Thursday, January 01, 2009

Failed Ideas and Two-Core Rendering

I'm pretty gun-shy about posting new features to this blog before they are released.  One reason is that a fair number of the things I code never make it into the final X-Plane because they just don't perform as expected.  But the converse of that is: there should be no problem posting about what failed.

One idea that I believe now will not make it into the sim is dual-core pipelined rendering.  Let me clarify what I mean by that.

As I have blogged before, object throughput is one of the hardest things to improve in X-Plane. That code has been tuned over and over, and it's getting to be like squeezing water from a rock. That's where dual-core pipelined rendering comes in.  The idea is pretty simple.  Normally, the way X-Plane draws the frame is this:
for each object
is it on screen?
if it is tell the video driver, hey go draw this OBJ
Now the decision about whether objects are on screen (culling) is actually heavily optimized with a quadtree, so it's not that expensive.  But still when we look at the loop, one core is spending all of its time both (1) deciding what is visible and (2) telling the video driver go draw the object.

So the idea of the pipelined render is to have one core decide what's on screen and then send that to another core that talks to the video driver.  Sort of a bucket-brigade for visible objects. The idea would be that instead of each frame taking the sum of the time to cull and draw, each frame should take whichever one is longer, and that's it.

The problem is: the idea doesn't actually work very well.  First, the math above is wrong: the time it takes to run is the time of the longer process plus the waiting time.  If you are at the end of a bucket brigade putting out the fire, you waste time waiting until that first bucket goes down the line.  In practice the real problem though is that on the kinds of machines that are powerful enough to be limited only by object count, the culling phase is really fast.  If it takes 1 ms to cull and 19 ms to draw, and we wait for 0.5 ms, the savings of this scheme is only 2.5%.

Now 2.5% is better than nothing, but there's another problem: this scheme assumes that we have two cores with nothing to do but draw.  This is true sometimes, but if you have a dual-core machine and you just flew over a DSF boundary, or there are heavy forests, or a lot of complex airports, or you have paged-texture orthophoto scenery, then that second core really isn't free some of the time, and at least some frames will pick up an extra delay: the delay waiting for the second core to finish the last thing it was doing (e.g. building one taxiway, or one forest stand) and be ready to help render.

And we lose do to one more problem: the actual cost of rendering goes up due to the overhead of having to make it work on two cores.  Nothing quite gloms up tight fast inlined code like making it thread-safe.

So in the end I suspect that this idea won't ever make it into the sim...the combination of little benefit, interference by normal multi-core processing, and slow-down to the code in all cases means it just doesn't quite perform the way we hoped.

I am still trying to use multiple cores as much as possible.  But I believe that the extra cores are better spent preparing scenery than trying to help with that main render.  (For example, having more cores to recompute the forest meshes more frequently lowers the total forest load on the first CPU, indirectly improving fps.)

1 comment:

Anonymous said...

Hi Ben,

According to various forum posts, X-Plane uses geodetic latitude for reference points, but also assumes a "round" earth. This is a little confusing to me --- Does it mean that given a reference lat/lon for a scenery map, the map is round (spherical) at its base, though the reference point itself is tied to an elliptical Earth model?

According to the tech notes...

“A few pitfalls of this coordinate system: the Y axis is not synonymous with up; this divergence increases as you go away from the reference point. Most parts of x-plane compensate for this, but there are still some shortcuts, typically for performance reasons.”

Does this mean that I can expect to observe unrealistic jumps in simulation time and/or Lat, Lon, Alt position data when crossing scenery borders --- or in general when X-Plane detects a coordinate system shift? If so, would this be a CPU/GPU issue (which can be remedied, e.g. with a quad-core CPU) --- or a physical modeling issue, or both?

This is important to me because I count on realistic progression of the aircraft dynamic state, and must bypass X-Plane's autopilot in favor of externally computed steering commands.