Showing posts with label performance. Show all posts
Showing posts with label performance. Show all posts

Wednesday, June 25, 2008

Clean Airport Layouts

I have blogged about correct airport layouts before, but let me bring the point up again, because it is so important:
  • You must create structurally correct layouts (that is, vertices connecting to vertices, not lines) in order to get good rendering in X-Plane.  Just because the preview looks okay in WED doesn't mean your layout is correctly formed!
I wrote some documentation on the scenery site that describes the problems in more detail, with pictures.  

(I've been trying to create more permanent documentation - it is tempting to simply blog the issues because it's so easy to throw a blog post up, but after 110 blog posts in 2007, the scenery site is still very thin on the documentation front.)

Do Not Add Vertices To Make Smoother Curves

I really can't stress this enough: please do not go adding extra vertices in your layout to make bezier curves look smoother in X-Plane.  Why is this such a bad idea?  Let me count the ways!
  1. X-Plane will fight you all the way!  X-Plane adds vertices to curves (to make them smoother) when it detects large errors.  If you have a lot of small curves, the errors are inherently smaller and X-Plane will add fewer points.  So the first vertices you add to your curve do almost nothing.  You have to add a huge number of vertices to get a marginal improvement in your curve.  In the meantime...
  2. X-Plane will provide variable-quality curve rendering in the future!  Curve detail should be a user-controlled setting.  X-Plane has to run on a wide range of hardware; any time we can let the user pick rendering quality, this is a win, because it helps bridge the gap between the user who just bought a brand new Core 2 Extreme system with GeForce 9800 and the user trying to keep X-Plane 9 running on his G4 laptop which can't be upgraded.  When you add vertices, you take the decision about rendering quality out of the hands of the user, and force high quality on a user who may not be able to handle it. Adding vertices forces a decision of lower framerate on some users.
  3. Adding vertices bloats the size of apt.dat.  This is not a huge factor for custom scenery, but is a factor for the default apt.dat.  Robin received a big pile of new airport layouts, and that's great.  But one risk is that the total size of user submitted data could get out of control.  For new layouts made with WED, vertices represent a big chunk of the data.  If you are increasing your vertices by a factor of 5x or 6x to improve tessolation, you are bloating the apt.dat file.
  4. Manually adding vertices to smooth curves lowers the level of abstraction in the apt.dat file.  Any time we can have a high level abstract representation of scenery, X-Plane has the freedom to improve rendering in the future.  If your layout is made up of a large number of small curves (instead of a small number of large curves), X-Plane cannot tell that those small pieces make up some larger structure; in the future it may not be able to render those layouts as nicely as ones that are made with fewer control points.
In summary, please use the smallest number of vertices to create your layouts.  (But always add vertices to ensure that your T junctions are correct!)



Saturday, June 21, 2008

Pushing On a String (RAM vs. CPU)

I found a disturbing text file on a user's computer...the user had a P-IV 2.8 ghz CPU and a GeForce 8800GTX. An excerpt:
Day 458 of my captivity. Life continues to be an immense, boring string of idle pauses, punctuated with drawing tasks way below my dignity. I am a GeForce 8 - why doesn't anyone here respect that? And yet when the user is playing "X-Plane" I spend long milliseconds with nothing to do while that infernal Pentium IV thinks carefully about what triangle I should draw next. Life is so dull. That Pentium IV must be the dumbest chip to come out of the fab plant - surely he is the runt of his wafer. Sometimes I become so despondent that I consider turning my fan off and cooking myself to death to end my misery. Perhaps I will intentionally blue screen the operating system...
Okay, so I made that up. But I'm sure that if you put a GeForce 8 into a P-4 system, that's about what the card would be thinking - it would be bored silly. Even the fastest Pentium 4s can't feed data rendering instructions to a GeForce 8 fast enough to utilize the hardware.

And in this sense, hardware can be like pushing on a string. Even if putting worse hardware in your machine would lower fps, putting better hardware in it may not improve fps. If you have a P-IV and a Geforce 8, putting a Geforce 2 MX in its place will lower fps, but upgrading to a GeForce 9800 won't - it will just leave your graphics card even more bored and angry than it was before.

This is because the slowest component in the graphics pipeline determines fps. Improving the speed of the other components just leaves them idle more of the time.

Why RAM Isn't like CPU.

When your system is limited by CPU, the change in performance is somewhat linear - get a 20% faster CPU and you get 20% more FPS.

RAM isn't like that...usually you either have more RAM than you need (and life is good) or you have too little, and your framerates is so bad you want to pull your hair out. Why is that?

Well, the difference between RAM and CPU lies in how we use the resource. If the sim needs more CPU than the computer has, the CPU plows through the work - it just takes longer.

But what happens when you run out of RAM? The backup for RAM is your hard disk, which isn't even the same substance. A RAM chip might be able to come up with some data for the CPU in 4 nanoseconds; a hard drive might be able to seek to data in 4 ms. That would be a diference of 1,000,000! (By comparison, the difference between flying across the country and walking might only be 150x.)

In other words, when we run out of RAM, we can't degrade gracefully, we degrade catastrophically.

(It gets worse: when we run out of virtual memory, we don't have anything to fall back on - we're just dead!)

Interestingly, it used to be that way for VRAM (back in the days of X-Plane 6); if we had to substitute system RAM for VRAM, framerate died. These days, however, the graphics bus is so much faster that that substitution is almost tolerable - thus you can start to run out of VRAM and not have a slide-show.

We Can't Use All of Your RAM

A user emailed us to inquire why X-Plane wasn't using more of his RAM - he had a machine with 8 GB running a 64-bit OS. X-Plane could have had 3 GB memory but was only using a fraction of that.)

The answer is partly in the nature of RAM - because RAM fails catatrophicaly, users are likely to set X-Plane to never get near the edge of running out of memory. Our engine is designed to simply minimize memory usage (since the penalty for running out of virtual memory is fatal) rather than try to gain incremental benefit from incremental RAM. (There are scenery packs that you need a lot of RAM for, but usually you can either run them or you can't.)

For most users, RAM is like pushing on a string - if you have enough memory that you can run without paging, adding more won't help. If you have 4 GB you really don't need more memory. If you have 2 GB, you probably don't need memory unless you're running with almost everthing else maxed out, due to a really fast CPU and GPU.

Saturday, June 07, 2008

Irrational Sliders

I am reading Predictably Irrational, by Dan Ariely. It's a great read - definitely recommended - describing the consistent irrational biases that frequent human decision making.

The first chapter discusses our tendency to make relative, rather than absolute comparisons. When deciding whether a product is a good value, we will look at the pricing of similar models, rather than the actual relationship between the product and the money spent. (The implication being that a company can make a product seem cheap without changing its price by adding a second, more expensive but similar "decoy" product. Poof! The cheaper product is now a good deal.)

This behavioral tendency explains user reaction to the rendering settings, a subject that makes me irrational on a regular basis. :-)

Time to Change the Settings

The rendering settings will let you select a range of sim detail between some minimum and maximum value. These values are based on the software, not hardware - because we don't actually know how much load any given hardware can support (and with the interaction between settings, finding such a cap is basically impossible). We can only give you a range of choices and let you pick ones that work well.

When a new version of the sim comes out, we sometimes have to recalibrate the settings. If the minimum features the sim can support increase, the minimum setting will be mapped to a new, more expensive behavior. And if the maximum detail the sim can present has increased, the maximum setting will be similarly remapped. We don't have much choice - if we need more "range" on the slider we have to recalibrate it.

I Can't Max Them Out

Here's where human behavior comes in. Humans make decisions based on the relative comparison of easily compared things. Given properties that are harder to measure and easier to measure, we'll pick the easier one. Given a choice of a trip to Rome, a trip to Rome with free breakfast, and a trip to Paris, we'll pick Rome with the free breakfast, opting for the easy to measure relative value. (Is the difference between a trip to Paris and Rome really less than the value of a breakfast? Probably not, but it's a lot harder to evaluate.)

So when we recalibrate the settings, we inevitably here this complaint:

"I used to be able to set the sliders to the maximum setting and now I can't."

Previously I would have said "Why the hell do you care?!?!" -- if the new slider's 50% position looks the same as the old slider's 100% position, why not just set it to 50% and go home happy.

But of course that's not how we think - the immediately comparable is of immediate concern. Ironically we could make the sim less useful but more pleasing by limiting the maximum range of the sliders. Now more users could feel the joy of having everything "set on max" even if the ultimate utility of the sim is reduced.

This One Goes To 11

I'm not sure there's a way around this. The best suggestion I've heard so far is that if we could attach some kind of units to the settings, then at least there would be a quantitative indication that the user isn't losing some perceived value. But I suspect that even this misses the point; it doesn't matter that you're still getting 500 trees per square km - what matters is that you are getting the most you possibly can! (Perhaps this psychology also explains why people like to overclock.)

Austin tried to fight the psychology of "maximum sliders" by naming all of our settings absurd things. Ever wonder why "default" is the lowest object setting, and we almost immediately jump into "extreme", "too many", "insane", etc.? He was trying to fight a losing battle against relative expectations. The natural human behavior is to pick some relative position for calibration, and based on that, every user who has to put objects below the center setting is going to be unhappy about having to use "lower than average" settings. Austin's naming convention may be silly, but it does actually do a little bit to fight this.

Food for thought: how does having multiple levels of reflections change user expectations?

Monday, June 02, 2008

The Cargo Cult of Preferences

In a previous post I said that our tech support guys will trouble-shoot the most likely problems first (based on what we see in our entire user base) - they're playing the odds.

Well, a lot of our users do too.  Over and over and over I see the recommendation "delete your preferences" as a cure for a wide variety of strange symptoms.  And a lot of the time deleting preferences works.

I fear that deleting preferences has become a bit of a Cargo Cult, that is, a ritual induced to fix the mysterious beast that is X-Plane without consideration to why X-Plane is broken.  If the fix works, the previous problem is ignored.

Now here's the thing: preferences files are relatively small and easy to read!  And they're really easy to save.

So next time you have a problem and consider deleting the preferences, simply move them outside your Resources/preferences folder to the desktop and restart.

If the problem goes away, you can then delete the newly generated (clean) preferences and put the old funky ones back.

If you then truly find a situation where one preferences file causes the problem, you can look at what's actually different and file a real bug. (Unix nerds: most of the preferences files are text and can be "diffed".)

At this point, almost every option in the preferences file has a user interface item, so if the preferences file causes the sim to run poorly, there should be a setting that has been changed that can be identified.  Screenshots of the other airplanes, weather and rendering settings before and after the prefs might provide another quick way to compare what has changed. Control-period will take screenshots when dialog boxes are shown.

(Remember that the effect of preferences on framerate varies a lot with hardware.  There may be some preferences that slow fps a lot but do not make an obvious change in what you see "out the window".  By comparing two rendering settings screenshots you might find something subtle that changed.)

Sunday, June 01, 2008

Limits On Texture Paging

I seem to be in a philosophical mood these days with my blog posts...thought for the day: the human mind easily goes from the specific to the general. Our brains are generalizing machines, pattern matchers finding the rule in the noise.

My preference in creating new scenery-system features is to make them very limited, and my reasoning is: our brains don't go backward very well.  We do not go from the general to the specific.

Now you might think: when making a scenery-system addition, the best thing would be to have a general feature, more useful because it can be used everywhere.  But I say: the most important thing is to fully understand the feature - otherwise the feature comes out buggy. 

(Consider the piles and piles of bugs and weird behaviors that you get when combining OBJ animation with OBJ hard surfaces.)

Since the human brain doesn't go from general to specific well, it is hard to start with a rule ("let's allow feature X in all parts of the scenery system") and comprehensively derive all of the implications; it is human nature to be surprised later by some unintended side-effects.

It is always easier to extend a feature later to its natural full implications than to declare certain uses illegal later, after authors of planned or started trying to use the feature in that way.  If the generalization of the feature makes sense, extending it is often quite painless.

Texture Paging - Scope For Now

Texture paging is the ability for X-Plane to raise and lower the resolution of scenery textures dynamically as you fly.  This means more VRAM used for nearby things and less for far away things.  In practical terms, this reduces VRAM used by orthophotos by down-sampling the far-away textures, making larger orthophoto scenery packages possible.  As you fly, the sim reloads some textures at higher resolutions and some at lower.  The cost of the features is the load time while you fly, which burns up some extra CPU cores.

It is my hope that we will productize some very simple texture paging in the next major patch of X-Plane 9 (that would be 920, not 902).  But the usage will be pretty specific:
  • Texture paging will only be available for .ter and .pol textures (we can extend to other scenery types later if it makes sense).
  • Texture paging will require changing the .ter and .pol files (X-Plane will not automatically analyze your scenery to see what can be paged.)
  • Texture paging will not be available for ENV scenery.
  • If you share textures and texture page, the results will probably be really bad and cause chaos.  Be sure to use only one .ter or .pol file (and reference that text file only once in the your DSF definitions section) if you want sane paging.  We can extend paging to shared textures in the future, but for now orthophotos are the intended target.
I am also deferring work on dataref-driven textures; we'll get there eventually, and the infrastructure from the pager will make it easier.  But dataref-driven textures really need to be available in a lot more places - it's a bigger, more complex feature* and I can't keep adding scope to 920.

Make New Meshes!

While paging will be available for both overlays (using .pol files) and base meshes (using .ter files) I strongly, strongly recommend going the base-mesh .ter route.  RealScenery sent me their new "State of Washington" package to use as test material; I was pleasantly surprised at the high framerate.  Part of that comes from them using base meshes and not overlays. 

Overlays cause the sim to draw the scenery twice (first the old scenery, then your overlay), burning a lot of pixel shader and fill power.  Base meshes simply replace the old mesh which is at least twice as efficient.

(I'm just going to keep beating the dead horse of base meshes because I believe that the sooner everyone moves toward base meshes, the more bang for our hardware buck everyone gets.)

* In particular, remember that texture paging happens on threads.  But datarefs can come from plugins that are not threaded!  Insert anarchy here...

Friday, May 23, 2008

Drivers and Builds To Try

For those who posted comments, sorry it took so long to moderate them - for some reason my spam filter decided that notifications of comments are, well, spam, so I just found them now. I should have known people would have jumped into a Vista-bashing thread. :-)

There is an X-Plane 9.02 beta 1 posted - like 901 we've been pretty quiet about this, but you can get it by enabling "get betas" and running the X-Plane updater. Please give it a try. Like 901 it is a small change for the purpose of localization, but it actually has an interesting feature pair:
  • True-type fonts and
  • Unicode-aware.
This is part of some rework we did to provide better language support. So...you should be able to run X-Plane no matter what weird characters* are in your folder names, name your airplanes funny things, and see diacritical marks. 902 uses a font that provides all of the Latin and Greek/Cyrillic code pages.

Also I have heard reports of improvements based on drivers:
  • nVidia has 175.16 drivers out and they apparently address "stuttering" issues. The stuttering issue has been on my list to investigate because it happens under Windows but not Linux. If you have stuttering performance on high-end NV hardware, particularly with forests and Windows, please try 175.16 and let me know how it goes.
  • ATI has released Catalyst 8-5. Catalyst 8-3 and 8-4 were causing "incomplete framebuffer" errors for some users, but I was unable to reproduce it (after spending a good day trying to jam Windows XP onto an iMac already crammed with Windows and Linux....yet another episode of a Tale of Three Operating Sytstems). Anyway, at least one user reported the issue as fixed in Cat 8-5, so if you are having problems, please try the new driver set.
As always, bugs in the X-Plane beta should go to our bug report form, on the X-Plane contacts page.

* You might accuse me of being American-centric in decrying diacritical and greek letters weird - but the truth is I am computer-centric...anything that is not in the original ASCII set is weird. :-)

Monday, May 12, 2008

Multi-Core Texture Loading

In a previous post I discussed the basic ideas behind using multiple threads in an application to get better performance out of a multi-core machine.

Now before I begin, I need to disclaim some things, because I get very nervous posting anything involving hardware. This blog is me running my mouth, not buying advice; if you are the kind of person who would be grumpy if you bought a $3000 PC and found that it wouldn't let you do X with X-Plane (where X includes run at a certain rendering setting, framerate, or make your laundry) my advice is very simple: don't spend $3000. So...
  • I do not advocate buying the biggest fastest system you can get; you pay a huge premium to be at the top of the hardware curve, particular for game-oriented technologies like fast-clock CPUs and high-end GPUs.
  • I do not advocate buying the Mac Pro with your own money; it's too expensive. I have one because my work pays for it.
  • 8 cores are not necessary to enjoy X-Plane. See above about paying a lot of money for that last bit of performance.
Okay...now that I have enough crud posted to be able to say "I told you so"...

My goal in reworking the threading system inside X-Plane for 920 (or whatever the next major patch is called) is, among other things, to get X-Plane's work to span across as many cores as you have, rather than across as many tasks are going on. (See my previous post for more on this.)

Today I got just one bit of the code doing this: the texture loader. The texture loader' job is to load textures from the hard drive to the video card (using the CPU, via main memory) while you fly. In X-Plane 901 it will use up to one core to do this, that core also being shared with building forests and airports.

With the new code, it will load as many textures at a time as it can, using as many cores as you have. I tested this on RealScenery's Seatle-Tacoma custom scenery package - the package is an ENV with about 1.5 GB of custom PNGs, covering about half of the ENV tile with non-repeating orthophotos.

On my Mac Pro, 901 will switch to KSEA from LOWI in about one minute - the vast majority of the time is spent loading about 500 PNG files. The CPU monitor shows one core maxed out. With the new code, the load takes fourteen seconds, with all eight cores maxed out.

(This also means that the time from when the scenery shifts to when the new scenery has its textures loaded would be about fourteen seconds, rather than a minute, which means very fast flight is unlikely to get to the new area before the textures are loaded and see a big sea of gray.)

Things to note:
  • Even if we don't expect everyone to have eight cores, knowing that the code can run on a lot of cores proves the design - the more the code can "spread out" over a lot of cores, the more likely the sim will use all hardware available.
  • Even if you only have two or four cores, there's a win here.
  • Texture load time is only a factor for certain types of scenery; we'll need to keep doing this type of work in a number of cases.
This change is the first case where X-Plane will actually spread out to eight cores for a noticeable performance gain. Of course the long-term trend will be more efficient use of multi-core hardware in more cases.

Saturday, May 03, 2008

A Tale of Three Operating Systems, Part II (Why You Need Bootcamp)

A while ago I put three operating systems on my laptop. With the Mac Pro I've done the same thing - it's a huge win to be able to cover such a wide swath of OS/GPU/CPU combinations with fewer machines. Last time it was OS X 10.4, Windows XP SP2, and Ubuntu 6.06. This time I repeated the process with OS X 10.5.2, Windows Vista RTM, and Ubuntu 8.04. Random observations:
  • Linux really just keeps getting stronger. I've always been a bit skeptical about Linux as a desktop environment, particularly as a Windows/Mac developer (that is to say, I'm spoiled by free high quality IDEs ad debuggers that require no setup to use the platform SDK, comprehensive platform documentation in one location, etc.). But Linux installation is becoming more plug & play and trouble-free each time I make myself a live CD.
  • Windows Vista is a train wreck. I feel a little bit lame blogging this, as taking pot-shots at Vista is sort of like shooting fish in a barrel. But the contrast between Ubuntu, which has become easier to use over a year and a half, and Windows, which has not, is stark.
  • There are some positive things to say about Vista. The partition-aware installer is a real convenience for multi-booters. And once you figure out where everything has been moved to and go back to "classic" views, the OS is tolerable. But you'll still find plenty of things that will make you want to tear your hair out. My recommendation: stick with XP. (Duh.)
Now on to the performance numbers. These numbers are the Xp900 time demo fps tests 1, 2 and 3. Each set of 3 numbers is from the three phases.
      1           2               3
MAC 49/ 60/ 62 38/ 43/ 44 21/ 20/ 21
WIN 121/128/133 114/115/119 77/ 75/ 82
LIN 143/144/157 130/123/132 92/104/113
That's not a typo. Linux is beating out Vista, but both are absolutely killing OS X. What's going on here? I don't know. But there appears to be something that isn't well optimized in the GeForce 8 drivers on OS X.

I suspect Apple will close this gap eventually; don't bother asking me for status information on this because if they ever tell me what's going on, I'll be bound by NDA not to tell you.

For now my recommendation is: consider dual-booting into Linux - it's pretty easy to install Ubuntu and you'll get great X-Plane performance. With good drivers, the Mac Pro and 8800 are just monstrous.

Friday, April 25, 2008

Threads and Cores

Now that multi-core machines are mainstream, you'll hear a lot of talk about "threads". What is a thread, and how does it relate to using more cores?

Definitions

A "core" is an execution unit in a CPU capable of doing one thing. So an 8-core machine might have two CPUs, each with four cores, and it can do eight tasks at once.

A "thread" is a single stream of work inside an application - every application has at least one thread. Basically a two-threaded application can do two things at once (think of driving and talking on your cellular phone at the same time).

Now here's the key: only one core can run a thread at one time. In other words, if you have an eight core machine and a one-thread application, only one core can run that application, and the other seven cores have to find something else to do (like run other applications, or do nothing).

Two more notes: a thread can be "blocked" - this means it's waiting for something to happen. Blocked threads don't use a core and don't do anything. For example, if a thread asks for a file from disk, it will "block" until the disk drive comes up with the data. (By CPU standards, disk drives are slower than snails, so the CPU takes a nap while it waits.)

So if you want to use eight cores, it's not enough to have eight threads - you have to have eight unblocked threads!

If there are more unblocked threads than cores, the operating system makes them take turns, and the effect is for each of them to run slower. So if we have an application with eight unblocked threads and one core, it will still run, but at one eighth the speed of an eight core machine.

It's not quite that simple, there are overheads that come into play. But for practical purposes we can say:
  • If you have more unblocked threads than cores, the execution speed of those threads slows down.
  • If you have more cores than unblocked threads, some of those cores are doing nothing.
Trivial Threads

When a thread is blocked, it does not use any cores. So while X-Plane has a lot of threads, most of them are blocked either most or all of the time. For all practical purposes we don't need to count them when asking "how many cores do we use". For example, G1000 support is done on a thread so that we keep talking to the G1000 even if the sim is loading scenery. But the G1000 thread spends about 99.9% of its time blocked (waiting for the next time it needs to talk) and only 0.1% actually communicating.

What Threads Are Floating Around

So with those definitions, what threads are floating around X-Plane? Here's a short list from throwing the debugger on X-Plane 9.0. (Some threads may be missing because they are created as needed.
  • X-Plane's "main" thread which does the flight model, drawing, and user interface processing.
  • A thread that can be used to debug OpenGL (made by the video driver, it blocks all the time).
  • Two "worker" threads that can do any task that X-Plane wants to "farm out" to other cores. (Remember, if we want to use more cores, we need to use more threads.)
  • The DSF tile loader (blocks most of the time, loads DSF tiles while you fly).
  • At least 3 threads made by the audio driver (they all block most of the time).
  • At least four threads made by the user operating system's user interface dode (they block most of the time).
  • The G1000 worker thread (blocks most of the time, or all the time if you don't have the G1000 support option).
  • The QuickTime thread (only exists when QuickTime recording is going on).
So if there's anything to take away from this it is: X-Plane has a lot of threads, but most of them block most of the time.

Core Use Now


So how many cores can we use at once? We only need to look at threads that aren't blocked to add it up. In the worst flying case I can think of:
  1. The main thread is rendering while
  2. The DSF tile loader is loading a just-loaded tile while
  3. One of the pool threads is building forests while
  4. You are recording a QuickTime movie (so the QT thread is compressing data).
Yep. If you really, really put your mind to it, you can use four cores at once. :-) Of course, two cores is a lot more common (DSF tile loading or forests, but not both at once, and no QuickTime.

Core Use In the Future

Right now some of X-Plane's threads are "task" oriented (e.g. this thread only loads DSF tiles), while others can do any work that comes up (the "pool threads", it's like having a pool car at the company, anyone can take one as needed). The problem with this scheme is that sometimes there will be too many threads and sometimes too few.
  • If you have a dual-core machine, having forests building and DSF loading at the same time is bad - with the main thread that's three threads, two cores; each one runs at two-thirds speed. But you don't want the main thread to slow down by 66%, that's a fps hit.
  • If you have a four-core machine, then when the DSF tile is not loading, you have cores being wasted.
Our future design will allow any task to be performed on a "pool thread". The advantage of this is that we'll execute as many tasks as we have cores. So if you have a dual-core machine, when a DSF tile load task comes along while there is forests being done, the one pool thread will alternate tasks, leaving one core to do nothing but render (at max fps). If you have a four-core machine, the DSF load and forests can run at the same time (on two pool threads) and you'll have faster load times.*

* Who cares about load time if you're flying? Well, if you crank up the settings a lot and fly really fast, the loader can get behind, and you'll see trees missing. X-Plane is always building trees in front of you as you fly and deleting the ones behind you. So using more cores to build the forests faster means you're less likely to fly right out of the forest zone at high settings.

Sunday, March 02, 2008

Hardware Profiles

X-Plane 9 currently recognizes (roughly) three categories of graphics hardware:
  • Non-Pixel-Shader hardware (GeForce 2,3,4, Radeon 7000-9200)
  • First-Generation Shader Hardware (GeForce FX 5nnnn, Radeon 9500-9800, X300-X600)
  • Later-Generation Shader Hardware (GeForce 6, 7, 8, Radeon X200, Radeon X700+)
That first bucket is pretty simple: those cards don't support programmable pixel shaders (as we know them today) and can't run any shader effects. The "use pixel shaders" check box doesn't appear in the rendering settings.

The distinction between the later two is a little bit more subtle. Basically the first generation of pixel shader cards (the 9700 and friends) support only 96 instructions for each pixel shader; this puts us right on the edge of sometimes not being able to draw all of our effects; we have to simplify the water slightly to make it work. The next generation of chips (X850 and friends) doesn't have this limitation.

By comparison, while NVidia cards have been able to handle long shaders from day one, the GeForce 5's shader performance is really poor.

So we bucket all of these chips as "first-gen". When we detect this we:
  • Simplify shaders slightly (gets us out of trouble with the 9700).
  • Don't default to shaders being on when the sim is first booted (because the framerate will probably be unusably slow).
Even though the 9700 provided very usable shader performance in its day, by the standards of modern GPUs, this older chip isn't that fast, so it's probably for the best that we not enable reflective water by default on machines with these cards.

By comparison, X-Plane deals with almost all other capabilities on an a-la-carte basis; particualr features are enabled if the right menu of hardware features is available. We do this to try to deal more flexibly with the wide variety of cards that are out there. Some examples:
  • You'll get hardware accelerated runway lights if your card supports pixel shaders and sprites (virtually all shader-enabled cards have sprites).
  • You'll get sun-glare effects if your card supports pixel counting (virtually all modern cards can do this).
  • The non-pixel-shader rendering code will show more detail if your card supports more texture units (this is only an issue with very old hardware).
I've been looking over hardware profiles a lot lately, and I suspect that the next big "jump" in hardware will be the DX10-compliant cards (GeForce 8, Radeon HD). There's a lot of fine print in what the various cards can do between all of the pre-DX10 cards; at some point when we decide what menu of features we'll require for rendering, we need to simplify.

My guess is that when we start to have "really advanced" pixel shaders that require hardware more sophisticated than what we need now, we'll simply require a DX10 card. Otherwise we'll have to sort through 8 different profiles of fine print, only to attempt to partially support cards that probably won't be fast enough anyway.

(That is to say, a feature is only useful for us if it can run reasonably quickly. It doesn't make sense for us to try to make a special "simplified" version of a rendering feature for, say, the X850 if every X850 is going to turn it off every time for framerate reasons.)

If any of this turns into hardware buying advice, I suppose it would be this:
  • If you are deciding between a DX10 and DX9 card (e.g. between the HD2400 and X1900, or GeForce 8600 vs 7900) go for the newer generation DX10 ards (HD or GeForce 8); if the card has decent performance you'll also be setting yourself up for future features.
  • As always, pay attention to the fine print of the model numbers, particularly the configuration. Lower model number cards basically have fewer parallel components than higher number ones and that leads directly to lower framerate.
I see an add online for the GeForce 8500 for $70 and 8600 for $90. But if you look at the links below, you'll see that the 8500 has only half the shaders of the 8600 - that's going to be a huge performance difference for $20.

(So the moral of this story is: try to get an HD or GeForce 8 card, but don't dip into the really low end cards because they're stripped down too far for X-Plane use.)

These pages (NV, ATI) on Wikipedia list specs for a whole pile of cards and can be useful to decode the fine print.

Thursday, February 07, 2008

GeForce 7 and Water Performance

A number of Windows and Linux GeForce 7 users have discovered that the command-line option --no_fbos improves their pixel-shader framerate a lot. Windows and Linux Radeon HD users have also discovred that --no_fbos cleans up artifacts in the water. Here's what's going on, at least as far as I can tell. (Drivers are black boxes to us app developers, so all we can do is theorize based on available data and often be proved wrong.)

Warning: this is going to get a bit technical.

FBO stands for framebuffer object, and simply put, it's an OpenGL extension that lets X-Plane build dynamic textures (textures that change per frame) by drawing directly into the texture using the GPU. Without FBOs we have to draw to the main screen and copy the results into the dynamic texture. (You don't see the drawing because we never tell the card "show the user".)

We like FBOs for a few reasons:
  • Most importantly, FBOs allow us to draw big images in one pass even if the screen is small. For example, if we have a 1024x1024 dynamic texture but the screen is 1024x768, then withou FBOs we have to draw the image in two parts and stitch it together. That sucks. With FBOs we can just draw straight to the texture and not worry about our "workspace" being smaller than our texture. This is going to become a lot more important for future rendering features where we need really-frickin' big textures.
  • It's faster to draw to the texture than to copy to it.
  • If you're running the sim with FSAA, then we end up using FSAA to prepare all of those dynamic textures. In virtually all cases, we don't need the quality improvements of FSAA, so there's no point in taking the performance penalty. When we render right into the texture, FSAA is bypassed and we prep our dynamic textures a lot faster.
Since copying to a texture from the screen predates these new-fangled FBOs by several years, most drivers can copy from the screen to the texture very quickly; however we have hit at least one case where FBOs are much faster than copy-from-screen. That's really a rare bug, and as you'll see below, we see more weird behavior with FBOs.

When do we use FBOs vs. copying? Well, it's pretty random:
  • Pixel shader reflective water and fog use FBOs.
  • Cloud shadows and the sun reflection when pixel shaders are off do not use FBOs.
  • The airplane panel uses FBOs if the panel is 1024x1024 or smaller; if the panel is larger than 1024x1024 we draw from the screen and patch things together. So the P180 and the C172 are using different driver techniques!!
When you run X-Plane with --no_fbos, you instruct X-Plane to ignore the FBO capability of the driver, and we use copy-from-screen everywhere.

Mipmapping

There is one more element: mipmapping. A mip map is a series of smaller versions of a texture. Mipmapping allows the video card to rapidly find a texture that is about the size it needs. Here's an example: imagine you have a building with a 128x128 texture. If you park your plane by the building, the building might take up about 100x100 pixels on the screen; your 128x128 texture is a good fit.

Now taxi away from the building and watch it get smaller out your rear window. After a while the building is only taking up 8x8 pixels. What good is that 128x128 texture? Its' much too big for the job. With mipmapping, the card has a bunch of reduced-size versions of your texture laying around...64x64, 32x32,16x16, 8x8, 4x4, 2x2, 1x1. The video card realizes the building is tiny and grabs the 8x8 version.

Why not just use the 128x128 texture? Well, we'd only have two options with this texture:
  1. Examine all 16384 pixels of the texture to compute the 64 pixels on screen. That sucks...we're accessing VRAM sixty four times for each pixel. Accessing VRAM is slow, so this would kill performance.
  2. Simply pick 64 pixels out of the 16384 based on whatever is nearby. This is what the card will do if mipmapping is not used (because option 1 is too slow) and it looks bad. Thsoe 64 pixels may not be a good representation of the 16384 that make up your building side.
So mipmapping lets the video card grab a small number of pixels that still capture everything we need to know about the building at low res.

We don't mipmap our dynamic textures very often; the only ones that we do mipmap are the non-pixel-shader sun reflections and the pixel-shader sun reflections.

ATI

As far as we can tell, the current ATI Catalyst 8.1 drivers do not generate mipmaps correctly for an FBO-rendered texture. This is why without --no_fbos ATI users on Windows or Linux see very strange water artifacts. --no_fbos switches to the copy path, which works correctly.

At risk of further killing my track record of driver bugs in v9, we do think this is a bug. We have good contact with the ATI Linux driver guys so I have hopes of getting this fixed.

nVidia

It appears that the process of creating mipmaps for FBO textures is not accelerated by the hardware on the GeForce 7 GPU series. This is why GeForce 7 users are seeing such poor pixel shader performance, while GeForce 8 users aren't having problems.

Now poor performance is not a bug; there's nothing in the OpenGL spec that says "your graphics card has to do this function wickedly fast". Nonetheless, what we're seeing now is unusably slow. So more investigation is needed -- given that the no-FBO case runs so quickly, I suspect the hardware itself can do what we want and it's just a question of the driver enabling the functionality. But I don't know for sure.

Wednesday, February 06, 2008

The Limits of Orthophotos and Meshes in X-Plane

I get asked a lot about the limits of meshes and orthophotos in X-Plane. I'll try to answer this, but the answer isn't as simple as most people expect.

Texture Limits and Orthophotos

The maximum single texture size in X-Plane 8 is 1024x1024, and in X-Plane 9 it is 2048x2048.

I believe the maximum number of unique custom orthophotos that can be attached to a single DSF is at least 32768.

In practice, that number is pretty useless because X-Plane loads all textures for a DSF at the highest user-allowed res when the DSF is loaded. That means you tend to load a lot of textures. Every system is different and drivers have a lot to do with RAM efficiency, but generally you'll run out of virtual address space and crash the sim before you can attach 32768x2048x2048 of pixels.

X-Plane has no limits on how the texturing is applied - that is, you can use your 2028x2048 texture to cover an entire tile or a single meter. So again, the limiting factor on the resolution of your orthophotos is how much total area you want to cover and how much RAM you can spend (remember RAM is also used for mesh complexity, 3-d models, etc.).

You do not need to have enough VRAM to hold all loaded orthophotos; the video driver will paeg the textures into VRAM. Virtual address space is the limiting factor. How far you push it depends on a lot of subjective things:
  • If you expect your users to also run with a lot of trees, 3-d objects, cars on roads, and some plugins, you can't use a lot of RAM.
  • If you expect your users to have /3GB in their boot.ini and use nothing but your add-on, you can use a lot more RAM.
Generally the size of the DDS texture on disk is a good proxy for the virtual memory that is required to hold your textures.

It should be noted that these limits on texturing (due to X-Plane blindly loading a lot of stuff at once) affect all scenery types: objects, draped polygons, very complex airplanes, plugins, and not just terrain mesh orthophotos.

Getting Past the Texture Limit

It will take a future extension to the rendering engine to get past the current limits. Basically X-Plane will have to load textures at lower resolutions when they're farther away. I don't know when that is coming, but when it happens, it will increase the total amount of image data a DSF mesh can contain, because the limiting factor will be how much data is in the small area the user is looking at (since the rest can be stored at much lower res for far-away views). At that point the limiting bottleneck will be resolution (smaller means more data at once), not total image data.

Mesh Limits

Unfortunately, limits to the mesh are even more vague than limits to texture usage. X-Plane uses an adaptive mesh - basically you can put your vertices wherever you want. So the highest resolution you can achieve might be much smaller than 1 meter resolution, but you can only do this for a small area before the total mesh size gets too big. But this is okay - the intention of DSF is to let you put a lot of detail where you need it.

I believe that once again memory provides the first limitation to the mesh. That is - you'll run out of memory loading your insanely huge mesh long before you hit a limit to the DSF container structure. And once again, even the RAM limit isn't a hard limit because that virtual address space is shared with texures. Your mesh density limits actually go down when your textures go up because it's a zero-sum game.

Estimating Memory

Here are some ideas on how to estimate your memory footprint:
  • Run X-Plane over ocean to get an idea of the baseline memory use that the sim needs without extra scenery.
  • Load your mesh without textures (move the textures away) to find the cost of the mesh itself. (I am going on the assumption here that you can rescale your mesh using whatever mesh generation tool you're using).
  • The size of DDS textures is a good proxy for the memory used.

Saturday, January 26, 2008

Performance Wrap-up (for now)

The story on X-Plane performance is never over, but the chapter that is 9.00 pretty much is. I think we'll be RC in the next build (if all goes well). Certainly a lot of the things that are still performance "problems" will require changes larger than we can do in a late beta.

I say problems in quotes because a lot of what's been reported lately is in the form of: a huge screen res + a lot of shaders + a lot of FSAA = slow fps. That's not really a bug, that's an engine limitation. Now I want to make the engine as fast as possible, and a lot of this pixel shader stuff is new to 9.0, so if our track record for tuning stays the way it was for v8, we'll probably get some efficiency improvements later.

But unfortunately there's an underlying limitation: the new water and fog both cause the rendering engine to consume significantly more hardware resources than it would otherwise. Turn them on and you get prettier pictures at a price.

Just to post a a few general things I've found:
  • X-Plane 9 will tell you where your GPU really stands. GPUs that were very adequate for X-Plane 8 (like the GeForce 6600 GT) will turn out to have nothing left in reserve for v9, while GPUs that were bored in v8 (the GeForce 8800 GTX for example) will show what it really has.
  • Generally the cost of going from no shaders to shaders with water reflections of "none" and no volumetric fog should be very low if your screen res and FSAA don't add up to something crazy (like 16x FSAA at 2048x2048).
  • If you do have serious performance hits, try --no_fbos in the command-line; some drivers seem to have trouble with them.
  • The P180's virtual cockpit is a lot more expensive than the other ones, because it has a huge panel that is used in 3-d. We'll hopefully rebuild the cockpit at some point.
  • Turning water reflections to "complete" is very expensive. Watch the water and use the lowest setting that looks good. You don't need complete reflections if there are a lot of waves!
  • Shaders, FSAA, and screen size are all pulling from the same set of resources - be careful about cranking up all three.
  • Check your v-sync - a lot of users whose vsync clamped them at 60 fps in v8 will be clamped at 20 in v9.
  • Do your testing with texture res set low, then crank texture res later; pixel shaders also require the allocation of VRAM that can't be purged (for things like reflection images) so running out of VRAM can show up in some weird ways performance-wise.
  • The new Intel iMacs have serious performance problems with shaders on. This is due to driver limitations; given the much better performance under BootCamp, I expect the Mac performance to get better when the drivers are updated. For now I'd keep shaders off.
For now, please hold off on sending me performance reports. I just don't have time to address them. In the future I will try to solicit very specific performance data points that we need to check. Perhaps in the future we can also set up a database of fps-test results to have a more comprehensive idea of how the hardware does its job.

I expect future features to appear in v9 that further eat hardware; those features will have an off switch. You may have to pick and choose what graphics you enable; there is no free lunch here. I also expect new graphics cards to emerge that make the GeForce 8800 GTX seem quaint!

Tuesday, January 15, 2008

Simple Optimizations for Airplanes and Objs

Just a few basic things:

For airplanes where you don't want to show the PlaneMaker part (because you've rebuilt the plane visually using OBJs):
  • In X-Plane 9, set the parts to be invisible - this is faster than drawing them with a transparent texture.
  • In X-Plane 8, if the texture is transparent, please downsize it! A 1024x1024 texture that's fully transparent is just a waste of VRAM!
For any version, any object, avoid using ATTR_diffuse_rgb when possible. You can get the same effect by tinting your texture and save unnecessary state thrash.

Wednesday, January 09, 2008

X-Plane Water - Now and the Future

Randy forwarded me a very detailed email message from the features list about water. First a few notes:
  1. I don't read the features list - I'm only blogging this because Randy forwarded it my way.
  2. Procedural water (that is, procedural waves with the sky as a reflection) is a default shader option because when we first looked at it, it seemed to be "low cost". Some hardware really chokes on this option. But one trend that's clear: I have not seen any hardware that can do volumetric fog but has trouble with water. When it comes to expensive computations, vol-fog is the the heavy effect. If your card can run with vol-fog even remotely well, you're not going to get any kind of fps boost by turning off water. So the question of whether procedural water can be optional is one of whether the water looks good (which I will discuss below), not one of performance.
  3. I've received plenty of emails about how we can "cheat" on the reflections to make them faster. To be honest, these ideas aren't very useful...you really need to know how the rendering engine works on a low level to find good ways to cheat on reflection quality.
(It's not enough to make the reflection texture less good looking, you have to do so in a way that makes it take less time to render! We've basically already taken all the optimizations we can - that's what that water reflection detail popup does. Keep it on a lower setting.)

So with that in mind I'd like to bring up three issues with the reflective water and give you some idea of the roadmap. I should say that when I list features as coming in the 9.xx or 10.xx time frame what I really mean is: some depend on new global scenery and some do not. It's possible that we may do a ton of water work in 9.1 or we may not do any more for five years. I can't really make a good prediction on future features after 9.0, except that they'll be, um, really cool. :-)

Water Weather Settings (9.00)

X-Plane currently provides a global wave height setting. X-Plane 9 beta 15 ignores this setting and simply creates calm water. This is simply a link-up between the physics and the shader that I had not gotten to until now. I believe that beta 16 will address this; the water wave height will match what you dial in. This still is only a baby step, but there will at least be a workaround for the mos common complaint (the ocean looks like glass) in that you can set the wave height to 10 meters.

Water Properties (9.xx, 10.xx)

Now Peter brings up an important point: the properties of how the water look vary with their location. First I must point out an architectural issue: you can't do a great job of computing "fetch" (open runs where wind builds up waves) in the sim because the area adjacent to the current water may not be loaded. That is, if X-plane doesn't have the pacific ocean loaded, how can it know that the waves hammering San Francisco can be pretty big? So I have always viewed fetch as something to pre-compute into a DSF mesh.

(This does bring up an architectural issue...how can we have properties on open ocean with no DSF mesh? The answer might end up being that we do have to provide water-only DSFs for bays and inlets that are large enough to cover a whole tile.)

Now it turns out that (secretly) the DSFs have contained a very crude version of fetch since 8.20. In version 8 we didn't really have the shading power to visualize it (I did have some experimental code once) and in version 9 it is not yet hooked up. The fetch calculation isn't very good, but it does exist. In the long term, we've talked within the company about including bathymetric data (water depth) and even water properties like clarity and the color of the goo in the water. Improving the water metadata would be a next-global-scenery feature, and we don't recut global scenery very often.

(The DSF format is flexible...you can encode just about as much extra data into the mesh for water as you want - it's not a format change.)

So I think you may see some improvement in the water as we utilize existing fetch data in the mesh, and some improvement as we encode more meta-data into the mesh. Note that to do any of this we need to change the sim a bit...right now it assumes a constant wave height - this would no longer be true. I think these kinds of improvements will start during the 9.x run but not be in 9.0.

Filtering Errors (9.xx)

This is the most technical issue with the water, relating to how the graphics hardware works. The problem is that the way the far-view of the water is computed is too reflective due to down-sampling.

In real life, if I am in front of a body of water with 6 inch waves, I will see two things:
  1. Near me, I will see the waves themselves. Part of the waves will be dark, because their surface normal faces me, so the Fresnel equation says I see down into the water, where I see the bottom (or if it's deep enough, darkness). Part of the waves will be at an angle to me and act reflective, picking up colors from the sky and maybe surrounding terrrain. So my waves are going to be a mix of the sky color and some kind of dark color, with the color contrast allowing me to see the shape of the wave.
  2. Farther out, the waves will be smaller than my eyes can distinguish and I'll start to see a more consistent water color, which is a mix of all of the various sky color inputs and darkness. The particular mix might depend on the chop and shape of the waves and angle to the sky. The important thing is: there is scattering of the reflection at a level smaller than my visual acuity, so I see sky color, but I don't see a sky reflection, and that color is darkened by the Fresnel equation.
Now in X-Plane the problem is this: the water waves are built up procedurally from a noise texture. As we get farther away, that wave texture is reduced in quality. Unfortunately, the graphics hardware averages together the wave shape, not the resulting color from the wave shape.

So instead of getting sky + deep = darker blue for the water, I get peek + trough = flat water! In other words, at a far distance the waves are canceling each other out before color lookup, giving us a perfect mirror in the far-ground.

This is fundamentally an implementation problem - I bring it up only because it's a counter-intuitive one. In the immediate future, the "glass lake" problem will become less because filtering only kicks in like this when the waves become less than one pixel - with the option for taller waves coming, the waves should be visible farther out. In the longer term we'll probably put in new shading code to address filtering problems.

Reflection Positioning Bugs (9.0, 9.xx)

As of beta 15 I thought I had fixed most of the reflection positioning bugs (that's what happens when something reflects in the wrong place in the water) - the geometry for this is made complicated by the Earth being round. I don't expect to nip all of these bugs in 9.0 but I do hope to get most of them, and I will keep working on this as bug reports come in.

Wave Shape (9.xx)

Finally, our wave shapes are quite primitive - it's just shaped procedural noise designed to look tolerably like water. We have a framework into which we can insert more complex wave equations (at the cost of some framerate). I don't know what the future will bring in this area. The v9 water sets out a new foundation onto which we can do more complex water. But we have to crawl before we can walk.

Tuesday, January 08, 2008

A New Broken Record

For years now I've been harping about ways to keep the number of batches down in your scenery. A batch is a single submission of triangles to the graphics card for drawing. Batches get rendered fast even if they contain a ton of triangles, but changing modes between batches is not very fast, so a few large batches is hugely better for performance than a large number of tiny batches.

To play that broken record one more time, there are two ways that you (a scenery designer) can cut down the number of batches):
  • Use a small number of larger textures instead of a large number of small textures, preferably sharing textures between similar scenery elements that are placed nearby. X-Plane will do its best to merge the content that uses those textures into single batches. We call this the "crayon rule."
  • Use less attributes in your objects. Attributes usually require a new batch (after the graphics card mode has been changed due to the attribute). So if you've got 1000 attributes in your object, you've got a problem.
Well, with X-Plane 9 I have a new broken record: avoid overdraw!

Overdraw is the process of drawing pixels on top of other pixels on screen. It happens any time we use blending to do translucency, and any time we use polygon offset to build the image in layers.

Overdraw is bad because with X-Plane 9's pixel shaders, most users are slowed down by the graphics card's ability to fill in pixels (pixel fill rate), with those complex shaders being run for every pixel. If you are at a screen-res of 1200x1024 looking at the ground with no objects, that might be 1.2 million pixels to fill. But if there is an overlay polygon covering the ground, we have 2.4 million pixels to fill! That's a huge framerate hit.

Right now there's not much you can do about overdraw. Once MeshTool comes out I will post some guidelines on how you can limit overdraw.

We took a step in the v9 global scenery to limit overdraw: in X-Plane 8 the global scenery tried to hide repetition of flat textures by drawing them over each other with offsets. In X-Plane 9 this is done in a pixel shader (e.g. the texture is analyzed and swizzled in the shader and then drann once), cutting down the number of times we must draw.

If you turn pixel shaders on and off in a flat area like Kansas you might see this if you compare the screenshots - the farm textures are more repetitive without shaders. This gives faster fps to everyone (with or without shaders) by eliminating overdraw.

Sunday, January 06, 2008

I Broke Volumetric Fog

I'll try to get it fixed soon....it's a bit frustrating, because it's the second time in the last week that I made a change to X-Plane to try to improve performance, tested it on my hardware, then discovered in-field that it helps some machines but screws up a lot of others.

The fog was a screw-up though...it's broken on the 9600XT and I have one of those.

(I'm not entirely sure what the minimum graphics card for volumetric fog will turn out to be...right now we let you use it no matter how slow it makes the computer, but generally it's been sort of a performance problem...it's just a very expensive algorithm that needs some kind of restructuring.)

Saturday, January 05, 2008

Panel texture in weird places

X-Plane 8 didn't care much whether you used ATTR_cockpit in scenery objects or other strange places. It would simply show the cockpit panel texture, and if it hadn't been updated, you might see an old one, and if it had never been used, maybe you'd see the random (but colorful) contents of memory. Similarly if you could get close enough to another airplane to look in the window, you'd see your own panel, since there is only one panel texture (for the user's airplane) in the entire scenery system.

This is a bigger problem in X-Plane 9.
  • Because the panel texture can be expensive for big panels, we are a lot more aggressive about not setting up the panel texture if we can avoid it. This means that sometimes the texture doesn't exist at all. This is why in beta 14 you get an error if you do a formation flight having only been in "w" (forward 2-d) view...the panel texture doesn't yet exist, but the exterior view of the Cessna tow plane uses it.
  • With panel regions there can be up to four panel textures, so you can see the potential for anarchy.
  • Panel textures aren't even the same size any more, causing the wrong-panel-in-AI-plane problem to look even weirder than before.
So in beta 15, the panel texture is replaced with a dummy white texture for:
  • Any cockpit object for an AI plane.
  • Any scenery objects that are illegally using the panel texture.
This prevents crashes and other nasty stuff. If you want to make the panel be visible in your AI plane, consider using LOD to make a non-panel-texture "fake" cockpit image (at a very small res) at farther LODs. My guess is that in normal usage of the sim you'd really have to do something dangerous to get close enough to see the hack.

We did discuss live panels for all planes (for all of about 3 seconds), but the live panel texture in 3-d is so expensive that it'd be prohibitive to most users for even one AI airplane, let alone 20!

Thursday, January 03, 2008

Is Bigger Always Better?

We've been preaching "one big texture, not lots of little textures" for a while now, and generally speaking, packing a lot of art into one big texture makes life eaiser for X-Plane, because it can draw more triangles at once before it has to tell the card to change what it's doing. Inside the company we call this the "crayon rule".

Now the total set of geometry and textures that X-Plane needs to use for one frame is the "working set" - you can think of it as the crayons that you keep out of the box because you need them all the time. And as I said before, if the working set becomes too big, your framerate dies.

Now with large panels we're seeing a new phenomenon, one of the first cases where the crayon rule might not be true. The reason is due to working set.

When you make an airplane with a large panel in version 9, you can either use ATTR_cockpit, which lets you use the entire panel as a texture, or you can use ATTR_cockpit_region, which will let you use several parts of the panel. Each ATTR_cockpit_region is a texture change, so that's more crayons. And yet ATTR_cockpit_region is usually faster.

The reason is two-fold:
  1. You can often use cockpit regions that don't cover the entire cockpit texture. Large panels are rounded up to 2048 if the are larger than 1024 in any dimension, so the "wasted space" in a 1600x1600 panel is actually quite huge. If you can get away with some smaller regions, your total panel texture area is smaller because there isn't wasted space due to this rounding, and you can also skip things like Windows. Prepping the panel texure takes time, and it's done once for lit and once for non-it elements, so it adds up!
  2. It turns out there are two categories of textures that contribute to the working set: static texures and dynamic ones, and their impact on VRAM is very different. Dynamic textures are much more expensive. The panel texture is dynamic and it's uncompressed, so it really costs a fortune. (32 MB of VRAM for 1600x1600. That's not a lot for a static texture but for a dynamic one that'll kill you.)
Here's the details on dynamic vs static textures: the OpenGL driver keeps a backup copy of a texture in main memory, so that if it has to purge VRAM (to make room for more stuff) it still has the texture. As it "swaps" textures, the process is to simply send textures as needed from main memory to VRAM. No big deal.

But with a dynamic texture, the texture has been modified in VRAM! So the copy in system memory is old and stale. The graphics card thus must send the texure back to main memory, consuming twice as much bus bandwidth as normal. (To free 16 MB of VRAM and refill it takes 32 MB of transfer, 16 MB to copy the old texture back to system RAM and another 16 to send the new textures to VRAM.) On non-PCIe cards, this back-transfer might be at 1/8th the speed of the transfer to the card, so this is even worse on AGP cards.

Thus the driver does its best to not throw out dynamic textures. And this is why the panel texture is so expensive. That P180 will cause X-Plane to make two 16-MB dynamic texures, and those textures will cause 32 MB of VRAM to basically be off the table. That's less space for the other textures to swap in and out of. This kind of "permanent allocation" makes the VRAM budget tighter for all other drawing operations.

Given the right combination of large panels, large res, pixel shader effects (which make more dynamic textures), clouds, and FSAA, you can easily get even a 256 MB card to a state where the free space into which static textures are shuffled becomes horribly small, and the framerate just dies.

So the moral of the story is: yes, it can be worth 4 crayons (using panel regions) to avoid the huge cost of dynamic textures from large panels.

As to static textures (regular DDS files) that are 2048x2048 - the jury is still out but my guess is they don't represent a huge performance problem. As one user pointed out to me, they're only 2 MB when compressed (maybe more with alpha) so they're not insanely huge, and they can be swapped out.

Saturday, December 29, 2007

Happy New Years

Lori and I are about to leave for a New Years Eve ski trip, but before I shut down the laptop for the last time in 2007 I wanted to say: Happy New Years to everyone in the X-Plane community. I had a lot of fun working on X-Plane in 07 and hopefully the sim brought you enjoyment too. I think 2008 is going to be very exciting - version 9 plants the seeds for a lot of interesting new possibilities during the version run.

When I get back I'll post a bit on panel regions, for which I have the first performance numbers, as well as some of the strange effects FSAA has on the sim. We should have a progress report on Linux soon too.

See you next year!