Sunday, April 29, 2007

Turn Off "Draw Hi-Detailed World"

It looks so innocent, that one check box..."draw hi detailed world". What harm could it do?

Well, if you have a GeForce FX graphics card, quite a bit! If you have Vista it may not be a great idea either.

The "draw hi-detail world" setting turns on multiple rendering settings that look nice but hurts fps. There are two I can think of right now:
  • With the setting on, we draw 3-d structures for airport lights. This can slow down slower machines, but usually isn't the big problem.
  • With this setting on, we use pixel shaders to draw terrain - without it we use the traditional "fixed function" OpenGL pipeline.
It's this second behavior that causes all the misery. X-Plane won't use pixel shaders if your video card doesn't have them. But...what if your card has pixel shaders and they're just not very good?

I should say: I have no first-hand knowledge of how the GeForce FX series works, and what I am repeating is simply conjecture posted on the web, albeit conjecture that explains what we keep hearing from users. The GeForce FX (nVidia's first series of programmable pixel-shader based cards) is a hybrid card - half the card's transistors are dedicated to fixed-function drawing, and only half for shaders. Thus if we go into shader mode, we basically "lose" half the chip, and our performance tanks. (ATI built 100% programmable cards starting with their first entry, the 9700, and nVidia went this way with the 6000 series.)

So if you have an FX card, it tells X-Plane "I can do shaders". With "hi-detailed world" on, we take a huge performance hit. Simple solution for FX users: turn "hi-detailed world" off! Get your fps back!

I've also heard a bunch of reports that the new drivers for Vista have bugs...they seem to come out more when we use pixel shaders . Again - turn "hi detailed world" off and see if it helps!

Bottom line: "draw hi detailed world" is proving to be an aggressive setting - I recommend backing it down as the first step in trouble-shooting performance problems.

Saturday, April 21, 2007

Drawing on the Ground - Polygons or OBJs

With X-Plane 860 there are two ways to draw stuff on the ground:
  • Use an OBJ and ATTR_poly_os. This is the way most people have built custom taxiways, markings, etc.
  • Use a DSF overlay with a bezier polygon (which is controlled by a .pol or .lin file).
Which should you do? Hrm...
  • For any geometry that goes over a very large area, the DSF overlay with a bezier polygon is basically a requirement, because the polygon will "drape" over the ground. The OBJ might just stick out in the air.
  • If your element repeats a lot, the OBJ may save memory. (OBJs are stored in memory once and used many times - bezier polygons use RAM each time they're used.)
  • If the overlay is very simple, the DSF overlay may perform better.
  • It may be too expensive to build complex lines with DSF overlay lines. They're meant for taxi-lines, not full vector graphics!
So here are a few decisions I'd make:
  • Taxi lines - certainly DSF overlays if we're not using apt.dat 850.
  • Parking spot at an airport - use one large polygon with a texture, not a bunch of bezier lines.
  • Use a DSF overlay for that parking spot, even if we use it 60 or 100 times. The simplicity of the overlay (vs. the compexity of starting and stopping an OBJ from drawing) outweighs the cost of using it many times.
  • A very structurally complex overlay on the ground with thousands of triangles (that for some reason cannot be created with a texture) that is used a lot -- well, this is an "artificial" case I've made up, but in this case use an OBJ.

The Sordid History of ATTR_poly_os

I've blogged in the past about ATTR_poly_os...it's a tricky topic. ATTR_poly_os is
a feature of the OBJ file format designed to let authors fix z-buffer thrash problems. Unfortunately, the cause of z-buffer thrash is pretty complex. To make things worse, it turns out I never finished my intended documentation on the subject. (I'm an idiot!)

The fundamental problem I think is that what we have now (ATTR_poly_os and ATTR_layer_group) provide a mechanism to correctly fix z-buffer thrash, but they don't in any way enforce good behavior over bad behavior. The two attributes are very flexible, and if used together, can do all sorts of bad things. The problem is that ATTR_poly_os was thought up years before the layer-group mechanism, and thus they don't really reinforce each other.

So...here are a few simple rules to help with z-buffer thrash in X-Plane 860:
  1. Never use the names of objects or their order in the DSF to accomplish anything. X-Plane ignores both names and orders when processing your scenery.
  2. Do not move your polygons above the terrain to fix z-thrash. This won't work.
  3. When possible, divide your objects into ones that are 100% on-the-ground (and thus may z-thrash) and ones that are 100% 3-d above the ground (and will not thrash). I realize that more objects means slower fps...so this applies best when you have many objects and can pick how you divide them up.
  4. Always use ATTR_poly_os for any polygons that lie along the ground. Use the smallest number you can to fix the thrash.
  5. If you have an object with ATTR_poly_os geometry and non-poly_os geometry, make sure the ATTR_poly_os geometry is first!
Those five rules should keep you out of trouble.

What about ATTR_layer_group? Well, secretly X-Plane 860 will change the layer group of an object that is ATTR_poly_os for you. So as long as your object contains only offset geometry (this is what I recommend in rule 3) it wll always be drawn before the rest of the objects, preventing artifacts.

You'll need ATTR_layer_group if you want to put objects underneath runways, or underneath taxiways, for example.

I am working on more comprehensive documentation on the topic, and appreciate any feedback on stuff that I've written that's unclear...the rules are complicated!

Sunday, April 15, 2007

DX10: Why I'm Not an Early Adopter

Before I begin, X-Plane uses OpenGL as its interface to 3-d hardware, not Direct3D. So when I talk about "X-Plane doesn't utilize DX10", isn't that meaningless? I mean, X-Plane has never supported any version of Direct3D.

But I like to use the term "DX9" and "DX10" anyway for this reason:
  • For all practical purposes, within the "games space", most advances to the Direct3D and OpenGL APIs that I care about are created for the purpose of exposing new hardware capabilities to applications. That is, the point of DX10 (including the new Direct3D) is to allow games to use the newest video cards more efficiently.
  • OpenGL is revised by adding "extensions", that is, independent features that can be mixed and matched. DirectX tends to have whole-API revisions. So I prefer DirectX because it puts a nice "number" on an entire set of technology. Since the graphics cards are revised in generations as new GPUs are designed, these generations match up reasonably well with the hardware.
So in this context, when I say "DX10" I really mean the very newest set of super-programmable cards, of which the GeForce 8800 is the first, and by use them I mean take advantage of some of these really great new features:
  • Instancing (the ability to draw a lot of objects with only one command to the card, which could relieve the CPU cost of huge numbers of objects).
  • Geometry shaders (the ability to do per-triangle and not just per-vertex calculations on the graphics card) which could move some of the logic for terrain generation to the graphics card. (We precompute this and save it in the DSF in X-Plane, so we use DVD space and RAM, while I believe MSFS does this kind of thing on the CPU.)
  • Better management of state changes (good for unloading the CPU).
  • A bunch of really interesting ways to work with data strictly on the card (don't know what it's good for yet, but it unlocks a lot of cool possibilities).
So why doesn't X-Plane utilize all of these new features? Or rather, when will we?

Well, my goal in working on X-Plane's rendering engine is to know these things are coming but not be an early adopter. The way I look at the economics of software development is: the amount of labor we can put into a release is somewhat constrained. If we put in more months between releases, we have fewer releases. If we have more programmers, we have to pay them more, and we have to charge more money. There's a lot of things you can say (or people have told us) about our business model. But I tend to view these as the invariant conditions we have to work with.

So what I worry about is efficiency: if we are limited to exactly X man-months of work per release, how can we make the best of them? Is being an early adopter the most efficient use of limited programmer resources when developing X-Plane? (We have to consider opportunity cost: what features won't be implemented because we spent time on early adoption of new graphics technologies.)

I think there are a few things going against early adoption, particularly for a small company like us where labor is at a premium. (Our list of things we can be doing is very long, so any new feature takes away from a lot of other good ideas.)
  • New graphics hardware isn't widely distributed among our user base. If we adopt DirectX-10 style features, this work helps a very small numer of our users. Eventually everyone will have hardware like this, but we can cover a case by adapting the new technology later.
  • When new features come out, there is often vendor disagreement on how to code for them. It takes time to come up with cross-vendor standards. Consider that ATI hasn't come out with their DX10 hardware, and the OpenGL extensions to use these are all nVidia proposals. My guess is that ATI will have their own extensions, and the real ones that get used will be a mix of each. If we adopt now, we'll be "betting on the wrong horse" a few times -- code that will have to be rewritten, for a total loss of efficiency.
  • New drivers can be buggy. It takes a while for support for new features to be both universal and reliable. The earlier we jump in, the buggier the environment we develop in, and thus the more difficult it is for us to develop.
Of course, there's definitely a cost. The 8800 is capable of doing some amazing things, and X-Plane does not yet fully take advantage of it. But I do believe that in the long term we end up delivering more value to X-Plane users by taking a slower wait-and-see approach to new hardware.

(To GeForce 8800 users I can only say that the code we write now while waiting for the right environment may do some cool things too, so we're not just taking a vacation! And the GeForce 8800 also delivers an overall speed boost to the entire system.)

EDIT: the same logic applies to operating systems like Vista and OS X 10.5 to some extent, but I hesitate to bring this up because: a number of users are having problems with X-Plane and Vista, and it is not because we have delayed support for Vista. The real problem is that the graphics card drivers for Vista are still new and have some problems. I believe that the various Vista problems we're seeing will be addressed by code changes by ATI and nVidia.

Thursday, April 12, 2007

Will it Ever Be Done (WED)

The internal joke about WED (the new scenery editor) is that it stands for Will it Ever be Done. So I should say that, given my total inability to predict when it will be done, and how many delays and setbacks there have been, I don't expect you to believe anything I say until I actually ship something you can run (or at least post some screenshots*). I realize that I have destroyed my credibility about ship-dates by having no ability to predict ship dates for WED.

With that in mind, I am pleased to report that today I was able to run WED with a few features working in concert:
  • Multiple selection (map and hierachy view).
  • Multiple undo everywhere.
  • Marquee multi-selection tool on the map view.
  • Tree-based hierarchy view with editable property fields.
WED still can't do anything remotely useful, but these four features represent a huge amount of infrastructure investment. Basically my goal is to provide some of these user-interface features (multiple undo, multiple select, multiple tools, etc.) for every single type of scenery component that you can edit.

Since apt.dat contains a lot of different components (just look at all the record types in the file format) and WED will also have to edit DSFs with a wide variety of data, it made sense to me to write a generic mechanism for these features that could be used over and over again without writing new code.

So my hope is that I'm just getting over the "hump" of writing infrastructure, and soon will be able to add two dozen more types of editable things to WED (other than my simple test objects) very quickly. We'll see if it pays off.

The plan is still for WED releases to be separate from X-Plane and to be open source (MIT/X11 license). I've been meaning to post in Chris's programming blog about some of the design ideas WED employs - it makes a good laboratory. Hopefully those posts can also provide some guidance if anyone decides to modify/work on/steal the WED source code. (It is my expectation that most programmers will not want to get into the guts of how WED works, as it is a fairly complex application. I think I can make simpler interfaces to things like import/export to make extending the program simpler.)

Anyway, WED is not the only thing I am working on right now, and you don't have to believe me, but the WED codebase is growing, and some of it even works!

* I am not posting screenshots because the program is running right now with "scab art" - that is, ugly green, red and purple boxes that will be replaced with nice PNGs once the layout settles down. I do not want to make our artists draw the UI components more than once, and I don't want to answer 1000 emails about "Why is it so ugly" by posting the scabs, which are perfectly adequate for my own coding purposes and not intended to ship.

Monday, April 09, 2007

I'm not a fan of SLI/CrossFire

When it comes to video cards, I've always been in the "don't spend more than $200" school of thought. My logic is: video card technology moves so fast that paying a lot of money for the "first six months" of any new technology level is very expensive. Bless all of you who are early adopters - you're helping keep nVidia and ATI humming, but it's an expensive hobby to main an up-to-date machine.

This is one of my favorite tables (there is a similar one on Wikipedia for ATI). It shows performance of graphics cards and when they came out. Compare the GeForce 7950 GT2 and the GeForce 8800 GTS. If you want 24,000 MT/S of fill rate, you could buy a top-of-the-line two-cards-in-one-via-SLI 7950, but if you waited six months, a SINGLE intermediate-speed 8800 would give you the same thing while supporting DX10 shaders (e.g. geometry shaders, instancing, and all that awesome stuff). The 7950 GT2 apparently retailed at $600+, which was a real discount compared to actually chaining two separate 7950's together (that'd get you up around $850). Look on newegg.com and you'll see that GeForce 8800 prices aren't that expensive (compared to an SLI combination). And the 7950 GT2's come down a lot from what it used to cost.

For another datapoint, compare the Geforce 7600's to the GeForce 6800's. The 6800 was the monster card when it came out, putting nVidia back in the number one spot. But the next-generation's intermediate range cards can do what was top-end before. (The 7600 can be had for a little over $100. Compare that to several hundred for the 6800 ultra about one year earlier.)

Simply put, you pay a huge premium to get a given performance level when it's new and top-end. Wait one generation of cards (by buying last-year's top end cards or this-year's middle-range cards) and you save a lot.

It's in this context that I don't believe that SLI makes a lot of sense. In an environment where (IMO, and my opinion only) the top-end video cards are already expensive for what they do, SLI simply makes the situation worse, by allowing you to spend double what the already-high-end cards cost to get performance that will be available in one card in the next generation.

To do the math, does it make sense to spend double the price on your video card to extend its useful life by six months? Only if you intend to change cards every six months.

(nVidia makes an argument that SLI allows developers to preview the next-gen hardware, and this is true. My strategy is different: simply run X-Plane slowly and assume that the next-generation hardware will go faster.)

I don't feel good about criticizing nVidia and ATI because overall I feel that their products provide an extraordinary value at a very good price, and the growth of performance in video cards has been astounding. Todays cards just hit it out of the park.

But to me SLI and CrossFire strikes me as a solution looking for a problem. They solve the problem of making the most expensive cards more expensive, but I don't think they're the best way to spend money on a flight simulation system. (Better might be to not buy at the "SLI/Crossfire" level of video cards, meaning spending $700+ on your video cards, but rather to go down a level and upgrade your motherboard/CPU more frequently.)

Some users email me asking for video card recommendations, in particular whether to buy an SLI/Crossfire configuration. The bottom line is, it depends on how much you value your money vs. your graphics card performance. If money is on object, and you want maximum speed, SLI configurations will provide the fastest performance (by some marginal amount). I believe that a good value lies below $200.

On the other side of the equation, I do recommend that everyone spend at least $100 if you're going to buy a video card at all. Below $100 the price cuts come from remaindering really old inventory and removing parts from the card to save cost. For the savings of $25 you might lose half your card's performance or half of its VRAM when you get down to the really cheap cards.

The other thing I tell users is the truth: no one at Laminar Research has an SLI system, so the reports we get on SLI come from users. Some users have told us they've gotten some benefit at very high FSAA levels. But at this point a single 8800 wll do the same thing. And SLI doesn't address CPU speed at all. Consider this list of features - nothing on the CPU side will get even remotely faster with SLI.

And in full disclosure: my two Macs have a Radeon X1600 Mobility, a Radeon 9600, and a GeForce 5200 FX sits on the shelf for testing purposes. (This isn't intentional bias toward ATI, it's what Apple ships.)

Sunday, April 08, 2007

Using Layer Groups

I can't say enough good things about Jonathan Harris (Marginal) -- his work on X-Plane has been fantastic, he is one of the most advanced third party scenery authors I know of, and when he sends us a bug report, it is usually so perfectly patched up that I'm looking at the bad line of code in minutes! (One of these days I'll post one of his bug reports -- he always isolates the bug in a simple package that makes it very clear, with no extra "stuff".)

He emailed me a while ago with some questions about layer groups, and I saved them to rewrite into a blog entry. A little bit of background:

X-Plane 850 and 860 introduce the concept of "layer groups", which provide a way to control the draw order of scenery to some extent. Objects naturally fall into a layer group based on their type (e.g. objects go into the "object" layer group by default, and runways always go into the runway layer group). However, some scenery elements let you customize their layer-group placement in two ways:
  • By changing which layer group the element goes to entirely, or
  • By providing a "bias number", which indicates that within the catagory of scenery elements, this one must be drawn early or late.
Layer groups let you do a number of useful things:
  • Make sure that polygons and objects are drawn under runways or over taxiways when needed.
  • Make sure that runway markings with polygon offset are drawn before 3-d objects.
Simply changing the order of objects in the DSF is not a reliable way to control draw order! Layer groups are. You can read about layer groups in the OBJ8 spec.

With that in mind, some Q and A. (I will elaborate on my answers from what I originally sent Jonathan.)

J: I assume that objects and polygons within a single layer can be drawn in any order - ie there's no defined drawing order between different scenery types.

B: You assume correctly! Within a layer group, X-plane is free to reorder to improve fps. So you cannot rely on the draw order of any two scenery elements without assuring that they are in different layer groups, either by using different group names, or different relative offsets.

J: But is there are defined drawing order between objects/polygons and apt.dat-generated scenery?

B: Yes because airport scenery goes into specific layer groups! (In fact, all scenery has a "default" layer it goes into, and they usually vary by type of scenery element.)

J: eg if I have an object or polygon with ATTR_layer_group runways 0 will it be guaranteed to be drawn after the runway?

B: You're close. The runways go into the runways group so

ATTR_layer_group runways -1

will always be before runways and

ATTR_layer_group runways 1

will always be after.

ATTR_layer_group runways 0

is the default layer group for runways, so your object or polygon would be in the layer group with all of the other runways, and X-Plane would be free to change the order amongst your object and runways in any way that would optimize framerate.

J: If not, is there any other way to insert objects and/or polygons between the taxiways and the runways?

B: The numeric offset is provided for this. The "spacing" of the layer group numbers is such that you can have up to 5 groups before and after each "named" group. So anything from runways -5 to runways +5 is fair game. (In other words, you can separately control up to 5 different "layers" of elements with a well-defined order for any layer group name.)

Friday, April 06, 2007

X-Plane vs. Reality

A few days ago, Austin posted part of an, um, "animated" discussion between himself and an author regarding blade-theory vs. table-based flight models. (You can find it on the xplane-news yahoo group message archive.)

I'd like to ignore the whole "my flight model can beat up your flight model" thread and look at one of the side effects of physics vs. table based flight models.

In this previous post I commented on the nature of specifications in a flight simulator.

- Some data simulates real-world data. ("Reality-based") The sim has open authority to interpret this data for maximum "quality" and the standard is "how close to reality are we". The specification clearly sites reality as the authority on behavior.

- Some data is arbitrary and has a clearly defined interpretation. ("Specification based".) The behavior of the computer program is unambiguous.

What I find interesting is that a blade theory flight model is a "reality-based" flight model; a table-based flight model is a "specification-based" flight model.

What this means is that, just like reality-based specifications in the scenery system, you can't tune your flight model in X-Plane to achieve a desired end result without understanding the real-world meaning of the parameters you are changing, or you risk a compatibility problem with future versions of X-Plane.

Imagine that, for some reason, your plane seems to feel sluggish when turning. So you increase the area of the control surfaces and the problem goes away.

You can't do that with X-Plane's flight model! The area of the control surfaces mean something other than "a variable you can change to affect how the plane turns". They have to match how the real-life airplane is built. If you increase the area, straying from reality, to "fix" a problem, what really happens is you create a new problem later when X-Plane goes to simulate your model.

Simply put, if you put intentional errors into your plane's flight model to compensate for limitations to the sim, any improvement in the simulation accuracy of X-Plane is almost guaranteed to make your plane fly worse in the future.

So one of the important differences between a table vs. blade-theory flight model is how you talk about "bugs". If your plane doesn't fly the way it used to in a table-based flight model, that's probably a bug (well, depending on how interpolation is done). In a blade-theory model it's not a bug per se.

In a table-based flight model, how the real plane flies is moot - the table is king. In a blade-theory flight model, if the real plane flies differently and the input parameters of the plane are the same, it is a bug, or perhaps a design limitation.

Thursday, April 05, 2007

CPU or GPU

If your X-Plane is framerate low, or you want to increase your rendering quality, you might think "time for a new graphcis card But is it?

Some rendering settings actually tax the CPU more than the GPU (graphics card). Here's a simple rule of thumb: if you increase the setting (and restart X-Plane) and your frame-rate does not go down, a new graphics card isn't going to make it go up!

For example, if you have one of those new-fangled GeForce 8800s, you may have noticed that when you turn on FSAA the framerate doesn't dip at all. That's because the 8800 is insanely overpowered for X-Plane (at normal monitor resolutions) and has plenty of extra capacity that will be sitting idle on an older PC. When you turn up FSAA, you are simply wasting less of the card's excess capacity. It goes without saying that if there were a card faster than the 8800, it wouldn't improve your fps any more than the 8800, it would simply be even more bored.

Here's a rough guide to which features tax the CPU vs GPU:

CPU-Intensive
  • World Level of Detail
  • Number of Objects
  • Draw Cars On Roads
  • Draw Birds (not that expensive for modern machines)
  • Draw Hi Detail World
  • World Field Of View (wider view means more CPU work!)
GPU-Intensive
  • Texture Resolution (requires more VRAM)
  • Screen Resolution
  • Full Screen Anti-Aliasing (FSAA)
  • Anisotropic Filtering (most cards can do at least 4x)
  • Draw Hi-Res Planet Textures From Orbit
  • Cloud Shadows and Reflections (not that expensive)
  • Draw Hi Detailed World

A few specific framerate-optimization warnings:
  • FSAA is equivalent to a higher screen resolution - that is, running at 2048x2048 and no FSA is similar to running at 1024x1024 and 4x FSAA. Both of these tax the video card with virtually no CPU increase. This is probably the only setting that can be helped only with a video-card upgrade.
  • Texture resolution: do not worry if the total size of all textures loaded is larger than the VRAM of your card. To find out if more VRAM would help, measure frame-rate with your normal settings, with texture resolution down a notch, and with anisotropic filtering down a notch. If turning texture resolution down increases fps more than turning down anisotropic filtering, more VRAM may help. Machines with faster graphics busses (like PCIe x16) will be less sensitive to VRAM.
  • Most Important: do not ever turn "World Detail Distance" beyond the default setting - you will simply destroy your fps and chew up your CPU for no benefit. I strongly recommend trying "low" for this setting - if you like a lot of objects, this setting can make a big difference in performance.
  • The number of objects is virtually always a factor of how fast your CPU is, not your GPU -- that is, most GPUs can draw about a gajillion objects if the CPU could only get through them fast enough. If you are unhappy with the number of objects you can draw, do not expect a new graphics card to help - it probably won't.
  • Cars on roads hurt fps on machines that don't have the fastest CPU.
  • Draw Hi detail World is doubly dangerous - it uses both the CPU and GPU. Quite literally this is where we stash "luxurious" options. Everything this checkbox does chews up framerate. (If these options didn't, we'd leave them on all the time!) So you should not use this option if you aren't happy with fps, if you don't have a fast CPU, or if your graphics card isn't that modern. (HINT: if your nVidia card has "FX" in the title, don't use this!)
Start with the default settings and experiment - turn a setting up one notch, then restart, then turn it down and try another. Different machines will be faster for some things and slower for others.

EDIT: one user correctly determined (by observing CPU utilization relative to settings) that puff-style 3-d clouds bottleneck the GPU, not the CPU! This was not what I expected - when Austin originally wrote that code, our measurement indicating that sorting the puffs from far to near taxed the CPU a lot, making this CPU-intesive. At the time the old Rage 128s would also get bogged down by filling in translucent puffs as you flew right into the thick of the cloud.

Times have changed and neither the sorting nor the alpha-drawing is even remotely expensive on a modern machine. So I was surprised to see the CPU not being used. After some investigation, it turns out that while the CPU and GPU have gotten a lot faster over time, the communciations channel between them has not. The result is that they both do their jobs really quickly, and as a result clog up the communications channel...the CPU simply can't talk to the GPU fast enough to get the clouds out.

This is a great find by this user, as this is something that didn't matter on old machines, but can be optimized in the future for new ones.