Showing posts with label hardware. Show all posts
Showing posts with label hardware. Show all posts

Friday, May 23, 2008

Drivers and Builds To Try

For those who posted comments, sorry it took so long to moderate them - for some reason my spam filter decided that notifications of comments are, well, spam, so I just found them now. I should have known people would have jumped into a Vista-bashing thread. :-)

There is an X-Plane 9.02 beta 1 posted - like 901 we've been pretty quiet about this, but you can get it by enabling "get betas" and running the X-Plane updater. Please give it a try. Like 901 it is a small change for the purpose of localization, but it actually has an interesting feature pair:
  • True-type fonts and
  • Unicode-aware.
This is part of some rework we did to provide better language support. So...you should be able to run X-Plane no matter what weird characters* are in your folder names, name your airplanes funny things, and see diacritical marks. 902 uses a font that provides all of the Latin and Greek/Cyrillic code pages.

Also I have heard reports of improvements based on drivers:
  • nVidia has 175.16 drivers out and they apparently address "stuttering" issues. The stuttering issue has been on my list to investigate because it happens under Windows but not Linux. If you have stuttering performance on high-end NV hardware, particularly with forests and Windows, please try 175.16 and let me know how it goes.
  • ATI has released Catalyst 8-5. Catalyst 8-3 and 8-4 were causing "incomplete framebuffer" errors for some users, but I was unable to reproduce it (after spending a good day trying to jam Windows XP onto an iMac already crammed with Windows and Linux....yet another episode of a Tale of Three Operating Sytstems). Anyway, at least one user reported the issue as fixed in Cat 8-5, so if you are having problems, please try the new driver set.
As always, bugs in the X-Plane beta should go to our bug report form, on the X-Plane contacts page.

* You might accuse me of being American-centric in decrying diacritical and greek letters weird - but the truth is I am computer-centric...anything that is not in the original ASCII set is weird. :-)

Friday, May 16, 2008

Commodification and Operating Systems

I'll warn you in advance: this is going to start off topic and go way off topic. "Catching up" with the changes to Mac OS, Windows and Linux has me thinking about the nature of technology. I feel a little bit guilty about this post because it's going to turn into a rant about Vista, and ranting about Vista is like shooting fish in a barrel. On the other hand, having used Vista, well, I have a lot of rant to give.

One of the most important things to understand about technology (and computers are no exception) is that changes in the scale of the technology change the very nature of the technology. That is, as you make computers faster and cheaper, at some point the sum of all of those small improvements changes the fundamental nature of the beast. We've seen this as the computer transformed from main frame to desktop (which is really just a change in cost and size), finding an entirely new audience, and now again as the computer changes from what we know of now as a computer to cell phones, MP3 players, and other small, mobile devices.

"Commodification" is what happens when, as things get better, cheaper, faster, etc., consumers stop caring about the marginal improvement. Back in the days of Windows 95 and 386's, there were ways you could improve the operating system and hardware in substantial ways; a doubling in processor speed and a rewrite of the operating system got you protected memory, which meant less data loss.

A few years ago we reached the point where desktop hardware became commodified. For the average user, 1.8 vs 2.2 ghz makes no difference at all. It's a question of how quickly your computer can wait for keystrokes and data from the internet. (Answer: even a lowly Celleron is light-years faster than the I/O devices it typically has to talk to. Even if you're the last kid in your class at Harvard, you're going to be bored discussing politics with a bunch of four-year-olds.) At that point things became very difficult for major vendors like IBM (sold out), HP and Compaq (merged), Gateway (bought out of it's misery), etc. The price of a desktop plummeted from over $1000 to less than $400.

I believe we've reached the point where operating systems have become a commodity as well;
  • Every major operating system has all of the features of a "real" operating system - that is, protected memory, virtual memory, plug & play driver support, etc.
  • The performance for normal applications is just about the same; there are some specific variations that matter in the server market, but for all practical purposes the operating system is not in the way, and the machine is much faster than users need anyway.
  • Every major operating system has a similarly designed GUI experience that, once you get used to the quirks of where the close box is, is just about the same, more or less. (Mac users - keep your pants on. :-)
And this is why life is not so good for Microsoft. In a non-commodified market, you can charge a premium for incremental improvements over the competition. That's a game Microsoft can play - they have a lot of capital to invest in changing their operating system, as long as they are rewarded with a lot of cash for doing it. (And normally they are - about six billion dollars for a major OS revision, I'm told.)

The problem is that operating systems are now a commodity. Simply put, users don't need a new operating system. There are no big ticket features missing from OS X 10.4, Windows XP, or Linux 2.6. This makes Microsoft's business model fundamentally vulnerable to Linux for the first time. If the name of the game is:
  • Keep costs down, as low as possible.
  • Incrementally improve quality very slowly without ever causing the pain of a major OS upgrade.
That's a game Linux, with their army of distributed bug fixers and free source code, is going to win.

When I looked at Windows XP and Ubuntu 6.06 I was afraid that Linux wouldn't make traction into the desktop market. I blamed the adoption of X11, the KDE/GNOME schism, and the Linux communities' being made up of Shell nerds for the tolerable desktop experience.

But look where we are now: Vista is a vehicle for bloat. Combine "we make money by shipping major features" and "there are no more major features to ship" and you get Vista...an attempt to change a lot of things when you should have left things alone.*

By comparison, Ubuntu pretty much just works - you put the live CD on your machine, it asks you some questions and installs...it knows about more hardware, has less bugs, more drivers, and a better user experience. In a commodified operating-system space, the only thing to do is try to avoid a bad user experience - if you can't offer a really juicy carrot to users, try to avoid hitting them with a stick.

And it is in this environment that the Mac is actually gaining market share. Apple's business model has always been at odds with the industry. Complete vertical integration meant higher costs, lack of market share, and out-of-date technology - back when having more for less meant something, that was a real weakness, and explains why the Mac never dominated in market share.

But what a difference a decade makes! Hardware is now commodified (and Apple is integrated at the system-building level, leveraging cheap third party parts like they always should have). Operating systems are commodified. But on the one frontier left, quality of user experience, Apple's vertical integration gives it an immense advantage.

The question is: why does an operating system "just work"?
  • Vista: it doesn't. There are too many systems and not enough testers and engineers trying to solve the problem.
  • Linux: massive distributed engineering. For any given hardware system, eventually a Linux nerd will integrate it. Anyone can solve the problem of poor user experience.
  • Apple: they have it easy. With only half a dozen machines in production (and maybe another two dozen legacy configurations to support) they have a much smaller configuration space to worry about than anyone else.
I don't kow what Microsoft's future is, but it can't be very good. At some point they are going to have to transition from a "major revision for cash" to an "incremental tuning" approach to operating systems. As long as they have market share, they still get the "Windows tax" - that is, their OEM pricing from major vendors on every new computer that is built. It's going to be harder and harder to convince the entire world to make a major jump (see how well XP to Vista went). In this situation, they'd be better off with a more solid operating system. It's unfortunate that they're going to have to try to sustain market share with Vista.

Their best-case scenario is that they eventually get Vista back to an XP-quality experience, in which case all they've done is spend a huge amount of R&D money and pissed off a lot of customers to maintain the status quo.

* I have mixed opinions on Vista's video-driver-model change. But that's a different post.

Monday, May 12, 2008

Multi-Core Texture Loading

In a previous post I discussed the basic ideas behind using multiple threads in an application to get better performance out of a multi-core machine.

Now before I begin, I need to disclaim some things, because I get very nervous posting anything involving hardware. This blog is me running my mouth, not buying advice; if you are the kind of person who would be grumpy if you bought a $3000 PC and found that it wouldn't let you do X with X-Plane (where X includes run at a certain rendering setting, framerate, or make your laundry) my advice is very simple: don't spend $3000. So...
  • I do not advocate buying the biggest fastest system you can get; you pay a huge premium to be at the top of the hardware curve, particular for game-oriented technologies like fast-clock CPUs and high-end GPUs.
  • I do not advocate buying the Mac Pro with your own money; it's too expensive. I have one because my work pays for it.
  • 8 cores are not necessary to enjoy X-Plane. See above about paying a lot of money for that last bit of performance.
Okay...now that I have enough crud posted to be able to say "I told you so"...

My goal in reworking the threading system inside X-Plane for 920 (or whatever the next major patch is called) is, among other things, to get X-Plane's work to span across as many cores as you have, rather than across as many tasks are going on. (See my previous post for more on this.)

Today I got just one bit of the code doing this: the texture loader. The texture loader' job is to load textures from the hard drive to the video card (using the CPU, via main memory) while you fly. In X-Plane 901 it will use up to one core to do this, that core also being shared with building forests and airports.

With the new code, it will load as many textures at a time as it can, using as many cores as you have. I tested this on RealScenery's Seatle-Tacoma custom scenery package - the package is an ENV with about 1.5 GB of custom PNGs, covering about half of the ENV tile with non-repeating orthophotos.

On my Mac Pro, 901 will switch to KSEA from LOWI in about one minute - the vast majority of the time is spent loading about 500 PNG files. The CPU monitor shows one core maxed out. With the new code, the load takes fourteen seconds, with all eight cores maxed out.

(This also means that the time from when the scenery shifts to when the new scenery has its textures loaded would be about fourteen seconds, rather than a minute, which means very fast flight is unlikely to get to the new area before the textures are loaded and see a big sea of gray.)

Things to note:
  • Even if we don't expect everyone to have eight cores, knowing that the code can run on a lot of cores proves the design - the more the code can "spread out" over a lot of cores, the more likely the sim will use all hardware available.
  • Even if you only have two or four cores, there's a win here.
  • Texture load time is only a factor for certain types of scenery; we'll need to keep doing this type of work in a number of cases.
This change is the first case where X-Plane will actually spread out to eight cores for a noticeable performance gain. Of course the long-term trend will be more efficient use of multi-core hardware in more cases.

Saturday, May 03, 2008

A Tale of Three Operating Systems, Part II (Why You Need Bootcamp)

A while ago I put three operating systems on my laptop. With the Mac Pro I've done the same thing - it's a huge win to be able to cover such a wide swath of OS/GPU/CPU combinations with fewer machines. Last time it was OS X 10.4, Windows XP SP2, and Ubuntu 6.06. This time I repeated the process with OS X 10.5.2, Windows Vista RTM, and Ubuntu 8.04. Random observations:
  • Linux really just keeps getting stronger. I've always been a bit skeptical about Linux as a desktop environment, particularly as a Windows/Mac developer (that is to say, I'm spoiled by free high quality IDEs ad debuggers that require no setup to use the platform SDK, comprehensive platform documentation in one location, etc.). But Linux installation is becoming more plug & play and trouble-free each time I make myself a live CD.
  • Windows Vista is a train wreck. I feel a little bit lame blogging this, as taking pot-shots at Vista is sort of like shooting fish in a barrel. But the contrast between Ubuntu, which has become easier to use over a year and a half, and Windows, which has not, is stark.
  • There are some positive things to say about Vista. The partition-aware installer is a real convenience for multi-booters. And once you figure out where everything has been moved to and go back to "classic" views, the OS is tolerable. But you'll still find plenty of things that will make you want to tear your hair out. My recommendation: stick with XP. (Duh.)
Now on to the performance numbers. These numbers are the Xp900 time demo fps tests 1, 2 and 3. Each set of 3 numbers is from the three phases.
      1           2               3
MAC 49/ 60/ 62 38/ 43/ 44 21/ 20/ 21
WIN 121/128/133 114/115/119 77/ 75/ 82
LIN 143/144/157 130/123/132 92/104/113
That's not a typo. Linux is beating out Vista, but both are absolutely killing OS X. What's going on here? I don't know. But there appears to be something that isn't well optimized in the GeForce 8 drivers on OS X.

I suspect Apple will close this gap eventually; don't bother asking me for status information on this because if they ever tell me what's going on, I'll be bound by NDA not to tell you.

For now my recommendation is: consider dual-booting into Linux - it's pretty easy to install Ubuntu and you'll get great X-Plane performance. With good drivers, the Mac Pro and 8800 are just monstrous.

Tuesday, April 29, 2008

Too Many Computers

The New Mac Pro came today. I'm very excited about this computer because it may be the first computer that can render an entire set of global scenery on its own in under three days. We've always had to apply multiple computers to the problem, sometimes in different cities, and synchronize the data. Being able to do the render on one box is a nice simplification.

A few immediate observations:
  • Man is it expensive. If you aren't Warren Buffet and your work isn't buying it, think long and hard about why you need it. The iMac is a more reasonable X-Plane computer. (And from what I understand, both Radeon HDs and GeForce 8's have driver problems on OS X; until Apple fixes this, consider BootCamp. With the new iMacs you can pick which next-gen card with "work-in-progress" drivers you want.)
  • Buy third party parts...not only will you save money, but you'll get the joy of installing them. Since the "blue and white G3" Apple's cases have been on par with well-designed PC cases for accessibility. The Mac Pro is nicer though - hard drives and memory are all installed on rail-guided slide-in parts; two hard drives and two DIMMs took about 5 minutes and a single screw-driver.
My office is going to explode. Here's a rough list of all of the Macs we have in the house now:
  • Mac Pro (the new scenery rendering machine, also graphics with 8800, will be triple-boot).
  • Aluminum iMac (mostly for testing graphics code on Radeon HD, but it's now the DVD burning station, runs Ubuntu 7.10).
  • Mac Book Pro (old X1600-based, portable dev machine, triple boot but LILO is dead).
  • G5 (the old rendering machine, kept around to regress bugs on PPC, R300 chipset, OS X 10.3, etc.).
  • Dell P-IV 2.6 ghz - mostly used to record music, but it does have a GeForce 6 in it, and it's a good low-end test system. (A 2.6 ghz P-IV doesn't get you real far, the FSB is slow, and it's only got 4x AGP.)
  • G4 laptop (800 mhz) - case is falling apart, but if I ever needed to try to run on a really low-end system. Actually at this point the laptop is so far below min specs that it probably isn't worth it.
  • Mac Book - got this for my wife, but if I begged she'd let me regress Intel X3100 bugs on it.
So there are two points to note here, and neither of them are "Ben has a lot of computers" and "Ben's office is a mess."
  1. X-Plane 9's use of graphics hardware has caught up enough with the bleeding edge that I now basically have to have "one of everything" to really debug the sim. We'll have some major updates to shaders in future patches, so having accumulated all of this hardware in-house (a lot in the last few months) will help me debug these things faster. A lot of these setups are due to "X doesn't work" bug reports specific to certain hardware/OS combinations.
  2. It's really handy that Macs now have x86 chips because it cuts down the number of boxes. The new Mac Pro will give me an nVidia DX10-type platform not only for OS X but also for Vista (first Vista machine in the house, not because I want Vista, but because someone in the company should have at least one copy) and Linux too, if I can make it work.*
So the next time we release a beta and it immediately crashes your computer, please bear with us. It really did work bug free on some computers we own...we're getting closer to a complete matrix to catch more of these incompatibilities early on. But even with that huge mess of hardware we're still missing a lot of combinations.

* Triple-booting Mac/Win/Linux is a much bigger PITA than double boot. The problem is that the old MBR-type setups only give you four partitions, of which one gets eaten for the firmware. That leaves you three partitions and three operating systems, but by default Linux wants a second one for swap (a good idea). So you need to either stick a fifth partition
on and fix the MBR by hand or reconfigure Linux to use on-main-volume swap-space. In summary, with only two operating systems, eitiher Windows or Linux "just installs", but to put three on one drive, you get into customization.

Friday, April 25, 2008

Threads and Cores

Now that multi-core machines are mainstream, you'll hear a lot of talk about "threads". What is a thread, and how does it relate to using more cores?

Definitions

A "core" is an execution unit in a CPU capable of doing one thing. So an 8-core machine might have two CPUs, each with four cores, and it can do eight tasks at once.

A "thread" is a single stream of work inside an application - every application has at least one thread. Basically a two-threaded application can do two things at once (think of driving and talking on your cellular phone at the same time).

Now here's the key: only one core can run a thread at one time. In other words, if you have an eight core machine and a one-thread application, only one core can run that application, and the other seven cores have to find something else to do (like run other applications, or do nothing).

Two more notes: a thread can be "blocked" - this means it's waiting for something to happen. Blocked threads don't use a core and don't do anything. For example, if a thread asks for a file from disk, it will "block" until the disk drive comes up with the data. (By CPU standards, disk drives are slower than snails, so the CPU takes a nap while it waits.)

So if you want to use eight cores, it's not enough to have eight threads - you have to have eight unblocked threads!

If there are more unblocked threads than cores, the operating system makes them take turns, and the effect is for each of them to run slower. So if we have an application with eight unblocked threads and one core, it will still run, but at one eighth the speed of an eight core machine.

It's not quite that simple, there are overheads that come into play. But for practical purposes we can say:
  • If you have more unblocked threads than cores, the execution speed of those threads slows down.
  • If you have more cores than unblocked threads, some of those cores are doing nothing.
Trivial Threads

When a thread is blocked, it does not use any cores. So while X-Plane has a lot of threads, most of them are blocked either most or all of the time. For all practical purposes we don't need to count them when asking "how many cores do we use". For example, G1000 support is done on a thread so that we keep talking to the G1000 even if the sim is loading scenery. But the G1000 thread spends about 99.9% of its time blocked (waiting for the next time it needs to talk) and only 0.1% actually communicating.

What Threads Are Floating Around

So with those definitions, what threads are floating around X-Plane? Here's a short list from throwing the debugger on X-Plane 9.0. (Some threads may be missing because they are created as needed.
  • X-Plane's "main" thread which does the flight model, drawing, and user interface processing.
  • A thread that can be used to debug OpenGL (made by the video driver, it blocks all the time).
  • Two "worker" threads that can do any task that X-Plane wants to "farm out" to other cores. (Remember, if we want to use more cores, we need to use more threads.)
  • The DSF tile loader (blocks most of the time, loads DSF tiles while you fly).
  • At least 3 threads made by the audio driver (they all block most of the time).
  • At least four threads made by the user operating system's user interface dode (they block most of the time).
  • The G1000 worker thread (blocks most of the time, or all the time if you don't have the G1000 support option).
  • The QuickTime thread (only exists when QuickTime recording is going on).
So if there's anything to take away from this it is: X-Plane has a lot of threads, but most of them block most of the time.

Core Use Now


So how many cores can we use at once? We only need to look at threads that aren't blocked to add it up. In the worst flying case I can think of:
  1. The main thread is rendering while
  2. The DSF tile loader is loading a just-loaded tile while
  3. One of the pool threads is building forests while
  4. You are recording a QuickTime movie (so the QT thread is compressing data).
Yep. If you really, really put your mind to it, you can use four cores at once. :-) Of course, two cores is a lot more common (DSF tile loading or forests, but not both at once, and no QuickTime.

Core Use In the Future

Right now some of X-Plane's threads are "task" oriented (e.g. this thread only loads DSF tiles), while others can do any work that comes up (the "pool threads", it's like having a pool car at the company, anyone can take one as needed). The problem with this scheme is that sometimes there will be too many threads and sometimes too few.
  • If you have a dual-core machine, having forests building and DSF loading at the same time is bad - with the main thread that's three threads, two cores; each one runs at two-thirds speed. But you don't want the main thread to slow down by 66%, that's a fps hit.
  • If you have a four-core machine, then when the DSF tile is not loading, you have cores being wasted.
Our future design will allow any task to be performed on a "pool thread". The advantage of this is that we'll execute as many tasks as we have cores. So if you have a dual-core machine, when a DSF tile load task comes along while there is forests being done, the one pool thread will alternate tasks, leaving one core to do nothing but render (at max fps). If you have a four-core machine, the DSF load and forests can run at the same time (on two pool threads) and you'll have faster load times.*

* Who cares about load time if you're flying? Well, if you crank up the settings a lot and fly really fast, the loader can get behind, and you'll see trees missing. X-Plane is always building trees in front of you as you fly and deleting the ones behind you. So using more cores to build the forests faster means you're less likely to fly right out of the forest zone at high settings.

Friday, March 21, 2008

More Threads - the Installer

Nine women cannot have a baby in one month - that's the classic example that gets thrown around computer science for the difficulty of parallelization - that is, just because we have ten times as many resources doesn't mean we're going to go ten times as fast.

Problems of scalability via parallelization have become very important for graphics engines as everybody and their mother now has at least two cores, and users with more serious hardware are going to have four.

I get asked a lot: "will X-Plane utilize X cores..." where X is a number larger than two. My general answer is: sometimes, maybe, and probably more in the future. I can't make strong predictions for what we'll ship in the future, but the general trend for the last 18 months has been us using more cores every time we go into a piece of code to do major architectural changes.

I've been doing a lot of work on the installer this week - the first major overhaul of the installer since we originally coded it all the way back at X-Plane 8.15. And the new installers and updaters will try to take advantage of multiple CPUs where possible. A few cases:
  • The X-Plane updater runs an MD5 checksum over the entire X-Plane folder to determine which version of the various file components you have and whether they need to be updated. This s not a fast process. I am working on threading this so that more CPUs can work on the problem at once. It looks like there will be only modest benefits from this because the process is also highly bottlenecked on the disk drive.
  • The installation engine from the DVD will use more than one CPU to decompress files. For zip compression this wasn't very important, but the scenery will be compressed via 7-zip compression to get us down to disk DVDs. 7-Zip compresses DSFs about 10% smaller per file than zip, but it's horribly slow to decompress, so being able to throw twice the CPU at it is a big win.
Now on one hand, our top performance goals are for the sim, not the installer. On the other hand, faster installations are good. But my main point here is: when we wrote new code four years ago, we assumed one CPU and a nice graphics card. We now assume at least two cores and possibly more, and that informs the design of every new feature, not just the rendering engine. If we don't create a multi-core-friendly design, we're effectively leaving at least 50% of the CPU on the table.

Sunday, March 02, 2008

Hardware Profiles

X-Plane 9 currently recognizes (roughly) three categories of graphics hardware:
  • Non-Pixel-Shader hardware (GeForce 2,3,4, Radeon 7000-9200)
  • First-Generation Shader Hardware (GeForce FX 5nnnn, Radeon 9500-9800, X300-X600)
  • Later-Generation Shader Hardware (GeForce 6, 7, 8, Radeon X200, Radeon X700+)
That first bucket is pretty simple: those cards don't support programmable pixel shaders (as we know them today) and can't run any shader effects. The "use pixel shaders" check box doesn't appear in the rendering settings.

The distinction between the later two is a little bit more subtle. Basically the first generation of pixel shader cards (the 9700 and friends) support only 96 instructions for each pixel shader; this puts us right on the edge of sometimes not being able to draw all of our effects; we have to simplify the water slightly to make it work. The next generation of chips (X850 and friends) doesn't have this limitation.

By comparison, while NVidia cards have been able to handle long shaders from day one, the GeForce 5's shader performance is really poor.

So we bucket all of these chips as "first-gen". When we detect this we:
  • Simplify shaders slightly (gets us out of trouble with the 9700).
  • Don't default to shaders being on when the sim is first booted (because the framerate will probably be unusably slow).
Even though the 9700 provided very usable shader performance in its day, by the standards of modern GPUs, this older chip isn't that fast, so it's probably for the best that we not enable reflective water by default on machines with these cards.

By comparison, X-Plane deals with almost all other capabilities on an a-la-carte basis; particualr features are enabled if the right menu of hardware features is available. We do this to try to deal more flexibly with the wide variety of cards that are out there. Some examples:
  • You'll get hardware accelerated runway lights if your card supports pixel shaders and sprites (virtually all shader-enabled cards have sprites).
  • You'll get sun-glare effects if your card supports pixel counting (virtually all modern cards can do this).
  • The non-pixel-shader rendering code will show more detail if your card supports more texture units (this is only an issue with very old hardware).
I've been looking over hardware profiles a lot lately, and I suspect that the next big "jump" in hardware will be the DX10-compliant cards (GeForce 8, Radeon HD). There's a lot of fine print in what the various cards can do between all of the pre-DX10 cards; at some point when we decide what menu of features we'll require for rendering, we need to simplify.

My guess is that when we start to have "really advanced" pixel shaders that require hardware more sophisticated than what we need now, we'll simply require a DX10 card. Otherwise we'll have to sort through 8 different profiles of fine print, only to attempt to partially support cards that probably won't be fast enough anyway.

(That is to say, a feature is only useful for us if it can run reasonably quickly. It doesn't make sense for us to try to make a special "simplified" version of a rendering feature for, say, the X850 if every X850 is going to turn it off every time for framerate reasons.)

If any of this turns into hardware buying advice, I suppose it would be this:
  • If you are deciding between a DX10 and DX9 card (e.g. between the HD2400 and X1900, or GeForce 8600 vs 7900) go for the newer generation DX10 ards (HD or GeForce 8); if the card has decent performance you'll also be setting yourself up for future features.
  • As always, pay attention to the fine print of the model numbers, particularly the configuration. Lower model number cards basically have fewer parallel components than higher number ones and that leads directly to lower framerate.
I see an add online for the GeForce 8500 for $70 and 8600 for $90. But if you look at the links below, you'll see that the 8500 has only half the shaders of the 8600 - that's going to be a huge performance difference for $20.

(So the moral of this story is: try to get an HD or GeForce 8 card, but don't dip into the really low end cards because they're stripped down too far for X-Plane use.)

These pages (NV, ATI) on Wikipedia list specs for a whole pile of cards and can be useful to decode the fine print.

Sunday, February 17, 2008

Road Trip

I'm going to be more or less out of the office next week - the Amichai Margolis band is playing a series of shows in Florida. I'm going to have to take the laptop - we're too close to going final to miss a week of bug reports, but hopefully the next beta will stick for a while. (Initial reports indicate that the video driver initialization changes we made are fixing crashes and not causing new ones.) There will be a beta 23 shortly to address other bugs.

I wanted to take a moment to thank all of the users who have helped me debug video card compatibility problems remotely. I now have seven installed operating system configurations in my office, and it isn't nearly enough to see all of the problems we hit in field. Looking back at my in-box this week is a reminder of how patient you all have been in trying test builds, sending log after log, helping get to the bottom of some very tricky issues.

Things are going to be a bit busy for the next few weeks - once I get back we'll be finishing up the last few bugs, so it will take a little while to get back to questions about scenery, plugins, etc.

Thursday, February 14, 2008

Instability in Version 9

One of the reasons why the X-Plane 9 betas have had so many more crash bugs than version 8 is that we introduced loading DSFs on a second core. This feature makes scenery loads much slower and (by using the second core) impacts fps less while they happen.

The problem is that I'm still fumbling with code that will allow this in all cases. (That would be three operating systems, two hardware vendors, and a myriad of drivers, some new, some quite prehistoric.)

Beta 22 will be out soon, and will contain the fourth major rewrite of the OpenGL setup code for X-Plane 9. So far the initial tests look good, but we never know until we let a lot of users try the code and find the new edge cases.

It's relatively easy to tell if your instability is related to the use of OpenGL with threads: simply run the sim with the --no_threaded_ogl option. If things become a lot more stable, it's a threaded GL problem. Mind you --no_threaded_ogl is more of a diagnostic than a workaround; without threaded OpenGL, the sim will pause when loading scenery.

(Also, to clarify, you'll find talk on discussion groups and game forums about "threaded drivers". Threads are a programming abstraction that can utilize multiple cores. What I am talking about is X-Plane using multiple threads to load scenery - in our case this requires interfacing with OpenGL. But a threaded driver is different - it's just a graphics driver that's been optimized for multcore machines. These two concepts are totally different; you don't need a threaded driver to use X-Plane 9, and a threaded driver won't make X-Plane 8 load without pauses.)

Thursday, February 07, 2008

GeForce 7 and Water Performance

A number of Windows and Linux GeForce 7 users have discovered that the command-line option --no_fbos improves their pixel-shader framerate a lot. Windows and Linux Radeon HD users have also discovred that --no_fbos cleans up artifacts in the water. Here's what's going on, at least as far as I can tell. (Drivers are black boxes to us app developers, so all we can do is theorize based on available data and often be proved wrong.)

Warning: this is going to get a bit technical.

FBO stands for framebuffer object, and simply put, it's an OpenGL extension that lets X-Plane build dynamic textures (textures that change per frame) by drawing directly into the texture using the GPU. Without FBOs we have to draw to the main screen and copy the results into the dynamic texture. (You don't see the drawing because we never tell the card "show the user".)

We like FBOs for a few reasons:
  • Most importantly, FBOs allow us to draw big images in one pass even if the screen is small. For example, if we have a 1024x1024 dynamic texture but the screen is 1024x768, then withou FBOs we have to draw the image in two parts and stitch it together. That sucks. With FBOs we can just draw straight to the texture and not worry about our "workspace" being smaller than our texture. This is going to become a lot more important for future rendering features where we need really-frickin' big textures.
  • It's faster to draw to the texture than to copy to it.
  • If you're running the sim with FSAA, then we end up using FSAA to prepare all of those dynamic textures. In virtually all cases, we don't need the quality improvements of FSAA, so there's no point in taking the performance penalty. When we render right into the texture, FSAA is bypassed and we prep our dynamic textures a lot faster.
Since copying to a texture from the screen predates these new-fangled FBOs by several years, most drivers can copy from the screen to the texture very quickly; however we have hit at least one case where FBOs are much faster than copy-from-screen. That's really a rare bug, and as you'll see below, we see more weird behavior with FBOs.

When do we use FBOs vs. copying? Well, it's pretty random:
  • Pixel shader reflective water and fog use FBOs.
  • Cloud shadows and the sun reflection when pixel shaders are off do not use FBOs.
  • The airplane panel uses FBOs if the panel is 1024x1024 or smaller; if the panel is larger than 1024x1024 we draw from the screen and patch things together. So the P180 and the C172 are using different driver techniques!!
When you run X-Plane with --no_fbos, you instruct X-Plane to ignore the FBO capability of the driver, and we use copy-from-screen everywhere.

Mipmapping

There is one more element: mipmapping. A mip map is a series of smaller versions of a texture. Mipmapping allows the video card to rapidly find a texture that is about the size it needs. Here's an example: imagine you have a building with a 128x128 texture. If you park your plane by the building, the building might take up about 100x100 pixels on the screen; your 128x128 texture is a good fit.

Now taxi away from the building and watch it get smaller out your rear window. After a while the building is only taking up 8x8 pixels. What good is that 128x128 texture? Its' much too big for the job. With mipmapping, the card has a bunch of reduced-size versions of your texture laying around...64x64, 32x32,16x16, 8x8, 4x4, 2x2, 1x1. The video card realizes the building is tiny and grabs the 8x8 version.

Why not just use the 128x128 texture? Well, we'd only have two options with this texture:
  1. Examine all 16384 pixels of the texture to compute the 64 pixels on screen. That sucks...we're accessing VRAM sixty four times for each pixel. Accessing VRAM is slow, so this would kill performance.
  2. Simply pick 64 pixels out of the 16384 based on whatever is nearby. This is what the card will do if mipmapping is not used (because option 1 is too slow) and it looks bad. Thsoe 64 pixels may not be a good representation of the 16384 that make up your building side.
So mipmapping lets the video card grab a small number of pixels that still capture everything we need to know about the building at low res.

We don't mipmap our dynamic textures very often; the only ones that we do mipmap are the non-pixel-shader sun reflections and the pixel-shader sun reflections.

ATI

As far as we can tell, the current ATI Catalyst 8.1 drivers do not generate mipmaps correctly for an FBO-rendered texture. This is why without --no_fbos ATI users on Windows or Linux see very strange water artifacts. --no_fbos switches to the copy path, which works correctly.

At risk of further killing my track record of driver bugs in v9, we do think this is a bug. We have good contact with the ATI Linux driver guys so I have hopes of getting this fixed.

nVidia

It appears that the process of creating mipmaps for FBO textures is not accelerated by the hardware on the GeForce 7 GPU series. This is why GeForce 7 users are seeing such poor pixel shader performance, while GeForce 8 users aren't having problems.

Now poor performance is not a bug; there's nothing in the OpenGL spec that says "your graphics card has to do this function wickedly fast". Nonetheless, what we're seeing now is unusably slow. So more investigation is needed -- given that the no-FBO case runs so quickly, I suspect the hardware itself can do what we want and it's just a question of the driver enabling the functionality. But I don't know for sure.

Saturday, January 26, 2008

Performance Wrap-up (for now)

The story on X-Plane performance is never over, but the chapter that is 9.00 pretty much is. I think we'll be RC in the next build (if all goes well). Certainly a lot of the things that are still performance "problems" will require changes larger than we can do in a late beta.

I say problems in quotes because a lot of what's been reported lately is in the form of: a huge screen res + a lot of shaders + a lot of FSAA = slow fps. That's not really a bug, that's an engine limitation. Now I want to make the engine as fast as possible, and a lot of this pixel shader stuff is new to 9.0, so if our track record for tuning stays the way it was for v8, we'll probably get some efficiency improvements later.

But unfortunately there's an underlying limitation: the new water and fog both cause the rendering engine to consume significantly more hardware resources than it would otherwise. Turn them on and you get prettier pictures at a price.

Just to post a a few general things I've found:
  • X-Plane 9 will tell you where your GPU really stands. GPUs that were very adequate for X-Plane 8 (like the GeForce 6600 GT) will turn out to have nothing left in reserve for v9, while GPUs that were bored in v8 (the GeForce 8800 GTX for example) will show what it really has.
  • Generally the cost of going from no shaders to shaders with water reflections of "none" and no volumetric fog should be very low if your screen res and FSAA don't add up to something crazy (like 16x FSAA at 2048x2048).
  • If you do have serious performance hits, try --no_fbos in the command-line; some drivers seem to have trouble with them.
  • The P180's virtual cockpit is a lot more expensive than the other ones, because it has a huge panel that is used in 3-d. We'll hopefully rebuild the cockpit at some point.
  • Turning water reflections to "complete" is very expensive. Watch the water and use the lowest setting that looks good. You don't need complete reflections if there are a lot of waves!
  • Shaders, FSAA, and screen size are all pulling from the same set of resources - be careful about cranking up all three.
  • Check your v-sync - a lot of users whose vsync clamped them at 60 fps in v8 will be clamped at 20 in v9.
  • Do your testing with texture res set low, then crank texture res later; pixel shaders also require the allocation of VRAM that can't be purged (for things like reflection images) so running out of VRAM can show up in some weird ways performance-wise.
  • The new Intel iMacs have serious performance problems with shaders on. This is due to driver limitations; given the much better performance under BootCamp, I expect the Mac performance to get better when the drivers are updated. For now I'd keep shaders off.
For now, please hold off on sending me performance reports. I just don't have time to address them. In the future I will try to solicit very specific performance data points that we need to check. Perhaps in the future we can also set up a database of fps-test results to have a more comprehensive idea of how the hardware does its job.

I expect future features to appear in v9 that further eat hardware; those features will have an off switch. You may have to pick and choose what graphics you enable; there is no free lunch here. I also expect new graphics cards to emerge that make the GeForce 8800 GTX seem quaint!

ATI: 2. Ben: 0.

What a difference new drivers make. ATI's latest OpenGL drivers (Catalyst 8.1) seem to work quite well with X-Plane. On two fronts:
  • Linux. Turns out all you need to do to make X-Plane happy on Linux with ATI hardware is update the drivers. I'm running with the Cat 8.1 drivers on my MacBook Pro and things look good. Use Catalyst 7.11 drivers or newer! No more MALLOC_CHECK_=1 or --no_threaded_ogl. With the next beta, you won't have to use --force_run anymre.
  • Windows. We were getting reports of corrupt screens on startup, and with the Catalyst 8.1 drivers these reports became very frequent. Turns out our threaded OpenGL code was doing something naughty*. Beta 19 fixes this.
The only known issue I can think of is: if you see corrupt water reflections, run with --no_fbos.

* Well, the way you set up threaded OpenGL on Windows and Linux is not very well documented, so I say naughty in that we made the drivers unhappy. I have yet to find a document that states clearly whether what we were doing is correct or not. We had to guess.

Tuesday, January 08, 2008

A New Broken Record

For years now I've been harping about ways to keep the number of batches down in your scenery. A batch is a single submission of triangles to the graphics card for drawing. Batches get rendered fast even if they contain a ton of triangles, but changing modes between batches is not very fast, so a few large batches is hugely better for performance than a large number of tiny batches.

To play that broken record one more time, there are two ways that you (a scenery designer) can cut down the number of batches):
  • Use a small number of larger textures instead of a large number of small textures, preferably sharing textures between similar scenery elements that are placed nearby. X-Plane will do its best to merge the content that uses those textures into single batches. We call this the "crayon rule."
  • Use less attributes in your objects. Attributes usually require a new batch (after the graphics card mode has been changed due to the attribute). So if you've got 1000 attributes in your object, you've got a problem.
Well, with X-Plane 9 I have a new broken record: avoid overdraw!

Overdraw is the process of drawing pixels on top of other pixels on screen. It happens any time we use blending to do translucency, and any time we use polygon offset to build the image in layers.

Overdraw is bad because with X-Plane 9's pixel shaders, most users are slowed down by the graphics card's ability to fill in pixels (pixel fill rate), with those complex shaders being run for every pixel. If you are at a screen-res of 1200x1024 looking at the ground with no objects, that might be 1.2 million pixels to fill. But if there is an overlay polygon covering the ground, we have 2.4 million pixels to fill! That's a huge framerate hit.

Right now there's not much you can do about overdraw. Once MeshTool comes out I will post some guidelines on how you can limit overdraw.

We took a step in the v9 global scenery to limit overdraw: in X-Plane 8 the global scenery tried to hide repetition of flat textures by drawing them over each other with offsets. In X-Plane 9 this is done in a pixel shader (e.g. the texture is analyzed and swizzled in the shader and then drann once), cutting down the number of times we must draw.

If you turn pixel shaders on and off in a flat area like Kansas you might see this if you compare the screenshots - the farm textures are more repetitive without shaders. This gives faster fps to everyone (with or without shaders) by eliminating overdraw.

Thursday, January 03, 2008

Beware the Forceware

Brett and a few other users pointed out to me that the nVidida ForceWare 169.21 release appears to hose video for X-Plane. If you have an nVidia card, don't update, or you may want to back up a version.

This kind of thing has happened in the past - hopefully a revision will come out fairly soon. (But I do not have contact with the nvidia driver team for Windows on this...)

Also some users are seeing corrupt startup screens with ATI hardware - apparently some configs don't react well to us turning vsync off. Not quite sure what we'll do about that yet, but hopefully a fix will be in the next few betas.

Friday, December 28, 2007

Script Files and Options

Sometimes we find that users machines cannot run without hiding OpenGL driver features from X-Plane. That is, the computer says it can support VBOs, but when the sim asks for a VBO, something really bad happens. X-Plane (since mid-version 8) accepts a series of command-line options that cause the sim to ignore the given feature.

These kind of bugs come and go as drivers are updated, the sim changes which technology it uses, and hardware cycles through the user base. The biggest one we're seeing right now is that the new iMacs show runway lights as white squares unless the sim is run with the --no_sprites option.

We're trying a new way to address these problems. In the past we would give users the command-line option; now we are building double-clickable script files that launch the sim with the appropriate options. Theoretically this...
  • Is less error prone for users.
  • Is quicker for users who may have to use the command-line option every time they launch the sim until a driver update becomes available.
  • Is quicker for us (in that we spend less time mailing out instructions and helping users who are unfamiliar with a command-line environment).
We did consider some other options, but this seemed like the least evil. The runners-up:
  • Just turning off the hardware feature permanently. We ruled this out because the performance hit would be significant and affect all users.
  • Attempt to identify and auto-turn-off known bad options. We do this in some cases, but it requires changes to the sim code, so this does no good if a new video card comes out after a given x-plane release and introduces new problems. so it's not a total solution. Also since a bug might be resolved in-field, if X-Plane auto-avoids certain configurations we have to patch the sim once the configuration starts to work again.
  • Provide a user interface to turn these options off inside the sim. This was ruled out for two reasons: first, some of these options cause the sim to crash before a user can ever get to the rendering settings. Second, turning off these kinds of options can really kill performance, so leaving the options in sight produces a whole new tech support option. (A user tells us their framerate is awful at the lowest settings...perhaps they turned off the hardware acceleration options in an attempt to get the lowest settings...etc.)
We'll see how well this approach works. So far it seems to be working better than handing out command-line options.

(Of course if we can address the problem by working around it with a change to our code, we almost always choose this option.)

Friday, December 21, 2007

NVidia: 2 Ben: 0

I found the root cause of another NVidia specific bug, and once again it's my own stupid code. If you Google for driver bugs, you'll find plenty of grumpy developers ranting about how card X does this wrong thing and card Y does that wrong thing...I figure it's only fair to follow up and say "yep, that one was mine."

Like the previous nVidia-only crash, this was a case where X-Plane was always doing something wrong, but only some drivers had problems with the behavior. So the crash was NVidia-specific, but X-Plane caused.

I believe that this bug was manfiesting itself either as a message that "scenery shift took more than 30 seconds" or some kind of crash. One of the problems was that the diagnostics for this particular bit of code were really bad. So we've improved things a bunch...
  • There is more careful error checking during scenery shift, and those error messages are reported.
  • If the sim does crash, some new code will output a crash log on Windows that helps us isolate what actually happened.
Beta 12 will be out soon with the fix that caused problems on NV hardware as well as the improved diagnostics. So you may find that the sim just works better, but if it does still crash or report errors, please tell us - now we'll have log files that will let us diagnose the problem a lot faster!

Friday, December 14, 2007

V-Sync - Problematic in Practice

To those who have sent performance info: thank you, but you probably won't hear for me for a week. I'm up to my eyeballs in reports and it's going to take a while to get through them.

I finally found the code that allows X-Plane to turn off V-Sync. This should help nVidia users who are having framerate problems.

The basic idea is this:
  • X-Plane tells the graphics card to draw a lot of stuff.
  • The video card accumulates this "todo" list and works on it while X-Plane runs.
  • X-Plane indicates that the entire frame to be seen is done and tells the card to show the results.
  • Eventually some time later the card finishes the todo list and then shows it to the user.
V-Sync relates to the question of when this last step happens. When V-Sync (vertical sync) is off, the card shows the results as soon as it is done drawing.

But when V-Sync is on, when the card finishes drawing the world, it then waits until the monitor is done drawing its frame, and then shows the results. Without V-Sync we can have a situation where the top half of the monitor is showing a new frame and the bottom half is showing an old frame. (This is called "tearing".)

So normally V-Sync is good because it prevents tearing. But the problem with V-Sync is that a frame can only be shown when the monitor refreshes. The video card has to wait until this happens, and this slows our framerate down.

In particular, most users have their monitors set to 60 hz. If X-Plane can only produce frames at 50 hz, the video card will have to further slow the framerate down to30 hz (one x-plane frame for every two monitor refreshes). If X-Plane falls below 30 hz, we end up with 20 hz (one X-Plane frame for three monitor refreshes), and if X-Plane goes below 20 hz, we would clamp at 15.

So when monitor refresh is on, there can be large framerate hits for small losses of performance in the sim, and a real risk of getting locked around 20 fps.

(The minimum framerate in X-Plane is intentionally set to 19 so that we won't fog up if the video card clamps us at 20 fps.)

So when beta 11 comes out, you may get some framerate back if you haven't already hacked your graphics card's control panel settings. If you still want v-sync, you can always set it this way in the control panel. But most users I've talked to are happy to have it off.

In an only vaguely related note, one of the reasons to have high frame rate is to have a smooth flight model. But Austin has now put a new setting in the operations-and-warnings dialog box: you can pick how many times per graphics frame the physics run. The normal ratio is 1:1, but for fighter and acrobatic pilots, you might find that you can get a nice feel at lower fps (20-30) by setting a higher ratio.

Thursday, November 29, 2007

Two Truths of Hardware

I would have to say my track-record at predicting hardware developments in the sim world has been pretty poor. But these two factors seem like they won't change for a bit:
  1. The amount of working data a system can crank through in one frame keeps increasing while the total amount of virtual address space stays the same (3 GB).
  2. The gap between the fastest and slowest systems from a finite time period keeps widening.
To elaborate on this first point, as video cards get faster (both GPU and internal bus/RAM speeds), system RAM and buses get faster (graphics slot changes are rare, but with enough VRAM this is relatively moot) the amount of texture and geometry data that X-Plane can tell the card to draw in a 30th of a second keeps going up. So users are running with more trees, roads, 3-d, etc. than in the past.

But all of this stuff lives in memory, and even if a user has 8 GB, X-Plane can't load more than 3. Imagine what will happen when a graphics card can draw 3 GB of data in one frame? X-Plane will have to use all of its available memory for things you see right now or it will waste graphics power. This would mean purging from memory anything that isn't on screen!

On the second point, video card power doubles every, um, six to twelve months (perhaps more like 18, now that the card makers are hitting the same fabrication and power limits that the CPU makers have already hit, but this is all seat of my pants). So even if we only supported the last three generations of cards (we support at least seven!!) that gap in performance doubles!

This means that every year it requires a more flexible rendering engine to make a sim with decent frame-rates for old computers and up-to-date graphics on new ones.

Sunday, October 14, 2007

X-Plane 9: The Absurdity of Pretending

There have been plenty of rumors and semi-official posts regarding the upcoming major revision of X-Plane (X-Plane 9). I have been trying to keep my mouth shut about it...the problem with pre-announcing anything is that the upside to us is small (at best we do what we said) and the downside is large (at worst we don't do what we said and people get grumpy). No one complains if XP9 turns out to have no-pause scenery load and it's a surprise...but plenty of people complain if we say "there won't be pauses" and then they are.

But...the situation is becoming mildly absurd...plenty of info is out there, and saying "the upcoming major release", etc. just feels political and weaselly. Austin would be disgusted.

So listen: I am going to try to provide some info on X-Plane 9. This info is subject to change. This is what we think is going to happen to the best of our knowledge. The release is still a ways away and enormous changes will happen. When things change, do not bitch to me that "you promised X" would happen. I do not promise anything. This info is provided to try to help those making add-ons for X-Plane plan appropriately.

With that in mind...I will try to post some more details on the authoring environment in the next few days. For now, here's some very basic guidance on compatibility and hardware requirements:
  • The hardware requirements will be at least as high as X-Plane 8. If your machine is gasping and wheezing on 8, it's not going to be any better on 9.
  • X-Plane 9 will depend more heavily on pixel shaders. If your hardware doesn't have pixel shaders (GeForce 2-4, Radeon 7000-9200) or has really crappy shaders (GeForce FX series) you will miss out on a lot of the cool stuff in v9, and possibly have the scenery look worse (but faster) than v8 (as we move features from the CPU to pixel shaders).
  • Scenery that opens in x-plane should open in X-Plane 9 unmodified - if the scenery works in 8 but not 9, report it as a bug!
  • Plugins that work in v8 should work in v9 without modification.
Finally, we are trying to finish up X-Plane 861...this is a bug-fix patch for version 8 - it contains no new features, but it does fix a few nasty bugs, some of which cause crashes. So if there is any new feature, it's coming in 9, not 861. Version 8 has been out for a very long time, so I will accept no argument that v8 should have more features than it does now. It's been a long run!

(One of my main goals with 861 is to try to fix any weird behavior for third party scenery add-ons, so that a scenery add-on looks the same in v8 and v9. If we left the bug in 861 and fixed it in v9, authors would have to hack the scenery to make it work with v8, and then remove the hack and republish for v9. By trying to fix the authoring bugs in v8, at least when possible, it lets authors publish one package for both versions. Of course, v9 will have new features, so I expect some v9-only scenery will emerge pretty quickly.)