Wednesday, March 31, 2010

OS X 10.6.3 Performance

OS X 10.6.3 is out. Besides adding a bunch of OpenGL extensions*, it looks like vertex performance is improved on nVidia hardware. My quick tests compare 10.5.8 to 10.6.3 (since I no longer have a 10.6.2 partition) and show a 15-30% improvement. If you have 10.5 and an 8800 you may want to consider updating your OS.

I also discovered that --fps_test=3 produces unreliable results because...wait for it...the deer and birds are randomized. If they show up during the fps test, you get hit with a performance penalty. I am working to correct this and may have to recut the time demo to work around this behavior.

If you are trying to time the sim via --fps_test=3, I suggest running the test multiple times - you should see "fast" runs and "slow" runs depending on our feathered and four-legged friends.

Phoronix reported a performance penalty with the new update; I do not know the cause of this or whether the fps_test=3 bug could be causing it. But their test setup is very different than mine - a GeForce 9400 on a big screen, which really tests shading power. My setup (an 8800 on a small screen) tests vertex throughput, since that has been my main concern with NV drivers.

My suggestion is to use --fps_test=2 if you want to differential 10.6.2 vs. 10.6.3. I'll try to run some additional bench-marks soon!

EDIT: Follow-up. I set the X-Plane 945 time demo to 2560 x 1024, 16x FSAA, and all shaders on (e.g. let's use some fill rate). I put the Cirrus jet on runway 8 at LOWI, then set paused forward full screen no HUD. In this configuration, I see these results:
Objects  10.5.8   10.6.3
none 85 fps 100 fps
a lot 46 fps 61 fps
tons 37 fps 42 fps
Note that in the "no objects" case the sim is fill-rate bound - in the other two it is vertex bound. So it looks to me like 10.6.3 is faster than 10.5.8 for both CPU use/object throughput and perhaps fill rate (or at least, fill-rate heavy cases don't appear to be worse).

* These extensions represent Apple and the graphics card company creating software interface to fully unlock the graphics card's abilities.

Friday, March 19, 2010

Weird Water

If you look at funky pictures of X-Plane on line, a fair number of them will show incorrect water reflections. I am working on some bug fixes for the reflection code for 950. Bug fixes might not even be the right term. To understand the incorrect reflections, you have to understand what the water code can and cannot do.

The water reflection code renders a reflection based on a flat plane. This limitation comes from the mathematics of the algorithm - a compromise to have water reflections that run at "real-time" speeds. (Real-time is graphics nerd speak for 20-30 fps and not 1-2 fps.)

As it turns out, the Earth is not flat. So we can pick up a number of reflection "bugs", due to the limits of the approximation we are using:
  • Over very large distances, the flat plane is a bad approximation of a water surface. The flat plane simply can't be "right" everywhere for any large scene. This isn't a bug, it's a ceiling on our maximum quality - a design constraint.
  • If we have two water surfaces at different elevations (e.g. a river with a dam) we can't have our reflection plane match both. So some scenes may have wrong reflections with multi-level water. This too is a design constraint.
  • If our reflection plane is at the wrong height or the wrong slope, it is going to produce really weird results. The reflection plane being in the wrong place despite a small scene with one level of water - that'd be a bug.
  • There is an art to positioning the plane - if we have a large scene (so the round earth means there is no one perfect plane) some locations of the plane will look better than others.
Now one fall-out of all of this is that things are going to look better if water is really flat, which is not always the case (both for some parts of the global scenery with production errors and some third party scenery). Where the water is sloped or contains bumps, we hit the multi-level case where we should not and we face reflection plane placement problems.

Finally, if the scenery mesh contains slanted water, we're really going to be hosed - almost by definition if X-Plane uses a sane water reflection plane, it won't be sloped, and thus this sloped water is going to be unaligned with the reflection plane and produce something that looks really funky.

So my work on 950 is aimed at having X-Plane be less easily fooled by complex and incorrect scenes less often. (Note that X-Plane can't tell the difference between Norway, where we really have water at multiple elevations, and bad input data to MeshTool.)

Even with these fixes, sloped water is still going to look pretty strange (because it is strange). And even with these fixes, multi-level water will still have its reflections approximated at best. But hopefully the visuals from the sim will be less jittery while flying over tricky DSFs.

I'm hoping we'll have a beta 2 in the next week or two.

Monday, March 15, 2010

Musings on CSLs

Now that Wade has XSquawkBox 1.0.3 out, I've been thinking about CSLs - that is, the collections of simplified airplanes that XSquawkBox uses to draw the other users. The CSL system was invented back in the days of X-Plane 6, and it's getting a bit long in the tooth. You can't use OBJ8 files, and it doesn't understand a lot of the modern rendering tricks that authors use with the standard tool chain.

Plane-Maker has advanced quite a bit since then too - to make the original CSL, I had to create a special one-off hacked build of Plane-Maker to export aircraft as OBJs. This capability is now built into Plane-Maker, works a lot better, and even supports animation.

X-Plane now exports a native OBJ drawing interface to plugins. Besides giving plugins access to the fully optimized native OBJ draw code, this also means that plugins can draw objects with per-pixel lighting.

One more piece of the puzzle: in France Austin announced that we were working on a new ATC engine. One goal of this new engine is to provide ATC to the AI planes, as well as your plane, so that the other aircraft interact seamlessly in one simulated environment. (In X-Plane 9, ATC only directs you, and the AI are rogue planes that try not to buzz you when you're on the runway.)

This makes me wonder: should there be a next-gen CSL format that is shared between X-Plane and third parties, based on X-Plane doing the rendering work?

Sunday, March 14, 2010

Best. Beta. Ever.

Wade just cut the final build for XSquawkBox 1.0.3, and of all of the betas I have been involved in, this was the best one ever. I say this because Wade ran the entire beta and fixed pretty much all of the bugs without me! That is to say, Wade made 1.0.3 happen. So huge thanks to Wade for making a new and much needed XSquawkBox version possible.

Here are some of the features:
  • Automatic download and update of the server list.
  • Dual VoIP radios.
  • Proxy client support for users with multiple copies of X-Plane for multiple views.
  • A ton of work on the UI - a lot of little things add up to make a much more mature, usable client.
Now if only I had time to fly on VATSIM...

New Toys

This isn't supposed to be a coding blog, but users do ask about DirectX vs. OpenGL, or sometimes start fights in the forums about which is better (and yes, my dad can beat up your dad!). In past posts I have tried to explain the relationship between OpenGL and DirectX and the effect of OpenGL versions on X-Plane.

At the Game Developers Conference 2010 OpenGL 4.0 was announced, and it looks to me like the released the OpenGL 3.3 specs at almost exactly the same time. So...is there anything interesting here?

A Quick Response

In understanding OpenGL 4.0, let's keep in mind how OpenGL works: OpenGL gains new capabilities by extensions. This is like a new item appearing on a menu at your favorite restaurant. Today we have two new specials: pickles in cream sauce, and fried potatoes. Fortunately, you don't have to order everything on the menu.

So what is OpenGL 4.0? It's a collection of extensions: if an implementation has all of them it can call itself 4.0. An application might not care. If we only want 2 of the 4 extensions, we're just going to look for those 2 extensions, not sweat what "version number" we have.

Now go back to OpenGL 3.0, and DirectX 10. When DX10 and the GeForce 8800 came out, nVidia published a series of OpenGL extensions that allowed OpenGL applications to use "cool DirectX 10 tricks". The problem was: the extensions were all NVidia specific tricks. After a fairly long time, OpenGL's architectural review board (ARB) picked up the specs, and eventually most of them made it into OpenGL 3.0 and 3.1. The process was very slow and very drawn out, with some of these "cool DirectX 10 tricks" only making it into "official" OpenGL now.

If there were OpenGL extensions for DirectX 10, who cares that the ARB was so slow to adopt these standards proposed by NVidia? Well, I do. If NVidia proposes an extension and then ATI proposes a different extension and the ARB doesn't come up with a unified official extension, then application like X-Plane have to have different code for different video cards. Our work-load doubles, and we can only put in half as many new cool features. Applications like X-Plane depend on unity among the vendors, via the ARB making "official" extensions.

So the most interesting thing about OpenGL 4.0 is how quickly they* made official ARB extensions for OpenGL that match DirectX 11's capabilities. (NVidia hasn't even managed to ship a DirectX 11 card yet, ATI's HD5000 series has only been out for a few months, and OpenGL already has a spec.) OpenGL 4.0 exposes pretty much everything that is interesting in DirectX 11. By having official ARB extensions, developers like Laminar Research now know how we will take advantage of these new cards as we plan new features.

Things I Like

So are any of the new OpenGL 3.3 and 4.0 capabilities interesting? Well, there are three I like:
  1. Dual-source blending. It is way beyond this blog to explain what this is or why anyone would care, and it won't show up as a new OBJ ATTRibute or anything. But this extension does make it possible to optimize some bottlenecks in the internal rendering engine.

  2. Instancing. Instancing is the ability to draw a mesh more than one time (with slight variants in each copy) with only one instruction to the graphics card. Since many games (like X-Plane) are limited in their ability to use the CPU to talk to the graphics card (we are "CPU bound" when rendering) the ability to ask for more work with fewer requests is a huge win.

    There are a number of different ways to program "instancing" with OpenGL, but this particular extension is the one we prefer. It is not available on NVidia cards right now. So it's nice to see it make it into the core spec - this is a signal that this particular draw path is considered important and will get attention.

  3. The biggest feature in OpenGL 4.0 (and DirectX 11) is tessellation. Tessellation is the ability for the graphics card to turn a crude mesh with a few triangles into a detailed mesh with lots of triangles. You can see ATI demoing this capability here.

There are a lot of other extensions that make up OpenGL 3.3 and 4.0 but those are the big three for us.

* who is "they " for OpenGL? Well, it's the architectural review board (ARB) and the Khronos group, but in practice these groups are made up of employees from NVidia, ATI, Apple, Intel, and other companies, so it's really a collective of people involved in OpenGL use. There's a lot of input from hardware vendors, but if you read the OpenGL extensions, you'll sometimes see game development studios get involved; Transgaming and Blizzard show up every now and then.

Wednesday, March 10, 2010

I Feel Manipulated

Tom has a new video on youtube of his just finished Falco. The video shows what screen-shots cannot: that the mouse interactions on the plane are really well crafted.

If you're just discovering X-Plane (or just discovering that X-Plane's 3-d cockpits can be very interactive), here's X-Plane's "raw" capabilities for manipulation:
  • The simplest manipulations are based on mapping the mouse from the 3-d cockpit back to the 2-d panel. This can only be done when the 3-d cockpit is textured using a piece of 2-d panel. This is the oldest way to make a clickable cockpit in X-Plane, dating back to the original X-Plane 3-d cockpits. The advantage of this method is that it's very easy to set up; the disadvantage is that the mouse click gestures tend to be "flat" in their operation.
  • As Tom's plane demonstrates, you can manipulate just about any dataref or command via a drag along a specific axis. Axes are subject to animation, so there's a lot of potential for "grabbing" things with this interface.
  • X-Plane also supports direct "click" manipulation - this can be handy for buttons where you don't want to require the user to move the mouse around. There are several types of click manipulation.
Click and drag manipulations can be tied into the plugin system - your plugin sees a manipulation as a change to a plugin-created dataref. This makes it possible to create almost any imaginable mouse effect. If you don't want to write a plugin, you can still write up the manipulators to any of X-Plane's datarefs (there are thousands) or commands (we're getting up toward the 1000 mark on these too).

To create manipulators on your cockpit, you can use the latest plugin for AC3D. A manipulator is a property on a mesh within your object - each mesh can have its own manipulation with its own properties.

X-Plane does not have an IK solver. Rather, movement of "stuff" in your cockpit is indirect.
  1. Your manipulator changes a dataref as the user drags along an axis.
  2. The dataref change shows as an animation on your mesh.
Fortunately, ac3d has a "Guess" button for the axis manipulators. If you set a mesh to be manipulated by dragging along an axis, the guess button will examine your animations and suggest an axis that will create the most "natural" looking animation for the manipulation. For example, if you have a throttle handle that rotates, the guess button will provide a drag axis perpendicular to the throttle (to push the levers); if you have a throttle lever that pushes, the guess button will make a drag axis that runs along the lever.

Monday, March 08, 2010

Conformance Test

I've been working on a conformance test for X-Plane. The idea is simple, and not at all mine: X-Plane 945 can output a series of test images that are the same on each run. The images cover a variety of rendering conditions. If a video driver is broken, the images will be corrupted.

You can learn more about how this works here: I am working on the 945 timedemo tarball now.

The main driver for this is to help NVidia, ATI, and Apple to integrate X-Plane into their dedicated testing. With X-Plane as part of their test systems, they can catch driver bugs the easy way - the day after the code is changed, rather than months later after a series of angry web posts. X-plane 945 includes a number of new features as part of its framerate test to help with this process.

My hope is that this will benefit users (who will see less bugs) and the driver writers (who can get feedback on code changes in a uniform and reproducible manner). Here are the eight images in the sample conformance test I wrote, based on the LOWI custom airport scenery.







Saturday, March 06, 2010

Plugin-Drawn Objects Do Not Exist

Version 2.0 of the plugin API (available in all versions of X-Plane 9) introduces three new routines that, for the first time, allow plugins to work with OBJs.

XPLMLoadObject and XPLMUnloadObject load an OBJ file (a model mesh) into memory, and then purge it when done. That part most users understand. The routine that causes confusion is XPLMDrawObjects. When you draw an OBJ with XPLMDrawObjects, you are not creating anything long-lasting or persistent in the world. Your object will be visible for one frame on screen, and then will disappear unless your draw it again. Your object is not part of any of the physics calculations.

To put this in perspective: when you make a new window using the Widgets API, the window is persistent - it exists until (1) you delete it or (2) your plugin is unloaded for some reason. You don't have to do anything per frame to "maintain" the window - you make it and it exists.

Objects are not like that. You cannot make an object "exist" in the X-Plane world - you can only draw it once per frame using the drawing callbacks. Essentially the draw-objects API is a lower level API.

Building a Layered System

Plugins operate in a layered environment, with lower level code on the bottom and your plugin at the top. The layer stack might only be 2 layers deep (XPLM on the bottom, plugin on top), or there might be several layers. Consider XSquawkBox:
  • The UI is drawn using the XPWidgets API. The XPWidgets API gets its drawing from the XPUIGraphics API, and the XPUIGraphics API changes OpenGL state using the XPLMGraphics API. So we've built up a layered system: basic OpenGL supports drawing, drawing supports user interface, user interface supports the plugin.
  • Similarly, multiplayer is done using a library that isn't part of the basic plugin system (but is open source): libXplaneMP. So here the XPLM supports drawing airplanes, libXplaneMP uses that to create a full multiplayer API, then XSquawkBox uses it.
The alternative to a layered system would be a "monolithic" one. Under a monolithic system, the only API for airplanes would be libXplaneMP, and the only way to create user interface would be widgets. Sandy and I usually prefer the layered approach because it provides a lot more flexibility. If you like widgets, great, use them. If not, no problem - roll your own on top of XPLMDisplay.

The Plugin System Is a Foundation

When Sandy and I cannot provide all of the layers, we have a strong bias toward providing the bottom layer, for an obvious reason: if the bottom layer isn't in the plugin system, it may be impossible for anyone else to create it. So typically if we have a choice between a high level vs. low level API, we'll put the low level API in first.

This is precisely what is happening with object drawing - we have the low level API ("draw an object") but not the high level one ("create an object in the scenery system"). Since we have provided the lowest level, it is possible to code persistent objects in your plugin by layering on top of our API. By comparison, had we only provided "create an object" it would be pretty close to impossible to draw an object for one frame - if you didn't want a scenery object, the API would be inflexible and useless.

On the Road a Lot

I've been on the road a lot for work, so my apologies to everyone whose email I am sitting on. Most of my time these days is being spent on new next-gen tech. But there are a few things I'm hoping to get done in the short term:
  1. Cut a new time-demo test. This might seem like a low priority item, but it's not. Apple, ATI and NVidia all run continuous automatic tests of their video drivers, with many applications and games. They have rooms full of computers that continuously run through 3 minute sections of Quake and Call of Duty, etc. If they introduce a driver bug while doing new development, these machines catch the problem immediately.

    The new time demo (based on 945) will have a number of features to make X-Plane a more useful test case. If we can make X-Plane into a test case, then they can catch bugs early, and that means you don't have to see them.

  2. Bring WED 1.1 to beta. The only thing holding it back is the DSF exporter, and I did have about two hours to poke at it last week. I'm hoping if I can find just a few more hours, I can finish off the exporter.

  3. Examine 950 bugs. I have half a dozen bug reports against 950 beta 1. 950 will be a small beta but also a slow one, because Austin and I have a lot of other things on our plates. If you haven't heard back from me on a bug report, probably it's still on my to-do list.

We'll see how much of that I can get to in the next week.

Tuesday, March 02, 2010

WorldEditor Export: File a Bug!

There's basically one reason why WorldEditor developer preview 2 is a "developer preview" and not a real beta: the DSF overlay export code isn't complete.

The problem is that, unlike an airport, an overlay has to be "cut" to the DSF tile boundaries. This is made slightly tricky by the fact that the overlay can have (1) bezier curved segments and (2) a UV map on those segments. My existing toolkit of polygon editing routines doesn't handle this case yet.

I have no idea when I will have time to complete this code. It is the number one piece of code that, if I had a quiet single afternoon of unexpected time to code, I'd pound it out. If I were stuck in an airport with my laptop, I'd pound it out. It should give you some idea of how busy things are that it still isn't done.

In the meantime, there is the scenery tools bugbase. By filing a bug, your issue won't get lost even if it's a while before I get to it.

A few quick rants about the bug base:
  • Most likely the first thing I'll do when I do get to your bug is just ask you for more info. Consider the bug to be as much a business card so I can make contact as well as a bug report.
  • Some bugs may get kicked out.
  • Do not file X-Plane bugs in the scenery tools bug base! The scenery tools bug base is not where we store sim bugs.
Do not bother to ask for direct bug base access for X-Plane itself. You cannot have it. The ratio of submitted "bugs" for X-Plane to actual bugs is at least 10:1. That is, 90% of you think that you should file a bug when you have a tech support question. Now you might be in that 10% (particularly if you've made it all the way down to this blog post), but we can't set up open infrastructure with those numbers. My hope is that the scenery tools are self-selecting to the point where people who are using developer-preview tools know what a bug report is.

Another Reason To Use a Few Big Textures

The file loading code in 950 beta 1 for Windows is slower than 945. Sometimes. This will be "fixed" in beta 2. Here's what happened:

The scenery system uses a number of small files. .ter files, multiple images, .objs, etc. This didn't seem like a problem at first, and having everything in separate text files makes it easier to take apart a scenery pack and see what's going on.

The problem is that as computers get bigger and faster, rather than a scenery pack growing bigger files, they are growing more files. The maximum texture size has doubled from 1024x1024 to 2048x2048. But with paged orthophotos, multicore, and a lot of VRAM, you could easily build a scenery pack with 10,000 images per DSF.

That's exactly what people are doing, and the problem is that loading all of those tiny files is slow. Your hard drive is the ultimate example of "cheaper by the dozen" - it can load a single huge file at a high sustained data rate. But the combination of opening and closing files and jumping between them is horribly inefficient. 10,000 tiny .ter files is a hard drive's worse nightmare.

In 950 beta 1 I tried to rewrite part of the low level file code to be quicker on Windows. It appeared to run 20% faster on my test of the LOWI demo area, so I left it in beta 1, only to find out later that it was about 100% slower on huge orthophoto scenery packs. I will be removing these "optimizations" in beta 2 to get back to the same speed we had before. (None of this affects Mac/Linux - the change was only for Windows.)

The long term solution (which we may have some day) is to have some kind of "packing" format to bundle up a number of small files so that X-Plane can read them more efficiently. An uncompressed zip file (that is, a zip where the actual contents aren't compressed, just strung together) is one possible candidate - it would be easy for authors to work with and get the job done.

In the short term, for 950 beta 2, I am experimenting with code that loads only a fraction of the paged orthophoto textures ahead of time - this means that some (hopefully far away part) of the scenery will be "gray" until loaded, but the load time could be cut in half.

There is one thing you can do if you are making an orthophoto scenery pack: use the biggest textures you can. Not only is it good from a rendering perspective (fewer, larger textures means less CPU work telling the video card "it's time to change textures") but it's good for loading too - fewer larger textures means fewer, larger total files, which is good for your hard disk.

(Thanks to Cam and Eric for doing heavy performance testing on some of the 950 beta builds!)