I'll try to answer these questions, but a lot of it is fuzzy and open to debate, so...
Just because of interest, the D3D pipeline does basically the same things that the OpenGL pipeline does (except the stuff you mentioned in your article), using more or less the same functionality of the underlayig graphic apis, right?
More or less, yes.
But why do even many really old boards work fine with the d3d pipeline whereas the OpenGL pipeline only works on some cards and only with latest drivers installed?
I don't think this is an accurate statement. The D3D pipeline cannot be enabled on a number of boards due to quality and robustness issues, just as the case currently for OGL, but the situation is improving for the latter since we've been pushing for more driver fixes on that side. There are a number of boards where the D3D pipeline does work quite well, but performance isn't as good as it could be because we're using the old DX7 interfaces. (For Dolphin, we expect to move up to DX9 and an STR architecture similar to the OGL pipeline, which should improve things, but there will still be a few boards where driver quality is too poor for us to enable the D3D pipeline.)
Is D3D easier to implement for driver programmers, or is OpenGL more complex?
A specific version of D3D may be easier for driver developers to implement, but the D3D architecture and interfaces change substantially from release to release, so I think they invest more time in the latest/greatest D3D release, and older versions go into sustaining mode fairly quickly. OpenGL as a whole plus all the relevant extensions is a big/complex chunk of work, but at least the driver teams haven't had to rewrite the whole thing every couple years, so while things were pretty dire 5 years ago on the OGL side, I think things have improved considerably.
.... and does every game that uses opengl have to work around all these driver bugs?
Or do they simply use a subset of the available functionality, avoiding the buggy bits that the Java pipeline seems to be using.
The use of OGL in Java2D is very simple compared to any modern 3D game, but still we do tend to use some more basic APIs that tend not to be used in games all that often. For example, we've been fighting performance issues with intrasurface glCopyPixels() operations on certain Nvidia boards, and you ask yourself, how could something this simple be broken? The comparable API in Xlib (XCopyArea()) is used all the time by desktop applications for things like scrolling, so there's no way something like that would break on the X11 side. But the equivalent OGL operation probably didn't receive as much testing and it slipped through the cracks. The same story applies to other older APIs like glDrawPixels() and lesser used parts of the pbuffer/FBO extensions that are important to us. Fortunately, we maintain a fairly extensive set of tests that catch these driver issues, and driver teams at Sun, Nvidia, ATI, etc run these tests on a regular basis to prevent driver regressions. This has been a big help and we're continuing to improve it to catch more driver bugs, but sometimes things slip through, although we're trying to find more ways to minimize those slips.
Sorry, rambled a bit... Going back to your question, I think most commercial OGL games/apps tend to workaround bugs instead of filing bug reports with the appropriate driver team, which I think is a shame. We could certainly do the same (in some cases) in our OGL pipeline code, but I think that leads to an unmanageable mess, and it doesn't help raise the bar for driver quality. We've had pretty good success in filing driver bugs with various companies and seeing them fixed in a reasonable time frame, and we'd like to continue that approach. Sure, in some cases the workaround will be simple and solves the problem, so we'll implement it in our code, but we still file a bug with the driver teams so that they can fix it properly.
Why does Mustang's J2D OGL pipeline kick butt yet the default DirectDraw pipeline is only about half the speed?
Is it because of the OGL pipeline's new STR improvement? It would be great if DirectDraw could get it too.
STR made a huge improvement for the OGL pipeline in terms of both robustness and performance. (Read my STR-Crazy and STR-Crazier blogs on the subject for more details.) The OGL pipeline can be faster than the DirectDraw because the OGL pipeline implements many more hardware accelerated operations, such as blending and transforms. On top of that, STR has helped reduce overhead when rendering lots of smaller primitives, so that improves performance even more when compared to the non-STR pipelines. (The Direct3D pipeline, recently beefed up in Mustang, matches the speed of the OGL pipeline in many areas, but it doesn't have the benefit of STR yet.) We're thinking about how to apply the same concepts to our older pipelines such as the DirectDraw/Direct3D-based pipelines on Windows, hopefully in the Dolphin timeframe.
Well STR means just another thread which means more resource waste and slower startup.
I disagree, the cost of having the extra queue flushing thread is next to zero and has no impact on startup performance. In fact, that extra thread has enabled the whole STR architecture which provides for ridiculously better rendering performance, especially for Swing apps.
Considering that the DDraw pipeline is used mostly for GUI applications this could be the reason why they choose not to make it STR.
No, it was just a matter of (people) resources. Making the existing DirectDraw pipeline (and I'm not talking about the D3D pipeline) work with STR techniques is a big project, and it wouldn't necessarily benefit as much as the OGL pipeline did. We used the OGL pipeline as a place to experiment with STR concepts, and it was a resounding success, so as I said above, we'd like to apply it to other areas of our implementation in the future where appropriate.