Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (516)
Games in Android Showcase (122)
games submitted by our members
Games in WIP (577)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1] 2 3
  ignore  |  Print  
  Immediate mode rendering is dead  (Read 23714 times)
0 Members and 1 Guest are viewing this topic.
Offline princec

JGO Kernel


Medals: 409
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Posted 2010-01-18 12:53:26 »

Long live VBOs!

So, having more or less doubled the speed of my sprite engine merely by switching to VBO based rendering instead of using traditional heap based DirectByteBuffers, I find now that as soon as I perform immediate mode rendering in OpenGL, performance rapidly plummets back down to terrible.

It would appear that VBOs are kind of an all-or-nothing approach; any immediate mode rendering pretty much buggers the whole advantage of VBO usage up. So now I have to port all my text rendering, background rendering, capacitor zap rendering, building under attack rendering, and powerup beam in effect rendering to use VBOs instead of immediate mode. Bah.

Cas Smiley

Offline Spasi
« Reply #1 - Posted 2010-01-18 12:55:16 »

Thanks for the tip. Smiley
Offline ryanm

Senior Duke


Projects: 1
Exp: 15 years


Used to be bleb


« Reply #2 - Posted 2010-01-18 14:15:34 »

Does this stuff ever get publicly documented by driver writers?
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 823
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #3 - Posted 2010-01-18 14:39:06 »

So, having more or less doubled the speed of my sprite engine merely by switching to VBO based rendering instead of using traditional heap based DirectByteBuffers, I find now that as soon as I perform immediate mode rendering in OpenGL, performance rapidly plummets back down to terrible.

Could you explain what you mean with a 'heap based DirectByteBuffer'? A DirectByteBuffer has its pointer outside of the heap. Naturally, the DirectByteBuffer instance will be on the heap, but that's also the case with the object returned from glMapBuffer()

glMapBuffer() will also stall the GPU until glUnmapBuffer() is called. So when you are filling your ByteBuffer, which might take some time, your GPU will be idling. No such problem with glBufferDataARB.

http://www.spec.org/gwpg/gpc.static/vbo_whitepaper.html

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Spasi
« Reply #4 - Posted 2010-01-18 15:03:29 »

Could you explain what you mean with a 'heap based DirectByteBuffer'? A DirectByteBuffer has its pointer outside of the heap. Naturally, the DirectByteBuffer instance will be on the heap, but that's also the case with the object returned from glMapBuffer()

I think he means a direct ByteBuffer that points to JVM-allocated memory, versus a direct ByteBuffer that points to driver-allocated memory (that's returned from glMapBuffer), which may be used for faster GPU transfers.

glMapBuffer() will also stall the GPU until glUnmapBuffer() is called. So when you are filling your ByteBuffer, which might take some time, your GPU will be idling. No such problem with glBufferDataARB.

http://www.spec.org/gwpg/gpc.static/vbo_whitepaper.html

That's implementation and usage dependent. It may stall or it may not stall. Cas may be lucky and his driver is clever enough to queue a copy instead of stalling the GPU. That's why I suggested in his other topic that he should explore an algorithmic solution first (using shaders, instancing etc), before trying a simple switch to VBOs (which he should do anyway).
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 823
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #5 - Posted 2010-01-18 15:10:24 »

I think he means a direct ByteBuffer that points to JVM-allocated memory, versus a direct ByteBuffer that points to driver-allocated memory (that's returned from glMapBuffer), which may be used for faster GPU transfers.

1. Filling both has exactly the same performance.
2. I wonder how the CPU=>GPU copy could be faster for the driver-allocated memory. It might be guaranteed to be page-aligned, but we can guarantee the same with a JVM-allocated ByteBuffer. The check for alignment should be negligible compared to the data copy.


That's implementation and usage dependent. It may stall or it may not stall. Cas may be lucky and his driver is clever enough to queue a copy instead of stalling the GPU.

Wouldn't a copy (malloc) of the driver be much slower than providing your own ByteBuffer?



Any benchmarks to share?

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline princec

JGO Kernel


Medals: 409
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #6 - Posted 2010-01-18 15:19:50 »

Heap-based direct byte buffer is that which is returned by ByteBuffer.allocateDirect() - it's still on the C heap (not the Java object heap). It's possible to create direct bytebuffers outside of the C heap in native code but not pure Java code. This is how the old NV_fence and AGP RAM allocation used to work.

glMapBuffer() will return a pointer to an address in the process's address space and not on the heap too, it'll be some weirdy location provided by the driver. Now, if I were calling glMapBuffer() on a buffer that overlapped some bit of memory the driver was currently trying to render, then I'd cause a GPU stall most likely. However, because of they way I do rendering - I map, rapidly fill the buffer with data, unmap, and then start rendering from it - I'm unlikely to cause any GPU stalling. In fact if the driver's worth its salt it'd be batching my state change calls and processing them asynchronously with the buffer DMA, effectively making nearly all the calls return immediately.

If I find that filling the byte buffer is taking too long I could double buffer my geometry data - that is, alternately swap between two VBOs. I might yet do this anyway.

Cas Smiley

Offline princec

JGO Kernel


Medals: 409
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #7 - Posted 2010-01-18 15:22:47 »

1. Filling both has exactly the same performance.
Not so; a driver-provided pointer to a strange address over the bus somewhere can completely bypass clientside memory caches and thus eliminate cache pollution, a major factor in slowdown when copying data up to the card. Even the piddly 125kb of vertex data I copy up per frame buggers most CPUs L1 caches.

At least that's my understanding of it, and seeing as everything's twice as fast since I made the change, I assume something good is happening Smiley

Cas Smiley

Offline princec

JGO Kernel


Medals: 409
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #8 - Posted 2010-01-18 15:25:26 »

2. I wonder how the CPU=>GPU copy could be faster for the driver-allocated memory. It might be guaranteed to be page-aligned, but we can guarantee the same with a JVM-allocated ByteBuffer. The check for alignment should be negligible compared to the data copy.

Wouldn't a copy (malloc) of the driver be much slower than providing your own ByteBuffer?
Again, this depends on the usage flags and how clever the driver is, but in my instance, I'm doing GL_STREAM_DRAW and GL_WRITE_ONLY, one of the optimally easy cases for allocation: the driver never needs to return the same pointer twice, or even more cleverly, it can provide the same clientside address space pointer, but pointing to a completely different serverside memory location, so that I can write to it unhindered.

Cas Smiley

Offline Spasi
« Reply #9 - Posted 2010-01-18 16:14:45 »

(I was writing this while Cas posted his 3 replies above, posting anyway)

1. Filling both has exactly the same performance.

Yes. But as I said in the other topic, for the amount of data Cas is generating per-frame, filling the buffer shouldn't be affecting the game's performance in any significant way.

2. I wonder how the CPU=>GPU copy could be faster for the driver-allocated memory. It might be guaranteed to be page-aligned, but we can guarantee the same with a JVM-allocated ByteBuffer. The check for alignment should be negligible compared to the data copy.

I'm not sure, but I think the GPU cannot perform DMA transfers from arbitrary memory locations. For example, on AGP cards, you need to use memory that's been reserved and allocated from the GART for the card to be able to use DMA. The JVM can't do that for you, but glMapBuffer can. The GART memory linearization has been moved to the GPU/driver on PCIe cards, but I think the same issue remains.

Wouldn't a copy (malloc) of the driver be much slower than providing your own ByteBuffer?

The GPU may have allocated a local copy of the VBO in GPU memory and do some kind of double buffering. Instead of stalling on glMapBuffer, it may continue rendering from the local copy, then update the VBO when rendering is done. It can't do that with user-allocated memory, not without having 3 copies of the VBO data. For example:

1) glBufferSubData is called, a user-allocated buffer is supplied.
2) the driver copies the user-allocated buffer somewhere else in system memory (possibly DMAable).
3) the system-memory => GPU-memory copy is performed to update the GPU-allocated VBO.

That's 3 copies of the VBO data. The driver cannot avoid #2, because the user may modify the user-allocated buffer data before the GPU transfer is performed. In the glMapBuffer case, data modification is controlled, you can't change anything outside a map/unmap pair or without a glBuffer(Sub)Data call. So, glMapBuffer gives you direct access to the #2 memory.

That's all assuming the driver does the double buffer copy. If not, the GPU stalls and you have the same performance. Possibly slightly better with glMapBuffer in case what I wrote above about DMA isn't bullshit. Or well, the data transfer shouldn't be the bottleneck (very few data according to Cas, very high bandwidth available on modern cards), but stalling the GPU can be. glMapBuffer is supposed to help with avoiding such stalls (that's why the WRITE_ONLY flag exists).

Any benchmarks to share?

No, sorry. I may be talking out of my ass here, it's all based on random stuff I've read around and how I think GPUs/drivers work. I've said this before on the LWJGL forums, I haven't used glMapBuffer more than once, I even replaced it at some point with a better solution. My other warning also applies, VBOs are very GPU/vendor/driver sensitive, you cannot be sure that a rendering setup that performs great on your machine will be anywhere close to optimal for other machines too. On the other hand, I haven't done any VBO tests lately, drivers with VBO support have matured and things may be much better now.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Spasi
« Reply #10 - Posted 2010-01-18 16:20:01 »

Anyway, I'd still be interested in a comparison of map/unmap with glBufferSubData in the context of a sprite engine. Cas should be able to test this very quickly.
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 823
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #11 - Posted 2010-01-18 17:07:53 »

2048 tiny triangles (to take fillrate out of the equation)


glBufferDataARB => 50ms
1  
2  
3  
4  
5  
6  
7  
8  
9  
                  for (int i = 0; i < 256; i++)
                  {
                     javaSideBuffer.clear();
                     glBufferDataARB(GL_ARRAY_BUFFER_ARB, javaSideBuffer, GL_STREAM_DRAW_ARB);

                     glVertexPointer(3, GL_FLOAT, 0, 0);
                     glColorPointer(3, GL_FLOAT, 0, byteCount >> 1);
                     glDrawArrays(GL_TRIANGLES, 0, quadCount * 3 * 2);
                  }


glBufferSubData() => 50ms
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
                  glBufferDataARB(GL_ARRAY_BUFFER_ARB, byteCount, GL_STREAM_DRAW_ARB);

                  for (int i = 0; i < 256; i++)
                  {
                     javaSideBuffer.clear();
                     glBufferSubDataARB(GL_ARRAY_BUFFER_ARB, 0, javaSideBuffer);

                     glVertexPointer(3, GL_FLOAT, 0, 0);
                     glColorPointer(3, GL_FLOAT, 0, byteCount >> 1);
                     glDrawArrays(GL_TRIANGLES, 0, quadCount * 3 * 2);
                  }


glMapBuffer() => 32ms
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
                  glBufferDataARB(GL_ARRAY_BUFFER_ARB, byteCount, GL_STREAM_DRAW_ARB);

                  for (int i = 0; i < 256; i++)
                  {
                     ByteBuffer driverSideBuffer;

                     driverSideBuffer = glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB, null);
                     javaSideBuffer.clear();
                     driverSideBuffer.put(javaSideBuffer);
                     glUnmapBufferARB(GL_ARRAY_BUFFER_ARB);

                     glVertexPointer(3, GL_FLOAT, 0, 0);
                     glColorPointer(3, GL_FLOAT, 0, byteCount >> 1);
                     glDrawArrays(GL_TRIANGLES, 0, quadCount * 3 * 2);
                  }




And the boring code to fill the vertex array:
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
            float quadSize = 5.0f;
            int quadRepeat = 32;
            int quadCount = quadRepeat * quadRepeat;
            int byteCount = quadCount
                  * 2 /* triangles per quad */
                  * 3 /* vertices per triangle */
                  * 3 /* coordinates per vertex */
                  * 2 /* vertex+color */
                  * 4 /* float_sizeof */;
            ByteBuffer javaSideBuffer = BufferUtils.createByteBuffer(byteCount);

            // vertices
            {
               for (int x = 0; x < quadRepeat; x++)
               {
                  for (int y = 0; y < quadRepeat; y++)
                  {
                     float x0 = (x + 1) * quadSize;
                     float y0 = (y + 1) * quadSize;
                     float x1 = (x + 2) * quadSize;
                     float y1 = (y + 2) * quadSize;

                     javaSideBuffer.putFloat(x0).putFloat(y0).putFloat(0.0f);
                     javaSideBuffer.putFloat(x1).putFloat(y0).putFloat(0.0f);
                     javaSideBuffer.putFloat(x1).putFloat(y1).putFloat(0.0f);
                     javaSideBuffer.putFloat(x1).putFloat(y1).putFloat(0.0f);
                     javaSideBuffer.putFloat(x0).putFloat(y1).putFloat(0.0f);
                     javaSideBuffer.putFloat(x0).putFloat(y0).putFloat(0.0f);
                  }
               }
            }

            // colors
            {
               for (int x = 0; x < quadRepeat; x++)
               {
                  for (int y = 0; y < quadRepeat; y++)
                  {
                     float[] cornerA = new float[] { 1, 0, 0 }; // red
                     float[] cornerB = new float[] { 0, 1, 0 }; // green
                     float[] cornerC = new float[] { 0, 0, 1 }; // blue
                     float[] cornerD = new float[] { 1, 1, 0 }; // yellow

                     for (float v : cornerA)
                        javaSideBuffer.putFloat(v);
                     for (float v : cornerB)
                        javaSideBuffer.putFloat(v);
                     for (float v : cornerC)
                        javaSideBuffer.putFloat(v);

                     for (float v : cornerC)
                        javaSideBuffer.putFloat(v);
                     for (float v : cornerD)
                        javaSideBuffer.putFloat(v);
                     for (float v : cornerA)
                        javaSideBuffer.putFloat(v);
                  }
               }
            }

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Spasi
« Reply #12 - Posted 2010-01-18 17:19:11 »

Could you try reusing the driverSideBuffer (store the reference and pass as the last argument to glMapBuffer)? Also, could you download a fresh LWJGL build and try the new glMapBuffer API (with an explicit length argument)?
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 823
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #13 - Posted 2010-01-18 17:24:10 »

reusing glMapBuffer() => 32ms
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
                  glBufferDataARB(GL_ARRAY_BUFFER_ARB, byteCount, GL_STREAM_DRAW_ARB);

                  ByteBuffer driverSideBuffer = glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB, null);

                  for (int i = 0; i < 256; i++)
                  {
                     javaSideBuffer.clear();
                     driverSideBuffer.clear();
                     driverSideBuffer.put(javaSideBuffer);
                     glUnmapBufferARB(GL_ARRAY_BUFFER_ARB);

                     glVertexPointer(3, GL_FLOAT, 0, 0);
                     glColorPointer(3, GL_FLOAT, 0, byteCount >> 1);

                     glDrawArrays(GL_TRIANGLES, 0, quadCount * 3 * 2);

                     driverSideBuffer = glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB, driverSideBuffer);
                  }

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 823
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #14 - Posted 2010-01-18 17:28:22 »

Could you try reusing the driverSideBuffer (store the reference and pass as the last argument to glMapBuffer)?
See above.

Also, could you download a fresh LWJGL build and try the new glMapBuffer API (with an explicit length argument)?

I downloaded it, put it in the build path, and I couldn't find a version of glMapBuffer with an explicit length parameter persecutioncomplex

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 823
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #15 - Posted 2010-01-18 17:39:45 »

Hrr, I hate the JVM... I had a version that used alternating buffers, took 27ms, and when I launched it again, it was back as 32ms Sad

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Spasi
« Reply #16 - Posted 2010-01-18 17:57:28 »

I downloaded it, put it in the build path, and I couldn't find a version of glMapBuffer with an explicit length parameter persecutioncomplex

There's a glMapBufferARB(int target, int access, long length, ByteBuffer old_buffer) in ARBBufferObject, isn't there?
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 823
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #17 - Posted 2010-01-18 17:58:48 »

There's a glMapBufferARB(int target, int access, long length, ByteBuffer old_buffer) in ARBBufferObject, isn't there?

Right, I had to restart Eclipse.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 823
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #18 - Posted 2010-01-18 18:00:58 »

Slightly better.


glMapBuffer(..., length, ....) =>30ms
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
               glBufferDataARB(GL_ARRAY_BUFFER_ARB, byteCount, GL_STREAM_DRAW_ARB);

               ByteBuffer driverSideBuffer = null;

               for (int i = 0; i < 256; i++)
               {
                  driverSideBuffer = glMapBufferARB(GL_ARRAY_BUFFER_ARB, GL_WRITE_ONLY_ARB, byteCount, driverSideBuffer);

                  javaSideBuffer.clear();
                  driverSideBuffer.clear();
                  driverSideBuffer.put(javaSideBuffer);
                  glUnmapBufferARB(GL_ARRAY_BUFFER_ARB);

                  glVertexPointer(3, GL_FLOAT, 0, 0);
                  glColorPointer(3, GL_FLOAT, 0, byteCount >> 1);
                  glDrawArrays(GL_TRIANGLES, 0, quadCount * 3 * 2);
               }

               glUnmapBufferARB(GL_ARRAY_BUFFER_ARB);

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Spasi
« Reply #19 - Posted 2010-01-18 18:02:53 »

Cool, thanks.
Offline VeaR

Junior Duke





« Reply #20 - Posted 2010-01-18 18:09:14 »

"Immediate mode rendering is dead"

It's dead long ago. Immediate OpenGL rendering mode was when OGL calls were used between glBegin() and glEnd() to render/specify each of the triangles vertices one-by-one. It was the original rendering mode of OpenGL (back in 1993, if i remember).

You should have given the title:

"Vertex array rendering mode is dead, along with display list rendering"

Offline princec

JGO Kernel


Medals: 409
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #21 - Posted 2010-01-18 18:18:12 »

Indeed. Although at least up until now, vertex array rendering was fast enough for my purposes, but now it seems I've hit the absolute limits, and the drivers are definitely pushing everybody to use VBOs or face drastically shit performance.

This is kinda significant because a) all my legacy code has a lot of immediate mode in it (I don't mean a lot of rendering, just a lot of places) and b) all the legacy tutorial code on the internets is immediate mode and it's all completely the wrong thing to be teaching people now

Anyway, back to major hackery to convert all my immediate mode stuff into VBO rendering... without breaking any of my legacy code... whilst still being integrated into the sprite engine layering system.... argh

Cas Smiley

Offline DzzD
« Reply #22 - Posted 2010-01-18 19:03:09 »

"Immediate mode rendering is dead"

It's dead long ago. Immediate OpenGL rendering mode was when OGL calls were used between glBegin() and glEnd() to render/specify each of the triangles vertices one-by-one. It was the original rendering mode of OpenGL (back in 1993, if i remember).

You should have given the title:

"Vertex array rendering mode is dead, along with display list rendering"



hehe and any chance to have an imediate compiled displaylist bench ? Smiley

Offline princec

JGO Kernel


Medals: 409
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #23 - Posted 2010-01-18 19:26:37 »

 Display lists used to just crash / not work most of the time I tried them on crappy Intel cards anyway. So I never used them.

All this stuff is basically written in stone for OpenGL3.1 anyway, so we might as well get used to it and start doing it right from now on anyway  Grin

Cas Smiley

Offline DzzD
« Reply #24 - Posted 2010-01-18 19:31:54 »

Display lists used to just crash / not work most of the time I tried them on crappy Intel cards anyway. So I never used them.

All this stuff is basically written in stone for OpenGL3.1 anyway, so we might as well get used to it and start doing it right from now on anyway  Grin

Cas Smiley
yup yup, was just a request for my personal knowledge

Offline xinaesthetic

Senior Duke


Medals: 1



« Reply #25 - Posted 2010-01-19 02:56:08 »

I must admit, I recommended someone to use display lists on here just a few days ago, without warning them that are actually already deprecated (which I knew full well)... they were using the JOGL TextRenderer and I wasn't really sure how that would interact.  Well, I felt guilty so I've at least now let them know.

In my own code, I have VBOs for the major stuff, with some immediate mode, the odd display list and one or two vertex arrays... I don't notice any particular difference to performance whether I have my various trivial immediate mode bits rendering or only VBO stuff - I am largely CPU bound, though.  In fact, I've just been experimenting rendering particles in immediate mode vs VA vs VBO and am seeing absolutely negligible differences... if anything, immediate mode maybe slightly ahead.  WTF?
Offline princec

JGO Kernel


Medals: 409
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #26 - Posted 2010-01-19 09:49:45 »

Maybe you have awesome drivers Smiley

I'm on an Nvidia 6510Go, with no video RAM - it's got nice drivers but performance wise it's barely any better than Intel rubbish.

Cas Smiley

Offline xinaesthetic

Senior Duke


Medals: 1



« Reply #27 - Posted 2010-01-19 11:30:31 »

Well, I've got the 190.89 drivers for an 8600M GT under 32bit Vista... certainly a bit better hardware wise Smiley

I'm still surprised that I fairly consistently see an increase of a few fps changing my particle system to immediate mode instead of VBO... never the other way around, it seems.  Maybe I'm confused somewhere down the line.
Offline jezek2
« Reply #28 - Posted 2010-01-19 11:54:40 »

Well, I've got the 190.89 drivers for an 8600M GT under 32bit Vista... certainly a bit better hardware wise Smiley

I'm still surprised that I fairly consistently see an increase of a few fps changing my particle system to immediate mode instead of VBO... never the other way around, it seems.  Maybe I'm confused somewhere down the line.

I observe the same thing (at least on 6600GT). It seems that small dynamic geometry rendering is faster using immediate mode than using VBO. But I haven't tried to eg. fill the VBO at start of frame, render other stuff and then use that VBO for rendering which would be more GPU/pipeline friendly.
Offline princec

JGO Kernel


Medals: 409
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #29 - Posted 2010-01-19 11:58:14 »

I can't fathom any way a particle system could be faster using immediate mode rendering. Especially from Java. Unless you draw so few particles it amounts to a microbenchmark and that renders the results a bit suspect.

Cas Smiley

Pages: [1] 2 3
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

TehJavaDev (32 views)
2014-10-27 03:28:38

TehJavaDev (26 views)
2014-10-27 03:27:51

DarkCart (41 views)
2014-10-26 19:37:11

Luminem (22 views)
2014-10-26 10:17:50

Luminem (27 views)
2014-10-26 10:14:04

theagentd (33 views)
2014-10-25 15:46:29

Longarmx (61 views)
2014-10-17 03:59:02

Norakomi (58 views)
2014-10-16 15:22:06

Norakomi (47 views)
2014-10-16 15:20:20

lcass (43 views)
2014-10-15 16:18:58
Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!