Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (477)
Games in Android Showcase (106)
games submitted by our members
Games in WIP (533)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  GPU, wich amount of VBO data can be allocated ?  (Read 4631 times)
0 Members and 1 Guest are viewing this topic.
Offline blizzard

Senior Newbie





« Posted 2011-10-18 10:13:21 »

Hello

I start working with VBO´s and my first tests look good.
For further planning the following question comes up.
Which amount of GPU memory can be allocated by using VBO´s ?
Just to get a rough feeling about the sizes that can be used.

I have a NVIDIA Quadro FX 770M GPU and a utility program shows that there are 512 MB available on the GPU card.
Are there 512MB available for use or how do I find out what is available ?

Thanks

Offline theagentd
« Reply #1 - Posted 2011-10-18 12:31:57 »

Your GPU has 512 MB of VRAM, and that's where your VBO data is stored. Remember that this memory is shared with everything else running just like system RAM, so don't expect all of those 512MB to be unused. For example, Windows Aero uses about 100-125MB of VRAM constantly. If you disable it, it drops to ~25 if I recall correctly. I recommend you try out GPU-Z, a program that can monitor memory usage, at least on NVidia cards.
512 MB is however not the limit of how much VBO data you can have. If the graphics card runs out of memory, it will start swapping VRAM to system memory, similar to how your computer swaps system memory to the hard drive. This is obviously not very good for performance, but it is actually not that bad either. The memory manager is intelligent enough to swap out unused things, and the PCI-E bus is fast enough to handle it pretty well. What happens when you overwhelm your VRAM depends on with what you overwhelm it with. If you have 1GB of unused (cached or preloaded or something) VBOs lying around you won't see much of a performance drop (I estimate it to less than 10% at least). If you overwhelm it with textures, things get much worse, as the whole texture is needed in memory for a much longer time (a VBO is only read ones per draw and can then be swapped out again). Source of info: My own game. I load chunks of the game world into VBOs and only draw the ones that pass the frustum culling test. I used about 2GB of VRAM, and my card only has 1.5GB. xD

Fun fact: If you overwhelm your VRAM with a framebuffer (render target) you will freeze your computer. Yeah, the mouse and everything. Swapping memory that is so commonly used simply freezes the game. What? You wonder why I know that? Who wouldn't want to play a game with 32SxAA (= 2x2 driver supersampling + 8xMSAA) on a 2x2 supersampled RGBA_32F render target? I mean, come on! That's just 16x ordered grid supersampling with 8xMSAA and my 1.5GB VRAM ran out?!

Myomyomyo.
Offline delt0r

JGO Knight


Medals: 26
Exp: 18 years


Computers can do that?


« Reply #2 - Posted 2011-10-18 13:58:04 »

Even if you have not run out of VRAM, a the driver is free to *not* store your VBO on the card. For example in streaming mode.

I have no special talents. I am only passionately curious.--Albert Einstein
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline theagentd
« Reply #3 - Posted 2011-10-18 14:22:58 »

Ah, yeah, that makes sense too.
If you fire up the NVidia Control Panel and on the menu bar choose help and then System Information, you can find out exactly how much memory you're allowed to use. Mine says 1.5GB dedicated video memory, 3316 total available memory.

Myomyomyo.
Offline princec

JGO Kernel


Medals: 342
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #4 - Posted 2011-10-18 14:28:23 »

Last I read, using about 4-8MB for a streaming VBO was recommended. And of course for static geometry, whatever size it needs to be to hold the data. Create as many of them as you need.

Cas Smiley

Offline theagentd
« Reply #5 - Posted 2011-10-18 14:37:44 »

I've noticed that using STREAM_DRAW doesn't affect FPS at all. I'm sending 12MB of data every frame for 1 million particles at 60 FPS. It's CPU-limited though, so it might just be that.

Myomyomyo.
Offline gouessej
« Reply #6 - Posted 2011-10-18 14:40:08 »

Hi

Please take care of a particular thing. As far as I know, ATI and NVidia graphics cards do not exactly have the same behavior when they do not succeed in storing VBO data on the GPU, some (but not all) ATI graphics cards simply return an error code whereas most NVIDIA graphics cards try to store the data somewhere else and return an error code only when even this fails.

I've noticed that using STREAM_DRAW doesn't affect FPS at all. I'm sending 12MB of data every frame for 1 million particles at 60 FPS. It's CPU-limited though, so it might just be that.
Actually, I think that some of these flags have never been properly supported, I only see a difference between dynamic and static except on very early implementations of VBO (for example ARB implementation in OpenGL 1.3).

Offline princec

JGO Kernel


Medals: 342
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #7 - Posted 2011-10-18 14:42:10 »

I've noticed that using STREAM_DRAW doesn't affect FPS at all. I'm sending 12MB of data every frame for 1 million particles at 60 FPS. It's CPU-limited though, so it might just be that.
You'll never really know until you profile the call to glDrawRangeElements Smiley With VBOs you will tend to find that glDrawRangeElements returns immediately. You will get blocked on glMapBuffer instead if your buffer is still in use, which it won't be, because you'll have called glUnmapBufferARB after you render with it.

Cas Smiley

Offline theagentd
« Reply #8 - Posted 2011-10-18 14:53:22 »

Ah, I use glBufferData. glBufferData is however a big performance hogger in my particle test. Calling glBufferData 5 times per frame drops FPS to the low tens.

Myomyomyo.
Offline princec

JGO Kernel


Medals: 342
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #9 - Posted 2011-10-18 15:38:46 »

glMapData's the best way to do it. Failing that you should be calling glBufferSubData instead of glBufferData as glBufferData causes the driver to discard and recreate the buffer.

Cas Smiley

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline princec

JGO Kernel


Medals: 342
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #10 - Posted 2011-10-18 15:42:56 »

FYI the reason that glMapData is the best way to do it is that the buffer you receive from the call is mapped directly - with any luck! - to fast-writing DMA memory which doesn't pollute your CPU data cache. What you're probably doing at the moment is filling up a big ByteBuffer in system RAM, trashing your caches completely along the way, then copying that entire buffer to the DMA staging area with glBufferData. Far better to simply ask OpenGL to give you a direct byte buffer straight into that RAM and not trash any caches or do any copying Smiley

Cas Smiley

Offline blizzard

Senior Newbie





« Reply #11 - Posted 2011-10-18 15:44:42 »

Thank you for the information ..

I´ve one further question, is it recommended to use glMapBuffer to update the data in my VBO´s ? For my first test I use glBufferData, it work but is it a fast way to do this ?

My English is not that good so I´m not sure if I have understand the performance hint from  theagentd correct :-)
Offline blizzard

Senior Newbie





« Reply #12 - Posted 2011-10-18 15:50:19 »

That was the answer  princec :-)

thanks
Offline theagentd
« Reply #13 - Posted 2011-10-18 16:15:46 »

FYI the reason that glMapData is the best way to do it is that the buffer you receive from the call is mapped directly - with any luck! - to fast-writing DMA memory which doesn't pollute your CPU data cache. What you're probably doing at the moment is filling up a big ByteBuffer in system RAM, trashing your caches completely along the way, then copying that entire buffer to the DMA staging area with glBufferData. Far better to simply ask OpenGL to give you a direct byte buffer straight into that RAM and not trash any caches or do any copying Smiley

Cas Smiley
I'm using Riven's MappedObject library for this. I'm sure glMapBuffer will work with it, but how do I do that? =S

Myomyomyo.
Offline blizzard

Senior Newbie





« Reply #14 - Posted 2011-10-18 16:22:37 »

I try to use gl2.glMapBuffer ..

it crashes :-( .. has anybody a short example ?

I will try it again tomerrow now I have to go

bye
Offline princec

JGO Kernel


Medals: 342
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #15 - Posted 2011-10-18 17:06:50 »

Riven's library should work great with mapped VBOs... provided you don't attempt to read any data, which you probably will do, by accident. I'd advise for now not using the mapping library and just concentrate on making sure you're writing the correct data in there.

"It crashes" btw is no help to anyone whatsoever trying to help you.

Cas Smiley

Offline theagentd
« Reply #16 - Posted 2011-10-18 17:56:42 »

So I can't read the data back? Then it's useless for most programs using MappedObject. The reason I'm using it in the first place is to be able to permanently store the position and color in a ByteBuffer so I don't have to copy everything into it every frame.

Myomyomyo.
Offline princec

JGO Kernel


Medals: 342
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #17 - Posted 2011-10-18 18:03:03 »

It's still got its advantages and its place but if you try mapping your objects directly into a mapped VBO you're fundamentally doing it wrong Smiley The VBO is purely for rendering data, you don't want to have, say, Sprites in there or anything. Having said that mapped objects may potentially save you some space and should also, if you're processing them in nice friendly efficient linear ways, be cache-friendly.

Cas Smiley

Offline theagentd
« Reply #18 - Posted 2011-10-18 18:07:10 »

I have one million particle positions and colors stored in a mapped ByteBuffer, so I don't have to update it in a Particle object and then put the updated position in a ByteBuffer for each particle each frame. I'm saving both performance and memory by doing that. How am I doing things wrong? >_>

Myomyomyo.
Offline lhkbob

JGO Knight


Medals: 32



« Reply #19 - Posted 2011-10-18 19:02:51 »

If you're storing the data in a GPU mapped buffer (and not one of Riven's mapped buffers), you are making the graphics card go through a lot more effort to manage the data within the VBO.  If you're making the mapped data read-able, the graphics card has to synchronize every time you try to render with that data, which is inefficient.

It is best to separate the graphics card and CPU as much as possible because they work best in an asynchronous fashion.  If you can push requests/data to the card and then let the CPU run, the CPU works best and you're not forcing the GPU to do something it considers inefficient.  It also lets the GPU process all of the requests in the queue in an uninterrupted fashion, which is also good for performance.

Offline theagentd
« Reply #20 - Posted 2011-10-18 19:54:28 »

That's basically why I'm using glBufferData instead of the Sub version. It's marginally but measurably faster.

Myomyomyo.
Offline princec

JGO Kernel


Medals: 342
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #21 - Posted 2011-10-18 20:43:46 »

What you should be doing is:

Set up OpenGL state for particle rendering
Update each particle sequentially
Map your VBO
Write each particle to the VBO
glDrawArrays
Unmap VBO ("orphan" it)
Swapbuffers

You should find that the glDrawArrays and swapbuffers returns immediately, allowing you to immediately carry on updating the particles for the next frame. Even the call to map the VBO shouldn't block as the driver should be intelligent enough to give you a completely new buffer if it hasn't finished drawing the previous one. It should only eventually block at the second call to swapbuffers if it hasn't yet actually finished rendering from the first one - that is, you are flat out.

Cas Smiley

Offline Riven
« League of Dukes »

JGO Overlord


Medals: 743
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #22 - Posted 2011-10-19 00:32:11 »

In this case the 'only' gain of my MappedObject library is that the mandatory copy from app-data to mapped-buffer is blazing fast.

I benchmarked ByteBuffer.put(ByteBuffer) and sun.misc.Unsafe.memoryCopy(p1,p2,len) and they were equally fast. On the other hand, pulling data from objects and pushing it into a buffer is significantly slower (2-3x as slow, in my microbenchmark).

It's probably not where your bottleneck is, unless maybe in particle engines and sprite engines (whatever the difference is).

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline theagentd
« Reply #23 - Posted 2011-10-19 06:01:49 »

I'll, try that tonight. Got school in ten minutes... FUUUUUUUUUU---
EDIT: I made it in time, somehow. I'm awesome!
So I tried to implement it, submitting my data with glMapBuffer instead of glBufferData, and got a small but noticeable increase in performance. Approximately 61 --> 63 FPS. This is what I'm doing:
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
for (int i = 0; i < threads; i++) {
    particleSubtasks[i].bindBuffer();
    ByteBuffer mappedBuffer = GL15.glMapBuffer(GL15.GL_ARRAY_BUFFER, GL15.GL_WRITE_ONLY, particlesPerThread * particleByteSize, particleSubtasks[i].getOldMappedBuffer());

    ByteBuffer particleData = particleSubtasks[i].getBuffer();
    mappedBuffer.put(particleData);
    GL15.glUnmapBuffer(GL15.GL_ARRAY_BUFFER);

    mappedBuffer.flip();
    particleData.flip(); //So it doesn't crash during the next update xD

    GL11.glVertexPointer(2, GL11.GL_FLOAT, particleByteSize, 0);
    GL11.glColorPointer(4, GL11.GL_UNSIGNED_BYTE, particleByteSize, 8);

    GL11.glDrawArrays(GL11.GL_POINTS, 0, particlesPerThread);
}

Anything else I can optimize? =D
EDIT: Now I also reuse the mapped buffer instead of passing null to glMapBuffer... Code updated. No measurable difference though.

Myomyomyo.
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 743
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #24 - Posted 2011-10-19 16:28:33 »

Where are you setting the reference you return in getOldMappedBuffer()

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline theagentd
« Reply #25 - Posted 2011-10-19 18:15:55 »

Ah, I forgot to do that. -_- I even made a function for setting it in my ParticleSubtask, but forgot to call it. The program works fine, but I can't test the performance. My CPU has entered permanent power saving running at 800MHz, so I'm getting around 22 FPS instead of 64 FPS. I've contacted ASUS, but they'll probably just tell me to send in my computer. Considering my computer is pretty much worthless now, I don't really have a choice. If I have to send it in, I'll be sure to say farewell to everyone on JGO, because I probably won't be able to survive here in Japan for 1-2 months without a computer. Now I'm off to flash my BIOS.
EDIT: ThrottleStop, marry me and have my babies.

Myomyomyo.
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

pw (24 views)
2014-07-24 01:59:36

Riven (24 views)
2014-07-23 21:16:32

Riven (18 views)
2014-07-23 21:07:15

Riven (21 views)
2014-07-23 20:56:16

ctomni231 (50 views)
2014-07-18 06:55:21

Zero Volt (45 views)
2014-07-17 23:47:54

danieldean (36 views)
2014-07-17 23:41:23

MustardPeter (39 views)
2014-07-16 23:30:00

Cero (56 views)
2014-07-16 00:42:17

Riven (55 views)
2014-07-14 18:02:53
HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54

HotSpot Options
by Roquen
2014-05-06 15:03:10

Escape Analysis
by Roquen
2014-04-29 22:16:43

Experimental Toys
by Roquen
2014-04-28 13:24:22
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!