Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (476)
Games in Android Showcase (106)
games submitted by our members
Games in WIP (530)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  How do you handle an indexed VBO with multiple texture coordinates per vertex?  (Read 6948 times)
0 Members and 1 Guest are viewing this topic.
Offline elias4444

Junior Member





« Posted 2010-01-15 01:14:04 »

Cas' sprites thread has me in an optimization frenzy with my own code now.  Tongue

Currently, for my VBOs, I take each triangle from a mesh and systematically dump each vertex along with each color, normal, and texture coordinate into 4 respective arrays (vertex, color, normal, and texture). When updating for animated meshes, I loop through the whole list of vertices (including the duplicates) and update them to their new position (same thing with the other arrays if something changes). Then, for drawing, I bind each one and call glDrawArrays.

Now, I know for a fact that my updates would benefit from not having to loop through so many duplicates for the vertices. However, the texture coordinates are making this difficult. Each vertex can (and often does) have multiple texture coordinates depending on which face is calling it. How do you handle this in an indexed VBO?

Or am I just stuck having to duplicate each vertex over and over?

Offline xinaesthetic

Senior Member


Medals: 1



« Reply #1 - Posted 2010-01-15 01:24:07 »

AFAIK each indexed vertex will have a complete set of attributes for that vertex... if you have two vertices which happen to have the same position / normal / whatever, but some different texture coordinates... then they are not the same vertex.
Offline princec

JGO Kernel


Medals: 339
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #2 - Posted 2010-01-15 11:12:45 »

It's recommended you interleave your vertex data in a single buffer aligned to 32 bytes with 32 or 64 byte strides. That way you're not pulling data from 5 different locations for each vertex; it all streams from one spot. My sprite engine uses just a single VBO (I don't even have an index buffer, as I use glDrawArrays calls).

Also - your code in the Sprites! thread uploads data to the VBOs via Java native byte buffers. This is probably the worst way to do it in Java, as it involves manipulating data in system RAM. You don't need to ever touch system RAM - write your data directly to a mapped VBO byte buffer instead (wrapped in FloatBuffers and IntBuffers, naturally). This means data goes straight from your class members and calculations to the card; it doesn't go to system RAM first and then get copied up to the VBO.

And yes you're stuck with duplicating vertices but that's ok because a vertex is only 32 bytes, nothing to be scared of Smiley

Cas Smiley

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline elias4444

Junior Member





« Reply #3 - Posted 2010-01-15 17:49:40 »

Hmmm.... haven't done that before (writing directly to a mapped VBO byte buffer). Not sure how to pull that off with animated meshes where I have to interpolate each vertex location per frame. That's why I keep a local copy Floatbuffer, so I can update the vertices and then dump it back into the VBO (I have to have two local copies actually, so I know the last frame position vs the next frame position, and can then interpolate over time). Is there a way around that? Do you have an example?

Offline princec

JGO Kernel


Medals: 339
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #4 - Posted 2010-01-15 19:06:05 »

Well, in which case you are doubly certain not to be using your own system RAM float buffers. Store your data normally in the heap - position A and position B - in float arrays. Each update, interpolate each coordinate between A and B and write the interpolated one to the mapped VBO.

There is also, however, an extension that can do this for you. And you can do something even more cleverer with shaders.

Cas Smiley

Offline elias4444

Junior Member





« Reply #5 - Posted 2010-01-15 19:54:48 »

I'm actually trying to keep my engine out of shaders these days (I'm already using shaders for optional shadow mapping, bump mapping, and per pixel lighting). I want to keep animation shader-free so I can support older graphic hardware.

Also, I did just try switching my engine over to using glMapBufferARB, but I'm having a rough time trying to write the data correctly to the mapped bytebuffer (I'm using bytebuffername.putFloat(), but keep getting garbage back). Must be doing something wrong still.

Anyway, what extension are you talking about??? I love learning new stuff and optimizing.

Offline VeaR

Junior Member





« Reply #6 - Posted 2010-01-15 21:11:43 »

I haven't posted here, and worked with java or gaming for some time, but here is some old code of mine, for doing vertex animations with mapped buffers:

http://code.google.com/p/vlengine/source/browse/trunk/vle_cleanup/src/com/vlengine/scene/animation/x/XSoftSkinner.java
http://code.google.com/p/vlengine/source/browse/trunk/vle_cleanup/src/com/vlengine/renderer/lwjgl/LWJGLRenderer.java#1206

As for "putFloat()", be sure that you are using direct buffers with native byte order. Also, don't ever read back data from a mapped buffer, only use it to send data to the GPU. If you need the data, then store it somewhere else first.
Offline princec

JGO Kernel


Medals: 339
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #7 - Posted 2010-01-15 21:24:39 »

Make sure that buffer has order(ByteOrder.nativeOrder()) called on it first Wink I cannot fathom what feckwit designer decided that native bytebuffers would be created by default in big endian order instead of native order. I spent 5 hours staring at a black screen last night until I suddenly twigged.

I forget the name of the extension that did the animation interpolation... ARB_vertex_blend or something. I think it never really caught on because vertex shaders came along.

Cas Smiley

Offline elias4444

Junior Member





« Reply #8 - Posted 2010-01-15 23:00:48 »

Ok, this is strange. I finally got it setup (yes, the byteorder was what was messing it up, thanks). I'm using direct buffers and all, but it's actually running a bit slower than it was for me to go through system memory. I've timed the different command calls, and the one that's hurting the most is actually: ARBVertexBufferObject.glUnmapBufferARB(ARBVertexBufferObject.GL_ARRAY_BUFFER_ARB). If I take that out, it runs a little faster than what I had before... of course, that would be sloppy of me now, wouldn't it?

Offline princec

JGO Kernel


Medals: 339
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #9 - Posted 2010-01-15 23:50:41 »

You cannot actually draw anything using a mapped buffer. You must unmap it before calling glDrawArrays etc. or it's not supposed to work at all!

Cas Smiley

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline elias4444

Junior Member





« Reply #10 - Posted 2010-01-16 00:31:07 »

That is actually in the update routine, not the drawing. I haven't changed the draw routine at all.

Quote
You cannot actually draw anything using a mapped buffer. You must unmap it before calling glDrawArrays etc. or it's not supposed to work at all!
So I tried it...  Tongue  If I get rid of the unmap call, it matches speed with my previous method (and still draws even though I haven't unmapped).

It also doesn't seem to matter how I allocate the bytebuffer. Once I call glMapBufferARB the first time, the function apparently builds it's own bytebuffer with whatever the LWJGL "default" is.

I'm still looking into the ARB_VERTEX_BLEND method, but it sounds like you were right about it moving to shaders.

EDIT:
Running my old method (loop through each vertice, interpolate, put() it into a floatbuffer, then glBufferDataARB the buffer into the VBO): 1170fps
New method (open direct buffer, loop through and interpolate each vertice and putFloat() it into the VBO, then unmap buffer when done): 950fps

These were both done with the commonly available Dr. Freak MD2 model, with texturing, shadow mapping, and per-pixel lighting enabled (and bump mapping enabled, but no bump map available). A small scene of cubes (for shadows to drop on), and some font objects to see the current FPS were included as well.


Offline princec

JGO Kernel


Medals: 339
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #11 - Posted 2010-01-16 00:48:25 »

Have you set the parameters of the VBO correctly for this scenario? (GL_STREAM_DRAW_ARB, GL_WRITE_ONLY_ARB)

Cas Smiley

Offline elias4444

Junior Member





« Reply #12 - Posted 2010-01-16 01:26:14 »

My initial buffering of the data call:
ARBVertexBufferObject.glBufferDataARB(ARBVertexBufferObject.GL_ARRAY_BUFFER_ARB, buffer, ARBVertexBufferObject.GL_STREAM_DRAW_ARB);

My mapbuffer call:
glMapBufferARB(ARBVertexBufferObject.GL_ARRAY_BUFFER_ARB, ARBVertexBufferObject.GL_WRITE_ONLY_ARB, mapped_buffer)

The initial buffering call doesn't seem to make a difference. Here are some run cases:
GL_STREAM_DRAW_ARB: 900fps
GL_DYNAMIC_DRAW_ARB: 900fps

I'm wondering if this has something to do with the way LWJGL uses a bytebuffer to "map" the VBO.  Huh

Offline Spasi
« Reply #13 - Posted 2010-01-16 11:30:03 »

You can read this thread for an explanation of what LWJGL does when you call glMapBuffer. I would also highly recommend grabbing a nightly build and using the new glMapBuffer API (with an explicit length argument), it should be faster.

Elias4444, I think you should do testing on more complex scenes. At 900fps you're basically doing the equivalent of a microbenchmark, you cannot reliably measure differences between rendering techniques.

Cas, it's very interesting that you got such a huge performance boost simply from switching to mapped VBOs. Could you please confirm that it's reliable? For example, have you tested on different GPUs (from a different vendor mainly)?
Offline princec

JGO Kernel


Medals: 339
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #14 - Posted 2010-01-16 11:37:22 »

Only got nvidias everywhere here. Won't really be able to get around to testing various configurations until beta in a few weeks time. I'll wait for get a proper release of LWJGL rather than experiment with nightlies.

I wouldn't really call it a huge performance boost, just about 2-3x more sprites.

Cas Smiley

Offline elias4444

Junior Member





« Reply #15 - Posted 2010-01-16 17:01:16 »

Wow, both the great Spasi and Cas have responded to my thread... I'm honored!  Grin

So, as I understand this:
Quote
b) Don't use MapBuffer. Tbh, when I was working on Marathon I don't think I had used VBO mapping more than once. I think I dropped it after a while too. Especially for uniform buffer objects I'd use BufferSubData instead, I think it will be much faster. Mapped buffers are too sensitive to implementation details, it's quite hard to get the driver to behave and provide the expected performance. Also, doing it in Java is kinda awkward (any API that returns void pointers is bound to be).
I was already doing it the best way by using glBufferData calls over the mapbuffer method?

BTW, I discovered a few months back that I get a large performance boost by using glBufferData over glBufferSubData. Of course, this may be because I'm using smaller VBOs for the most part (I try to keep my scenes very simple).

Quote
Elias4444, I think you should do testing on more complex scenes. At 900fps you're basically doing the equivalent of a microbenchmark, you cannot reliably measure differences between rendering techniques.
True. I just haven't had time to put something more complex together. With all this optimization work I'm doing though, maybe it's time I built a little LWJGL benchmarker for different methods.

Offline elias4444

Junior Member





« Reply #16 - Posted 2010-01-27 17:00:02 »

Ok, I've been running my new little Tommy Engine benchmarker on all sorts of machines for a few days now. I also updated it to use LWJGL 2.2.2.

New results on my personal machine:
Running my old method (loop through each vertice, interpolate, put() it into a floatbuffer, then glBufferDataARB the buffer into the VBO): 575fps
New method (open direct buffer, loop through and interpolate each vertice and putFloat() it into the VBO): 512fps

If I close the direct buffers each frame, I lose another 50 to 75 fps (but for some reason, I can just keep it open, and they all seem to behave themselves).

Looks like I'll stick with the old method.


Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

pw (12 views)
2014-07-24 01:59:36

Riven (10 views)
2014-07-23 21:16:32

Riven (11 views)
2014-07-23 21:07:15

Riven (12 views)
2014-07-23 20:56:16

ctomni231 (43 views)
2014-07-18 06:55:21

Zero Volt (38 views)
2014-07-17 23:47:54

danieldean (32 views)
2014-07-17 23:41:23

MustardPeter (34 views)
2014-07-16 23:30:00

Cero (50 views)
2014-07-16 00:42:17

Riven (50 views)
2014-07-14 18:02:53
HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54

HotSpot Options
by Roquen
2014-05-06 15:03:10

Escape Analysis
by Roquen
2014-04-29 22:16:43

Experimental Toys
by Roquen
2014-04-28 13:24:22
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!