WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
on:
2006-07-04 18:31:40 » |
|
Hi All, I'm just testing out a profiler to try and get some more performance out of my game. I can already see that I need to add some more Display Lists for parts of the rendering, but one thing that surpised me was the following: 1 2 3 4 5 6 7
| method name time(ms) invocation count Model.draw(GL, int, int, int, boolean) 24,687 93 % 426,877 com.sun.opengl.impl.GLImpl.glBegin(int) 4,937 19 % 2,561,262 com.sun.opengl.impl.GLImpl.glEnd() 3,265 12 % 2,561,262 MyUtils.calcNormal(float[], float[], float[]) 2,812 11 % 2,561,262 com.sun.opengl.impl.GLImpl.glNormal3fv(float[], int) 2,453 9 % 2,561,262 ...some other calls |
Is glBegin and glEnd really so expensive? It seems so, anyway I can fix this issue no problem, I just thought the numbers were quite interesting. Cheers Peter
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
emzic
|
 |
«
Reply #1 on:
2006-07-05 04:15:50 » |
|
what profiler did you use?
|
|
|
|
Spasi
JGO Ninja    Posts: 566 Medals: 22
Molon Lave
|
 |
«
Reply #2 on:
2006-07-05 04:27:45 » |
|
In immediate mode rendering the real work happens on glBegin and glEnd. Usually everything in between is buffered and submitted as a batch at glEnd. The glBegin overhead is probably GL state validation, pipeline flushing, etc.
|
|
|
|
|
Games published by our own members! Go get 'em!
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #3 on:
2006-07-05 04:28:43 » |
|
YourKit - its a bit pricey, but they do a 30day evaluation for free 
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #4 on:
2006-07-05 05:07:24 » |
|
Thanks Spasi, that would explain why none of the other gl methods show up too  but it makes it kinda hard to know whats slowing things down. Anyway I got my scene rendering 3x faster already, but maybe someone has some tips for a novice OGL programmer  The specific scene I was having problems with was displaying only around 5600 lego style bricks. Not that much to ask I thought, but It was really killing my fps. 1 2 3 4 5 6 7
| loop 5600 times: gl.glPolygonMode(GL.GL_FRONT_AND_BACK, GL.GL_FILL); drawSingleBrick using GL_QUADS and textured. gl.glPolygonMode(GL.GL_FRONT_AND_BACK, GL.GL_LINE); drawSingleBrick using GL_LINES |
I've now split this into two iterations (one for solid drawing, one for wireframe highlight drawing) which each gets compiled into a display list. Then I just call the two display lists. This approach seems to tripple to performance, but I know that some of this is because of the normal calculations and color lookups only being done once during compilation. But I'm still a bit confused. I would have thought that using a display list that the code is then on the gcard, but my CPU is still maxing out at 98% during rendering. Is JOGL running these lists on the CPU? Should I be using VertexArrays for this type of stuff? or would I have the same problem? Any tips are really appreciated. Cheers Peter java (build 1.5.0_06-b05) jogl beta4 Win2k GL_VENDOR=NVIDIA Corporation GL_RENDERER=GeForce 6200/AGP/SSE2 DRAWABLE_GL=com.sun.opengl.impl.GLImpl
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #6 on:
2006-07-05 07:42:01 » |
|
Well it's quite a low spec gcard, but I was down to 7fps, back to 20fps now after the changes, but I think I'm CPU bound for some reason. Thanks for the link. This is the code for drawing the solid bricks: but I call the same code but use GL_LINES 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| gl.glBindTexture(GL.GL_TEXTURE_2D, textureStud); gl.glEnable(GL.GL_TEXTURE_2D); gl.glPolygonMode(GL.GL_FRONT_AND_BACK, GL.GL_FILL);
loop 5600 times float tx = drawQx + (QUEST_TILE_LENGTH * x); float ty = drawQy + drawY; int height = mapData[qx][qy].height[x][y]; float tz = ClientConstants.STD_BRICK_HEIGHT * height; gl.glColor3fv(questColorTable[height+ -CorkConstants.MIN_MAP_HEIGHT], 0); gl.glTranslatef(tx, ty, tz); gl.glBegin(GL.GL_QUADS); drawQuestTile(gl, false); gl.glEnd();
gl.glTranslatef(-tx, -ty, -tz); |
And this is for a single brick 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61
| private static final void drawQuestTile(final GL gl, final boolean wireframe) { float BASE = 0.0f; float HEIGHT = STD_BRICK_HEIGHT; float WIDTH = 4; float LENGTH = 4; float[] normal;
normal = Utils.calcNormal(new float[] { 0, 0, BASE }, new float[] { LENGTH, 0, BASE }, new float[] { LENGTH, 0, HEIGHT }); gl.glNormal3fv(normal, 0); gl.glVertex3f(0, 0, BASE); gl.glVertex3f(LENGTH, 0, BASE); gl.glVertex3f(LENGTH, 0, HEIGHT); gl.glVertex3f(0, 0, HEIGHT);
normal = Utils.calcNormal(new float[] { LENGTH, WIDTH, BASE }, new float[] { 0, WIDTH, BASE }, new float[] { 0, WIDTH, HEIGHT }); gl.glNormal3fv(normal, 0); gl.glVertex3f(LENGTH, WIDTH, BASE); gl.glVertex3f(0, WIDTH, BASE); gl.glVertex3f(0, WIDTH, HEIGHT); gl.glVertex3f(LENGTH, WIDTH, HEIGHT);
normal = Utils.calcNormal(new float[] { 0, WIDTH, BASE }, new float[] { 0, 0, BASE }, new float[] { 0, 0, HEIGHT }); gl.glNormal3fv(normal, 0); gl.glVertex3f(0, WIDTH, BASE); gl.glVertex3f(0, 0, BASE); gl.glVertex3f(0, 0, HEIGHT); gl.glVertex3f(0, WIDTH, HEIGHT);
normal = Utils.calcNormal(new float[] { LENGTH, 0, BASE }, new float[] { LENGTH, WIDTH, BASE }, new float[] { LENGTH, WIDTH, HEIGHT }); gl.glNormal3fv(normal, 0); gl.glVertex3f(LENGTH, 0, BASE); gl.glVertex3f(LENGTH, WIDTH, BASE); gl.glVertex3f(LENGTH, WIDTH, HEIGHT); gl.glVertex3f(LENGTH, 0, HEIGHT);
normal = Utils.calcNormal(new float[] { 0, 0, 0 }, new float[] { LENGTH, 0, 0 }, new float[] { LENGTH, WIDTH, 0 }); gl.glNormal3fv(normal, 0); gl.glVertex3f(0, 0, BASE); gl.glVertex3f(LENGTH, 0, BASE); gl.glVertex3f(LENGTH, WIDTH, BASE); gl.glVertex3f(0, WIDTH, BASE);
normal = Utils.calcNormal(new float[] { 0, 0, HEIGHT }, new float[] { LENGTH, 0, HEIGHT }, new float[] { LENGTH, WIDTH, HEIGHT }); gl.glNormal3fv(normal, 0); if(!wireframe)gl.glTexCoord2f(0.0f, 0.0f); gl.glVertex3f(0, 0, HEIGHT); if(!wireframe)gl.glTexCoord2f(QUEST_TILE_STUD_COUNT, 0.0f); gl.glVertex3f(LENGTH, 0, HEIGHT); if(!wireframe)gl.glTexCoord2f(QUEST_TILE_STUD_COUNT, QUEST_TILE_STUD_COUNT); gl.glVertex3f(LENGTH, WIDTH, HEIGHT); if(!wireframe)gl.glTexCoord2f(0.0f, QUEST_TILE_STUD_COUNT); gl.glVertex3f(0, WIDTH, HEIGHT);
} |
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
Spasi
JGO Ninja    Posts: 566 Medals: 22
Molon Lave
|
 |
«
Reply #7 on:
2006-07-05 10:37:58 » |
|
There are three reasons for the low performance you're seeing:
1. drawQuestTile is making immediate mode calls (glVertex/Normal/TexCoord), which is the slowest way to submit vertices. You're creating a lot of arrays too, which contribute to bad performance. With display lists, consider this problem solved.
2. You're submitting too many low polygon batches. 5600 objects is a big number, even for a high-end CPU. The overhead of each draw call is considerably larger than the GPU effort to render six quads. The GPU is basically sitting idle and waiting for the CPU. You may be able to solve this by packing groups of bricks (say 100 at a time) in a vertex array and drawing all of them at once.
3. GL_LINE drawing is not hardware accelerated on consumer-level GPUs.
|
|
|
|
|
Spasi
JGO Ninja    Posts: 566 Medals: 22
Molon Lave
|
 |
«
Reply #8 on:
2006-07-05 10:44:06 » |
|
For more details about #2, google for "instancing". It's the method provided by Direct3D to solve this problem. OpenGL does not support it because GL's overhead is generally much lower than D3D's, but it's still a problem in situations like yours. A technique called "pseudo-instancing" can be used in OpenGL, but that requires vertex shaders, which is probably too advanced for you right now.
|
|
|
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #9 on:
2006-07-05 11:48:17 » |
|
Thanks for the great feedback! #1 The code I posted also now has two display lists wrapping it, one for the solid bricks and one for the wireframe, that's where I got my first increase from 7fps to 20fps, but still suffer 100% CPU load. Maybe I try array lists instead? or do you think this might be a jogl issue? From my reading I thought Display Lists were compiled on the GPU and then used directly from there, so should't I be getting almost 0 CPU load? #2If I'm using Display Lists or Array Lists I assume this is no longer an issue? I'd like to look more at vertex shaders in the near future, sounds very powerful, but my priority is to get something basically playable and then start to improve it. I didn't want to worry to much about the performance, but the client has gone down from 60fps a few months ago, to yesterdays low of 5fps, so I thought I'd better take a break and find out where the problem was before I go to far down the wrong path. The info here is really helping, thanks. #3 Interesting. Is there another better technique for highlighting the edges of the bricks instead of just drawing a black wireframe over the brick. (it doesn't seem to work very well anyway because of the zbuffer unless I make the line width equal to 2). You can see the effect I have on my front page: http://whitehexagon.comThanks Peter
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
Games published by our own members! Go get 'em!
|
|
Spasi
JGO Ninja    Posts: 566 Medals: 22
Molon Lave
|
 |
«
Reply #10 on:
2006-07-05 13:44:28 » |
|
#1 The code I posted also now has two display lists wrapping it, one for the solid bricks and one for the wireframe, that's where I got my first increase from 7fps to 20fps, but still suffer 100% CPU load. Maybe I try array lists instead? or do you think this might be a jogl issue? From my reading I thought Display Lists were compiled on the GPU and then used directly from there, so should't I be getting almost 0 CPU load? Yes, DLs are compiled and (probably) stored on the GPU and they are very fast. You've solved this problem, I just wanted to help you understand why this was an issue before moving to DLs. #2 is your big problem now. #2If I'm using Display Lists or Array Lists I assume this is no longer an issue? Unfortunately it is. Actually, it's a problem no matter how you're rendering (DLs, VBOs, vertex arrays). It is caused because of the pipelined way GPUs work. Each time you make a draw call, a lot of stuff happen (from state validation to, worst case, pipeline flushing). The overhead of each such call piles up to the point that the CPU struggles to keep up with the GPU (and usually fails). The problem isn't how you're rendering, but the massive number of 5600 draw calls. FYI, most of the redesign in DX10 was done because of exactly this problem. Even the new geometry shaders, except the unique possibilities they offer, are meant to improve this situation. So, you have to accept the fact that you can't possibly make 5600 draw calls. You either design your game around that, or use techniques like the one I described in my previous post. IIRC, current GPUs work better in batches of more than 500-1000 triangles and the number of draw calls should be lower than 1000 (there are certain papers that have exact numbers). #3 Interesting. Is there another better technique for highlighting the edges of the bricks instead of just drawing a black wireframe over the brick. (it doesn't seem to work very well anyway because of the zbuffer unless I make the line width equal to 2). You can see the effect I have on my front page: http://whitehexagon.comYeah, I know what you mean. In our terrain editor, I've solved the artifacts problem with a vertex shader (can be done without VS). I'm just pushing the terrain grid by a small amount towards the normal of each vertex and it looks great. I couldn't be bothered to search for non-GL_LINE line rendering (I'm using lines only in the editor for the terrain grid and debugging), but I think there are certain techniques you can investigate (using clever texturing IIRC).
|
|
|
|
|
darkprophet
JGO Neuromancer     Posts: 1171
Go Go Gadget Arms
|
 |
«
Reply #11 on:
2006-07-05 13:50:44 » |
|
You could use a vertex/fragment shader couple to do edge detection and blend that over the scene...GPGPU has a few code snippets about edge detection filter on the GPU
DP
|
|
|
|
kitfox
Full Member   Posts: 135
Java games rock!
|
 |
«
Reply #12 on:
2006-07-05 14:29:53 » |
|
I'm working on a terrain editor myself. At the moment, I'm sending everything to the graphics card with individual calls to glVertex, glNormal and glTexCoord, and getting a pretty decent frame rate. Even when I throw in GL_LINEs to highlight the edges in my editor, the performance is reasonable. I don't think I have a super speedy machine, but 3200 triangles the slow way seems to be working for me.
Anyhow, I wrote it this way just to let me start debugging things quickly. I plan to move everything to VBOs. Now, will my VBO render faster if I write an algorithm to stripify my terrain, or should I just leave them as individual triangles?
I'm implementing a ROAM style algorithm, and it seems to be working, but I'm getting odd artifacts on the tesselated terrain. When smooth shaded with normals and a solid color, there are these star shapes surrounding concave or convex verticies. I'm pretty sure the normals are correct, but having these shapes in an otherwise smooth terrain breaks the visual continuity. Do I need to tweak the normals somehow?
I'm also curius about your idea of using a vertex shader to display gemoetry. Does this mean you would just upload a square grid of points as a single object and write a clever vertex shader to fold it into a terrain shape? Does this really give faster performance? How to you adjust for level of detail?
|
|
|
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #13 on:
2006-07-05 17:21:51 » |
|
#2 For those numbers, what do you define as a draw call? is one draw call = one gl.glDoSomething method? or one Begin End block? Would spliting the display list into 5 smaller lists help in anyway? or is it really down to the number of gl.glDoSomething calls inside the display list. Since I guess that is currently 5600 * number of calls in drawQuestTile (35ish?) = 196,000. I think I'm missing something here, so I shall do some more reading on this topic because I think I need somehow to solve having this many bricks on screen, if not for the landscape then for sure once I have all the other scenery and creatures on screen. Maybe I will also try some VertexArrays here, I use them already for another part of the game and they seems quite performant. Since I only need one surface of the brick textured maybe I can also draw those surfaces seperately and render the sides of the bricks untextured. Lots of ideas... But it's been a long day and my head is buzing with all this  Thanks to everyone for the feedback. I shall have another read of this over the weekend and hopefully all will be clearer.
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
Niwak
Full Member   Posts: 111
|
 |
«
Reply #14 on:
2006-07-06 05:06:12 » |
|
To WhiteHexagon : A draw call in your situation is "gl.glCallList". It doesn't matter how many glVertex,... are in the display list. The thing you should minimize is the number of gl.glCallList. Therefore splitting display list is not a solution, you would make thing worse. Regarding your display list, there are "good habits" given by cards manufacturers that parhaps you are not applying. Here are some ; - dont perform state change in a display list (like glBindTexture, glTranslate,...). This can render the display list rather inefficient since it force the driver to perform a state validation when you call the display list even if you did not change the state. - use an uniform vertex format ; i.e. when you specify a vertex you should allways provides the same information (for example : a normal + a texture coordinate + a vertex) ; your are not doing this since in your snippet, normals are specified once per face, texture coordinates just for one of the faces, color seems to be one per model.
Anyway, your model is composed of only 6x4 = 24 vertices which is very low. I'm not sure using 5600 display lists is a very efficient technique. You could try to create one FloatBuffer, put in it interleaved data for all your blocks, when a block move, just update its coordinates directly in the FloatBuffer and submit this to the GPU with a single glDrawArray call. I think you would get fairly higher frame rate (at leats if not all block are moved each frame).
To KitFox : I have spent some time implementing a terrain algorithm for my game. In this process, I initially tried ROAM. The result were that it was somewhat inefficient ; the fact that you have to generate a new index array for each frame with all the stripping problem made it too CPU intensive for my game. I have moved to a very straightforward system similar to geomipmapping which performs really well and was really easier andfaster to implement. So, before wasting too much time on ROAM, I would suggest to quickly try a brute force system like geomipmapping to see if it does not fit your needs.
Vincent
|
|
|
|
|
cylab
JGO Kernel      Posts: 1909 Medals: 24
|
 |
«
Reply #15 on:
2006-07-06 05:08:09 » |
|
Is there another better technique for highlighting the edges of the bricks instead of just drawing a black wireframe over the brick.
You could texture your quad using an image containing that highlighting lines.
|
Mathias - I Know What [you] Did Last Summer!
|
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #16 on:
2006-07-07 17:54:28 » |
|
Hi All, I've done some work on this the last couple of days with some interesting results. I tried Niwaks idea of using a vertex array. For code simplicity I split the rendering into two VA. One for the brick tops with a studed texture, and another VA for the brick sides (including the edges drawn into the texture as cylab suggested). So the results: when displaying just the tops of the bricks the VA run as I would expect, 5% CPU and around 60fps (I presume this is just limited by the vsynch rate of my TFT which is currently also 60). Is there a way to disable that? I remember with GL4Java there used to be a special call to disable the vsynch limit, does JOGL have something similar? So when I try to display the second VA (the sidess) as well, the CPU jumps upto 100% and the frame rate drops to 30fps (but thats still better than the 20% from using display lists!) So I took out the rendering of the tops of the bricks for now, and just have the single vertex array for the sides of the bricks to try and optimize that. I found that by adjusting the count parameter on glDrawArrays I could find some switchover point where I start to become CPU bound. I'm drawing 2025 bricks which worked out to about 32,400 vetices. 1 2 3 4 5 6 7 8
| vetcices CPU% 10000 6 15000 15 17000 20 20000 25 21000 50 22000 90 23000 98 |
So it seems I can only draw around half the brick sides I need taking this approach. I was thinking that maybe I could use a QUAD_STRIP for the 4 sides which would reduce the vertex count from 16 down to 10 per brick. Or I could even try and calculate a quad strip for a complete row of bricks across the whole map. question? Would I still be able to change the color of each face using a quad strip, or will I end up with just a mess of blended colors because the vertices are shared? How would that with textures if I wanted differnet textures per face. I realise I can have both textures in a single 256x256 texture and just cut out the piece I need for each face, but how would I specify this while drawing a quad strip, seems impossible? Anyway overall things are getting better. I'm just still confused over why the VA rendering starts to impact the CPU performance so drastically, and not even linearly. Cheers Peter
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
Spasi
JGO Ninja    Posts: 566 Medals: 22
Molon Lave
|
 |
«
Reply #17 on:
2006-07-08 05:50:44 » |
|
Hi WhiteHexagon, I told you what to do in my second post: You may be able to solve this by packing groups of bricks (say 100 at a time) in a vertex array and drawing all of them at once. There are two pieces of information in that sentence, a) use vertex arrays and b) don't pack all the bricks in a single VA, but rather groups of them.
|
|
|
|
|
bahuman
Full Member   Posts: 145
|
 |
«
Reply #18 on:
2006-07-08 08:27:56 » |
|
So the results: when displaying just the tops of the bricks the VA run as I would expect, 5% CPU and around 60fps (I presume this is just limited by the vsynch rate of my TFT which is currently also 60). Is there a way to disable that? I remember with GL4Java there used to be a special call to disable the vsynch limit, does JOGL have something similar?
So when I try to display the second VA (the sidess) as well, the CPU jumps upto 100% and the frame rate drops to 30fps (but thats still better than the 20% from using display lists!)
So I took out the rendering of the tops of the bricks for now, and just have the single vertex array for the sides of the bricks to try and optimize that. I found that by adjusting the count parameter on glDrawArrays I could find some switchover point where I start to become CPU bound. I'm drawing 2025 bricks which worked out to about 32,400 vetices.
When you try the same measurement with only the top of the bricks, do you get the same result? Also: drawing each brick separately is not the most efficient. You can easily optimize this, by constructing new display lists each time the user attaches a new brick to the construction. For example, if the user built a wall, you can put the entire wall in a display list, even if the layout of the wall may change within the next 30 seconds (another mouseclick). 30 seconds is -ideally- about 1800 frames, so you'll have saved yourself a lot of transmits over the AGP (or PCI-e) pipe, even if it looks like a lot of code to execute. Once you decide to put an entire wall in a display lists, you could even cheat, and use less vertices than you would for every brick separately, as long as you tile your texture correctly! (if your texture coordinates wrap, rather than clamp, the texture will repeat itself).
|
|
|
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #19 on:
2006-07-08 18:09:55 » |
|
Hi Spasi, I appreciate your help with this but I didn't understand your earlier tip at first (see reply #13). But I'm learning slowerly  I was going to split my display list and that's where I got confused. I've now tried your approach of batching the VA data. My data breaks down into 9 chunks quite nicely so thats what I've tried first. I can see though that this is probably still too much data for the 100 or so items you mentioned, but now Im drawing less data and no outlines... so I'm drawing 9x225 bricks parts as below. 1 2 3 4 5 6 7
| bind top texture loop 9x: glDrawArrays[i] (225 single textured quads)
bind side texture: loop 9x: glDrawArrays[i] (225 x 4 brick sides (no base)) |
So i presume if I take this approach I'm only doing 18 'draw calls'? For the brick tops I presume that quads will be split internally into two triangles, so that would be 450 triangle per call, right? and 1800 triangles for the 4 sides. Which is more than the 1000 you mentioned. So as expected the CPU was still maxing out for the complete scene. So next I tried to split the side drawing into two batches, east & north, and south & west. 1 2 3 4 5 6 7 8
| bind top texture loop 9x: glDrawArrays[i] (225 single textured quads)
bind side texture: loop 9x: glDrawArrays[i] (225 x 2 brick sides (no base)) glDrawArrays[i] (225 x 2 brick sides (no base)) |
That brings the triangle count down to 900 in each of those array lists. But sadly I'm still seeing a maxed out CPU and 30fps (the same as a single VA). DO you think I need to make these batches even smaller? To bahuman: I could display all 2025 tops in a single VA at 60fps and 4%CPU. The problem was when I started drawing the bricks sides as well. Then I seemed to reach some switch over point where the CPU started taking load. Thanks again for everyone thats helping out on this, I hope I can show something nice at the end of it all  Cheers Peter
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
Spasi
JGO Ninja    Posts: 566 Medals: 22
Molon Lave
|
 |
«
Reply #20 on:
2006-07-10 06:53:10 » |
|
Hi WhiteHexagon, I was skeptical of the numbers you posted, so I created a little test to try the different methods. Here it is (requires Java 5.0+, LWJGL, GL1.3+, GL2.0 for the final test): Bricks Test (439kb zip, includes source & win32 LWJGL binaries) The test renders 18x18x18 "bricks" (6 QUADS, normals are specified but with no lighting or texturing). I assumed as a requirement that each brick is dynamic in position and appearance, so I'm rendering a total of 5832 unique bricks, while constantly animating their position and color. There are 3 tests currently (press SPACE to change the active test): - Immediate mode rendering. Uses glTranslate and glBegin(GL_QUADS). - Simple display list rendering. Uses glTranslate and glCallList. - Display list rendering with pseudo-instancing. Uses a vertex shader to pass the brick position as a texture coordinate and glCallList. I didn't have time to add a vertex array test, feel free to add one. So, here are my results: WinXP, Athlon XP 2800+, GeForce 6800GT AGP, Java 6 b90 Client VM Server VM ------------- ------------- 1. IM 52 fps 65 fps 2. DL 155 fps 170 fps 3. PI 253 fps 253 fps
|
|
|
|
|
Orangy Tang
JGO Kernel      Posts: 2840 Medals: 26
Monkey for a head
|
 |
«
Reply #21 on:
2006-07-10 08:11:11 » |
|
#3 Interesting. Is there another better technique for highlighting the edges of the bricks instead of just drawing a black wireframe over the brick. (it doesn't seem to work very well anyway because of the zbuffer unless I make the line width equal to 2). You can see the effect I have on my front page: http://whitehexagon.comYeah, I know what you mean. In our terrain editor, I've solved the artifacts problem with a vertex shader (can be done without VS). I'm just pushing the terrain grid by a small amount towards the normal of each vertex and it looks great. I couldn't be bothered to search for non-GL_LINE line rendering (I'm using lines only in the editor for the terrain grid and debugging), but I think there are certain techniques you can investigate (using clever texturing IIRC). I don't know if this is still relevant to the discussion, but I found that theres quite a difference between rendering line primatives (eg. glBegin(GL_LINE)  and line fill mode (eg. glPolygonMode(GL_LINE)). Line fill/poly mode tends to be rather slow, but doing the equivilent work manually and just rendering line primatives is much faster. Seems rather odd to me, but it might be worth looking into.
|
|
|
|
Matzon
« League of Dukes » JGO Kernel      Posts: 1803 Medals: 8
I'm gonna wring your pants!
|
 |
«
Reply #22 on:
2006-07-10 10:55:20 » |
|
WinXP, P4 3GHz, Ati x300 PCIe, Java 1.5.0_06 IM: 63 DL: 80 no difference between server and client (?)
|
|
|
|
elias
|
 |
«
Reply #23 on:
2006-07-10 11:12:57 » |
|
Suse 10.1, radeon 9700 mobility, ATI drivers 8.26.18, Pentium M 1700 Mhz, java 1.5.0_07 (client and server the same):
IM: 54 DL: 110 DLadv: 180
Latest mustang gives me 66 for IM with -server, the rest are the same.
- elias
|
|
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #24 on:
2006-07-10 14:56:26 » |
|
Hi Spasi, A big thanks for writing that! And interesting to see others results. Here's what I got on my hardware. (GeForce 6200/AGP/SSE2, Java 1.5.0_06, Pentium M760, Win 2K). I only tried client VM, don't think my machine is quite server spec  IM: 74fps DL: 69fps PI: 73fps The CPU was maxed out, but the display was very pretty  I have some lighting and textures which might be slowing my stuff down. Plus as you can see the machine spec is quite low. But obviously I'm doing something wrong in my code to be twice as slow, unless it's the way I'm configuring JOGL. I shall try to extract my drawing code into a testable unit which should make it easier to test, currently is quite heavily dependant on a whole bunch of other code. Cheers Peter
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
crash0veride007
JGO n00b  Posts: 40
ThE MaTriX HaS YoU!
|
 |
«
Reply #25 on:
2006-07-10 19:06:23 » |
|
Windows XP x64, Dual 7900 GTX 512 Sli, Dual Opteron 252 @ 2.6Ghz, 8Gb Ram, Nvidia 91.31 Driver, Mustang Beta 2 32-bit VM
Client 32-bit VM IM: 106 FPS DL: 260 FPS PI: 328 FPS
Server 32-bit VM IM: 108 FPS DL: 264 FPS PI: 331 FPS
|
|
|
|
|
quintesse
Full Member   Posts: 124
Java games rock!
|
 |
«
Reply #26 on:
2006-07-11 07:26:49 » |
|
WinXP, ATI mobility Radeon X700, no idea what processor it has but Windows says a 1.73GHz Intel, 1GB
java version "1.6.0-rc-fastdebug" Java(TM) SE Runtime Environment (build 1.6.0-rc-fastdebug-b88) Java HotSpot(TM) Client VM (build 1.6.0-rc-fastdebug-b88-debug, mixed mode)
IM: 83 fps DL: 139 fps PI: 171 fps
Server VM hardly makes a difference
|
|
|
|
|
joda
JGO n00b  Posts: 2
|
 |
«
Reply #27 on:
2006-07-19 17:37:44 » |
|
|
|
|
|
|
WhiteHexagon
Jr. Member   Posts: 51
|
 |
«
Reply #28 on:
2006-07-31 15:43:42 » |
|
Thanks joda, I currently used the polygon offset on another part of the engine, but never really understood what it was doing  It works okay for most cases but I found a few gcards where I get really nasty result when two polygons intersect, kinda heavy saw tooth effect, if that makes sense. Using the stencil buffer technique looks a bit tricky, but maybe something I will brave once I learn a bit more about OpenGL. -- Anyway regarding the performance issue, I've managed to rip out most of my engine code into a stadalone test now, apart from Texture loading where I use some custom file formats. Does anyone have a simple bit of code to generate a texture ie. doesn't need to load a texture? Thanks Peter
|
whitehexagon.com - Building a new game world, one brick at a time.
|
|
|
bahuman
Full Member   Posts: 145
|
 |
«
Reply #29 on:
2006-07-31 16:47:31 » |
|
just give us some code that uses TextureIO.newTexture(new File("test.jpg"), true), and create a dummy jpg file. It shouldn't be that hard to grab a dummy jpg from the net, should it ? 
|
|
|
|
|
|