Java-Gaming.org    
Featured games (81)
games approved by the League of Dukes
Games in Showcase (487)
Games in Android Showcase (110)
games submitted by our members
Games in WIP (552)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1] 2
  ignore  |  Print  
  Performance  (Read 8504 times)
0 Members and 1 Guest are viewing this topic.
Offline WhiteHexagon

Senior Newbie





« Posted 2006-07-04 22:31:40 »

Hi All,

I'm just testing out a profiler to try and get some more performance out of my game.  I can already see that I need to add some more Display Lists for parts of the rendering, but one thing that surpised me was the following:

1  
2  
3  
4  
5  
6  
7  
method name                                             time(ms)         invocation count
Model.draw(GL, int, int, int, boolean)                   24,687   93 %    426,877
  com.sun.opengl.impl.GLImpl.glBegin(int)                 4,937   19 %    2,561,262  
  com.sun.opengl.impl.GLImpl.glEnd()                      3,265   12 %    2,561,262  
  MyUtils.calcNormal(float[], float[], float[])           2,812   11 %    2,561,262  
  com.sun.opengl.impl.GLImpl.glNormal3fv(float[], int)    2,453    9 %    2,561,262  
  ...some other calls


Is glBegin and glEnd really so expensive?  It seems so, anyway I can fix this issue no problem, I just thought the numbers were quite interesting.

Cheers

Peter

whitehexagon.com - Building a new game world, one brick at a time.
Offline emzic

Senior Member





« Reply #1 - Posted 2006-07-05 08:15:50 »

what profiler did you use?

www.embege.com - personal website
webstart blendinspect - OpenGL BlendingModes visualization.
Online Spasi
« Reply #2 - Posted 2006-07-05 08:27:45 »

In immediate mode rendering the real work happens on glBegin and glEnd. Usually everything in between is buffered and submitted as a batch at glEnd. The glBegin overhead is probably GL state validation, pipeline flushing, etc.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline WhiteHexagon

Senior Newbie





« Reply #3 - Posted 2006-07-05 08:28:43 »

YourKit - its a bit pricey, but they do a 30day evaluation for free Smiley

whitehexagon.com - Building a new game world, one brick at a time.
Offline WhiteHexagon

Senior Newbie





« Reply #4 - Posted 2006-07-05 09:07:24 »

Thanks Spasi, that would explain why none of the other gl methods show up too Smiley but it makes it kinda hard to know whats slowing things down.

Anyway I got my scene rendering 3x faster already, but maybe someone has some tips for a novice OGL programmer Smiley

The specific scene I was having problems with was displaying only around 5600 lego style bricks. Not that much to ask I thought, but It was really killing my fps.

1  
2  
3  
4  
5  
6  
7  
loop 5600 times:
    //draw solid brick
   gl.glPolygonMode(GL.GL_FRONT_AND_BACK, GL.GL_FILL);
    drawSingleBrick using GL_QUADS and textured.
    //draw wireframe to highlight the edges in black
   gl.glPolygonMode(GL.GL_FRONT_AND_BACK, GL.GL_LINE);
    drawSingleBrick using GL_LINES


I've now split this into two iterations (one for solid drawing, one for wireframe highlight drawing) which each gets compiled into a display list.  Then I just call the two display lists.  This approach seems to tripple to performance, but I know that some of this is because of the normal calculations and color lookups only being done once during compilation.

But I'm still a bit confused.  I would have thought that using a display list that the code is then on the gcard, but my CPU is still maxing out at 98% during rendering.  Is JOGL running these lists on the CPU? Should I be using VertexArrays for this type of stuff?  or would I have the same problem?

Any tips are really appreciated.

Cheers

Peter

java (build 1.5.0_06-b05)
jogl beta4
Win2k
GL_VENDOR=NVIDIA Corporation
GL_RENDERER=GeForce 6200/AGP/SSE2
DRAWABLE_GL=com.sun.opengl.impl.GLImpl

whitehexagon.com - Building a new game world, one brick at a time.
Offline bahuman

Junior Member





« Reply #5 - Posted 2006-07-05 10:15:54 »

I'm curious, what is your current FPS?

Oh, and maybe this might get you on your way:
http://www.opengl.org//resources/faq/technical/performance.htm

What is the opengl code for drawing the lego block? How many triangles does it contain ?
Offline WhiteHexagon

Senior Newbie





« Reply #6 - Posted 2006-07-05 11:42:01 »

Well it's quite a low spec gcard, but I was down to 7fps, back to 20fps now after the changes, but I think I'm CPU bound for some reason.

Thanks for the link.

This is the code for drawing the solid bricks: but I call the same code but use  GL_LINES

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
        
        gl.glBindTexture(GL.GL_TEXTURE_2D, textureStud);
        gl.glEnable(GL.GL_TEXTURE_2D);
        gl.glPolygonMode(GL.GL_FRONT_AND_BACK, GL.GL_FILL);

 loop 5600 times
                        float tx = drawQx + (QUEST_TILE_LENGTH * x);
                        float ty = drawQy + drawY;
                        int height = mapData[qx][qy].height[x][y];
                        float tz = ClientConstants.STD_BRICK_HEIGHT * height;
                        gl.glColor3fv(questColorTable[height+ -CorkConstants.MIN_MAP_HEIGHT], 0); //adjust height to be zero based.

                        // solid brick
                       gl.glTranslatef(tx, ty, tz);
                        gl.glBegin(GL.GL_QUADS);
                        drawQuestTile(gl, false);
                        gl.glEnd();

                        gl.glTranslatef(-tx, -ty, -tz);


And this is for a single brick

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
    
private static final void drawQuestTile(final GL gl, final boolean wireframe) {
        float BASE = 0.0f; // base height...
       float HEIGHT = STD_BRICK_HEIGHT;
        float WIDTH = 4;
        float LENGTH = 4;
        float[] normal;

        // front
       normal = Utils.calcNormal(new float[] { 0, 0, BASE }, new float[] { LENGTH, 0, BASE }, new float[] { LENGTH, 0, HEIGHT });
        gl.glNormal3fv(normal, 0);
        gl.glVertex3f(0, 0, BASE);
        gl.glVertex3f(LENGTH, 0, BASE);
        gl.glVertex3f(LENGTH, 0, HEIGHT);
        gl.glVertex3f(0, 0, HEIGHT);

        // back
       normal = Utils.calcNormal(new float[] { LENGTH, WIDTH, BASE }, new float[] { 0, WIDTH, BASE }, new float[] { 0, WIDTH, HEIGHT });
        gl.glNormal3fv(normal, 0);
        gl.glVertex3f(LENGTH, WIDTH, BASE);
        gl.glVertex3f(0, WIDTH, BASE);
        gl.glVertex3f(0, WIDTH, HEIGHT);
        gl.glVertex3f(LENGTH, WIDTH, HEIGHT);

        // w end
       normal = Utils.calcNormal(new float[] { 0, WIDTH, BASE }, new float[] { 0, 0, BASE }, new float[] { 0, 0, HEIGHT });
        gl.glNormal3fv(normal, 0);
        gl.glVertex3f(0, WIDTH, BASE);
        gl.glVertex3f(0, 0, BASE);
        gl.glVertex3f(0, 0, HEIGHT);
        gl.glVertex3f(0, WIDTH, HEIGHT);

        // e end
       normal = Utils.calcNormal(new float[] { LENGTH, 0, BASE }, new float[] { LENGTH, WIDTH, BASE }, new float[] { LENGTH, WIDTH, HEIGHT });
        gl.glNormal3fv(normal, 0);
        gl.glVertex3f(LENGTH, 0, BASE);
        gl.glVertex3f(LENGTH, WIDTH, BASE);
        gl.glVertex3f(LENGTH, WIDTH, HEIGHT);
        gl.glVertex3f(LENGTH, 0, HEIGHT);

        // bottom
       normal = Utils.calcNormal(new float[] { 0, 0, 0 }, new float[] { LENGTH, 0, 0 }, new float[] { LENGTH, WIDTH, 0 });
        gl.glNormal3fv(normal, 0);
        gl.glVertex3f(0, 0, BASE);
        gl.glVertex3f(LENGTH, 0, BASE);
        gl.glVertex3f(LENGTH, WIDTH, BASE);
        gl.glVertex3f(0, WIDTH, BASE);

        // top
       normal = Utils.calcNormal(new float[] { 0, 0, HEIGHT }, new float[] { LENGTH, 0, HEIGHT }, new float[] { LENGTH, WIDTH, HEIGHT });
        gl.glNormal3fv(normal, 0);
        if(!wireframe)gl.glTexCoord2f(0.0f, 0.0f);
        gl.glVertex3f(0, 0, HEIGHT);
        if(!wireframe)gl.glTexCoord2f(QUEST_TILE_STUD_COUNT, 0.0f);
        gl.glVertex3f(LENGTH, 0, HEIGHT);
        if(!wireframe)gl.glTexCoord2f(QUEST_TILE_STUD_COUNT, QUEST_TILE_STUD_COUNT);
        gl.glVertex3f(LENGTH, WIDTH, HEIGHT);
        if(!wireframe)gl.glTexCoord2f(0.0f, QUEST_TILE_STUD_COUNT);
        gl.glVertex3f(0, WIDTH, HEIGHT);

    }


whitehexagon.com - Building a new game world, one brick at a time.
Online Spasi
« Reply #7 - Posted 2006-07-05 14:37:58 »

There are three reasons for the low performance you're seeing:

1. drawQuestTile is making immediate mode calls (glVertex/Normal/TexCoord), which is the slowest way to submit vertices. You're creating a lot of arrays too, which contribute to bad performance. With display lists, consider this problem solved.

2. You're submitting too many low polygon batches. 5600 objects is a big number, even for a high-end CPU. The overhead of each draw call is considerably larger than the GPU effort to render six quads. The GPU is basically sitting idle and waiting for the CPU. You may be able to solve this by packing groups of bricks (say 100 at a time) in a vertex array and drawing all of them at once.

3. GL_LINE drawing is not hardware accelerated on consumer-level GPUs.
Online Spasi
« Reply #8 - Posted 2006-07-05 14:44:06 »

For more details about #2, google for "instancing". It's the method provided by Direct3D to solve this problem. OpenGL does not support it because GL's overhead is generally much lower than D3D's, but it's still a problem in situations like yours. A technique called "pseudo-instancing" can be used in OpenGL, but that requires vertex shaders, which is probably too advanced for you right now.
Offline WhiteHexagon

Senior Newbie





« Reply #9 - Posted 2006-07-05 15:48:17 »

Thanks for the great feedback!

#1 The code I posted also now has two display lists wrapping it, one for the solid bricks and one for the wireframe, that's where I got my first increase from 7fps to 20fps, but still suffer 100% CPU load. Maybe I try array lists instead? or do you think this might be a jogl issue?  From my reading I thought Display Lists were compiled on the GPU and then used directly from there, so should't I be getting almost 0 CPU load?

#2If I'm using Display Lists or Array Lists I assume this is no longer an issue? 

I'd like to look more at vertex shaders in the near future, sounds very powerful, but my priority is to get something basically playable and then start to improve it.  I didn't want to worry to much about the performance, but the client has gone down from 60fps a few months ago, to yesterdays low of 5fps, so I thought I'd better take a break and find out where the problem was before I go to far down the wrong path.  The info here is really helping, thanks.

#3 Interesting. Is there another better technique for highlighting the edges of the bricks instead of just drawing a black wireframe over the brick. (it doesn't seem to work very well anyway because of the zbuffer unless I make the line width equal to 2).  You can see the effect I have on my front page: http://whitehexagon.com

Thanks

Peter


whitehexagon.com - Building a new game world, one brick at a time.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Online Spasi
« Reply #10 - Posted 2006-07-05 17:44:28 »

#1 The code I posted also now has two display lists wrapping it, one for the solid bricks and one for the wireframe, that's where I got my first increase from 7fps to 20fps, but still suffer 100% CPU load. Maybe I try array lists instead? or do you think this might be a jogl issue?  From my reading I thought Display Lists were compiled on the GPU and then used directly from there, so should't I be getting almost 0 CPU load?

Yes, DLs are compiled and (probably) stored on the GPU and they are very fast. You've solved this problem, I just wanted to help you understand why this was an issue before moving to DLs. #2 is your big problem now.

#2If I'm using Display Lists or Array Lists I assume this is no longer an issue?

Unfortunately it is. Actually, it's a problem no matter how you're rendering (DLs, VBOs, vertex arrays). It is caused because of the pipelined way GPUs work. Each time you make a draw call, a lot of stuff happen (from state validation to, worst case, pipeline flushing). The overhead of each such call piles up to the point that the CPU struggles to keep up with the GPU (and usually fails). The problem isn't how you're rendering, but the massive number of 5600 draw calls.

FYI, most of the redesign in DX10 was done because of exactly this problem. Even the new geometry shaders, except the unique possibilities they offer, are meant to improve this situation.

So, you have to accept the fact that you can't possibly make 5600 draw calls. You either design your game around that, or use techniques like the one I described in my previous post. IIRC, current GPUs work better in batches of more than 500-1000 triangles and the number of draw calls should be lower than 1000 (there are certain papers that have exact numbers).

#3 Interesting. Is there another better technique for highlighting the edges of the bricks instead of just drawing a black wireframe over the brick. (it doesn't seem to work very well anyway because of the zbuffer unless I make the line width equal to 2).  You can see the effect I have on my front page: http://whitehexagon.com

Yeah, I know what you mean. In our terrain editor, I've solved the artifacts problem with a vertex shader (can be done without VS). I'm just pushing the terrain grid by a small amount towards the normal of each vertex and it looks great. I couldn't be bothered to search for non-GL_LINE line rendering (I'm using lines only in the editor for the terrain grid and debugging), but I think there are certain techniques you can investigate (using clever texturing IIRC).
Offline darkprophet

Senior Member




Go Go Gadget Arms


« Reply #11 - Posted 2006-07-05 17:50:44 »

You could use a vertex/fragment shader couple to do edge detection and blend that over the scene...GPGPU has a few code snippets about edge detection filter on the GPU

DP

Friends don't let friends make MMORPGs.

Blog | Volatile-Engine
Offline kitfox

Junior Member




Java games rock!


« Reply #12 - Posted 2006-07-05 18:29:53 »

I'm working on a terrain editor myself.  At the moment, I'm sending everything to the graphics card with individual calls to glVertex, glNormal and glTexCoord, and getting a pretty decent frame rate.  Even when I throw in GL_LINEs to highlight the edges in my editor, the performance is reasonable.  I don't think I have a super speedy machine, but 3200 triangles the slow way seems to be working for me.

Anyhow, I wrote it this way just to let me start debugging things quickly.  I plan to move everything to VBOs.  Now, will my VBO render faster if I write an algorithm to stripify my terrain, or should I just leave them as individual triangles? 

I'm implementing a ROAM style algorithm, and it seems to be working, but I'm getting odd artifacts on the tesselated terrain.  When smooth shaded with normals and a solid color, there are these star shapes surrounding concave or convex verticies.  I'm pretty sure the normals are correct, but having these shapes in an otherwise smooth terrain breaks the visual continuity.  Do I need to tweak the normals somehow? 

I'm also curius about your idea of using a vertex shader to display gemoetry.  Does this mean you would just upload a square grid of points as a single object and write a clever vertex shader to fold it into a terrain shape?  Does this really give faster performance?  How to you adjust for level of detail?

Offline WhiteHexagon

Senior Newbie





« Reply #13 - Posted 2006-07-05 21:21:51 »

#2 For those numbers, what do you define as a draw call? is one draw call = one gl.glDoSomething method? or one Begin End block?
Would spliting the display list into 5 smaller lists help in anyway?  or is it really down to the number of gl.glDoSomething calls inside the display list.  Since I guess that is currently 5600 * number of calls in drawQuestTile (35ish?) = 196,000. 

I think I'm missing something here, so I shall do some more reading on this topic because I think I need somehow to solve having this many bricks on screen, if not for the landscape then for sure once I have all the other scenery and creatures on screen.  Maybe I will also try some VertexArrays here, I use them already for another part of the game and they seems quite performant.  Since I only need one surface of the brick textured maybe I can also draw those surfaces seperately and render the sides of the bricks untextured.  Lots of ideas... But it's been a long day and my head is buzing with all this Smiley  Thanks to everyone for the feedback.  I shall have another read of this over the weekend and hopefully all will be clearer.

whitehexagon.com - Building a new game world, one brick at a time.
Offline Niwak

Senior Member


Medals: 1
Projects: 1



« Reply #14 - Posted 2006-07-06 09:06:12 »

To WhiteHexagon :
A draw call in your situation is "gl.glCallList". It doesn't matter how many glVertex,... are in the display list.
The thing you should minimize is the number of gl.glCallList. Therefore splitting display list is not a solution, you would make thing worse. Regarding your display list, there are "good habits" given by cards manufacturers that parhaps you are not applying. Here are some ;
- dont perform state change in a display list (like glBindTexture, glTranslate,...). This can render the display list rather inefficient since it force the driver to perform a state validation when you call the display list even if you did not change the state.
- use an uniform vertex format ; i.e. when you specify a vertex you should allways provides the same information (for example : a normal + a texture coordinate + a vertex) ; your are not doing this since in your snippet, normals are specified once per face, texture coordinates just for one of the faces, color seems to be one per model.

Anyway, your model is composed of only 6x4 = 24 vertices which is very low. I'm not sure using 5600 display lists is a very efficient technique. You could try to create one FloatBuffer, put in it interleaved data for all your blocks, when a block move, just update its coordinates directly in the FloatBuffer and submit this to the GPU with a single glDrawArray call. I think you would get fairly higher frame rate (at leats if not all block are moved each frame).

To KitFox :
I have spent some time implementing a terrain algorithm for my game. In this process, I initially tried ROAM. The result were that it was somewhat inefficient ; the fact that you have to generate a new index array for each frame with all the stripping problem made it too CPU intensive for my game. I have moved to a very straightforward system similar to geomipmapping which performs really well and was really easier andfaster to implement. So, before wasting too much time on ROAM, I would suggest to quickly try a brute force system like geomipmapping to see if it does not fit your needs.

    Vincent
Offline cylab

JGO Ninja


Medals: 43



« Reply #15 - Posted 2006-07-06 09:08:09 »

Quote
Is there another better technique for highlighting the edges of the bricks instead of just drawing a black wireframe over the brick.

You could texture your quad using an image containing that highlighting lines.

Mathias - I Know What [you] Did Last Summer!
Offline WhiteHexagon

Senior Newbie





« Reply #16 - Posted 2006-07-07 21:54:28 »

Hi All,

I've done some work on this the last couple of days with some interesting results.  I tried Niwaks idea of using a vertex array.  For code simplicity I split the rendering into two VA. One for the brick tops with a studed texture, and another VA for the brick sides (including the edges drawn into the texture as cylab suggested).

So the results:  when displaying just the tops of the bricks the VA run as I would expect, 5% CPU and around 60fps (I presume this is just limited by the vsynch rate of my TFT which is currently also 60).  Is there a way to disable that? I remember with GL4Java there used to be a special call to disable the vsynch limit, does JOGL have something similar?

So when I try to display the second VA (the sidess) as well, the CPU jumps upto 100% and the frame rate drops to 30fps (but thats still better than the 20% from using display lists!)

So I took out the rendering of the tops of the bricks for now, and just have the single vertex array for the sides of the bricks to try and optimize that.  I found that by adjusting the count parameter on glDrawArrays I could find some switchover point where I start to become CPU bound.  I'm drawing 2025 bricks which worked out to about 32,400 vetices.


1  
2  
3  
4  
5  
6  
7  
8  
vetcices CPU%
10000   6
15000   15
17000   20
20000   25
21000   50
22000   90
23000   98


So it seems I can only draw around half the brick sides I need taking this approach.  I was thinking that maybe I could use a QUAD_STRIP for the 4 sides which would reduce the vertex count from 16 down to 10 per brick. 

Or I could even try and calculate a quad strip for a complete row of bricks across the whole map.  question? Would I still be able to change the color of each face using a quad strip, or will I end up with just a mess of blended colors because the vertices are shared?  How would that with textures if I wanted differnet textures per face.  I realise I can have both textures in a single 256x256 texture and just cut out the piece I need for each face, but how would I specify this while drawing a quad strip, seems impossible?

Anyway overall things are getting better.  I'm just still confused over why the VA rendering starts to impact the CPU performance so drastically, and not even linearly.

Cheers

Peter



whitehexagon.com - Building a new game world, one brick at a time.
Online Spasi
« Reply #17 - Posted 2006-07-08 09:50:44 »

Hi WhiteHexagon,

I told you what to do in my second post:

Quote
You may be able to solve this by packing groups of bricks (say 100 at a time) in a vertex array and drawing all of them at once.

There are two pieces of information in that sentence, a) use vertex arrays and b) don't pack all the bricks in a single VA, but rather groups of them.
Offline bahuman

Junior Member





« Reply #18 - Posted 2006-07-08 12:27:56 »

So the results:  when displaying just the tops of the bricks the VA run as I would expect, 5% CPU and around 60fps (I presume this is just limited by the vsynch rate of my TFT which is currently also 60).  Is there a way to disable that? I remember with GL4Java there used to be a special call to disable the vsynch limit, does JOGL have something similar?

So when I try to display the second VA (the sidess) as well, the CPU jumps upto 100% and the frame rate drops to 30fps (but thats still better than the 20% from using display lists!)

So I took out the rendering of the tops of the bricks for now, and just have the single vertex array for the sides of the bricks to try and optimize that.  I found that by adjusting the count parameter on glDrawArrays I could find some switchover point where I start to become CPU bound.  I'm drawing 2025 bricks which worked out to about 32,400 vetices.

When you try the same measurement with only the top of the bricks, do you get the same result?

Also: drawing each brick separately  is not the most efficient. You can easily optimize this, by constructing new display lists each time the user attaches a new brick to the construction. For example, if the user built a wall, you can put the entire wall in a display list, even if the layout of the wall may change within the next 30 seconds (another mouseclick). 30 seconds is -ideally- about 1800 frames, so you'll have saved yourself a lot of transmits over the AGP (or PCI-e) pipe, even if it looks like a lot of code to execute. Once you decide to put an entire wall in a display lists, you could even cheat, and use less vertices than you would for every brick separately, as long as you tile  your texture correctly! (if your texture coordinates wrap, rather than clamp, the texture will repeat itself).
Offline WhiteHexagon

Senior Newbie





« Reply #19 - Posted 2006-07-08 22:09:55 »

Hi Spasi, I appreciate your help with this but I didn't understand your earlier tip at first (see reply #13).  But I'm learning slowerly Smiley  I was going to split my display list and that's where I got confused.  I've now tried your approach of batching the VA data.  My data breaks down into 9 chunks quite nicely so thats what I've tried first.  I can see though that this is probably still too much data for the 100 or so items you mentioned, but now Im drawing less data and no outlines...  so I'm drawing 9x225 bricks parts as below.

1  
2  
3  
4  
5  
6  
7  
bind top texture
loop 9x:
    glDrawArrays[i] (225 single textured quads)

bind side texture:
loop 9x:
    glDrawArrays[i]  (225 x 4 brick sides (no base))


So i presume if I take this approach I'm only doing 18 'draw calls'? For the brick tops I presume that quads will be split internally into two triangles, so that would be 450 triangle per call, right? and 1800 triangles for the 4 sides.  Which is more than the 1000 you mentioned.  So as expected the CPU was still maxing out for the complete scene.

So next I tried to split the side drawing into two batches, east & north, and south & west. 

1  
2  
3  
4  
5  
6  
7  
8  
bind top texture
loop 9x:
    glDrawArrays[i] (225 single textured quads)

bind side texture:
loop 9x:
    glDrawArrays[i]  (225 x 2 brick sides (no base))
    glDrawArrays[i]  (225 x 2 brick sides (no base))


That brings the triangle count down to 900 in each of those array lists.  But sadly I'm still seeing a maxed out CPU and 30fps (the same as a single VA).  DO you think I need to make these batches even smaller?

To bahuman:  I could display all 2025 tops in a single VA at 60fps and 4%CPU.  The problem was when I started drawing the bricks sides as well.  Then I seemed to reach some switch over point where the CPU started taking load.

Thanks again for everyone thats helping out on this,  I hope I can show something nice at the end of it all Smiley

Cheers

Peter

whitehexagon.com - Building a new game world, one brick at a time.
Online Spasi
« Reply #20 - Posted 2006-07-10 10:53:10 »

Hi WhiteHexagon,

I was skeptical of the numbers you posted, so I created a little test to try the different methods. Here it is (requires Java 5.0+, LWJGL, GL1.3+, GL2.0 for the final test):

Bricks Test (439kb zip, includes source & win32 LWJGL binaries)

The test renders 18x18x18 "bricks" (6 QUADS, normals are specified but with no lighting or texturing). I assumed as a requirement that each brick is dynamic in position and appearance, so I'm rendering a total of 5832 unique bricks, while constantly animating their position and color.

There are 3 tests currently (press SPACE to change the active test):

- Immediate mode rendering. Uses glTranslate and glBegin(GL_QUADS).
- Simple display list rendering. Uses glTranslate and glCallList.
- Display list rendering with pseudo-instancing. Uses a vertex shader to pass the brick position as a texture coordinate and glCallList.

I didn't have time to add a vertex array test, feel free to add one.

So, here are my results:

WinXP, Athlon XP 2800+, GeForce 6800GT AGP, Java 6 b90

           Client VM     Server VM
           -------------      -------------
1. IM      52 fps          65 fps
2. DL   155 fps         170 fps
3. PI    253 fps         253 fps
Offline Orangy Tang

JGO Kernel


Medals: 56
Projects: 11


Monkey for a head


« Reply #21 - Posted 2006-07-10 12:11:11 »

#3 Interesting. Is there another better technique for highlighting the edges of the bricks instead of just drawing a black wireframe over the brick. (it doesn't seem to work very well anyway because of the zbuffer unless I make the line width equal to 2).  You can see the effect I have on my front page: http://whitehexagon.com

Yeah, I know what you mean. In our terrain editor, I've solved the artifacts problem with a vertex shader (can be done without VS). I'm just pushing the terrain grid by a small amount towards the normal of each vertex and it looks great. I couldn't be bothered to search for non-GL_LINE line rendering (I'm using lines only in the editor for the terrain grid and debugging), but I think there are certain techniques you can investigate (using clever texturing IIRC).
I don't know if this is still relevant to the discussion, but I found that theres quite a difference between rendering line primatives (eg. glBegin(GL_LINE)Wink and line fill mode (eg. glPolygonMode(GL_LINE)). Line fill/poly mode tends to be rather slow, but doing the equivilent work manually and just rendering line primatives is much faster.

Seems rather odd to me, but it might be worth looking into.

[ TriangularPixels.com - Play Growth Spurt, Rescue Squad and Snowman Village ] [ Rebirth - game resource library ]
Offline Matzon

JGO Knight


Medals: 19
Projects: 1


I'm gonna wring your pants!


« Reply #22 - Posted 2006-07-10 14:55:20 »

WinXP, P4 3GHz, Ati x300 PCIe, Java 1.5.0_06
IM: 63
DL: 80
no difference between server and client (?)

Offline elias

Senior Member





« Reply #23 - Posted 2006-07-10 15:12:57 »

Suse 10.1, radeon 9700 mobility, ATI drivers 8.26.18, Pentium M 1700 Mhz, java 1.5.0_07 (client and server the same):

IM: 54
DL: 110
DLadv: 180

Latest mustang gives me 66 for IM with -server, the rest are the same.

 - elias

Offline WhiteHexagon

Senior Newbie





« Reply #24 - Posted 2006-07-10 18:56:26 »

Hi Spasi, A big thanks for writing that!

And interesting to see others results.  Here's what I got on my hardware.  (GeForce 6200/AGP/SSE2,  Java 1.5.0_06,  Pentium M760,  Win 2K).  I only tried client VM, don't think my machine is quite server spec Smiley

IM: 74fps
DL: 69fps
PI: 73fps

The CPU was maxed out, but the display was very pretty Smiley
I have some lighting and textures which might be slowing my stuff down.  Plus as you can see the machine spec is quite low.  But obviously I'm doing something wrong in my code to be twice as slow, unless it's the way I'm configuring JOGL.

I shall try to extract my drawing code into a testable unit which should make it easier to test, currently is quite  heavily dependant on a whole bunch of other code.

Cheers

Peter



whitehexagon.com - Building a new game world, one brick at a time.
Offline crash0veride007

Senior Newbie




ThE MaTriX HaS YoU!


« Reply #25 - Posted 2006-07-10 23:06:23 »

Windows XP x64, Dual 7900 GTX 512 Sli, Dual Opteron 252 @ 2.6Ghz, 8Gb Ram, Nvidia 91.31 Driver, Mustang Beta 2 32-bit VM

Client 32-bit VM
IM: 106 FPS
DL: 260 FPS
PI: 328 FPS

Server 32-bit VM
IM: 108 FPS
DL: 264 FPS
PI: 331 FPS
Offline quintesse

Junior Member




Java games rock!


« Reply #26 - Posted 2006-07-11 11:26:49 »

WinXP, ATI mobility Radeon X700, no idea what processor it has but Windows says a 1.73GHz Intel, 1GB

java version "1.6.0-rc-fastdebug"
Java(TM) SE Runtime Environment (build 1.6.0-rc-fastdebug-b88)
Java HotSpot(TM) Client VM (build 1.6.0-rc-fastdebug-b88-debug, mixed mode)

IM: 83 fps
DL: 139 fps
PI: 171 fps

Server VM hardly makes a difference
Offline joda

Junior Newbie





« Reply #27 - Posted 2006-07-19 21:37:44 »

Found this, I guess you asked for tricks for drawing borders:
http://www.opengl.org/resources/faq/technical/polygonoffset.htm
Offline WhiteHexagon

Senior Newbie





« Reply #28 - Posted 2006-07-31 19:43:42 »

Thanks joda,

I currently used the polygon offset on another part of the engine, but never really understood what it was doing Smiley It works okay for most cases but I found a few gcards where I get really nasty result when two polygons intersect, kinda heavy saw tooth effect, if that makes sense.  Using the stencil buffer technique looks a bit tricky, but maybe something I will brave once I learn a bit more about OpenGL.

--

Anyway regarding the performance issue, I've managed to rip out most of my engine code into a stadalone test now, apart from Texture loading where I use some custom file formats.  Does anyone have a simple bit of code to generate a texture ie. doesn't need to load a texture?

Thanks

Peter

whitehexagon.com - Building a new game world, one brick at a time.
Offline bahuman

Junior Member





« Reply #29 - Posted 2006-07-31 20:47:31 »

just give us some code that uses TextureIO.newTexture(new File("test.jpg"), true), and create a dummy jpg file. It shouldn't be that hard to grab a dummy jpg from the net, should it ?  Grin
Pages: [1] 2
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

CopyableCougar4 (23 views)
2014-08-22 19:31:30

atombrot (34 views)
2014-08-19 09:29:53

Tekkerue (30 views)
2014-08-16 06:45:27

Tekkerue (28 views)
2014-08-16 06:22:17

Tekkerue (18 views)
2014-08-16 06:20:21

Tekkerue (27 views)
2014-08-16 06:12:11

Rayexar (65 views)
2014-08-11 02:49:23

BurntPizza (41 views)
2014-08-09 21:09:32

BurntPizza (31 views)
2014-08-08 02:01:56

Norakomi (41 views)
2014-08-06 19:49:38
List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06

List of Learning Resources
by SilverTiger
2014-07-31 11:54:12

HotSpot Options
by dleskov
2014-07-08 01:59:08
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!