Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (538)
Games in Android Showcase (132)
games submitted by our members
Games in WIP (600)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  LWJGL 2D rendering optimization.  (Read 1192 times)
0 Members and 1 Guest are viewing this topic.
Offline Geek_Link

Senior Newbie





« Posted 2013-11-20 05:42:23 »

I am making my 2D graphical engine for my game(S) for now using direct rendering with openGl. I want to optimize everything but I have a lot of questions.

-Using VBOs : I know it is a lot faster than direct rendering but I have also heard that this is not the case for moving objects. I can't think of one thing that isn't moving in a game aside from menus and UI. I have also seen people on this forum talking about shaders to "move" the textures, but isn't it something that not every computer can handle?

-2D or 3D ? : I am not a fan of isometric 2D but I want to make a tiled map with multiple layers of "blocks". Do you think creating all the tiles with the perspective effects(using at least 4 (actually 8 2*4 triangles)shapes, top front left and right side of the "cube") depending on their location on the screen when the map scrolls would be a good idea or should I use some sort of 3D?

-What would be the best way to change the order of rendering of my sprites since it is 2D?

- Do you have any other optimization ideas? Everyone seems to always points towards VBOs when asked for optimization ideas. I am not in trouble in term of frame rate but better safe than sorry.

Thanks!
Online trollwarrior1
« Reply #1 - Posted 2013-11-20 05:52:22 »

You can store all your cube or block or square data into a single VBO and render it at some offset using shader. Unless your game is minecraft where cubes change, this should be very easy to implement. Just put all the vertices and texture, color data into a single vbo and render it! For moving objects you can use immediate mode if you want.

I guess putting all the stuff into 1 vbo would be called something like spritebatching or something.
Offline Dxu1994
« Reply #2 - Posted 2013-11-20 05:59:33 »

For moving geometry VA's are faster. For static geometry VBO's are faster.

What I do in my game engine is I initialize a ShortBuffer with 6 x MaxTextures worth of indices, then I bind this as GL_ARRAY_BUFFER.
When I need to render a texture I push the 4 floats into the VA, do this repeatedly until done or until I overflow the VA and use the Index Buffer and GL_TRIANGLES to render the sprites.
Also, I use a dynamic TextureAtlas to remove the glBindTexture overhead while rendering different sprites.

Example of how I do it:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
indexBufferId = GL15.glGenBuffers();
        LWJGLState.glBindIndexBuffer(indexBufferId);
        for (int i = 0, j = 0; i < indexBufferCapacity; i += VERTEXES_PER_SPRITE, j += 4) {
            ind[i] = (short)j;
            ind[i + 1] = (short)(j + 2);
            ind[i + 2] = (short)(j + 1);
            ind[i + 3] = (short)j;
            ind[i + 4] = (short)(j + 3);
            ind[i + 5] = (short)(j + 2);
        }
       
        ib.put(ind, 0, indexBufferCapacity).flip();
        GL15.glBufferData(GL15.GL_ELEMENT_ARRAY_BUFFER, ib, GL15.GL_STATIC_DRAW);


Then later..

1  
2  
3  
4  
5  
6  
7  
8  
GL11.glVertexPointer(VERT_SIZE, 0, vb);
        GL11.glColorPointer(COLOR_SIZE, 0, cb);
        GL11.glTexCoordPointer(TEX_SIZE, 0, tb);
       
        //GL15.glBindBuffer(GL15.GL_ELEMENT_ARRAY_BUFFER, 0);
        //GL11.glDrawElements(GL11.GL_TRIANGLES, ib);
        LWJGLState.glBindIndexBuffer(indexBufferId);
        GL11.glDrawElements(GL11.GL_TRIANGLES, vertIndex, GL11.GL_UNSIGNED_SHORT, 0);

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Geek_Link

Senior Newbie





« Reply #3 - Posted 2013-11-20 06:14:53 »

Unless your game is minecraft where cubes change
I plan on maybe implanting a mode like minecraft but not in the game only to mess with and as an immersive level editor. I want people to be able to create their own stories.
Offline opiop65

JGO Kernel


Medals: 160
Projects: 7
Exp: 3 years


JumpButton Studios


« Reply #4 - Posted 2013-11-20 11:40:07 »

Well, then VBOs are far better. They are great for static geometry! You should create a sprite batcher. At the simplest level, and a Batcher takes a bunch of geometry, throws it into one buffer and sends it off to the GPU or CPU for rendering. Its way more efficient than drawing all the tiles separately as switching buffers is expensive. For 2D though, I seriously doubt you'll have a lot of problems with performance as long as you only render whats on the screen. Even immediate mode could accomplish something like this, but I still would recommend VBOs.

Offline davedes
« Reply #5 - Posted 2013-11-20 14:55:22 »

If you are doing 2D sprites, it's pretty hard to cache everything in VBOs since it's going to be constantly changing (position, UVs, alpha, etc). In some cases you can use multiple buffers and only update them when dirty, but it's a lot of work for little gain. Instead, I'd recommend making a single sprite batcher that is highly optimized for batch rendering of sprites in a texture atlas.

More info on sprite batching & meshes here:
https://github.com/mattdesl/lwjgl-basics/wiki/Sprite-Batching
https://github.com/mattdesl/lwjgl-basics/wiki/LibGDX-Meshes

VBOs is of course the best option, except on a few mobile devices where Vertex Arrays may be slightly better. On desktop you can multiple mapped VBOs for a huge boost in performance:
http://www.java-gaming.org/index.php?topic=28209.0

Once you have a solid sprite batcher, you probably won't need to optimize any further. At this point you'll be able to render hundreds of thousands of sprites per frame -- any more will probably lead to bottlenecks in fill-rate (rather than vertex/draw call bound). But in some cases, like if you can't easily texture pack everything, then you could try another optimization which involves batching multiple textures into the same draw call. See my post here:
http://www.badlogicgames.com/forum/viewtopic.php?p=50933#p50933

Or if you want to take the other approach, and optimize for fill-rate and reduced overdraw, you should look into batching polygons to make up your transparent sprites. Lots more info here:
http://www.youtube.com/watch?v=fHFsQHvzfwo&feature=youtu.be&t=8m3s

Offline Geek_Link

Senior Newbie





« Reply #6 - Posted 2013-11-25 19:38:51 »

Hello I've made my homework and I made some conclusions (maybe wrong conclusions) about rendering methods.

VAO seems to be the best way for moving and static sprites but not for old PCs since it's not supported before 3.0.

VBOs are great but only for static and need to be mixed with VA for the moving sprites (VBOs for static and VA for moving, or blending the two together by moving the arrays from VBO to VA ?)

Immediate mode : Maybe at last resort for old PCs?

shaders seems to be a good but complicated way for the rendering.

How can I manage the rendering order?
how can I render the same texture twice without creating another set of vertices in the VBO or VA.

Some time in openGL more seems faster (ex: two triangle vs one quad : vertex buffer vs vertex buffer + index : rendering everything at the same time vs rendering only what you need (ex recreating a vbo only to add one or two sprite))


As you can see I have a lot of question maybe I don't understand enough or maybe I need more than I should?
Offline davedes
« Reply #7 - Posted 2013-11-25 20:04:12 »

A vertex array is an old-school way of moving a bunch of data to the GPU. They are deprecated and have been replaced by VBOs. But on some drivers (like mobile Android), vertex arrays can still be used, and might even give a (very slight) boost compared to VBOs for dynamic data.

VAO is not a replacement for VBOs. In modern pipelines you need to use a VAO to setup your VBO states. But in GLES land you might not even be able to use VAOs. Either way, they really won't do much to benefit your VBO-based sprite batcher.

Immediate mode; just forget you ever learned that.

Maybe you are just in over your head, and over-thinking it. If you get a sprite batcher working like this then it will probably be fast enough to render more sprites than your game will ever need. If you can't manage to wrap your head around all this, then use the lwjgl-basics API or just pick up a more newb-friendly framework like LibGDX.

Offline opiop65

JGO Kernel


Medals: 160
Projects: 7
Exp: 3 years


JumpButton Studios


« Reply #8 - Posted 2013-11-25 20:41:17 »

Shaders are complicated at times, but they get you closer to the GPU, therefore reducing the time it takes to render vertices.

To render multiple textures at one time you'll need aspritesheet. Think about one texture, to render the entire texture you pass in coordinates like (0,0) and (1, 0) etc... But with a spritesheet, you'll load up a large texture, subdivide it, and then use smaller numbers to specify the coordinates. Let's start with something simple. Say you have a 64 x 64 pixel spritesheet with two textures, and you want to get the bottom one. Well, you'll need to pass in these coordinates
(0, 0) for bottom left
(0.5, 0) for bottom right
(0.5, 0.5) for top right
(0, 0.5) for top left

As you can see, since texture coordinates range from 1 to 0, you'll need to divide 1 by the number of textures you have to get the width and height of the (square) texture. So, 3 textures would have a width and height of 0.33 repeating. Then, for your geometry specify the correct texture coordinates from the spritesheet, bind the spritehseet, and boom, multiple textures!

As for the question about geometry being faster with more, this is partially true. You mentioned that 2 triangles are rendered faster than a single quad. Well, OpenGL (or the actual GPU, I don't know which) breaks geometry down into triangles before rendering. So, your two triangles will be faster because the quad needs to be broken down first. Also, to a certain extent, the more geometry you send to the pipeline at once, the better. You see, setting up the rendering functions is expensive, and if you only send a few vertices to be rendered, OpenGL is waisting lots of cycles. So, you'll need to find a line between how much geometry is too much or too little. This is why sprite batchers are great for rendering, they push the maximum number of vertices to the pipeline at once.

Hope I gave you correct information, I think that's all right! Smiley

Offline Geek_Link

Senior Newbie





« Reply #9 - Posted 2013-11-25 21:04:40 »

I find your answers really helpful but please forget about sprite sheet/Atlas I already know this concept and how to implement it. I am not talking about rendering  different sprites but the same one multiple times. Like is creating a new polygon for each instance of a sprite efficient enough or is there an other better way. Thanks Grin
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline opiop65

JGO Kernel


Medals: 160
Projects: 7
Exp: 3 years


JumpButton Studios


« Reply #10 - Posted 2013-11-25 21:20:50 »

Yes, you'll need to create a separate set of vertices for each entity. But you should just glTranslate them around instead of creating a new VBO or whatever each frame, that's very inneficient.

Offline theagentd

« JGO Bitwise Duke »


Medals: 365
Projects: 2
Exp: 8 years



« Reply #11 - Posted 2013-11-27 11:19:20 »

I find your answers really helpful but please forget about sprite sheet/Atlas I already know this concept and how to implement it. I am not talking about rendering  different sprites but the same one multiple times. Like is creating a new polygon for each instance of a sprite efficient enough or is there an other better way. Thanks Grin

That depends on what you're doing, or more precisely what your bottleneck is. First you should figure out what your bottleneck is, then optimize that part. After you're done, the bottleneck has hopefully shifted somewhere else, so at that point it's time to start focusing on the new bottleneck instead.

If you're rendering 10 000 small say 20x20 sprites you're actually not pushing your GPU very far at all. 10 000 quads = 20 000 triangles which is nothing for a modern GPU. For comparison, the latest 3D games generally push a few million triangles per frame, and this number is increased a lot if the game uses tessellation. In pixel coverage, you're only covering around 4 million pixels, which again is not really pushing your GPU. A 1080p screen has around 2 million pixels (2 megapixels), and even weak hardware can render at least few fullscreen quads over such a screen. High-end GPUs can have as much as 40 GIGApixels per second, or around 650 megapixels per frame, though low-end has only a fraction of that. There is therefore a risk that if you for example have a lot of huge transparent sprites that you'll be fill-rate or ROP-limited (think "blending limited"), but this is rarely the case and can usually be avoided pretty easily in the first place.

So in essence, you can be pretty sure that for rendering 2D sprites you'll almost always be CPU limited. Why? Because OpenGL has quite a bit of overhead for draw calls. If you tell OpenGL to draw a small sprite using one glDrawArrays() call for each sprite (or worse, using immediate mode) you're going to be massively CPU bottlenecked. In essence, your CPU is spending a LOT more time instructing your GPU what to do than it actually takes for your GPU to perform those commands, so that's what you should focus on optimizing. As many people have suggested, batching helps a lot for solving this. If you are rendering static geometry, just upload it once and render it using a single draw call each frame.

An interesting exception is tile rendering. If you have very small tiles, for example due to zooming out a lot, the sheer number of triangles needed to draw these tiles can be very high. In this case you can use a shader that allows you to render just a single quad over the whole screen. This shader then calculates which tile each pixel is inside and samples that tile's image in a tile map automatically. That pretty much allows you to have any number of tiles visible down to 1x1 pixel tiles (1080p screen = 2 million tiles in that case) while only ever needing one fullscreen quad.

Myomyomyo.
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

rwatson462 (29 views)
2014-12-15 09:26:44

Mr.CodeIt (20 views)
2014-12-14 19:50:38

BurntPizza (40 views)
2014-12-09 22:41:13

BurntPizza (75 views)
2014-12-08 04:46:31

JscottyBieshaar (37 views)
2014-12-05 12:39:02

SHC (50 views)
2014-12-03 16:27:13

CopyableCougar4 (47 views)
2014-11-29 21:32:03

toopeicgaming1999 (113 views)
2014-11-26 15:22:04

toopeicgaming1999 (100 views)
2014-11-26 15:20:36

toopeicgaming1999 (30 views)
2014-11-26 15:20:08
Resources for WIP games
by kpars
2014-12-18 10:26:14

Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!