Thanks for your quick responses, I am going to do some frame rate tests with checking visible vs not checking and drawing everything...
You two are talking about culling, maybe this is more in response to jonjava, but in my case I have no control over culling. I am drawing to a canvas which, to my understanding, is simply a larger bitmap to which I draw my sprites(<- little bitmaps) to, and does not use any opengl... where opengl has its own thread for rendering, when drawing to a canvas the main thread does all the game logic as well as the drawing... In future developements I plan to move to opengl but am relatively new to the field and decided to start with something relatively easy...
A canvas is basically a large image, but Java2D can indeed use OpenGL to increase performance. OpenGL also doesn't "have" it's own thread though. OpenGL is only usable from one thread (if you want good performance, that is). If you want to put that in a separate thread, fine, but that complicates a lot of things, especially loading textures and similar stuff. As you can't call any OpenGL commands from the main thread most content loading has to be queued and processed in the OpenGL thread. I may be making this sound slightly more complicated than it is, but my point is that people who make "small" OpenGL programs generally don't use multiple threads. xd
OpenGL needs culling too though. OpenGL renders things using vertices. If you want to draw a quad (rectangle), you need to send 4 vertices (coordinates) to OpenGL. OpenGL does indeed not "draw" things that are outside the screen, but it does need to process the individual vertices to determine if they are inside or outside the screen. There is also the cost of sending the coordinates through the PCI-E bus to the graphics card (insanely fast, but you can still overwhelm it xD). For example, if you're going to do frustum culling in a 3D game (not draw things that are not inside the camera's line of sight), you'd want quickly check if whole objects are outside the view frustum using a bounding sphere (really fast), and then not draw the entire object consisting of many (possibly tens of thousands) of triangles. For just a few CPU cycles you can save the graphics card a lot of work. The culling can even be put in its own thread and simply mark each object as visible or invisible, but this is rarely needed unless you have very many small objects, but in such a case it's better to use a hierarchical culling approach instead e.g. checking larger areas to be able to quickly rule out many objects and then progressively check smaller and smaller areas/objects.
Like I said, culling in a 2D game has the same goal, but it has to be done differently.
When using the graphics card there are basically two things that can be bottle necking: you can be fill rate bound or vertex bound. Fill rate is basically how fast the graphics card can fill the pixels covered by a triangle. Therefore the ratio between the number of triangles and the sum of the area of these triangles matters if it is too high or low. I mean, too many small triangles is going to be vertex bound, and if you have just a few very large triangles you'll be fill rate bound. OpenGL also has support for vertex shaders and fragment (pixel) shaders, which are like a small program or function you can write that will be run on each vertex or pixel. These obviously make vertices and pixels have different costs and complicate things further. Now things aren't as bad as they seem. Newer (e.g. not 10 years old) graphics card use an unified architecture, meaning that the graphics card contains hundreds (NVidia) or thousands (AMD) of small processors that can be used either to process vertices and fragments/pixels. This allows the GPU to compensate to a certain point between being able to process many vertices and many pixels. Older cards had separate processors for vertices and fragments, and obviously suffered a lot in extreme cases.
For a 2D game, things aren't that complicated though. Usually you use a number of sprites (images) that you simply draw on the screen using 4 vertices. Compared to a 3D game which wants to have as small triangles as possible to make things look round and smooth, these 2D images are relatively large. A 2D game is therefor in 99% of all cases fill rate bound. The sprites that are outside the screen are obviously not filled and don't cost much at all unless there are lots of them. Checking each and every sprite to see if it is inside the screen can be a very bad idea, as you might quickly get CPU bound instead, being unable to feed data to the graphics card fast enough.
To make you grasp the performance of a GPU, I used a small test program to test the vertex processing and fragment processing capabilities of my laptop's GPU, a GTX 460m. The test is very basic. I just checked how much I could draw while maintaining 60 FPS:
Results: I can draw approximately 9 584 640 vertices (3 194 880 triangles, 2 396 160 quads) at 60 FPS. Yes, you read that right. I can draw 9.5 million vertices as long as the triangles they form are small enough or outside the screen.
Fill performance is a lot harder to test. I created a 1024x1024 window and checked how many times I could draw a quad covering the whole screen. Results: 76 times. That's 79 691 776 pixels filled at 60 FPS. HOWEVER!!! This number is actually completely irrelevant. Why? Because I'm not drawing a texture to the area. I'm just filling it with a single color. Texture performance is actually ANOTHER aspect of graphics cards! Using a texture to cover the triangle, performance drops to 32 overdraws, meaing 33 554 432 pixels with textures filled. The problem is that this number doesn't mean much either. Texture performance actually depends on a lot of things, including how big the texture is and if the samples are distributed evenly on the screen. Basically: It's complicated.
Today's graphics cards aren't made for just simply drawing textures to the screen. Sure, they are really good at it, but the real theoretical performance can only be achieved with more computationally expensive vertices and fragments. They are made to handle texturing of surfaces, but also lighting and other special effects. What I want to say is that if you double the cost of a fragment shader program for a new effect, it's not going to run exactly half as fast. My GPU stays at 60-65 degrees during these tests. When playing "real" games (Modern Warfare 2, Supreme Commander 2, Team Fortress) it goes to 82-85 degrees. I'm not fully utilizing my GPU at all with these tests.
The biggest problem with 2D games is definitely CPU performance. A single sprite is cheap for a GPU to render, so the CPU has to send vertices quickly to keep up. A big problem is texture switches. Changing the texture between each sprite drawn will reduce your performance by a very large factor, mostly because of CPU limitations. My test program was simply one OpenGL draw call for each layer (glDrawArrays()). Assembling buffers of sprite vertex data and texture coordinates, binding textures and issuing a draw call for each single sprite is going to be veeeeeery CPU intensive. Screen culling is therefore a good idea if the culling is less CPU intensive than actually drawing it. All in all, things aren't just vertices and fragments anymore.
There are many good methods to reduce the CPU cost in OpenGL. For example, you can batch up the coordinates of multiple sprites using the same image/texture and draw them all using a single draw call. You can also keep multiple sprites in the same texture and draw all those at the same time. Actually, one of the most expensive parts of drawing sprites is filling a ByteBuffer or FloatBuffer with the vertex data, but clever people like Riven have come up with nice solutions for that.
Java2D is not even close to as fast as using raw OpenGL is, even if it is accelerated by OpenGL. We don't know if culling is done for things drawn outside of the screen, regardless of whether it's rendered in software or with OpenGL. I wouldn't trust Java2D to do a very good job with this anyway. For things like map rendering though, calculating what to draw is easy, because you don't have to test every tile in the game.
Wow. I'm awesome at rambling on completely off-topic stuff. Crap. Sorry!!! >_<
EDIT: Some line breaks to make it more readable...