Java-Gaming.org    
Featured games (81)
games approved by the League of Dukes
Games in Showcase (497)
Games in Android Showcase (114)
games submitted by our members
Games in WIP (563)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  Game Inefficiencies  (Read 1274 times)
0 Members and 1 Guest are viewing this topic.
Offline Willchill
« Posted 2014-07-02 23:48:59 »

Hello,

Recently I've been developing a game and I've noticed it's using a ridiculous amount of system resources (30% CPU). What are the steps I can take to find these inefficiencies and resolve them?

Thanks.
Offline Rayvolution

JGO Kernel


Medals: 198
Projects: 2
Exp: 1 year


Resident Crazyman


« Reply #1 - Posted 2014-07-03 00:00:06 »

... by coding better! Cheesy

There's no single one magic answer, you'll have to give us code examples of your most resource intensive code, and we can offer advice on refactoring/optimizing it. Wink

A more proper answer is probably "Use a profiler tool to find bottlenecks in your code" though.

- Raymond "Rayvolution" Doerr.
Retro-Pixel Castles - Survival Sim/Builder/Roguelike!
LIVE-STREAMING DEVELOPMENT: http://www.twitch.tv/SG_Rayvolution
Offline Willchill
« Reply #2 - Posted 2014-07-03 01:04:55 »

The most CPU-intensive method called is org.lwjgl.opengl.WindowsContextImplementation.nSwapBuffers. Next is my rendering method, specifically the one I use to render all of the the blocks in the level, however nSwapBuffers() uses 4 times as much processing power as the rendering method.

The method I call to render the tiles is on GitHub: https://github.com/WillchillDev/Game/blob/master/Game/src/me/willchill/game/level/Level.java

A screenshot of the stuff JProfiler is showing:
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline BurntPizza
« Reply #3 - Posted 2014-07-03 01:32:31 »

I see in Block.render() that you are calling glBegin and glEnd for each block rendered. Not only is that inefficient (could call once for all blocks of same texture IIRC), it the deprecated, old openGL pipeline. That's what probably calling nSwapBuffers so much.

The real "solution" besides batching your block renders is to upgrade to VBOs and such.
Offline Willchill
« Reply #4 - Posted 2014-07-03 01:53:34 »

The real "solution" besides batching your block renders is to upgrade to VBOs and such.
Alright, thanks for the help. Are there any resources (books, videos, websites) you would recommend for learning modern OpenGL?
Offline EgonOlsen
« Reply #5 - Posted 2014-07-03 06:35:37 »

As long as your code doesn't pause while waiting for a vertical sync or a Thread.sleep(), you'll always end up with at least one CPU core used to ~100%. For that, it doesn't matter how "efficient" the code is. If it's more efficient, it might output higher frame rates but that doesn't change the cpu load.
If you are walking for one hour, you are walking for one hour. It doesn't matter if you are walking pretty fast or crawling on your knees. The distance after one hour will differ, but the actual load (your body used to 100% for moving around) doesn't differ.

Offline ra4king

JGO Kernel


Medals: 345
Projects: 3
Exp: 5 years


I'm the King!


« Reply #6 - Posted 2014-07-03 07:33:25 »

A good resource for learning modern OpenGL is the Arcsynthesis tutorials.
The C++ code has been ported to LWJGL: https://www.github.com/ra4king/LWJGL-OpenGL-Tutorials/

Offline Rayvolution

JGO Kernel


Medals: 198
Projects: 2
Exp: 1 year


Resident Crazyman


« Reply #7 - Posted 2014-07-03 09:26:51 »

ah, I think I see whats probably wrong. You're using slick, and you're drawing your images with draw instead of drawEmbedded. Slick can be laggy when you use .draw to draw a large amount of images from the same texture. Any time you draw a collection of images off the same sprite sheet, or the same image multiple times (basically, any time you're using a single texture) you want to use drawEmbedded.

For example, if you say, are rendering 4 things on your screen at once in a 2x2 grid all from the same texture, you're probably doing something like this:

1  
2  
3  
4  
5  
for(int x = 0; x < 4; x++) {
   for(int y = startY; y < endY; y++) {
      myImage.draw(xCoord*x,yCoord*y);
   }
}


When you do that, you're basically calling glBegin and glEnd over and over, like this:

glBegin()
render image 1
glEnd()
glBegin()
render image 2
glEnd()
glBegin()
render image 3
glEnd()
glBegin()
render image 4
glEnd()

.. what you want to do, is find any of your loops or anywhere in your program you're drawing from a single texture (this can even be an entire sprite sheet), and use drawEmbedded, like so:
1  
2  
3  
4  
5  
6  
7  
myImage.startUse();
for(int x = 0; x < 4; x++) {
   for(int y = startY; y < endY; y++) {
      myImage.drawEmbedded(xCoord*x,yCoord*y);
   }
}
myImage.endUse();


What this translates to:
glBegin()
render image 1
render image 2
render image 3
render image 4
glEnd()

It'll give you a huge performance increase, assuming you have large collections of images on the same sprite sheet (like a sprite sheet for a tilemap, for example).

Also, if you aren't using sprite sheets in areas that you can, I highly recommend converting to sprite sheets instead of individual files.


- Raymond "Rayvolution" Doerr.
Retro-Pixel Castles - Survival Sim/Builder/Roguelike!
LIVE-STREAMING DEVELOPMENT: http://www.twitch.tv/SG_Rayvolution
Online theagentd
« Reply #8 - Posted 2014-07-03 13:41:46 »

As long as your code doesn't pause while waiting for a vertical sync or a Thread.sleep(), you'll always end up with at least one CPU core used to ~100%. For that, it doesn't matter how "efficient" the code is. If it's more efficient, it might output higher frame rates but that doesn't change the cpu load.
If you are walking for one hour, you are walking for one hour. It doesn't matter if you are walking pretty fast or crawling on your knees. The distance after one hour will differ, but the actual load (your body used to 100% for moving around) doesn't differ.
To clarify on this...

Your GPU is currently the limit here. When you call Display.update() the driver makes sure that the GPU hasn't fallen too far behind. If it has, then the driver forces the CPU to wait until the GPU has caught up. Most drivers seem to implement this with a busy loop that uses 100% on one CPU core.

Myomyomyo.
Offline EgonOlsen
« Reply #9 - Posted 2014-07-03 15:11:34 »

Your GPU is currently the limit here. When you call Display.update() the driver makes sure that the GPU hasn't fallen too far behind. If it has, then the driver forces the CPU to wait until the GPU has caught up. Most drivers seem to implement this with a busy loop that uses 100% on one CPU core.
Mine (GTX 680) even spawns an additional thread so that 2 cores are 100 % busy even if i limit the fps to 60. However, what i actually wanted to express is, that looking at the cpu load while the game is running tells you nothing about the efficiency of the code. Or in other words: Having one core fully loaded isn't necessarily a sign of bad coding. Cores are there to be used.

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Online theagentd
« Reply #10 - Posted 2014-07-03 15:56:53 »

Your GPU is currently the limit here. When you call Display.update() the driver makes sure that the GPU hasn't fallen too far behind. If it has, then the driver forces the CPU to wait until the GPU has caught up. Most drivers seem to implement this with a busy loop that uses 100% on one CPU core.
Mine (GTX 680) even spawns an additional thread so that 2 cores are 100 % busy even if i limit the fps to 60. However, what i actually wanted to express is, that looking at the cpu load while the game is running tells you nothing about the efficiency of the code. Or in other words: Having one core fully loaded isn't necessarily a sign of bad coding. Cores are there to be used.
That's a feature of the Nvidia driver, not the GPU. Intel also has this feature, and I believe AMD does as well. They basically just append all OpenGL commands to a queue that the other driver thread reads from and runs. It essentially makes most OpenGL commands free for the game's thread and gives you some extra CPU time to play with. The problem with this is mapping buffers. Every time you call glMapBuffer() or any of its variations (regardless of if you use GL_MAP_UNSYNCHRONIZED or not) the game's thread has to wait for the driver's thread to finish, so most of the benefit of the extra thread is lost. This is why the new persistently mapped buffers are so awesome. You can map a buffer once and keep it mapped forever, so you never have to synchronize with the driver's thread.

Myomyomyo.
Offline ags1

JGO Ninja


Medals: 62
Projects: 3
Exp: 5 years


Make code not war!


« Reply #11 - Posted 2014-07-03 21:38:34 »

Most drivers seem to implement this with a busy loop that uses 100% on one CPU core.

That is crazy! I'm sure there is a good reason for that, I'd like to hear it...

Online theagentd
« Reply #12 - Posted 2014-07-03 22:45:05 »

Precision? It doesn't really matter.

Myomyomyo.
Offline Rayvolution

JGO Kernel


Medals: 198
Projects: 2
Exp: 1 year


Resident Crazyman


« Reply #13 - Posted 2014-07-03 23:58:28 »

Most drivers seem to implement this with a busy loop that uses 100% on one CPU core.

That is crazy! I'm sure there is a good reason for that, I'd like to hear it...

I think (don't quote me on it), CPUs and GPUs last longer when they're forced to always run at 100%. Something about transiter load. I don't know the details or if I am even right, I just remember reading this somewhere like a decade ago.

Windows does this as well, if you look at your taskbar on older versions of windows they have the "System idle process" that's always maxed out at whatever percentage of the processor currently is not being used. Windows 7 (and possibly vista) don't show it anymore though.

- Raymond "Rayvolution" Doerr.
Retro-Pixel Castles - Survival Sim/Builder/Roguelike!
LIVE-STREAMING DEVELOPMENT: http://www.twitch.tv/SG_Rayvolution
Online theagentd
« Reply #14 - Posted 2014-07-04 00:32:02 »

I think (don't quote me on it), CPUs and GPUs last longer when they're forced to always run at 100%. Something about transiter load. I don't know the details or if I am even right, I just remember reading this somewhere like a decade ago.

Windows does this as well, if you look at your taskbar on older versions of windows they have the "System idle process" that's always maxed out at whatever percentage of the processor currently is not being used. Windows 7 (and possibly vista) don't show it anymore though.
I find it hard to believe that this is true. If it was, then you'd be wasting a shitload of money and/or battery life on that "idle process". The System idle process is simply there to show you how much of the time the CPU idles (and it's still there for 7).

Myomyomyo.
Offline ra4king

JGO Kernel


Medals: 345
Projects: 3
Exp: 5 years


I'm the King!


« Reply #15 - Posted 2014-07-04 01:25:40 »

The System Idle Process is there to keep the CPU idle when the scheduler finds no threads ready to execute. That's why it's always shown as the percentage not being used, as there must always be a thread running on a CPU at all times. More information on Wikipedia.

Online theagentd
« Reply #16 - Posted 2014-07-04 02:51:39 »

The System Idle Process is there to keep the CPU idle when the scheduler finds no threads ready to execute. That's why it's always shown as the percentage not being used, as there must always be a thread running on a CPU at all times. More information on Wikipedia.
From that link:

Quote
Because of the idle process's function, its CPU time measurement (visible through Windows Task Manager) may make it appear to users that the idle process is monopolizing the CPU. However, the idle process does not use up computer resources (even when stated to be running at a high percent), but is actually a simple measure of how much CPU time is free to be utilized. If no ordinary thread is able to run on a free CPU, only then does the scheduler select that CPU's System Idle Process thread for execution. The idle process, in other words, is merely acting as a sort of placeholder during "free time".

In Windows 2000 and later the threads in the System Idle Process are also used to implement CPU power saving. The exact power saving scheme depends on the operating system version and on the hardware and firmware capabilities of the system in question. For instance, on x86 processors under Windows 2000, the idle thread will run a loop of halt instructions, which causes the CPU to turn off many internal components until an interrupt request arrives. Later versions of Windows implement more complex CPU power saving methods. On these systems the idle thread will call routines in the Hardware Abstraction Layer to reduce CPU clock speed or to implement other power-saving mechanisms.
You're right that it is indeed a real thread (which I didn't know), but it's not exactly a normal thread. My main point was that neither the CPU or GPU are unnecessarily burning energy because it's supposed to be good for them. CPUs and GPUs have massive power saving functions so they don't have to run at 100% load all the time, which includes shutting down unused parts of the processor or even complete cores and lowering the clock speed to a fraction of what it can run at. My CPU idles at room temperature and my GPUs at 35 degrees. My CPU can drop down to 800 MHz instead of running at 3.9GHz all the time. My GPUs' cores drop down to 135MHz instead of 1.2GHz and their memory to 162MHz from 1.75GHz. Hardware makers are doing everything they can to decrease power usage and heat generation to be able to get better battery life and smaller devices.

Myomyomyo.
Offline Roquen
« Reply #17 - Posted 2014-07-04 11:05:13 »

Too lazy to find a good reference, but here: http://siyobik.info.gf/main/reference/instruction/PAUSE.  Just because the CPU is in theory running a tight loop doesn't mean all of the units are running.
Pages: [1]
  ignore  |  Print  
 
 

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

BurntPizza (15 views)
2014-09-19 03:14:18

Dwinin (33 views)
2014-09-12 09:08:26

Norakomi (58 views)
2014-09-10 13:57:51

TehJavaDev (80 views)
2014-09-10 06:39:09

Tekkerue (40 views)
2014-09-09 02:24:56

mitcheeb (63 views)
2014-09-08 06:06:29

BurntPizza (46 views)
2014-09-07 01:13:42

Longarmx (33 views)
2014-09-07 01:12:14

Longarmx (37 views)
2014-09-07 01:11:22

Longarmx (36 views)
2014-09-07 01:10:19
List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06

List of Learning Resources
by SilverTiger
2014-07-31 11:54:12

HotSpot Options
by dleskov
2014-07-08 01:59:08
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!