Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (512)
Games in Android Showcase (119)
games submitted by our members
Games in WIP (576)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: 1 [2] 3
  ignore  |  Print  
  Garbage collector tuning  (Read 6310 times)
0 Members and 1 Guest are viewing this topic.
Offline Gudradain
« Reply #30 - Posted 2012-07-31 13:45:25 »

"text"+n means 4 allocations right there (StringBuilder & char[], String & char[])

It's nice to realize that that code is so slow Smiley When you remove it the speed of the benchmark is increased by 50 times Smiley Also, the GC doesn't have to collect anything anymore. But object pooling is still slower.
Online nsigma
« Reply #31 - Posted 2012-07-31 13:45:58 »

No matter how you slice it...GC must walk memory (not free) and usually random memory (even less free) and when compacting must move memory (not free).

An object pool is not free either!  Which value of 'not free' is better comes down a lot to circumstances and evaluation of need.

With apologies for derailing a thread on the merits of bad benchmarks!  Tongue

Praxis LIVE - open-source intermedia toolkit and live interactive visual editor
Digital Prisoners - interactive spaces and projections
Online Roquen
« Reply #32 - Posted 2012-07-31 13:52:19 »

Yes indeed.  However I'm not the one claiming anything is free.  The only plus point I gave to pooling is (if properly set-up) it allows data-flow optimizations.  In general which is 'better' is:  it depends.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline princec

JGO Kernel


Medals: 404
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #33 - Posted 2012-07-31 14:21:15 »

"text"+n means 4 allocations right there (StringBuilder & char[], String & char[])

It's nice to realize that that code is so slow Smiley When you remove it the speed of the benchmark is increased by 50 times Smiley Also, the GC doesn't have to collect anything anymore. But object pooling is still slower.
I find it hard to believe that you could write an object pool that is slower than doing a new.
Also, the GC will eventually have to collect when you're doing new, and it is the unpredictable nature of "eventually" and the duration of said collection that pooling solves.

Cas Smiley

Online nsigma
« Reply #34 - Posted 2012-07-31 14:48:48 »

I find it hard to believe that you could write an object pool that is slower than doing a new.

I don't!  It depends what changes (if any) you need to make to the object to make it suitable for pooling.  Some objects that might have been immutable now have to be mutable, which could add cost (and if you're working with threads at all, then immutable objects can be a very good thing).

I personally tend to think of pooling for memory purposes rather than objects (ie. int[] / buffers for pixel data).  However, in that scenario you either end up with using lots of extra memory, or a need to check dimensions - at some point the size you want is probably so small it's faster to create from scratch.

Praxis LIVE - open-source intermedia toolkit and live interactive visual editor
Digital Prisoners - interactive spaces and projections
Offline Gudradain
« Reply #35 - Posted 2012-07-31 14:51:25 »

I find it hard to believe that you could write an object pool that is slower than doing a new.

I didn't even tried Sad I just wrote it and got those results...

Also, the GC will eventually have to collect when you're doing new, and it is the unpredictable nature of "eventually" and the duration of said collection that pooling solves.

Also, the GC is collecting the objects (as you see when I post the output in the console). One thing that I have to agree is the unpredictable nature when you don't have object pooling. With object pooling the pause are usually between 60 and 80 ms for my example. Without object pooling, the pause can be as low as 2ms or as high as 24ms (12 times bigger than lowest). But still the pauses are much shorter.
Online Roquen
« Reply #36 - Posted 2012-07-31 14:54:49 »

Never draw any conclusions from a flawed benchmark.
Offline Gudradain
« Reply #37 - Posted 2012-07-31 15:01:32 »

From Java 6 documentation

Quote
5. Available Collectors
The discussion to this point has been about the serial collector. The Java HotSpot VM includes three different collectors, each with different performance characteristics.

1. The serial collector uses a single thread to perform all garbage collection work, which makes it relatively efficient since there is no communication overhead between threads. It is best-suited to single processor machines, since it cannot take advantage of multiprocessor hardware, although it can be useful on multiprocessors for applications with small data sets (up to approximately 100MB). The serial collector is selected by default on certain hardware and operating system configurations, or can be explicitly enabled with the option -XX:+UseSerialGC.

2. The parallel collector (also known as the throughput collector) performs minor collections in parallel, which can significantly reduce garbage collection overhead. It is intended for applications with medium- to large-sized data sets that are run on multiprocessor or multi-threaded hardware. The parallel collector is selected by default on certain hardware and operating system configurations, or can be explicitly enabled with the option -XX:+UseParallelGC.

New: parallel compaction is a feature introduced in J2SE 5.0 update 6 and enhanced in Java SE 6 that allows the parallel collector to perform major collections in parallel. Without parallel compaction, major collections are performed using a single thread, which can significantly limit scalability. Parallel compaction is enabled by adding the option -XX:+UseParallelOldGC to the command line.

3. The concurrent collector performs most of its work concurrently (i.e., while the application is still running) to keep garbage collection pauses short. It is designed for applications with medium- to large-sized data sets for which response time is more important than overall throughput, since the techniques used to minimize pauses can reduce application performance. The concurrent collector is enabled with the option -XX:+UseConcMarkSweepGC.

And there is one new garbage collector in Java 7 : G1

Quote
The Garbage-First (G1) garbage collector is fully supported in Oracle JDK 7 update 4 and later releases. The G1 collector is a server-style garbage collector, targeted for multi-processor machines with large memories. It meets garbage collection (GC) pause time goals with high probability, while achieving high throughput. Whole-heap operations, such as global marking, are performed concurrently with the application threads. This prevents interruptions proportional to heap or live-data size.
Offline Gudradain
« Reply #38 - Posted 2012-07-31 15:05:49 »

Anyone knows if the JVM can now do that thing or not?

Quote
The JIT compiler can perform additional optimizations that can reduce the cost of object allocation to zero. Consider the code in Listing 2, where the getPosition() method creates a temporary object to hold the coordinates of a point, and the calling method uses the Point object briefly and then discards it. The JIT will likely inline the call to getPosition() and, using a technique called escape analysis, can recognize that no reference to the Point object leaves the doSomething() method. Knowing this, the JIT can then allocate the object on the stack instead of the heap or, even better, optimize the allocation away completely and simply hoist the fields of the Point into registers. While the current Sun JVMs do not yet perform this optimization, future JVMs probably will. The fact that allocation can get even cheaper in the future, with no changes to your code, is just one more reason not to compromise the correctness or maintainability of your program for the sake of avoiding a few extra allocations.

EDIT : It seems it does : Java 7 Enhancements
Online Roquen
« Reply #39 - Posted 2012-07-31 15:29:34 »

JDK 7 has escape analysis.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline princec

JGO Kernel


Medals: 404
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #40 - Posted 2012-07-31 15:42:27 »

Though interestingly watching the GC and heaps in jvisualvm on various games and that which I've got here I don't ever see any evidence of escape analysis actually doing anything. Heap still fills at same rate.

Cas Smiley

Online Roquen
« Reply #41 - Posted 2012-07-31 15:53:53 »

I've never been motivated enough to test it...too much work.  But even if worked ideally (note this is why contracts rule: @NoReference would be awesome)..it still only useful in a sub-set of cases (notable tuples).
Offline keldon85

Senior Duke


Medals: 1



« Reply #42 - Posted 2012-07-31 15:59:21 »

Escape analysis might have eliminated the need for GC of your actual object (though not the strings) in the sans-pool approach (messing up the results completely). To really compare them you would want to create a benchmark that better reflects the object link structures you will have in a game situation.

The basic essence of how GC works is by testing reachability from root nodes. Root nodes in this case are object references declared in currently executing functions and static class variables. The state of all of the object pointers needs to be frozen when this process is carried out, hence the use of stop the world GC's. Now that is costly, because increasing the size and complexity of object structures increases the cost of carrying out a GC mark and sweep.

Now pooling wouldn't affect GC that much in that case, but it would eliminate the memory management cost. Of course as you've seen implementations and technologies are a lot different:
 - there are incremental garbage collectors that don't need to stop the world and can operate in a separate thread, increasing the time it spends GC'ing based the behaviour of the program
 - there are generational garbage collectors that in a sense partition the structure, traversing some partitions more often than others - reducing the cost of GC'ing significantly since it traverses much less nodes and the nodes-traversed/nodes-deleted ratio is much lower (and therefore more efficient)
 - we now have fancy algorithms that JIT compilers use to pretty much rewrite your code, sometimes eliminating your `new` call completely (exhibiting pool like behaviour, depending on how you look at it)

Either way we've not yet reached a point where we don't need to manage memory or objects just yet, especially for real time applications where predictability* is crucial. So it's not to plain cut and dry as to whether pooling will or will not give you a boost. As your structures become more complex you will be able to see things in your system that G1 will not be privy to. And most important of all, trust the real world results more than theory.

In fact I once had a funny story of using floats on an embedded system. According to all available documentation on a certain 100% integer based processor a co-worker of mine insisted on using floating points rather than integers. He was not aware of or bothered with the technical workings of the device so ignored my warnings, so to show him the penalty and risks associated with using floats I made some benchmarks performing various arithmetic on random numbers. To my surprise the benchmarks showed the floating point operations to be faster than the integer operations in most (if not all) cases. I was baffled. I shared this information with other developers and they could not understand it, as it should just be impossible. There was no fault with the code and others could repeat the behaviour. Well, the moral here is that sometimes all theories and knowledge are trumped, so don't be too presumptuous. I might add (for reference), that we were using a fast floating point library with associated cache risks.

So because of the documentation, I'd hazard a guess that none of the developers on that system ever used floating points in their physics code, or even tested it. Dogma is a dangerous thing, so I'd say it would be good to listen to the real world results, or better yet, keep working on improving on your benchmark and perhaps you will be able to share with us some interesting referencial results.

*: I understand G1 has high predictability for suspending it's GC thread, still doesn't change the fact that there are associated GC costs and allocation costs that can be reduced

Offline sproingie

JGO Kernel


Medals: 202



« Reply #43 - Posted 2012-07-31 17:26:33 »

- we now have fancy algorithms that JIT compilers use to pretty much rewrite your code, sometimes eliminating your `new` call completely (exhibiting pool like behaviour, depending on how you look at it)

Aside from stack allocation, which is a different thing entirely, I've only ever seen this behavior with String constants and small values of Integer or other boxed types.  I really don't believe this behavior is even possible with most objects: Java isn't referentially transparent, and even if there were a Sufficiently Smart Compiler™ that could infer it by way of noticing an aliased object is never mutated, such a compiler couldn't support separate compilation, which is a cornerstone of Java.

Offline krausest

Junior Duke


Exp: 15 years


I love YaBB 1G - SP1!


« Reply #44 - Posted 2012-07-31 17:58:21 »

I ripped a smaller test case out of a 3d demo to see if scalar replacement works. In my case it did an amazing job. Here's the link to my blog entry: http://www.stefankrause.net/wp/?p=64
Offline keldon85

Senior Duke


Medals: 1



« Reply #45 - Posted 2012-07-31 18:28:08 »

- we now have fancy algorithms that JIT compilers use to pretty much rewrite your code, sometimes eliminating your `new` call completely (exhibiting pool like behaviour, depending on how you look at it)

Aside from stack allocation, which is a different thing entirely, I've only ever seen this behavior with String constants and small values of Integer or other boxed types.  I really don't believe this behavior is even possible with most objects: Java isn't referentially transparent, and even if there were a Sufficiently Smart Compiler™ that could infer it by way of noticing an aliased object is never mutated, such a compiler couldn't support separate compilation, which is a cornerstone of Java.

I was alluding to Escape Analysis there. Google's V8 is an excellent example of what can be achieved with this technology. In fact they've taken it a step further and even get a fair amount of code compiled directly into integer operations (despite Javascript's weak typing).

Main point though, is just to draw attention to what's going on and highlighting the dangers of dogmatically ignoring pooling and to create a benchmark much more accurate to the case.

Offline princec

JGO Kernel


Medals: 404
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #46 - Posted 2012-07-31 19:01:51 »

Excelsior JET does a very efficient job of escape analysis, replacing heap allocations with stack allocations. The problem is that JDK7 doesn't seem to be doing this yet, or if it is, it doesn't seem to have any noticeable effect on garbage that I've seen in some game code. Stefan's blog entry is pretty interesting though - it's clearly doing it there. I wonder if you increase the inlining size of the compiler whether it'd be more effective. I might try that on Project Zomboid later.

Cas Smiley

Offline Nate

JGO Kernel


Medals: 149
Projects: 4
Exp: 14 years


Esoteric Software


« Reply #47 - Posted 2012-07-31 23:11:00 »

Let alone Android, were the GC is poor and object allocation is slow.

Well, in the article's defense, it did say "object pooling is a performance loss for all but the most heavyweight objects on modern JVMs". Tongue

Offline kaffiene
« Reply #48 - Posted 2012-08-01 01:17:39 »

I have an OpenGL based game which creates tonnes of particles and bullets without any pooling and the worst GC pause I ever saw was about 1/100th second, with most pauses in the 1/1000th - 1/10000th second range.  That's using -xincgc.



Online Roquen
« Reply #49 - Posted 2012-08-01 11:43:10 »

It's important to note that all reported "findings" should be taken with a large dash of salt.  And even when the findings are not suspect they are only useful in the context of which they were written.  Most java writings will be in terms of application & server programming which has drastically different needs from computationally expensive (such as games or scientific) programming and soft-real time (such as games), etc.

Also we seem to be mixing up terms (unless hotspot is using non-standard terminology)..in terms of java:

escape analysis: determine if a reference cannot escape its creating call frame.  If so then it can be stack allocated and some other optimization can be performed.
scalar replacement: determine if the object itself may be broken apart into scalar components.  At the extreme the object itself can be removed and its fields live on the stack and/or in registers.

@kaffiene: I'm not sure what you're saying here:  1/100 of a second is terrible.
Offline princec

JGO Kernel


Medals: 404
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #50 - Posted 2012-08-01 11:54:34 »

Indeed, 10ms is awful, we'd be looking at juddering frames with that sort of pause. The maximum we want to spend on GC in a frame is maybe 3ms on a 1.6GHz-class single core sort of system. This is one area that the G1 GC excels at: when you give it a target collection time parameter, it's really pretty good at achieving that, which means with a bit pooling and careful coding not to generate too much crap in a frame, we're guaranteed a more or less rock steady 60Hz even on low-end systems. Awesome.

Cas Smiley

Offline Riven
« League of Dukes »

JGO Overlord


Medals: 816
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #51 - Posted 2012-08-01 11:54:51 »

The sweetspot of trouble is having (hundreds of) millions objects that mostly need to be retained, while also generating many objects of which a small percentage makes it out of eden. The 'full gc' that eventually occurs will have to move around most data in the different heaps and rewrite all pointers of the moved data. This is relatively slow.



(offtopic)
It's important to note that all reported "findings" should be taken with a large dash of salt.
Classic misinterpretation of the expression... 'a grain of salt' is insignificant, a 'large dash of salt' is more significant, not less.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline princec

JGO Kernel


Medals: 404
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #52 - Posted 2012-08-01 11:57:23 »

Haha pedant Smiley One more observation about games and GC: irregular, sometime even long, GC pauses are acceptable, even in realtime arcade games. What is not acceptable is regular GC pauses, as they seriously ruin the experience as they break the brain's ingenious ability to correctly predict motion.

Also of course games that don't rely on constant realtime action... who cares about GC Smiley

Cas Smiley

Online Roquen
« Reply #53 - Posted 2012-08-01 12:05:05 »

@Riven - with a "large dash of salt" you're gonna think about it more before swallowing it...at least you should. Smiley
Offline gimbal

JGO Knight


Medals: 25



« Reply #54 - Posted 2012-08-01 13:01:57 »

Haha pedant Smiley

If you need to be a pedantic ahole to be right, then wear the badge with pride!

EDIT: eeeeh, not that I'm calling anyone a ahole. Just speaking in general.
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 816
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #55 - Posted 2012-08-01 13:04:56 »

@Riven - with a "large dash of salt" you're gonna think about it more before swallowing it...at least you should. Smiley
Hence it's more significant. Clueless

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Online Roquen
« Reply #56 - Posted 2012-08-01 13:53:53 »

Why can't use just have structures.
Online nsigma
« Reply #57 - Posted 2012-08-01 15:11:42 »

Out of interest, has anyone found -XX:MaxGCPauseMillis to be of any use whatsoever?

Praxis LIVE - open-source intermedia toolkit and live interactive visual editor
Digital Prisoners - interactive spaces and projections
Offline princec

JGO Kernel


Medals: 404
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #58 - Posted 2012-08-01 15:22:29 »

Only really with the G1 GC.

Cas Smiley

Online nsigma
« Reply #59 - Posted 2012-08-01 15:31:52 »

Only really with the G1 GC.

To be read as it's worth doing?  I was under the impression it was meant to work elsewhere (with CMS I think), though don't recall it having much effect the last time I tried.

Praxis LIVE - open-source intermedia toolkit and live interactive visual editor
Digital Prisoners - interactive spaces and projections
Pages: 1 [2] 3
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

Longarmx (49 views)
2014-10-17 03:59:02

Norakomi (39 views)
2014-10-16 15:22:06

Norakomi (31 views)
2014-10-16 15:20:20

lcass (35 views)
2014-10-15 16:18:58

TehJavaDev (66 views)
2014-10-14 00:39:48

TehJavaDev (65 views)
2014-10-14 00:35:47

TehJavaDev (55 views)
2014-10-14 00:32:37

BurntPizza (72 views)
2014-10-11 23:24:42

BurntPizza (43 views)
2014-10-11 23:10:45

BurntPizza (84 views)
2014-10-11 22:30:10
Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!