Java-Gaming.org Hi !
 Featured games (90) games approved by the League of Dukes Games in Showcase (753) Games in Android Showcase (228) games submitted by our members Games in WIP (842) games currently in development
 News: Read the Java Gaming Resources, or peek at the official Java tutorials
Pages: [1]
 ignore  |  Print
 Tiny object performance overhead  (Read 14990 times) 0 Members and 1 Guest are viewing this topic.
Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Posted 2006-09-22 19:34:59 »

I read in some article* of a JVM engineer that creating new objects was 'almost at the cost of shifting a pointer'.

* I tried hard to find the article, but sometimes java.sun.com is kinda hard to wade through

Further, the GC is considered so intelligent and efficient, that its effect should be 'noise' even in performance-critical code.

Combining these two, would almost make you think allocating and discarding tiny objects is nearly free, or at least have a small impact.

I decided to give it a test, in a real-world application which has its bottleneck in some sphere<->triangle method.
Basic vector-math (Vec3) was implemented like:

 1  2  3 `public static final Vec3 add(Vec3 a, Vec3 b) {    return new Vec3(a.x + b.x, a.y + b.y, a.z + b.z);}`

When I was writing this code it seemed horribly inefficient.

The next code, shows the algorithm:

 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43 `      Vec3 ba = sub(b, a);      Vec3 ca = sub(c, a);      Vec3 pa = sub(p, a);      float snom = dot(pa, ba);      float tnom = dot(pa, ca);      if (snom <= 0.0f && tnom <= 0.0f)         return a;      Vec3 cb = sub(c, b);      Vec3 pb = sub(p, b);      float unom = dot(pb, cb);      float sdenom = dot(pb, sub(a, b));      if (sdenom <= 0.0f && unom <= 0.0f)         return b;      Vec3 pc = sub(p, c);      float tdenom = dot(pc, sub(a, c));      float udenom = dot(pc, sub(b, c));      if (tdenom <= 0.0f && udenom <= 0.0f)         return c;      Vec3 n = cross(ba, ca);      Vec3 ap = sub(a, p);      Vec3 bp = sub(b, p);      float vc = dot(n, cross(ap, bp));      if (vc <= 0.0f && snom >= 0.0f && sdenom >= 0.0f)         return add(a, mul(snom / (snom + sdenom), ba));      Vec3 cp = sub(c, p);      float va = dot(n, cross(bp, cp));      if (va <= 0.0f && unom >= 0.0f && udenom >= 0.0f)         return add(b, mul(unom / (unom + udenom), cb));      float vb = dot(n, cross(cp, ap));      if (vb <= 0.0f && tnom >= 0.0f && tdenom >= 0.0f)         return add(a, mul(tnom / (tnom + tdenom), ca));      float u = va / (va + vb + vc);      float v = vb / (va + vb + vc);      float w = 1.0f - u - v;      return add(add(mul(u, a), mul(v, b)), mul(w, c));`

The following is the version where all Vec3 methods are inlined:

 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99  100  101  102  103  104  105  106  107  108  109  110  111  112  113  114  115  116  117  118  119  120  121 `      float bax = b.x - a.x;      float bay = b.y - a.y;      float baz = b.z - a.z;      float cax = c.x - a.x;      float cay = c.y - a.y;      float caz = c.z - a.z;      float pax = p.x - a.x;      float pay = p.y - a.y;      float paz = p.z - a.z;      float snom = pax * bax + pay * bay + paz * baz;      float tnom = pax * cax + pay * cay + paz * caz;      if (snom <= 0.0f && tnom <= 0.0f)         return a;      float abx = a.x - b.x;      float aby = a.y - b.y;      float abz = a.z - b.z;      float cbx = c.x - b.x;      float cby = c.y - b.y;      float cbz = c.z - b.z;      float pbx = p.x - b.x;      float pby = p.y - b.y;      float pbz = p.z - b.z;      float unom = pbx * cbx + pby * cby + pbz * cbz;      float sdenom = pbx * abx + pby * aby + pbz * abz;      if (sdenom <= 0.0f && unom <= 0.0f)         return b;      float pcx = p.x - c.x;      float pcy = p.y - c.y;      float pcz = p.z - c.z;      float acx = a.x - c.x;      float acy = a.y - c.y;      float acz = a.z - c.z;      float bcx = b.x - c.x;      float bcy = b.y - c.y;      float bcz = b.z - c.z;      float tdenom = pcx * acx + pcy * acy + pcz * acz;      float udenom = pcx * bcx + pcy * bcy + pcz * bcz;      if (tdenom <= 0.0f && udenom <= 0.0f)         return c;      float nx = bay * caz - baz * cay;      float ny = baz * cax - bax * caz;      float nz = bax * cay - bay * cax;      float apx = a.x - p.x;      float apy = a.y - p.y;      float apz = a.z - p.z;      float bpx = b.x - p.x;      float bpy = b.y - p.y;      float bpz = b.z - p.z;      float APBPx = apy * bpz - apz * bpy;      float APBPy = apz * bpx - apx * bpz;      float APBPz = apx * bpy - apy * bpx;      float vc = nx * APBPx + ny * APBPy + nz * APBPz;      if (vc <= 0.0f && snom >= 0.0f && sdenom >= 0.0f)      {         Vec3 r = new Vec3();         float t = snom / (snom + sdenom);         r.x = bax * t + a.x;         r.y = bay * t + a.y;         r.z = baz * t + a.z;         return r;      }      float cpx = c.x - p.x;      float cpy = c.y - p.y;      float cpz = c.z - p.z;      float BPCPx = bpy * cpz - bpz * cpy;      float BPCPy = bpz * cpx - bpx * cpz;      float BPCPz = bpx * cpy - bpy * cpx;      float va = nx * BPCPx + ny * BPCPy + nz * BPCPz;      if (va <= 0.0f && unom >= 0.0f && udenom >= 0.0f)      {         Vec3 r = new Vec3();         float t = unom / (unom + udenom);         r.x = cbx * t + b.x;         r.y = cby * t + b.y;         r.z = cbz * t + b.z;         return r;      }      float CPAPx = cpy * apz - cpz * apy;      float CPAPy = cpz * apx - cpx * apz;      float CPAPz = cpx * apy - cpy * apx;      float vb = nx * CPAPx + ny * CPAPy + nz * CPAPz;      if (vb <= 0.0f && tnom >= 0.0f && tdenom >= 0.0f)      {         Vec3 r = new Vec3();         float t = (tnom / (tnom + tdenom));         r.x = cax * t + a.x;         r.y = cay * t + a.y;         r.z = caz * t + a.z;         return r;      }      float u = va / (va + vb + vc);      float v = vb / (va + vb + vc);      float w = 1.0f - u - v;      Vec3 r = new Vec3();      r.x = u * a.x + v * b.x + w * c.x;      r.y = u * a.y + v * b.y + w * c.y;      r.z = u * a.z + v * b.z + w * c.z;      return r;`

After warming both loops for several seconds, allowing the JVM to inline and optimize, these are the results:

 Objects: 1548ms 1553ms 1551ms Inlined: 505ms 500ms 558ms

This is clearly not 'noise' anymore (timing difference wise).

Some of you guys (to be honest, including me) would say: doh! - but I kinda started to believe they really reduced the overhead of objects. Sadly this doesn't seem to be the case as of yet.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #1 - Posted 2006-09-22 19:48:07 »

Found Jeffs remarks on this topic:

http://wiki.java.net/bin/view/Games/JeffOnPerformance#Do_I_need_to_avoid_garbage_colle
Quote
This means you are free today to create objects just to pass in and out of method calls
or hold temporary values, a practice which makes your code a whole lot neater, less buggy,
and simpler to maintain.

I'll continue my search for the article about the pointer-shift...
I found a quote of it, on another website:
Quote
Garbage Collection

The garbage collector has been greatly improved: creating a new object is now an incredibly
cheap operation, in most cases equivalent to shifting a pointer in memory. Don't necessarily be
afraid of creating many short-lived objects, they will be garbage-collected very efficiently.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
Matzon

JGO Knight

Medals: 19
Projects: 1

 « Reply #2 - Posted 2006-09-23 00:17:28 »

are you sure that the methods are inlined? - else you'd have a method overhead in the object test

kappa
« League of Dukes »

JGO Kernel

Medals: 120
Projects: 15

★★★★★

 « Reply #3 - Posted 2006-09-23 00:56:49 »

you know i had the exact same impression that creating small objects was free, just today i was trying to decide wheather to send 9 float as objects or directly as floats.

 1  2  3  4 `public void someMethod(float vec1x, float vec1y, float vec1z,                          float vec2x, float vec2y, float vec2z,                          float vec3x, float vec3y, float vec3z){}`

or wrap the values in Vector3f objects

 1  2 `public void someMethod(Vector3f a, Vector3f b, Vector3f c) {}`

clearly the second version is much nicer and cleaner but requires creating 3 more objects(of Vector3f), so according to your test first version would be more optimal performance wise?

Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #4 - Posted 2006-09-23 08:44:06 »

are you sure that the methods are inlined? - else you'd have a method overhead in the object test

Yup, running Xprof shows no sign of these methods anymore, they are interpretated a few times, then dissappear (0.4% of the ticks)

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
CommanderKeith
 « Reply #5 - Posted 2006-09-23 09:18:43 »

Very interesting stats.

Try Java 6, apparently 'small object creation' has become much more efficient.  see:

http://www.javalobby.org/java/forums/t66270.html

PS: I'm sure you know but to to avoid warming up loops, try the VM with the -server option (only works on windows with JDK VM however).

Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #6 - Posted 2006-09-23 09:19:11 »

you know i had the exact same impression that creating small objects was free, just today i was trying to decide wheather to send 9 float as objects or directly as floats.

 1  2  3  4 `public void someMethod(float vec1x, float vec1y, float vec1z,                          float vec2x, float vec2y, float vec2z,                          float vec3x, float vec3y, float vec3z){}`

or wrap the values in Vector3f objects

 1  2 `public void someMethod(Vector3f a, Vector3f b, Vector3f c) {}`

clearly the second version is much nicer and cleaner but requires creating 3 more objects(of Vector3f), so according to your test first version would be more optimal performance wise?

New Object Loop
 1  2  3  4  5  6  7  8 `            int p = values.length - 1;            while (p > 12)            {               Vec3 a = new Vec3(values[p--], values[p--], values[p--]);               Vec3 b = new Vec3(values[p--], values[p--], values[p--]);               Vec3 c = new Vec3(values[p--], values[p--], values[p--]);               r += fancyCalc(a, b, c);            }`

Used Object Loop
 1  2  3  4  5  6  7  8 `            int p = values.length - 1;            while (p > 12)            {              a.load(values[p--], values[p--], values[p--]);              b.load(values[p--], values[p--], values[p--]);              c.load(values[p--], values[p--], values[p--]);              r += fancyCalc(a, b, c);            }`

Many Floats Loop
 1  2  3  4  5 `            int p = values.length - 1;            while (p > 12)            {               r += fancyCalc(values[p--], values[p--], values[p--], values[p--], values[p--], values[p--], values[p--], values[p--], values[p--]);            }`

update: Float Array Loop
 1  2  3  4  5 `            int p = values.length - 1;            while (p > 12)            {               r += fancyCalc(values, p -= 9);            }`

 Client VM 1.4 Server VM 1.4 ------- Client VM 1.5 Server VM 1.5 ------- Client VM 1.6 Server VM 1.6 New Object Loop 2266ms 1453ms 2188ms 1354ms 1427ms 1094ms Used Object Loop 1404ms 656ms 1326ms 447ms 588ms 281ms Many Floats Loop 1265ms 328ms 1278ms 246ms 420ms 230ms Float Array Loop ?ms ?ms 1206ms 250ms 310ms 219ms

Fancy calc
 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38 `   private static final float fancyCalc(float ax, float ay, float az, float bx, float by, float bz, float cx, float cy, float cz)   {      float dotAB = ax * bx + ay * by + az * bz;      float dotBC = bx * cx + by * cy + bz * cz;      float dotCA = cx * ax + cy * ay + cz * az;      return (dotAB + dotBC) * dotCA + (1.0f - dotCA);   }   private static final float fancyCalc(Vec3 a, Vec3 b, Vec3 c)   {      float dotAB = a.x * b.x + a.y * b.y + a.z * b.z;      float dotBC = b.x * c.x + b.y * c.y + b.z * c.z;      float dotCA = c.x * a.x + c.y * a.y + c.z * a.z;      return (dotAB + dotBC) * dotCA + (1.0f - dotCA);   }   private static final float fancyCalc(float[] buf, int off)   {      float ax = buf[off + 0];      float ay = buf[off + 1];      float az = buf[off + 2];      float bx = buf[off + 3];      float by = buf[off + 4];      float bz = buf[off + 5];      float cx = buf[off + 6];      float cy = buf[off + 7];      float cz = buf[off + 8];      float dotAB = ax * bx + ay * by + az * bz;      float dotBC = bx * cx + by * cy + bz * cz;      float dotCA = cx * ax + cy * ay + cz * az;      return (dotAB + dotBC) * dotCA + (1.0f - dotCA);   }`

Ofcourse the body of fancyCalc is a bit too large to measure only the overhead of the way it is invoked, but it's more 'real world' this way, instead of yet another 'micro benchmark'

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #7 - Posted 2006-09-23 10:01:18 »

Very interesting stats.

Try Java 6, apparently 'small object creation' has become much more efficient.  see:

http://www.javalobby.org/java/forums/t66270.html

PS: I'm sure you know but to to avoid warming up loops, try the VM with the -server option (only works on windows with JDK VM however).

The Server VM takeas even longer to warm up. Anyway, I'm giving the VM more than enough time to warm up, so which VM is used doesn't really matter.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
CommanderKeith
 « Reply #8 - Posted 2006-09-23 10:26:19 »

That is disappointing, I thought hotspot would turn the 'new object' code into the 'direct' code.    Well at least object creation has gotten better in Java 6.  How badly do these bottlenecks affect you, because in all of my games it's the blitting to the screen that takes most of the time.

I wonder why the 1.6 Client VM is so much quicker than the 1.5 equivalent when doing the 'direct' method?

PS: oops, I thought the server VM did all possible native code compilation AND inlining.  So inlining must still be done dynamically at runtime by the server VM

Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #9 - Posted 2006-09-23 10:39:42 »

Offtopic:

I found out that
FloatBuffer.get(int) is more than twice as slow in 6.0 (compared to 1.5) (both client VM)

fancyCalc FloatBuffer client 1.5: ~1000ms
fancyCalc FloatBuffer client 1.6: ~2400ms <--- ?? serious regression

fancyCalc FloatBuffer server 1.5: ~285ms
fancyCalc FloatBuffer server 1.6: ~290ms

FloatBuffer = direct, native-ordered buffer

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
Martin Strand

Junior Devvie

 « Reply #10 - Posted 2006-09-23 11:00:08 »

PS: oops, I thought the server VM did all possible native code compilation AND inlining.  So inlining must still be done dynamically at runtime by the server VM
It does more agressive inlining but still needs to warm up. You can use -Xcomp to have methods compiled the first time they're invoked, that would get rid of the warmup loop.
Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #11 - Posted 2006-09-23 11:04:59 »

PS: oops, I thought the server VM did all possible native code compilation AND inlining.  So inlining must still be done dynamically at runtime by the server VM
It does more agressive inlining but still needs to warm up. You can use -Xcomp to have methods compiled the first time they're invoked, that would get rid of the warmup loop.

Indeed, yet it often results in crappy optimized code, as the VM didn't have enough time to properly analyze the code-paths and adjust the optimizing process with that data.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
g666

Junior Devvie

 « Reply #12 - Posted 2006-09-23 16:46:46 »

Well i didnt ever see any examples that showed you could now create small objects for not much more of a cost than reusing them, so i didnt ever believe it, im not sure why any1 did.

desperately seeking sanity
CommanderKeith
 « Reply #13 - Posted 2006-09-24 03:49:34 »

Well I've been told many times here (& read elsewhere) that object pooling doesn't give any performance boost - since the gc is so efficient & object creation is swift.

So object pooling can still be a good idea (if object creation is causing the bottleneck).

And what will you do with your vector-math code Riven - persevere with temporary objects or use primitives or object pooling?

Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #14 - Posted 2006-09-24 08:43:44 »

I wrote this vector-math code for this test only. I always had a gut-feeling it would be dead-slow, so this was the only place that created all those objects.

Next test I'll do will be with an ObjectPool. I have my doubts about your statement object-pooling still being feasible. We'll see.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
Linuxhippy

Senior Devvie

Medals: 1

Java games rock!

 « Reply #15 - Posted 2006-09-24 15:06:42 »

However object pooling time is quite consistent and also does not hurt scalability on many CPUs.
I am working for a larger company and my job is to tune the stuff other (cheaper *lol*) programmes produce - if you're running on a 32-64CPU machine generating garbage is VERY expensive and HURTS concurrency a lot. However managing memory yourself means ... well you've to take care
Have a look at javolution, a nice framework for fast object pooling :-)

lg Clemens
blahblahblahh

JGO Coder

Medals: 1

http://t-machine.org

 « Reply #16 - Posted 2006-09-25 08:19:55 »

Um, are you sure you've got the right end of the stick here?

I thought the claim was that the garbage collection of small objects is now practically free, as small as shifting a pointer.

Off the top of my head, it is clearly impossible for *object creation* to be that cheap - you have to initialise lots of data in memory (think how much data an object actually contains under the hood if it contains merely a simple float)

malloc will be first against the wall when the revolution comes...
rreyelts

Junior Devvie

There is nothing Nu under the sun

 « Reply #17 - Posted 2006-09-25 19:07:33 »

Off the top of my head, it is clearly impossible for *object creation* to be that cheap - you have to initialise lots of data in memory (think how much data an object actually contains under the hood if it contains merely a simple float)

Think about it in terms of allocation versus initialization blah^3. Allocation is reserving address space for the object, and initialization is assigning actual values to fields, etc... So, allocation can indeed be as fast as a pointer bump.

I've done this in C++ code where I've written custom allocators for a routine. The routine allocates some millions of nodes over it's relatively short (1 second) execution time. The allocator has a pre-allocated memory pool. When it needs to allocate a node, it simply bumps a pointer. No nodes get deallocated until the very end of the routine, at which point they are all "deallocated" by simply resetting the pointer to the top of the pool. This reduced allocation/deallocation times to just about nil.

You can do something similar in Java by creating an object pool, but those objects are still something that the gc is aware of.

Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
rreyelts

Junior Devvie

There is nothing Nu under the sun

 « Reply #18 - Posted 2006-09-25 19:19:00 »

However object pooling time is quite consistent and also does not hurt scalability on many CPUs.

It depends on what your pools look like. If they're MT-safe, that definitely hurts scalability. For example, Java heaps are tuned to be extremely fast for multi-threaded allocations. (They blow the bog-standard C++ allocators out of the water). They can do all sorts of dirty tricks like segmenting different areas of heap address space per thread to reduce contention. There are tricks you can do with object pools too (like creating threadlocal pools, but the cost of a threadlocal lookup isn't zero), but they aren't trivial and do involve other kinds of overhead.

Most people can just forget about allocation and pooling unless they're creating millions of objects / second, or using a class that has heavyweight initialization (e.g. database connections).

Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
princec

« JGO Spiffy Duke »

Medals: 1012
Projects: 3
Exp: 20 years

Eh? Who? What? ... Me?

 « Reply #19 - Posted 2006-09-26 10:02:17 »

Pools are for objects that are expensive to construct and/or initialise.

Cas

Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #20 - Posted 2006-09-26 10:11:02 »

Appearantly this object is expensive to construct and/or initialize:

 1  2  3  4 `public class Vec3{  public float x,y,z;}`

See my benchmark: compare "new object" (1354ms) and "used object" (447ms).

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
Jeff

JGO Coder

Got any cats?

 « Reply #21 - Posted 2006-10-18 15:52:08 »

Ya gotta be **really* careful with micro-benchmarks in java.  Its very very easy to get meaningles results.

I'm swamped now but if I find time later Ill take a look at this particular example.

Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #22 - Posted 2006-10-18 15:56:18 »

I'm fully aware of that, and have run into this in real-world applications, and turned it into a bechmark to show the results with you guys without uploading large packages of code, with at least a dozen dependancies.

Even in realworld cases the results of the changes in architechture were very similar, so please don't think of it as yet-another-benchmark that the JVM isn't handling properly yet.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
pepijnve

Junior Devvie

Java games rock!

 « Reply #23 - Posted 2006-10-19 06:54:17 »

I second Riven's comment on the expensive Vec3 construction. I recently reworked some C++ code that used some vector math classes like the Vec3 class. All the Vec3 operators (+-*/) allocated new Vec3 instances (stack allocation). Initially I converted this to 'new Vec3()' in the java code, but the performance was terrible. The algorithm in question was causing lots of very shortlived Vec3 instances to be allocated inside inner loops. I then reworked the code to reuse Vec3 instances as much as possible. This improved performance a lot, but the elegance of the code dropped Unfortunately I can't seem to find my test results...
blahblahblahh

JGO Coder

Medals: 1

http://t-machine.org

 « Reply #24 - Posted 2006-10-19 08:09:48 »

Off the top of my head, it is clearly impossible for *object creation* to be that cheap - you have to initialise lots of data in memory (think how much data an object actually contains under the hood if it contains merely a simple float)

Think about it in terms of allocation versus initialization blah^3. Allocation is reserving address space for the object, and initialization is assigning actual values to fields, etc... So, allocation can indeed be as fast as a pointer bump.

I've done this in C++ code where I've written custom allocators for a routine. The routine allocates some millions of nodes over it's relatively short (1 second) execution time. The allocator has a pre-allocated memory pool. When it needs to allocate a node, it simply bumps a pointer. No nodes get deallocated until the very end of the routine, at which point they are all "deallocated" by simply resetting the pointer to the top of the pool. This reduced allocation/deallocation times to just about nil.

You can do something similar in Java by creating an object pool, but those objects are still something that the gc is aware of.

Bad description on my part, but what I was trying to do was allude to the fact that a java object that merely contains a float also contains many other bytes imposed by the language. Whereas you have to re-initialize only the float with pooling, with new'ing you have to initialize a bunch of other data.

So, I took "object creation" as used in this discussion to mean "allocation + initialization of required JVM/language/platform data".

No? Yes? Maybe?

malloc will be first against the wall when the revolution comes...
princec

« JGO Spiffy Duke »

Medals: 1012
Projects: 3
Exp: 20 years

Eh? Who? What? ... Me?

 « Reply #25 - Posted 2006-10-19 11:52:33 »

Once again, it is probably the case that escape analysis and stack allocation will cure most of this. Due in Java 7 isn't it? I've seen it working in Jet and it pretty much does the trick performance wise.

Cas

walter_bruce

Junior Newbie

Performance matters.

 « Reply #26 - Posted 2006-12-05 16:32:56 »

Escape analysis will help and be a nice addition but it is no panacea.  It should easily handle the trivial cases shown in typical microbenchmarks where an object is created, used once, and thrown away all within a single method.  For harder cases, for example where the temporary object is used to marshal arguments for a possibly polymorphic method call, it remains to be seen how often escape analysis can handle this for real code in large projects.  And of course, if the object has any significant lifespan, for example if the object is part of a larger object, then escape analysis cannot help.  It does not allow objects to be inlined into other objects.

The small object overhead is still significant and while escape analysis is a good thing, it only fixes one aspect of a wider problem.
CommanderKeith
 « Reply #27 - Posted 2007-04-30 03:22:44 »

Quote
I read in some article* of a JVM engineer that creating new objects was 'almost at the cost of shifting a pointer'.

* I tried hard to find the article, but sometimes java.sun.com is kinda hard to wade through

I think CaptainJester just found the article you were talking about:

Quote

The 1.0 and 1.1 JDKs used a mark-sweep collector, which did compaction on some -- but not all -- collections, meaning that the heap might be fragmented after a garbage collection. Accordingly, memory allocation costs in the 1.0 and 1.1 JVMs were comparable to that in C or C++, where the allocator uses heuristics such as "first-first" or "best-fit" to manage the free heap space. Deallocation costs were also high, since the mark-sweep collector had to sweep the entire heap at every collection. No wonder we were advised to go easy on the allocator.

In HotSpot JVMs (Sun JDK 1.2 and later), things got a lot better -- the Sun JDKs moved to a generational collector. Because a copying collector is used for the young generation, the free space in the heap is always contiguous so that allocation of a new object from the heap can be done through a simple pointer addition, as shown in Listing 1. This makes object allocation in Java applications significantly cheaper than it is in C, a possibility that many developers at first have difficulty imagining. Similarly, because copying collectors do not visit dead objects, a heap with a large number of temporary objects, which is a common situation in Java applications, costs very little to collect; simply trace and copy the live objects to a survivor space and reclaim the entire heap in one fell swoop. No free lists, no block coalescing, no compacting -- just wipe the heap clean and start over. So both allocation and deallocation costs per object went way down in JDK 1.2.

http://www.java-gaming.org/forums/index.php?topic=16512.msg130580;topicseen#msg130580

Riven

« JGO Overlord »

Medals: 1336
Projects: 4
Exp: 16 years

 « Reply #28 - Posted 2007-04-30 11:15:29 »

Thanks for backing up that statement.

It simply shows there is more to object-creation than just allocation. Even an object with an 'empty' constructor has significant overhead.  At least the object-header has to be written (as it's not a struct) which might require fetching the class-id, or something else entirely...

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
t_larkworthy

Senior Devvie

Medals: 1
Projects: 1

 « Reply #29 - Posted 2007-04-30 12:16:55 »

Object pooling in JOODE has speed it up by an order of magnitute.

Runesketch: an Online CCG built on Google App Engine where players draw their cards and trade. Fight, draw or trade yourself to success.
Pages: [1]
 ignore  |  Print

 nelsongames (20 views) 2018-04-24 18:15:36 nelsongames (17 views) 2018-04-24 18:14:32 ivj94 (608 views) 2018-03-24 14:47:39 ivj94 (54 views) 2018-03-24 14:46:31 ivj94 (402 views) 2018-03-24 14:43:53 Solater (66 views) 2018-03-17 05:04:08 nelsongames (111 views) 2018-03-05 17:56:34 Gornova (177 views) 2018-03-02 22:15:33 buddyBro (747 views) 2018-02-28 16:59:18 buddyBro (94 views) 2018-02-28 16:45:17
 ByerN 12x KaiHH 12x SHC 10x NuclearPixels 10x Zemlaynin 10x Guerra2442 10x Damocles 6x VaTTeRGeR 5x orangepascal 4x philfrei 4x ndnwarrior15 3x mesterh 3x ags1 3x Phased 2x CommanderKeith 2x delt0r 2x
 Java Gaming Resourcesby philfrei2017-12-05 19:38:37Java Gaming Resourcesby philfrei2017-12-05 19:37:39Java Gaming Resourcesby philfrei2017-12-05 19:36:10Java Gaming Resourcesby philfrei2017-12-05 19:33:10List of Learning Resourcesby elect2017-03-13 14:05:44List of Learning Resourcesby elect2017-03-13 14:04:45SF/X Librariesby philfrei2017-03-02 08:45:19SF/X Librariesby philfrei2017-03-02 08:44:05
 java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org