Java-Gaming.org Hi !
Featured games (90)
games approved by the League of Dukes
Games in Showcase (741)
Games in Android Showcase (225)
games submitted by our members
Games in WIP (823)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: 1 [2] 3 4 ... 8
  ignore  |  Print  
  Interesting proposals: Java 9 and beyond  (Read 64529 times)
0 Members and 1 Guest are viewing this topic.
Online princec

« JGO Spiffy Duke »


Medals: 976
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #30 - Posted 2015-09-09 09:50:29 »

jemalloc is indeed a great solution, though as you say, I've no need of it particularly... I just allocate a giant VBO to render everything into and that's all I ever need for the duration of the game. Well actually that's not quite what I do... I allocate VBOs in large chunks (4mb or so at a time) and fill them up one at a time during a frame, and if I run out of space, I allocate another one, and never release them.

Now I'm curious... other than the known irritation of MappedBuffers... how are others using buffers in their game code?

Cas Smiley

Offline nsigma
« Reply #31 - Posted 2015-09-09 10:35:11 »

When a direct NIO buffer is no longer referenced in your Java code, this case is detected, the cleaner is called. It's not perfect because one must be sure that your native code (OpenGL for example) isn't trying to access it (you risk a crash of the JVM)

How's that situation different to the default behaviour (if possibly more likely to be encountered)?  Surely either way you need to keep the reference alive if it's being access by native code?

Wait. Looking at the info from Aleksey Shipilev...HotSpot calls malloc?  Ha ha ha!!  You're all fired!

You ever get straight to the point?  Tongue

Praxis LIVE - hybrid visual IDE for (live) creative coding
Offline gouessej
« Reply #32 - Posted 2015-09-09 11:03:30 »

Not sure why you've got such a pressing need to dispose of native buffers so urgently...? Apart from mapped files (a known bugbear) the general usage pattern of direct buffers, particularly in regard to games, is to map one big one at start up and then ... never touch it again until your game exits.
I know how to slice the buffers but this is used differently in numerous scenegraph APIs, especially in LibGDX, JMonkeyEngine and JogAmp's Ardor3D Continuation. I agree with Spasi's first paragraph, it just moves the problem but it can be better for you if you know better how to write an efficient custom allocation/deallocation system.

How's that situation different to the default behaviour (if possibly more likely to be encountered)?  Surely either way you need to keep the reference alive if it's being access by native code?
I advise you to read the page about PhantomReference in the Java documentation. The garbage collector can detect that an object is useless but release its resources later. The PhantomReference can be used to execute pre-mortem cleanups in this case. The deallocator of the cleaner is quite robust, it is able to detect whether it has already been called, there is (almost?) no risk of double free.

I'll just write the dirty code supporting Java 1.6, 1.7, 1.8 and 1.9. In my humble opinion, it would be better to implement a free() method in java.nio.DirectByteBuffer (like in Apache Harmony) so that the developers don't need to manipulate the cleaners and it would allow to prevent the use of sun.misc.Cleaner too. Then, I would have to rely on package protected classes in java.nio but I would no longer have to rely on Sun internal classes. I find this solution acceptable especially if there is no clean API to replace it in Java 1.9, it would let some more time to write it, maybe in Java 1.10?

Edit.: princec, look at Bits.reserveMemory() and maybe you'll understand a bit my position.

Julien Gouesse | Personal blog | Website | Jogamp
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline nsigma
« Reply #33 - Posted 2015-09-09 11:22:15 »

How's that situation different to the default behaviour (if possibly more likely to be encountered)?  Surely either way you need to keep the reference alive if it's being access by native code?
I advise you to read the page about PhantomReference in the Java documentation. The garbage collector can detect that an object is useless but release its resources later. The PhantomReference can be used to execute pre-mortem cleanups in this case. The deallocator of the cleaner is quite robust, it is able to detect whether it has already been called, there is (almost?) no risk of double free.

That's not what I meant!  You suggest with that approach you need to keep a strong reference to the direct buffer to ensure it's not freed during native access.  I said that's no different to the default case.

Incidentally, surely in this scenario it's better just to remove GC and references from the equation?  eg. using the ability for native code to allocate a direct buffer without a cleaner and handle disposal manually?

Praxis LIVE - hybrid visual IDE for (live) creative coding
Online princec

« JGO Spiffy Duke »


Medals: 976
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #34 - Posted 2015-09-09 11:33:32 »

The code in Bits.reserveMemory is indeed completely terrible and cringeworthy, but we have to remember where the original Sun engineers were coming from when they designed native buffers: the problem as they have succinctly put it is that if a thread deallocates a direct buffer manually, then some other thread with a reference to that buffer can effectively read/write into that memory even though it is no longer owned. The solution, as they put it, was you'd need a check for dealloc on every call in ByteBuffer.... which would break any optimisation and slow everything down to the point where it no longer offered any advantage at all to use them. Of course us game developers couldn't care less about forcing checks but the JVM has uses outside games which involve security and guarantees.

So while I did once rant and spit and curse and scratch about why we can't force buffers to deallocate... I eventually came to understand that if we're trying to allocate/deallocate them rapidly enough that it's a problem we were just using them in a manner that was not intended nor even strictly necessary in 99% of use cases. The only one left that bugs me is mapped files.

Relatedly, does anyone have any idea about the latency/jitter/timeliness of reference queues?

Cas Smiley

Offline gouessej
« Reply #35 - Posted 2015-09-09 11:57:31 »

You suggest with that approach you need to keep a strong reference to the direct buffer to ensure it's not freed during native access.
It's already done in JOGL (and probably in some other third party libraries). It stops keeping such references when the buffers are deleted.

Incidentally, surely in this scenario it's better just to remove GC and references from the equation?  eg. using the ability for native code to allocate a direct buffer without a cleaner and handle disposal manually?
but I have to communicate with Java, I can't do everything in native code.

Julien Gouesse | Personal blog | Website | Jogamp
Offline nsigma
« Reply #36 - Posted 2015-09-09 12:41:04 »

Incidentally, surely in this scenario it's better just to remove GC and references from the equation?  eg. using the ability for native code to allocate a direct buffer without a cleaner and handle disposal manually?
but I have to communicate with Java, I can't do everything in native code.

That's not what I said either!  Wink  Check this code in JNA for example.  It returns a direct byte buffer that doesn't have a cleaner (see the constructor in DirectByteBuffer) and has to be freed manually.

Praxis LIVE - hybrid visual IDE for (live) creative coding
Offline Roquen

JGO Kernel


Medals: 516



« Reply #37 - Posted 2015-09-09 14:25:39 »

You ever get straight to the point?  Tongue
Points are boring little zero dimension things.  Meh. 

Half joking, half serious.  Joking because that choice shouldn't matter since they shouldn't be called that much.  Half serious because the default malloc/realloc/free are multi-threaded general purpose heaps with runtime configuration options and can often be completely patched-out by the user.  General purpose heaps generally suck at everything...they just suck less than special purpose allocators when the programmer is breaking the expected usage patterns.

So they depend on compiler (including version), OS (including version) and any insane things that a given user on the given system might have mucked around with.
Offline gouessej
« Reply #38 - Posted 2015-09-09 15:02:49 »

That's not what I said either!  Wink  Check this code in JNA for example.  It returns a direct byte buffer that doesn't have a cleaner (see the constructor in DirectByteBuffer) and has to be freed manually.
Ok I can do something similar with JNI by calling NewDirectByteBuffer, can't I? Now I see what you mean.

I advise some developers here to have a look at this document, I have found it very helpful:
http://www.ibm.com/developerworks/library/j-nativememory-linux/index.html

Julien Gouesse | Personal blog | Website | Jogamp
Offline nsigma
« Reply #39 - Posted 2015-09-09 18:17:04 »

Ok I can do something similar with JNI by calling NewDirectByteBuffer, can't I? Now I see what you mean.

Yes!  I assume it's how LWJGL 3 is making use of jemalloc as per @spasi's post earlier.  Be a good utility for JOGL if it's not in there already?!  A library just providing that element of LWJGL with the different allocators looks like it could be useful too?

If you wanted it should still be possible to manage this via (Phantom)References rather than forcing the end user to manually free memory, while still staying away from any internal JVM code.

Praxis LIVE - hybrid visual IDE for (live) creative coding
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Spasi
« Reply #40 - Posted 2015-09-09 19:28:59 »

I assume it's how LWJGL 3 is making use of jemalloc as per @spasi's post earlier.

We did that in LWJGL 2. LWJGL 3 talks to native code exclusively with primitives and never calls NewDirectByteBuffer (or any other JNI function, except in callbacks). The buffer instances are constructed via .duplicate() and overriding the address/capacity fields with Unsafe (plus appropriate fallbacks when Unsafe in not available). Details here.
Offline gouessej
« Reply #41 - Posted 2015-09-09 21:14:49 »

Yes!  I assume it's how LWJGL 3 is making use of jemalloc as per @spasi's post earlier.  Be a good utility for JOGL if it's not in there already?!  A library just providing that element of LWJGL with the different allocators looks like it could be useful too?
There are some fundamental differences between those libraries concerning the use of third party libraries and external tools. The JogAmp APIs let the developer decide what to do when a buffer is destroyed and we don't plan to support any alternative mean of creating direct NIO buffers.

If you wanted it should still be possible to manage this via (Phantom)References rather than forcing the end user to manually free memory, while still staying away from any internal JVM code.
The first part is already done in JogAmp's Ardor3D Continuation but I prefer detecting this case even before the garbage collector. I think that I will go on using Sun internal APIs at least for Java 1.7 and Java 1.8. If something better appears in the standard Java API, we'll use it in JogAmp. I remind you that some of our APIs must work in tons of environments (there is even at least one user who tries to run JOGL on Wii and PS3), we can't rely on a library that doesn't work under Android for example.

Julien Gouesse | Personal blog | Website | Jogamp
Offline theagentd
« Reply #42 - Posted 2015-09-10 00:13:18 »

I assume it's how LWJGL 3 is making use of jemalloc as per @spasi's post earlier.

We did that in LWJGL 2. LWJGL 3 talks to native code exclusively with primitives and never calls NewDirectByteBuffer (or any other JNI function, except in callbacks). The buffer instances are constructed via .duplicate() and overriding the address/capacity fields with Unsafe (plus appropriate fallbacks when Unsafe in not available). Details here.
Something I noticed in LWJGL 2 was that when mapping buffer the ByteBuffer returned was recreated if either the VBO address or VBO size was updated. Does this mean that the ByteBuffer can be reused indefinitely now?

Myomyomyo.
Offline Spasi
« Reply #43 - Posted 2015-09-10 08:04:50 »

Something I noticed in LWJGL 2 was that when mapping buffer the ByteBuffer returned was recreated if either the VBO address or VBO size was updated. Does this mean that the ByteBuffer can be reused indefinitely now?

Yes, the LWJGL 3 implementation doesn't care whether the address/size has changed or not.
Offline nsigma
« Reply #44 - Posted 2015-09-10 09:01:42 »

There are some fundamental differences between those libraries concerning the use of third party libraries and external tools. The JogAmp APIs let the developer decide what to do when a buffer is destroyed and we don't plan to support any alternative mean of creating direct NIO buffers.

What, never? Tongue

Actually, I'm mainly thinking about situations where direct buffers are allocated outside of the library (JOGL) itself.  In those cases, it's not like it makes any difference to JOGL where the buffer comes from - eg. I'm passing GStreamer video direct to JOGL textures using JNA creating direct buffers - it's all just pointers!

It seems like various projects besides LWJGL are exploring alternative allocators - just saying that a tiny library that does just that, in a pluggable way, would seem to be useful.

I remind you that some of our APIs must work in tons of environments (there is even at least one user who tries to run JOGL on Wii and PS3), we can't rely on a library that doesn't work under Android for example.

If there's no malloc / free to fall back to, you might have bigger problems?!  Wink  Mind you could also just fall back to the allocateDirect()

We did that in LWJGL 2. LWJGL 3 talks to native code exclusively with primitives and never calls NewDirectByteBuffer (or any other JNI function, except in callbacks). The buffer instances are constructed via .duplicate() and overriding the address/capacity fields with Unsafe (plus appropriate fallbacks when Unsafe in not available). Details here.

Interesting!  My naive hope of ignoring Unsafe appears misguided.  Grin  Any benchmarks of the impacts?

Praxis LIVE - hybrid visual IDE for (live) creative coding
Offline gouessej
« Reply #45 - Posted 2015-09-10 10:49:50 »

What, never? Tongue
I don't know. We should talk about that to the other contributors. Personally, I don't want to provide an alternative allocator within JogAmp, I'm against this option.

Actually, I'm mainly thinking about situations where direct buffers are allocated outside of the library (JOGL) itself.  In those cases, it's not like it makes any difference to JOGL where the buffer comes from - eg. I'm passing GStreamer video direct to JOGL textures using JNA creating direct buffers - it's all just pointers!

It seems like various projects besides LWJGL are exploring alternative allocators - just saying that a tiny library that does just that, in a pluggable way, would seem to be useful.
We do nothing to prevent the developers from using any alternative allocators as far as I know. As long as you don't pass any arrays to any methods accepting preferably direct NIO buffers, JOGL won't create any direct NIO buffer except when using some utilities. There is almost nothing to plug then. You can already use JNA, Apache DirectMemory or another library with JOGL, what more do you expect? I could fill a RFE about com.jogamp.common.nio.Buffers to allow to pass a custom allocator here:
https://github.com/sgothel/gluegen/blob/master/src/java/com/jogamp/common/nio/Buffers.java#L79

Julien Gouesse | Personal blog | Website | Jogamp
Offline Spasi
« Reply #46 - Posted 2015-09-10 11:26:46 »

We did that in LWJGL 2. LWJGL 3 talks to native code exclusively with primitives and never calls NewDirectByteBuffer (or any other JNI function, except in callbacks). The buffer instances are constructed via .duplicate() and overriding the address/capacity fields with Unsafe (plus appropriate fallbacks when Unsafe in not available). Details here.

Interesting!  My naive hope of ignoring Unsafe appears misguided.  Grin  Any benchmarks of the impacts?

It was not done for performance. LWJGL 3 has the explicit goal of having minimal native code, for two reasons:

- Attract more contributions from Java developers that have no C experience.
- Make the transition to Project Panama (JVM FFI) in Java 10 as painless as possible.

This design has the nice side-effect that the JVM is able to inline and optimize a lot of code that was previously hidden inside JNI functions. The JNI code now does nothing but call the native function. With Panama, the JVM will be able to inline all the way to the native function call, completely eliminating JNI overhead. I'm also hopeful that using Unsafe won't be necessary by then.

Finally, this has allowed us to deduplicate JNI methods, resulting in important space savings in the native binaries. Details here.

---

A note on LWJGL 3's jemalloc support: it's optional. You can delete the native binary and LWJGL will work. It will simply fall back to the system's malloc/free/etc. Which is what NIO/Unsafe uses, minus the VM housekeeping overhead.
Offline gouessej
« Reply #47 - Posted 2015-09-10 13:46:21 »

noctarius, is my suggestion (putting the call to sn.misc.Cleaner.clean() into java.nio.DirectByteBuffer.free()) completely nonsensical?

Julien Gouesse | Personal blog | Website | Jogamp
Offline noctarius

JGO Knight


Medals: 61


Manager Developer Relations @Hazelcast


« Reply #48 - Posted 2015-09-10 17:00:04 »

noctarius, is my suggestion (putting the call to sn.misc.Cleaner.clean() into java.nio.DirectByteBuffer.free()) completely nonsensical?

Well I think it'll never happen. It still is a specific use case and you don't make people expect that to be used. A method called free() always sounds like you actually have to free the buffer.

Offline Spasi
« Reply #49 - Posted 2015-09-12 20:02:42 »

Interesting!  My naive hope of ignoring Unsafe appears misguided.  Grin  Any benchmarks of the impacts?

I took some time to test this and have verified that the JVM is able to eliminate ByteBuffer allocations via escape analysis. Simple benchmark:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
public static void main(String[] args) {
   // warmup
   for ( int i = 0; i < 100; i++ ) {
      testImpl();
   }

   // bench
   long t = System.nanoTime();
   for ( int i = 0; i < 1000; i++ ) {
      testImpl();
   }
   t = System.nanoTime() - t;
   System.out.println("TIME: " + t / 1000 / 1000 + "ms");
}

Tests and results:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
62  
63  
64  
65  
66  
67  
68  
69  
70  
71  
72  
73  
74  
75  
76  
// Reference implementation using Unsafe:
private static void testUnsafe() {
   long target = nje_malloc(8);
   for ( int i = 0; i < 10000; i++ ) {
      long source = nje_malloc(8);
      memPutInt(source + 0, 0xDEADBEEF);
      memPutInt(source + 4, 0xCAFEBABE);
      memCopy(source, target, 8);
      nje_free(source);
   }
   nje_free(target);
}
/*
TIME: 668ms
Heap
 PSYoungGen      total 38400K, used 2665K [0x00000007d5d00000, 0x00000007d8780000, 0x0000000800000000)
  eden space 33280K, 8% used [0x00000007d5d00000,0x00000007d5f9a6f0,0x00000007d7d80000)
  from space 5120K, 0% used [0x00000007d8280000,0x00000007d8280000,0x00000007d8780000)
  to   space 5120K, 0% used [0x00000007d7d80000,0x00000007d7d80000,0x00000007d8280000)
 ParOldGen       total 86016K, used 0K [0x0000000781800000, 0x0000000786c00000, 0x00000007d5d00000)
  object space 86016K, 0% used [0x0000000781800000,0x0000000781800000,0x0000000786c00000)
 PSPermGen       total 21504K, used 3082K [0x000000077c600000, 0x000000077db00000, 0x0000000781800000)
  object space 21504K, 14% used [0x000000077c600000,0x000000077c902958,0x000000077db00000)
*/


// ByteBuffer implementation, using je_malloc for malloc/free
private static void testLWJGL() {
   ByteBuffer target = memAlloc(8);
   for ( int i = 0; i < 10000; i++ ) {
      ByteBuffer source = memAlloc(8);
      source.putInt(0xDEADBEEF);
      source.putInt(0xCAFEBABE);
      source.flip();
      target.put(source);
      target.flip();
      source.flip();
      memFree(source);
   }
   memFree(target);
}
// Results with default JVM arguments
/*
TIME: 693ms
Heap
 PSYoungGen      total 38400K, used 5330K [0x00000007d5d00000, 0x00000007d8780000, 0x0000000800000000)
  eden space 33280K, 16% used [0x00000007d5d00000,0x00000007d62348c0,0x00000007d7d80000)
  from space 5120K, 0% used [0x00000007d8280000,0x00000007d8280000,0x00000007d8780000)
  to   space 5120K, 0% used [0x00000007d7d80000,0x00000007d7d80000,0x00000007d8280000)
 ParOldGen       total 86016K, used 0K [0x0000000781800000, 0x0000000786c00000, 0x00000007d5d00000)
  object space 86016K, 0% used [0x0000000781800000,0x0000000781800000,0x0000000786c00000)
 PSPermGen       total 21504K, used 3091K [0x000000077c600000, 0x000000077db00000, 0x0000000781800000)
  object space 21504K, 14% used [0x000000077c600000,0x000000077c904e90,0x000000077db00000)
*/


// Results with -XX:-DoEscapeAnalysis
/*
[GC [PSYoungGen: 33280K->400K(38400K)] 33280K->408K(124416K), 0.0009441 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 33680K->368K(38400K)] 33688K->384K(124416K), 0.0008054 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 33648K->352K(38400K)] 33664K->376K(124416K), 0.0006958 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 33632K->368K(71680K)] 33656K->392K(157696K), 0.0007315 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 66928K->352K(71680K)] 66952K->376K(157696K), 0.0009498 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 66912K->384K(133632K)] 66936K->408K(219648K), 0.0007236 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 133504K->32K(128512K)] 133528K->352K(214528K), 0.0007481 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 128032K->32K(123904K)] 128352K->352K(209920K), 0.0003752 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
[GC [PSYoungGen: 123424K->32K(119808K)] 123744K->352K(205824K), 0.0004305 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
TIME: 808ms
Heap
 PSYoungGen      total 119808K, used 42952K [0x00000007d5d00000, 0x00000007de000000, 0x0000000800000000)
  eden space 118784K, 36% used [0x00000007d5d00000,0x00000007d86ea2d8,0x00000007dd100000)
  from space 1024K, 3% used [0x00000007dde00000,0x00000007dde08000,0x00000007ddf00000)
  to   space 1024K, 0% used [0x00000007ddf00000,0x00000007ddf00000,0x00000007de000000)
 ParOldGen       total 86016K, used 320K [0x0000000781800000, 0x0000000786c00000, 0x00000007d5d00000)
  object space 86016K, 0% used [0x0000000781800000,0x0000000781850050,0x0000000786c00000)
 PSPermGen       total 21504K, used 3091K [0x000000077c600000, 0x000000077db00000, 0x0000000781800000)
  object space 21504K, 14% used [0x000000077c600000,0x000000077c904e90,0x000000077db00000)
*/

This optimization is not possible if you pass/return ByteBuffer instances to/from JNI methods, or use ByteBuffer.allocateDirect. For example, the same test with allocateDirect:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
// ByteBuffer implementation, using ByteBuffer.allocateDirect for malloc
private static void testJava() {
   ByteBuffer target = ByteBuffer.allocateDirect(8).order(ByteOrder.nativeOrder());
   for ( int i = 0; i < 10000; i++ ) {
      ByteBuffer source = ByteBuffer.allocateDirect(8).order(ByteOrder.nativeOrder());
      source.putInt(0xDEADBEEF);
      source.putInt(0xCAFEBABE);
      source.flip();
      target.put(source);
      target.flip();
      source.flip();
   }
}
/*
[GC [PSYoungGen: 33280K->5104K(38400K)] 33280K->28272K(124416K), 0.0620751 secs] [Times: user=0.19 sys=0.00, real=0.06 secs]
[GC [PSYoungGen: 38384K->5104K(71680K)] 61552K->51640K(157696K), 0.0356250 secs] [Times: user=0.14 sys=0.00, real=0.04 secs]
[GC [PSYoungGen: 71664K->5104K(71680K)] 118200K->108720K(175616K), 0.0715550 secs] [Times: user=0.23 sys=0.00, real=0.07 secs]
[Full GC [PSYoungGen: 5104K->0K(71680K)] [ParOldGen: 103616K->59651K(166912K)] 108720K->59651K(238592K) [PSPermGen: 2530K->2529K(21504K)], 0.2818225 secs] [Times: user=1.02 sys=0.00, real=0.28 secs]
[GC [PSYoungGen: 66560K->5120K(101888K)] 126211K->121827K(268800K), 0.0715050 secs] [Times: user=0.31 sys=0.00, real=0.07 secs]
[Full GC [PSYoungGen: 5120K->0K(101888K)] [ParOldGen: 116707K->37099K(185344K)] 121827K->37099K(287232K) [PSPermGen: 2529K->2529K(21504K)], 0.1767194 secs] [Times: user=0.61 sys=0.00, real=0.18 secs]
[GC [PSYoungGen: 96768K->5120K(138240K)] 133867K->128955K(323584K), 0.0938719 secs] [Times: user=0.36 sys=0.00, real=0.09 secs]
[Full GC [PSYoungGen: 5120K->0K(138240K)] [ParOldGen: 123835K->51468K(253952K)] 128955K->51468K(392192K) [PSPermGen: 2529K->2529K(21504K)], 0.2487228 secs] [Times: user=0.88 sys=0.00, real=0.25 secs]
[GC [PSYoungGen: 133120K->70624K(237056K)] 184588K->122092K(491008K), 0.1287830 secs] [Times: user=0.38 sys=0.03, real=0.13 secs]
[GC [PSYoungGen: 206816K->72256K(238592K)] 258284K->123724K(492544K), 0.1258122 secs] [Times: user=0.36 sys=0.05, real=0.13 secs]
[GC [PSYoungGen: 208448K->72256K(293376K)] 259916K->123724K(547328K), 0.1384435 secs] [Times: user=0.39 sys=0.03, real=0.14 secs]
[GC [PSYoungGen: 261696K->100480K(293888K)] 313164K->151948K(547840K), 0.1869928 secs] [Times: user=0.59 sys=0.00, real=0.19 secs]
[GC [PSYoungGen: 289920K->100480K(344576K)] 341388K->151948K(598528K), 0.1922449 secs] [Times: user=0.53 sys=0.03, real=0.19 secs]
[GC [PSYoungGen: 329344K->121376K(352256K)] 380812K->172844K(606208K), 0.2226970 secs] [Times: user=0.67 sys=0.00, real=0.22 secs]
TIME: 4700ms
Heap
 PSYoungGen      total 352256K, used 282188K [0x00000007d5d00000, 0x00000007f8200000, 0x0000000800000000)
  eden space 228864K, 70% used [0x00000007d5d00000,0x00000007dfa0b370,0x00000007e3c80000)
  from space 123392K, 98% used [0x00000007e3c80000,0x00000007eb308000,0x00000007eb500000)
  to   space 137728K, 0% used [0x00000007efb80000,0x00000007efb80000,0x00000007f8200000)
 ParOldGen       total 253952K, used 51468K [0x0000000781800000, 0x0000000791000000, 0x00000007d5d00000)
  object space 253952K, 20% used [0x0000000781800000,0x0000000784a430a8,0x0000000791000000)
 PSPermGen       total 21504K, used 2536K [0x000000077c600000, 0x000000077db00000, 0x0000000781800000)
  object space 21504K, 11% used [0x000000077c600000,0x000000077c87a2e0,0x000000077db00000)
*/

Also tried with Cleaner.clean(), runs at about 2900ms which is still 4 times slower.
Offline nsigma
« Reply #50 - Posted 2015-09-13 09:58:02 »

Thanks @spasi  Guess I was more thinking about the difference in time of your second (jemalloc) test using MemoryAccessorUnsafe vs MemoryAccessorJNI though (which I think is roughly equivalent to targeting a specific VM vs targeting a generic VM?)

Well I think it'll never happen. It still is a specific use case and you don't make people expect that to be used. A method called free() always sounds like you actually have to free the buffer.

Doesn't seem that much different to Closeable to me.  Some resources require explicit life-cycle management and direct buffers should have been one of them!

Praxis LIVE - hybrid visual IDE for (live) creative coding
Online princec

« JGO Spiffy Duke »


Medals: 976
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #51 - Posted 2015-09-13 10:31:28 »

Looking at the actual #asm output produced might be even more enlightening. (Roquen?)

Cas Smiley

Offline Icecore
« Reply #52 - Posted 2015-09-13 12:07:52 »

1  
2  
3  
4  
5  
6  
   for ( int i = 0; i < 10000; i++ ) {
      ByteBuffer source = ByteBuffer.allocateDirect(8).order(ByteOrder.nativeOrder());
      //source.free();
   }

  memory 98% used
Offtop: Lol hm - wher is a problem XD

Last known State: Reassembled in Cyberspace
End Transmission....
..
.
Offline Riven
Administrator

« JGO Overlord »


Medals: 1324
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #53 - Posted 2015-09-13 13:05:13 »

Doesn't seem that much different to Closeable to me.  Some resources require explicit life-cycle management and direct buffers should have been one of them!
With the exception that mishandling Closeables will cause memory leaks, whereas mishandling malloc/free causes native crashes and/or security issues.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
Offline gouessej
« Reply #54 - Posted 2015-09-13 14:58:03 »

With the exception that mishandling Closeables will cause memory leaks, whereas mishandling malloc/free causes native crashes and/or security issues.
The current code is already protected against double free (look for the word "paranoia" in the comments of the source code concerning direct NIO buffers, especially in the deallocator)  and mishandling direct NIO buffers can still cause memory leaks (on the native heap).

Julien Gouesse | Personal blog | Website | Jogamp
Offline nsigma
« Reply #55 - Posted 2015-09-13 19:05:19 »

mishandling direct NIO buffers can still cause memory leaks (on the native heap).

Not to mention segfaults.  Protecting against native crashes and other issues within the VM shouldn't be an issue - it's not really much different to the current housekeeping.  OTOH, once the direct buffer has been passed to native code it's already possible to trigger segfaults with mishandling.  I'd rather crashes that easily reproduce than ones at the whim of the garbage collector.

Praxis LIVE - hybrid visual IDE for (live) creative coding
Offline Roquen

JGO Kernel


Medals: 516



« Reply #56 - Posted 2015-09-13 22:49:57 »

http://pastebin.java-gaming.org/7520d69513f1b
http://pastebin.java-gaming.org/06752405d3911
Offline Roquen

JGO Kernel


Medals: 516



« Reply #57 - Posted 2015-09-13 23:33:39 »

Oh!.  Since I had out jitwatch, I just compiled this:

1  
2  
3  
4  
5  
6  
7  
8  
  private static void sum(float[] d, float[] a, float[] b)
  {
    int len = d.length;

    for(int i=0; i<len; i++) {
      d[i] = a[i]+b[i];
    }
  }


Annnddd....the top of the unrolled loop looks like:
1  
2  
3  
             L0001: movdqu xmm0,xmmword ptr     ; Load 4 values from a
0x0000000002abc9b7: movdqu xmm1,xmmword ptr     ; Load 4 values from b
0x0000000002abc9be: addps  xmm1,xmm0            ; Yo! Add 4 values


and likewise for mul.  So there's at least some basic autovectorization in the current release build.
Offline Spasi
« Reply #58 - Posted 2015-09-14 11:08:29 »


Thanks! It helped me identify a few issues with the current implementation:

- Using .slice() is more efficient than .duplicate(). (improves the reflection fallback)
- Using sun.reflect.FieldAccessor directly eliminates some overhead from java.lang.reflect.Field. (improves the reflection fallback)
- Making the JEmalloc instance (that holds the function pointers) final eliminates an indirection and implicit NPE check. (improves all)
- Using Unsafe.allocateInstance eliminates any overhead from slice()/duplicate(). (improves the Unsafe implementation)

With the above changes and after adding NPE checks to testUnsafe (the implicit NPE checks in testLWJGL cannot be removed), both tests get JIT compiled to identical code:

Unsafe: http://pastebin.java-gaming.org/520d9715f3b1a
LWJGL: http://pastebin.java-gaming.org/20d918f5b3a16

Also, JITWatch is awesome. I always wanted to try it and Roquen gave me an excuse. Use it!

Thanks @spasi  Guess I was more thinking about the difference in time of your second (jemalloc) test using MemoryAccessorUnsafe vs MemoryAccessorJNI though (which I think is roughly equivalent to targeting a specific VM vs targeting a generic VM?)

That's correct, MemoryAccessorJNI will work on any JVM. It will also going to be slow, that's why there's an appropriate warning if it ends up being used. How slow? As slow as NewDirectByteBuffer plus the overhead of an extra JNI method call. Why is NewDirectByteBuffer slow? Because it calls a package private DirectByteBuffer constructor reflectively and reflective constructor calls are horribly inefficient. That's why the MemoryAccessorReflect fallback uses slice() and then sets the appropriate field values via reflection. It's much faster, but also requires a real object instance and escape analysis can't do anything to improve that.
Offline Roquen

JGO Kernel


Medals: 516



« Reply #59 - Posted 2015-09-14 11:10:45 »

The suggestions button is very useful.  Like tells you why some basic transforms are not being performed (like method X is too large to be inlined).

EDIT: and another useful and common problem:  THIS BRANCH IS RANDOM!! AHHHH!!!
Pages: 1 [2] 3 4 ... 8
  ignore  |  Print  
 
 

 
xxMrPHDxx (21 views)
2017-11-21 16:21:00

xxMrPHDxx (14 views)
2017-11-21 16:14:31

xxMrPHDxx (16 views)
2017-11-21 16:10:57

Ecumene (114 views)
2017-09-30 02:57:34

theagentd (150 views)
2017-09-26 18:23:31

cybrmynd (260 views)
2017-08-02 12:28:51

cybrmynd (250 views)
2017-08-02 12:19:43

cybrmynd (247 views)
2017-08-02 12:18:09

Sralse (260 views)
2017-07-25 17:13:48

Archive (881 views)
2017-04-27 17:45:51
List of Learning Resources
by elect
2017-03-13 14:05:44

List of Learning Resources
by elect
2017-03-13 14:04:45

SF/X Libraries
by philfrei
2017-03-02 08:45:19

SF/X Libraries
by philfrei
2017-03-02 08:44:05

SF/X Libraries
by SkyAphid
2017-03-02 06:38:56

SF/X Libraries
by SkyAphid
2017-03-02 06:38:32

SF/X Libraries
by SkyAphid
2017-03-02 06:38:05

SF/X Libraries
by SkyAphid
2017-03-02 06:37:51
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!