Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (522)
Games in Android Showcase (127)
games submitted by our members
Games in WIP (590)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  BufferedImage backed by a ByteBuffer  (Read 3557 times)
0 Members and 1 Guest are viewing this topic.
Offline Marvin Fröhlich

Senior Devvie




May the 4th, be with you...


« Posted 2008-05-14 22:44:21 »

hi

I have written a BufferedImage extension, that uses a special DataBuffer implementation, that stores its data directly in a DirectByteBuffer. I need this to avoid data duplication in memory. The image data needs to be sent to OpenGL, which can only be done through the ByteBuffer. To keep the image updatable/redrawable, I need the linkage between the BufferedImage and the ByteBuffer. Therefore I need this solution.

I currently have a working solution, but it is incredibly slow. This led me to this nice section of these boards Wink.

I have implemented it by creating a WritableRaster extension, that takes a PixelInterleavedSampleModel (instantiated just as BufferedImage does) and my DirectDataBufferByte. The documentation says, that the ByteInterleavedRaster is being used by a BufferedImage for byte-data to improve performance. I would like to use it, too, but it doesn't accept my DirectDataBufferByte (the one, that is backed by a ByteBuffer). Unfortunately the source of ByteInterleavedRaster and its parent classes doesn't seem to be available, so I cannot loopup, what they do to improve performance.

The only difference between my DirectBufferedImage and the regular BufferedImage (with bytes) is the used WritableRaster. But the original BufferedImage is 3x as fast as my DirectBufferedImage. And writing the data to a byte array or a ByteBuffer doesn't make any difference in performance. So, I guess, I only need a better WritableRaster implementation.

I hope, I was clear enough and someone can help me out here. If there are any further questions, please don't hesitate to ask.

Thanks in advance,

Marvin
Online Riven
« League of Dukes »

« JGO Overlord »


Medals: 835
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #1 - Posted 2008-05-15 19:16:12 »

First, make a tiny test-case where only writing to the BufferedImage is done with your DataBufferByte impl.

It should be commandline, and do *really* nothing else.


Why am I stressing this? If any code in the JVM creates a HeapByteBuffer using ByteBuffer.allocate(...), the performance of the direct buffers can lose performance by factor 10, seriously - it has to unoptimize from extremely fast pointer-access to a jumptable with subclasses, now that there is more than 1 subclass of ByteBuffer.

Further, the Client VM is rather poor at accessing direct buffers. Switch to Server VM which can speed up buffer performance by factor 3 or so.

You might want to post your findings... Wink

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Linuxhippy

Senior Devvie


Medals: 1


Java games rock!


« Reply #2 - Posted 2008-05-22 17:58:18 »

Quote
Why am I stressing this? If any code in the JVM creates a HeapByteBuffer using ByteBuffer.allocate(...), the performance of the direct buffers can lose performance by factor 10, seriously - it has to unoptimize from extremely fast pointer-access to a jumptable with subclasses, now that there is more than 1 subclass of ByteBuffer.
If it would be a "normal" method call, it would go from monomorphic to a bimorphic call which is something like if/else + uncommon trap handling.

However this stuff is totally instrified (at least in the server-vm), so I guess it will not be treated like a normal method.

lg Clemens
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Online Riven
« League of Dukes »

« JGO Overlord »


Medals: 835
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #3 - Posted 2008-05-22 18:36:12 »

Yeah, well, try and see.

I've seen horrific performance degradation simply because I put a ByteBuffer.allocate(1) as the first line in my program.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Marvin Fröhlich

Senior Devvie




May the 4th, be with you...


« Reply #4 - Posted 2008-05-28 18:46:18 »

Hey. Thanks a lot for your replies, guys.

I do have a testcase like this. Actually I tested it like this before I posted my request. I even made a testcase, where I filled a ByteBuffer with random values and a byte-array on the other hand. The ByteBuffer and the array were of size 196608 both and I filled them 500 times. The ByteBuffer version was about 4% slower. So I guess, there everythign right with the "directness" of the ByteBuffer.

Then I tried something else. I created a BufferedImage, which was backed by my own instance of DataBufferByte, where I passed in my own byte-array. I created it the exact same way as the original BufferedImage constructor does. And it is exactly as fast as a standard BufferedImage. But if I use a byte-offset, that differs from (2, 1, 0) (I would need (0, 1, 2) for OpenGL texture-data), it is as slow as the ByteBuffer version.

So it seems like the BufferedImage doesn't use its best optimized code, if anything differs from the way, they would create it by default.

Here is my testcase:
http://jagatoo.svn.sourceforge.net/viewvc/jagatoo/trunk/test/src/org/jagatoo/test/util/image/DirectBufferedImageTest.java?view=markup
http://jagatoo.svn.sourceforge.net/viewvc/jagatoo/trunk/src/org/jagatoo/image/

Any idea, how to make the BufferedImage use its optimized code for ByteBuffers or (0, 1, 2, 3)-offsetted byte-array, too?

Marvin
Online Riven
« League of Dukes »

« JGO Overlord »


Medals: 835
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #5 - Posted 2008-05-28 22:26:54 »

Quote
Any idea, how to make the BufferedImage use its optimized code for ByteBuffers or (0, 1, 2, 3)-offsetted byte-array, too?

Well, I wouldn't know, as BufferedImage is like a blackbox to me.

Do I understand correctly that a backing byte[] with swapped offsets also is acceptable to you (no ByteBuffer at all..?) In that case, the byte[] has a-really-big-chance to be copied in JNI calls that read from the byte[] and sent to the gfx-driver. The overhead would be comparable to simply copying (and swapping) the bytes in Java code. Then you can just as well render with wrong offsets with full acceleration, then copy the bytes into a DirectByteBuffer and swap the bytes around.

You might want to measure what's the overhead in that - it might be much less than the factor 3 slowdown you're seeing now.


// try to get rid of "p++" as it 'prevents' out-of-order execution in the CPU.
bb.put(p+0, arr[p+3]);
bb.put(p+1, arr[p+2]);
bb.put(p+2, arr[p+1]);
bb.put(p+3, arr[p+0]);
p+=4;



Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Marvin Fröhlich

Senior Devvie




May the 4th, be with you...


« Reply #6 - Posted 2008-05-28 23:47:28 »

Do I understand correctly that a backing byte[] with swapped offsets also is acceptable to you (no ByteBuffer at all..?) In that case, the byte[] has a-really-big-chance to be copied in JNI calls that read from the byte[] and sent to the gfx-driver. The overhead would be comparable to simply copying (and swapping) the bytes in Java code. Then you can just as well render with wrong offsets with full acceleration, then copy the bytes into a DirectByteBuffer and swap the bytes around.

You might want to measure what's the overhead in that - it might be much less than the factor 3 slowdown you're seeing now.

Yes, actually this is quite my current solution Wink. I am using the SharedBufferedImage (with wrong byte-offsets), that you can find in the link above. The usage of these SharedBufferedImages is of rather rare use at the moment. So this is not a eral problem anyway. The point is just, that I would prefer the "perfect" solution, if it was possible Smiley.

// try to get rid of "p++" as it 'prevents' out-of-order execution in the CPU.
bb.put(p+0, arr[p+3]);
bb.put(p+1, arr[p+2]);
bb.put(p+2, arr[p+1]);
bb.put(p+3, arr[p+0]);
p+=4;

You you please explain that a little more in detail? What is an out-of-order execution?

Marvin
Online Riven
« League of Dukes »

« JGO Overlord »


Medals: 835
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #7 - Posted 2008-05-29 06:27:08 »

You might want to read wikipedia about out-of-order execution.

It's basically like this:
Simple mathematical instructions take only 1 clockcycle, while fecthing a byte from memory, can take up to a few dozen clockcycles. This is why the CPU will change the execution-order in which the instructions can be performed (when the results would be the same (valid) as in-order-execution). You can change the execution order of this example, without a problem:
x = a + b;
y = b - a;

You can't however, change the execution-order of these instructions, without changing the outcome:
a = a + b;
y = b - a;


Let's say you have this code:
1  
2  
3  
4  
5  
6  
7  
8  
for(int p=0; p<len; p++)
{
   int q = p+3;
   bb.put(p++, arr[q--]);
   bb.put(p++, arr[q--]);
   bb.put(p++, arr[q--]);
   bb.put(p, arr[q]);
}


There the effect 2nd line is dependent on the first line. It can't be executed out-of-order. In C/C++ we have fancy compilers that optimize this away, but in Java, I often* see a nice performance boost when turning the code into:
1  
2  
3  
4  
5  
6  
7  
for(int p=0; p<len; p += 4)
{
   bb.put(p+0, arr[p+3]);
   bb.put(p+1, arr[p+2]);
   bb.put(p+2, arr[p+1]);
   bb.put(p+3, arr[p+0]);
}


Where all lines can be executed in any order, allowing the CPU to perform the operation in optimal order, depending on memory latency and whether the data is in cache or not.



* the JIT in the HotSpot VM is not always predictable, so your loop might suddenly be twice as fast, or you might not see that much of a difference.






Edit:
Further, the VM seems to benefit (10-20%) from manual loop-unrolling.

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
//if((len % 8) != 0)
if((len & 7) != 0)
  throw new IllegalStateException();
for(int p=0; p<len; p += 8)
{
   bb.put(p+0, arr[p+3]);
   bb.put(p+1, arr[p+2]);
   bb.put(p+2, arr[p+1]);
   bb.put(p+3, arr[p+0]);

   bb.put(p+4, arr[p+7]);
   bb.put(p+5, arr[p+6]);
   bb.put(p+6, arr[p+5]);
   bb.put(p+7, arr[p+4]);
}

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline Marvin Fröhlich

Senior Devvie




May the 4th, be with you...


« Reply #8 - Posted 2008-05-29 20:02:47 »

Thanks. This is interesting stuff.

Marvin
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

trollwarrior1 (29 views)
2014-11-22 12:13:56

xFryIx (71 views)
2014-11-13 12:34:49

digdugdiggy (50 views)
2014-11-12 21:11:50

digdugdiggy (44 views)
2014-11-12 21:10:15

digdugdiggy (38 views)
2014-11-12 21:09:33

kovacsa (62 views)
2014-11-07 19:57:14

TehJavaDev (67 views)
2014-11-03 22:04:50

BurntPizza (64 views)
2014-11-03 18:54:52

moogie (80 views)
2014-11-03 06:22:04

CopyableCougar4 (80 views)
2014-11-01 23:36:41
Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!