Java-Gaming.org Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (539)
Games in Android Showcase (132)
games submitted by our members
Games in WIP (603)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  Quiet in here these days...  (Read 4947 times)
0 Members and 1 Guest are viewing this topic.
Offline princec

« JGO Spiffy Duke »


Medals: 434
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Posted 2007-02-06 21:48:56 »

... guess that means it's fast enough now.

Cas Smiley

Offline Matzon

JGO Knight


Medals: 19
Projects: 1


I'm gonna wring your pants!


« Reply #1 - Posted 2007-02-06 22:00:42 »

well 1.6 IS really nice - now we just need world+dog to upgrade........

Offline Riven
« League of Dukes »

« JGO Overlord »


Medals: 840
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #2 - Posted 2007-02-06 22:57:50 »

Good observation, wrong conclusion Smiley

All VM performance topics have been beaten to death...

What we need is:

  • {primitive}Buffer performance equal to {primitive}[] performance - in all cases.
  • SIMD:
    • the Intel core2duo handles 4 values in 1 operation
    • all other SIMD enabled CPUs handle 4 values in 2 operations
    • Sun Java VM can only handle 4 values in 4 operations

Java 6.0 is nice, indeed, but due to changes in the CPU implementations, the gap is widening...!


But why start yet-another-topic about it?

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Linuxhippy

Senior Devvie


Medals: 1


Java games rock!


« Reply #3 - Posted 2007-02-06 23:02:55 »

Java 6.0 is nice, indeed, but due to changes in the CPU implementations, the gap is widening...!

Now that Hotspot is opensource, no one prevents you from implementing it  Tongue

1.) I have to admit that this would be beneficial to some kind of applications, however to lobby for this as the almost one of a few things missing in java is a bit of unrealistic.

2.) Don't you think that especially for buffers almost anything possible has been done to archieve high performance in such a critical area?

Quote
Sun Java VM can only handle 4 values in 4 operations
Well 4 values in 4 instructions. But that does not mean e.g. a Core2Duo will need 4 cycles to process the data, since its able to handle four integer instructions at once.

lg Clemens
Offline Riven
« League of Dukes »

« JGO Overlord »


Medals: 840
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #4 - Posted 2007-02-06 23:05:52 »

it is able, yes, but the VM isn't sending it the right instructions to do so, so even the core2duo is stuck at 4 in 4, instead of 4 in 1



To show what I mean:
Doing math in pure Java, was 2.4x (40% due to pointer-arithmetic) slower than using JNI and invoking a native method that used SIMD.
On a core2duo that difference might very well be 4.4x

I'm talking about simple loops like this:
1  
2  
3  
4  
5  
6  
7  
for(int i=0; i<n; i++)
   (*dst++) = (*op1++) * (*op2++); // C-compiler turns this into SIMD

vs

for(int i=0; i<n; i++)
   dst[i] = op1[i] * op2[i];

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social
Offline Ken Russell

JGO Coder




Java games rock!


« Reply #5 - Posted 2007-02-06 23:10:34 »

{primitive}Buffer performance equal to {primitive}[] performance - in all cases.

An RFE has been filed about this: 6509032. Check it out in Sun's bug database. If this were implemented then it should completely eliminate the performance difference between Buffer get()/put() operations and array operations. Please vote for it. Right now it doesn't even have a responsible engineer assigned to it.

SIMD:
Sun Java VM can only handle 4 values in 4 operations

Work is ongoing in the Java HotSpot VM to enable better use of SIMD instructions. Stay tuned.
Offline Riven
« League of Dukes »

« JGO Overlord »


Medals: 840
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #6 - Posted 2007-02-06 23:31:13 »

In my recent brush with JVM reference tricking, I had some assumptions about how the VM handles it's reads/writes for arrays and Buffers.


for an array:
1  
2  
3  
array[i] = 5; // array is simply a pointer

thus: WRITE 5 AT ((int)array + i)


for a buffer:
1  
2  
3  
buffer.put(i, 5); // buffer is simply a pointer, with 'base' at offset N, thus:

WRITE 5 AT (FETCH((int)buffer + base_field_offset) + i)


So how can a Buffer (pointer-to-a-pointer + offset) ever get as fast as an array (pointer + offset)?

Again, these are 'just' assumptions, don't be too harsh if I'm way off. Smiley

I voted for 6509032, read the description, and couldn't see how this was taken into account.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social
Offline Riven
« League of Dukes »

« JGO Overlord »


Medals: 840
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #7 - Posted 2007-02-07 13:13:41 »

Hm... the base-pointer could be cached by the VM ofcourse, in case of loops.

But the solution mentioned in 6509032 certainly does not address that.
It seems to assume the direct ByteBuffer (object!) is allocated in direct-memory.
Further, it only (seems to) equal the performance of heap- and direct-buffers, not buffers vs. arrays.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social
Offline Ken Russell

JGO Coder




Java games rock!


« Reply #8 - Posted 2007-02-07 17:15:40 »

As you pointed out, the base pointer of Buffers (direct or non-direct) is immutable, so in the case of loops or rapidly repeated method calls (which will likely be inlined) you won't have to do the additional dereference of the Buffer because the base pointer's value will be fetched once at the top of the loop. This should make both heap-based and direct Buffers as fast as array accesses in all cases.
Offline GKW

Senior Devvie




Revenge is mine!


« Reply #9 - Posted 2007-02-07 18:34:52 »

Work is ongoing in the Java HotSpot VM to enable better use of SIMD instructions. Stay tuned.

Blink twice if it's autovectorization and cough once if it is going to be in se7.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Orangy Tang

JGO Kernel


Medals: 56
Projects: 11


Monkey for a head


« Reply #10 - Posted 2007-02-07 19:39:03 »

Better use of SIMD? I wasn't aware that it used it at all, when did that sneak in?

[ TriangularPixels.com - Play Growth Spurt, Rescue Squad and Snowman Village ] [ Rebirth - game resource library ]
Offline Riven
« League of Dukes »

« JGO Overlord »


Medals: 840
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #11 - Posted 2007-02-07 21:02:12 »

Java 1.4 (see the performance notes, somewhere outthere on sun.com)

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social
Offline krausest

Junior Devvie


Exp: 15 years


I love YaBB 1G - SP1!


« Reply #12 - Posted 2007-02-07 21:19:38 »

Let me add two questions:
Is it just me who is waiting every week for the next JDK 7 build, hoping it'll come with tiered compilation?
Anyone tried the IBM 6 pre-release? It has a not yet documented but nonetheless mentioned feature called "Data sharing between JVMs: Ahead Of Time (AOT) compiled code" - sounds interesting...
Offline Riven
« League of Dukes »

« JGO Overlord »


Medals: 840
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #13 - Posted 2007-02-07 21:20:17 »

I finally found out why my float[] vs FloatBuffer results always had another winner, in similair benchmarks...

I just finished my best performance benchmark to date, because each test-case is run inside it's own VM. Everything is warmed up properly by the server VM before the measurements begin.

Java 6.0 server VM:





Benchmark
Calculate the cross-product of 2 data-sets


Results
1/3rd of the benchmarks are 20-50%slower, while the data-set differs only 4K in size
1/12th of the benchmarks are 60-75% slower, while the data-set differs only 4K in size
The 3 types seem to have the a distinct 'phase offset', causing the 'winner' to be fairly predictable for a certain data-set size.

after a certain data-set size, the int[]/float[] lose 50% performance, and lose their 'spikes'


Question
Who's fault is this? Hotspot? OS ? CPU ? RAM ?

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social
Offline Chris61182

Junior Devvie





« Reply #14 - Posted 2007-02-07 21:32:22 »

Quote from: Ken Russell
Work is ongoing in the Java HotSpot VM to enable better use of SIMD instructions. Stay tuned.

If they just gave developers vector primitives, and the VM the appropriate backend compilers most of the work would be done. With vector primitives the VM could compile to scalar code if the proper instruction set isn't there, to SSE2 on x86 machines, Altivec on Power, and VIS on Sparc, all the while allowing the developers to do the vectorization themselves (and relatively easily at that). Because frankly, from what I've seen autovectorization is never likely to perform well enough.
Offline Linuxhippy

Senior Devvie


Medals: 1


Java games rock!


« Reply #15 - Posted 2007-02-07 22:29:57 »

I just can speak about GCC and the Intel-C-Compiler but its quite tricky to get autovectorization for loops that do little more than one operation over a large set of data which make code often more complicated than with vector primitives.

I think AutoVectorization is a great tool for C2 to generate even better code because there are maybe loops out there which trigger this enhancement, but for programmers I guess explicit Vector-Routines would be best. At least the programmer would have direct control over what happens, and does not have to rely on some optimization magic to get stuff built the way its intended.

lg Clemens
Offline Ken Russell

JGO Coder




Java games rock!


« Reply #16 - Posted 2007-02-08 02:17:42 »

Better use of SIMD? I wasn't aware that it used it at all, when did that sneak in?

SSE2 instructions have been used in the HotSpot server VM and more recently the client VM for floating-point operations, but only with scalar values so far. What I meant by "better use" is using these instructions in vector form.
Offline Ken Russell

JGO Coder




Java games rock!


« Reply #17 - Posted 2007-02-08 02:19:07 »

Who's fault is this? Hotspot? OS ? CPU ? RAM ?

Sounds like a cache issue to me -- that in some situations data that needs to be in the cache is being evicted and reloaded from main memory.
Offline Ken Russell

JGO Coder




Java games rock!


« Reply #18 - Posted 2007-02-08 02:21:10 »

Blink twice if it's autovectorization and cough once if it is going to be in se7.

I'm not involved with the development at all, so no comment from me, but I've pointed the responsible engineer at this thread so maybe we'll hear it from the source.
Offline Linuxhippy

Senior Devvie


Medals: 1


Java games rock!


« Reply #19 - Posted 2007-02-08 11:52:17 »

I'm not involved with the development at all, so no comment from me, but I've pointed the responsible engineer at this thread so maybe we'll hear it from the source.
That would be really great :-)
Offline Chris61182

Junior Devvie





« Reply #20 - Posted 2007-02-08 19:26:48 »

I'm not involved with the development at all, so no comment from me, but I've pointed the responsible engineer at this thread so maybe we'll hear it from the source.


I for one would love nothing more than to hear why we have yet to see vector primitives in Java. :-)
Offline rossk

Innocent Bystander





« Reply #21 - Posted 2007-02-08 21:01:12 »

One blink, and a muffled cough.<grin>

Auto-vectorization is being worked on.  It's applied to loops whose
array references have had their range checks successfully eliminated.

Most of the data types and operations that make sense for a platform will
eventually be supported. For operations that don't have a direct Java
equivalent such as saturating adds, an idiom pattern match will be attempted.
Some other operations maybe to difficult for this method, requiring new
scalar intrinsic to be defined.  The scalar intrinsic would then
be vectorized.

Auto-vectorization certainly has it's limits.  The hope is that it will
succeed on enough interesting cases to be a benefit.
Offline GKW

Senior Devvie




Revenge is mine!


« Reply #22 - Posted 2007-02-08 23:01:12 »

Is autovectorization going to be available to the client compiler or is it server only?  I can't wait to see how it well it works or doesn't work!
Offline darkprophet

Senior Devvie




Go Go Gadget Arms


« Reply #23 - Posted 2007-02-09 02:41:49 »

java7 is hopefuly going to have teired compilation (already has?), so a client/server VM thing is no more....

Friends don't let friends make MMORPGs.

Blog | Volatile-Engine
Offline rreyelts

Junior Devvie




There is nothing Nu under the sun


« Reply #24 - Posted 2007-02-12 23:42:51 »

Some other operations maybe to difficult for this method, requiring new scalar intrinsic to be defined.  The scalar intrinsic would then be vectorized.

Neat stuff. I'm a little confused what you mean by this last part. By "instrinsic", do you mean a new VM instruction?

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline Riven
« League of Dukes »

« JGO Overlord »


Medals: 840
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #25 - Posted 2007-02-21 14:38:47 »

I filed a RFE to enable us to use SIMD within the VM - which got accepted by Sun.
You're more than welcome to add comments and suggestions!
Voting would be appreciated to bump its priority a bit.


Please vote here:

RFE: 6526380 Add API to access SIMD instructions



Thanks! Smiley

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

rwatson462 (30 views)
2014-12-15 09:26:44

Mr.CodeIt (23 views)
2014-12-14 19:50:38

BurntPizza (50 views)
2014-12-09 22:41:13

BurntPizza (84 views)
2014-12-08 04:46:31

JscottyBieshaar (45 views)
2014-12-05 12:39:02

SHC (59 views)
2014-12-03 16:27:13

CopyableCougar4 (57 views)
2014-11-29 21:32:03

toopeicgaming1999 (123 views)
2014-11-26 15:22:04

toopeicgaming1999 (114 views)
2014-11-26 15:20:36

toopeicgaming1999 (32 views)
2014-11-26 15:20:08
Resources for WIP games
by kpars
2014-12-18 10:26:14

Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!