Java-Gaming.org    
Featured games (81)
games approved by the League of Dukes
Games in Showcase (498)
Games in Android Showcase (117)
games submitted by our members
Games in WIP (563)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1] 2 3 4
  ignore  |  Print  
  New VM performance improvements  (Read 14737 times)
0 Members and 1 Guest are viewing this topic.
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Posted 2004-10-25 19:59:44 »

Hi guys,
  Sorry for being away for so long, but I've been busy fixing bugs.  Anyway I added another new intrinsic for the next version of Java.  Absolute value (Math.abs) is now intrinsified to use hardware when available.  For non-sse machines that means using FABS while sse machines use andpd or andps (depending on single or double precision).  Client also picks up this improvement along with Server, SPARC, AMD64 and IA64 all have hardware implementations so those benefit as well.  I'm open for suggestions to other improvements that might help.  
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #1 - Posted 2004-10-26 05:48:12 »

No suggestions, but a quick thank-you for keeping us informed of this.

Sooner or later we tend to notice these things in the bug-fixed lists for later VM's, but it's great to get a heads-up on the stuff that matters, rather than have to manually trawl through hundreds of bugs to see what's happening Smiley

malloc will be first against the wall when the revolution comes...
Offline princec

JGO Kernel


Medals: 380
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #2 - Posted 2004-10-26 07:45:58 »

Could you explain what the mysterious two-tier compilation & threshold -XX flags are in the server VM...?

Cas Smiley

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #3 - Posted 2004-10-26 14:41:19 »

Sure, those flags sorta kinda work.  It was the first attempt at tiered compilation.  Don't use it, well you can if you want it just won't help.  Wait for the next version of Java, it'll have working tiered compilation and all my math goodies.  
Offline selendic

Junior Member




Java games rock!


« Reply #4 - Posted 2004-10-26 14:52:23 »

Quote
Sure, those flags sorta kinda work.  It was the first attempt at tiered compilation.  Don't use it, well you can if you want it just won't help.  Wait for the next version of Java, it'll have working tiered compilation and all my math goodies.  



Is there any possible hint on when/if regular snapshots of 6.0 will be available, like for beta3 of Tiger?
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #5 - Posted 2004-10-26 14:58:51 »

Dunno about the previews, we may or may not have them.  If I find out, I'll let you all know
Offline princec

JGO Kernel


Medals: 380
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #6 - Posted 2004-10-26 15:18:41 »

Oh JOY! I think I moaned about the need for tiered compilation most vociferously in here, what, maybe four years ago or something! And finally we know it's being started on.

At this rate we'll see Structs by the end of the decade Cheesy

Cas Smiley

Offline phazer

Junior Member




Come get some


« Reply #7 - Posted 2004-10-27 13:57:10 »

Excuse my ignorance, but what is "tiered compilation"?  Huh

Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #8 - Posted 2004-10-27 14:20:59 »

Tiered compilation is an extension of what HotSpot already does.  So lets step back and talk about how HotSpot and most JIT compilers work in general.  There are two phases in HotSpot, Interpreted and Compiled.  When you start up a Java program, HotSpot first interprets the code until a certain threshold is reached (1000 for Client, 10,000 for Server) then the method is compiled and the compiled version is used (faster than interpreted, usually ALOT faster).  Tiered compilation basically adds a second layer, so that you have interpreted -> fast jit, but low quality code -> slow jit but awesome code.  Tiered compilation gives you the best of both worlds in that you get fast startup and good long running performance.  Hope this helps...
Offline selendic

Junior Member




Java games rock!


« Reply #9 - Posted 2004-10-27 16:23:45 »

hmmm. that includes merging of compilers? Will it use heueristics or command line switches to kick in? And memory consumption will be higher, I suppose?
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #10 - Posted 2004-10-27 16:27:17 »

All of that is unkown at this time, although memory consumption shouldn't be higher.  Since the JIT's memory usage (C-HEAP) is significantly lower than the Java Heap.  So I don't expect a memory increase, but more info will be available once the work is nearing completion.
Offline selendic

Junior Member




Java games rock!


« Reply #11 - Posted 2004-10-27 16:44:48 »

thanks, really great news. Now, when tiered compilation is covered, I hope escape analisys is next  Wink
Offline GKW

Senior Member




Revenge is mine!


« Reply #12 - Posted 2004-10-27 16:56:56 »

Is this going to be added to 5.1 or 6.0?
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #13 - Posted 2004-10-27 18:27:38 »

Definitly 6.0 or later...
Offline princec

JGO Kernel


Medals: 380
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #14 - Posted 2004-10-28 08:16:54 »

I don't suppose you'd care to champion Structs with me?

Cas Smiley

Offline ChrisRijk

Senior Newbie




Optimise or Die


« Reply #15 - Posted 2004-10-28 12:32:57 »

Quote
Tiered compilation is an extension of what HotSpot already does.  So lets step back and talk about how HotSpot and most JIT compilers work in general.  There are two phases in HotSpot, Interpreted and Compiled.  When you start up a Java program, HotSpot first interprets the code until a certain threshold is reached (1000 for Client, 10,000 for Server)


Isn't it 1,500 for client? Or is the -XX:CompileThreshold default for client on this page out of date:
http://java.sun.com/docs/hotspot/VMOptions.html


Quote
then the method is compiled and the compiled version is used (faster than interpreted, usually ALOT faster).  Tiered compilation basically adds a second layer, so that you have interpreted -> fast jit, but low quality code -> slow jit but awesome code.  Tiered compilation gives you the best of both worlds in that you get fast startup and good long running performance.  Hope this helps...


I'll understand if it is too early to say, but - would this be server VM only...? Or would client and server effectively merge with the new model...?
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #16 - Posted 2004-10-28 14:13:18 »

Whoops, its 1500 for X86 and 1000 for SPARC Smiley  I'm just so use to vieweing sparc files, that I didn't notice the discrepency Smiley
Offline pepe

Junior Member




Nothing unreal exists


« Reply #17 - Posted 2004-10-30 06:30:46 »

Thanks a lot for the info, Azeem !
Any insight about the previous 'new jvm improvements' topic and the bit shift/masking bench?

Home page: http://frederic.barachant.com
------------------------------------------------------
GoSub: java2D gamechmark http://frederic.barachant.com/GoSub/GoSub.jnlp
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #18 - Posted 2004-10-30 14:23:32 »

Quote
Thanks a lot for the info, Azeem !
Any insight about the previous 'new jvm improvements' topic and the bit shift/masking bench?



Well I'm working on JumpTables and I might be able to use SSE3 for some thing.  My ultimate goal though, is to be able to use SSE to do SIMD.   It would only help a limited set of code, and you'd have to write the code to a very narrow range, but I think I can figure out something.  Stay tuned.

ps.  Anyone have any good SSE documents?
Offline pepe

Junior Member




Nothing unreal exists


« Reply #19 - Posted 2004-11-03 10:59:04 »

Quote

Stay tuned.

For sure !!

Home page: http://frederic.barachant.com
------------------------------------------------------
GoSub: java2D gamechmark http://frederic.barachant.com/GoSub/GoSub.jnlp
Offline Spasi
« Reply #20 - Posted 2004-11-03 12:17:47 »

Quote
It would only help a limited set of code, and you'd have to write the code to a very narrow range


We'd appreciate a document describing which those narrow ranges are/will be.

PS: Thanks for your efforts and for keeping us informed.
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #21 - Posted 2004-11-03 13:58:25 »

Quote


We'd appreciate a document describing which those narrow ranges are/will be.

PS: Thanks for your efforts and for keeping us informed.



Well from what I understand SIMD only works on sets of similar data.  I'm not an expert at all, but from what I understand SIMD lets you do the same operation on different data (Single Instruction Multiple Data), that much I know from my computer science classes a long time ago.  But I'm unsure how Intel implemented this in SSE/SSE2.  I'm still researching this Smiley  Again, if anyone has good documentation on SSE or a recommendation for a book  I'm listening!
Offline Mark Thornton

Senior Member





« Reply #22 - Posted 2004-11-03 17:03:13 »

How about this for a start:
http://www.cortstratton.org/articles/OptimizingForSSE.php

https://shale.intel.com/SoftwareCollege/CourseDetails.asp?courseID=23
Offline crystalsquid

Junior Member




... Boing ...


« Reply #23 - Posted 2004-11-03 17:10:39 »

I'm sure if you ask Intel they will be forthcoming. THey do some very nice docs as well as training courses several times a year.

From my previous experience with it, the data types must be the same for all bits, and unless you pre-format the data into something suitable, you end up using as many instructions to shuffle/packthe data into the form you need in SSE than it would save Sad

For example, a matrix multiply operation is not much faster in SSE because you have to transpose a matrix which takes quite a few cycles to do.
Offline princec

JGO Kernel


Medals: 380
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #24 - Posted 2004-11-03 18:53:38 »

Specifically, it's matrix4 and vector4 operations that need SIMD acceleration mostly as the highest priority, so it might be rather useful if these made it into the Java language as primitives and then got intrinsified.

After that it's sound and videio decoding are the main uses, and signal processing.

Cas Smiley

Offline Mithrandir

Senior Member




Cut from being on the bleeding edge too long


« Reply #25 - Posted 2004-11-04 12:28:59 »

You don't even need to go that far. If we could get a version of the javax.vecmath package that used native code for everything, that would make a huge performance difference to a lot of applications.

The site for 3D Graphics information http://www.j3d.org/
Aviatrix3D JOGL Scenegraph http://aviatrix3d.j3d.org/
Programming is essentially a markup language surrounding mathematical formulae and thus, should not be patentable.
Offline Spasi
« Reply #26 - Posted 2004-11-04 12:49:22 »

Quote
If we could get a version of the javax.vecmath package that used native code for everything, that would make a huge performance difference to a lot of applications.


That would indeed be a quick'n'dirty solution (and would help lots of apps), but no, I'd hate that. It's a bad API and a bad implementation. The effort should be better spent on a more generic solution.

I'm not sure about primitive types either. I'd prefer it if the VM could analyze the code and optimize on any possible opportunity (even non-vecmath code). I do realize though how difficult that is...
Offline blahblahblahh

JGO Coder


Medals: 1


http://t-machine.org


« Reply #27 - Posted 2004-11-04 16:33:24 »

Quote


That would indeed be a quick'n'dirty solution (and would help lots of apps), but no, I'd hate that. It's a bad API and a bad implementation.


It's a heck of a lot less bad than generics, and for an awful lot of applications more valuable (*especially* if they were to do a little bit of updating and unify vecmath a little with J2D's Point classes etc).

No disagreement on the implementation - but there are better impls available already for free.

/me ducks and runs for cover

malloc will be first against the wall when the revolution comes...
Offline swpalmer

JGO Coder


Exp: 12 years


Where's the Kaboom?


« Reply #28 - Posted 2004-11-06 17:05:03 »

Quote
After that it's sound and videio decoding are the main uses, and signal processing.


I think that level is beyond what Azeem is doing at the compiler level, but just for the record I will pipe up with my usual complaint re the lack of optimization in existing native code of the JRE.

SSE must be used from JPEG decoding and encoding, to not do so is to throw performance out the window.  It's like purposely using a bubble sort Smiley when every other sort algorithm would result in a massive perfromance improvement.

SSE can also be used to get massive performance gains in software loops that do image scaling and blitting.


Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #29 - Posted 2004-11-07 14:50:42 »

Quote


I think that level is beyond what Azeem is doing at the compiler level, but just for the record I will pipe up with my usual complaint re the lack of optimization in existing native code of the JRE.

SSE must be used from JPEG decoding and encoding, to not do so is to throw performance out the window.  It's like purposely using a bubble sort Smiley when every other sort algorithm would result in a massive perfromance improvement.

SSE can also be used to get massive performance gains in software loops that do image scaling and blitting.



Yeah its definitly beyond just one person to get those kinds of changes into a JDK.  I really can only make changes in the VM.  Plus I was thinking more along the lines of a loop optimization that recognizes certain types of loops and emits instructions appropriately (in this case the SIMD instructions).
Pages: [1] 2 3 4
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

radar3301 (12 views)
2014-09-21 23:33:17

BurntPizza (31 views)
2014-09-21 02:42:18

BurntPizza (22 views)
2014-09-21 01:30:30

moogie (20 views)
2014-09-21 00:26:15

UprightPath (29 views)
2014-09-20 20:14:06

BurntPizza (33 views)
2014-09-19 03:14:18

Dwinin (48 views)
2014-09-12 09:08:26

Norakomi (74 views)
2014-09-10 13:57:51

TehJavaDev (103 views)
2014-09-10 06:39:09

Tekkerue (51 views)
2014-09-09 02:24:56
List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06

List of Learning Resources
by SilverTiger
2014-07-31 11:54:12

HotSpot Options
by dleskov
2014-07-08 01:59:08
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!