Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (477)
Games in Android Showcase (106)
games submitted by our members
Games in WIP (533)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  Beware of anti-optimizations  (Read 6438 times)
0 Members and 1 Guest are viewing this topic.
Offline Roquen
« Posted 2012-03-28 14:38:22 »

I've noted a fair amount of anti-optimizations in some library code.  Remember that just because a source file shipping with Java has an implementation doesn't mean that it will be called on any hardware that you care about.  Examples:

Using SWAR or Hacker's Delight bit twiddling for stuff based on leading zeroes, trailing zeros or population count (floorlog2, nextpow2, etc, etc).  ARM and Intel natively support these opcodes and HotSpot treats them as intrinsics.

Directly calling StrictMath instead of Math.  StrictMath is always software implementation. Math equivalents are instrinics and don't really forward to StrictMath (ignore ARM no-fp here...but regardless, the software Math version should be much faster)
Offline Roquen
« Reply #1 - Posted 2012-03-29 15:33:00 »

Follow-up (couldn't find this info straight away) you can see the (I think full) set of HotSpot intrinsics in the following files:

http://hg.openjdk.java.net/jdk8/awt/hotspot/file/d61761bf3050/src/share/vm/classfile/vmSymbols.hpp
http://hg.openjdk.java.net/jdk8/awt/hotspot/file/d61761bf3050/src/share/vm/opto/library_call.cpp
Offline Roquen
« Reply #2 - Posted 2012-05-02 14:34:07 »

Follow-up 2:  Since some people actually read this, I'll augment it with slightly more accurate info.

HotSpot has a set of classes of which it's aware and does some special case stuff with.  Here we're only concerned with methods.  One thing it can do is specify a "patch-out" method, which is simply a native method which replaces any java-implemented version (or the method could also be marked native as well...don't know of any specific cases where this is true).  In that case the native routine is called (without JNI overhead) instead of compiling and using the Java based one.  Why do this?  One big reason is making it easy to port as adding patch-outs can be deferred until needed and until then the software 'fall-back' should just work. Another is regression testing if some part of the compiler base goes hay-wire during development.  The second thing it can do is mark a method as "intrinsic" (will probably always have a patch-out version as well).  The mean that the compiler is actually aware of method as-if it were a built-in function, which allows more more aggressive optimizations to be performed.

Also above I incorrectly state that StrictMath is always software based.  I forgot that some of the methods require properly rounded results and as such if a native implementation does always return bit-identical results for all input then it can be patched-out (again I have not check that this ever occurs).
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline theagentd
« Reply #3 - Posted 2012-05-02 16:02:59 »

StrictMath is faster than Math == long is faster than int. That should be obvious from the name. >_>

Myomyomyo.
Offline ra4king

JGO Kernel


Medals: 336
Projects: 2
Exp: 5 years


I'm the King!


« Reply #4 - Posted 2012-05-02 20:27:17 »

StrictMath is faster than Math == long is faster than int. That should be obvious from the name. >_>
It's not. See this post.

long vs int? They both have the exact same methods.....

Offline Riven
« League of Dukes »

JGO Overlord


Medals: 743
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #5 - Posted 2012-05-03 06:42:42 »

StrictMath is faster than Math == long is faster than int. That should be obvious from the name. >_>
It's not. See this post.

long vs int? They both have the exact same methods.....


Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline ra4king

JGO Kernel


Medals: 336
Projects: 2
Exp: 5 years


I'm the King!


« Reply #6 - Posted 2012-05-03 06:51:17 »

StrictMath is faster than Math == long is faster than int. That should be obvious from the name. >_>
It's not. See this post.

long vs int? They both have the exact same methods.....


Am I missing something? O_o

Offline Riven
« League of Dukes »

JGO Overlord


Medals: 743
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #7 - Posted 2012-05-03 08:27:21 »

 Wink

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline theagentd
« Reply #8 - Posted 2012-05-03 08:53:24 »

StrictMath has more features than Math. It's more precise ---> slower.
Long has more features than int. It has a larger range of values ---> slower.

Okay, not the best comparison, but that´s not really the point. Math is supposed to be optimized with native code, while StrictMath ensures that you get the exact same result on every computer that runs it. From just that we can easily draw the conclusion that Math >= StrictMath in performance. They may be equally fast if the native Math version produces the same value as the StrictMath version in which case both should use the native version. However, if StrictMath is faster it means that it´s more precise (since it´s not the same function as the Math one if it´s slower) AND faster it should be treated as a bug in the VM. Math should never be slower than StrictMath.

Even the long/int comparison holds here. Future/current (Huh) CPUs might be able to do 64-bit math in a single cycle, but a long should NEVER be faster than an int.

Myomyomyo.
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 743
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #9 - Posted 2012-05-03 09:17:53 »

Well naturally 64 bit CPUs already do perform operations on longs in just as many ticks as for ints (and shorts, and bytes). It's just the memory (cache) bandwidth is quickly saturated with longs, for obvious reasons.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Roquen
« Reply #10 - Posted 2012-05-03 12:00:59 »

Yes exactly.  A fair number of the SISD (in SSEx) ops on doubles and floats have the same numbers, but data-motion will typically make the execution time vary drastically.
Offline princec

JGO Kernel


Medals: 342
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #11 - Posted 2012-05-03 12:03:17 »

Likely to not be as drastic as a simple doubling of memory bandwidth time though, depending on the actual locations of the data involved, which would then be all about cache filling time wouldn't it? In which case, I think, your typical case is likely to see next to no difference in speed a lot of the time on 64-bit CPUs.

Cas Smiley

Offline ra4king

JGO Kernel


Medals: 336
Projects: 2
Exp: 5 years


I'm the King!


« Reply #12 - Posted 2012-05-03 13:42:23 »

Awww I see the joke now Grin

....well that was embarrassing persecutioncomplex

Offline Roquen
« Reply #13 - Posted 2012-05-03 16:48:19 »

Likely to not be as drastic as a simple doubling of memory bandwidth time though, depending on the actual locations of the data involved, which would then be all about cache filling time wouldn't it? In which case, I think, your typical case is likely to see next to no difference in speed a lot of the time on 64-bit CPUs.
Man this is nearly impossible to generalize.  One thing to remember is that parts of the memory architecture are shared resources between all cores (and, of course, all active threads and processes) and the memory motion is typically going in both directions.  Even when data is in the L1 cache (if my memory serves) then moving from L1 to register isn't free.  The microopcode accesses a port (which if free) delivers 32-bits in 5 cycles and 64-bits in 10.  On the store side the store buffer will get filled quicker when moving data which will stall if filled.

In my experience there are drastic speed differences between using 32 vs. 64 bit types.  With certainty I can say that moving less data will virtually never slow you down and moving more will the majority of the time.  Of course don't read into this that I'm suggesting to always use the smallest possible data size.  Do what works for you.
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

pw (26 views)
2014-07-24 01:59:36

Riven (25 views)
2014-07-23 21:16:32

Riven (19 views)
2014-07-23 21:07:15

Riven (22 views)
2014-07-23 20:56:16

ctomni231 (51 views)
2014-07-18 06:55:21

Zero Volt (46 views)
2014-07-17 23:47:54

danieldean (37 views)
2014-07-17 23:41:23

MustardPeter (40 views)
2014-07-16 23:30:00

Cero (56 views)
2014-07-16 00:42:17

Riven (55 views)
2014-07-14 18:02:53
HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54

HotSpot Options
by Roquen
2014-05-06 15:03:10

Escape Analysis
by Roquen
2014-04-29 22:16:43

Experimental Toys
by Roquen
2014-04-28 13:24:22
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!