Java-Gaming.org    
Featured games (91)
games approved by the League of Dukes
Games in Showcase (577)
games submitted by our members
Games in WIP (498)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1] 2
  ignore  |  Print  
  Micro benchmarks  (Read 5607 times)
0 Members and 1 Guest are viewing this topic.
Offline altair

Senior Newbie





« Posted 2003-05-24 02:15:07 »

Hi guys,

I was trying to make speed improvements in some code and got unexpected results. First look at some code:

     int[] iv = new int[1000];
     iv[0] = 8;
     iv[1] = 9;

     long before = System.currentTimeMillis();

     for (int ii=0; ii<100000; ii++)
     {
       for (int i=2; i<iv.length; i++)
       {
         iv = (5 * iv[i-1] - 10 * iv[i-2]) / 3;
         iv *= iv;
       }
     }

     long after = System.currentTimeMillis();
     System.out.println(after-before);

     before = System.currentTimeMillis();

     for (int ii=0; ii<100000; ii++)
     {
       int previous = iv[1];

       for (int i=2; i<iv.length; i++)
       {
         int v = (5 * previous - 10 * iv[i-2]) / 3;
         v *= v;
         iv = v;
         previous = v;
       }
     }

     after = System.currentTimeMillis();
     System.out.println(after-before);

     float[] fv = new float[1000];
     fv[0] = 8.0f;
     fv[1] = 9.0f;

     before = System.currentTimeMillis();

     for (int ii=0; ii<100000; ii++)
     {
       for (int i=2; i<fv.length; i++)
       {
         fv = (5.0f * fv[i-1] - 10.0f * fv[i-2]) * 0.333333333f;
         fv *= fv;
       }
     }

     after = System.currentTimeMillis();
     System.out.println(after-before);

     before = System.currentTimeMillis();

     for (int ii=0; ii<100000; ii++)
     {
       float previous = fv[1];

       for (int i=2; i<fv.length; i++)
       {
         float f = (5.0f * previous - 10.0f * fv[i-2]) * 0.333333333f;
         f *= f;
         fv = f;
         previous = f;
       }
     }

     after = System.currentTimeMillis();
     System.out.println(after-before);


OK, basically it is some math computation inside loops.
I ran these tests several times in the same process and got the following results:

int : 5203
int aliased : 4750
float : 4172
float aliased : 5031

The absolute values by themselves are not really important but they reveal a pattern.

First conclusion: the floats are faster that the ints on my platform (WinXP, Athlon XP, Java 1.4.2beta) !!! Incredible isn't it ?

Naively, I thought otherwise. I guess the SSE2 instructions now used by the JVM kick in to boost dramatically the float performance. It would be interesting to see the results on other platforms (SSE or no SSE). That is good news for Open GL Java ;-)

Second conclusion: aliasing is always better with ints but counter productive with floats which also was not obvious to me.

I am perfectly aware that these results should be taken with a grain of salt (if you slightly modify the code inside the loop you may end up with other conclusions).

The bottom line: it is going very difficult to optimize code since the optimization on one platform may end up decreasing perf on another platform. The only way to know is to test !


I'd be interesting in other people testing with other hardware, OS and JVM versions.
Offline AndersDahlberg

Junior Member





« Reply #1 - Posted 2003-05-24 03:57:58 »

Tried this test on my machine (adding some usual repeat stuff)

java 1.4.2 beta, RedHat 9.0, athlon 900:

int: 7769
int aliased: 7409
float: 4250
float aliased: 5273

int: 8080
int aliased: 7172
float: 4325
float aliased: 5263

int: 8075
int aliased: 7160
float: 4322
float aliased: 5268

int: 8072
int aliased: 7173
float: 4312
float aliased: 5282

java -server:
int: 3530
int aliased: 2501
float: 8488
float aliased: 7738

int: 1937
int aliased: 1779
float: 8154
float aliased: 7718

int: 1983
int aliased: 1764
float: 8171
float aliased: 7722

int: 1945
int aliased: 1767
float: 8189
float aliased: 7712

Huh server versus client... Huh
ROFL - benchmarks are fun Smiley

For even more laughs  Shocked I tried it with gcj 3.2.2:
int: 7893
int aliased: 4297
float: 9014
float aliased: 5995

int: 9787
int aliased: 4584
float: 9833
float aliased: 6528

Pretty bad, Huh, but wait - here is more,
compiled it with  "gcj -msse2 -m3dnow -O2 -o test --main=Test Test.java":
int: 2698
int aliased: 1798
float: 3997
float aliased: 1892

int: 2684
int aliased: 1790
float: 3987
float aliased: 1900

gcj rocks on this test Grin - is it cheating or is sun java just being slow? Maybe should try it with ibm as they (?) are faster on calculus Wink
Offline altair

Senior Newbie





« Reply #2 - Posted 2003-05-24 05:57:54 »

Geeeeee !
Your results are puzzling because in the end you do not know what to shoot for.
With both integers and floats,  the values span (roughly) from 2 to 9 ! And the floats can be faster or slower that the ints.
I note than with the new JVM, the floats 'can be' REALLY fast ...
How can you be sure that the optimizations you made on your platform are relevant at all ?
The only constant thing for sure is that aliased integers are always faster than non aliased.

Food for thought, people ...
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline aikarele

Senior Newbie





« Reply #3 - Posted 2003-05-24 11:04:49 »

I did some microbenchmarking too (with J2SE 1.4.2 + WinXP) and I found out that floating point division was working faster than integer division. However, floating point multiplication was working slower than integer multiplication.

By the way, you can find an interesting table comparing the relative performance of common operations on PIII-733 using C++ here (appendix B):
http://www.tantalon.com/pete/cppopt/appendix.htm

It states that "In fact, floating-point division is as fast or faster than integer division". So this is not a Java only "feature". Somebody should do a table like that for Java and post it here  Wink.

Offline Abuse

JGO Coder


Medals: 10


falling into the abyss of reality


« Reply #4 - Posted 2003-05-24 11:33:16 »

Isn't it normal that floating point is better for division/multiplation and integer is better for addition/negation?

I thought it was common knowledge  Huh

Make Elite IV:Dangerous happen! Pledge your backing at KICKSTARTER here! https://dl.dropbox.com/u/54785909/EliteIVsmaller.png
Offline aikarele

Senior Newbie





« Reply #5 - Posted 2003-05-24 13:17:26 »

Quote
Isn't it normal that floating point is better for division/multiplation and integer is better for addition/negation?

I thought it was common knowledge  Huh


Read the earlier posts. It depends. For example, I just said on my earlier post that "floating point multiplication is SLOWER than integer multiplication".
Offline Abuse

JGO Coder


Medals: 10


falling into the abyss of reality


« Reply #6 - Posted 2003-05-24 17:33:03 »

oh yeah, soz Cheesy

well... you've proved 1 thing.

Micro-optimisations are a waste of time Cheesy

and attempting to benchmark micro-optimisations are an even larger waste of time  Wink

Make Elite IV:Dangerous happen! Pledge your backing at KICKSTARTER here! https://dl.dropbox.com/u/54785909/EliteIVsmaller.png
Offline jbanes

JGO Coder


Projects: 1


"Java Games? Incredible! Mr. Incredible, that is!"


« Reply #7 - Posted 2003-05-24 17:48:30 »

If I may, I think the real reason that floats appear to be so much faster has less to do with SIMD/SSE, and more to do with floating-point coprocessors. THe idea that only integers should be used in games comes from back in the days of 386 machines where floating point had to be simulated in software. Unfortunately, it was one of those ideas that the development community never got past. I remember only one game that ever advertised the fact that it used floating point math. It was some sort of 3D car/shoot'em up game that ran on 486s and Pentiums. It actually ran quite well, but the market (to my knowledge) never picked up on this little tidbit.

Java Game Console Project
Last Journal Entry: 12/17/04
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #8 - Posted 2003-05-24 20:01:21 »

The fact that the server VM did float operations so much slower than the client VM is a performance issue that I would file a bug report on.

Performance in that area should be the same or better.

GCC on intel is known to suck.  But for floats MSVC 6.0 is also known to suck.  The Intel compiler produces much better code, although I hear that MS has caught up with the .net compiler.

I like that the server compiler outperformed GCJ on ints... this sort of information is helpful in dispelling the myth that Java is just all around "slow".

Here are some numbers from the above code for Mac OS X 1GHz PowerPC G4
client VM

test repeated 4 times.. these are the typical results...

int: 3809
int a: 2276
float: 4653
float a: 3410

int: 3803
int a: 2268
float: 4619
float a: 3424

With -server ALL numbers are HIGHER by about 100-200.

Offline altair

Senior Newbie





« Reply #9 - Posted 2003-05-24 20:23:55 »

swpalmer,

On Mac, the floats are somewhat slower than the ints, but all in all, the JVM/Max Os X/G4 combunation kicks butts !!!

Abuse,

micro-optimizations may be a waste of time but benchmarking them is not IMO : you are always safe with aliasing int arrays, the floats are 'usually' pretty fast (at least on the 3 config that have been tested so far).

The -server option seems to decrease dramatically the speed (unless the micro benchmark was too short for the JVM optimizations to kick in).

I would not have bet on these facts before.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline AndersDahlberg

Junior Member





« Reply #10 - Posted 2003-05-25 00:07:07 »

Well, my take on this issue would be that java is really fast on even this microbenchmark!

As we all know (? Grin ) any java vm become more competitive the more complex a problem it faces, as this test is a very easy one - the gcj native compiler should IMHO outperform sun and ibm java vm (as it is able to compile with all optimizations - "easy to calculate" optimizations). As I (and you) noticed sun java did quite good, even though the big difference between java -server and client! As Altair said I believe this big server - client difference to be a java1.4.2 bug - if anyone could test this with another java version...?

Anyway I don't really understand you Altair when you said I didn't know what I was testing? I was just doing your test on a different architecture and trying to get  as much data as possible - then it's up to the experts to try to get something worth to mention out of this data! Cool
I.e. I'm not "shooting" for anything except more data!
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #11 - Posted 2003-05-25 00:47:22 »

Quote
On Mac, the floats are somewhat slower than the ints, but all in all, the JVM/Max Os X/G4 combunation kicks butts !!!


I think this is more because the PowerPC is likely much easier to compile decent code for, since it has a much better design than intel (despite being behind in terms of clock speed).  I mean, when you don't have to worry so much about what you will keep in registers and what you need to shove on the stack because you actually have a decent amount of general purpose registers... well it just seems like optimizing on that architecture would be easier.

Offline altair

Senior Newbie





« Reply #12 - Posted 2003-05-25 01:19:34 »

AndersDahlberg

I did not make myself clear.
" Your results are puzzling because in the end 'we' do not know what to shoot for ".

I meant that it is not obvious from your results what to target to achieve the best speed : use floats or integers ? It depends on the platform (hardware, JVM, OS).  Alias / not alias ? it all depends on the code inside of the loop AND the platform.

Offline AndersDahlberg

Junior Member





« Reply #13 - Posted 2003-05-25 04:46:57 »

altair: Ok, then we understand each other Smiley

...for my part I don't really care which one is faster - will probably never become a big issue for me anyways (1x or 2x slower on a test like this is "almost nothing" Wink
Offline Mark Thornton

Senior Member





« Reply #14 - Posted 2003-05-28 09:02:22 »

On my 3.06GHz P4 (WIndows XP) I get

int: 2374
int aliased: 1421
float: 843
float aliased: 828
float (div): 1374
float aliased (div): 1343

The extra pair of float results are using /3f instead of *0.333...
These results are with the server VM, for the client VM the results are:

4716
4372
153369
154689
155082
156864

Ouch!

The results for double are essentially the same as for float (both client and server).

Offline Kevdog

Junior Member





« Reply #15 - Posted 2003-05-28 22:30:38 »

When you guys are using the -server option, are you putting code in there to "warm it up" before actually running it?

Maybe run the tests twice and throw out the first results?

There are only 10 types of people, those who understand binary and those who don't!
Offline Mark Thornton

Senior Member





« Reply #16 - Posted 2003-05-29 08:56:43 »

Quote
When you guys are using the -server option, are you putting code in there to "warm it up" before actually running it?

Maybe run the tests twice and throw out the first results?


Yes, but the length of the loop is sufficiently long that the changes aren't large.
Offline erikd

JGO Ninja


Medals: 15
Projects: 4
Exp: 14 years


Maximumisness


« Reply #17 - Posted 2003-05-30 18:51:11 »

Quote
Yes, but the length of the loop is sufficiently long that the changes aren't large.

There's not much to warm up in a micro benchmark so that makes sense  Smiley

Erik

Offline Kevdog

Junior Member





« Reply #18 - Posted 2003-05-30 20:54:56 »

Okay, just had to give it a try on my work machine:
P3 1Ghz
512MB mem
WinNT

java 1.4.1_01 -client
Run #1
int: 6409
int alias: 5939
float: 66125
float alias: 64413

Run #2
int: 6819
int alias: 5799
float: 64873
float alias: 64703

Run #3
int: 6850
int alias: 5858
float: 64834
float alias: 65914

-server
Run #1
int: 5668
int alias: 5658
float: 62330
float alias: 62229

Run #2
int: 5598
int alias: 6259
float: 66416
float alias: 66996

Run #3
int: 5828
int alias: 6079
float: 62210
float alias: 62239

Looks like under WinNT the SSE instructions aren't being used?  Float math is horrible!  Maybe that's why some of the demo games run very slow and jerky on my system.  Hopefully we're upgrading to WinXP by the end of the year!

There are only 10 types of people, those who understand binary and those who don't!
Offline altair

Senior Newbie





« Reply #19 - Posted 2003-05-31 02:15:57 »

Unlike previous results, the results on the P3/NT would be enough to ban floats (or remove this platform from the targets). You did not give the version of the JVM though (upgrading could help improve the score).

Consider a game heavily using floats: it would fly on the Mac and the fast P4s (with XP) but would be implayable with a 'slow' PC with NT. Less so with integers.

"Write once run anywhere" seems really not to be an easy task as far as performance is concerned ...
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #20 - Posted 2003-05-31 04:00:02 »

Quote
You did not give the version of the JVM though (upgrading could help improve the score).


Look again it says 1.4.1_01

Offline Mark Thornton

Senior Member





« Reply #21 - Posted 2003-05-31 11:59:46 »

Quote
Okay, just had to give it a try on my work machine:
P3 1Ghz
512MB mem
WinNT

java 1.4.1_01 -client
...
Looks like under WinNT the SSE instructions aren't being used?  Float math is horrible!  Maybe that's why some of the demo games run very slow and jerky on my system.  Hopefully we're upgrading to WinXP by the end of the year!


The use of SSE is new in 1.4.2 beta and even then only in the server version.
Offline princec

JGO Kernel


Medals: 282
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #22 - Posted 2003-05-31 13:26:39 »

> only in the server version

Fools. Java gaming once again takes a poke in the eye.

Cas Sad

Offline Mark Thornton

Senior Member





« Reply #23 - Posted 2003-05-31 15:20:54 »

One serious problem with this benchmark is that the floating point calculation overflows (and then becomes NaN). The processor timing for the subsequent operations may not be very representative of normal calculation.
Offline Mark Thornton

Senior Member





« Reply #24 - Posted 2003-05-31 15:29:55 »

I've now changed the constants slightly and found that on my P2/400 the floating point time changes from 155 seconds down to 13 seconds.
Curiously that 155 second time for the original benchmark is almost the same as for the 3.06GHz P4 I have at work when SSE is not used. Evidently the SSE path is much faster at hanldling NaN values than the ordinary case, but this doesn't tell us much about real life floating point speed.
Offline genepi

Senior Newbie




azerty


« Reply #25 - Posted 2003-06-01 00:30:45 »

And with benchmarks, ALWAYS check the results when you compare the timing! http://developer.java.sun.com/developer/bugParade/bugs/4860749.html is my experience with JDK 1.4.2Beta... Shocked
Though speedy, the double calculations were sometimes wrong on the Windows platform... Tongue
Offline Mark Thornton

Senior Member





« Reply #26 - Posted 2003-06-02 09:54:44 »

Quote
> only in the server version

Fools. Java gaming once again takes a poke in the eye.

Cas Sad


The standard fp performance is reasonable provided that you avoid the NaN case. Of course the -server version is faster (wouldn't be much point otherwise), but that is also true of the integer results.
So perhaps we should look for a benchmark which does something realistic with the values finite and preferably non zero. My slight variation in this benchmark results in the values converging to zero which isn't ideal either (too easy). In this case on my P4 the floating point calculation is faster than the integer method.

Any suggestions for a relevant calculation which has both integer and fp forms.
Offline princec

JGO Kernel


Medals: 282
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #27 - Posted 2003-06-02 11:20:56 »

Mandelbrot?
And of course, vertex transformation is where it's at these days. And the reason why the client FP performance isn't reasonable at all, just slow Embarrassed


Cas Smiley

Offline Mark Thornton

Senior Member





« Reply #28 - Posted 2003-06-02 12:35:00 »

Quote
Mandelbrot?
And of course, vertex transformation is where it's at these days.  


Probably a bad example as you would really like the transformations to be done by all those transform pipelines on the graphics card.

I wonder if the graphics card FP could usefully be used for general purpose fp --- I seem to recall that the Sony game systems being hooked up as a 'supercomputer' were doing something like that.
Offline cfmdobbie

Senior Member




Who, me?


« Reply #29 - Posted 2003-06-02 14:47:52 »

Coincidentally, there was a thread about that a couple of days ago!

The conclusion was "technically yes, but getting the results out again is too slow", I believe.

Hellomynameis Charlie Dobbie.
Pages: [1] 2
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

xsi3rr4x (22 views)
2014-04-15 18:08:23

BurntPizza (17 views)
2014-04-15 03:46:01

UprightPath (31 views)
2014-04-14 17:39:50

UprightPath (15 views)
2014-04-14 17:35:47

Porlus (31 views)
2014-04-14 15:48:38

tom_mai78101 (57 views)
2014-04-10 04:04:31

BurntPizza (114 views)
2014-04-08 23:06:04

tom_mai78101 (214 views)
2014-04-05 13:34:39

trollwarrior1 (182 views)
2014-04-04 12:06:45

CJLetsGame (189 views)
2014-04-01 02:16:10
List of Learning Resources
by Longarmx
2014-04-08 03:14:44

Good Examples
by matheus23
2014-04-05 13:51:37

Good Examples
by Grunnt
2014-04-03 15:48:46

Good Examples
by Grunnt
2014-04-03 15:48:37

Good Examples
by matheus23
2014-04-01 18:40:51

Good Examples
by matheus23
2014-04-01 18:40:34

Anonymous/Local/Inner class gotchas
by Roquen
2014-03-11 15:22:30

Anonymous/Local/Inner class gotchas
by Roquen
2014-03-11 15:05:20
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!