Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (477)
Games in Android Showcase (109)
games submitted by our members
Games in WIP (536)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1] 2
  ignore  |  Print  
  So now we're on a microbenchmarking spree  (Read 4066 times)
0 Members and 1 Guest are viewing this topic.
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Posted 2004-01-15 08:10:26 »

After reading the bad trig scores, I tried to create a little class for faster math using look-up tables and floats (currently only sin, cos, tan). And I tried to benchmark them in different ways: Using the client and server, using -Xcomp and without.

The results in ms:

Server with -Xcomp:
TOTAL: 35501

Server without -Xcomp
TOTAL: 19258

Client with -Xcomp:
TOTAL: 22102

Client without -Xcomp:
TOTAL: 20350

I found these results very surprising indeed.
The results are a mix between java.lang.Math and my own FMath class. Interestingly, java.lang.Math executes faster on the client than on the server JVM here but my FMath class is faster on the server.

Now before you'll all tell me "It's a microbenchmark and you have to know what you're measuring", you're probably right although I don't rule out I'm dealing with 1 or more defficiencies, especially in the server.
So the real question here is: What am I measuring? Is there a way to know at all, without knowing the intimite dirty secrets of hotspot?
I mean if there isn't, microbenchmarking is *absolutely useless* if you're not a hotspot developer (which is probably old news). Which is too bad, because microbenchmarking can IMHO be very useful to measure your own code.
Even macro benchmarks seem hard to draw any reasonable conclusions from then...

(I'll post the code in a minute)

Erik

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #1 - Posted 2004-01-15 08:25:09 »

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
public class Test {

    public static void main(String[] args) {
        FMath.init();
        
        long totalStart = System.currentTimeMillis();
        
        for (int ii = 0; ii < 20; ii++) {
              System.out.println("FMath");
              long start = System.currentTimeMillis();
              
              float result = 0;
              
              for (int i = 0; i < 1000000; i++) {
                  float r = (float)(i / 1000000f) * FMath.PI_2;
                  result += FMath.sin(r);
                  result += FMath.cos(r);
                  result += FMath.tan(r);
              }
              
              System.out.println(result);
              System.out.println(System.currentTimeMillis() - start);
              
              System.out.println("Math");
              
              start = System.currentTimeMillis();
              
              result = 0;
              
              for (int i = 0; i < 1000000; i++) {
                  float r = (float)(i / 1000000f) * FMath.PI_2;
                  result += Math.sin(r);
                  result += Math.cos(r);
                  result += Math.tan(r);
              }
              
              System.out.println(result);
              System.out.println(System.currentTimeMillis() - start);  

              
        }
        System.out.println("***TOTAL: " + (System.currentTimeMillis() - totalStart));
    }
}

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #2 - Posted 2004-01-15 08:27:03 »

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
public class FMath {
   
    public static int PRECISION = 0x100000;
    public static final float PI = (float)java.lang.Math.PI;
   
    public static final float PI_2 = PI*2;
   
    private static float RAD_SLICE = PI_2 / PRECISION;
   
    private static float[] sinTable;
    private static float[] cosTable;
    private static float[] tanTable;
       
    public static void init() {
        RAD_SLICE = PI_2 / PRECISION;
        sinTable = new float[PRECISION];
        cosTable = new float[PRECISION];
        tanTable = new float[PRECISION];
        for (int i = 0; i < PRECISION; i++) {
            float rad = (float)i * RAD_SLICE;
            sinTable[i] = (float)java.lang.Math.sin(rad);
            cosTable[i] = (float)java.lang.Math.cos(rad);
            tanTable[i] = (float)java.lang.Math.tan(rad);
        }
    }
   
    private static final int radToIndex(float radians) {
        //return (int)(((radians % PI_2)/PI_2) * PRECISION);
       return (int)((radians / PI_2) * (float)PRECISION) & (PRECISION-1);
    }
   
    public static float sin(float radians) {
        return sinTable[radToIndex(radians)];
    }

    public static float cos(float radians) {
        return cosTable[radToIndex(radians)];
    }
   
    public static float tan(float radians) {
        return tanTable[radToIndex(radians)];
    }
   
}

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #3 - Posted 2004-01-15 08:28:58 »

But again, the real question is not what's wrong with the code but is there a way to learn how you should draw conclusions from java benchmarks...

Offline princec

JGO Kernel


Medals: 343
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #4 - Posted 2004-01-15 08:36:19 »

If you've got the time would you write a JNI version of FMath that calls #asm functions and see how that works out? Wink

Oh and here's something that might explain the server result - once upon a time when I wrote some terrain demo or other it ran incredibly slowly and I couldn't figure out why. Turns out that it was using strictfp but I can't see any way from the command line of telling the VM which maths to use.

Cas Smiley

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #5 - Posted 2004-01-15 10:42:30 »

Quote
If you've got the time would you write a JNI version of FMath that calls #asm functions and see how that works out?  


Well, it's better to fix performance problems ourselves than to wait for Sun for faster math. Sun's got fair reasons for Math to be dog slow but unfortunately that doesn't help us at all. We don't need the accuracy we just want raw speed. So maybe, yeah I will. LWJBM (Light Weight Java BadMath)?  Grin

But I'm wondering how I should benchmark it  Wink Roll Eyes

Offline shawnkendall

Senior Member





« Reply #6 - Posted 2004-01-15 12:32:21 »

Suggestions...

1) Run a "warm up" phase...
This performs the benchmark >50,000 times ( I use 100,000 when I can wait ) exactly as it will run in the final benckmark.  Usually to guarantee this, put benchmark meat is in a method that is called in a loop and

2) Create NO objects in the benchmark.  Unless of course you are benchmarking object creation/GC stuff.

3) To find true deltas between alternate tests, first compute the test with no operations ( or minimum ) and use that as a baseline cost of calling the method and moving data.  If this baseline ends up being 0, then it was optimized out and the benchmark is too simple.

4) Pre-compute test data and store in a table.  This will make the test as focused as possible, but prevent optimized out data generation code, because the compiler cannot know how the table will change so it must get the data each time.

5) Use the higher precision timer unofficial, sun.misc.Perf high-res timer for timing.

6) Make SURE nothing else is running on your system. Alternatively, set the Java process priority to highest.

Remember the goal is to test your code, not the GC or JIT.  

If I think of more... :-)

Shawn Kendall
Cosmic Interactive, LLC
http://www.facebook.com/BermudaDash
Offline shawnkendall

Senior Member





« Reply #7 - Posted 2004-01-15 12:41:25 »

7) Don't System.out until the end of each benchmark.  System.out.println generates garabage for collection.  Sometimes, you can get by because the GC won't happen in the benchamrk, but be aware that is could.

8 ) Wait/Sleep.  This is one I do, but I'm not positive it's needed now.  Perhaps Jeff or other can verify.
I put waits/sleeps inbetween tests so the VM/GC/JIT gets execution time for whatever it wants so as not to have to pull it in the middle of some other library call that waits/sleeps.  Of course the VM can't just block your code anywhere, but in large benchmarks with libraries, you never know where one is laying so I give the VM plenty of time and chance to do what it wants outside of my tests.

Shawn Kendall
Cosmic Interactive, LLC
http://www.facebook.com/BermudaDash
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #8 - Posted 2004-01-15 13:09:40 »

Thanks. Those seem good suggestions.  Smiley

BTW, anyone got a clue why in this test:
a) java.lang.Math performs faster on the client? (might have something to do with strictfp, but why does this then only affect the server?)
b) -Xcomp slows things down? (especially on the server)
...or so it seems anyway  Wink...

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #9 - Posted 2004-01-15 18:56:22 »

anyways, I'll do some more testing using shawn's suggestions and if I can really pin down the problem to being a server defficiency in Math (which it now looks like), I'll file a bug report.

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #10 - Posted 2004-01-15 19:20:38 »

Hmmm.... Just tested the benchmark at home using 1.4.1_01 and the results were as expected: the server being slightly faster than the client (without -Xcomp, didn't test with it).

Could maybe someone else with 1.4.2_03 run this benchmark and look if the Math output on the server is also slower? Maybe it's some weird local problem on my machine although that machine is just 3 days old and everything freshly installed...

Erik

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #11 - Posted 2004-01-15 19:25:18 »

on 1.4.1, -Xcomp on the server still seriously degrades performance in my benchmark...  :-/ (>twice as slow)

Offline Jeff

JGO Coder




Got any cats?


« Reply #12 - Posted 2004-01-15 20:25:41 »

These dont surprise me at all? It looks exactly like I would expect.  Am I missing something?

i assume bigger is better. If not then your results look 100% backwards and I woudl check that assumption.

-Xcomp is always better then without it, either VM

WITHOUT -Xcomp, client hits its compile threshold sooner and thus produces better numbers.

Just shows the real need to properly warm up your VM.

Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #13 - Posted 2004-01-16 03:02:21 »

Quote
These dont surprise me at all? It looks exactly like I would expect.  Am I missing something?

i assume bigger is better. If not then your results look 100% backwards and I woudl check that assumption.

-Xcomp is always better then without it, either VM

WITHOUT -Xcomp, client hits its compile threshold sooner and thus produces better numbers.

Just shows the real need to properly warm up your VM.

Huh? Maybe you are reading the numbers backwards - lower is better (right?)
WITHOUT -Xcomp it reaches the compile threshold SOONER??  I thought with -Xcomp it hit the compile threshold IMMEDIATELY.

The thing that was weird is:

Server with -Xcomp:
TOTAL: 35501

is slower than

Server without -Xcomp
TOTAL: 19258

And that the server VM is slower than the client VM with -Xcomp... unless the actual compile times are taking up (too much) time in the benchmark.. then I would expect the server VM to take more time optimizing than the client VM.. and yet the code likely won't come out that different.

Could that be it or am I missing something?

Offline Jeff

JGO Coder




Got any cats?


« Reply #14 - Posted 2004-01-16 03:40:09 »

Quote

Huh? Maybe you are reading the numbers backwards - lower is better (right?)
WITHOUT -Xcomp it reaches the compile threshold SOONER??  I thought with -Xcomp it hit the compile threshold IMMEDIATELY.


Sorry I wasnt clear.  If both Server and Client are run without Xcomp, Client will hit threshold sooner and perform better on short-run tests.  That was my point.

As I say, if bigger is better the numbers are exactly the relationship I expect.  If smaller is better the numbers are entirely BACKWARDS, which makes me question that assumption.

Guess I should try to spring some cycles to look at the benchmark itself if I can.


Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline Jeff

JGO Coder




Got any cats?


« Reply #15 - Posted 2004-01-16 03:44:55 »

Hmm you're right about the measurement.  Which does make it odd, but I'm sure its explainable.

First thing is to take the loop out of the main.  Having the loop in the main means on-stack-replacement which can cause odd things to happen benchmark-wise as we've already seen.

Second thing to do is to call the test multiple times.  I wouldn't trust any number until you've seen it settle into returning the same number (or within a few 100s of MS of same number) a few times.  This will ensure you arent seeing any compile times or other warm-up issues  in your numbers.



Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #16 - Posted 2004-01-16 06:37:27 »

The test is already runing 20 times. After 3 runs or so, the numbers become stable. I'll try to get the test out of main and see if it helps.

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #17 - Posted 2004-01-16 07:02:41 »

Ok, changed the Test to this:
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
62  
63  
64  
65  
66  
public class Test {
   
    static final int PRECISION = 1000000;
   
    private void fmath() {
        System.out.println("FMath");
        long start = System.currentTimeMillis();
       
        float result = 0;
       
        for (int i = 0; i < 1000000; i++) {
            float r = (float)(i / 1000000f) * FMath.PI_2;
            result += FMath.sin(r);
            result += FMath.cos(r);
            result += FMath.tan(r);
        }
       
        System.out.println(result);
        System.out.println(System.currentTimeMillis() - start);
       
    }
   
    public void math() {
        System.out.println("Math");
       
        long start = System.currentTimeMillis();
       
        long result = 0;
       
        for (int i = 0; i < 1000000; i++) {
            float r = (float)(i / 1000000f) * FMath.PI_2;
            result += Math.sin(r);
            result += Math.cos(r);
            result += Math.tan(r);
        }
       
        System.out.println(result);
        System.out.println(System.currentTimeMillis() - start);  

       
    }
   
    public void benchmark() {
        long totalStart = System.currentTimeMillis();
       
        fmath();
        fmath();
        fmath();
        math();
        math();
        math();
       
       
        for (int ii = 0; ii < 17; ii++) {
            fmath();
            math();
        }
        System.out.println("***TOTAL: " + (System.currentTimeMillis() - totalStart));
    }

    public static void main(String[] args) {
        FMath.init();
        Test test = new Test();
        test.benchmark();
    }
}




Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #18 - Posted 2004-01-16 07:28:32 »

With the above code, the numbers start to make somewhat more sense.

server:
FMath: 70
Math: 1232
***TOTAL: 26098

server with -Xcomp:
FMath: 70
Math: 1242
***TOTAL: 26498

client:
FMath: 211
Math: 1031
***TOTAL: 25036

client with -Xcomp:
FMath: 210
Math: 1031
***TOTAL: 24915

Observations:
* client is surprisingly a little bit faster than the server in the total score.
* FMath is (as expected) a lot faster on the server
* Math suprisingly performs slower on the server.
* -Xcomp doesn't make a notable difference anymore, which I suppose is as expected in a little benchmark like this.

Offline crystalsquid

Junior Member




... Boing ...


« Reply #19 - Posted 2004-01-16 10:24:53 »

Ooh! FMath looks nice & fast Smiley I may have to use that (as long as you don't mind Erik)

I will try to get around to adding a sqrt/rsqrt & see if it is any faster as well & Ill post the results if it works.

You can also increase precision on the result by interpolating between the nearest two table entries, although this may slow it down to nearly the original math speed.

- Dom
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #20 - Posted 2004-01-16 11:30:41 »

Of course I don't mind  Smiley but you have to test it, cos I didn't really check the results (it might well be totally wrong), so some bugfixing may be involved  Grin
I just did it to see if this would be a workaround for Math's slowness.

EDIT: I just did some checks. Precision is good until ~4 positions after the point. You can of course make the precision higher at the cost of memory use, but hey, it's 17.6 times faster than Math  Smiley
Interpolation is indeed an option, although it will surely cost.

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #21 - Posted 2004-01-16 16:45:49 »

BTW it seems to me that Math.sqrt is pretty fast already, although I haven't checked against C. Anyway I doubt if it could be made really faster using java alone.

1  
2  
3  
4  
            for (int i = 0; i < 1000000; i++) {
                  float r = (float) (i / 1000000f) * FMath.PI_2;
                  result += Math.sqrt(r);
            }

is done in 31 ms on my machine. Well, on the client that is. On the server it's 47ms strangely enough  :-/

Strange thing is that if I redirect FMath.sqrt() to StrictMath.sqrt() (just like Math does), the FMath version is way slower than Math (Math being ~30x faster) Huh
Something to do with strictfp perhaps?

Offline Mark Thornton

Senior Member





« Reply #22 - Posted 2004-01-16 17:09:49 »

Quote

Strange thing is that if I redirect FMath.sqrt() to StrictMath.sqrt() (just like Math does), the FMath version is way slower than Math (Math being ~30x faster) Huh
Something to do with strictfp perhaps?

Not strange at all. The StrictMath version really does do a JNI call to some complex C code, while the Math version will use the Intel sqrt instruction inlined.
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #23 - Posted 2004-01-16 17:15:37 »

Ah, that explains the comment in Math.sqrt()  Smiley
So the implementation in math is really like 'overridden' by HotSpot if I understand correctly?

Offline Mark Thornton

Senior Member





« Reply #24 - Posted 2004-01-16 18:44:23 »

Just for curiousity I implemented a simple polynomial approximation to sin accurate to about 2e-4. With code to reduce arguments to PI/2 it was about 5 times faster than Math.sin. If it could assume the arguments were in the range 0 .. 2*PI, then it was about 9 times faster.
The range of arguments used in the test was 0 .. 2*PI. For arguments restricted to the range 0 .. PI/2 the advantage is less because the Math.sin code doesn't need to use its expensive argument reduction in this interval.
Note that the argument reduction used by Math.sin becomes more expensive with larger arguments as it has use ever higher precision values of PI (up to ~1024 bits as I recall). This means that the benchmark which used arguments up to 1e6 was particularly cruel to the Java implementation.
Offline Jeff

JGO Coder




Got any cats?


« Reply #25 - Posted 2004-01-16 21:24:29 »

Quote
Ah, that explains the comment in Math.sqrt()  Smiley
So the implementation in math is really like 'overridden' by HotSpot if I understand correctly?


Yes the VM knows how to do math primatives directly in the code, as opposed to treating them as method calls.



Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #26 - Posted 2004-01-16 22:03:41 »

nifty  Smiley
Would you happen to know a possible explanation of the lower performance of Math using the server VM in my benchmark by any chance?
All misteries would be solved then  Wink
Or should I report a bug and see what happens?

Offline Jeff

JGO Coder




Got any cats?


« Reply #27 - Posted 2004-01-16 23:49:25 »

Quote
nifty  Smiley
Would you happen to know a possible explanation of the lower performance of Math using the server VM in my benchmark by any chance?
All misteries would be solved then  Wink
Or should I report a bug and see what happens?


I know some folks I can ask. I'll try to get to it.  I'm kinda bogged down right now with getting the Big Secret Surprise ready for GDC...


Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #28 - Posted 2004-01-17 17:09:05 »

Ok, thanks and no hurries. Good luck with the preparations for GDC.

Offline grayarea

Junior Newbie





« Reply #29 - Posted 2004-01-17 22:17:20 »

Here's a bug from BugParade I found on a discussion at TheServerSide.com about the same benchmark study: http://developer.java.sun.com/developer/bugParade/bugs/4857011.html
I don't know if this particular bug was posted in this forum before. The evaluation makes the reasoning behind Java 1.4 trig functions' implementation pretty clear. A combination of using narrow ranges and table lookups seems to be the way.
Pages: [1] 2
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

CogWheelz (18 views)
2014-07-30 21:08:39

Riven (26 views)
2014-07-29 18:09:19

Riven (15 views)
2014-07-29 18:08:52

Dwinin (13 views)
2014-07-29 10:59:34

E.R. Fleming (34 views)
2014-07-29 03:07:13

E.R. Fleming (12 views)
2014-07-29 03:06:25

pw (44 views)
2014-07-24 01:59:36

Riven (44 views)
2014-07-23 21:16:32

Riven (30 views)
2014-07-23 21:07:15

Riven (31 views)
2014-07-23 20:56:16
List of Learning Resources
by SilverTiger
2014-07-31 18:29:50

List of Learning Resources
by SilverTiger
2014-07-31 18:26:06

List of Learning Resources
by SilverTiger
2014-07-31 13:54:12

HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!