Java-Gaming.org    
Featured games (91)
games approved by the League of Dukes
Games in Showcase (577)
games submitted by our members
Games in WIP (498)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  Fast Trig on the x86  (Read 4179 times)
0 Members and 1 Guest are viewing this topic.
Offline Jeff

JGO Coder




Got any cats?


« Posted 2005-12-05 03:30:07 »

Hey guys,

As soem of you may or may not know there is an issue in java where trig calculatiosn over 45 degrees or under -45 degrees ar esignificantly slower in Java then C on the x86 processor. This is due to the fact that the x86 prcoessor is *broken* and its trig intrinsics fail Java's accuracy gaurantees for anything otuside of that range, forcing Java to fall back on sofwtare calculation of those functions.

I've just posted my first gem on my gems page, which is tested code for Sin and Cos which will take an arbitrary Sin or Cos calculation and reduce it to one in that range, ensuring speedy calculation.

Enjoy.

http://wiki.java.net/bin/view/Games/JeffGems


Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline darkprophet

Senior Member




Go Go Gadget Arms


« Reply #1 - Posted 2005-12-05 15:32:48 »

It would be nice if you posted some benchmarks too, against trig tables and Math.sin/cos/tan


Friends don't let friends make MMORPGs.

Blog | Volatile-Engine
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 605
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #2 - Posted 2005-12-05 15:35:49 »

Trig tables will beat the ... out of sin/cos. (somewhere around factor 7 IIRC)

If your lookup-table is of length 100, you do:
entry = (( (int)(angle * scale) % 100) + 100) % 100;
...to catch all angles, even negative ones, without if-statements



Will post the benchmarks when I find time... at work now Lips Sealed

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline DzzD
« Reply #3 - Posted 2005-12-05 17:39:43 »

I just made a very simple benchmarks of those functions (loop 2000000 ops ):

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
      long startTime=0;
      long endTime=0;
      int time=0;
      int nbOperation=2000000;  
      double result=0.0;
      double a=3*Math.PI;
     
      startTime=System.currentTimeMillis();
      for(int n=0;n<nbOperation;n++)
      {
         result+=Math.cos(a);
         a+=10;
      }
      endTime=System.currentTimeMillis();
      time=(int)(endTime-startTime);
      System.out.println("Time=" + time);
      System.out.println("Op/s=" + (nbOperation*1000)/time);
      System.out.println("result=" + result);
     
      result=0.0;
      a=3*Math.PI;
      startTime=System.currentTimeMillis();
      for(int n=0;n<nbOperation;n++)
      {
         result+=FastTrig.FastCos(a);
         a+=10;
      }
      endTime=System.currentTimeMillis();
      time=(int)(endTime-startTime);
      System.out.println("Time=" + time);
      System.out.println("Op/s=" + (nbOperation*1000)/time);
      System.out.println("result=" + result);  




FastTrig.FastCos is about one(0.9) to ten times faster than Math.cos !! of course it depend on tests ranges and especially the parameter "a"

final result is not the same but after 2000000 cosinus additions there is just a very very little difference.

Bruno
 

Offline Jeff

JGO Coder




Got any cats?


« Reply #4 - Posted 2005-12-05 22:21:47 »


FastTrig.FastCos is about one(0.9) to ten times faster than Math.cos !! of course it depend on tests ranges and especially the parameter "a"

Well... I'd expct it to be *slightly* slower then Math.cos for anythign in the -45/+45 degree range because  there is no benefit and youa re adding a tiny handful of additional ops.

I'd be *very* curious to see a  distribution of speed comparisons by "a"...

Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline Jeff

JGO Coder




Got any cats?


« Reply #5 - Posted 2005-12-05 23:36:34 »

Okay,  I wrote this little benchmark...

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
62  
63  
64  
65  
66  
67  
68  
69  
70  
public class FastTrigBenchmark {
   
    static double[][] benchmarkCos(double startvalue, double endvalue, double increment, long samples){
          double[][] results = new double[(int)Math.ceil((endvalue-startvalue+1)/increment)][2];
          int datapt = 0;
          for(double testval=startvalue;testval<=endvalue;testval+=increment){
             long time = System.nanoTime();
             for(int i=0;i<samples;i++){
                Math.cos(testval);
             }
             results[datapt][0] = time - System.nanoTime();
             time = System.nanoTime();
             for(int i=0;i<samples;i++){
                FastTrig.fastCos(testval);
             }
             results[datapt++][1] = time - System.nanoTime();
          }
          return results;
       }

   /**
    * @param args
    */

   public static void main(String[] args) {
      double start = -Math.PI*10;
      double end = Math.PI*10;
      double incr = Math.PI/180.0;// check by degree
     //warmup
     System.out.println("Warming up");
      benchmarkCos(start,end,incr,1000);
      //now do for real
     System.out.println("Collecting values");
      double[][] result = benchmarkCos(start,end,incr,1000);
      // print raw output
     System.out.println("Raw Results");
      double val = start;
      for(int i=0;i<result.length;i++){
         System.out.print("PI*"+val/Math.PI+":");
         System.out.print("Cos:"+result[i][0]);
         System.out.println("FastCos:"+result[i][1]);
         val += incr;
      }
   
      System.out.println("Deltas as % (FastCos = X % Cos)");
      val = start;
      for(int i=0;i<result.length;i++){
         System.out.print("PI*"+val/Math.PI+":");        
         System.out.println((result[i][1]/result[i][0])*100+"%");
         val += incr;
      }

      System.out.println("total times as %: (FastCos = x % cos)");
      double cosAcc = 0;
      double fastCosAcc = 0;
      for(int i=0;i<result.length;i++){
         cosAcc += result[i][0];
         fastCosAcc += result[i][1];
      }
      System.out.println((fastCosAcc/cosAcc)*100+"%");

   }

}
[quote][/quote]

On the whole, voer the entire distribution, we're seeing about the same speed (it varies a bit for reasons Im unsure of)

What i find mreo inetresting though is the breakdown of the bands. It suggests there migth be some subtle tuning I can do yet...

Im trying to get Star office to graph it. if I can make that work ill post.

Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline Jeff

JGO Coder




Got any cats?


« Reply #6 - Posted 2005-12-05 23:46:01 »

Trig tables will beat the ... out of sin/cos. (somewhere around factor 7 IIRC)

If your lookup-table is of length 100, you do:
entry = (( (int)(angle * scale) % 100) + 100) % 100;
...to catch all angles, even negative ones, without if-statements



Will post the benchmarks when I find time... at work now Lips Sealed

Intresting!  i must admit im having a bit of trouble figuring out how your lookup works.. just trouble picturing it in my head...

But it sure looks more efficient!  Are you assuming a 0-360 degree table?

Edit:Oh ,I got it, most of that is jsut to movve the index out of the negs Smiley
What is "scale" in the above?

Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline DzzD
« Reply #7 - Posted 2005-12-05 23:47:09 »

Quote
Well... I'd expct it to be *slightly* slower

yes about 0.9


Quote
re adding a tiny handful of additional ops

additional ops are negligable with cos (eg: += can be performed about 60000000/s when cos is about 1000000), and you need a value that is updated in the loop and used outside to be sure that optimiser will not change benchmark result.

it seem that if you calculate random cos and sin fasttrig become faster (in the outside range) but if you calculate continuous cos like cos,0.1 cos,0.2cos etcc (a+=0.1) it become equals (a bit less 0.9 due to over op)

maybe a cache or something like that (serie cache)

Bruno

Offline Jeff

JGO Coder




Got any cats?


« Reply #8 - Posted 2005-12-05 23:49:12 »

Good catch.  Il have to rewrite a gaussain random benchmark at some point...

Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline princec

JGO Kernel


Medals: 282
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #9 - Posted 2005-12-06 20:27:49 »

I use "Yakly degrees" myself - instead of needing to do % modulus ops I can just bitmask 'em instead - there are conveniently 0x10000 Yakly degrees in a circle Wink

One thing that people using LUTs forget is that it can be a very, very expensive operation if you're not lucky, as if your LUT finds itself outside the L1 cache, or worse, the L2 cache, you're looking at a bus wait. Aggh!

Cacs Smiley

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline kaffiene
« Reply #10 - Posted 2005-12-06 23:05:57 »

I use "Yakly degrees" myself - instead of needing to do % modulus ops I can just bitmask 'em instead - there are conveniently 0x10000 Yakly degrees in a circle Wink

Ummm... wassat?  Wink
Offline Jeff

JGO Coder




Got any cats?


« Reply #11 - Posted 2005-12-07 00:19:30 »

I use "Yakly degrees" myself - instead of needing to do % modulus ops I can just bitmask 'em instead - there are conveniently 0x10000 Yakly degrees in a circle Wink

Ummm... wassat?  Wink

If i understand it, its an arbitrary angle division that happens to fit into base 2 well.

Look at it this way:  A circle is....

360 degrees
or
(approximately)6.28 Radians
or
0x10000 Yaklys


Get it?  Smiley


Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline Ask_Hjorth_Larsen

Junior Member




Java games rock!


« Reply #12 - Posted 2005-12-07 06:59:51 »

If you use table look-ups, you should linearize between look-up points to ensure that the curve is continuous. This should be relatively cheap.
Offline rolz

Junior Member




Chief Technopolitan


« Reply #13 - Posted 2005-12-07 12:02:23 »

FastTrig is stable 20-30% slower than conventional Math.cos()/sin()  on my environment.
btw, I have used values between -10000 .. 10000 rads for angle.



Conventional math: 0.577s
Fast math: 0.794s

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
/**
 * User: Andrei_Karneyenka
 */

public class TrigTest extends TestCase {


    public void testSinCos() throws Exception {

        double[]values = new double[2000];

        for (int i = 0; i < values.length; i++) {
            values[i] = 10000 - Math.random() * 20000;
        }

        TimerCounter timer = new TimerCounter();
        for (int i = 0; i < values.length; i++) {
            for (int j = 0; j < values.length; j++) {
                double value = values[j];
                Math.cos(value);
                Math.sin(value);
            }
        }
        System.err.println("Conventional math: "+timer.getElapsedTime());

        for (int i = 0; i < values.length; i++) {
            for (int j = 0; j < values.length; j++) {
                double value = values[j];
                Trig.cos(value);
                Trig.sin(value);
            }
        }
        System.err.println("Fast math: "+timer.getElapsedTime());

    }
}


Offline Jeff

JGO Coder




Got any cats?


« Reply #14 - Posted 2005-12-07 19:56:48 »

FastTrig is stable 20-30% slower than conventional Math.cos()/sin()  on my environment.
btw, I have used values between -10000 .. 10000 rads for angle.



Conventional math: 0.577s
Fast math: 0.794s

This has already been covered.  Linearly advancing benchmarks are a special case for Math.cos/sin and much faster then random data points.

I have a note  on this at the end of the article.

Seperately I dont see any warm-up on your benchmark.

Benchmarks without warm-up are fairly useless on J2SE due to the impact of the JIT, both in time burned for compilation and the subsequent speed up.

Depending on exactly when the JIT chose to do compilation and on-stack repalcement your benchmark would produce wildly different results.

Writing good micro-benchmarks under J2SE is a rather difficult and specialized art requiring a lot of understanding of what the VM is doing underneath.


Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 605
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #15 - Posted 2005-12-07 21:00:39 »

Trig tables will beat the ... out of sin/cos. (somewhere around factor 7 IIRC)

If your lookup-table is of length 100, you do:
entry = (( (int)(angle * scale) % 100) + 100) % 100;
...to catch all angles, even negative ones, without if-statements



Will post the benchmarks when I find time... at work now Lips Sealed

Intresting!  i must admit im having a bit of trouble figuring out how your lookup works.. just trouble picturing it in my head...

But it sure looks more efficient!  Are you assuming a 0-360 degree table?

Edit:Oh ,I got it, most of that is jsut to move the index out of the negs Smiley
What is "scale" in the above?

Let say you want 10 accurate values between an angle of 4 and 5 (thus: 4.0, 4.1, .., 4.9) (doesn't matter whether those are degrees or radians)
For simplicity sake we do it in degrees here:
we have 10 * 360 (3600) entries in the lookup-table
we then do the following:

1  
2  
3  
public static final float fastSin(float degrees) {
   return lookupTable[  (((int)(degrees * 10) % 3600)+3600)%3600  ];
}


... IIRC Smiley

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline whome

Junior Member




Carte Noir Java


« Reply #16 - Posted 2005-12-11 13:20:25 »

"we have 10 * 360 (5760) entries in the lookup-table"
Where did you get this 5760 entries in an array table if we have 0-360 degrees with 10 step accurate scale?
Offline Riven
« League of Dukes »

JGO Overlord


Medals: 605
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #17 - Posted 2005-12-11 18:12:54 »

I appearantly like to change some values in my examples without changing everything else.

I started with 16 * 360... will change it.

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings
Offline chaosdeathfish

Junior Member


Projects: 1



« Reply #18 - Posted 2005-12-11 23:53:50 »

Personally I just use lookup tables for acos (which seems horrendously slow) - sin and cos don't cause me enough problems for me to need LUTs for them.
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

xsi3rr4x (19 views)
2014-04-15 18:08:23

BurntPizza (15 views)
2014-04-15 03:46:01

UprightPath (28 views)
2014-04-14 17:39:50

UprightPath (13 views)
2014-04-14 17:35:47

Porlus (29 views)
2014-04-14 15:48:38

tom_mai78101 (54 views)
2014-04-10 04:04:31

BurntPizza (111 views)
2014-04-08 23:06:04

tom_mai78101 (212 views)
2014-04-05 13:34:39

trollwarrior1 (181 views)
2014-04-04 12:06:45

CJLetsGame (187 views)
2014-04-01 02:16:10
List of Learning Resources
by Longarmx
2014-04-08 03:14:44

Good Examples
by matheus23
2014-04-05 13:51:37

Good Examples
by Grunnt
2014-04-03 15:48:46

Good Examples
by Grunnt
2014-04-03 15:48:37

Good Examples
by matheus23
2014-04-01 18:40:51

Good Examples
by matheus23
2014-04-01 18:40:34

Anonymous/Local/Inner class gotchas
by Roquen
2014-03-11 15:22:30

Anonymous/Local/Inner class gotchas
by Roquen
2014-03-11 15:05:20
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!