Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (477)
Games in Android Showcase (107)
games submitted by our members
Games in WIP (534)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  Floating Point Versus Fixed Point  (Read 3480 times)
0 Members and 1 Guest are viewing this topic.
Offline sunet2000

Senior Newbie




I want my mumart account back!


« Posted 2007-06-25 23:57:26 »

Not so long ago I made a statement about how I found using fixed point arithmetic to be faster when doing linear interpolation of audio in Java. This has been bugging me ever since, as in the floating-point code I hadn't really tried to do anything fast. So I've conducted a little test to see if floating point arithmetic really is slower.

Here's the code I used, it's a simple class that generates a sine wave at the specified frequency (specified as input samples/output sample) by linear-interpolating a 1024 entry wavetable. It's not the most efficient algorithm, as there is an end-of-waveform check in the inner loop which could be hoisted with some clever arithmetic, but it's relatively easy to understand. There is an implementation of the algorithm for 15 bit fixed point, 32 bit float, and 64 bit double types.

I had stated that you need a modulo-divide to get the fractional part when using floating point arithmetic. Actually this is not true, you can just subtract the integer part instead.

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
62  
63  
64  
65  
66  
67  
68  
69  
70  
71  
72  
73  
74  
75  
76  
77  
78  
79  
80  
81  
82  
83  
84  
85  
86  
87  
88  
89  
90  
91  
92  
93  
94  
95  
96  
97  
98  
99  
100  
101  
102  
103  
104  
105  
106  
public class Interpolator {
   private static final int
      FP_SHIFT = 15,
      FP_ONE = 1 << FP_SHIFT,
      FP_MASK = FP_ONE - 1,
      BUF_LEN = 1024,
      FP_BUF_LEN = BUF_LEN << FP_SHIFT;

   private short[] wav_buf;

   public Interpolator() {
      int idx;
      /* Initialise a sine wave as the input.*/
      wav_buf = new short[ BUF_LEN + 1 ];
      for( idx = 0; idx < BUF_LEN; idx++ ) {
         wav_buf[ idx ] = ( short ) ( Math.sin( Math.PI * 2 * idx / BUF_LEN ) * 32767 );
      }
      wav_buf[ 0 ] = wav_buf[ BUF_LEN ];
   }

   public double interpolate_double( int[] mix_buf, int length, double step, double phase ) {
      int idx, i;
      double c, m, x;
      for( idx = 0; idx < length; idx++ ) {
         i = ( int ) phase;
         c = wav_buf[ i ];
         m = wav_buf[ i + 1 ] - c;
         x = phase - i;
         mix_buf[ idx ] += ( int ) ( m * x + c );
         phase += step;
         if( phase >= BUF_LEN ) {
            phase -= BUF_LEN;
         }
      }
      return phase;
   }

   public float interpolate_float( int[] mix_buf, int length, float step, float phase ) {
      int idx, i;
      float c, m, x;
      for( idx = 0; idx < length; idx++ ) {
         i = ( int ) phase;
         c = wav_buf[ i ];
         m = wav_buf[ i + 1 ] - c;
         x = phase - i;
         mix_buf[ idx ] += ( int ) ( m * x + c );
         phase += step;
         if( phase >= BUF_LEN ) {
            phase -= BUF_LEN;
         }
      }
      return phase;
   }

   public int interpolate_int( int[] mix_buf, int length, int step, int phase ) {
      int idx, i, c, m, x;
      for( idx = 0; idx < length; idx++ ) {
         i = phase >> FP_SHIFT;
         c = wav_buf[ i ];
         m = wav_buf[ i + 1 ] - c;
         x = phase & FP_MASK;
         mix_buf[ idx ] += ( m * x >> FP_SHIFT ) + c;
         phase += step;
         if( phase >= FP_BUF_LEN ) {
            phase -= FP_BUF_LEN;
         }
      }
      return phase;
   }

   public static void main( String[] args ) throws Exception {
      int idx, num_cycles;
      int phase_i, pi_i;
      float phase_f, pi_f;
      double phase_d;
      long time;
      Interpolator interpolator;
      int[] mix_buf;

      num_cycles = 100000;

      interpolator = new Interpolator();
      mix_buf = new int[ BUF_LEN ];
      phase_d = phase_f = phase_i = 0;
      pi_i = ( int ) ( Math.PI * FP_ONE );
      pi_f = ( float ) Math.PI;

      time = System.currentTimeMillis();
      for( idx = 0; idx < num_cycles; idx++ ) {
         phase_i = interpolator.interpolate_int( mix_buf, BUF_LEN, pi_i, phase_i );
      }
      System.out.println( "INT: " + ( System.currentTimeMillis() - time ) );

      time = System.currentTimeMillis();
      for( idx = 0; idx < num_cycles; idx++ ) {
         phase_f = interpolator.interpolate_float( mix_buf, BUF_LEN, pi_f, phase_f );
      }
      System.out.println( "FLOAT: " + ( System.currentTimeMillis() - time ) );

      time = System.currentTimeMillis();
      for( idx = 0; idx < num_cycles; idx++ ) {
         phase_d = interpolator.interpolate_double( mix_buf, BUF_LEN, Math.PI, phase_d );
      }
      System.out.println( "DOUBLE: " + ( System.currentTimeMillis() - time ) );
   }
}


And here are the results from my 1.5Ghz AMD Sempron (basically an Athlon XP with the same amount of cache as an Athlon Thunderbird) using Java 1.6.0. Values are in milliseconds:

INT: 1453
FLOAT: 2641
DOUBLE: 5922

So it appears that, in Java at least, integer arithmetic is faster than floating point arithmetic on my machine. I'm a bit confused as to why 32 bit floating point is so much faster than 64 bit. I only counted 6 FP values, which should all fit in the 7 x87 registers. Perhaps there is still some unavoidable register swapping that could account for it. Or maybe there are still improvements to be made to the JVM's floating point support Smiley

Even so, the slowest result is still 6 seconds for over 100 MILLION samples of output. That's enough for over 300 waveforms generated in real-time at 48khz. There are roughly 10 operations in the inner loop, which I work out to be about 170 megaflops. Some way off the numbers given for my machine (Sisoft Sandra suggests 2700 megaflops) but for real-world performance I've got no complaints Smiley

Any thoughts?

Cheers,
Martin
Offline sunet2000

Senior Newbie




I want my mumart account back!


« Reply #1 - Posted 2007-06-26 13:37:48 »

Because the waveform was a sine wave in this test I thought it might be useful to see whether Math.sin() was a practical alternative, so I added the following code, which ought to be equivalent. The phase check is not really necessary (certainly not inside the loop), but in the interest of fairness I left it in:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
   private static final double
      RECIP_BUF_LEN = 1.0 / BUF_LEN;

   public double sine_wave( int[] mix_buf, int length, double step, double phase ) {
      int idx;
      double two_pi;
      two_pi = Math.PI * 2;
      step *= two_pi * RECIP_BUF_LEN;
      for( idx = 0; idx < length; idx++ ) {
         mix_buf[ idx ] += ( int ) ( Math.sin( phase ) * 32767 );
         phase += step;
         if( phase > two_pi ) {
            phase -= two_pi;
         }
      }
      return phase;
   }


And the results:

1  
2  
3  
4  
INT: 1485
FLOAT: 2594
DOUBLE: 5812
SINE: 20125


So use a sine table Smiley
Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #2 - Posted 2007-06-26 14:06:15 »

Quote
So use a sine table  Smiley

heh, yeah I could've told you that without the test  Wink
I'm actually surprised using fixed point math and a table is 'only' about 13 times faster  Grin

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline brackeen

Junior Member





« Reply #3 - Posted 2007-06-26 21:13:53 »

A lot of it has to do with converting ints to floats and floats to ints. For example, if wav_buf and mix_buf were both float[] arrays, the interpolate_float() function would be faster.

Here are the results on my machine - 2.16 GHz Intel Core Duo 2
Mac OS X, Java 5:
INT: 931
FLOAT: 2642
DOUBLE: 2649

Windows XP, Java 6 (under Parallels):
INT: 671
FLOAT: 771
DOUBLE: 831

I don't know why the results are so different.

Edit: Results on Mac OS X + Java 5 with the -server switch:
INT: 678
FLOAT: 1081
DOUBLE: 1185

Results on Windows + Java 6  with the -server switch:
INT: 631
FLOAT: 1101
DOUBLE: 1212

Offline keldon85

Senior Member


Medals: 1



« Reply #4 - Posted 2007-06-26 22:09:11 »

I think that the OSX Java is built by Apple and not Sun which is why there are speed differences; but when you've delved into hardware-limited platforms these common speed practises become second nature. Look up tables, integer maths (and not losing accuracy from premature shifting) soon come naturally, you almost have to unlearn and rethink maths.

You may also find that it makes sense to have sin/cos look up tables with the length of powers of two's; degrees and radians have no use to you in any way and when you think about it you'll be much better off that way. If you are really thinking about speed then you may also want to consider cache hits also; with audio processing you're doing the same operation over and over again on constantly changing data. You may find that 512kb sized buffers works best!

Offline erikd

JGO Ninja


Medals: 16
Projects: 4
Exp: 14 years


Maximumisness


« Reply #5 - Posted 2007-06-26 22:28:18 »

On modern CPU's I think fixed point math doesn't make that much sense anymore.

Quote
I think that the OSX Java is built by Apple and not Sun which is why there are speed differences;
Hm, why would 'made by Apple instead of Sun' imply lower performance?

Offline sunet2000

Senior Newbie




I want my mumart account back!


« Reply #6 - Posted 2007-06-26 23:27:04 »

A lot of it has to do with converting ints to floats and floats to ints. For example, if wav_buf and mix_buf were both float[] arrays, the interpolate_float() function would be faster.

Here are the results on my machine - 2.16 GHz Intel Core Duo 2
Mac OS X, Java 5:
INT: 931
FLOAT: 2642
DOUBLE: 2649

I'll look into the cost of float/int conversion. The Core 2 result is interesting. My guess is that Java is using SSE2 on that platform.

EDIT:

It's just occured to me that this could explain the difference in performance between float and double on my machine. Since my chip only has SSE1 (which only deals with single-precision floating point), the JVM might be using that for the float method, but falling back to the x87 for the double method. This would also affect the performance of Math.sin().
Offline brackeen

Junior Member





« Reply #7 - Posted 2007-06-26 23:59:50 »

Quote
My guess is that Java is using SSE2 on that platform.
You're probably right (or SSE3, or whatever else). Also, looking at the results from using the -server switch, I would guess that some optimizations previously found in Java 5's HotSpot server VM got added to the HotSpot client VM in Java 6.
Offline sunet2000

Senior Newbie




I want my mumart account back!


« Reply #8 - Posted 2007-06-27 00:15:20 »

OK, so I've added the following method (and generated a float array called wav_buf_float[] ):

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
   public float interpolate_float( float[] mix_buf, int length, float step, float phase ) {
      int idx, i;
      float c, m, x;
      for( idx = 0; idx < length; idx++ ) {
         i = ( int ) phase;
         c = wav_buf_float[ i ];
         m = wav_buf_float[ i + 1 ] - c;
         x = phase - i;
         mix_buf[ idx ] += m * x + c;
         phase += step;
         if( phase >= BUF_LEN ) {
            phase -= BUF_LEN;
         }
      }
      return phase;
   }


The results are as follows (FLOAT2 is the result for the float-array method):

1  
2  
3  
4  
5  
INT: 1516
FLOAT: 2547
FLOAT2: 2406
DOUBLE: 6078
SINE: 20703


So there is an extra cost to float/int conversion, but it only amounts to 5% or so. It's nice to know that for the basic operations floating point is not far off the performance of integer code.

On modern CPU's I think fixed point math doesn't make that much sense anymore.

Yeah, although if you don't mind the limited accuracy it can sometimes provide some extra performance.
Offline brackeen

Junior Member





« Reply #9 - Posted 2007-06-27 01:32:04 »

I really should be working, but for whatever reason I can't stop myself for being interested in this stuff Smiley

So I tried a third float interpolate method, which uses fixed point for step & phase, but floats for the buffers. The only int-to-float conversion is setting x:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
        public int interpolate_float3( float[] mix_buf_float, int length, int step, int phase ) {
      for( int idx = 0; idx < length; idx++ ) {
         int i = phase >> FP_SHIFT;
         float c = wav_buf_float[ i ];
         float m = wav_buf_float[ i + 1 ] - c;
         float x = phase & FP_MASK;
         mix_buf_float[ idx ] += ( m * x / FP_ONE + c );
         phase += step;
         if( phase >= FP_BUF_LEN ) {
            phase -= FP_BUF_LEN;
         }
      }
      return phase;
   }


Mac Java 5 results:
INT: 953
FLOAT: 2667
FLOAT2: 1692
FLOAT3: 1108
DOUBLE: 2676

Windows Java 6 results:
INT: 660
FLOAT: 732
FLOAT2: 811
FLOAT3: 661
DOUBLE: 821
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline keldon85

Senior Member


Medals: 1



« Reply #10 - Posted 2007-06-27 02:34:40 »

On modern CPU's I think fixed point math doesn't make that much sense anymore.
Hm, why would 'made by Apple instead of Sun' imply lower performance?
I wasn't implying that Apple's implementation must be slower, just stating that it is a different implementation altogether.

And although you find that addition/multiplications may be faster with integers, I find that floating point divides are much quicker on my system (Athlon XP 64).

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
 // reciprical divisions

public int interpolate_float4( float[] mix_buf_float, int length, int step, int phase ) {
   float fp_one_rec = 1f/FP_ONE;   // reciprical
  for( int idx = 0; idx < length; idx++ ) {
      int i = phase >> FP_SHIFT;
      float c = wav_buf_float[ i ];
      float m = wav_buf_float[ i + 1 ] - c;
      float x = phase & FP_MASK;
      mix_buf_float[ idx ] += ( m * x * fp_one_rec + c );
      phase += step;
      if( phase >= FP_BUF_LEN ) {
         phase -= FP_BUF_LEN;
      }
   }
   return phase;
}


Windows Java 6 results (Athlon XP 64+ 3200):
INT: 781
FLOAT 1: 2235
FLOAT 2: 1390
FLOAT 3: 1328
FLOAT 4: 1157
DOUBLE: 2000

Offline brackeen

Junior Member





« Reply #11 - Posted 2007-06-27 03:26:51 »

I think the main bottleneck with interpolate_float2 is the float-to-int conversion. I'm not sure if it's a HotSpot issue, an Intel issue, or just an issue in general.

I tried interpolate_float4 - didn't see an improvement on my Intel machine, but it's interesting to see the speed up on AMD.

Edit: Seriously, I've got typo-fever today.
Offline keldon85

Senior Member


Medals: 1



« Reply #12 - Posted 2007-06-27 08:59:32 »

What's striking is the difference in the differences (if you get what I mean). Floats are much slower on my processor whereas integer multiplications tend to have little difference between the two. Having said that are you running a dual core processor?

Offline sunet2000

Senior Newbie




I want my mumart account back!


« Reply #13 - Posted 2007-06-27 15:52:07 »

I think the main bottleneck with interpolate_float2 is the float-to-int conversion.

Perhaps on Intel chips. On my machine the conversion has a cost, but it's relatively small. I found I could actually get more performance by using a float mix_buf[], but keeping the short wav_buf[]. Perhaps the memory bandwidth reduction made up for the difference. Another possibility is that int->float is fast, but float->int is slower.

EDIT: Sorry, I wasn't reading your post carefully enough. I think you made the same point Smiley
Offline sunet2000

Senior Newbie




I want my mumart account back!


« Reply #14 - Posted 2007-06-27 15:56:45 »

So I tried a third float interpolate method, which uses fixed point for step & phase, but floats for the buffers. The only int-to-float conversion is setting x:

Yeah, I've used that method in some previous code. Didn't benchmark it, but I did assume it would be faster than indexing through a floating point value.

I can confirm though that rounding-and-subtracting to get the fractional part is a lot faster than modulo-division.
Offline keldon85

Senior Member


Medals: 1



« Reply #15 - Posted 2007-06-27 18:42:41 »


I can confirm though that rounding-and-subtracting to get the fractional part is a lot faster than modulo-division.

Well naturally, bearing in mind that modulo-division requires a division!

Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

pw (35 views)
2014-07-24 01:59:36

Riven (33 views)
2014-07-23 21:16:32

Riven (21 views)
2014-07-23 21:07:15

Riven (24 views)
2014-07-23 20:56:16

ctomni231 (55 views)
2014-07-18 06:55:21

Zero Volt (47 views)
2014-07-17 23:47:54

danieldean (38 views)
2014-07-17 23:41:23

MustardPeter (43 views)
2014-07-16 23:30:00

Cero (59 views)
2014-07-16 00:42:17

Riven (56 views)
2014-07-14 18:02:53
HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54

HotSpot Options
by Roquen
2014-05-06 15:03:10

Escape Analysis
by Roquen
2014-04-29 22:16:43

Experimental Toys
by Roquen
2014-04-28 13:24:22
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!