Java-Gaming.org Hi !
 Featured games (91) games approved by the League of Dukes Games in Showcase (757) Games in Android Showcase (229) games submitted by our members Games in WIP (844) games currently in development
 News: Read the Java Gaming Resources, or peek at the official Java tutorials
Pages: [1]
 ignore  |  Print
 Floating Point Versus Fixed Point  (Read 8273 times) 0 Members and 1 Guest are viewing this topic.
sunet2000

Senior Newbie

I want my mumart account back!

 « Posted 2007-06-25 21:57:26 »

Not so long ago I made a statement about how I found using fixed point arithmetic to be faster when doing linear interpolation of audio in Java. This has been bugging me ever since, as in the floating-point code I hadn't really tried to do anything fast. So I've conducted a little test to see if floating point arithmetic really is slower.

Here's the code I used, it's a simple class that generates a sine wave at the specified frequency (specified as input samples/output sample) by linear-interpolating a 1024 entry wavetable. It's not the most efficient algorithm, as there is an end-of-waveform check in the inner loop which could be hoisted with some clever arithmetic, but it's relatively easy to understand. There is an implementation of the algorithm for 15 bit fixed point, 32 bit float, and 64 bit double types.

I had stated that you need a modulo-divide to get the fractional part when using floating point arithmetic. Actually this is not true, you can just subtract the integer part instead.

 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99  100  101  102  103  104  105  106 `public class Interpolator {   private static final int      FP_SHIFT = 15,      FP_ONE = 1 << FP_SHIFT,      FP_MASK = FP_ONE - 1,      BUF_LEN = 1024,      FP_BUF_LEN = BUF_LEN << FP_SHIFT;   private short[] wav_buf;   public Interpolator() {      int idx;      /* Initialise a sine wave as the input.*/      wav_buf = new short[ BUF_LEN + 1 ];      for( idx = 0; idx < BUF_LEN; idx++ ) {         wav_buf[ idx ] = ( short ) ( Math.sin( Math.PI * 2 * idx / BUF_LEN ) * 32767 );      }      wav_buf[ 0 ] = wav_buf[ BUF_LEN ];   }   public double interpolate_double( int[] mix_buf, int length, double step, double phase ) {      int idx, i;      double c, m, x;      for( idx = 0; idx < length; idx++ ) {         i = ( int ) phase;         c = wav_buf[ i ];         m = wav_buf[ i + 1 ] - c;         x = phase - i;         mix_buf[ idx ] += ( int ) ( m * x + c );         phase += step;         if( phase >= BUF_LEN ) {            phase -= BUF_LEN;         }      }      return phase;    }   public float interpolate_float( int[] mix_buf, int length, float step, float phase ) {      int idx, i;      float c, m, x;      for( idx = 0; idx < length; idx++ ) {         i = ( int ) phase;         c = wav_buf[ i ];         m = wav_buf[ i + 1 ] - c;         x = phase - i;         mix_buf[ idx ] += ( int ) ( m * x + c );         phase += step;         if( phase >= BUF_LEN ) {            phase -= BUF_LEN;         }      }      return phase;    }   public int interpolate_int( int[] mix_buf, int length, int step, int phase ) {      int idx, i, c, m, x;      for( idx = 0; idx < length; idx++ ) {         i = phase >> FP_SHIFT;         c = wav_buf[ i ];         m = wav_buf[ i + 1 ] - c;         x = phase & FP_MASK;         mix_buf[ idx ] += ( m * x >> FP_SHIFT ) + c;         phase += step;         if( phase >= FP_BUF_LEN ) {            phase -= FP_BUF_LEN;         }      }      return phase;   }   public static void main( String[] args ) throws Exception {      int idx, num_cycles;      int phase_i, pi_i;      float phase_f, pi_f;      double phase_d;      long time;      Interpolator interpolator;      int[] mix_buf;      num_cycles = 100000;      interpolator = new Interpolator();      mix_buf = new int[ BUF_LEN ];      phase_d = phase_f = phase_i = 0;      pi_i = ( int ) ( Math.PI * FP_ONE );      pi_f = ( float ) Math.PI;      time = System.currentTimeMillis();      for( idx = 0; idx < num_cycles; idx++ ) {         phase_i = interpolator.interpolate_int( mix_buf, BUF_LEN, pi_i, phase_i );      }      System.out.println( "INT: " + ( System.currentTimeMillis() - time ) );      time = System.currentTimeMillis();      for( idx = 0; idx < num_cycles; idx++ ) {         phase_f = interpolator.interpolate_float( mix_buf, BUF_LEN, pi_f, phase_f );      }      System.out.println( "FLOAT: " + ( System.currentTimeMillis() - time ) );      time = System.currentTimeMillis();      for( idx = 0; idx < num_cycles; idx++ ) {         phase_d = interpolator.interpolate_double( mix_buf, BUF_LEN, Math.PI, phase_d );      }      System.out.println( "DOUBLE: " + ( System.currentTimeMillis() - time ) );   }}`

And here are the results from my 1.5Ghz AMD Sempron (basically an Athlon XP with the same amount of cache as an Athlon Thunderbird) using Java 1.6.0. Values are in milliseconds:

INT: 1453
FLOAT: 2641
DOUBLE: 5922

So it appears that, in Java at least, integer arithmetic is faster than floating point arithmetic on my machine. I'm a bit confused as to why 32 bit floating point is so much faster than 64 bit. I only counted 6 FP values, which should all fit in the 7 x87 registers. Perhaps there is still some unavoidable register swapping that could account for it. Or maybe there are still improvements to be made to the JVM's floating point support

Even so, the slowest result is still 6 seconds for over 100 MILLION samples of output. That's enough for over 300 waveforms generated in real-time at 48khz. There are roughly 10 operations in the inner loop, which I work out to be about 170 megaflops. Some way off the numbers given for my machine (Sisoft Sandra suggests 2700 megaflops) but for real-world performance I've got no complaints

Any thoughts?

Cheers,
Martin
sunet2000

Senior Newbie

I want my mumart account back!

 « Reply #1 - Posted 2007-06-26 11:37:48 »

Because the waveform was a sine wave in this test I thought it might be useful to see whether Math.sin() was a practical alternative, so I added the following code, which ought to be equivalent. The phase check is not really necessary (certainly not inside the loop), but in the interest of fairness I left it in:

 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17 `   private static final double      RECIP_BUF_LEN = 1.0 / BUF_LEN;   public double sine_wave( int[] mix_buf, int length, double step, double phase ) {      int idx;      double two_pi;      two_pi = Math.PI * 2;      step *= two_pi * RECIP_BUF_LEN;      for( idx = 0; idx < length; idx++ ) {         mix_buf[ idx ] += ( int ) ( Math.sin( phase ) * 32767 );         phase += step;         if( phase > two_pi ) {            phase -= two_pi;         }      }      return phase;   }`

And the results:

 1  2  3  4 `INT: 1485FLOAT: 2594DOUBLE: 5812SINE: 20125`

So use a sine table
erikd

JGO Ninja

Medals: 16
Projects: 4
Exp: 14 years

Maximumisness

 « Reply #2 - Posted 2007-06-26 12:06:15 »

Quote
So use a sine table

heh, yeah I could've told you that without the test
I'm actually surprised using fixed point math and a table is 'only' about 13 times faster

brackeen

Junior Devvie

 « Reply #3 - Posted 2007-06-26 19:13:53 »

A lot of it has to do with converting ints to floats and floats to ints. For example, if wav_buf and mix_buf were both float[] arrays, the interpolate_float() function would be faster.

Here are the results on my machine - 2.16 GHz Intel Core Duo 2
Mac OS X, Java 5:
INT: 931
FLOAT: 2642
DOUBLE: 2649

Windows XP, Java 6 (under Parallels):
INT: 671
FLOAT: 771
DOUBLE: 831

I don't know why the results are so different.

Edit: Results on Mac OS X + Java 5 with the -server switch:
INT: 678
FLOAT: 1081
DOUBLE: 1185

Results on Windows + Java 6  with the -server switch:
INT: 631
FLOAT: 1101
DOUBLE: 1212

keldon85

Senior Devvie

Medals: 1

 « Reply #4 - Posted 2007-06-26 20:09:11 »

I think that the OSX Java is built by Apple and not Sun which is why there are speed differences; but when you've delved into hardware-limited platforms these common speed practises become second nature. Look up tables, integer maths (and not losing accuracy from premature shifting) soon come naturally, you almost have to unlearn and rethink maths.

You may also find that it makes sense to have sin/cos look up tables with the length of powers of two's; degrees and radians have no use to you in any way and when you think about it you'll be much better off that way. If you are really thinking about speed then you may also want to consider cache hits also; with audio processing you're doing the same operation over and over again on constantly changing data. You may find that 512kb sized buffers works best!

erikd

JGO Ninja

Medals: 16
Projects: 4
Exp: 14 years

Maximumisness

 « Reply #5 - Posted 2007-06-26 20:28:18 »

On modern CPU's I think fixed point math doesn't make that much sense anymore.

Quote
I think that the OSX Java is built by Apple and not Sun which is why there are speed differences;
Hm, why would 'made by Apple instead of Sun' imply lower performance?

sunet2000

Senior Newbie

I want my mumart account back!

 « Reply #6 - Posted 2007-06-26 21:27:04 »

A lot of it has to do with converting ints to floats and floats to ints. For example, if wav_buf and mix_buf were both float[] arrays, the interpolate_float() function would be faster.

Here are the results on my machine - 2.16 GHz Intel Core Duo 2
Mac OS X, Java 5:
INT: 931
FLOAT: 2642
DOUBLE: 2649

I'll look into the cost of float/int conversion. The Core 2 result is interesting. My guess is that Java is using SSE2 on that platform.

EDIT:

It's just occured to me that this could explain the difference in performance between float and double on my machine. Since my chip only has SSE1 (which only deals with single-precision floating point), the JVM might be using that for the float method, but falling back to the x87 for the double method. This would also affect the performance of Math.sin().
brackeen

Junior Devvie

 « Reply #7 - Posted 2007-06-26 21:59:50 »

Quote
My guess is that Java is using SSE2 on that platform.
You're probably right (or SSE3, or whatever else). Also, looking at the results from using the -server switch, I would guess that some optimizations previously found in Java 5's HotSpot server VM got added to the HotSpot client VM in Java 6.
sunet2000

Senior Newbie

I want my mumart account back!

 « Reply #8 - Posted 2007-06-26 22:15:20 »

OK, so I've added the following method (and generated a float array called wav_buf_float[] ):

 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16 `   public float interpolate_float( float[] mix_buf, int length, float step, float phase ) {      int idx, i;      float c, m, x;      for( idx = 0; idx < length; idx++ ) {         i = ( int ) phase;         c = wav_buf_float[ i ];         m = wav_buf_float[ i + 1 ] - c;         x = phase - i;         mix_buf[ idx ] += m * x + c;         phase += step;         if( phase >= BUF_LEN ) {            phase -= BUF_LEN;         }      }      return phase;    }`

The results are as follows (FLOAT2 is the result for the float-array method):

 1  2  3  4  5 `INT: 1516FLOAT: 2547FLOAT2: 2406DOUBLE: 6078SINE: 20703`

So there is an extra cost to float/int conversion, but it only amounts to 5% or so. It's nice to know that for the basic operations floating point is not far off the performance of integer code.

On modern CPU's I think fixed point math doesn't make that much sense anymore.

Yeah, although if you don't mind the limited accuracy it can sometimes provide some extra performance.
brackeen

Junior Devvie

 « Reply #9 - Posted 2007-06-26 23:32:04 »

I really should be working, but for whatever reason I can't stop myself for being interested in this stuff

So I tried a third float interpolate method, which uses fixed point for step & phase, but floats for the buffers. The only int-to-float conversion is setting x:

 1  2  3  4  5  6  7  8  9  10  11  12  13  14 `        public int interpolate_float3( float[] mix_buf_float, int length, int step, int phase ) {      for( int idx = 0; idx < length; idx++ ) {         int i = phase >> FP_SHIFT;         float c = wav_buf_float[ i ];         float m = wav_buf_float[ i + 1 ] - c;         float x = phase & FP_MASK;         mix_buf_float[ idx ] += ( m * x / FP_ONE + c );         phase += step;         if( phase >= FP_BUF_LEN ) {            phase -= FP_BUF_LEN;         }      }      return phase;    }`

Mac Java 5 results:
INT: 953
FLOAT: 2667
FLOAT2: 1692
FLOAT3: 1108
DOUBLE: 2676

Windows Java 6 results:
INT: 660
FLOAT: 732
FLOAT2: 811
FLOAT3: 661
DOUBLE: 821
keldon85

Senior Devvie

Medals: 1

 « Reply #10 - Posted 2007-06-27 00:34:40 »

On modern CPU's I think fixed point math doesn't make that much sense anymore.
Hm, why would 'made by Apple instead of Sun' imply lower performance?
I wasn't implying that Apple's implementation must be slower, just stating that it is a different implementation altogether.

And although you find that addition/multiplications may be faster with integers, I find that floating point divides are much quicker on my system (Athlon XP 64).

 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17 ` // reciprical divisionspublic int interpolate_float4( float[] mix_buf_float, int length, int step, int phase ) {   float fp_one_rec = 1f/FP_ONE;   // reciprical    for( int idx = 0; idx < length; idx++ ) {      int i = phase >> FP_SHIFT;      float c = wav_buf_float[ i ];      float m = wav_buf_float[ i + 1 ] - c;      float x = phase & FP_MASK;      mix_buf_float[ idx ] += ( m * x * fp_one_rec + c );      phase += step;      if( phase >= FP_BUF_LEN ) {         phase -= FP_BUF_LEN;      }   }   return phase; }`

Windows Java 6 results (Athlon XP 64+ 3200):
INT: 781
FLOAT 1: 2235
FLOAT 2: 1390
FLOAT 3: 1328
FLOAT 4: 1157
DOUBLE: 2000

brackeen

Junior Devvie

 « Reply #11 - Posted 2007-06-27 01:26:51 »

I think the main bottleneck with interpolate_float2 is the float-to-int conversion. I'm not sure if it's a HotSpot issue, an Intel issue, or just an issue in general.

I tried interpolate_float4 - didn't see an improvement on my Intel machine, but it's interesting to see the speed up on AMD.

Edit: Seriously, I've got typo-fever today.
keldon85

Senior Devvie

Medals: 1

 « Reply #12 - Posted 2007-06-27 06:59:32 »

What's striking is the difference in the differences (if you get what I mean). Floats are much slower on my processor whereas integer multiplications tend to have little difference between the two. Having said that are you running a dual core processor?

sunet2000

Senior Newbie

I want my mumart account back!

 « Reply #13 - Posted 2007-06-27 13:52:07 »

I think the main bottleneck with interpolate_float2 is the float-to-int conversion.

Perhaps on Intel chips. On my machine the conversion has a cost, but it's relatively small. I found I could actually get more performance by using a float mix_buf[], but keeping the short wav_buf[]. Perhaps the memory bandwidth reduction made up for the difference. Another possibility is that int->float is fast, but float->int is slower.

sunet2000

Senior Newbie

I want my mumart account back!

 « Reply #14 - Posted 2007-06-27 13:56:45 »

So I tried a third float interpolate method, which uses fixed point for step & phase, but floats for the buffers. The only int-to-float conversion is setting x:

Yeah, I've used that method in some previous code. Didn't benchmark it, but I did assume it would be faster than indexing through a floating point value.

I can confirm though that rounding-and-subtracting to get the fractional part is a lot faster than modulo-division.
keldon85

Senior Devvie

Medals: 1

 « Reply #15 - Posted 2007-06-27 16:42:41 »

I can confirm though that rounding-and-subtracting to get the fractional part is a lot faster than modulo-division.

Well naturally, bearing in mind that modulo-division requires a division!

Pages: [1]
 ignore  |  Print

 EgonOlsen (73 views) 2018-06-10 19:43:48 EgonOlsen (54 views) 2018-06-10 19:43:44 EgonOlsen (73 views) 2018-06-10 19:43:20 DesertCoockie (253 views) 2018-05-13 18:23:11 nelsongames (154 views) 2018-04-24 18:15:36 nelsongames (154 views) 2018-04-24 18:14:32 ivj94 (895 views) 2018-03-24 14:47:39 ivj94 (156 views) 2018-03-24 14:46:31 ivj94 (808 views) 2018-03-24 14:43:53 Solater (172 views) 2018-03-17 05:04:08
 Java Gaming Resourcesby philfrei2017-12-05 19:38:37Java Gaming Resourcesby philfrei2017-12-05 19:37:39Java Gaming Resourcesby philfrei2017-12-05 19:36:10Java Gaming Resourcesby philfrei2017-12-05 19:33:10List of Learning Resourcesby elect2017-03-13 14:05:44List of Learning Resourcesby elect2017-03-13 14:04:45SF/X Librariesby philfrei2017-03-02 08:45:19SF/X Librariesby philfrei2017-03-02 08:44:05
 java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org