Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Posted
2003-02-26 17:04:57 » |
|
Could it be possible to add this into the standard lwgl package? This is the c++ way: 1 2 3 4 5 6 7 8 9
| float InvSqrt (float x) { float xhalf = 0.5f*x; int i = *(int*)&x; i = 0x5f3759df - (i >> 1); x = *(float*)&i; x = x*(1.5f - xhalf*x*x); return x; } |
I was recently asking this
|
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #1 - Posted
2003-02-26 17:13:47 » |
|
And the Lomont way, which is supposebly better, I'm just picking these up from [/code] float InvSqrt_Lomont(float x) { float xhalf = 0.5f*x; int i = *(int*)&x; i = 0x5f375a86 - (i>>1); x = *(float*)&i; x = x*(1.5f-xhalf*x*x); // x = x*(1.5f-xhalf*x*x); // added precision return x; } // InvSqrt_Lomont
[code]
I'm not sure if this can really boost anything because of native calls, or even if this can be converted to java, but I think its worth a try.
EDIT: This is obviously in wrong place, if admin sees this please move to lwgl forum.
|
|
|
|
|
princec
|
 |
«
Reply #2 - Posted
2003-02-26 18:47:16 » |
|
Seems like a fine addition to me. If you've got the time some performance comparisons would be nice in real world situations (eg. normalise an array of 1,000,000 vector3fs). Cas 
|
|
|
|
Games published by our own members! Check 'em out!
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #3 - Posted
2003-02-27 09:15:11 » |
|
My problem is that I don't have decent c++ compiler available on my home computer. So it mean it will take at least 2 weeks to get back to home and then get c++ compiler.
Anyone here who has the possibility of doing this??
|
|
|
|
|
princec
|
 |
«
Reply #4 - Posted
2003-02-27 10:08:56 » |
|
I didn't see any reason for it not to be implemented in Java.... Cas 
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #5 - Posted
2003-02-27 10:43:26 » |
|
Umm what? I thought you can't do "binary logic" in java? Now what exactly have I missed? For example I have avsolutely no idea how to express if this is achievable then of course I will do the tests!
|
|
|
|
|
alexz
Senior Newbie 
Java games rock!
|
 |
«
Reply #6 - Posted
2003-02-27 10:53:19 » |
|
IMO you can do this via NIO (direct ByteBuffer with IntBuffer and FloatBuffer attached)... 
|
|
|
|
|
NielsJ
Senior Newbie 
Java games rock!
|
 |
«
Reply #7 - Posted
2003-02-27 11:53:20 » |
|
hmm.. I'm not sure I see the point in involving buffers - it's a simple float-in, float-out method call  ... Depending on the implementation of FloatToIntBits and IntBitsToFloat, a pure java implementation might also be viable. --- modified: sorry, didn't read all posts  , the code in java would be: float f = 2; int x = Float.floatToIntBits(f); f = Float.intBitsToFloat(x);
|
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #8 - Posted
2003-02-27 12:23:14 » |
|
hmm.. I'm not sure I see the point in involving buffers - it's a simple float-in, float-out method call  ... Depending on the implementation of FloatToIntBits and IntBitsToFloat, a pure java implementation might also be viable. --- modified: sorry, didn't read all posts  , the code in java would be: float f = 2; int x = Float.floatToIntBits(f); f = Float.intBitsToFloat(x); I don't think this works... when used in the method I get .33295277 from 9... This is well beyond my knowledge, my initial thought was to call it through JNI, but if someone can do pure or "somewhat" pure java implementation it would be better, wouldn't it?
|
|
|
|
|
alexz
Senior Newbie 
Java games rock!
|
 |
«
Reply #9 - Posted
2003-02-27 13:42:11 » |
|
Ok! I've done this quick C++ to Java conversion...  Here are two test functions: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| private static void testInvSqrt() { Vector3f v = new Vector3f(1, 2, 3); float sum = 0;
long start = System.currentTimeMillis(); for (int i = 0; i < 1000000; ++i) { v.x += i * 0.001f; v.y += i * 0.001f; v.z += i * 0.001f; sum += (float)(1.0 / Math.sqrt(v.x * v.x + v.y * v.y + v.z * v.z)); } long end = System.currentTimeMillis(); System.out.println("1.0 / Math.sqrt(): " + (end - start) + "ms, sum = " + sum); }
private static void testInvSqrtNEW() { Vector3f v = new Vector3f(1, 2, 3); float sum = 0;
long start = System.currentTimeMillis(); for (int i = 0; i < 1000000; ++i) { v.x += i * 0.001f; v.y += i * 0.001f; v.z += i * 0.001f; sum += newInvSqrt(v.x * v.x + v.y * v.y + v.z * v.z); } long end = System.currentTimeMillis(); System.out.println("newInvSqrt(): " + (end - start) + "ms, sum = " + sum); } |
Here is the newInvSqrt() implementation: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
| static ByteBuffer byteBuf = ByteBuffer.allocateDirect(4) .order(ByteOrder.nativeOrder()); static IntBuffer intBuf = byteBuf.asIntBuffer(); static FloatBuffer floatBuf = byteBuf.asFloatBuffer();
private static int f2i(float f) { floatBuf.put(0, f); return intBuf.get(0); }
private static float i2f(int i) { intBuf.put(0, i); return floatBuf.get(0); }
private static float newInvSqrt(float x) { float xhalf = 0.5f*x; int i = f2i(x); i = 0x5f3759df - (i >> 1); x = i2f(i); x = x*(1.5f - xhalf*x*x); return x; } |
Here is the test invocation code: 1 2 3 4 5
| for (int i = 0; i < 10; ++i) { testInvSqrt(); testInvSqrtNEW(); } |
Here is the result: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| 1.0 / Math.sqrt(): 271ms, sum = 27.828238 newInvSqrt(): 280ms, sum = 27.803202 1.0 / Math.sqrt(): 261ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202 1.0 / Math.sqrt(): 270ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202 1.0 / Math.sqrt(): 261ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202 1.0 / Math.sqrt(): 260ms, sum = 27.828238 newInvSqrt(): 161ms, sum = 27.803202 1.0 / Math.sqrt(): 270ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202 1.0 / Math.sqrt(): 261ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202 1.0 / Math.sqrt(): 280ms, sum = 27.828238 newInvSqrt(): 171ms, sum = 27.803202 1.0 / Math.sqrt(): 260ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202 1.0 / Math.sqrt(): 271ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202 |
I think the code is clear and doesn't need my explanation... 
|
|
|
|
|
Games published by our own members! Check 'em out!
|
|
alexz
Senior Newbie 
Java games rock!
|
 |
«
Reply #10 - Posted
2003-02-27 13:46:52 » |
|
The code tag skewed my code line indents... 
|
|
|
|
|
alexz
Senior Newbie 
Java games rock!
|
 |
«
Reply #11 - Posted
2003-02-27 13:54:29 » |
|
The result of using native Float.floatToIntBits() and Float.intBitsToFloat() instead of NIO: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| 1.0 / Math.sqrt(): 270ms, sum = 27.828238 newInvSqrt(): 441ms, sum = 27.803202 1.0 / Math.sqrt(): 250ms, sum = 27.828238 newInvSqrt(): 391ms, sum = 27.803202 1.0 / Math.sqrt(): 250ms, sum = 27.828238 newInvSqrt(): 381ms, sum = 27.803202 1.0 / Math.sqrt(): 260ms, sum = 27.828238 newInvSqrt(): 381ms, sum = 27.803202 1.0 / Math.sqrt(): 260ms, sum = 27.828238 newInvSqrt(): 381ms, sum = 27.803202 1.0 / Math.sqrt(): 250ms, sum = 27.828238 newInvSqrt(): 391ms, sum = 27.803202 1.0 / Math.sqrt(): 250ms, sum = 27.828238 newInvSqrt(): 391ms, sum = 27.803202 1.0 / Math.sqrt(): 250ms, sum = 27.828238 newInvSqrt(): 391ms, sum = 27.803202 1.0 / Math.sqrt(): 250ms, sum = 27.828238 newInvSqrt(): 391ms, sum = 27.803202 1.0 / Math.sqrt(): 250ms, sum = 27.828238 newInvSqrt(): 380ms, sum = 27.803202 |
|
|
|
|
|
cfmdobbie
|
 |
«
Reply #12 - Posted
2003-02-27 14:08:32 » |
|
Hrm, I thought Newton-Raphson required guessing a possible result, then looping refining your guess until the required precision is reached? The results I get with the following conversion of the above code is pretty good, accurate to two dp initially, but slipping thereafter: 1 2 3 4 5 6 7 8 9
| public static float sqrt(float x) { float xhalf = 0.5f * x ; int i = Float.floatToIntBits(x) ; i = 0x5f375a86 - (i >> 1) ; x = Float.intBitsToFloat(i) ; x *= (1.5f - xhalf * x * x) ; return 1/x ; } |
What's the name of this algorithm, and where did you find it?
|
Hellomynameis Charlie Dobbie.
|
|
|
NielsJ
Senior Newbie 
Java games rock!
|
 |
«
Reply #13 - Posted
2003-02-27 14:34:32 » |
|
Amazing performance from the Float.xxxBits methods, eh?
How can it possibly be slower then NIO? It should at the very least be the same.
|
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #14 - Posted
2003-02-27 14:35:48 » |
|
Good job alexz, I'm getting vastly different performance, in a loop I get java.lang.Math.sqrt(): 1862 fos3d.util.Math.invSqrt(): 7331 With code like this 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| long starttime = 0, entime = 0; starttime = System.currentTimeMillis(); for(int i = 0; i<100000000; i++){ float f = 1/(float)java.lang.Math.sqrt(9d); } entime = System.currentTimeMillis(); System.out.println("JAVA FUNC: " + (entime - starttime) ); float f = 0; long starttime2 = 0, entime2 = 0; starttime2 = System.currentTimeMillis(); for(int i = 0; i<100000000; i++){ f = Math.sqrt(9f); } entime2 = System.currentTimeMillis(); System.out.println("My SHIT FUNC: " + (entime2 - starttime2) ); |
And the results are not what they are supposed to resemple. From 9f i get 2.996579, which is pretty good.
|
|
|
|
|
princec
|
 |
«
Reply #15 - Posted
2003-02-27 15:02:46 » |
|
I like this new sqrt method so much I've added it to the LWJGL Math class. It's accurate to at least 3dp and the accuracy can be tuned by simply repeating the last line of the equation over and over as many times as you like. What's the theory behind it? Cas 
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #16 - Posted
2003-02-27 15:39:03 » |
|
I like this new sqrt method so much I've added it to the LWJGL Math class. It's accurate to at least 3dp and the accuracy can be tuned by simply repeating the last line of the equation over and over as many times as you like. What's the theory behind it? Cas  Awesome, I wish I could explain.
|
|
|
|
|
qonjure
Junior Newbie
When nature calls we all shall drown.
|
 |
«
Reply #17 - Posted
2003-02-27 20:03:01 » |
|
|
|
|
|
|
alexz
Senior Newbie 
Java games rock!
|
 |
«
Reply #18 - Posted
2003-02-27 21:20:14 » |
|
Ooops... Do you remember my previous result? 1.0 / Math.sqrt(): 261ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202
This was done on P3 600 MHz (JDK 1.4.1)... And here is the result on P4 1.6 GHz (JDK 1.4.1): 1.0 / Math.sqrt(): 70ms, sum = 27.828238 newInvSqrt(): 110ms, sum = 27.803202
Magic! Isn't it? 
|
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #19 - Posted
2003-02-28 09:07:10 » |
|
I get same kind of result, my Athlon 1.33ghz gets ~1100ms from math square and about ~7400ms from inv sqrt.
|
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #20 - Posted
2003-02-28 09:08:02 » |
|
Are you using JRE 1.4.1 with JIT???
|
|
|
|
|
ryanm
« League of Dukes » Senior Member    Projects: 1
Used to be bleb
|
 |
«
Reply #21 - Posted
2003-02-28 09:20:51 » |
|
1.0 / Math.sqrt(): 261ms, sum = 27.828238 newInvSqrt(): 160ms, sum = 27.803202
This was done on P3 600 MHz (JDK 1.4.1)... And here is the result on P4 1.6 GHz (JDK 1.4.1):
1.0 / Math.sqrt(): 70ms, sum = 27.828238 newInvSqrt(): 110ms, sum = 27.803202
looks like the Math.sqrt() is being compiled to some fancy-dan hardware sqrt operation on newer processors. It's nice to know that the JVM does this where it can, but also a pain in the arse if you want top performance across a wide range of target hardware. At last! My methodology of coding to the api, and avoiding nasty hacks, in the slim hope that Sun would someday optimise it for me, is vindicated:D. Sort of :-/.
|
|
|
|
|
alexz
Senior Newbie 
Java games rock!
|
 |
«
Reply #22 - Posted
2003-02-28 10:08:50 » |
|
Yes, I used JRE with JIT... Actually there are lot of C++ optimization tricks, which don't work in Java (didn't at all or did work prior to some "point")... I believe that JVM has special path for some of java.lang.Math methods as well as for other java.* ones, so it effectively inlines them in the final native code. In this case making your own implementation of a standard method (sqrt(), for instance) doesn't make sence especially in a long run. So you can think about java.* classes as they have superior performance already!  So my opinion is that the only optimization at algorithm level is fully under your control and low-level tricks are... for JVM!  OR You can keep Java code at high-level (object-level) part and C++ code at low-level (polygon-level, drawing) part of your game (engine) with all of the tricks, etc. I think that this is the optimal way for "hard-core" Java gaming at the moment! 
|
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #23 - Posted
2003-02-28 10:38:47 » |
|
The new computers (I think it was included with MMX) have pipeline slot for straight repeated arithmetic calculations(exponent, sqrt, pi, and whatnot), but it doesn't make sense the jump is involved with 600mhz pentium to 1.4ghz, since technically they both be MMX. Does anyone know whether the new media processors have sqrt slot built into the cycle?
I have over 514% difference here in the sqrt calculations, so it must mean that the JIT really is doing its work, which is always good to know. However, the speed could be increased.
But still, integer 9^(1/2) is faster than (int)Math.sqrt(9)
Has anyone tested this with older versions of Java?
|
|
|
|
|
elias
|
 |
«
Reply #24 - Posted
2003-02-28 18:11:38 » |
|
But still, integer 9^(1/2) is faster than (int)Math.sqrt(9)
I don't hope you litterally mean "9^(1/2)", because ^ is xor in java, not a power function. - elias
|
|
|
|
rgeimer
Senior Newbie 
|
 |
«
Reply #25 - Posted
2003-02-28 20:05:35 » |
|
The new computers (I think it was included with MMX) have pipeline slot for straight repeated arithmetic calculations(exponent, sqrt, pi, and whatnot), but it doesn't make sense the jump is involved with 600mhz pentium to 1.4ghz, since technically they both be MMX. Does anyone know whether the new media processors have sqrt slot built into the cycle? Perhaps there is something in SSE2, which is included in the P4 but not the P3? http://www.tommesani.com/InstructionSetCPU.html
|
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #26 - Posted
2003-03-01 04:35:16 » |
|
I don't hope you litterally mean "9^(1/2)", because ^ is xor in java, not a power function.
- elias
Well I do mean that, because I tested it and it was about 1000 ms faster. I bet you can test that your self easily as well. However, whatever your result might be I'm 100% sure that it is faster.
|
|
|
|
|
elias
|
 |
«
Reply #27 - Posted
2003-03-01 08:39:04 » |
|
I could and would, if only ^ were the power operator, but it isn't. ^ in java is the boolean xor operator. Take this example: 1 2 3 4 5 6 7
| public class Test { public static void main(String[] args) { float wrong_sqrt = 9^(2); float real_sqrt = 9*9; System.out.println(wrong_sqrt + " != " + real_sqrt); } } |
gives me: 1 2
| [elias@ip172 snot]$ javac Test.java && java Test 11.0 != 81.0 |
see http://java.sun.com/docs/books/jls/second_edition/html/expressions.doc.html#5233 for more information - elias
|
|
|
|
Captain-Goatse
Junior Member  
I suck at teh 2D. XBOX IS BIG LOL!111
|
 |
«
Reply #28 - Posted
2003-03-01 11:14:46 » |
|
OK, thank god it came clear now, I have always been under impression that ^is truly ^
|
|
|
|
|
Viro
Senior Newbie 
Java games rock!
|
 |
«
Reply #29 - Posted
2004-05-30 12:30:47 » |
|
The difference between the speed of the Math.sqrt on a P4 and P3 is probably due to the difference in the way the fsqrt opcode is implemented on the processors. On the P4, the fsqrt instruction takes 38 clock ticks to complete while on the P3, it takes 57(!!) clock ticks. That would explain why you see much better results with Math.sqrt on a P4. My guess is that the JVM emits an fsqrt instruction for each call to Math.sqrt. More information about the performance of these floating point instructions on various processors can be found at http://www.aceshardware.com/Spades/read.php?article_id=25000194
|
|
|
|
|
|