Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (477)
Games in Android Showcase (107)
games submitted by our members
Games in WIP (535)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: 1 [2]
  ignore  |  Print  
  Finding index of single 1 bit in a long  (Read 4649 times)
0 Members and 1 Guest are viewing this topic.
Offline crystalsquid

Junior Member




... Boing ...


« Reply #30 - Posted 2003-12-17 15:31:43 »

Quote
I think you can do this without any branching.. theoretically it could be faster.


Thats what I said, back before this thread was so succesfully jacked  Tongue

I had it down to (for just one test block):
Quote
   b = a&0x0000ffff;
   b = ~((-b)>>31);  // 0 if b was non-zero
   r |= 16 & b;


which is 6 logical ops per bit of the output:
  AND, NEGATE,SHIFT,INVERT,AND,OR
and you have:

Quote
 m2 = ((v & 0xFFFF0000)-1+v);
 m3 = (m1-m2)>>31;
 i += m3 & 16;


which I make as 7 ops:
   AND,DEC,ADD,SUB,SHIFT,AND,ADD

Although Im not entirely sure how your version works - I love to see bit tricks I don't understand. Gives me a puzzle to solve  Wink

- Dom
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #31 - Posted 2003-12-17 16:55:07 »

Ah..  sorry i must have skimmed through the posts too fast.. you've beaten me by 1 op.. our last lines are equivalent as I could use OR instead.. however looking at your code gave me an idea to simplify mine..

1  
2  
int m1 = ((x & 0x0000FFFF) - 1) >> 31;
i |= 16 & m1;


I think that should work.  AND, SUB, SHIFT, AND, OR... down to 5 ops.

The AND and the -1 basically create a negative number if none of the bits we are testing were set, and it makes sure that if the high bit was the one that was set that it will not be set anymore. The shift is used to create a mask relying on the high bit being sign extended after the arithmetic shift.  After that it is the simple ORing of the bit in the result..

*edit*  fixed mask - needs to be the same as yours now otherwise the result needs to be subtracted from 31 Smiley

Offline crystalsquid

Junior Member




... Boing ...


« Reply #32 - Posted 2003-12-17 18:16:23 »

Shocked Sweet!
Woz - any chance of a timing test for this version? Wink

- Dom
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Woz

Senior Newbie




A Troll who lives in a hole


« Reply #33 - Posted 2003-12-18 11:17:18 »

Here you go Dom,

It takes anywhere between 1-2 hours to gather the results for each test, so you'll have to forgive the slight delay.

Some slight code modifications on the each bitIndex() were required, as Dom wrote bitIndex6() off the top of his head, and Pamler's initial version was only accepted integers as parameters.

So here are the modifed versions:
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
  public static int bitIndex6(long x)
  {
    int r = 0;
    int a = (int)x;
    int b = (int)(x>>32);

//    if( a == 0 )
//    {
//        a = (int)(x >> 32);
//        r += 32;
//    }

    int c = ( a >> 31) | (( -a ) >> 31);  // 0 if a is 0
   a = (a & c) | (b & (~c));  // a = b if a was 0
   r = 32 & (~c);

    //if ((a & 0xFFFF0000) != 0) r += 16;
   b = a & 0x0000ffff;
    b = ~(( -b ) >> 31);  // 0 if b was non-zero
   r |= 16 & b;

    //if ((a & 0xFF00FF00) != 0) r += 8;
   b = a & 0x00ff00ff;
    b = ~(( -b ) >> 31);  // 0 if b was non-zero
   r |= 8 & b;

    //if ((a & 0xF0F0F0F0) != 0) r += 4;
   b = a & 0x0f0f0f0f;
    b = ~(( -b ) >> 31);  // 0 if b was non-zero
   r |= 4 & b;

    //if ((a & 0xCCCCCCCC) != 0) r += 2;
   b = a & 0x33333333;
    b = ~(( -b ) >> 31);  // 0 if b was non-zero
   r |= 2 & b;

    //if ((a & 0xAAAAAAAA) != 0) r++;
   b = a & 0x55555555;
    b = ~(( -b ) >> 31);  // 0 if b was non-zero
   r |= 1 & b;

    return r;
  }


1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
  public static int bitIndex8(long x)
  {
    int r = 0;
    int a = (int)x;
    int b = (int)(x>>32);
    int c = ( a >> 31) | (( -a ) >> 31);  // 0 if a is 0

    a = (a & c) | (b & (~c));  // a = b if a was 0
   r = 32 & (~c);

    b = ((a & 0x0000FFFF) - 1) >> 31;
    r |= 16 & b;

    b = ((a & 0x00FF00FF) - 1) >> 31;
    r |= 8 & b;

    b = ((a & 0x0F0F0F0F) - 1) >> 31;
    r |= 4 & b;

    b = ((a & 0x33333333) - 1) >> 31;
    r |= 2 & b;

    b = ((a & 0x55555555) - 1) >> 31;
    r |= 1 & b;

    return r;
  }



And a most evil looking version, in an attempt to (some what doubtfully) get the method to inline.
1  
2  
3  
4  
5  
6  
7  
8  
  public static int bitIndex9(long x)
  {
    int a = (int)x;
    int c = (a | -a) >> 31;
    a = (a & c) | (((int)(x >> 32)) & ~c);

    return ((32 & (~c)) | (16 & (((a & 0x0000FFFF) - 1) >> 31)) | (8 & (((a & 0x00FF00FF) - 1) >> 31)) | (4 & (((a & 0x0F0F0F0F) - 1) >> 31)) | (2 & (((a & 0x33333333) - 1) >> 31)) | (1 & (((a & 0x55555555) - 1) >> 31)) );
  }


Results in next post.

------
Woz.
Offline Woz

Senior Newbie




A Troll who lives in a hole


« Reply #34 - Posted 2003-12-18 11:56:48 »

bitIndex6()














456.719
457
ibm13index6e.csv
Thread priority = 5
Classic VM
IBM Corporation
1.3.0
1.3.0
IBM Corporation
486.462
486
ms114index6e.csv
Thread priority = 5
null
null
null
1.1.4
Microsoft Corp.
361.481
361
sun12index6e.csv
Thread priority = 5
Classic VM
Sun Microsystems Inc.
1.2
1.2
Sun Microsystems Inc.
591.169
591
sun131index6e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.3.1-b24
1.3.1
Sun Microsystems Inc.
728.77
729
sun141index6e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.4.1-b21
1.4.1
Sun Microsystems Inc.
606.939
607
sun142index6e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.4.2_01-b06
1.4.2_01
Sun Microsystems Inc.

bitIndex8()












418.503
418
ibm13index8e.csv
Thread priority = 5
Classic VM
IBM Corporation
1.3.0
1.3.0
IBM Corporation
455.988
456
ms114index8e.csv
Thread priority = 5
null
null
null
1.1.4
Microsoft Corp.
337.237
337
sun12index8e.csv
Thread priority = 5
Classic VM
Sun Microsystems Inc.
1.2
1.2
Sun Microsystems Inc.
567.88
568
sun131index8e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.3.1-b24
1.3.1
Sun Microsystems Inc.
627.207
627
sun141index8e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.4.1-b21
1.4.1
Sun Microsystems Inc.
574.385
574
sun142index8e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.4.2_01-b06
1.4.2_01
Sun Microsystems Inc.

Evil looking bitIndex9()












22.033
22
ibm13index9e.csv
Thread priority = 5
Classic VM
IBM Corporation
1.3.0
1.3.0
IBM Corporation
463.16
463
ms114index9e.csv
Thread priority = 5
null
null
null
1.1.4
Microsoft Corp.
322.418
322
sun12index9e.csv
Thread priority = 5
Classic VM
Sun Microsystems Inc.
1.2
1.2
Sun Microsystems Inc.
523.457
523
sun131index9e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.3.1-b24
1.3.1
Sun Microsystems Inc.
523.457
523
sun131index9e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.3.1-b24
1.3.1
Sun Microsystems Inc.
513.895
514
sun142index9e.csv
Thread priority = 5
Java HotSpot(TM) Client VM
Sun Microsystems Inc.
1.4.2_01-b06
1.4.2_01
Sun Microsystems Inc.


It would appear that IBM's JVM optimized bitIndex9() out!

As a whole they appear to be slower (than bitIndex5() ) under microbench mark conditions. Personnally I was expecting that the JIT compilers would be considerably slower due to most JIT's being nothing more than a Interpreter without the overheads of interpreting (!). Interpreters dont reorder code they execute each byte code as it comes.

All but one JVM is able to beat the original bitIndex2().

Note that the 1.4.2 -server -Xcompile scores are not present, as I have not finished playing with it yet.
I have tested it on bitTest2, 4 & 5.
Using -Xcompile made no difference to the client JVM as neither did changing the amount of "warm up" time.

The server version when using just the -server switch manages to optimize out the the benchmark completely giving a score of zero. When combined with -Xcompile a score is presented, it has the greatest gain (percentage increase over bitIndex2() ) in the order of  190% for bitIndex4(), which is faster that bitIndex5(). But is using 308 clocks to do it.

Obviously the microbenchmark is going to have to be modified in order to stop the code being optimized out.

It would be interesting to try them on a P4 which has a very long pipeline, so surely bitIndex9() would be faster than bitIndex5().

Cold fish all round then!
-------
Woz.
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #35 - Posted 2003-12-18 12:27:35 »

My code is a bit tighter than your bitIndex9...

1  
2  
3  
4  
5  
6  
7  
8  
      public static int getIndex(int v)
      {
            return ((((v & 0x55555555)-1)>>31) & 1)
                 | ((((v & 0x33333333)-1)>>31) & 2)
                 | ((((v & 0x0F0F0F0F)-1)>>31) & 4)
                 | ((((v & 0x00FF00FF)-1)>>31) & 8)
                 | ((((v & 0x0000FFFF)-1)>>31) & 16);
      }


Which you can trivially extend to work on a long intead of an int.

** EDIT ** Actually... hmmm.. nevermind...I see what you are doing...

Note: if you run the 'long' version on a G5 Mac it will use 64bit instructions and registers.  Without 64 bit native operations your 'a' and 'c' calculations might make a difference.  Though I think that can be written a little bit more efficiently too.
Once you know to OR in 32 you can do something like this and follow through with the 'int' version.
1  
2  
3  
4  
5  
6  
7  
int bitIndex(long v)
{
...
int i = (int)(v | (v >>> 32));
...
do int version on i
}

Offline princec

JGO Kernel


Medals: 343
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #36 - Posted 2003-12-18 13:30:05 »

Quick tip for microbenchmarkers:
Just after you calculate your total time, System.out.println() the sum of the results of calling the function (for example) which prevents the methods being optimized away.

Cas Smiley

Offline crystalsquid

Junior Member




... Boing ...


« Reply #37 - Posted 2003-12-18 17:35:07 »

Woz: Cheers for the timings (& the fixes) Smiley
Shame they're not any faster  :-/ Interesting though. I wonder if its not managing to interleave the dependancy chains for each output bit.

- Dom
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #38 - Posted 2003-12-18 23:23:08 »

What does the test harness look like?
You would think that the branch prediction of modern processors would be totally defeated by this stuff since each branch has exactly a 50:50 chance of being taken.
I would like to try this on the Mac to see if the PowerPC handles things differently.

Offline AndersDahlberg

Junior Member





« Reply #39 - Posted 2003-12-19 13:27:22 »

Would be interesting to see results from a Crusoe processor running a similar test Smiley
Pages: 1 [2]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

pw (37 views)
2014-07-24 01:59:36

Riven (38 views)
2014-07-23 21:16:32

Riven (26 views)
2014-07-23 21:07:15

Riven (28 views)
2014-07-23 20:56:16

ctomni231 (59 views)
2014-07-18 06:55:21

Zero Volt (50 views)
2014-07-17 23:47:54

danieldean (42 views)
2014-07-17 23:41:23

MustardPeter (44 views)
2014-07-16 23:30:00

Cero (60 views)
2014-07-16 00:42:17

Riven (57 views)
2014-07-14 18:02:53
HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54

HotSpot Options
by Roquen
2014-05-06 15:03:10

Escape Analysis
by Roquen
2014-04-29 22:16:43

Experimental Toys
by Roquen
2014-04-28 13:24:22
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!