Java-Gaming.org    
Featured games (78)
games approved by the League of Dukes
Games in Showcase (426)
Games in Android Showcase (89)
games submitted by our members
Games in WIP (466)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  Hmm. Can this little piece be optimized further?  (Read 3239 times)
0 Members and 1 Guest are viewing this topic.
Offline barfy

Junior Member




The evidence of things not seen


« Posted 2004-06-23 10:00:59 »

1  
2  
3  
4  
5  
6  
7  
8  
9  
// pixel is a byte array
// currentTextureMap is a byte array

int newpix = (pixel[index] & 0xFF) + (currentTextureMap[tIndex] & 0xFF);
                               
if(newpix > 255)
     newpix = 255;
                               
pixel[index] = (byte) newpix;


Can this section of code be made any faster? This is pretty crucial since this portion gets executed many times per frame.

I was wondering if there's a way to limit the variable newpix (using bit operations?) so that it is always between the values 0-255 (if the value is greater than 255, than it should truncate to 255). Then I could simply discard that if statement which checks for overflow.



Offline Herkules

Senior Member




Friendly fire isn't friendly!


« Reply #1 - Posted 2004-06-23 10:41:28 »

1  
int newpix = (pixel[index] + currentTextureMap[tIndex]) & 0xFF;  


Hows that?

HARDCODE    --     DRTS/FlyingGuns/JPilot/JXInput  --    skype me: joerg.plewe
Offline erikd

JGO Ninja


Medals: 15
Projects: 4
Exp: 14 years


Maximumisness


« Reply #2 - Posted 2004-06-23 10:43:41 »

As far as I know, it can't be replaced by a simple bit operation but the if statement is very cheap anyway. The only possible speed optimization I can see is to have an unsigned byte type in java (am I right? you would be able to skip the masking when converting to an int), but that doesn't help you much does it?  Wink

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline erikd

JGO Ninja


Medals: 15
Projects: 4
Exp: 14 years


Maximumisness


« Reply #3 - Posted 2004-06-23 10:45:09 »

Quote
1  
int newpix = (pixel[index] + currentTextureMap[tIndex]) & 0xFF;  


Hows that?


I don't think the result will be same, or will they?

Offline Herkules

Senior Member




Friendly fire isn't friendly!


« Reply #4 - Posted 2004-06-23 10:52:41 »

1  
pixel[index] = (byte)(pixel[index] + currentTextureMap[tIndex]);    


This should do it in one step.

HARDCODE    --     DRTS/FlyingGuns/JPilot/JXInput  --    skype me: joerg.plewe
Offline barfy

Junior Member




The evidence of things not seen


« Reply #5 - Posted 2004-06-23 10:53:14 »

Quote
1  
int newpix = (pixel[index] + currentTextureMap[tIndex]) & 0xFF;  


Hows that?


Nope, that gives a different result. For example, if pixel[index] = -1 and currentTextureMap[tIndex] = -1,

1. Then the result in my original code would be 510, which is then truncated by that if statement to 255.

2. Your version would give 254.  

Sigh, what I would really need is an unsigned byte type. But thanks anyway.
Offline Herkules

Senior Member




Friendly fire isn't friendly!


« Reply #6 - Posted 2004-06-23 10:54:05 »

1  
pixel[index] & 0xFF


this doesn't do anything bc. pixel already is a byte! Just a waste of cycles.

HARDCODE    --     DRTS/FlyingGuns/JPilot/JXInput  --    skype me: joerg.plewe
Offline barfy

Junior Member




The evidence of things not seen


« Reply #7 - Posted 2004-06-23 10:57:14 »

Quote
1  
pixel[index] & 0xFF


this doesn't do anything bc. pixel already is a byte! Just a waste of cycles.


Actually pixel[index] is a SIGNED byte, so calling pixel[index] & 0xFF converts it to the equivalent of an UNSIGNED byte (actually cast to an int type).

However, merely casting pixel[index] to an int type using (int)pixel[index] would not change the sign of the byte.
Offline Herkules

Senior Member




Friendly fire isn't friendly!


« Reply #8 - Posted 2004-06-23 11:01:34 »

ic, you're right ....  &0xff implicitely casts to int though...

HARDCODE    --     DRTS/FlyingGuns/JPilot/JXInput  --    skype me: joerg.plewe
Offline erikd

JGO Ninja


Medals: 15
Projects: 4
Exp: 14 years


Maximumisness


« Reply #9 - Posted 2004-06-23 11:03:00 »

Therefor we need to bug Sun to reconsider http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4186775

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline barfy

Junior Member




The evidence of things not seen


« Reply #10 - Posted 2004-06-23 11:09:42 »

Quote
As far as I know, it can't be replaced by a simple bit operation but the if statement is very cheap anyway.


Not so cheap when it's executed more than 10000x per frame. I ran a profiler and this section is taking up 70% of my processor time.

Even just 1 or 2 less instructions in the code makes a substantial difference.
Offline crystalsquid

Junior Member




... Boing ...


« Reply #11 - Posted 2004-06-23 11:14:36 »

You are right that you want to dispose of the branch if possible (conditional branches are nasty for most CPU's), so to get round it you do:

1  
newpix |= ((255-newpix)>>31);


If newpix < 255, then 255-newpix is positive. the shift right 31 will propogate the sign bit through the whole int, giving '0'. ORing this in gives no change.

If newpix >=255, the result of this line is '0xffffffff', and when OR'd in makes newpix 0xffffffff as well, so when you cast back to a byte you will get '0xff' - the clamped value you were after.

This takes 3 logical ops (3 cycles), compared to a branch predict error (~30 cycles) whenever the clamping is used. So if you clamp ~10% of pixels this way, the chances are that it will be around the same speed Smiley

Your original looks a little odd though - are your arrays really byte arrays? Are you not dealling with multiple colour channels at least? If you are tring to do additive or alpha blending, there are slightly more efficient ways to do this as you can ususally get away with dealling with R and B combined in one int, saving you 1/3rd of the work for a blend.

Hope this helps,

- Dom
Offline Mark Thornton

Senior Member





« Reply #12 - Posted 2004-06-23 11:39:30 »

Quote
This takes 3 logical ops (3 cycles)
Could be a lot more on some CPU that don't have a barrel shifter (Pentium 4?).

Offline barfy

Junior Member




The evidence of things not seen


« Reply #13 - Posted 2004-06-23 11:57:52 »

Quote
You are right that you want to dispose of the branch if possible (conditional branches are nasty for most CPU's), so to get round it you do:

1  
newpix |= ((255-newpix)>>31);


If newpix < 255, then 255-newpix is positive. the shift right 31 will propogate the sign bit through the whole int, giving '0'. ORing this in gives no change.

If newpix >=255, the result of this line is '0xffffffff', and when OR'd in makes newpix 0xffffffff as well, so when you cast back to a byte you will get '0xff' - the clamped value you were after.


Thanks. Unfortunately, it's slower now Cheesy. Although that's really quite an elegant way of skirting the conditional statement.

I think the issue is not so much a branch prediction error/cache miss, but the sheer number of instructions that gets executed per frame.

EDIT: I'm using a p4 so part of the slowdown could probably be with the issue described by Mark in his post above.

Quote

Your original looks a little odd though - are your arrays really byte arrays? Are you not dealling with multiple colour channels at least? If you are tring to do additive or alpha blending, there are slightly more efficient ways to do this as you can ususally get away with dealling with R and B combined in one int, saving you 1/3rd of the work for a blend.

Hope this helps,

- Dom



I'm actually working with 8-bit IndexColorModels and a DataBuffer.Byte pixel array. The addition that you see is just adding corresponding pixel values from a pre-defined 8-bit texture map to the DataBuffer.Byte pixel array.

Hmm. What you suggested got me thinking though. I wonder if I could use a DataBuffer.Int with the IndexColorModel so that 4 8-bit pixel values can be combined in an int, and then perform the adding on the int instead...

Thanks Smiley
Offline Herkules

Senior Member




Friendly fire isn't friendly!


« Reply #14 - Posted 2004-06-23 12:33:06 »

DataBuffer.MMX would be helpful Smiley

HARDCODE    --     DRTS/FlyingGuns/JPilot/JXInput  --    skype me: joerg.plewe
Offline phazer

Junior Member




Come get some


« Reply #15 - Posted 2004-06-23 12:42:58 »

You could also try this:
1  
2  
3  
// a[i] = i for i < 256 and a[i] = 255 for 256 <= i <512

pixel[index] = a[newpix];


Don't know if there will be a speed increase, but it's worth a shot. The array is so small it will probably fit inside the L1 cache.

Offline erikd

JGO Ninja


Medals: 15
Projects: 4
Exp: 14 years


Maximumisness


« Reply #16 - Posted 2004-06-23 14:41:04 »

Quote
What you suggested got me thinking though. I wonder if I could use a DataBuffer.Int with the IndexColorModel so that 4 8-bit pixel values can be combined in an int, and then perform the adding on the int instead...  


I'm guessing you would have to do an awful lot of masking instead to prevent overflows to 'bleed' into the wrong bits, or am I missing something?

Offline tom
« Reply #17 - Posted 2004-06-23 15:09:19 »

Here is some code that adds the rgb components of a int using the same method as crystalsquid. There is some loss in precision, and the 4th component needs to be handled seperatly  Sad
1  
2  
3  
4  
5  
6  
7  
8  
9  
      public final static int addSaturated(int a, int b)
      {
            a &= 0xfefefe;
            b &= 0xfefefe;
            int ab = a+b;
            int sign = ab & 0x01010100;
            int sum = (sign-(sign>>8)) | ab;
            return sum;
      }

Offline barfy

Junior Member




The evidence of things not seen


« Reply #18 - Posted 2004-06-23 17:19:21 »

Quote
You could also try this:
1  
2  
3  
// a[i] = i for i < 256 and a[i] = 255 for 256 <= i <512

pixel[index] = a[newpix];


Don't know if there will be a speed increase, but it's worth a shot. The array is so small it will probably fit inside the L1 cache.


That gives roughly the same, maybe a little slower speeds than with the "if" statement. Probably because there's the array bounds check with each random access.
Offline erikd

JGO Ninja


Medals: 15
Projects: 4
Exp: 14 years


Maximumisness


« Reply #19 - Posted 2004-06-23 17:21:10 »

Overflows are now going to the wrong color and even to the 4th byte (i.e. addSaturated(0xff00ff, 0xff00ff) results in 0x1ff01ff) so the result can be slightly wrong. Maybe this isn't a problem though, but then again maybe it is...

Offline barfy

Junior Member




The evidence of things not seen


« Reply #20 - Posted 2004-06-23 17:33:14 »

Quote


I'm guessing you would have to do an awful lot of masking instead to prevent overflows to 'bleed' into the wrong bits, or am I missing something?


Anyway it seems that you can't get the "multiple pixels packed into an int" idea to work with an IndexColorModel, which is unfortunately what I am using.
Offline Abuse

JGO Coder


Medals: 10


falling into the abyss of reality


« Reply #21 - Posted 2004-06-24 14:49:24 »

Silly suggestion, but are you running these tests on the server VM?

Make Elite IV:Dangerous happen! Pledge your backing at KICKSTARTER here! https://dl.dropbox.com/u/54785909/EliteIVsmaller.png
Offline barfy

Junior Member




The evidence of things not seen


« Reply #22 - Posted 2004-06-24 19:51:48 »

Quote
Silly suggestion, but are you running these tests on the server VM?


I'm testing the performance on the client VM because as far as I know, there doesn't seem to be a way to run the app with the server VM via webstart... or is there?
Offline erikd

JGO Ninja


Medals: 15
Projects: 4
Exp: 14 years


Maximumisness


« Reply #23 - Posted 2004-06-24 22:08:40 »

Even if there is (i don't think there is a non hackish one), you can be 99.99% sure the user is running your game on the client.

Offline princec

JGO Kernel


Medals: 284
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #24 - Posted 2004-06-25 09:38:35 »

Unless you, ahh, ship a JRE embedded in the game with the server VM Wink

Cas Smiley

Offline Mark Thornton

Senior Member





« Reply #25 - Posted 2004-06-25 10:05:28 »

Quote

the app with the server VM via webstart... or is there?
From 1.5 webstart supports arbitrary VM arguments which ought to include -server. No doubt it will only work if the selected JRE has the server VM installed, but perhaps you could have an installable extension which contained the server VM dll and copied it to the right place in the chosen JRE. Obviously the .jar file would have to be signed, but it looks feasible.

-server is listed as supported here
http://java.sun.com/j2se/1.5.0/docs/guide/javaws/developersguide/syntax.html

To assist in adding the server JVM, the ExtensionInstallerService stuff looks ideal.

It would be really helpful if Sun would wrap up the server JVM as an extension JNLP and host it somewhere.
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

xsi3rr4x (73 views)
2014-04-15 18:08:23

BurntPizza (68 views)
2014-04-15 03:46:01

UprightPath (80 views)
2014-04-14 17:39:50

UprightPath (65 views)
2014-04-14 17:35:47

Porlus (81 views)
2014-04-14 15:48:38

tom_mai78101 (105 views)
2014-04-10 04:04:31

BurntPizza (164 views)
2014-04-08 23:06:04

tom_mai78101 (260 views)
2014-04-05 13:34:39

trollwarrior1 (210 views)
2014-04-04 12:06:45

CJLetsGame (220 views)
2014-04-01 02:16:10
List of Learning Resources
by SHC
2014-04-18 03:17:39

List of Learning Resources
by Longarmx
2014-04-08 03:14:44

Good Examples
by matheus23
2014-04-05 13:51:37

Good Examples
by Grunnt
2014-04-03 15:48:46

Good Examples
by Grunnt
2014-04-03 15:48:37

Good Examples
by matheus23
2014-04-01 18:40:51

Good Examples
by matheus23
2014-04-01 18:40:34

Anonymous/Local/Inner class gotchas
by Roquen
2014-03-11 15:22:30
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!