Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (477)
Games in Android Showcase (107)
games submitted by our members
Games in WIP (536)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  Concrete benefits of SSE and SSE2 instructions in Java  (Read 4336 times)
0 Members and 1 Guest are viewing this topic.
Offline TheAnalogKid

JGO Coder


Projects: 2



« Posted 2005-08-30 20:25:19 »

Hi all,

I've done multiple searches in the forums about SSE and SSE2 to know the concrete benefits of using these  instruction sets. I know that FP computations have performance boots using them but I'm confused about how it could boost graphics performance since FPUs already do graphcis accelerations. Could someone clarifies all this please?

Thanks!

Offline Linuxhippy

Senior Member


Medals: 1


Java games rock!


« Reply #1 - Posted 2005-08-30 23:37:28 »

Quote
how it could boost graphics performance since FPUs already do graphcis accelerations

I do not really understand this. what do you mean by saying that FPUs do graphic accerlation? A FPU is a unit in the processor and has nothing to do with the graphic card at all - no impact wether you have  a GF7800 or a Tseng-board in your computer. (btw. does anybody remember the Tseng-VGA boards?)

However to come back to SSE:
SSE is a SMID instruction set which means Single-Instruction-Multiple-Data, simply means do one instruction on more-than-one data in one step. SSE allows e.g. to do 4 multiplications in one instruction.
If hotspot detects that it can optimize code to use SSE, the resulting code will need less instructions - thats it.

However this mostly is important for maths/algorythmic code which todays games are mostly not.

lg Clemens
Offline TheAnalogKid

JGO Coder


Projects: 2



« Reply #2 - Posted 2005-08-31 01:41:46 »

Quote
I do not really understand this. what do you mean by saying that FPUs do graphic accerlation?
Oops! I did a typo here!  Roll Eyes I meant GPU of course. I already know that FPU is not related to grapghics acceleration at all.

Quote
SSE is a SMID instruction set which means Single-Instruction-Multiple-Data, simply means do one instruction on more-than-one data in one step. SSE allows e.g. to do 4 multiplications in one instruction.
If hotspot detects that it can optimize code to use SSE, the resulting code will need less instructions - thats it.
I know what are SIMDs but I was wondering how a game could take benefits of it. What are concrete uses?

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Jeff

JGO Coder




Got any cats?


« Reply #3 - Posted 2005-08-31 07:06:20 »

Well... lesee

On a system that doesnt do triangle transform on the GPU it helps there, but ofcourse you're right thats becoming less and less the case.

Physics however is a current big sucker of computation power.


Got a question about Java and game programming?  Just new to the Java Game Development Community?  Try my FAQ.  Its likely you'll learn something!

http://wiki.java.net/bin/view/Games/JeffFAQ
Offline TheAnalogKid

JGO Coder


Projects: 2



« Reply #4 - Posted 2005-08-31 14:46:47 »

Thanks Jeff!
Quote
On a system that doesnt do triangle transform on the GPU it helps ther
But I guess that it's unlikely that this system will have a CPU that has at least SSE instructions. And MMX is not very useful as it doesn't allow a program to use floating point primitives without loosing the perfornance boost of SIMDs. But what about 3DNow on AMD CPUs? I don't know. I think it's been there before SSE.
Now I understand how it can boost the performance in video games. And talking about game physics, do you know if eventually we could see physics hardware accelerators embeded in video cards?

Offline Linuxhippy

Senior Member


Medals: 1


Java games rock!


« Reply #5 - Posted 2005-08-31 16:19:46 »

Now I understand how it can boost the performance in video games. And talking about game physics, do you know if eventually we could see physics hardware accelerators embeded in video cards?

Not really - since it  would not "fit" well into a GPU. Game physics is often coupled very closly to the game-engine and so needs to communicate a lot with RAM/CPU which is something GPUs are not good in, since there is a lot of communication overhead between.

lg Clemens
Offline TheAnalogKid

JGO Coder


Projects: 2



« Reply #6 - Posted 2005-08-31 18:13:38 »

Yes very good point!

Offline Raghar

Junior Member




Ue ni taete 'ru hitomi ni kono mi wa dou utsuru


« Reply #7 - Posted 2005-08-31 20:07:43 »

if you'd do

for ...
array[c] = array2[c] * array3[c];

It should speed your application considerably especially if arrays are 16 byte boundary aligned.

So it's also important in video compression, various simultaneous tasks, and for cycle expansion
Offline darkprophet

Senior Member




Go Go Gadget Arms


« Reply #8 - Posted 2005-08-31 20:11:50 »

Quote
Game physics is often coupled very closly to the game-engine and so needs to communicate a lot with RAM/CPU which is something GPUs are not good in, since there is a lot of communication overhead between.

Not really...Take a look at ODE, Newton, Tokoman, Novodex (they actually have implemented Hardware physics like you describe), TrueAxis, Havok, MathLib et al...They all have decoupled the game engine from the physics. Infact i'l go as far as to say that its considered wrong design for a physics engine to be dependant on any game engine, because it doesn't need to.

Friends don't let friends make MMORPGs.

Blog | Volatile-Engine
Offline Linuxhippy

Senior Member


Medals: 1


Java games rock!


« Reply #9 - Posted 2005-08-31 20:22:02 »

Not really...Take a look at ODE, Newton, Tokoman, Novodex (they actually have implemented Hardware physics like you describe), TrueAxis, Havok, MathLib et al...They all have decoupled the game engine from the physics. Infact i'l go as far as to say that its considered wrong design for a physics engine to be dependant on any game engine, because it doesn't need to.

Did not know about that at all however I just wonder which benefits it would have to have a physic-instruction set inside the GPU - wouldn't it from design fit much better into the CPU if someone really wants to implement it in HW?
On the contrary I didn't even know that physics consumes so much cycles these days so I am everything but experienced in terms of game programming ;-)

lg Clemens
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline darkprophet

Senior Member




Go Go Gadget Arms


« Reply #10 - Posted 2005-08-31 20:24:34 »

http://www.ageia.com/ - Enjoy a Hardware based physics engine, that can also run on the CPU if the PSU isn't found.

Friends don't let friends make MMORPGs.

Blog | Volatile-Engine
Offline tom
« Reply #11 - Posted 2005-08-31 20:49:21 »

The vm might be using SSE2 instruction instead of the FPU. That is if they can make it run faster on single data (not SIMD). SSE have an advantage since the registers is not stack based. Maybe Jeff can ask the vm guys if SSE is used?

I find it extremely unlikely that the vm can produce SIMD code. If they do it is only in very special cases. So you will not see any benefits of SIMD in java. You could write a native library that take advantage of SSE. But it is only benefitial if there is enough data that can overcome the JNI overhead.

I'm sure SIMD instructions can be used whenever you need to do some serious number crunshing. In games that might be:
-Sound (software mixers, softsynths, special fx)
-Physics
-AI
-Vertex manipulations, like in some shadow algorithms?

Offline TheAnalogKid

JGO Coder


Projects: 2



« Reply #12 - Posted 2005-08-31 21:27:41 »

SSE and SSE2 instrictions are used when available by the server VM since 1.4.2. See SDK doc: http://java.sun.com/j2se/1.4.2/changes.html#vm

Quote
I'm sure SIMD instructions can be used whenever you need to do some serious number crunshing. In games that might be:
-Sound (software mixers, softsynths, special fx)

I know a lot of things about sound but don't you think that DSPs are better suited for this kind of task?

Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #13 - Posted 2005-09-07 21:35:38 »

SSE and SSE2 instructions are used on single data (so no SIMD stuff here).  The biggest advantages are as follows (at least for the VM)

- More registers (albeit single and double precision fpu registers)
- Don't use the FPU stack (except for trig/transcendentals)

Those above alone are worth quite a bit on FPU heavy programs.  Easing of register pressure especially on register starved Intel CPUs is a big win.
Offline TheAnalogKid

JGO Coder


Projects: 2



« Reply #14 - Posted 2005-09-07 22:11:39 »

Sorry but according to wikipedia SSE and SSE2 are actually SIMDs:

SSE:
Quote
SSE (Streaming SIMD Extensions) is a SIMD (Single Instruction, Multiple Data) instruction set designed by Intel, and introduced in their Pentium III series processors as a reply to AMD's 3DNow! debuted a year earlier.

SSE2:
Quote
SSE2 is one of the IA-32 SIMD instruction sets, designed by Intel. It extends the earlier version SSE instruction set, and is intended to fully supplant MMX.

So who is right?

Offline tom
« Reply #15 - Posted 2005-09-07 23:28:38 »

Yes, SSE and SSE2 is ofcourse SIMD instructions. I've not used it so I don't know the details of how you load the registers etc. But nothing prevents you from only using a single element of data. Even though the instruction is used on multiple elements, you just ignore the results of the elements you don't use.

Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #16 - Posted 2005-09-08 18:18:37 »

Sorry but according to wikipedia SSE and SSE2 are actually SIMDs:

Right, but the VM doesn't use the SIMD portion of SSE/SSE2.  You can use SSE/SSE2 registers and instructions on single data.
Offline Raghar

Junior Member




Ue ni taete 'ru hitomi ni kono mi wa dou utsuru


« Reply #17 - Posted 2005-09-08 20:00:13 »

SIMD mean single instruction multiple data. If you'd fill rest of the data by 0 you'd compute single instrucion single data. However Intel SSE2 doesn't work this way.

SSE and SSE2 instructions have two types of instructions for majority of work. One is for full xmm register, the other is just for data element at least important place. Like XXXO (O is that computed data element.) Of course some instructions aren't exactly computation intensive, like XOR, AND, thus they are done always on full SSE2 register. SQRT and div are not as friendly to CPU, so there are IIRC 3 types of such instructions, One for exact, the second for fast aproximation, the third is for, in most situations, nearly exact result.

The biggest benefit of SSE2 instructions is freeing mmx registers for integer only work. Namely boolean operator work, and a scratchpad work, without needing to mess with CPU state with EMMS instruction. If all FPU is done on SSE2 registers, then mmx register state should never change thus no accidental stalls, and a nice 8 64 bit registers on a 32 bit computer. It also reduces polution of L1 cache. Note however that latency might be higher when accesing xmm registers than when accessing standard registers. And of course there is the problem with the memory aligned/unaligned loading. (Aligned is twice faster than unaligned, they should be ideally at nearly same latency.)

(I hope that above short introduction into SSE2 is without too many errors. I didn't verified it with Intel manuals.)

I very don't recomend to take names like, SIMD, or vector instructions, too literally. They are often used just for marketing purposes. For example SSE2 intructions might be sometimes refered to as vector instructions however I never seen command like "DOT" or "normalize" in Intel's documentation, and nobody have serious need for them. (Yes I know this missnomer originated from a math and attempts to unneccessary import math terms into other areas.)
Also note that wikipedia isn't exactly better resource than Intel programs for explaining work on SSE registers, and Intel manuals for P4 family SSE3 assembly instructions.

BTW Azeem Jiva
Is JVM able to reduce cache polution? For example r/w to volatile members should evade cache completely (on multiple CPU computer). And what about prefetching?
Offline Raghar

Junior Member




Ue ni taete 'ru hitomi ni kono mi wa dou utsuru


« Reply #18 - Posted 2005-09-09 18:50:11 »

SSE and SSE2 instructions are used on single data (so no SIMD stuff here).  The biggest advantages are as follows (at least for the VM)

- More registers (albeit single and double precision fpu registers)
- Don't use the FPU stack (except for trig/transcendentals)

Those above alone are worth quite a bit on FPU heavy programs.  Easing of register pressure especially on register starved Intel CPUs is a big win.

Actually xmm registers have 128 bit size. It's unimportant if there are 2 x 64 bit FP data, or 4x32 bit ints. So it's somewhat misleading to call them FPU registers.
Look at instruction like paddd xmm1, mem
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #19 - Posted 2005-09-14 17:01:11 »

Alright so they aren't FPU registers Smiley  I think of them as that, but your right you can put anything you want in them.
Pages: [1]
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

CogWheelz (7 views)
2014-07-30 21:08:39

Riven (20 views)
2014-07-29 18:09:19

Riven (14 views)
2014-07-29 18:08:52

Dwinin (12 views)
2014-07-29 10:59:34

E.R. Fleming (32 views)
2014-07-29 03:07:13

E.R. Fleming (12 views)
2014-07-29 03:06:25

pw (42 views)
2014-07-24 01:59:36

Riven (42 views)
2014-07-23 21:16:32

Riven (29 views)
2014-07-23 21:07:15

Riven (30 views)
2014-07-23 20:56:16
HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54

HotSpot Options
by Roquen
2014-05-06 15:03:10

Escape Analysis
by Roquen
2014-04-29 22:16:43

Experimental Toys
by Roquen
2014-04-28 13:24:22
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!