Java-Gaming.org    
Featured games (81)
games approved by the League of Dukes
Games in Showcase (495)
Games in Android Showcase (114)
games submitted by our members
Games in WIP (563)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: 1 [2] 3 4
  ignore  |  Print  
  floating performance  (Read 13440 times)
0 Members and 1 Guest are viewing this topic.
Offline K.I.L.E.R

Senior Member




Java games rock!


« Reply #30 - Posted 2005-03-03 09:30:14 »

No idea what XCompile does.
In the end floats are converted to doubles.

Quote



I thought -Xcompile just makes the VM do it's optimizations on the first pass so you don't need to wait for a wind up for the JIT?  If so, then it just means your microbenchmark isn't running optimized like a real world app would be.

If not, what does it do?


Vorax:
Is there a name for a "redneck" programmer?

Jeff:
Unemployed. Wink
Offline EgonOlsen
« Reply #31 - Posted 2005-03-03 09:49:45 »

Quote
No idea what XCompile does.
It forces hotspot to compile all methods before using them. That way, you can make sure that you are not measuring compile time instead of execution time or that you aren't suffering from different compilation behaviours for whatever reason.

Edit: For those who are interested in what hotspot does when: Start your app with -XX:+PrintCompilation
Combine that with -Xcompile and see what happens...

Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #32 - Posted 2005-03-03 12:29:44 »

Quote
OK ! so with the Athlon64s - with SSE2 support - can I take it that the JVM will (indeed) use SSE2-style registers so that the double performance of the Athlon64s will be comparable to P4s ?

In general, is the SSE2 performance of Athlon64 as good as the P4 ? And specifically, when running Java apps with the -server  option ?

And for the couple of reasons mentioned earlier 1) some CPUs use extended 80-bit precision and 2) some do all the computations in doubles and reconvert to floats (the IBM RS6000 workstation, IIRC, used to do that), is it worth the trouble to stick to floats for speed benefits if memory size is not a consideration ?



First a little intro on how the VM works:

The JVM has two sections of code (basically), platform independent and platform dependent code.  The platform independent stuff are things that operate on the bytecodes, the IR, and then the optimizations (parsing, constant folding, loop opts, register allocation, etc).  

The platform dependent stuff are basically match rules for instructions.  So if the VM requires a Multiply Node (MulNode), the VM matches that to the appropriate rule in the particular architecture.  Now this matching part is where the AMD64 hasn't been fully optimized.   Its mostly there, but there are parts missing, and things we don't do, etc.  So yes the VM uses SSE2 for AMD64 machines, but we might be doing a few things suboptimal.

I've also heard that the Athlons (XP and 64) have slower SSE performance compared to P4s.  It may no longer be true in later revisions of the chip, etc.  Heck I may have heard incorrectly as well.  But anyway, the AMD64 as far as the JVM is concerened is just another chip, most of the optimizations are platform independent.  

Oh don't forget the AMD64 VM is a 64bit VM, while the X86 VM is a 32bit VM.  Internally that means the 64bit VM has to handle larger pointers, etc.  Although the VM gains 8 registers for the AMD64 so overall there is a win in performance.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Vorax

Senior Member


Projects: 1


System shutting down in 5..4..3...


« Reply #33 - Posted 2005-03-03 13:08:25 »

Quote

It forces hotspot to compile all methods before using them. That way, you can make sure that you are not measuring compile time instead of execution time or that you aren't suffering from different compilation behaviours for whatever reason.

Edit: For those who are interested in what hotspot does when: Start your app with -XX:+PrintCompilation
Combine that with -Xcompile and see what happens...


Ok, that's what I thought it did.

Offline NVaidya

Junior Member




Java games rock!


« Reply #34 - Posted 2005-03-03 20:05:03 »

Quote


First a little intro on how the VM works:

The JVM has two sections of code (basically), platform independent and platform dependent code.  The platform independent stuff are things that operate on the bytecodes, the IR, and then the optimizations (parsing, constant folding, loop opts, register allocation, etc).  

The platform dependent stuff are basically match rules for instructions.  So if the VM requires a Multiply Node (MulNode), the VM matches that to the appropriate rule in the particular architecture.  Now this matching part is where the AMD64 hasn't been fully optimized.   Its mostly there, but there are parts missing, and things we don't do, etc.  So yes the VM uses SSE2 for AMD64 machines, but we might be doing a few things suboptimal.

I've also heard that the Athlons (XP and 64) have slower SSE performance compared to P4s.  It may no longer be true in later revisions of the chip, etc.  Heck I may have heard incorrectly as well.  But anyway, the AMD64 as far as the JVM is concerened is just another chip, most of the optimizations are platform independent.  


Firstly, it's great  to have you around here Smiley.

Hmm.. so may be I wasn't way off in suspecting that the AMD64 wasn't giving as good a performance boost as I thought it would going by the gaming benchmarks dished out by the hardware review sites. And yes, I've also heard that  AMD's SSE implementation still lags Intel. There is   a C based benchmark called ScienceMark (http://www.sciencemark.org), developed by Dr. Wilkens who I understand is currently with AMD, which can be used for testing among other things the SSE and SSE2 performance of the CPU. You may have probably heard of it.  

Hopefully you folks could get around to implementing the optimized SSE2 for the AMD64 also. What would it take to do that ?  An RFE perhaps. That takes time doesn't it ? And with AMD's Venice core scheduled to be released this or next quarter, there will be interest too in SSE3 type optimizations.

Thanks

Gravity Sucks !
Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #35 - Posted 2005-03-04 01:57:24 »

Quote


Hopefully you folks could get around to implementing the optimized SSE2 for the AMD64 also. What would it take to do that ?  An RFE perhaps. That takes time doesn't it ? And with AMD's Venice core scheduled to be released this or next quarter, there will be interest too in SSE3 type optimizations.

Thanks


A RFE is not needed, we know about this and several people (including myself) are working on this.  Granted its not high priority as other work is taking up our time.  But we'll get to it.  I did some reasearch into SSE3, and those instructions don't seem to be well suited towards the VM.   The only instruction that I could think of might be FISTTP, but I haven't fully explored that area yet so I'm not sure what else might come in handy.
Offline K.I.L.E.R

Senior Member




Java games rock!


« Reply #36 - Posted 2005-03-04 02:01:29 »

What's RFE stand for?

Vorax:
Is there a name for a "redneck" programmer?

Jeff:
Unemployed. Wink
Offline trembovetski

Senior Member




If only I knew what I'm talking about!


« Reply #37 - Posted 2005-03-04 03:57:05 »

RFE == Request for enhancement
Offline Raghar

Junior Member




Ue ni taete 'ru hitomi ni kono mi wa dou utsuru


« Reply #38 - Posted 2005-03-04 18:46:16 »

I reruned tests and result was 30 s for Floats, 9 s for doubles. Presscott Celeron D underclocked to 1.8 GHz. JRockit showed simillar performance, just 3x as slow. It seems they didn't did a SSE2 optimalizations.
I might try some ASM programs, just I should need to know how can I setup FP precision in assembly, never needed to go down from doubles.



Re FP16.
NVIDIA wievs FP as a number from -1.0 -  1.0 This isn't necessary too much compatibile with other FP formats. Raster drawing has somewhat limited target.
Offline Raghar

Junior Member




Ue ni taete 'ru hitomi ni kono mi wa dou utsuru


« Reply #39 - Posted 2005-03-05 18:18:44 »

I tried it with scimark and I have nearly 2 G FLOPS in double precision and more that 2.4 G FLOPS in a single precision. It's underclocked to 1.8 GHz.
Quite puzzling.

BTW why JVM uses MSVC 6.0 for compiling? Isn't here a MSVC toolkit 7.1 2003?
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline trembovetski

Senior Member




If only I knew what I'm talking about!


« Reply #40 - Posted 2005-03-05 23:54:39 »

We're working on moving to VC 2003 from vc6 for mustang. The 'free toolkit' from MS is not usable, as it's missing some important and widely used libraries, not available for free.

Anyway, keep an eye on mustang builds (https://mustang.dev.java.net/).
Offline Mark Thornton

Senior Member





« Reply #41 - Posted 2005-03-06 12:29:03 »

Quote
The 'free toolkit' from MS is not usable, as it's missing some important and widely used libraries, not available for free.

Now there is a surprise :-(
Offline rreyelts

Junior Member




There is nothing Nu under the sun


« Reply #42 - Posted 2005-03-07 15:50:37 »

Quote
We're working on moving to VC 2003 from vc6 for mustang. The 'free toolkit' from MS is not usable, as it's missing some important and widely used libraries, not available for free.

I'm continually amazed how many people out there are still using VC++ 6.0. That compiler was initially released circa 10 years ago, and it hasn't been patched in like 5. My guess is that the compiler allowed such incredibly broken behavior that the migration efforts to move up to a relatively conforming compiler are very significant.

Off the top of my head, I remember problems with template member functions, for loop conformance, auto_ptrs in containers (no warnings), multiple inheritance with virtual base classes, improperly namespaced classes/functions, non-inlined template definitions (ODR issues)...

It's an example of how Microsoft has really hurt cross platform development. They focused so much on supporting their class libraries and proprietary funky compiler extensions, meanwhile putting no effort into fixing a horribly broken compiler. Sad

I think that they should have been sued in countries like Germany which have very strict laws about advertising and standards conformance. Had they had to put a sticker on their box from the get go which said that their compiler had 152 known compliance issues, I think their market share wouldn't have been so high.

God bless,
-Toby Reyelts

About me: http://jroller.com/page/rreyelts
Jace - Easier JNI: http://jace.reyelts.com/jace
Retroweaver - Compile on JDK1.5, and deploy on 1.4: http://retroweaver.sf.net.
Offline Raghar

Junior Member




Ue ni taete 'ru hitomi ni kono mi wa dou utsuru


« Reply #43 - Posted 2005-03-07 23:03:36 »

Quote
We're working on moving to VC 2003 from vc6 for mustang. The 'free toolkit' from MS is not usable, as it's missing some important and widely used libraries, not available for free.


What libraries?
Offline trembovetski

Senior Member




If only I knew what I'm talking about!


« Reply #44 - Posted 2005-03-08 03:06:44 »

Switching to a new compiler is not a simple task. Just consider the cost alone: we'd need to buy a license for everyone who compiles the code. That's a lot of people in our case.

Another issue is that we already know the bugs in the old compiler, and switching to the new one is always risky since it's very likely that we'll run into new compiler bugs - and believe me, we run into them every time we try even a new compiler revision (ie there's a reason the current jdk requires vc6 sp3, not 4), since jdk is such a huge codebase. So prior to the switch, lots of testing needs to be done - functional, performance, footprint.

And then there's the problem that the new version of the compiler may not even be compatible - as it's the case with vc7 (there's a page from ms with the list of incompatibilities), so we actually have to port our code to the new compiler (which in some jdk areas is harder than in others).

But anyway, we're making the switch this time (which I personally am very happy about, as I finally get to use the new tools instead of the 6 years old v.studio)..

Dmitri
Offline phazer

Junior Member




Come get some


« Reply #45 - Posted 2005-03-08 07:11:34 »

Quote
Switching to a new compiler is not a simple task. Just consider the cost alone: we'd need to buy a license for everyone who compiles the code. That's a lot of people in our case.


There's always GCC + Eclipse Grin Works as well as VS in my opinion.

Offline Mark Thornton

Senior Member





« Reply #46 - Posted 2005-03-08 08:28:04 »

Quote

What libraries?

http://weblogs.asp.net/brianjo/archive/2004/04/17/115335.aspx

Scroll to the bottom and look at the last two entries. Evidently the omission of certain libraries is deliberate.
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #47 - Posted 2005-03-10 19:14:20 »

Quote
There's always GCC + Eclipse Grin Works as well as VS in my opinion.


GCC produces extremely poor code for Intel.  In fact last I heard Intel's own compiler was significantly better than what MS was offering, but that was in the VC6 days.

I'm not sure how vc7 stacks up to the Intel compiler, but with vc6 I knew some guys that did very performance sensitive image processing applications and all of their release builds were done with the Intel compiler because they got a significant boost.

Offline phazer

Junior Member




Come get some


« Reply #48 - Posted 2005-03-14 05:45:54 »

Quote


GCC produces extremely poor code for Intel.  


Please provide some links. GCC 3 optimizes much better than the old GCC.

Quote

In fact last I heard Intel's own compiler was significantly better than what MS was offering, but that was in the VC6 days.

I'm not sure how vc7 stacks up to the Intel compiler, but with vc6 I knew some guys that did very performance sensitive image processing applications and all of their release builds were done with the Intel compiler because they got a significant boost.


From what I've read, Intel's compiler is the best (which makes sense), followed by MS and then GCC 3. I think the difference between MS and GCC is small though.

Offline Mark Thornton

Senior Member





« Reply #49 - Posted 2005-03-14 06:11:21 »

Quote
From what I've read, Intel's compiler is the best (which makes sense), followed by MS and then GCC 3. I think the difference between MS and GCC is small though.

In any case the performance of the C++ compiler used to compile a JVM is probably not as important as it is in other applications --- hopefully most of the time will be spent in code generated by the JVM (which will be the same regardless of the C++ compiler used).
Offline princec

JGO Kernel


Medals: 378
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #50 - Posted 2005-03-14 07:21:16 »

Hm, except that in the forthcoming hybrid VM it'd certainly be nice if the optimising JIT compiler part were a bit faster than it currently is.

Cas Smiley

Offline K.I.L.E.R

Senior Member




Java games rock!


« Reply #51 - Posted 2005-03-14 08:17:37 »

Shouldn't the JIT be built in ASM?
Performance reasons?

Vorax:
Is there a name for a "redneck" programmer?

Jeff:
Unemployed. Wink
Offline princec

JGO Kernel


Medals: 378
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #52 - Posted 2005-03-14 08:58:29 »

Only nutters and Swedes try to write code in assembly language.

Cas Smiley

Offline kevglass

JGO Kernel


Medals: 164
Projects: 23
Exp: 18 years


Coder, Trainee Pixel Artist, Game Reviewer


« Reply #53 - Posted 2005-03-14 09:08:59 »

Or people trying to prove how great they are.

Kev

Offline Markus_Persson

JGO Wizard


Medals: 15
Projects: 19


Mojang Specifications


« Reply #54 - Posted 2005-03-14 10:28:27 »

Isn't that was princec said?

Play Minecraft!
Offline princec

JGO Kernel


Medals: 378
Projects: 3
Exp: 16 years


Eh? Who? What? ... Me?


« Reply #55 - Posted 2005-03-14 10:37:08 »

Sort of Smiley It helps to live in perpetual darkness hundreds of miles from any girls of course. The mind plays tricks.

Cas Smiley

Offline Azeem Jiva

Junior Member




Java VM Engineer, Sun Microsystems


« Reply #56 - Posted 2005-03-14 14:30:48 »

Quote

In any case the performance of the C++ compiler used to compile a JVM is probably not as important as it is in other applications --- hopefully most of the time will be spent in code generated by the JVM (which will be the same regardless of the C++ compiler used).



Correct, the JVM spends most of its time in generated code.  We've upgraded the Solaris C++ compilers several times over the past couple of years, and we've never seen any substantial improvements in performance for the VM.  Oh and the new Sun C++ Compilers for Solaris (SS10) are better than the Intel C++ compilers for the X86 Smiley
Offline Raghar

Junior Member




Ue ni taete 'ru hitomi ni kono mi wa dou utsuru


« Reply #57 - Posted 2005-03-14 15:55:51 »

I'm not Swede, and I consider Macro Assembly as more simple and intuitive than C++.  BTW they have women in Sweden, just they don't care as much about them as women would like, or so they say.
While it wouldn't be too much benefical to write JIT in assembly, possibly it could help by having cleaner code, and smaller executables. It would however help if JIT writers would have heavy experience with optimalizations in assembly. For example in holding critical parts of code in L1 cache, or estimation if SSE2 registers would be faster for one operation than 4 operations in general purpose registers.
It certainly wouldn't hurt if someone would play with assembly for a week and create some GUI application, or JNI library in it. At least he wouldn't talk nonsense like that above link "there is no lib.exe, so we can't create DLLs" evrery respectable programmer is using link.exe with a nice library definition file. (It's much cleaner that decorating method names with compiler hints.)
It reminds me... How would look compiled version of this code :
for(int c1 = 0; c1 < array.length; c1++){
     something }


Difference between GCC and MSVC isn't small, however. From a my half year old test GCC was close to JVM, rather down for my purpose (translated no easily optimisable code.), MSVC was by 1/5 faster.
Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #58 - Posted 2005-03-15 02:43:59 »

Quote
Correct, the JVM spends most of its time in generated code.  We've upgraded the Solaris C++ compilers several times over the past couple of years, and we've never seen any substantial improvements in performance for the VM.  Oh and the new Sun C++ Compilers for Solaris (SS10) are better than the Intel C++ compilers for the X86 Smiley


The important bits here would be in things like software blitting loops and that sort of thing, that is native anyway (I assume).   But to be honest that is where going to assembler would make a lot of sense.  Coding software blits, stretches etc.  should be done in vectorized code... hand tuned SSE2 and the like.  This is one of my ongoing rants that started when I realized how pathetically slow the JPEG loader is in the JRE.  I bet going to a proper JPEG loader (e.g. Intel's old jpeg library, or using their new DSP code that is optimized for the SSE2 instructions) would improve GUI startup time for an application that used JPEG images for button icons - just because the current loader is so extremely slow.  (Can you tell I have a need to play motion JPEG in my Java UI Smiley ? And no, JMF is too broken to use for that sort of thing.)

I'm not sure how well assembler would help things like the  ZIP deflating code, but something as fundamental as loading your resources from a compressed JAR should be optimized.

Offline swpalmer

JGO Coder




Where's the Kaboom?


« Reply #59 - Posted 2005-03-15 02:48:03 »

Quote
Please provide some links. GCC 3 optimizes much better than the old GCC.


I don't have links, this was information from coworkers that had experience with MSVC, GCC, and Intel compilers.  And as I said it is out of date, as they were comparing to VC6.  

Of course I've also seen that Java 1.4.2 is faster than VC6 for some simple things - like basic conversion from YUV to RGB colors and printing the result. Smiley  Microbenchmarks can be fun Smiley

Pages: 1 [2] 3 4
  ignore  |  Print  
 
 
You cannot reply to this message, because it is very, very old.

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

Dwinin (23 views)
2014-09-12 09:08:26

Norakomi (56 views)
2014-09-10 13:57:51

TehJavaDev (69 views)
2014-09-10 06:39:09

Tekkerue (35 views)
2014-09-09 02:24:56

mitcheeb (56 views)
2014-09-08 06:06:29

BurntPizza (40 views)
2014-09-07 01:13:42

Longarmx (26 views)
2014-09-07 01:12:14

Longarmx (33 views)
2014-09-07 01:11:22

Longarmx (31 views)
2014-09-07 01:10:19

mitcheeb (39 views)
2014-09-04 23:08:59
List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27

Resources for WIP games
by CogWheelz
2014-08-01 16:20:17

Resources for WIP games
by CogWheelz
2014-08-01 16:19:50

List of Learning Resources
by SilverTiger
2014-07-31 16:29:50

List of Learning Resources
by SilverTiger
2014-07-31 16:26:06

List of Learning Resources
by SilverTiger
2014-07-31 11:54:12

HotSpot Options
by dleskov
2014-07-08 01:59:08
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!