Java-Gaming.org    
Featured games (79)
games approved by the League of Dukes
Games in Showcase (475)
Games in Android Showcase (106)
games submitted by our members
Games in WIP (530)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  What is this OpenCL sorcery?  (Read 1547 times)
0 Members and 1 Guest are viewing this topic.
Offline DrewLols

Senior Member


Medals: 1
Projects: 1


Noob going through metamorphosis...


« Posted 2014-02-01 00:39:54 »

Hah!  Okay, so I've finally got a decent handle on writing multithreaded game loops. I actually just finished a processor class similar the concurrent package's ExecutorService.  In my case, I can see a performance boost on tasks that take at least 1 millisecond to complete on a single core, but I digress.  I consider this relevant because graphics cards tend to have thousands of cores over my 'pitiful' oct core cpu.  Do they call them oct cores?  Meh...

What I'm confused about is how hardware that is created for the sole purpose of accelerating graphics can accelerate cpu instructions.  I've seen, for instance, that OpenCL can be used to process elements in an array assuming that order is arbitrary.  It's like a for loop with i as your index, but you don't really know what the value of i is.  You just know that i will go over all indices at some point, and you're free to do your processing utilizing this black box sorcery that we call OpenCL.

Now, I can't argue with results!  I think it would be interesting if Java were to include an OpenCL binding in its distributions for zealous performance freaks like myself.  However!  I always thought that the GPU was made for a more particular task (rendering), and that's why it has always been faster at rendering, and only rendering.  That's the only thing that perplexes me.

Edit:  I have seen a couple of OpenCL bindings.  Even LWJGL has it included, which I find interesting.  It's still not ideal seeing as it requires JNI, but it's still a perk.  I needed to make this relevant to Java somehow, so that's why I mentioned this xD

Did you know that 90% of statistics are wrong?
Offline Danny02
« Reply #1 - Posted 2014-02-01 00:59:45 »

GPUs have uniform-architecture(Geforce 6000) for a long time now, which means that there are only a few special purpose calculation units left on these chips. And with DX10 cards these architecture was first accessible to the public.

That GPUs are better equipped than CPUs for graphic calculation, comes from the different type/concept of the architecture. Not like in the old times were you had some specialized hardware for a single purpose.

GPUs don't have thousands of cores, but a few hundreds which can handle a lot(here comes the thousands into play) of lightweight threads. Another difference is the fast memory access of a GPU. The gigabytes of VRAM are multiple factors faster to access by the GPU than the RAM is for the CPU. Combine this with multiple levels of intelligent caches and you have a beast of machine which can crunch extremely fast through gigabytes of data.


ps: there are OpenCL bindings for Java as there are ones for OpenGL. There is also a quite handy lib called Aparapi from AMD which can convert normal Java bytecode to OpenCL kernels on the fly with a fallback to a normal fork-join pool.
Offline HeroesGraveDev

JGO Kernel


Medals: 238
Projects: 11
Exp: 2 years


┬─┬ノ(ಠ_ಠノ)(╯°□°)╯︵ ┻━┻


« Reply #2 - Posted 2014-02-01 01:00:38 »

GPUs are good at math.
CPUs are good at branching.

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline DrewLols

Senior Member


Medals: 1
Projects: 1


Noob going through metamorphosis...


« Reply #3 - Posted 2014-02-01 01:12:56 »

GPUs have uniform-architecture(Geforce 6000) for a long time now, which means that there are only a few special purpose calculation units left on these chips. And with DX10 cards these architecture was first accessible to the public.

That GPUs are better equipped than CPUs for graphic calculation, comes from the different type/concept of the architecture. Not like in the old times were you had some specialized hardware for a single purpose.

GPUs don't have thousands of cores, but a few hundreds which can handle a lot(here comes the thousands into play) of lightweight threads. Another difference is the fast memory access of a GPU. The gigabytes of VRAM are multiple factors faster to access by the GPU than the RAM is for the CPU. Combine this with multiple levels of intelligent caches and you have a beast of machine which can crunch extremely fast through gigabytes of data.


ps: there are OpenCL bindings for Java as there are ones for OpenGL. There is also a quite handy lib called Aparapi from AMD which can convert normal Java bytecode to OpenCL kernels on the fly with a fallback to a normal fork-join pool.

Ah!  I have heard of Aparapi.  I would still like to see OpenCL code to run on the JVM, though.  Then maybe the "Java is slow" myth would end?  Hah!  Just kidding...

Did you know that 90% of statistics are wrong?
Offline theagentd
« Reply #4 - Posted 2014-02-01 02:26:57 »

GPUs are becoming faster in a way that consumer CPUs can't until we get proper threading in our day-to-day programs. CPU performance isn't improving in the way it did 10 years ago. At that time we were still following Moore's "law" where the number of transistors per area unit (and usually also performance) doubled every two years, leading to an exponential increase in performance. Then we hit somewhere around 2.5-3.0 GHz and suddenly the performance increase pretty much stopped. Sure, we're still getting slightly higher increases in clock speeds and better architectures with sophisticated branch prediction etc, but all in all we're not seeing the same rate of increase nowadays. Instead, we've switched to having more cores. The reason lies in heat. Double the clock rate and increase the voltage to keep the CPU stable and you get between 4x and 8x as much heat. That's where multicore solutions come in.

With this, it's obvious that instead of having one fast core, it'd be much more efficient to have 4-8 cores running concurrently at half the clock speed. However, CPUs and most programming languages are only able to utilize a single core unless you manually split up work between them. Games and other consumer programs have traditionally been pretty bad at utilizing more than one or two cores with good scaling, so the CPU makers have been forced to try to cram out as much performance as possible from each core, despite the fact that it's inefficient to do so.

The first "graphics cards" were simply specialized additional single core CPUs you could plug into your motherboard which the (main) CPU could offload 2D graphics tasks to. They weren't even that fast, usually using less power than the main CPU, but they could give a solid performance boost since the CPU was free to do other things. As graphics got more advanced and started to venture into proper 3D, the manufacturers realized that rasterizing and shading was an extremely easy thing to parallelize. Two pixels can be independently calculated on two different cores. (This was before graphics cards also handled transforming vertices.) Then the rendered resolution started to increase to the point where the number of pixels was so large that using multiple cores became more viable. 15 years ago Nvidia coined the term "GPU" when it realesed the Geforce 256, which featured a grand total of 4 pixel shaders. It also featured hardware support for vertex transformations, but this hardware was slower than a decent CPU. After this, the number of pixel shaders and vertex shaders gradually increased, and for Nvidia the number of pixel shaders and vertex shaders eventually reached 24 and 8 respectively in the GTX 7900 in 2006 (which is a slightly faster version of the PS3's GPU). At this point, GPUs were becoming so powerful with lots of memory that new rendering techniques started appearing.

The flexibility of programmable shaders lead to a new lighting technique called "deferred shading". Deferred shading splits up lighting into two passes. In the first pass, the geometry pass, you store the data you'll need for lighting (diffuse color, normals, shininess...) for each pixel into a huge buffer. In the second pass, the lighting is done by rendering the volume of the light, reading the lighting data for each pixel the volume intersects and computing the lighting. The key here is the unbalanced workload. In the first pass, the number of triangles processed was huge, but the pixel shader was essentially just a copy which depended on bandwidth, not number crunching. In the second pass, the workload flipped. Now we had very few triangles, but we had millions of pixels to light instead. In essence, only half the GPU was working at any given time. The first pass had heavy load on the vertex shaders and the second one had heavy load on pixel shaders. This lead Nvidia and AMD to move on to a unified architecture where GPUs had only one type of shader/core instead of two which could handle both vertex and pixel shading. That allowed the GPU to load-balance between vertex and pixel processing and adapt to the uneven load. Now, I'm not saying that deferred shading was the only reason they made this move, but for games that was probably the biggest reason.

And that's pretty much where we are today. GPUs still contain a lot of fixed functionality hardware, like rasterizers that "fill" pixels that are covered by triangles, and raster output units which handle blending and the conversion and writing of the resulting pixel color, but the thousands of cores that GPUs have are now so flexible that they can be used for almost anything that can be parallelized. People are running physics engines, ray tracing, etc on GPUs nowadays. A simple for-loop where each object is processed independently can easily be run on multiple cores. It's what's called an embarrassingly parallel problem, which is basically a problem that can be split up into a large number of independent tasks that can be processed in parallel. Like pixel shading.

Well, I hope someone found that interesting. I just enjoy writing this stuff, I guess.

Myomyomyo.
Offline DrewLols

Senior Member


Medals: 1
Projects: 1


Noob going through metamorphosis...


« Reply #5 - Posted 2014-02-01 03:00:22 »

Yikes!  You wrote more than I did!  That's some dedication.  To be honest with you, I don't know too much about the history of hardware in general.  I'm just a college student.  I DID know about the slowing down of Moore's law, though.  I've heard that modern transistor gate is... 20 atoms across unless I'm wrong.  

On an almost unrelated note, this flash submission has a transistor gate in there somewhere... among other mind blowing things...
http://htwins.net/scale2/

I couldn't resist.

Did you know that 90% of statistics are wrong?
Offline gouessej
« Reply #6 - Posted 2014-02-11 12:32:35 »

Then maybe the "Java is slow" myth would end?  Hah!  Just kidding...
This myth is still there but it became really very wrong even before 2004. Moreover, JogAmp has a nice OpenCL binding (JOCL) supporting both desktop and mobile environments including Android Smiley

Offline theagentd
« Reply #7 - Posted 2014-02-11 13:19:06 »

Moreover, JogAmp has a nice OpenCL binding (JOCL) supporting both desktop and mobile environments including Android Smiley
OpenCL on Android?!

Myomyomyo.
Offline SHC
« Reply #8 - Posted 2014-02-11 13:28:36 »

OpenCL on Android?!

This post shows how to use OpenCL in Android App.

http://www.pgroup.com/lit/articles/insider/v4n2a3.htm

Offline gouessej
« Reply #9 - Posted 2014-02-11 17:30:08 »

Moreover, JogAmp has a nice OpenCL binding (JOCL) supporting both desktop and mobile environments including Android Smiley
OpenCL on Android?!
Yes but it will become more and more interesting as time goes by, by getting more capable mobile GPUs. The main contributor of JogAmp and the maintainer of JOCL did a very nice job  Grin

Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline Gibbo3771
« Reply #10 - Posted 2014-03-24 10:37:12 »

*Snip*

Dat read, worth a read over to anyone here.

"This code works flawlessly first time and exactly how I wanted it"
Said no programmer ever
Offline BurntPizza
« Reply #11 - Posted 2014-03-24 16:10:38 »

Thanks for that Gibbo, somehow I missed that post.

Agent, do you have a blog somewhere? 'Cause I for one would read the heck out of it.
Quote
Well, I hope someone found that interesting. I just enjoy writing this stuff, I guess.

Keep doin' what you do. It's good stuff.
Offline theagentd
« Reply #12 - Posted 2014-03-24 17:27:57 »

Agent, do you have a blog somewhere? 'Cause I for one would read the heck out of it.
Quote
Well, I hope someone found that interesting. I just enjoy writing this stuff, I guess.

Keep doin' what you do. It's good stuff.
Sadly I don't really do blogs. Maybe I should? xd

Myomyomyo.
Offline Hermasetas

Senior Member


Medals: 6
Projects: 2
Exp: 3 years


I do gamez, yes!


« Reply #13 - Posted 2014-03-26 04:22:51 »

Sadly I don't really do blogs. Maybe I should? xd

Oh please do! Cheesy
Your post was really well written and interesting Cheesy
Offline Roquen
« Reply #14 - Posted 2014-03-26 10:36:40 »

Let me note again.  Do you "really" want OpenCL and/or automagic moving of general purpose code to the GPU?  For a game runtime?   Most likely you don't.  If you're attempting to push the limit you're going to need to juice out GPU cycles to render your scenes.  Chances are you going to want to do the opposite...take things that are easy to compute on the GPU and perform some of theme on the CPU instead  (like in software occlusions).
Pages: [1]
  ignore  |  Print  
 
 

 

Add your game by posting it in the WIP section,
or publish it in Showcase.

The first screenshot will be displayed as a thumbnail.

ctomni231 (37 views)
2014-07-18 06:55:21

Zero Volt (35 views)
2014-07-17 23:47:54

danieldean (28 views)
2014-07-17 23:41:23

MustardPeter (31 views)
2014-07-16 23:30:00

Cero (46 views)
2014-07-16 00:42:17

Riven (47 views)
2014-07-14 18:02:53

OpenGLShaders (36 views)
2014-07-14 16:23:47

Riven (36 views)
2014-07-14 11:51:35

quew8 (32 views)
2014-07-13 13:57:52

SHC (68 views)
2014-07-12 17:50:04
HotSpot Options
by dleskov
2014-07-08 03:59:08

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:58:24

Java and Game Development Tutorials
by SwordsMiner
2014-06-14 00:47:22

How do I start Java Game Development?
by ra4king
2014-05-17 11:13:37

HotSpot Options
by Roquen
2014-05-15 09:59:54

HotSpot Options
by Roquen
2014-05-06 15:03:10

Escape Analysis
by Roquen
2014-04-29 22:16:43

Experimental Toys
by Roquen
2014-04-28 13:24:22
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!