Java-Gaming.org Hi !
Featured games (90)
games approved by the League of Dukes
Games in Showcase (736)
Games in Android Showcase (224)
games submitted by our members
Games in WIP (814)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: 1 ... 186 187 [188] 189 190 ... 192
  ignore  |  Print  
  What I did today  (Read 1914409 times)
0 Members and 1 Guest are viewing this topic.
Offline dime26

JGO Ninja


Medals: 49
Projects: 7
Exp: 12 years


Should traffic wardens be armed?


« Reply #5610 - Posted 2017-07-15 11:49:32 »

Spent approximately 20 hours split between today and yesterday working on........ something I'm not going to reveal right now. It's AFAIK something new that hasn't been done before with the quality I'm getting. I'll be writing one of my super long posts on this tomorrow, but right now it's like 10 in the morning and I REALLY need to sleep.

Reminded me of Silicon Valley  Smiley
Online theagentd
« Reply #5611 - Posted 2017-07-16 02:41:44 »

OK, I feel ready to show this shit off. Spent essentially the entirety of today on it too. =___=

Basically, I've been spending quite a lot of time trying to get a good quality motion blur and depth of field working. We really needed motion blur for WSW, so I tried very hard to implement the (then very new) algorithm described in A Reconstruction Filter for Plausible Motion Blur, an excellent algorithm. We used this algorithm for quite some time with great success, but it wasn't perfect. Some nice improvements and optimizations were floating around on the internet which I implemented as I found them, but there was always one glaring issue of that algorithm: It had a really bad discontinuity at the edges of objects, which can clearly be seen in the bottom picture of Figure 5 in the original paper.

Next Generation Post Processing in Call of Duty: Advanced Warfare was the next big source of information for me. I STRONGLY recommend everyone interested in postprocessing to take a look at that presentation, as it contains a huge amount of really great images, videos, explanations and even shader code. For motion blur, they added a weight renormalization pass to fix the depth discontinuity, making the motion blur much more convincing around the edges of moving objects. However, the really amazing thing in that presentation was their depth of field implementation. It had extremely good quality and worked similar to the motion blur. Basically, it split up the scene into a background and a foreground, defined per tile. It looked really great in the screenshots and the example video they had in the slides, but... I just couldn't get it to look good.

The foreground/background classification didn't seem to work well at all. As the classification was done based on the minimum depth of a tile, a tile with low minimum depth could cause neighboring tiles to classify the entirety of their contents as background. This caused tile-sized snaps and shifts as the scene moved, which was just unacceptable to me. Instead, I tried doing the classification per pixel instead, which solved some problems, but now an out-of-focus foreground object couldn't blur out over other in-focus objects. Murr murr...

It took me a very long time to find a solution, but two days ago I finally did. The problem with the foreground/background system is that the pixel you're looking at needs to go into either the foreground or the background. If it goes into the foreground, things that are in front of the pixel can't blur over it. Similarly, if it's in the background that also causes issues in certain cases. The solution was to add a third class, the "focus", which involves things at around the same depth as the pixel we're looking at. This fixed the problem in all cases!

No DoF:


DoF, focus on head:


DoF, focus further away:


There was just one major issue. The DoF didn't work together with the motion blur algorithm! No matter which one you applied first, as soon as you've blurred the image, the depth buffer becomes useless for classification! This means color bleeding over edges when both motion blur and depth of field are both applied to the same place.

Now, the story above isn't entirely chronological. Some months ago, I theorized if would be possible to combine both DoF and motion blur into a single postprocessing past. They're both blurs; I just need to figure out how to get them to work together. How hard can it be? ...... Don't get me started, it was f**king hard, but two days ago I managed to get it right. DoF's classification system is essentially a more advanced version of what the original motion blur algorithm does, so using it for motion blur is fine. I had a crazy amount of issues with the alpha calculations between the layers, and even more issues with the sampling patterns, but in the end I actually managed to get it working. Behold, the results of my work.

Combined DoF and motion blur.



I'm... exhausted as hell. But it was worth it I guess. Now there's just a massive amount of work left to optimize this. If anyone cares enough to want to implement this themselves, I can write an article on how the DoF+MB works.


Myomyomyo.
Offline Slyth2727
« Reply #5612 - Posted 2017-07-16 03:11:07 »

Incredible @theagentd. The combination of the blur passes is a really neat concept, I'm definitely going to play around with that (if you don't mind!). As always, amazing work.
As a side note, learned about and played with metaballs yesterday. Currently working on a DE formula for them so I can raytrace 3D ones. Trying to work it out myself for the practice and fun.
https://www.shadertoy.com/view/4s2fDz
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline cylab

JGO Kernel


Medals: 173



« Reply #5613 - Posted 2017-07-16 12:09:55 »

@theagentd video or it didn't happen  Tongue Pointing

Seriously. Great stuff. If you come around captioning a vid, would be great and much appreciated!

Mathias - I Know What [you] Did Last Summer!
Online theagentd
« Reply #5614 - Posted 2017-07-16 20:43:51 »

Incredible @theagentd. The combination of the blur passes is a really neat concept, I'm definitely going to play around with that (if you don't mind!). As always, amazing work.
Thanks!

@theagentd video or it didn't happen  Tongue Pointing

Seriously. Great stuff. If you come around captioning a vid, would be great and much appreciated!
Thanks! What do you mean by "captioning a video"? A video with explanations...?

Myomyomyo.
Offline Riven
Administrator

« JGO Overlord »


Medals: 1310
Projects: 4
Exp: 16 years


Hand over your head.


« Reply #5615 - Posted 2017-07-16 22:41:41 »

I think he meant capturing Pointing

Hi, appreciate more people! Σ ♥ = ¾
Learn how to award medals... and work your way up the social rankings!
Offline orange451

JGO Kernel


Medals: 342
Projects: 6
Exp: 6 years


Your face. Your ass. What's the difference?


« Reply #5616 - Posted 2017-07-17 05:03:05 »

Reworked the decal shader in my engine. Decals can now modify specularity and glossiness. So... SSR can reflect from decals now!



Click to Play


I'm still trying to figure out how to correctly get them to modify the normals. The problem is that decals are technically "planes", so if they are on a non-planar surface (but have depth to still draw with "thickness"), they will break the normal of the surface (as they will be pointing in the direction of the plane not how they are really oriented)

First Recon. A java made online first person shooter!
Offline Stampler
« Reply #5617 - Posted 2017-07-17 13:21:08 »

Started working on a small android game with two friends. We used to talk about making games a lot in school but never started. Now we have jobs and actually use the skills we learn there to develope in our freetime.


Don't dream big, dream realistic.
BeeMelon Studio: Twitter | Instagram | Facebook
Online theagentd
« Reply #5618 - Posted 2017-07-18 02:10:28 »

Jesus H. Christ, I've been in constant cold sweat / depression over finding out that there was a case which my combined DoF/MB implementation couldn't handle. Lied sleepness at night, almost walked into shit while going to the store. x___X

If the entire scene is out of focus, the result should be approaching a simple circle blur (think box blur, but a circle shape instead), but my triple-classification system didn't approach that for blurry scenes. This resulted in seeing weird outlines of the out-of-focus objects on the screen. I could tweak the range of the different layers, but this essentially resulted in disabling the third layer, which made the original issues come back. The problem was simply inherent to using a third classification, and therefore unfixable. Hence, I went back to the basics again and reimplemented the original depth of field from the paper, with the additions I made to support motion blur. I tried using the minimum depth of each tile again, but just confirmed the original issue I discovered there:



The per-tile depth had to go. There was no way that was ever going to work. Each pixel needed to be classified relative to the current pixel. Simply changing the depth I did the classification against to be the depth of the center pixel, and this just amplified the issue further: objects could no longer blur over their edges at all if the background was sharp. Well, shit. OK, I'll just change the center pixel to be classified as background too. Wait a minute, that actually looks... correct? With just two classification types? Then I enabled motion blur, and was immediately shot down as massive dark strips appeared around the edge of motion blurred objects. Crap. But... how can that even be possible? Motion blur and DoF are essentially the same thing. The only difference is the shape of the blur area and how the blur radius/length is calculated. I banged my head and finally found it: A really stupid bug in the background weight calculation of motion blurred pixels. Arghhhh!!! Shocked

I fixed it, and I'm now back to only two classifications, and it seems to handle every case I expect it to handle. It suffers from the inherent limitations of working in screen space as a postprocessing effect. A blurry object essentially becomes see-through around the edges, which should reveal the background, but that information simply isn't there in the original sharply rendered screen. Hence, the background is reconstructed by taking the colors of nearby pixels and averaging them together. You can actually see that in effect in the image above. You can see how the reconstructed background around the edge of the blurry robot looks good in most cases, but particularly at the horizon as well as in some other minor cases, the reconstruction simply doesn't look correct. There's no way of fixing that without actually feeding more data into the blurring algorith, which would require generating that additional data as well using something like depth peeling (and then lighting and processing all layers). That, or raytracing, and I'm pretty sure neither of those two will happen anytime soon.

In addition, there are some limitations in how the motion blur works. It still suffers from the same limitations as the original algorithm, as it only tries to blur in one direction at a time, the dominant motion vector of each tile. This can cause issues with overlapping motions around edges, which are more apparent the higher the maxMotionBlurDistance and maxDofRadius is. "Luckily", higher distances/radii cause such huge performance hits that this will be limited. =P

I'll be making a video after doing some more testing and reimplementing some of the optimizations I made, but I'm gonna take it easy today and go to bed early (and maybe even get some proper sleep).

Myomyomyo.
Offline Slyth2727
« Reply #5619 - Posted 2017-07-18 04:53:13 »

Whipped up a little orbit trap shader after reading up on it. I'm also trying to figure out how I can get a linearly increasing value in pure GLSL, or at least a somewhat smooth sine function that doesn't speed up a ridiculous amount. I was thinking that I could use sin + cos (ya know a circle) and step across a circle instead of the exponential endpoints that just sin produces but I can't seem to get that working. Any ideas?

Orbit shader for those interested: https://www.shadertoy.com/view/lsjBD1
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline orange451

JGO Kernel


Medals: 342
Projects: 6
Exp: 6 years


Your face. Your ass. What's the difference?


« Reply #5620 - Posted 2017-07-18 23:31:39 »

Revisiting the gen code for my project. Making it room based now to allow for more realistic building feeling Smiley

Click to Play

First Recon. A java made online first person shooter!
Offline matt_p
« Reply #5621 - Posted 2017-07-19 09:27:37 »

Added a progress bar for the generic loading screen


Added health bars to menu


... and battle Smiley


Also, and not screenshotable (will check for gif sometimes):
Facesets for dialog boxes can be animated:
Either just the whole time (e.g. having 1 normal and 1 closed eyes frame, displaying the first one all the time and occasionally switch for a sec to the second one)
Animate only if the text changes (e.g. having the animation moving the mouth, so only when the text appears and is updated, the mouth moves)
Offline J0
« Reply #5622 - Posted 2017-07-19 10:43:14 »

Not exactly today but over the course of the past week I've really started improving my test webcomic website, most notably adding comments and animated panels Cheesy Still a lot of bugs but I believe I'm on the right path Smiley

Offline orange451

JGO Kernel


Medals: 342
Projects: 6
Exp: 6 years


Your face. Your ass. What's the difference?


« Reply #5623 - Posted 2017-07-19 23:03:15 »

Starting to re-implement the gen!



This is going to take a LONG time!

First Recon. A java made online first person shooter!
Offline kingroka123
« Reply #5624 - Posted 2017-07-20 04:21:59 »

Here's a quick and dirty look at the game I've been working on for the past month.
<a href="http://www.youtube.com/v/YMPYCGtUCSA?version=3&amp;hl=en_US&amp;start=" target="_blank">http://www.youtube.com/v/YMPYCGtUCSA?version=3&amp;hl=en_US&amp;start=</a>

Deep (working name) is a roguelike dungeon crawler with a robust and creative spell system (which I am still developing, I'll make a post here once the spell system is finalized).
Offline dime26

JGO Ninja


Medals: 49
Projects: 7
Exp: 12 years


Should traffic wardens be armed?


« Reply #5625 - Posted 2017-07-20 07:52:13 »

Big refactor job of current code base, creating classes etc and also blogged for first time in a while to capture the posts I have made to this thread recently https://carelesslabs.wordpress.com/2017/07/19/zero-hour-gamedev.

Not quite ready to post my game to the "wip-games-tools-toy-projects" section just yet.

Offline Slyth2727
« Reply #5626 - Posted 2017-07-20 23:09:19 »

Just moved into my first dorm room. Going to court in a couple weeks to try to get a couple possession charges dropped and if that succeeds I'll have completely moved on to my new life. Things are going great right now!
Online theagentd
« Reply #5627 - Posted 2017-07-21 05:56:05 »

Depth of field + motion blur video! Sorry about the Skype notification sound(s) in the video! >___< The video can be streamed from Drive using the Youtube video player, but I recommend downloading the video for maximum 60 FPS quality.

https://drive.google.com/open?id=0B0dJlB1tP0QZZHNOTHdiN3hCTms

Notes:
 - There's some aliasing of sharp (and blurry) objects in the scene. This is because my antialiasing is not compatible with the DoF/motion blur at the moment, but I will probably fix that one way or another.
 - The motion vectors go crazy when I move the camera during slow-motion, as the camera believes it's rotating at an extremely high speed. It's not a problem with the algorithm.

Myomyomyo.
Offline orange451

JGO Kernel


Medals: 342
Projects: 6
Exp: 6 years


Your face. Your ass. What's the difference?


« Reply #5628 - Posted 2017-07-21 16:07:46 »

Finally released an alpha trailer for my game:
<a href="http://www.youtube.com/v/3WeDVtZAdSk?version=3&amp;hl=en_US&amp;start=" target="_blank">http://www.youtube.com/v/3WeDVtZAdSk?version=3&amp;hl=en_US&amp;start=</a>

First Recon. A java made online first person shooter!
Offline FabulousFellini
« Reply #5629 - Posted 2017-07-21 16:15:49 »

Finally released an alpha trailer for my game:
<a href="http://www.youtube.com/v/3WeDVtZAdSk?version=3&amp;hl=en_US&amp;start=" target="_blank">http://www.youtube.com/v/3WeDVtZAdSk?version=3&amp;hl=en_US&amp;start=</a>

Wow I'm really impressed dude, that looks great!

-FabulousFellini
www.fabulousfellini.com
Offline LiquidNitrogen
« Reply #5630 - Posted 2017-07-21 23:51:07 »

I've been wanting to try making height map terrain for a long time, pretty rough still but it runs really fast and is satisfying to walk around. Now I've learnt how to use meshes that will be very useful for other things.

Offline SkyAphid
« Reply #5631 - Posted 2017-07-22 06:18:15 »



All of this time spent animating, and I've only got about two minutes done! It takes a lot of work to animate interacting objects. Also, please ignore the low quality textures on the model in theagentd's videos - those are very old UVs and textures made for Terminus that were merely for testing. I promise it'll look even better in the final version!

it just werks
Offline Icecore
« Reply #5632 - Posted 2017-07-22 18:17:19 »

I also have Data parser)
it something similar to “xtext”, i wanted use it long ago,
but don't find the way to use tabs separator for classes,

so i think made something similar but simpler for 2-3 weeks – next I think 2-3 months
and it took 4 months to receive at least runnable model
(it have 3 generations with different syntax ^^)

-here img parser syntax for parse self (current file) ^^
(its notepad++ color highlight)



it works not really fast Sad only 3-4mb/s

p.s Every custom language milestone is parser – that can parse itself Wink
At this point it can parse only Data syntax – without runnable function
I hope one day I reach that milestone =)

up: Upps i look at git log and it took not 4 months it took 10 months OMG
i start it at 09.2016 lol so lame long, i am f**ing moron- how it took so long...
it start from this
http://www.java-gaming.org/topics/what-i-did-today/33622/msg/361350/view.html#msg361350

and i even don't want talk about this (time)
http://www.java-gaming.org/topics/what-i-did-today/33622/msg/347342/view.html#msg347342
http://www.java-gaming.org/topics/what-i-did-today/33622/msg/346244/view.html#msg346244
http://www.java-gaming.org/topics/what-i-did-today/33622/msg/346336/view.html#msg346336

its like Doom Limbo XD

Last known State: Reassembled in Cyberspace
End Transmission....
..
.
Offline Opiop
« Reply #5633 - Posted 2017-07-22 19:11:02 »

I'll be recording my first EP with my band August 6th-14th and will be taking that entire time off work! I'm really excited, I haven't taken any kind of a vacation since I started two years ago (wow, time flies!). Also playing some shows/festivals very soon, can't wait to start playing out more after we finish recording. And then we get to make music videos, so I get to be an actor! Smiley Very exciting.
Online theagentd
« Reply #5634 - Posted 2017-07-22 20:44:29 »

Stumbled upon some interesting results while optimizing the depth of field/motion blur shader.

Basically, there are three relevant things that can cause a bottleneck in the blur shader:

1. Too many ALU instructions, AKA too much mathz!!!11
2. Too many texture instructions. Each instruction, regardless of what it actually fetches, has a minimum cost.
3. Bad texture sample spatial cache locality. This is worsened if you're sampling a format which takes up a lot of memory in the texture cache.

The blur shader has three different paths:
a. The most expensive path is needed when there is a big difference between the circles-of-confusion or the motion vectors inside a tile. This path needs two texture samples per sample (1 for color, 1 for depth+CoC+motion vector length), and has quite a lot of math.
b. If the blur is uniform for the entire tile (and neighboring tiles), there's no point in doing all the fancy depth sorting and layer blending, so in that case I can just run a simple blur on the color on the scene directly. This version just reads the color texture and requires less math.
c. If I tile is completely in focus and has no major motion, the shader early outs as no blurring is needed. The shader is obviously incredibly fast if this path is used, so for the tests I will be disabling this path.

To rule out texture cache problems and get a baseline performance read, I ran path A and path B with zero CoCs and zero motion vectors. This causes all the texture samples to end up at the exact same place, so the texture cache should be able to do its work perfectly. Testing this, I got around 4.5ms for the expensive path and 2.2 ms for the cheaper path. Sadly, it's hard to tell what the bottleneck is alone from this data. The fact that performance is half as fast could mean that the bottleneck is either the sheer number of texture samples (as performance doubled for half as many samples), or it could just mean that the cheaper path has half as many math instructions. Next, I tested the shader with a 16 pixel radius CoC for every pixel on the screen, meaning that the samples are distributed over a 33x33 pixel area (over 1000 possible pixels). This caused the fast path to increase from 2.2 ms to 8.2. Even worse, the expensive path went from 4.5 ms all the way up to 21.5! Ouch!

As the blur size increases, the samples get less and less spatially coherent. However, there is no performance loss whatsoever as long as the samples fit in the texture cache. After that point, performance quickly gets a lot worse very quickly. That threshold also depends on the size of the texture format we're sampling, as a bigger format takes up more space in the texture cache ---> fewer texels can fit in it ---> the cache gets "shorter memory". In my case, both the color texture and the depth+CoC+motion vector length textures are GL_RGBA16F textures, meaning they should be taking up 8 bytes per sample each. As we can see, the fast path suffered a lot less from the worse cache coherency, only taking around 3.75x more time, but the slow path took around 4.75x as much time! This is because the fast path only needs to sample the color, and hence has less data competing for space in the cache.

Here's where it gets interesting: For a first test, I tried changing the texture format of the color texture from GL_RGBA16F (8 bits per texel) to GL_R11F_G11F_B10F (4 bits per texel). I expected this to possibly even double the performance of the fast path as it'd halve the size of each texel, but... Absolutely nothing happen. It performed exactly the same as GL_RGBA16F. The explanation for this is that the texture cache always stores data "uncompressed". For example, for DXT1-compressed textures, the texture colors of each 4x4 block are compressed to 2-bit indices used to interpolate between two 5-6-5-bit colors. The result of this interpolation does not result in exact 8-bit values, but the GPU will round the values to the closest 8-bit values and store the result in the texture cache with 8 bits of precision per color channel. On the other hand, RGTC-compressed textures work similarly to DXT1-compressed textures, but instead use 3-bit indices and 8-bit colors. This gives RGTC textures up to ~10-11 bits of precision in practice, and uncompressed RGTC is actually stored at 16-bit precision in the texture cache to allow you to take advantage of this. Hence, it makes sense that the GPU cannot store 11/10 bit floats in the texture cache directly, and instead has to store them as 16-bit floats instead. In addition, the texture cache can't handle texels with 6 bytes, so they're then padded to 8 bytes, giving us the same cache footprint as GL_RGBA16F! The bandwidth used isn't the issue here; we're simply suffering from cache misses, and the latency for fetching that data is killing us, regardless of how much data is being fetched! Testing with GL_RG16F and GL_R32F instead, performance of the fast path improved from 8.2 ms to 6.0! The slow path only improved from 21.5 to 19.4 ms, as the other texture is still GL_RGBA16F.

So, to improve performance I really want to reduce the size of each texel. Since GL_R11F_G11F_B10F is out of the question for the color texture, I'll need to do some manual packing. Luckily, I don't need any filtering for these textures, so sacrificing filterability is not a problem. My current plan is to emulate something similar to GL_RGB9_E5 (as that format can't be rendered to) by storing the color in RGB, and an exponent in the alpha channel. This should allow me to store a HDR color with just 32 bits of data. The depth+CoC+MV texture is simpler; just store depth as a 16-bit float, and the CoC and motion vector length values can be stored at 8 bits precision each no problem. Since I'll need to do manual packing using bit manipulation, storing these values in 32-bit uint textures is the easiest. This means that I can easily pack all this data into a single GL_RG32UI texture. This will halve the number of texture samples AND halve the texel size for the slow path, but it'd also force the fast path to fetch the extra blur data it doesn't need. Hence, I will probably output just the color to a second GL_R32UI texture that the fast path can use to halve the size of each texel for that one too. I already do a fullscreen packing pass to generate the depth+CoC+MV GL_RGBA16F texture which only takes 0.08ms right now, so adding another texture (increasing bandwidth by 50%) shouldn't cause any significant slowdowns compared to the gains the blur pass will see.

I will also port the blur shader to a compute shader. Currently, in my fragment shader I pick which path to take based on the tile the pixel falls into. This assumes that the fragments are processed in blocks that align up with these tiles. If this isn't the case, this could have severe performance impacts as some pixels may end up having to execute multiple paths. By using a compute shader, I can get ensure that the compute shader work groups align perfectly with the classification tiles, so each work group would only ever need to execute one of the 3 paths.

Myomyomyo.
Offline Apo
« Reply #5635 - Posted 2017-07-24 14:34:23 »

Next two games are ready for my advent calendar =)



Online theagentd
« Reply #5636 - Posted 2017-07-25 17:30:09 »

TL;DR: Compute shaders are frigging awesome!

Today I ported my blur shader to a compute shader. It does everything exactly the same, but it's significantly faster!



My blur shader is optimized using a kind of tile classification system. For each 16x16 tile on the screen, I calculate:
 - the dominant motion vector (DMV) direction and length of the tile
 - the highest circle of confusion (CoC = depth of field blur radius) of the tile
 - some additional data.
These tiles are then dilated based on their "reach". The point is to allow objects to blur over their borders by making sure the neighboring tiles are doing the blur calculations as well. Then, based on the dilated DMV length and max CoC of the tile, I check if the tile can be optimized. A sample count is calculated based on the blur area, so that for smaller blurs I use a lower sample count. In addition, I pick a "path" for the blur shader per tile as I've mentioned before, but here's a recap:
 - If the max CoC and DMV length is less than 0.5, the tile is completely sharp, so the blur shader can early out.
 - If the CoC and the DMV length doesn't vary much in the tile, a fast path is used instead as we don't need any fancy blending just to get an even blur for that.
 - Otherwise, the slow path is used as the tile contains complex geometry that needs to be blurred carefully.

The reason for doing this classification per tile is that it's cheaper to do, and that shaders are executed in large groups anyway. Hence, if pixel 1 in a group picks the fast path, pixel 2 picks the slow path and the rest want to early out, the entire shader group will need to run all 3 paths for all pixels as they all have to run in lockstep. This is actually slower than just running the slow path for all pixels. Hence, the idea is that by using 16x16 tiles, entire groups of pixels should be able to run only one path. To test this out, I created a specific scenario: The entire scene will be placed in focus with no motion (meaning the shader can early out immediately), but tiles in a checkerboard pattern will be forced to the slow path (in other words, have optimizations disabled). If the way the GPU groups up pixels into tiles is not EXACTLY aligned with the 16x16 tiles, it would need to execute the slow path for the entire screen, and the performance will reflect that. The key here is that compute shaders allow me to manually define the work group size, so I can force the workgroup size to perfectly align with the tiles. Let's see how this performs:

 - Fragment shader, all tiles using slow path: 3.94 ms
 - Fragment shader, checkerboard early out/slow: 3.94 ms (ouch!)
 - Compute shader, all tiles using slow path: 3.94 ms
 - Compute shader, checkerboard early out/slow: 2.07 ms (yes!)

In other words, the 16x16 tiles did NOT line up with the way the GPU placed pixels into groups, but with a compute shader they obviously do! This should give a nice (but obviously smaller) performance boost in real world scenarios!



In addition, I made some surprising findings while just playing around with it! Here's the performance of the compute shader compared to the fragment shader version when the entire screen is completely out of focus (max blur radius for every single pixel):
 - Fast path: 50% faster (15.0 ms ---> 10.0 ms)
 - Slow path: 17% faster (22.5 ms ---> 19.2 ms)

In other words, for some reason the raw performance of the shader is significantly better. This doesn't seem to be due to better ALU performance, but rather better texture sampling performance. There's no incoherent branching going on here either; every single pixel/invocation is executing the same thing (either the fast or the slow path). I guess since compute shaders bypass a lot of hardware in the GPU compared to a fullscreen triangle, maybe using a compute shader simply freed up cache space or changed the way the pixels are grouped together. I'm afraid I don't have a very good explanation for this, but damn is it awesome! =P



For a scene with mixed {early out/fast path/slow path} with dynamic sample count based on blur area (in other words, a typical scene in a normal game with all optimizations on):
 - 74% faster (7.5ms ---> 4.3 ms)

This is generating the exact same image with the same code, just executed as a compute shader instead of a fragment shader. The only difference is that I use gl_FragCoord.xy in the fragment shader version, which I need to calculate manually in the compute shader as (vec2(gl_GlobalInvocationID.xy) + 0.5). Also, this is BEFORE adding the packing of the data before. I expect that optimization to almost double the performance of both the fast and the slow path in the worst case scenario above (fast path 10 --> 5ms, slow path 20 --> 10ms), for a tiny performance cost in other scenarios.

EDIT: Here's a debug image showing the tile classification. Red = slow path, green = fast path, blue = early out. In addition, the color intensity represents the sample count; bright = more samples (up to 128).

Myomyomyo.
Offline Icecore
« Reply #5637 - Posted 2017-07-25 18:06:10 »

Spend all day fixing 2 bugs in “core eclipse search”)
org.eclipse.jdt.core_3.10.0.v20140604-1726.jar

Disclaimer  - I have no idea how eclipse works, I open random plugin in eclipse SDK and start debug it running copy of eclipse – many hrs trying manually find error and fix it,
so after 6-7 hrs I find line of code – if comment it bug disappear
(who knows what I may broke this small change, but I don’t care)
After that I discover similar bug and fix it same way - commenting the line in about 2 hrs)

I don’t want report to Eclipse – because my fixes may be totally garbage,
(and I have no idea how properly report to them)
If they fix it, it be only in last version, and I hate last versions of eclipse
*same like in IDEA some bugs last Ages without fixes (current in eclipse open 300 bugs)
(I also may say that source code of eclipse is nightmare,
and I very happy because java allow changing source code on the fly ^^)

If someone interesting here code with bugs and fixes
(maybe it help someone manual fix it in version they used)
(open plugin list in SDK, right click on plugin – import plug-in as source,  debug it)

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
package pac;
import pac.b1.ns_33;
import pac.b1.c1.ns_1;

public interface ns extends
   ns_1, ns_33{
}
///
package pac.b1;
import pac.ns;

public interface ns_33 {
   class ns_33_C implements ns{
   }
}
///
package pac.b1.c1;
import static pac.ns.*;

public interface ns_1 {
 class ns_1_C{
    ns_33_C b11; // search referece not work
 }
}
//

fix
   org.eclipse.jdt.internal.compiler.lookup.ClassScope;
   //      if (needToTag) {
   //         for (int i = 0; i < nextPosition; i++)
   //            interfacesToVisit[i].tagBits |= TagBits.HasNoMemberTypes;
   //      }


1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
package tt;

public interface __ww{
   static public abstract class _WWW_P{  
     
   }
}
///
package tt
import tt.__ww._WWW_P;

public interface _P_Seq{
   static class _Seq{
      public _WWW_P st[]; // search referece not work
   }
}
///
fix

   org.eclipse.jdt.internal.core.SearchableEnvironment;
   IRestrictedAccessTypeRequestor typeRequestor = new IRestrictedAccessTypeRequestor() {
            public void acceptType(int modifiers, char[] packageName, char[] simpleTypeName, char[][] enclosingTypeNames, String path, AccessRestriction access) {
               if (excludePath != null && excludePath.equals(path))
                  return;
/*               if (!findMembers && enclosingTypeNames != null && enclosingTypeNames.length > 0)
                  return; // accept only top level types
*/
              storage.acceptType(packageName, simpleTypeName, enclosingTypeNames, modifiers, access);
            }
         };

Last known State: Reassembled in Cyberspace
End Transmission....
..
.
Offline LiquidNitrogen
« Reply #5638 - Posted 2017-07-26 12:41:09 »

figured out how to calculate normals!

Offline cybrmynd

Senior Newbie


Medals: 5



« Reply #5639 - Posted 2017-07-26 13:57:42 »

I played some live music at a bar serving pizza. Not as cool as procedrually generated geometry Pointing but hey, it was a good turnout.   
Pages: 1 ... 186 187 [188] 189 190 ... 192
  ignore  |  Print  
 
 

 
cybrmynd (139 views)
2017-08-02 12:28:51

cybrmynd (160 views)
2017-08-02 12:19:43

cybrmynd (155 views)
2017-08-02 12:18:09

Sralse (170 views)
2017-07-25 17:13:48

Archive (649 views)
2017-04-27 17:45:51

buddyBro (768 views)
2017-04-05 03:38:00

CopyableCougar4 (1306 views)
2017-03-24 15:39:42

theagentd (1267 views)
2017-03-24 15:32:08

Rule (1239 views)
2017-03-19 12:43:22

Rule (1313 views)
2017-03-19 12:42:17
List of Learning Resources
by elect
2017-03-13 14:05:44

List of Learning Resources
by elect
2017-03-13 14:04:45

SF/X Libraries
by philfrei
2017-03-02 08:45:19

SF/X Libraries
by philfrei
2017-03-02 08:44:05

SF/X Libraries
by SkyAphid
2017-03-02 06:38:56

SF/X Libraries
by SkyAphid
2017-03-02 06:38:32

SF/X Libraries
by SkyAphid
2017-03-02 06:38:05

SF/X Libraries
by SkyAphid
2017-03-02 06:37:51
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!