Java-Gaming.org Hi !
Featured games (90)
games approved by the League of Dukes
Games in Showcase (744)
Games in Android Showcase (225)
games submitted by our members
Games in WIP (825)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
   Home   Help   Search   Login   Register   
  Show Posts
Pages: [1] 2 3 ... 123
1  Games Center / WIP games, tools & toy projects / Re: Generated Graph optimized A* Pathfinding on: 2017-12-13 17:22:58
My open areas were not rectangular, but organic. Try a big diamond-shaped room and then let my grand children know the results.  Wink
2  Games Center / WIP games, tools & toy projects / Re: Generated Graph optimized A* Pathfinding on: 2017-12-13 16:59:45
I've actually tried out this exact pathfinding system myself. It's REALLY nice and clean and is a LOT faster for stuff like long corridors. It pretty much generates completely optimal paths in contrast with a grid-based pathfinder. However, it really can't handle open spaces well, as you end up with all corners connecting to all other corners over extreme distances. These checks are expensive and slow. In my case I needed to be able to modify the terrain efficiently and there was a very big risk of large open rooms with massive amounts of edges, so I ended up dropping it.
3  Java Game APIs & Engines / Engines, Libraries and Tools / LibGDX jars not part of the generated project and instead deep in .gradle folder on: 2017-11-07 22:13:05
So I'm trying to lay the foundation for a small school project for my Game Design class. I said that we could use LibGDX for Android compatibility, and now I'm in charge of setting it up of course. The problem is that I just cannot fathom what I'm supposed to do to get this project pushed completely to GitHub so we can actually collaborate. The LibGDX project creator generated dependencies like this:

Is there any way I can make LibGDX generate sane library dependencies or should I just clone my harddrive and mail it to my teammates?
4  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-10-22 21:57:08
@theagentd

Ah, so the scenario is essentially 1 source of gravity where the precise positions of the ships/particles affected by it comes to play? But then why is the calculations so taxing? What makes this so different than anything else? The precision? The magnitude/amount of bodies? Why wouldn't e.g. a QuadTree based solution work?
Simulating the physics takes 0.27 ms for ~1 million ship, and this is GPU bandwidth limited, so I an have up to 8 sources of gravity before I get any drop in performance. If it's just the simulation, it can easily be done for over 10 million ships. The problem is the collision detection really. Hierarchical data structures are usually not very efficient on the GPU, and constructing them on the CPU would require reading the data back, constructing the quad tree, then uploading it again to the GPU, which is gonna be too slow. In addition, actually querying the quad tree on the GPU will be very slow as well; GPUs can't do recursion and computations happen in lockstep in workgroups, so any kind of branching or uneven looping will be very inefficient. It's generally a better idea to use a more fixed data structure, like a grid instead, but that's a bad match in this case. The large scale of the world, the extremely fast speed of the ships and the fact that ships will very likely be very clumped up into fleets means that even a uniform grid will be way too slow.

The idea of sorting the ships along one axis and checking for overlap of their swept positions (basically treating each ship as a line from its previous position to its current position) was inspired by Box2D's broadphase actually. I concluded that sorting was a simpler problem to solve than creating and maintaining a spatial data structure (especially on the GPU), but after testing it out more I'm not sure it's a good solution in this case. For a fleet orbiting in close formation, there's a huge spike in sorting cost when the orbit reaches the leftmost and rightmost edges of the orbit when the order of the entire fleet reverses. There are also problems when two large fleets, one moving left and the other right) cross each other, again due to the two fleets first intermixing and then swapping positions in the list once they've crossed... Finally, there's a huge problem with just fleets travelling around together. A fleet of 10 000 ships moving very quickly together will have overlapping swept positions, so all 10 000 ships will be collision tested against each other.

I got a lot of thoughts on this problem, so if you want to have more of a discussion about this, I'd love to exchange ideas and thoughts on this through some kind of chat instead.
5  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-10-22 19:28:54
@jonjava

It's not an N-body simulation. Celestial bodies (stars, planets, moons) affect each other, but ships are only pulled by celestial bodies. Ships don't pull each other either.
6  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-10-21 23:28:53
Life's been tough on me the last few weeks, especially the last few days, so I decided to do some extra fun coding this weekend.

3-4 years ago I made some threads about an extreme scale space game with realistic Newtonian physics. The game would require simulating a huge number of objects affected by gravity, with extreme speed collision detection. I am talking 100k+ ships, each orbiting a planet at 10km/second, with accurate collision detection. The technical challenges are enormous. After some spitballing here on JGO, I ended up implementing a test program using fixed-precision values (64-bit longs) to represent positions and velocities to get a constant amount of precision regardless of distance from origin. Simple circle-based collision detection was handled by sorting the ships along the X-axis, then checking collisions only for ships that overlap along the X-axis. The whole thing was completely multi-threaded, and I even tried out Riven's mapped struct library to help with cache locality. Even sorting was multithreaded using my home-made parallel insertion sort algorithm, tailor-made for almost-sorted data sets (the order along the X-axis did not change very quickly). It scaled well with more cores, but was still very heavy for my poor i7.

I realized that the only way to get decent performance for this problem on a server would be to run the physics simulation on the GPU. With a magnitude higher performance and bandwidth, the GPU should be able to easily beat this result as long as the right algorithms are used. The physics simulation is easy enough as it's an embarrassingly parallel problem and fits perfectly for the GPU. The collision detection (sorting + neighbor check) is a whole different game. GPU sorting is NOT a fun topic, at least if you ask me. The go-to algorithm for this is a parallel GPU radix sort, but with 64-bit keys that's very expensive. Just like my parallel insertion sort algorithm took advantage of the almost-sorted nature of the sorting, I needed something like that that could run on the GPU as well. That's when I stumbled upon a simple GPU selection sort algorithm.

The idea is simple. For each element, loop over the entire array of elements to sort. Calculate how many elements that should be in front of this element. You now know the new index of your element, so move it directly to that index. Obviously, this is O(n^2), so it doesn't scale too well. However, the raw power of the GPU can compensate for that to some extent. 45*1024 = 46 080 elements can be sorted in ~60FPS, regardless of how sorted the array is. By using shared memory as a cache, performance almost triples to 160 FPS, allowing me to sort 80*1024 = 81 920 elements at 60 FPS. Still not fast enough. Anything above 200k elements runs a big risk of causing the driver to time out and restart...

Enter block-based selection sort for almost sorted data-sets! The idea is to split the list up into blocks of 256 elements, then calculate the bounds of the values of each block. This allows us to skip entire blocks of 256 values if the block doesn't intersect with the current block we're processing. Most likely, only the blocks in the immediate vicinity of each block needs to be taken into consideration when sorting, while the rest of the blocks can be skimmed over. Obviously, this makes the data dependent, and the worst case is still the same as vanilla GPU selection sort if all blocks intersect with each other (which is essentially guaranteed for a list of completely random values). However, for almost sorted data sets this is magnitudes faster!

To simulate an almost sorted data-set, an array is filled with elements like this:
1  
2  
3  
for(int i = 0; i < NUM_KEYS; i++){
   data.putLong(i*8, i + r.nextInt(1000));
}

This gives us an almost sorted array with quite a lot of elements with the exact same value, to test the robustness of the sort. The block-based selection sort algorithm is able to sort a 2048*1024 = 2 097 152 element list... at 75 FPS, way over the target of 100 000. It's time to implement a real physics simulation based on this!



Let's define the test scenario. 1024*1024 = 1 048 576 ships are in perfect circular orbits around the earth. The orbit heights range from low earth orbit (International Space Station height) to geosynchronous orbit. Approximately half of the ships are orbiting clockwise, the other half counterclockwise. The size of the earth, the mass, the gravity calculations, etc are physically accurate and based on real-life measurements.

Going back to my original threaded CPU implementation, it really can't handle one million ships very well. Just the physics simulation of the ships takes 20.43ms, and sorting another 18.75ms. Collision detection then takes another 10.16ms.

The compute shader implementation is a LOT faster. Physics calculations take only 0.27ms, calculating block bounds another 0.1ms and finally sorting takes 2.07ms. I have not yet implemented the final collision detection pass, but I have no reason to expect it to be inefficient on the GPU, so I'm optimistic about the final performance of the GPU implementation.



Each ship is drawn as a point. The color depends on the current index in the list of the ship, so the perfect gradient means that the list is perfectly sorted along the X-axis. 303 FPS, with rendering taking up 0.61ms, 370 FPS without rendering.
7  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-10-16 14:53:13
I got fiber installed a couple of days ago.

Before:


After:
8  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-30 01:12:59
I've been super busy, sorry.

I didn't realize the random vectors essentially filled the same purpose as the random rotations. You can drop the rotation matrix I gave you and just use the random vector texture you had. Please post how you sample from it. I recommend a simple texelFetch() with the coordinates &-ed to keep them in range.
9  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-27 14:23:37
I think the reason why you're getting wrong results is because you do the matrix multiplication the wrong way around. Remember that matA*matB != matB*matA. However, I've been thinking about this, and I think it's possible to simplify this.

What we really want to do is rotated the samples around the Z-axis. If we look at the raw sample offsets, this just means rotating the XY coordinates separately, leaving the Z intact. Such a rotation matrix should be much easier to construct:
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
   float angle = rand(texCoords) * PI2;
   float s = sin(angle);
   float c = cos(angle);
   mat3 rotation = mat3(
      c, -s, 0,
      s,  c, 0,
      0, 0, 1
   );
   //We want to do kernelMatrix * (rotation * samplePosition) = (kernelMatrix * rotation) * samplePosition
   mat3 finalRotation = kernelMatrix * rotation;


This should be faster and easier to get right!
10  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-26 18:29:47
A couple of tips:

 - The code you have is using samples distributed over a half sphere. Your best bet is a modified version of best candidate sampling over a half sphere, which would require some modification of the JOML code to get.

 - I'd ditch the rotation texture if I were you. Just generate a random angle using this snippet that everyone is using, then use that angle to create a rotation matrix around the normal (You can check the JOML source code on how to generate such a rotation matrix that rotates around a vector). You can then premultiply the matrix you already have with this rotation matrix, keeping the code in the sample loop the exact same.

 - To avoid processing the background, enable the depth test, set depth func to GL_LESS and draw your fullscreen SSAO quad at depth = 1.0. It is MUCH more efficient to cull pixels with the depth test than an if-statement in the shader. With an if-statement, the fragment shader has to be run for every single pixel, and if just one pixel in a workgroup enters the if-statement the entire workgroup has to run it. By using the depth test, the GPU can avoid running the fragment shader completely for pixels that the test fails for, and patch together full workgroups from the pixels that do pass the depth test. This massively improves the culling performance.

 - You can use smoothstep() to get a smoother depth range test of each sample at a rather small cost.

 - It seems like you're storing your normals in a GL_RGB8 texture, which means that you have to transform it from (0.0 - 1.0) to (-1.0 - +1.0). I recommend using a GL_RGB8_SNORM which can stores each value as a normalized signed byte, allowing you to write out the normal in the -1.0 to +1.0 range and sample it like that too. Not a huge deal of course, but gives you better precision and a little bit better performance.
11  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-26 04:05:01
I'm not sure what your "kernel" is. Are those the sample locations for your SSAO? I'd recommend precomputing some good sample positions instead of randomly generating them, as you're gonna get clusters and inefficiencies from a purely random distribution. JOML has some sample generation classes in the org.joml.sampling package that may or may not be of use to you.

It doesn't look like you're using your noise texture correctly. A simple way of randomly rotating the samples is to place random normalized 3D vectors in your noise texture, then reflect() each sample against that vector. I'm not sure how you're using your random texture right now, but it doesn't look right at all. If you let me take a look at your GLSL code for that, I can help you fix it.
12  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-25 14:11:52
To fix the SSAO going too far up along the cube's edges, you need to reduce the depth threshold.

I can also see some banding in your SSAO. If you randomly rotate the sample locations per pixel, you can trade that banding for noise instead, which is much less jarring to the human eye.
13  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-24 18:18:23
Renderbuffers are a bit of a legacy feature. They are meant for exposing formats the GPU can render to but can't be read in a shader (read: multisampled stuff). The thing is that multisampled textures are supported by all OGL3 GPUs so they no longer fill any real purpose anymore. If you do the FBO setup yourself, you can attach a GL_DEPTH_COMPONENT24 texture as depth attachment and read it in a shader.
14  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-24 13:14:04
Is LibGDX using a renderbuffer?
15  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-23 14:07:54
You do not need to store depth in a color texture. You can simply bind the depth texture you use as depth buffer and bind that as any other texture. The depth value between 0.0 and 1.0 is returned in the first color channel (red channel) when you sample the texture with texture() or texelFetch().
16  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-23 12:59:11
The traditional purpose of doing a depth pre-pass is to avoid shading pixels twice. By rendering the depth first, the actual shading can be done with GL_EQUAL depth testing, meaning each pixel is only shaded once. The depth pre-pass also rasterize at twice the speed as GPUs have optimized depth-only rendering for shadow maps, so by adding a cheap pre-pass you can eliminate overdraw in the shading.

To also output normals, you need to have a color buffer during the depth pre-pass, meaning you'll lose the double speed rasterization, but that shouldn't be a huge deal. You can store normal XYZ in the color, while depth can be read from the depth buffer itself and doesn't need to be explicitly stored.

If you have a lot of vertices, rendering the scene twice can be very expensive. In that case, it's possible for you to do semi-deferred rendering where you do lighting as you currently do but also output the data you need to do SSAO afterwards. This require using an FBO with multiple render targets, but it's not that complicated. The optimal strategy depends on the scene you're trying to render.
17  Java Game APIs & Engines / Engines, Libraries and Tools / Re: SSAO in LibGDX sans Deferred Rendering? on: 2017-09-23 07:33:36
Traditional SSAO doesn't require anything but a depth buffer. However, normals help quite a bit in improving quality/performance. You should be able to output normals from your forward pass into a second render target. It is also possible to reconstruct normals by analyzing the depth buffer, but this can be inaccurate if you got lots of depth discontinuities (like foliage).

EDIT: Technically, SSAO is <ambient> occlusion, meaning it should only be applied to the ambient term of the lighting equation. The only way to get "correct" SSAO is therefore to do a depth prepass (preferably output normal too), compute SSAO, then render the scene again with GL_EQUAL depth testing while reading SSAO from the current pixel. If you already do a depth prepass, this should essentially be free. If not, maybe you should! It could improve your performance.
18  Game Development / Game Play & Game Design / Re: Graphics Backend Abstraction on: 2017-09-06 11:57:31
<--- You can't contain this! =P
19  Java Game APIs & Engines / Java 2D / Re: Writing Java2D BufferedImage Data to Opengl Texture Buffer on: 2017-08-21 02:03:28
Am I insane? -> yes. But is it really crazier then drawing 2D stuff in OpenGL?
Yeah, it is. OpenGL is used for 2D stuff all the time. GPUs can really only draw 2D stuff in the first place; you project your 3D triangles to 2D triangles, so it makes a lot of sense to use the GPU for accelerating 2D stuff as well.
20  Java Game APIs & Engines / OpenGL Development / Re: [LWJGL] [JOML] Memory Behavior on: 2017-08-12 13:32:40
You can release the memory allocated on the native heap when you no longer need a direct NIO buffer by calling the cleaner yourself, I assume that it's what theagentd meant in his second suggestion.
Nope, I'm not a big fan of the Cleaner "hack". You'll generally get much better performance by managing memory yourself with malloc/free. In LWJGL, that can be done with MemoryUtil.memAlloc() and MemoryUtil.memFree(), which under the hood uses the best library for the job (usually Jemalloc).
21  Java Game APIs & Engines / OpenGL Development / Re: [LWJGL] [JOML] Memory Behavior on: 2017-08-10 01:30:55
This comes from you allocating a lot of native memory buffers using NIO, possibly using BufferUtils.create***Buffer() calls. All those objects are related to managing native memory, which lies outside the Java memory heap. You should:
 - try to reuse native memory buffers
 - manage native memory yourself instead to avoid GC overhead.
22  Java Game APIs & Engines / Engines, Libraries and Tools / Re: LibGDX How do I get the depth buffer from FrameBuffer? on: 2017-08-05 19:47:02
Ok, I will try that, but first I want to clarify something:
If my clipping range is between 1.0f and 1000.0f, then the normalised depth value from the shader (0.5 in this case) means 500f?
Anyway, thanks for help.
No, that is not the case. The depth value you get is calculated in a specific way to give more precision closer to the camera. This means that 0.5 will refer to something very close to the far plane, something like 3.0 (very rough estimate). How this is calculated depends on your projection matrix (the clipping range you mentioned). It is possible to "linearize" the depth value if you know the far and near planes.
23  Java Game APIs & Engines / Engines, Libraries and Tools / Re: LibGDX How do I get the depth buffer from FrameBuffer? on: 2017-08-05 14:32:43
the framebuffer should be configured with a depth-attachment attached to it, which is a texture after all.

internal-format should be GL_DEPTH_STENCIL or GL_DEPTH_COMPONENT and format should go like GL_DEPTH24_STENCIL8 or GL_DEPTH_COMPONENT24.

now when you render the color-attachment in a fullscreen-quad/triangle pass, just throw in the depth-texture like you do with any other texture.

next, sampling the depth-texture - i cannot rember the "proper" way.
You got the internal format and format switched around. Internal format (the format the GPU stores the data on the GPU in) should be GL_DEPTH_COMPONENT16, GL_DEPTH_COMPONENT24 or GL_DEPTH_COMPONENT32F, or if you also need a stencil buffer, GL_DEPTH24_STENCIL8 or GL_DEPTH32F_STENCIL8. Format (the format of the data you pass in to initialize the texture in glTexImage2D()) shouldn't really matter as you're probably just passing in null there and clearing the depth buffer for the first use anyway, but you need to pass in a valid enum even if you pass null, so GL_DEPTH_COMPONENT is a good choice.

To give you a list of steps:

1. Create a depth texture with one of the internal formats listed above.
2. Attach the depth texture to the FBO using glFramebufferTexture2D() to attachment GL_DEPTH_ATTACHMENT or GL_DEPTH_STENCIL_ATTACHMENT if your depth texture also has stencil bits.
3. Render to the FBO with GL_DEPTH_TEST enabled. If it's not enabled, the depth buffer will be completely ignored (neither read nor written to). If you don't need the depth "test" and just want to write depth to the depth buffer, you still need to enable GL_DEPTH_TEST and set glDepthFunc() to GL_ALWAYS.
4. Once you're done rendering to your FBO, you bind the texture to a texture unit and add a uniform sampler2D to your shader for the extra depth buffer.
5. Sample the depth texture like a color texture in your shader. A depth texture is treated as a single-component texture, meaning that the depth value is returned in the red channel (float depthValue = texture(myDepthSampler, texCoords).r;). The depth is returned as a normalized float value from 0.0 to 1.0, where 0.0.
24  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-07-25 17:30:09
Deleted.
25  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-07-22 20:44:29
Deleted.
26  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-07-21 05:56:05
Deleted.
27  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-07-18 02:10:28
Deleted.
28  Game Development / Newbie & Debugging Questions / Re: Game "running out of memory" with 32-bit JRE's on: 2017-07-17 07:20:57
Is there a specific reason why you want to support 32-bit JVMs?
29  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-07-16 20:43:51
Deleted.
30  Discussions / Miscellaneous Topics / Re: What I did today on: 2017-07-16 02:41:44
Deleted.
Pages: [1] 2 3 ... 123
 
Ecumene (149 views)
2017-09-30 02:57:34

theagentd (214 views)
2017-09-26 18:23:31

cybrmynd (300 views)
2017-08-02 12:28:51

cybrmynd (288 views)
2017-08-02 12:19:43

cybrmynd (296 views)
2017-08-02 12:18:09

Sralse (291 views)
2017-07-25 17:13:48

Archive (968 views)
2017-04-27 17:45:51

buddyBro (1094 views)
2017-04-05 03:38:00

CopyableCougar4 (1667 views)
2017-03-24 15:39:42

theagentd (1428 views)
2017-03-24 15:32:08
Java Gaming Resources
by philfrei
2017-12-05 19:38:37

Java Gaming Resources
by philfrei
2017-12-05 19:37:39

Java Gaming Resources
by philfrei
2017-12-05 19:36:10

Java Gaming Resources
by philfrei
2017-12-05 19:33:10

List of Learning Resources
by elect
2017-03-13 14:05:44

List of Learning Resources
by elect
2017-03-13 14:04:45

SF/X Libraries
by philfrei
2017-03-02 08:45:19

SF/X Libraries
by philfrei
2017-03-02 08:44:05
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!