Hi !
Featured games (83)
games approved by the League of Dukes
Games in Showcase (581)
Games in Android Showcase (163)
games submitted by our members
Games in WIP (632)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
   Home   Help   Search   Login   Register   
  Show Posts
Pages: [1] 2 3 ... 96
1  Discussions / General Discussions / Re: Rayvolution's JGO Appreciation Thread (AKA: Free copies of Retro-Pixel Castles!) on: 2015-05-27 12:29:19
Yay! I would very much like a key! =D
2  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-26 19:44:07
Improved the antialiasing of WSW. I had been "cheating" when upsampling the transparent effects. The whole thing was extremely complicated since I had the current frame, the previous frame, motion vectors, MSAA depth values, etc, and the shader was sampling 10 different textures and a total of ~40+ texture samples for each pixel. To simplify a bit, the transparency was overlaid after the anti-aliasing had been applied, so it did not correctly block individual samples. If an object was standing in front of a fog cloud, the edge samples of the object would get occluded by the fog behind it while the background along the same edge behind the fog would be slightly visible. Upscaling the transparency for each sample turned out to be prohibitively expensive though.

At the same time, I realized that the performance I was getting did not make sense. The shader that did anti-aliasing also did motion blur. Adding motion blur had a cost of around 0.3ms, while adding anti-aliasing had a cost of 1.5ms. However, adding both made the shader skyrocket to 3-4ms, far over the sum of the two. Somehow, GPUs couldn't handle the massive shader (~1500-2000 instructions and 40+ texture samples), most likely due to not being able to have enough shader invocations in registers at the same time which reduced the occupancy...

I decided to extract the motion blur to a separate pass before applying anti-aliasing. I realized that motion blur was essentially a transparent effect that was applied in-between other transparent effects and the opaque geometry, so the first pass both calculates motion blur and upsamples the transparency effects and outputs a preblended version if this. The anti-aliasing shader can then use this precomputed texture to achieve anti-aliasing of transparent effects the same way as it achieves anti-aliasing of opaque objects. Sadly, this didn't work as good as I hoped, so in the end I had to do some vast rewrites to it, but I eventually got it working as good as possible.

Splitting the shader into 2 smaller ones did help with performance, but the cost of accurate transparency anti-aliasing killed most actual gains. At 2x AA, performance is down, at 4x AA performance is the same and at 8x AA performance is up a bit. The quality when opaque objects are standing in front of transparent objects is up a lot though.



No more dark or bright halos around objects. The single black pixel at his hand is due to the lack of transparency data (transparency is done at half resolution, so not all pixels have accurate data.
3  Game Development / Networking & Multiplayer / Re: NitroNet - New, High-Level Networking Library on: 2015-05-25 11:25:14
Why do people use thread per connection? Iv been told a million times, it's not recommended unless you're doing some sort of Chat program.
A multiplayer game with 8-16 players could easily be implemented with threads without any disadvantage.
4  Discussions / General Discussions / Re: Java 8 - Stream vs Loop on: 2015-05-25 02:23:54
for (Object i : list) {
    if (i.check()) {
        return true;
return false;

This is worse than
for(int i = 0; i < list.size(); i++){
        return true;
return false;

It generates an Iterator instance and uses it to loop over the list, which means overhead and garbage.

The lambda version most likely does the same.
5  Discussions / General Discussions / Re: Four hour downtime as mysql tumbled on: 2015-05-23 20:50:21
You responded within seconds of my Skype message, so you're not entirely correct. =P
6  Discussions / Miscellaneous Topics / Re: I'm baaaaack... on: 2015-05-22 19:22:49
They can't do fallout or skyrim, but they can do flappy bird...
I think the real problem here is that people put games like Skyrim and Fallout in the same category as Flappy Bird.
7  Java Game APIs & Engines / OpenGL Development / Re: texture problems on: 2015-05-22 19:10:24
tex is incomplete since the default min filtering of textures (GL_NEAREST_MIPMAP_LINEAR) assumes that you have mipmaps. Therefore your texture is considered incomplete and you can't sample it. Either generate mipmaps for it or change the min filter.
8  Game Development / Networking & Multiplayer / Re: NitroNet - New, High-Level Networking Library on: 2015-05-22 16:46:32
When it comes to performance, how does this compare to Kryonet and other solutions? Does it support multithreading?
9  Game Development / Shared Code / Re: A convenient method that tells you if a rectangle is intersecting a line on: 2015-05-22 13:52:30
Assuming you're going to test more than don't do it this way.
This may be my 39 degree fever speaking, but damn are you annoying. At least clarify what the hell you mean.
10  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-18 19:25:45
I rewrote the model renderer's octree since I had done some optimization on the physics engine's quadtree that I could carry over. Cut the culling cost of models by a lot and allowed for a few interesting optimizations. I also added proper threading for some trivial parts of the octree updating each frame, one of the few if not the only significant single-threaded part in the entire engine. I then tried to give threading the octree generation and basically hit a wall. With some clever rewriting to eliminate most synchronized() blocks, improved reuse of data between frames, and a few other tricks I might be able to get half-decent scaling out of it, but simply put it's always difficult or impossible to get good scaling when writing to the same place in memory using multiple cores. In the end I basically said "f*ck it" and simply went with the easiest solution: Multiple octrees! =D Instead of trying to stuff thousands of instances into the same octree, I simply divided up the instances and placed them in 2-4 separate octrees. Building two independent octrees has pretty much perfect scaling, so there's no problem there, but traversing 4 octrees with 1/4th as many instances in them is slower than 1 octree with all the instances... It's a simple solution that helped make the octree generation fast enough to run in parallel to better threaded tasks.
11  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-13 20:11:42
Shadow mapping.
12  Game Development / Performance Tuning / Re: Performance Slowdown (LWJGL, OSX) on: 2015-05-10 20:51:49
Alright, thanks for the info.
13  Game Development / Performance Tuning / Re: Performance Slowdown (LWJGL, OSX) on: 2015-05-10 19:26:12
Nice catch! Like I said, swapBuffers() generally works as a "filler" when the GPU and/or driver thread still has work to do, so the fact that it barely appears in the OSX profile log points to something else taking a lot of time. There's a big chance that this bug is in LWJGL 3 too. Would be nice if someone could confirm that. Still, the stuttering that Morgan is getting is probably coming from his own code, although the CPU performance handicap from that bug is probably making the problem more noticeable since he has a smaller time budget to play with.
14  Game Development / Performance Tuning / Re: Performance Slowdown (LWJGL, OSX) on: 2015-05-10 18:20:45
Yes, the driver collects a number of draw calls and ship them all off after a certain threshold is reached, but they do not usually buffer all commands for a whole frame. On mobile, this may be different though. In those cases, the GPUs are often tile-based deferred renderers, which means that they do buffer up all rendered draw calls and state and then render the frame when it is resolved, but this can also happen on FBO switches and on read-backs of FBO textures, etc, so they still do not usually buffer a whole "frame". That being said, my knowledge of mobile is limited.

What I can say for certain is that in no way do the driver always defer all draw commands until you finish the frame, as there are numerous ways to trigger driver thread flushes (mapping a buffer) and entire GPU flushes (reading back data).
15  Game Development / Performance Tuning / Re: Performance Slowdown (LWJGL, OSX) on: 2015-05-10 17:41:55
I haven't looked at your code, but some drivers like postponing the rendering work up to the point when the result need to be displayed, and that would be at swapBuffers, since then the rendering results must be produced and made visible on the screen.
I have not heard about any drivers doing this. What I have heard of is that drivers offload all OpenGL calls to a separate internal driver thread, which essentially makes the draw calls almost free on the game's thread. When the driver determines that the driver thread has fallen too far behind (either if the driver thread can't process the commands in time or if the GPU is not fast enough to consume them), it forces the game's thread to wait for driver thread, which in my experience happens in nSwapBuffers(). Most drivers seem to implement this as a busy loop. I have never seen nUpdate() appearing in VisualVM since that one is usually free, which indicates an LWJGL problem.
16  Game Development / Performance Tuning / Re: Performance Slowdown (LWJGL, OSX) on: 2015-05-10 16:37:53
If you have a CPU bottleneck, the GPU timer queries are worthless. The GPU will be limited by how fast you can provide commands to it and will idle when no commands are present, hence the values will be inflated by this.

This looks like a pretty clear LWJGL bug/problem. Try to update LWJGL.

Stuttering on the other hand could be coming from something in your code. VisualVM will give you the average performance of the application, but it won't show you spikes. If suddenly you have a 20-40ms spike every second, it'll only take 2-4% of the CPU time but cause easily noticeable stuttering. TerrainSet.refreshAllMeshes() seems like the obvious place to start looking. Time it using System.nanoTime() and see if the spikes are your problem.
17  Game Development / Performance Tuning / Re: Performance Slowdown (LWJGL, OSX) on: 2015-05-09 22:31:09
That those two functions take a lot of time indicate a GPU bottleneck, OR a driver CPU bottleneck. If you have an Nvidia GPU, you can disable Threaded Optimization in the Nvidia Control Panel, which will disable the driver multithreading and might give you more information about a possible CPU bottleneck. You can also check the GPU load with GPU-Z and see if it's at 95%+, in which case your GPU is the bottleneck. Lastly, to figure out what exactly is slow, you can use GPU timer queries.
18  Discussions / Miscellaneous Topics / Re: Most unusual/weird syntax features in non-joke languages on: 2015-05-09 09:20:25
It's actually all quite commonplace in functional langs, but currying/partials/implicit-args/operators-are-just-functions/etc do appear magical at first to the uninitiated.

But one must approach with caution:  Tongue
How do I the opposite of medal you for showing me that nightmare?
19  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-07 12:56:14
Found a bug in my test program (not the engine) which switched around the LOD models, so the low-resolution models were being used up close and for distant instances the high-resolution models were used, cutting my FPS by 1/3rd... >___<
20  Game Development / Newbie & Debugging Questions / Re: Instanced Rendering LWJGL on: 2015-05-05 18:55:31
You can use the gl_InstanceID built-in GLSL uniform value to access instance-specific values, for example data in a texture buffer.
21  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-05 17:40:19
Did some nice thread profiling of WSW.

Without threading (off the charts!):

With threading:

Task details:

Red, green blue (x4 or x5) are physics updates. We do 250 updates per second to get good collision detection, so multiple ones are done for each frame rendered.
 - Red is the first threaded pass, where the position is updated based on velocity and then body-ground collisions are solved.
 - Green is the second pass where body-body collisions and body-triangle collisions are solved.
 - Blue is the third pass where body-ground collisions are solved again to eliminate any risk of falling through the ground (working on eliminating this).
 - Lastly, there's a fourth pass (white) which is so fast it's not visible. It's a single-threaded task which updates the body data structure in a thread-safe manner, but since it only processes bodies that actually need updating (checked in pass 4), it's almost instant.

The last tasks are for the rendering of the game.
 - Purple is actually part of the physics engine. It interpolates transforms for each body so the instance gets a properly interpolated matrix (allowing >250 FPS and reducing stutter at <250 FPS).
 - Then there's a gap, which comes from updating some rendering data (updating the animation frame of all 3D models, the camera) which is single-threaded in this threading program.
 - Yellow (single-threaded) is the light culling. It's fast so no real need to thread it.
 - Brown is 3D model "resetting" which prepares the 3D models for being culled by updating their bounding box. It also handles instance ID handling for SRAA.
 - Black is shadowed light matrix generation, which generates views for shadow maps to be rendered.
 - Reddish (single-threaded) is octree generation for model culling.
 - Light blue is terrain culling. It determines what terrain "tiles" are visible and constructs instance lists for each view (each light in that scene has 6 shadow maps which are also handled here).
 - Pinkish is model culling. Constructs instance lists for all views for 3D models.
 - Dark blue is terrain tile data packing in VBOs (an invisible green task maps the VBO after culling is done).
 - Red is model skeleton calculation and VBO uploading of instance data and visibility data for each view.
( - Last blue task is a rogue tile data packing subtask.)
 - Finally, the last black task (running on the top OpenGL thread) sends off all draw calls for the entire scene to the driver's internal thread.

There are some other tasks too but they are too fast/simple to mention (3 buffer mapping and 3 buffer unmapping tasks)

1 thread: 14 FPS
8 threads: 48 FPS (Fraps screwed up the FPS in the screenshot...)
= 3.43x scaling on a quad core with Hyper-threading. Quite good considering all the draw calls are single-threaded due to how OpenGL works and the driver only has 1 internal thread.

Realized that the skeleton calculation was way too slow. Turns out I was calculating skeletons for instances that eventually were prevented from being rendered due to being LODed out. Added an extra LOD check during culling and the FPS went to 25 with 1 thread and 82 with 8 threads, or 3.28x scaling. Scaling dropped due to a bigger percentage of the frame time being spent in single-threaded parts.

Turns out I actually got GPU limited after the skeleton animation optimization at 8 threads. Reducing the resolution a bit under 1920x1080 gave me an FPS of 26 with 1 thread and 89 with 8 threads, which equals 3.42x scaling like before!
22  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-05 16:45:58
Thanks. Seems like the apartment's mostly okay, and it doesn't look like all hope is out for the summer job, although it won't be what I expected and it seems like it's not as much as it initially was. Well, better than nothing...

As long as you've got some source of income, you'll be fairly better off than a lot of other people. I'm sure there are plenty of jobs available for programmers though, especially talented ones?
My only income is student aid, and I don't get that during the summer. The problem is that I declined my previous summer job to get that one... Luckily it seems like not all hope is lost with the new one, although it seems like it'll be a different task than what I had in mind.
23  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-04 17:24:24
Lost my summer job and the apartment I plan on living in in a few a weeks had break-in.

That really sucks man.. I'm sorry about that. I hope you're feeling better, that must've been a pretty tough blow.
Thanks. Seems like the apartment's mostly okay, and it doesn't look like all hope is out for the summer job, although it won't be what I expected and it seems like it's not as much as it initially was. Well, better than nothing...
24  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-03 15:09:17
Lost my summer job and the apartment I plan on living in in a few a weeks had break-in.
25  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-05-03 09:03:52
I rewrote some of my mapped VBO managing code and improved the two slowest renderers in Insomnia. Before, every view (camera/shadow map) had its own set of VBOs for what was visible for that view. Each view frustum culled terrain and 3D models, constructed a list of instances that were visible, mapped a buffer, placed the data in the buffer, unmapped the buffer, then rendered itself. All views like these were calculated in parallel to the extent that this was possible (cull views and construct lists in parallel, map buffers in OpenGL thread, data uploaded in parallel, buffers unmapped in OpenGL thread, render everything in OpenGL thread). This had a number of problems that were quite difficult to detect.

1. Mapping a lot of buffers is slow as hell. In my stress test, I had almost 1000 point lights visible with 6 shadow maps each, resulting in over 10 000 VBOs being mapped each frame. Persistent mapped buffers "fixed" this as the map operation could be avoided, but that doesn't help OGL3 GPUs, causing a shitload of driver overhead. Mapping a lot of buffers also kills the driver's internal multithreading, as each map operation causes synchronization inside the driver.

2. Due to how my code was structured, each pass needed to fetch render data (visibility lists, VBO) of each view from a tiny little HashMap. It turned out that the simple map and unmap passes run on the OpenGL thread were locking up the OpenGL thread for much longer than they should be due to the overhead of having to fetch the render data.

3. My engine can seamlessly switch between using unsynchronized VBOs and persistently mapped VBOs. It turns out that glMapBufferRange()'s ability to reuse the previous ByteBuffer instance is very limited.
The old_buffer argument can be null, in which case a new ByteBuffer will be created, pointing to the returned memory. If old_buffer is non-null, it will be returned if it points to the same mapped memory and has the same capacity as the buffer object, otherwise a new ByteBuffer is created.
In other words, it can only reuse the previous ByteBuffer if you map the exact same number of bytes each frame and the driver gives you the exact same memory address (which it should unless you reallocate the VBO). When using unsynchronized VBOs, the number of visible objects for each view changed pretty much every frame, causing new ByteBuffers to be allocated each frame. In my stress test, this amounted to over 10MBs of garbage per second of ByteBuffers, Cleaners and Deallocators.

The solution was pretty simple, but required some rewriting. I modified my renderers to use a shared VBO system, so instead of each view mapping their own buffers they would "reserve" a part of the VBO instead. Since I'm only mapping a handful of VBOs each frame now, the driver overhead is minimal. Fewer mapped VBOs also means fewer ByteBuffers created each frame. In addition, the mapping and unmapping code for unsynchronized VBOs does not need to query the render data of each view anymore since it only needs to know the total amount of data needed, removing a lot of HashMap overhead. Although I expected this to yield a noticeable performance increase when using unsychronized VBOs, I did not expect unsychronized VBOs to become as fast as persistent VBOs. In addition, both unsychronized and persistent VBOs are significantly faster than before.

Note: Threading in the below table refers to the Threaded Optimization setting in the Nvidia Control Panel, which controls driver multithreading. Both Intel and AMD also have multithreaded drivers, but do not allow the user to override the driver's automatic selection.

TechniqueOld FPSNew FPS% improvement
Unsynchronized, threading off34 FPS52 FPS53%
Unsynchronized, threading on19 FPS62 FPS226%
Persistent, threading off46 FPS53 FPS15%
Persistent, threading on57 FPS64 FPS12%

This is mostly an optimization for computers with older GPUs that don't support persistent VBOs, where the improvement is 82% (34 FPS --> 62 FPS), but even computers with GPUs that support persistent buffers got a 12% increase from the improved parallelism due to the reduced HashMap overhead. Using a single thread in Insomnia and disabling threaded optimization so it only uses 1 thread, I get 21 FPS. With 8 threads on my Hyperthreaded quad core and threaded optimization, I get 64 FPS = 3.05x scaling.
26  Discussions / Miscellaneous Topics / Re: What do you define as a programming language? on: 2015-05-02 13:46:47
Oh, god, no...
27  Discussions / Miscellaneous Topics / Re: What do you define as a programming language? on: 2015-05-02 02:43:27
EDIT: aha!
Sigh. You can't possibly imply that not being able to make a game in the language isn't a drawback. Pointing out objective drawbacks isn't ridicule. Yes, my tone was immature, but my point still stands. I'm saying that if I can't make a game in a programming language, then it is objectively worse than a Turing complete language, and hence it's not a "real" programming language. Turing completeness is the best threshold we have for when we go from data to code. If you choose to disregard something that is widely accepted, then this whole discussion is pointless.
28  Discussions / Miscellaneous Topics / Re: What do you define as a programming language? on: 2015-05-02 01:14:07
The definition of a programming language is a Turing complete language. Languages/formats/whatever that aren't Turing complete cannot be used to implement all algorithms, so we don't count them as programming languages.

See I disagree, what about deliberately non-complete systems such as total functional languages?
So... you can't make a game in them since games aren't provably terminating. Sounds like a shitty paradigm. It's like saying "What about Java without loops that cannot be provably terminating?". You can't take a programming language, chop off its limbs one after another and then still call it a programming language.
29  Discussions / Miscellaneous Topics / Re: screen scraping as a hacking method? on: 2015-05-01 23:58:29
If it's a simple color-clicker, it could be SCAR. That's actually how I got into programming. That being said, Chrislo's suggestion sounds more likely.

30  Discussions / Miscellaneous Topics / Re: What do you define as a programming language? on: 2015-05-01 23:50:48
This is kind of like asking what everyone's stance on global warming is; it doesn't really matter what your opinion is since there are pretty much facts about it. The definition of a programming language is a Turing complete language. Languages/formats/whatever that aren't Turing complete cannot be used to implement all algorithms, so we don't count them as programming languages.
Pages: [1] 2 3 ... 96
MrMapcom (24 views)
2015-05-23 20:26:16

MrMapcom (32 views)
2015-05-23 20:23:34

Waterwolf (37 views)
2015-05-20 15:01:45

chrislo27 (44 views)
2015-05-20 03:42:21

BurntPizza (77 views)
2015-05-10 15:53:18

FrozenShade (61 views)
2015-05-07 09:11:21

TheLopais (226 views)
2015-05-06 13:36:48

TheLopais (207 views)
2015-05-06 13:35:14

TheLopais (213 views)
2015-05-06 13:33:39

TheLopais (234 views)
2015-05-06 13:32:48
List of Learning Resources
by SilverTiger
2015-05-05 10:20:32

How to: JGO Wiki
by Mac70
2015-02-17 20:56:16

2D Dynamic Lighting
by ThePixelPony
2015-01-01 20:25:42

How do I start Java Game Development?
by gouessej
2014-12-27 19:41:21

Resources for WIP games
by kpars
2014-12-18 10:26:14

Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00 is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!