Java-Gaming.org Hi !
Featured games (84)
games approved by the League of Dukes
Games in Showcase (565)
Games in Android Showcase (151)
games submitted by our members
Games in WIP (605)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
   Home   Help   Search   Login   Register   
  Show Posts
Pages: [1] 2 3 ... 93
1  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-26 05:11:40
I've been admiring and tweaking my new SSAO effect. Doing SSAO at half resolution and having high quality grass straws covering half the screen lead to the SSAO becoming an aliased, shimmering mess, AKA eye cancer. After working over the weekend on it, I finally got all 3 passes of the shader fast enough to be possible to do at full resolution, and the difference is HUGE. My goal was to get it down to 1-2ms at full resolution 1920x1080, and I managed to get it down to around 1.3ms, while at 2560x1440 I ended up at 2.1 ms... The best part is that the new shader has a million times better texture cache coherency. The old one would choke once it started sampling random pixels in a 25+ pixel radius around each pixel due to texture cache thrashing, but the new algorithm does some clever precomputation to the depth buffer to improve performance. Along with ALU optimizations to the shader as well, the new shader is over 2x faster compared to the old shader's absolute best case, and 20-30x faster than its worst case. The bilateral blur shader also got a ~2x speed up thanks to simplifications of the algorithm and that I packed the SSAO value and the depth value it needed into a single texture to halve the number of texture samples. All in all, it works, and it looks gorgeous now that I don't have a radius limit. But now to the pictures!


This is with SSAO off. When in the shadow of the sun, it gets extremely hard to see where things are and what shape they have. We can't really see if the building is floating in the air or if it actually touches the ground. The corner right behind MORS (the character) is almost invisible due to the flat colors, and the only reason you guys even know there's a corner there is thanks to the blue stripe following the wall. MORS himself also looks flat and boring. His legs are essentially just two shades of gray, and his arms also look extremely flat. The detail simply isn't there when there's no direct lighting to bring it out.


With SSAO, the scene gets a whole new level of depth. Not only is the vertical corner behind MORS clearly visible, we suddenly see an additional little horizontal crease to the bottom left of the building that was completely invisible before. The grass where the building stands is slightly darker too, showing that the building is indeed standing on the ground. MORS also looks significantly better. The area under his arms and around his neck and feet are significantly darker due to being occluded, bringing out a lot more detail in the lighting.

Of course, this effect is completely dynamic and isn't precomputed in any way. It does have some limitations as it works on screen space information, e.g. the depth buffer. For example, the background as seen between his legs is too bright. His legs are simply blocking the information for the background occlusion to be detected, so there is no way to know if the background should be occluded or not. Still, the overall quality of the sceen is massively improved.
2  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-18 05:34:10
No shit. Every time we release a new major build I have to go through my shaders and code to Intel-proof it. I don't know what I expected this time to be honest. I don't think I've ever managed to compile a GLSL 400+ shader at all on Intel in the first place. Intel + OpenGL = ButIPoopFromThereException.
3  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-18 04:58:37
Today I tried to compile Intel's own Adaptive Order-Independent Transparency algorithm on an Intel graphics card and immediately regretted my decision when the entire driver crashed during shader linking. I think this is pretty much the ultimate proof of what kind of state Intel drivers are in.
4  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-15 00:58:12
Did you tested to sort those quads from front to back. Alpha test does disable early depth test but does not disable hi-z. Probably does not matter because your grass blade seems to be lot thinner than hi-z resolution. Is that foliage deferred or forward?

I haven't tested sorting them; they're rendered in the order they're culled from the quad tree, but this essentially means that they're sorted in one direction. I do not get a noticeable difference by rotating the camera, which should represent an essentially worst and best case scenario depending on the angle.

Everything's deferred. 4 render targets.
5  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-14 21:47:01
I improved the performance of the grass rendering a lot. This scene was running at 44 FPS. With the new optimizations, the exact same scene, down to the pixel, runs at exactly 60 FPS.



The grass is drawn as a shitload of simple quads. This causes heavy overdraw, to the tune of 34 million fragments or 16x+ overdraw per pixel at 1920x1080p, before depth and alpha testing. Since alpha testing is used, this disables early depth testing which means that even though something like 80-90% of those fragments never end up on the screen they still cost a lot of performance. Add the fact that the grass is rendered to the G-buffer which consists of 3-4 render targets and you end up with extremely bad performance. What I did was to separate the rendering of the grass into two passes. I first do a depth-only pass which uses a simple shader for alpha-testing the texture. This is reasonably fast even without early-Z since the shader is so simple. In the second pass, I DISABLE alpha testing and change the depth testing function to GL_EQUAL and render the grass as usual. The result is identical to depth testing, but the depth-testing performance is improved a lot even though everything is rendered twice.
6  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-12 16:05:31
I did a lot of stuff today and yesterday, but one of the more fun stuff was adding detection for if the driver is overriding the number of multisampling samples. Basically, I create two small multisampled textures with different sample counts. If both those textures end up having the same sample count when queried, it's an indicator that the driver is ignoring the sample count and setting one based on a user setting in the driver. Since many of my shaders loop over the number of samples, it means that if the driver sets 8 samples but the game settings only have 2, the shaders will only process the first 2 samples, ignoring the rest. With the change, the sample count is updated to the driver's reported sample count if it is determined to be overriden.
7  Game Development / Newbie & Debugging Questions / Re: Check if it's a combo/streak on: 2015-03-09 13:12:11
An algorithm that can be extended to any combo length would be one that counts consecutive elements, first on rows and then on columns.

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
private final static int COMBO_COUNT = 3;

for(int y = 0; y < numRows; y++){
    Tile element = grid[y][0];
    int occurances = 1;
    for(int x = 1; x < numCols; x++){
        Tile t = grid[y][x];
        if(t.equals(element)){
            occurances++;
        }else{
            element = t;
            occurances = 1;
        }
        if(occurances >= COMBO_COUNT){
            //Combo of length COMBO_COUNT on row
        }
    }
}


You then need a second block like this for detecting combos in columns. This should be faster, especially for higher combo counts. In the nested loop others have suggested, an average element is processed COMBO_COUNT*2 times, or another way to see it, each iteration of the nested loop checks COMBO_COUNT*2 tiles. My code only ever needs to check each element once per dimension, meaning 2 times for a 2D array, regardless of combo count.
8  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-07 15:25:34
Changed my resolution to 1920x1080 on my monitor whose native resolution is 1600x900
You can enable DSR or the AMD equivalent, you know.
9  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-05 20:33:09
I wrote a new heat spread compute shader. This one has two main improvements. The first one is that instead of each object having up to 4 neighbors that they are connected to, I instead store a list of neighbors, so the maximum number of neighbors is basically only limited by performance and/or int address precision, whichever becomes a problem first. This will allow for more varying shapes than simple tiles, and will also allow for optimizations since a large object can be represented by a single component instead of a huge number of tiles to fill its volume. Secondly, the shader now not only takes the thermal conductivity into account, but also the thermal capacity into account (basically how much energy is needed to increase the temperature of the object by 1 degree). Here are some pretty pictures of heat spread in action!

Initial conditions. The white object is 100 degrees hot, the rest is 0.


The objects closest to the hot objects immediately start heating up as well.


The heat has dispersed enough to not be very concentrated anymore


The temperature has mostly evened out over all 256 objects in the system.


EDIT: Haha, I hit the Texture Buffer limit of addressable texels... 134 217 728... That's really low... q.q
10  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-01 22:00:51
In addition to a lot of bug fixing on WSW, I did some compute shader experiments.

Basically I developed a generic compute shader for "simulating" some super basic heat conductivity. No actual graphics for it yet though. The idea is to supply the compute shader with a massive list of tiles. Each tile has a heat value (temperature/energy/whatever), the tile indices of its 4 neighbors and the conductivity coefficients to those neighbors. Heat spreading between tiles is simulated using something similar to a gaussian blur, with the weights modified by conductivity. My current performance tests look promising. At 60 updates per second I can update 64 000 000 tiles. Another way to put it is 3 841 707 432 tiles per second. Now, I wonder what it'll be useful for... =3
11  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-03-01 13:14:08
Did some basic 2D bloom. I think it brightens the game up quite a bit. Pun intended.

With:
http://puu.sh/ghvFx/2720d0c38f.jpg
Without:
http://puu.sh/ghvGO/94e37f6095.jpg
I don't think that's how bloom works. I can't really see the difference except the brightness.
12  Java Game APIs & Engines / Engines, Libraries and Tools / Re: Java OpenGL Math Library (JOML) on: 2015-02-27 17:52:19
For tools...sure.  There are existing libraries.  Runtime:  You don't need doubles.

(EDIT: the extra precision is more often the interesting part...not the wider range)
By large range of values, I meant that you might have relatively large values (= a wider range of values) but you still need high precision, so we're basically saying the same thing.
13  Java Game APIs & Engines / Engines, Libraries and Tools / Re: Java OpenGL Math Library (JOML) on: 2015-02-27 14:53:17
Doubles are for serious computing.
Or, you know, a large range of values.
14  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-02-26 11:09:58
That would work too, of course.
15  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-02-26 08:11:22
TL;DR:
 - Don't assume anything about the ordering of vertex attributes! Nvidia assigns them in the order they're defined, but AMD does not!
 - Make sure you set vertex attribute and uniform locations BEFORE linking the shader! This is a silent "error" as OpenGL expects you to relink the shaders to apply those settings.
 - If your shader only outputs a vec3 instead of all 4 color channels, Nvidia and AMD behave differently. Nvidia outputs 0.0 for the alpha value while AMD outputs 1.0.

Fixed two critical bugs that only affected AMD GPUs!

The first one was a bug in my particle rendering code, causing the particles to appear in random color (usually bright magenta) and have random size. The problem was the vertex attributes, but this problem was only visible on AMD GPUs. In essence I set the vertex attribute locations AFTER linking the shader, so they weren't being applied. AMD's GLSL compiler simply assigned different vertex attribute locations so it read the particle data incorrectly.

Secondly, another piece of useful information for the future! My SRAA was not working correctly on AMD card, and I tracked down the problem to my lighting code. In essence, SRAA works by rendering the scene twice. In the first pass the scene is rendered and lit as normal, but we also store a semi-unique triangle ID for each pixel. In the second pass, we render ONLY triangle IDs to a multisampled texture. In a resolve pass, the non-multisampled lighting result is "upsampled" to MSAA quality by matching triangle IDs. In practice, it's a lot more complicated as I use both the current and previous frames in the upsampling, but that's the basic idea.

Anyway, the point here was how the I stored the triangle ID in the first pass. I render and light the scene to a GL_RGBA16F render target, with accumulated light intensity stored in RGB and triangle ID stored in A, the alpha channel. During lighting, the shader only outputs an RGB color:
1  
out vec3 fragColor;

In this case, Nvidia's shader compiler set alpha to 0.0, and since I was using additive blending this left the triangle ID unmodified. AMD's compiler however set alpha to 1.0, which corrupted the triangle IDs completely, leaving my SRAA resolve shader utterly confused.
16  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-02-24 05:12:03
Implemented AOIT to compare it to my own OIT algorithm... Too tired for screenshots though. x___x
17  Java Game APIs & Engines / Engines, Libraries and Tools / LWJGL problem with shared contexts on: 2015-02-22 01:32:17
Due to some reasons we recently put our asset loading code in their own threads. These threads have their own shared OpenGL contexts, but it seems like we don't need it, so we'll probably make them normal threads. Anyway, the point is that sometimes when a shared context is made current on the new thread, it crashes due to some kind of concurrency issue sometimes. This is with the latest stable version of LWJGL 2. The problem seems related to capabilities checking that the context switch triggers.

Quote
java.lang.IllegalStateException: Function is not supported
   at org.lwjgl.BufferChecks.checkFunctionAddress(BufferChecks.java:58)
   at org.lwjgl.opengl.GL11.glGetError(GL11.java:1301)
   at org.lwjgl.opengl.Util.checkGLError(Util.java:57)
   at org.lwjgl.opengl.GLContext.getSupportedExtensions(GLContext.java:280)
   at org.lwjgl.opengl.ContextCapabilities.initAllStubs(ContextCapabilities.java:5802)
   at org.lwjgl.opengl.ContextCapabilities.<init>(ContextCapabilities.java:6240)
   at org.lwjgl.opengl.GLContext.useContext(GLContext.java:374)
   at org.lwjgl.opengl.ContextGL.makeCurrent(ContextGL.java:195)
   at org.lwjgl.opengl.DrawableGL.makeCurrent(DrawableGL.java:110)
   at org.lwjgl.opengl.SharedDrawable.makeCurrent(SharedDrawable.java:47)
   at engine.gl.GLThread.run(GLThread.java:60)

Quote
org.lwjgl.LWJGLException: GL11 not supported
   at org.lwjgl.opengl.ContextCapabilities.initAllStubs(ContextCapabilities.java:5806)
   at org.lwjgl.opengl.ContextCapabilities.<init>(ContextCapabilities.java:6240)
   at org.lwjgl.opengl.GLContext.useContext(GLContext.java:374)
   at org.lwjgl.opengl.ContextGL.makeCurrent(ContextGL.java:195)
   at org.lwjgl.opengl.DrawableGL.makeCurrent(DrawableGL.java:110)
   at org.lwjgl.opengl.SharedDrawable.makeCurrent(SharedDrawable.java:47)
   at engine.gl.GLThread.run(GLThread.java:60)

Creating a simple test program to reproduce the problem failed as usual...
18  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-02-17 19:24:31
I found some nice optimizations for bounds testing in shaders. Basically my motion blur was blurring over the edge of the screen, resulting in either black color if I used texelFetch() which gave a dark aura around the screen when in motion, or being clamped to the edge pixel if I used texture() which inflated the weights of the edge pixels. Neither of these looked very good, so I decided to simply remove the samples that fell outside the screen. However, detecting these scenarios was expensive.
...

I have seen this kind of pattern multiple times. Its hlsl but maps well to hardware.
1  
2  
if (dot(1.0, saturate(texCoords) - texCoords) != 0.0)
  samples += 1.0;


I tried a similar version too:
1  
samples += float(texCoords == clamp(texCoords, vec2(0), resolution));


8.30 pixels per second.

EDIT: If your coordinates are normalized and you use clamp(<value>, 0.0, 1.0), the clamp becomes free, in which case this one is ALMOST as fast as the float multiplied version (version 4).

EDIT2: Turns out that clamp DIDN'T become free. Since the texture coordinate varying input is never modified, there is no previous instruction that clamp can piggyback on, so it needs to add a MOV instruction with CLAMP to get the same result. It's faster than a MIN and a MAX, but still slower than the optimized float multiplied thingy.
19  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-02-17 15:49:19
I found some nice optimizations for bounds testing in shaders. Basically my motion blur was blurring over the edge of the screen, resulting in either black color if I used texelFetch() which gave a dark aura around the screen when in motion, or being clamped to the edge pixel if I used texture() which inflated the weights of the edge pixels. Neither of these looked very good, so I decided to simply remove the samples that fell outside the screen. However, detecting these scenarios was expensive.

Simply using an if-statement to test whether the coordinates are inside the screen was dead slow as this was compiled to 1 branch per sample.
1  
2  
3  
if(texCoords.x >= 0 && texCoords.y >= 0 && texCoords.x < resolution.x && texCoords.y < resolution.y){
   samples += 1.0;
}


However, there's a trick you can use. By casting the resulting boolean to a float, we can convert the boolean to 1.0 if it's true and 0.0 if it's false! Exactly what we want!
1  
      samples += float(texCoords.x >= 0 && texCoords.y >= 0 && texCoords.x < resolution.x && texCoords.y < resolution.y);

Sadly, this still compiles to the same thing as the if-statement. That's weird since I know that GPUs have specific instructions to set the value of a register based on a simple comparison. As an example, this line:
1  
float x = float(someValue < 1);

compiles to this instruction:
1  
x: SETGT       R0.x,  1.0f,  PV0.x  

It would seem that the boolean &&-operators are messing this up, causing it to revert to branches. Let's try casting the result of each comparison to floats and then use a simple float multiply between them to effectively and them together.
1  
samples += float(texCoords.x >= 0)*float(texCoords.y >= 0)*float(texCoords.x < resolution.x)*float(texCoords.y < resolution.y);

Bam! The comparison compiles to 2 SETGE (greater-equals) instructions, 2 SETGT (greater-than) instructions and 3 multiplies. I need to do this 16 times per pixel, once for each sample, so this saves a load of work! There is one final optimization we can make to improve this code on AMD's vector GPUs. AMD's older GPUs are a bit funny in that they run each shader on 4 or 5 different cores at the same time, trying to do as much as possible at the same time. This code:
1  
float x = a + b + c + d;

would fit this extremely badly. GLSL requires the GPUs to enforce the order of the operations, so none of these instructions can be run in parallel. First we do a+b, then (a+b)+c, then finally ((a+b)+c)+d, which requires 3 cycles. If we add some "unnecessary" parenthesises, we can encourage vector GPUs to do these additions in parallel without affecting the performance of scalar GPUs that don't have this problem:
1  
float x = (a + b) + (c + d)

This only takes 2 cycles, as a+b and c+d can both be calculated in the first cycle, and then (a+b)+(c+d) can be calculated in the second cycle, making this chain of addition 50% faster. Doing this for the bounds testing gives this code:
1  
samples += (float(texCoords.x >= 0)*float(texCoords.y >= 0))*(float(texCoords.x < resolution.x)*float(texCoords.y < resolution.y));


Theoretical performance of the 4 versions of bounds checking done for 16 samples on a Radeon HD 6870:
1. 2.04 pixels per second
2. 2.02 pixels per second
3. 11.20 pixels per second
4. 11.79 pixels per second

All in all, that's a 5.78x improvement compared to a naive if-statement.
20  Game Development / Newbie & Debugging Questions / Re: Byte buffer crashing on: 2015-02-11 12:11:30
Checking for errors using glGetError() can ruin performance...
Sorry for saying that, but saying this right after spasi pointed out to check for a GL error, is just stupid.

As long as there is evidence of your code having a programming error in it, you should not give a damn about performance at all, but instead should focus on getting rid of the source of the error. Therefore you most certainly want to insert an error check after each and every single GL call...

If I wanted my program to crash without a decent stack trace in a way that's hard to debug I wouldn't use Java in the first place. I can check for null every time I map a buffer, but having to call glGetError() if debug callbacks aren't supported is a nightmare.
21  Game Development / Newbie & Debugging Questions / Re: Byte buffer crashing on: 2015-02-11 11:45:04
Checking for errors using glGetError() can ruin performance since it causes a driver thread sync. Since there is no way (?) to figure out the address of the returned buffer, I'd strongly prefer it returning null. That makes it much clearer what's going on.
22  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-02-10 03:15:56
Rewrote my texture streaming to accommodate a few new things based on my findings in this thread: http://www.java-gaming.org/topics/files-on-your-harddrive-can-be-mapped-and-passed-directly-into-textures-buffers/35504/msg/336278/view.html#msg336278. I ended up not using mapped files, but I did rewrite it to use FileChannels instead of InputStreams, which should give a decent performance boost anyway. The biggest change lies in how textures can be sourced now. Before the streamer simply looked in a directory you specified, which got awfully messy once you had 50 textures or so. Now it's possible to put the textures multiple folders and also to pack together textures into a single massive file, and feed the streamer a TextureSource object for that folder/file, and tadaa, the streamer can magically find them!
23  Game Development / Shared Code / Re: Files on your harddrive can be mapped and passed directly into textures/buffers on: 2015-02-04 00:55:52
Riven, you mentioned writing my own file mapping implementation using JNI. That sounds a bit annoying. Isn't there any simple implementation of it out there that I can use?
24  Game Development / Shared Code / Re: Files on your harddrive can be mapped and passed directly into textures/buffers on: 2015-02-03 20:29:39
I'll look into this more and update the first post to reflect this discussion once I've gotten rid of the 25cm of snow that fell outside our house...
25  Game Development / Shared Code / Re: Files on your harddrive can be mapped and passed directly into textures/buffers on: 2015-02-03 19:48:22
Is there a serious risk of a out-of-memory crash unless I do that?
26  Game Development / Shared Code / Re: Files on your harddrive can be mapped and passed directly into textures/buffers on: 2015-02-03 19:22:06
Ah, right. =3 No problem.
27  Game Development / Shared Code / Re: Files on your harddrive can be mapped and passed directly into textures/buffers on: 2015-02-03 19:00:12
But when are you passing textures between processes? Threads maybe, but not processes.

Although I have seen some talk about spinning up multiple jvms in this fashion just so that each jit can focus on a core area of the code, etc. and be profitable if the IPC has low-enough overhead (mmap'ed files). Probably only useful in very particular circumstances. Maybe texture decompression from a conventional format would be in this category.
I'm not following. I just want to read raw data from the harddrive and pass it to OpenGL in the most efficient way possible. Since mapping the file seems like not only the fastest but also the simplest and most memory efficient way, why is it not fit for streaming? I don't have multiple processes.
28  Game Development / Shared Code / Re: Files on your harddrive can be mapped and passed directly into textures/buffers on: 2015-02-03 18:12:10
Probably not applicable to texture streaming (unless you're evil)...
Why not? My streamable textures are just each mipmap's compressed texture data dumped to a big file. Being able to map a whole mipmap and pass it in sounds immensely useful, and decompressing the texture is simply too slow, especially since S3TC and BPTC textures don't compress very well.
29  Game Development / Shared Code / Files on your harddrive can be mapped and passed directly into textures/buffers on: 2015-02-03 17:38:16
EDIT:
Don't do this! It has some severe problems which can cause your program to grind into a halt after mapping a lot of files. See the below discussion for more details! I recommend using FileChannels with a temporary direct buffer, not to map them!

Hello.

Today I tried out something interesting. I was working on making it possible to source my streamable textures from multiple sources (asset files, texture packs and raw files in a directory) when I stumbled upon an interesting function. FileChannels allow you to get a MappedByteBuffer for a certain range of a file on your harddrive. This byte buffer is direct, meaning that it's possible to pass it directly into OpenGL functions like glTexImage2D() and glBufferData(). For texture data, this can be pretty worthless. Most of the time the data is stored in some kind of image format (PNG or even JPG) which needs to be decompressed before it can be passed into glTexImage2D(), but for my streaming system this decompression was way too slow. My streamable texture files contain the raw image data compressed using S3TC or BPTC which I simply dump into glCompressedTexImage2D(). To test out the potential gains of using mapped files, I've developed a small test program which compares the CPU performance of 3 different ways of loading raw texture data from a file.

The first way of loading stuff is with old-school input streams. This requires a lot of copies, since FileInputStreams work with byte[]s, not ByteBuffers. We have to read the texture data into a byte[], copy it to a direct ByteBuffer and then pass it to glTexImage2D().
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
   private static byte[] bytes = new byte[DATA_LENGTH];
   private static ByteBuffer buffer = BufferUtils.createByteBuffer(DATA_LENGTH);

   private static long loadStream() throws Exception {
      long startTime = System.nanoTime();

      FileInputStream fis = new FileInputStream(RAW_FILE);
     
      int read = 0;
      while(read < DATA_LENGTH){
         int r = fis.read(bytes, read, DATA_LENGTH-read);
         if(r == -1){
            throw new IOException();
         }
         read += r;
      }
      buffer.put(bytes).flip();
      glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, 512, 512, 0, GL_RGBA, GL_UNSIGNED_BYTE, buffer);
     
      fis.close();
     
      return System.nanoTime() - startTime;
   }


The second way is to use NIO and FileChannels. FileChannels work on ByteBuffers directly, so we can use a direct ByteBuffer from the start!
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
   private static ByteBuffer buffer = BufferUtils.createByteBuffer(DATA_LENGTH);

   private static long loadChannel() throws Exception{
      long startTime = System.nanoTime();

      FileInputStream fis = new FileInputStream(RAW_FILE);
      FileChannel fc = fis.getChannel();
     
      while(buffer.hasRemaining()){
         fc.read(buffer);
      }
      buffer.flip();
      glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, 512, 512, 0, GL_RGBA, GL_UNSIGNED_BYTE, buffer);
     
      fc.close();
      fis.close();
     
      return System.nanoTime() - startTime;
   }


The last most awesome way is to use NIO and FileChannels to map part of the file as a MappedByteBuffer. This is just so magical and simple.
1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
   private static long loadMapped() throws Exception {
      long startTime = System.nanoTime();

      FileInputStream fis = new FileInputStream(RAW_FILE);
      FileChannel fc = fis.getChannel();
     
      MappedByteBuffer mbb = fc.map(MapMode.READ_ONLY, 0, fc.size());
      glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA8, 512, 512, 0, GL_RGBA, GL_UNSIGNED_BYTE, mbb);
     
      fc.close();
      fis.close();

      return System.nanoTime() - startTime;
   }


As you can see, there's some timing code in each function. The average time taken to do these operations over a few thousand runs in a loop (all reading the same file over and over again, so it's not representative for IO performance, only CPU performance) is listed below:

Stream:  0.657057 ms
Channel: 0.207856 ms
Mapped:  0.169004 ms

So it's not only simple as hell to do, it's faster as well and doesn't require any temporary memory!
30  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-02-01 17:02:07
Gah. Which JDK was it?

Cas Smiley
An ancient Java 7 Update 45 one...
Pages: [1] 2 3 ... 93
 
theagentd (10 views)
2015-03-27 23:08:20

wxwsk8er (52 views)
2015-03-20 15:39:46

Fairy Tailz (45 views)
2015-03-15 21:52:20

Olo (28 views)
2015-03-13 17:51:59

Olo (30 views)
2015-03-13 17:50:51

Olo (35 views)
2015-03-13 17:50:16

Olo (41 views)
2015-03-13 17:47:07

ClaasJG (40 views)
2015-03-10 11:36:42

ClaasJG (39 views)
2015-03-10 11:33:01

Pippogeek (47 views)
2015-03-05 14:36:23
How to: JGO Wiki
by Mac70
2015-02-17 20:56:16

2D Dynamic Lighting
by ThePixelPony
2015-01-01 20:25:42

How do I start Java Game Development?
by gouessej
2014-12-27 19:41:21

Resources for WIP games
by kpars
2014-12-18 10:26:14

Understanding relations between setOrigin, setScale and setPosition in libGdx
by mbabuskov
2014-10-09 22:35:00

Definite guide to supporting multiple device resolutions on Android (2014)
by mbabuskov
2014-10-02 22:36:02

List of Learning Resources
by Longor1996
2014-08-16 10:40:00

List of Learning Resources
by SilverTiger
2014-08-05 19:33:27
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!