Hi !
Featured games (85)
games approved by the League of Dukes
Games in Showcase (624)
Games in Android Showcase (176)
games submitted by our members
Games in WIP (676)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
   Home   Help   Search   Login   Register   
  Show Posts
Pages: [1] 2 3 ... 104
1  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-08 03:00:11
Aha, that explains it! Yes, this is more in line with what I expected. Thank you.

I found a bug in my implementation. To floor the index, I simply do (int)index. This produces wrong results for negative angles. A correct implementation should use (float)Math.floor(index) instead.

I also found an optimization:
sin1 + (sin2 - sin1) * alpha
is faster than the original
alpha * sin1 + (1 - alpha) * sin2
by a solid 10-15%.

My full code:

   private static final int SIN_BITS, SIN_MASK, SIN_COUNT;
   private static final float radToIndex, cosOffset;
   public static final float[] sin;
   static {
      SIN_BITS = 9;
      SIN_MASK = ~(-1 << SIN_BITS);
      SIN_COUNT = SIN_MASK + 1;
      float radFull = (float) (Math.PI * 2.0);
      radToIndex = SIN_COUNT / radFull;
      cosOffset = (float)(Math.PI / 2);
      sin = new float[SIN_COUNT+1];
      for (int i = 0; i <= SIN_COUNT; i++) {
         sin[i] = (float) Math.sin(Math.PI * 2 * i / SIN_COUNT);
   public static final float sin(float rad) {
      float index = rad * radToIndex;
      //float floor = (float)Math.floor(index); //Correct
      float floor = (int)index;                 //Fast, only for positive angles
      float alpha = index - floor;
      int i = (int)(index) & SIN_MASK;
      float sin1 = sin[i+0];
      float sin2 = sin[i+1];
      return sin1 + (sin2 - sin1) * alpha;
   public static final float cos(float rad) {
      return sin(rad + cosOffset);
2  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-08 01:42:07
Interesting. I wasn't expecting mine to beat the SOA. Are you sure you're not getting an unrealistic amount of compound error?
3  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-08 01:31:12
Right. What about precision? What's the accuracy of those methods? It'd be interesting to see the performance/quality tradeoffs they make.
4  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-08 01:27:59
What does "Score error" mean?
5  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-07 22:15:51
Disclaimer: I saw high variance between runs, which I attribute to CPU thermal throttling. Had to let the laptop cool down a bit before getting consistent results. This might be affecting you too.

Use ThrottleStop. It saved me when my computer was throttling during games. You can even underclock your computer with it to get consistent results.
6  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-06 12:10:14
..the worst possible professors at my uni...
They're paid to help you.  Any time I type in information here I'm putting my hand in my pocket and throwing cash out the window the entire time I'm typing.  Seems pretty understandable that my willingness to do so is rather limited.
Wow, I didn't even see this until now. Is that a joke? I'm being serious. What do you even think the purpose of this forum is?

I use and enjoy this forum for a number of reasons. People make interesting things that inspire me. I can post about think I've coded that I'm proud of and get feedback and acknowledgement from people who actually understand and appreciate what I'm doing, which is a rarity IRL for me. I learn new things about math, HotSpot, LWJGL and uncountable other things all the time here. I can follow and discuss the development of things I admire or personally care about. Even just explaining things to others helps me get a better grasp of those concepts myself.

That being said, how do your posts fit into that? Although you seem to want to help, your answers are too brief and unexplained to help much without a direct neural link to your brain. Your explanations are like a summary of your knowledge that only someone with similar knowledge already would understand. What annoys me the most is that you just literally told me that your time is extremely valuable, yet you expect every single person who read (and actually want to understand) your comments to spend a significant amount of time Googling to decode your posts.

I think it's sad, because if you're half as knowledgeable as you claim seem to be then I have lots to learn from you. I may not be interested in learning everything (AKA advanced quaternion math), but we often comment in the same threads which means our interests do overlap. IMO the main point of this forum is sharing knowledge. Maybe you don't think that you're "receiving" as much as you're "giving" to this forum since you already know a lot, but there are lots of other points that even it out IMO. It's not a zero-sum game.

Sorry for the off-topic post...

EDIT: claim ---> seem
7  Java Game APIs & Engines / OpenGL Development / Re: Global Illumination via Voxel Cone Tracing in LWJGL on: 2015-10-06 11:35:37
No update? =<
8  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-05 18:42:43
To decode this a bit for the people that are interested. I believe this is what Roquen is trying to say.

If you pre compute cos(x), sin(x), cos(y), sin(y) and save these values for each of the N objects you only need to compute sin(t) and cos(t) for each render cycle because these are the only variables that change. As you already have cos(x),sin(x),cos(y),sin(y) saved you just need to perform 4 multiplications and 2 additions per render cycle. Comparing both methods you get (Theagentd vs Roquen)

2N Lookups + 2N additions vs 4N multiplications + 2N Additions + 2 Lookups = 2N Lookups vs 4N Multiplications + 2 Lookups

Assuming multiplications are more than 2x quicker than lookups and N is sufficiently large you will save time using this method.

Hope this helps clears things up a bit
Thank you, that is a great idea. I will implement that.                                                       
9  Game Development / Game Play & Game Design / Re: What makes a good simulation game? on: 2015-10-05 15:04:43
In what sense is Goat Simulator an actual "simulator"? It doesn't simulate anything realistic.
10  Java Game APIs & Engines / OpenGL Development / Re: Global Illumination via Voxel Cone Tracing in LWJGL on: 2015-10-05 01:40:37
This is really cool!

Are you using sparse textures? What memory usage are you getting? How are you voxelizing the game world? Are you revoxelizing dynamic objects each frame? Performance at different resolutions?
11  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-04 23:10:40
I should care... but your "explanation" is like the worst possible professors at my uni who supposedly know a lot about their topics but can't teach for shit all combined into one. I can't follow your reasoning and it's too time consuming to decode your intentions. Sorry.
12  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-04 16:26:17
sin(t+x) = cos(x)sin(t)+cos(t)sin(x)
sin(t+y) = cos(y)sin(t)+cos(t)sin(y)

per sim step is cos(t) & sin(t) which is really one trig op...not that it would matter.

{....cos(x),sin(x),cos(y),sin(y)...} packed array.  Speculate loading and skips the trig table lookup.  More accurate, but that probably doesn't matter either.  So 2x (2 look-up, 2 mul, 1 add) per entry.
Uhm, I simply use sin() to get a wavey motion. Each grass patch has a spring system, and the wind force calculation is based on a sinus function. There is no rotation here.
13  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-04 13:27:15
So given 3 values (x, y, time) I can compute sin(x+time) and sin(y+time) with a single trigonometry function?
14  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-04 04:45:58
12-bit interpolated: 4.43 ms
12-bit raw value: 3.31 ms

16-bit interpolated: 5.20 ms
16-bit raw value: 4.25 ms

I totally did not see that coming. Weird. I don't think I got values like these earlier.
15  Game Development / Performance Tuning / Re: Sharing data between threads on: 2015-10-03 16:34:00
Indeed, writing to the same cache line from two different threads causes a cache flush on pretty much each write. I've had code that were reduced to zero scaling with 8 threads due to them all incrementing the same integer counter (was an old preformance measurement variable that wasn't even synchronized). I removed that single x++ and scaling went up to 3.5x on 4 Hyperthreaded cores with 8 threads.
16  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-03 15:55:27
public class FastSineWindFunction implements WindFunction{

   public float getWindX(float time, float x, float y) {
      return FastTrigonometry2.sin(time + x);

   public float getWindY(float time, float x, float y) {
      return FastTrigonometry2.sin(time + y);

I also have an implementation which does Math.sin().
17  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-03 14:32:44
But regarding @Roquen's comment, can you probably give a vague percentage of how much faster the table-based lookup is compared to other methods in WSW maybe?
The only time I use lots of sin() calculations is for our swinging grass to simulate a wind force based on the position of each straw.

Math.sin(): 15.01ms
9-bit lookup with interpolation: 4.45ms
6-bit lookup with interpolation: 4.45ms

This is for the entirety of the grass calculation, so it does lots of other things too.
18  Game Development / Shared Code / Re: Extremely Fast sine/cosine on: 2015-10-03 04:43:45
I solved that.

I also expanded the sin[] to hold an extra value when the last element of the array is indexed like philfrei did. IMO this is the best performance/quality tradeoff.
19  Discussions / Java Gaming Wiki / Re: Math: Inequality properties on: 2015-10-01 15:08:40
You can use sqrt() as an example for a function that can be removed on both sides of an inequality.
20  Java Game APIs & Engines / OpenGL Development / 16-bit float conversion Java code? on: 2015-09-30 16:31:34
Hey. I'm interested in improving the packing of my vertex data to reduce its size further. In some cases I have stuff like HDR colors that don't necessarily need 32 bits of precision. 16-bit half-floats would work just as well and save quite a bit of precision. The problem is converting 32-bit float values to 16-bit values on the Java side. My Google-fu tells me that this is more complicated than I expected if I'm going to handle special values like infinities, NaN and denormals. Has anyone already implemented this perhaps?
21  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-09-30 10:49:26
Nice job mate. Farming is a fun thing to do. You should make it seem like the plants grow in the background by saving the last time that the player quit the game and then finding the difference between the current time and the last time and updating the growth of the plants based on that.
If I didn't explain that well:
System.currentTimeMillis() - previousRecordedLogoutTime = difference
Heh, then you can get a negative growth by setting back your clock. =P
22  Game Development / Game Mechanics / Re: Loading Graphics Smoothly | LWJGL on: 2015-09-28 22:22:50
OpenGL is "bound" to a single thread. This means that you need to do all OpenGL calls from the main thread, which obviously also is the one that draws the loading screen (using OpenGL). It is possible to create a second context for the loading thread that shares (some) data with the main context. However, due to an unfathomably stupid design decision only OpenGL objects that contain data are shared. For example textures and VBOs are shared, but FBOs and VAOs aren't because they only reference other objects. If you design your engine/game around this from the start, it is relatively easy to work around this (load VBOs and texture in shared context, load cheap FBOs and VAOs in the main thread). Another much more complicated solution would be to have the second thread place work that the main thread has to do in a queue which the main thread queries each time it redraws the loading screen. This is much more time-consuming to implement but can work really well.
23  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-09-26 12:10:21
Technically yesterday, but I wrote a small program that can convert a normal map (back) to a height map.

We need more details. How does it work? What assumptions are needed?
It basically treats the normal map as a gradient map for an unknown height map. This is a tiny bit simplified, but if the X-derivative is negative (slope to the right) I rise the left neighbor and lower the right neighbor a tiny bit. If it's positive, I lower the left and rise the right. I do the same for the Y coordinate with the top and bottom neighbor. After that, I apply a small smoothing pass. Repeat.

I'm currently working on rewriting the first pass to be threadable, which will allow me to port it to the GPU. There are some small problems right now with some depths getting deeper and deeper and some hills getting taller and taller. I need to take the current gradient into consideration when modifying the height map.
24  Game Development / Newbie & Debugging Questions / Re: Are people overreacting in the negative performance of GC? on: 2015-09-25 18:53:14
Object pooling is inefficient except in a few particular cases since Java 1.4. Android is another story...
What are you smoking? Pooling objects is perfectly valid if you want to avoid garbage even on desktop. The goal isn't to improve average performance, it's to reduce stuttering.
25  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-09-25 16:18:10
Another cool image:

26  Discussions / Miscellaneous Topics / Re: What I did today on: 2015-09-25 11:42:28
Technically yesterday, but I wrote a small program that can convert a normal map (back) to a height map.

27  Discussions / General Discussions / Re: Threads, games and running on all CPU's on: 2015-09-20 21:21:23
The programmer is trying to find out how many cores are available but is being told how many threads are available. If your program can use a new threads as easily as it can a core then it doesn't matter ie for most video processing, audio processing and your program. Most games don't benefit from HT so the program is being told misleading info from the call ie twice as many cores as there really is. If you then kick off additional processes to make use of those "cores" then the processor would have to swap those processes in and out. This may not be a huge overhead but it is an overhead.
With HT each core can have 2 threads loaded into two different sets of registers at the same time and work on either one of them. One of the main points of HT is to AVOID the overhead of swapping threads. The other is to allow more efficient use of the CPU's hardware since it can sometimes execute two instructions in parallel. That most games don't benefit from HT is a symptom of them not being able to utilize multiple threads efficiently in the first place, something that as far as I know only the Battlefield series does at all, and those games get HUGE wins from HT. My old laptop with a HT dual-core gained around 50% higher framerates from HT as it was CPU limited. the only time Hyperthreading would hurt performance would be if you're thrashing the CPU cache. In those cases doing that from twice as many threads may actually hurt performance, but that very rarely happens if you know what you're doing.

Scaling in WSW:
Only physical cores:
1 core: 12 FPS (1x)
2 cores: 22 FPS (1.83x)
4 cores: 36 FPS (3x)

Using Hyperthreading:
1 core: 15 FPS (1.25x)
2 cores: 27 FPS (2.25x)
4 cores: 43 FPS (3.58x)

The last time I did anything heavy in this area was ~5-6 years ago on 4 execution port hardware.  What I was seeing in my use-cases was a hyperthreaded core would hit around .10-.15 the computation throughput of a full core.  In some specialize cases which involved heavy tweaking it could be bumped up to around .30.  These numbers are more or less in line with what others were seeing at the same time.
In some very specific cases the scaling is far better than that, but on average you're right. HT helps a lot when you have a lot of cache misses or branch prediction failures, and cache misses are much more abundant in Java than in C, so Hyperthreading helps hiding those problems.

If AMD's next architecture will implement a similar technology that could actually do wonders for them, considering their shitty memory controller holds them back so hard. Or it won't. Who knows?

I have no specific expectation about how these number might look on newer 6 execution port hardware...too many variables, but as a guess it's probably a bit higher.  So attempting to develop a scaling scheme where you don't know if the cores are all full or if half are virtual sucks because it's a huge difference in expected computational power.  Of course I'm not blowing off adaptive scaling.  Perhaps of more interest is that in the hyperthreaded case I switched to using thread affinities which helped in that case and blinding doing the same hurt in all full core case.
I'm not entirely sure what you mean, but the thread scheduler knows the difference between virtual and physical cores and prefers physical cores.

But really I don't understand your point.  I'm saying that having more (and accurate) information is a better situation...even if any individual never makes use of it.
We have a name for unused information: useless information. I am asking you because I genuinely can't see when you would ever be able to change anything to take that into consideration.
28  Discussions / General Discussions / Re: Threads, games and running on all CPU's on: 2015-09-20 20:14:24
HT is different to having a separate core. The main reason why they're different is that full use of HT can only be made if you have relatively little code and a lot of parallel data to process. That's why it's more suited to video and audio work assuming the code is suitably written. In most games it's far better to have a separate core to having HT and this show's in most benchmarks where performance is similar between Intel i5's (4 core 4 threads) and i7's (4 core 8 threads). If HT can't be used by a game then it's important for the game to know how many real processors it has rather than virtual processors.

That's not to say all cores are equal. Intel chips have been better than AMD since Sandybridge and AMD looks unlikely to catch up any time soon. A 4 core i5 will usually beat an 8 core FX chip - I've also seen a budget 2 core G3258 outperform the FX chips. AMD chips are good value though which is why people go for them. The big problem with having more cores is that they generate more heat so you need to run them slower. The other problem for AMD is that their FX chips shares certain components (ie floating point, instruction decoding) between each pair of cores so you're effectively running half as many cores as they state.

So is it important to know the difference between HT and real cores? certainly not to me.

I know all that. I've done extensive benchmarking with Hyperthreading and have personally written code that gets 6.5x scaling on a hyperthreaded quad core. What I don't know is why you would NOT want to use Hyperthreading if it's available. Hyperthreading especially helps Java programs as we don't have the same control over memory as C has. Let me rephrase the question: Why would I care if the cores are logical or physical? Why should that change the behavior of my program?
29  Discussions / General Discussions / Re: Threads, games and running on all CPU's on: 2015-09-19 23:27:38
There's quite a difference between say: 2 full cores with hyperthreading vs. 4 full cores without hyperthreading.  Ideally you want both pieces of information.
My last try: Why? In what situation would you want to treat those differently?
30  Discussions / General Discussions / Re: Threads, games and running on all CPU's on: 2015-09-19 11:55:42
The best solution is almost always to use an embarrassingly parallel algorithm and split up the workload between N threads, where N=Runtime.getRuntime().availableProcessors().

My instinct (based on no hard evidence whatsoever  Wink ) would be to go with maximum N-1.  There are going to be lots of other things going on you don't directly control, OS stuff, VM stuff (GC, JIT?), audio playback, etc.  Some (most?) of those are going to have priorities higher than your threads.
If you've coded the OpenGL parts correctly the driver will use a second thread to actually process OpenGL commands in so the main thread isn't blocked for too long. Even with that thread, sound threads etc, it is faster to use all processors in my experience. Sound threads should already be running at a high priority to prevent stuttering, and they'll spend most of their time waiting for the harddrive or idling with full buffers.

note: availableProcessors is the number of virtual CPUs.
Your point being? Why would you not want to run your code on all logical CPU cores?
Pages: [1] 2 3 ... 104
KaiHH (15 views)
2015-10-11 14:10:14

KaiHH (15 views)
2015-10-11 13:26:18

BurntPizza (42 views)
2015-10-08 03:11:46

BurntPizza (21 views)
2015-10-08 00:30:40

BurntPizza (26 views)
2015-10-07 17:15:53

BurntPizza (42 views)
2015-10-07 02:11:23

KaiHH (47 views)
2015-10-06 20:22:20

KaiHH (21 views)
2015-10-06 19:41:59

BurntPizza (38 views)
2015-10-06 19:04:48

basil_ (52 views)
2015-09-30 17:04:40
Math: Inequality properties
by Roquen
2015-10-01 13:30:46

Math: Inequality properties
by Roquen
2015-09-30 16:06:05

HotSpot Options
by Roquen
2015-08-29 11:33:11

Rendering resources
by Roquen
2015-08-17 12:42:29

Rendering resources
by Roquen
2015-08-17 09:36:56

Rendering resources
by Roquen
2015-08-13 07:40:51

Networking Resources
by Roquen
2015-08-13 07:40:43

List of Learning Resources
by gouessej
2015-07-09 11:29:36 is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!