Java-Gaming.org Hi !
Featured games (90)
games approved by the League of Dukes
Games in Showcase (739)
Games in Android Showcase (224)
games submitted by our members
Games in WIP (820)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1] 2 3 ... 10
 1 
 on: 2017-10-23 01:15:37 
Started by appel - Last post by Ranger
Ah, that sucks.  Next time maybe go with the 10 year option.  Wink  Thanks for all the years hosting it, really enjoyed entering.

...now to take off the link in my "Professional Interests" section in my CV.  Smiley

 2 
 on: 2017-10-22 22:21:46 
Started by BurntPizza - Last post by jonjava
@theagentd

Ah, so the scenario is essentially 1 source of gravity where the precise positions of the ships/particles affected by it comes to play? But then why is the calculations so taxing? What makes this so different than anything else? The precision? The magnitude/amount of bodies? Why wouldn't e.g. a QuadTree based solution work?
Simulating the physics takes 0.27 ms for ~1 million ship, and this is GPU bandwidth limited, so I an have up to 8 sources of gravity before I get any drop in performance. If it's just the simulation, it can easily be done for over 10 million ships. The problem is the collision detection really. Hierarchical data structures are usually not very efficient on the GPU, and constructing them on the CPU would require reading the data back, constructing the quad tree, then uploading it again to the GPU, which is gonna be too slow. In addition, actually querying the quad tree on the GPU will be very slow as well; GPUs can't do recursion and computations happen in lockstep in workgroups, so any kind of branching or uneven looping will be very inefficient. It's generally a better idea to use a more fixed data structure, like a grid instead, but that's a bad match in this case. The large scale of the world, the extremely fast speed of the ships and the fact that ships will very likely be very clumped up into fleets means that even a uniform grid will be way too slow.

The idea of sorting the ships along one axis and checking for overlap of their swept positions (basically treating each ship as a line from its previous position to its current position) was inspired by Box2D's broadphase actually. I concluded that sorting was a simpler problem to solve than creating and maintaining a spatial data structure (especially on the GPU), but after testing it out more I'm not sure it's a good solution in this case. For a fleet orbiting in close formation, there's a huge spike in sorting cost when the orbit reaches the leftmost and rightmost edges of the orbit when the order of the entire fleet reverses. There are also problems when two large fleets, one moving left and the other right) cross each other, again due to the two fleets first intermixing and then swapping positions in the list once they've crossed... Finally, there's a huge problem with just fleets travelling around together. A fleet of 10 000 ships moving very quickly together will have overlapping swept positions, so all 10 000 ships will be collision tested against each other.

I got a lot of thoughts on this problem, so if you want to have more of a discussion about this, I'd love to exchange ideas and thoughts on this through some kind of chat instead.

A very insightful clarification covering several levels of analysis, thank you~

My time is limited as of late, not that I'd necessarily be able to contribute anything meaningful to the discussion anyway. But it's an interesting topic I don't think anyone here wouldn't mind seeing more posts of in the future.

 3 
 on: 2017-10-22 21:57:08 
Started by BurntPizza - Last post by theagentd
@theagentd

Ah, so the scenario is essentially 1 source of gravity where the precise positions of the ships/particles affected by it comes to play? But then why is the calculations so taxing? What makes this so different than anything else? The precision? The magnitude/amount of bodies? Why wouldn't e.g. a QuadTree based solution work?
Simulating the physics takes 0.27 ms for ~1 million ship, and this is GPU bandwidth limited, so I an have up to 8 sources of gravity before I get any drop in performance. If it's just the simulation, it can easily be done for over 10 million ships. The problem is the collision detection really. Hierarchical data structures are usually not very efficient on the GPU, and constructing them on the CPU would require reading the data back, constructing the quad tree, then uploading it again to the GPU, which is gonna be too slow. In addition, actually querying the quad tree on the GPU will be very slow as well; GPUs can't do recursion and computations happen in lockstep in workgroups, so any kind of branching or uneven looping will be very inefficient. It's generally a better idea to use a more fixed data structure, like a grid instead, but that's a bad match in this case. The large scale of the world, the extremely fast speed of the ships and the fact that ships will very likely be very clumped up into fleets means that even a uniform grid will be way too slow.

The idea of sorting the ships along one axis and checking for overlap of their swept positions (basically treating each ship as a line from its previous position to its current position) was inspired by Box2D's broadphase actually. I concluded that sorting was a simpler problem to solve than creating and maintaining a spatial data structure (especially on the GPU), but after testing it out more I'm not sure it's a good solution in this case. For a fleet orbiting in close formation, there's a huge spike in sorting cost when the orbit reaches the leftmost and rightmost edges of the orbit when the order of the entire fleet reverses. There are also problems when two large fleets, one moving left and the other right) cross each other, again due to the two fleets first intermixing and then swapping positions in the list once they've crossed... Finally, there's a huge problem with just fleets travelling around together. A fleet of 10 000 ships moving very quickly together will have overlapping swept positions, so all 10 000 ships will be collision tested against each other.

I got a lot of thoughts on this problem, so if you want to have more of a discussion about this, I'd love to exchange ideas and thoughts on this through some kind of chat instead.

 4 
 on: 2017-10-22 19:56:36 
Started by BurntPizza - Last post by jonjava
@theagentd

Ah, so the scenario is essentially 1 source of gravity where the precise positions of the ships/particles affected by it comes to play? But then why is the calculations so taxing? What makes this so different than anything else? The precision? The magnitude/amount of bodies? Why wouldn't e.g. a QuadTree based solution work?

 5 
 on: 2017-10-22 19:28:54 
Started by BurntPizza - Last post by theagentd
@jonjava

It's not an N-body simulation. Celestial bodies (stars, planets, moons) affect each other, but ships are only pulled by celestial bodies. Ships don't pull each other either.

 6 
 on: 2017-10-22 15:08:17 
Started by BurntPizza - Last post by jonjava
@theagentd

How "realistic" does it have to be, though? Wouldn't something like the Barnes-Hut algorithm be "good enough" for most cases?

http://arborjs.org/docs/barnes-hut
https://en.wikipedia.org/wiki/Barnes%E2%80%93Hut_simulation
http://www.cs.princeton.edu/courses/archive/fall03/cs126/assignments/barnes-hut.html

 7 
 on: 2017-10-22 14:42:04 
Started by BurntPizza - Last post by princec
I await the day when my CPU also has 1000 cores. Surely not too long now.

Cas Smiley

 8 
 on: 2017-10-21 23:28:53 
Started by BurntPizza - Last post by theagentd
Life's been tough on me the last few weeks, especially the last few days, so I decided to do some extra fun coding this weekend.

3-4 years ago I made some threads about an extreme scale space game with realistic Newtonian physics. The game would require simulating a huge number of objects affected by gravity, with extreme speed collision detection. I am talking 100k+ ships, each orbiting a planet at 10km/second, with accurate collision detection. The technical challenges are enormous. After some spitballing here on JGO, I ended up implementing a test program using fixed-precision values (64-bit longs) to represent positions and velocities to get a constant amount of precision regardless of distance from origin. Simple circle-based collision detection was handled by sorting the ships along the X-axis, then checking collisions only for ships that overlap along the X-axis. The whole thing was completely multi-threaded, and I even tried out Riven's mapped struct library to help with cache locality. Even sorting was multithreaded using my home-made parallel insertion sort algorithm, tailor-made for almost-sorted data sets (the order along the X-axis did not change very quickly). It scaled well with more cores, but was still very heavy for my poor i7.

I realized that the only way to get decent performance for this problem on a server would be to run the physics simulation on the GPU. With a magnitude higher performance and bandwidth, the GPU should be able to easily beat this result as long as the right algorithms are used. The physics simulation is easy enough as it's an embarrassingly parallel problem and fits perfectly for the GPU. The collision detection (sorting + neighbor check) is a whole different game. GPU sorting is NOT a fun topic, at least if you ask me. The go-to algorithm for this is a parallel GPU radix sort, but with 64-bit keys that's very expensive. Just like my parallel insertion sort algorithm took advantage of the almost-sorted nature of the sorting, I needed something like that that could run on the GPU as well. That's when I stumbled upon a simple GPU selection sort algorithm.

The idea is simple. For each element, loop over the entire array of elements to sort. Calculate how many elements that should be in front of this element. You now know the new index of your element, so move it directly to that index. Obviously, this is O(n^2), so it doesn't scale too well. However, the raw power of the GPU can compensate for that to some extent. 45*1024 = 46 080 elements can be sorted in ~60FPS, regardless of how sorted the array is. By using shared memory as a cache, performance almost triples to 160 FPS, allowing me to sort 80*1024 = 81 920 elements at 60 FPS. Still not fast enough. Anything above 200k elements runs a big risk of causing the driver to time out and restart...

Enter block-based selection sort for almost sorted data-sets! The idea is to split the list up into blocks of 256 elements, then calculate the bounds of the values of each block. This allows us to skip entire blocks of 256 values if the block doesn't intersect with the current block we're processing. Most likely, only the blocks in the immediate vicinity of each block needs to be taken into consideration when sorting, while the rest of the blocks can be skimmed over. Obviously, this makes the data dependent, and the worst case is still the same as vanilla GPU selection sort if all blocks intersect with each other (which is essentially guaranteed for a list of completely random values). However, for almost sorted data sets this is magnitudes faster!

To simulate an almost sorted data-set, an array is filled with elements like this:
1  
2  
3  
for(int i = 0; i < NUM_KEYS; i++){
   data.putLong(i*8, i + r.nextInt(1000));
}

This gives us an almost sorted array with quite a lot of elements with the exact same value, to test the robustness of the sort. The block-based selection sort algorithm is able to sort a 2048*1024 = 2 097 152 element list... at 75 FPS, way over the target of 100 000. It's time to implement a real physics simulation based on this!



Let's define the test scenario. 1024*1024 = 1 048 576 ships are in perfect circular orbits around the earth. The orbit heights range from low earth orbit (International Space Station height) to geosynchronous orbit. Approximately half of the ships are orbiting clockwise, the other half counterclockwise. The size of the earth, the mass, the gravity calculations, etc are physically accurate and based on real-life measurements.

Going back to my original threaded CPU implementation, it really can't handle one million ships very well. Just the physics simulation of the ships takes 20.43ms, and sorting another 18.75ms. Collision detection then takes another 10.16ms.

The compute shader implementation is a LOT faster. Physics calculations take only 0.27ms, calculating block bounds another 0.1ms and finally sorting takes 2.07ms. I have not yet implemented the final collision detection pass, but I have no reason to expect it to be inefficient on the GPU, so I'm optimistic about the final performance of the GPU implementation.



Each ship is drawn as a point. The color depends on the current index in the list of the ship, so the perfect gradient means that the list is perfectly sorted along the X-axis. 303 FPS, with rendering taking up 0.61ms, 370 FPS without rendering.

 9 
 on: 2017-10-21 17:22:27 
Started by dime26 - Last post by Opiop
Can I get added? @TobySimone. Thanks! Cheesy

Scratch that, I'm already added  persecutioncomplex

 10 
 on: 2017-10-20 22:39:34 
Started by BurntPizza - Last post by philfrei
Non-Java, but programming related: I received my first paycheck yesterday from a startup I have been working with (with request not to deposit it until tomorrow). Not much money, but they are basically paying me to learn as I go and giving me valuable work experience. Project involves VR (using JavaScript / AFrame library).

We had a meeting yesterday where I had to show work done so far. Fortunately, I think I chose well and got a few key things working very smoothly, rather than trying to do too much. For example, an animation (fade to black) for transitioning between "rooms" that people neither noticed or remarked about, as it was glitch free (not easy to achieve).

Museum staff felt good enough about the progress that my boss was approached about working on a grant for an additional project, which pleased him immensely.

I want to comment on JavaScript and on adding operator overloading to Java.

With physical work, (e.g., involving sacks of cement, say), the size of the bags have evolved to something that an good strong male worker can hoist around (I'm thinking of the 100# bags I dealt with, years ago). Now, if the work force were of people of smaller physical stature (e.g., included many women), it would be more efficient if the bags were sized to a slightly lower maximum. Does this make sense?

The point I want to make is that there is a mental correlation: working memory (similar but not to be confused with short-term memory). Young brains can juggle more items at one time than older brains. Maybe I can only track 5 or 6 thoughts at the same time, rather than 8 or 9 like a twenty-something programmer. I think it must be tempting to push the limits of what one can juggle. But having too much stuff to keep track of at the same time ends up being counter-productive. Operator overloading, it seems to me, adds to the load on human "working memory." A better form for the leveraging of complexity is via "chunking," which I take as encapsulating or grouping functionality in a single entity (e.g., a class, a function, a subroutine).

I don't often see discussions about the merit of languages talk much about the human "working-memory" demands of that language.

JavaScript makes pretty high demands, it seems to me. I'm still not able to easily tell something that is usually pretty obvious in Java: a given variable's scope and whether its use in a statement is subject to closure or not. Maybe you've seen the articles on "favorite interview questions" as examples. I can link one if needed. I truly hope Java doesn't go down the path of adding functionality that increases the human working-memory load.

Pages: [1] 2 3 ... 10
 
Ecumene (56 views)
2017-09-30 02:57:34

theagentd (79 views)
2017-09-26 18:23:31

cybrmynd (187 views)
2017-08-02 12:28:51

cybrmynd (185 views)
2017-08-02 12:19:43

cybrmynd (192 views)
2017-08-02 12:18:09

Sralse (203 views)
2017-07-25 17:13:48

Archive (759 views)
2017-04-27 17:45:51

buddyBro (892 views)
2017-04-05 03:38:00

CopyableCougar4 (1441 views)
2017-03-24 15:39:42

theagentd (1326 views)
2017-03-24 15:32:08
List of Learning Resources
by elect
2017-03-13 14:05:44

List of Learning Resources
by elect
2017-03-13 14:04:45

SF/X Libraries
by philfrei
2017-03-02 08:45:19

SF/X Libraries
by philfrei
2017-03-02 08:44:05

SF/X Libraries
by SkyAphid
2017-03-02 06:38:56

SF/X Libraries
by SkyAphid
2017-03-02 06:38:32

SF/X Libraries
by SkyAphid
2017-03-02 06:38:05

SF/X Libraries
by SkyAphid
2017-03-02 06:37:51
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!