In an attempt to promote my threading library
I made a small project which simulates a 2D ball park, Ball Ocean. The program simulates thousands of small balls and does collision detection between each ball and with the screen edges. The collision detection and movement updating is fully threaded and scales very well with the number of cores. The updating can be done both with my library and an optimized single-threaded version without any synchronization for a fair comparison.
First I planned out my game.
- I need to find potential collision pairs quickly, so I decided to use a grid to be able to quickly find nearby balls.
- I need to update my balls position based on their current velocity, and also update their location in the grid if they move over a tile border.
- I need to check for collisions between balls, and add a force to them when they collide.
- It has to have exactly the same result as a single-threaded implementation, e.g. be fully deterministic.
I then split it up into 2 main tasks:
Update position based on velocity.
Check collision against screen edges.
Check if the tile is still in the correct grid tile and move it to the correct tile if it has moved out of its old tile.
For each circle, query the grid for nearby circles and check for collisions against them.
When a collision is found, add some velocity to it to bounce it away from what it collided with.
All of these are fully threadable except for one thing: Updating the grid. When a ball has moved out of its current tile and we need to update the grid, we need to remove the ball from its old tile and add it to it's new tile. This is not thread-safe, and even if we use synchronization we won't get any guarantees on which order the balls will end up in inside each tile. We therefore break out that part of the task and run that single-threaded to guarantee that they have the correct order and avoid synchronization. This led to the following implementation of two tasks: UpdateTask and CollisionTask.
UpdateTask first loops over each ball and updates its position using its current velocity. It then checks for collisions against the screen edges and bounces them off of them. It also checks if a ball has moved outside its current tile. If it has, it adds the ball to a thread local list (there's one list per thread). When the multithreaded updating is done, it updates the grid location of all balls which had moved outside their tiles in the finish() method of the task which is called from a single thread after the task is done.
CollisionTask is run after UpdateTask has finished and queries the grid for nearby circles. It then checks for collisions against each circle that the grid callback returns and adds a force (= a speed) away from the other ball.
There are also some additional features: You can push away circles by holding down the mouse button (singlethreaded though). How many balls there are in a grid tile is visualized by a red color. Multithreading and rendering (to benchark raw CPU performance) can be toggled.
The grid-size and velocities are optimized for balls with a radius of 5 pixels right now. On a 1080p screen I can fit 45 000 balls which stack up and fill around 95% of the screen. The following is the FPS I measured (with Fraps) after the balls had stabilized after a few seconds. My CPU is an i7 860 quad core with hyperthreading.
Single-threaded: 26 FPS
8 threads: 110 FPS
Scaling: 4,23x faster.
Rendering OFF (benchmarking pure CPU performance):
Single-treaded: 27 FPS
8 threads: 131 FPS
Scaling: 4.85x faster.
No, that's not a typo. It's actually almost 5x as fast on my quad core thanks to hyperthreading.
You can download the source code here: http://www.mediafire.com/?gx8tk0ca887kkf7
The latest version of Small Java Threading Library is also included as a .jar which you'll have to add to your classpath. You'll also need LWJGL which is used for rendering (not included).