Yeah thats what I kinda though about pooling. So basically you will have to have methods with mass args for all attributes or set them with different method calls as the chance the dead particle being of the type you want is slim.
I was going to do exactly that for rotation. Use vec3 for loc and have z be the rotation. How do I rotate in shader with out dropping performance?
My idea of pooling:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| public class Pool{ private ArrayList<Particle> pool; public Pool(){ pool = new ArrayList<>(); } public Particle get(){ if(!pool.isEmpty()){ return pool.remove(pool.size() - 1); } return new Particle(pool); } public void recycle(Particle p){ pool.add(p); } } |
Rotation can be done in multiple ways but the best way to do it in a shader is with a 2D rotation matrix. You simply generate a rotation matrix from an angle (the rotation variable) you pass to the shader per particle and rotate the generated coordinates with this matrix. It doesn't affect fill rate at all, so it won't cost anything in a real game. However, it does cost some memory bandwidth for the extra rotation variable per particle plus some geometry shader performance, but this is completely irrelevant in this case since fill rate will outweigh it by far.
You can generate the rotation matrix in your geometry shader like this:
1 2 3 4 5 6
| float sin = sin(vRotation[0]); float cos = cos(vRotation[0]); mat2 rotationMatrix = mat2( cos, -sin, sin, cos ); |
Then rotate the local coordinates by multiply them with this matrix. See the full shader source below.
There's no reason to pack the rotation into a vec3. Just add another float variable to the shader and treat it as a completely different attribute, since that's what it is. Packing is soooo fixed function pipeline.
Shader source codeJava source code(The only relevant stuff is the attribute setup at start and before/after rendering, but I don't have time to pick out the relevant parts... >_<)
Performance had a slight impact since my particles/sprites/whatever are so small and many (= not fill-rate limited): I now only get around 1.0 million particles at 60 FPS, down from 1.1 million. I strongly suspect it's because of the additional memory footprint of the particles. 20 bytes --> 24 bytes = a 20% increase in memory usage. The additional GPU load shouldn't be significant.
EDIT: Did some more benchmarking. Turns out the GPU impact was higher than I thought and that seems to be the main reason for the performance loss. However, doing that math on the CPU instead and uploading all 4 coordinates is of course a lot more expensive, so it's obviously worth it.