I recently experimented with particle systems. With 250.000 particles I get 62 FPS on a laptop with an i5-2410M and a Geforce GTX 460M.
- Lifetime of 100 frames.
- Gravity and fake air resistance.
- Bounds checking to bounce particles on window edges.
- They are drawn drawn as colored smoothed points.
Using a glPointSize up to 12 or 13, my test is CPU bottlenecked. The slowest part is actually filling the data buffer with particle positions+colors to submit using glBufferData, as it involves calls to native functions. Also, by load balancing between 4 threads (I have a dualcore with hyperthreading) I get a 2.2x performance boost (28 fps with one thread). I use pooling. NOT using pooling is actually faster by a pretty large margin, but produces HUGE hitches due to the GC.
Basically, if you only want a few thousand particles you shouldn't really worry much.
I assume this is a lot slower if you are drawing textured quads instead of points. What do you mean by the GC?