There is still one more thing that I'm confused about. Is it really going to be slower using something like glTranslate() than transforming the vertices myself? How I had it set up initially was that every sprite just translated itself (using glTranslatef()) before rendering instead of actually changing the vertices. If I were to do this then I would have a very small number of unique vertices and either way every sprite needs to be transformed. How is it more efficient for me to multiply every vertex by the transform matrix myself instead of letting opengl do it? I know that I can load the matrix into opengl using glMultMatrix, but I thought I was supposed to give the 'world space' coordinates to opengl when using glDrawArrays(), is that true?.

It is not faster to transform the vertices yourself. But when all the vertices are in the same space they can be drawn using one call to glDrawArrays(). Instead of one call for every sprite. Doing the transformation yourself is a compromise you do to get the speed increase of batching up geometry.

The vertices don't have to be world space. But they do have to be in the same space since they are all rendered with the same matrix.

To sum up, assuming you are geometry limited and not filrate limited (you spend more time transfering vertices to the card than actually rendering):

few calls to drawArrays - good

many calls to drawArrays - bad