As simple as it may sound you may look into DisplayLists.
I have a program that reads in 3DS graphics files and I implemented 4 rendering modes: Immediate, Display Lists, Vertex Arrays and VBOs. I found the following speed relationships/rankings between these modes:
|Immediate||1 (best)||4 (worst)|
|Display Lists||3||1 (best)|
The display lists had the cost of setup that was generally minor but it had the fastest rendering...for one model I had rendering times of 0-15 milliseconds, while in Vertex Arrays and VBOs it was 45-60 milliseconds, and in Immediate mode it was 120 milliseconds.
The downside is that displaylists are locked in to displaying their content, so if it changes (or your window reinitializes) then you need to recreate the list. The same applies to VBOs...you have to rebind and that has a little cost to it. Vertex Arrays and Immediate mode have no cost to reinitialize.
The rendering surprised me that using display lists was much faster than VBOs or Vertex Arrays. I guess I might be able to combine displaylists and vertex arrays/VBOs but I'm not sure...that possibly would be the optimal solution.
Anyway, look into display lists as they may be more useful than you initially thought - I did and it improved my performance significantly.
One MAJOR CAUTION I have with VBOs is that they are apparently not well implemented on Mac OS and as a result you will have a huge slowdown in their initial creation. For the example model I used it took about 1 second to create/bind the VBOs on a PC but 120 seconds (2 minutes!!!) to create/bind on a Mac.