It is not a dumb question in my humble opinion. I wrote my own engine several years ago, you want to learn, I was in the same case, it does not mean you're a masochist.
Actually, performances depend on the complexity of the geometry and the power of your graphics card.
The fastest primitive is the one you don't draw.
If you can simplify the geometry that you send to the graphics card fast enough on the CPU side, you can improve the performances (especially on low end machines, mobile phones, crappy mobile chipsets or SoC on laptops).
Some culling techniques are already implemented in OpenGL, you don't need to reimplement them except for pedagogical purposes, for example backface culling and view frustum culling.
You have to try to benefit of some pieces of information which the graphics card does not know about your geometry. For example, I implemented a cells-and-portals subdivision algorithm in a simple case, it is very efficient in indoor environments, it (not my implementation of course) is even used in Fallout 3.
I think that modern 3D scenegraphs written in Java (except maybe JMonkeyEngine 3.0) have a real lack in this domain. Sending everything to the graphics card is rarely a good idea except if you adapt the minimal required configuration to the complexity of your scenes, that is what a lot of non commercial games do.