As someone who's spent a lot of time on a minecraft renderer, you probably just want to stick with removing internal edges and don't bother with merging faces.
It's tempting when you've got a 'simple' world, but as you add more and more gameplay it becomes progressively harder to manage. It's just too damn handy to do some thing per-vertex (eg. minecraft does it's fake AO lighting via vertex colours), as well as making it easier to do texture atlas/layer tricks to get richer surface detail. And of course it means that you have to do more work when you change a single block in your world.
The biggest rendering drag I found was that even after frustrum culling a minecraft-style world has lots of underground caves that are still considered visible. That can mean you're drawing twice or three times as many polys as you need.

You *should* be able to do something about that with occlusion queries, but I never got around to trying it. Minecraft doesn't IIRC, but that's probably because it's awkward 16x16x128 world structure makes it a bad fit.