Thanks pjt33. Your method seems to work just as well and is simpler.

In general, I haven't found using fixed point to be any faster than just using floats, even though the G1 doesn't have an FPU. I only use fixed point with OpenGL ES, which is 16.16. I found any real number crunching is going to have to be native code. I didn't try using fixed point in native code as it doesn't seem to be a bottleneck.
Looking at some of my cryptic comments around some fixed point code I have it seems that I was at one point convinced that shifting stuff as far left as possible before division was more accurate. However, what I was doing was the equivalent of (this code for 16.16):
1
| return (int)((((long)x) << 32) / (y << 16)); |
Shrug. Actually what I was doing was a bit more complicated (overflow detection was included). It was for Java4k, and I discovered that using fixed point I got better compression. There was plenty of integer arithmetic going on, so using only integer arithmetic rather than a mix of integer and float meant lower entropy in the bytecodes used.
I have actually worked in a company which strongly encouraged use of fixed point for various reasons; one case was networked multiplayer physics engines, which otherwise required strictfp for consistency.