I've only stuck up Morton for the moment but:

Pictures/Wikipedia:

http://en.wikipedia.org/wiki/Z-order_(curve)

http://en.wikipedia.org/wiki/Hilbert_curveCDF Player required (

https://www.wolfram.com/cdf-player/):

http://demonstrations.wolfram.com/HilbertAndMooreFractalCurves/http://demonstrations.wolfram.com/Lebesgue3DCurves/http://demonstrations.wolfram.com/WunderlichCurves/http://demonstrations.wolfram.com/HilbertAndMoore3DFractalCurves/Wolfram/Alpha:

http://www.wolframalpha.com/input/?i=hilbert+curve&lk=4&num=2Say you have a 2D data set which you flatten into an array or some other linear storage. For simplicity it's a square with power-of-two height/width = D. The common indexing will be

. This is pretty good. It's a 2-to-1 mapping that's compact (there's no wasted space). I'll call this row-linear just to give it a name.

Now it's common if you examine some cell (x,y) that the next cell(s) you're going to visit are going to be close to cells you've already looked at recently. With row-linear the next cell is going to be very close in memory if it's a horizontal displacement. But it's not very hot if (say) you're walking vertically. A simple alternate indexing scheme would be a reflected row-linear. So the first row is (x), second is (2D-x-1), third is (2D+x), forth (4D-x-1), etc. This simplifies but it's not a very interesting scheme.

So enter space-filling-curves and other 2-to-1 mappings. Talking about exact properties of each actually isn't very useful because "to do the math" requires knowing average (or worst case) access patterns and details of the target hardware's memory architecture. Pure PITA. Schemes that are potentially worth considering will good locality. Morton is nice and simple to compute...esp incrementally. So basically you layout the data in some alternate indexing scheme in an attempt to move less memory.

What kind of gains can one expect? Small. You're explicitly reading and writing the same amount of memory and performing the same amount of work at each cell. If you want real speed gains this isn't the place to look. Any speed gains stem from moving less memory around in the memory architecture. One thing to note is that all the gains are not local the "thread" where this is occurring...the reduced pressure on memory allows other threads to progress (assuming they're accessing). What I'm only single threaded! ORLY? You at least have two more: GC and compiler.

I've added a few comments and I'll toss together a RayGrid2D example.