Note: you are watching revision 5 of this wiki entry.
(
view plain diff
)
VERY ROUGH WORK IN PROGRESS..NO REAL USEFUL INFORMATION AS OF YET.
[i] A random number generator is like sex;
When it's good, it's wonderful,
And when it's bad, it's still pretty good.[/i] George Marsaglia
[i] Games? Crank 'em out as fast as you can; Wham! Bam! Thank you ma'am![/i] Roquen
[h3]Overview[/h3]
The purpose of this page is to give an rough overview of how pseudo random number generators (PRNG) work and a rough idea of how to choose ones which are adequate for a game runtime.
[h3]Rules of thumb[/h3]
[list]
[li]Never use the same instance of a generator in more than one thread.[/li]
[li]Never create static utility methods which call a generator unless single threaded (same as previous).[/li]
[li]XXXIf using a JDK supplied generator, use ThreadLocalRandom.[/li]
[/list]
XXX Use different generators if you need a more expensive one for a sub-set of tasks...oh, and don't really worry about any of this stuff if you just don't generate many random numbers for a given task.
Everything below this point can (and should) be ignored unless you are generating a large numbers of random numbers per frame OR you tempted to use OR are using a "good" random number generator especially one with a period greater than 2[sup]64[/sup]. If a your most expensive AI entity needs up to half a million random numbers per frame, then a barely reasonable PRNG with a period of 2[sup]32[/sup] should be more than sufficient.
[b]Again this page is only dealing with PRNGs for game runtimes where "real" money is not directly involved...as in for gambling.[/b]
[h3]Introduction[/h3]
The vast majority of PRNGs all work in the same way. They take some state-data in the form of a sequence of bits and perform computations which mix the bits. Notice that hashing and random number generation are very similar. A given generator will have a [i]period[/i] which is the number of unique values the state will assume before wrapping around and repeating itself. Yeah, that was clear. OK, let's pretend like we're using a 2-bit integer for our state data, so our 2-bit integer can have values of: [tt]{0,1,2,3}[/tt]. For our random number generator we want to produce some permutation ([url=http://mathworld.wolfram.com/Permutation.html]MathWorld[/url], [url=http://en.wikipedia.org/wiki/Permutation]Wikipedia[/url]) of that set which appears random. The number of permutations of a set with [i]n[/i] elements is [tt]Factorial[n][/tt], so we have 24 choices for our 2-bit integer. Limiting the set to those that start with '0' (all the remaining are rotations), we have:[tt] {0,1,2,3},{0,1,3,2},{0,2,1,3},{0,2,3,1},{0,3,1,2},{0,3,2,1}[/tt].
Now obviously none of the above permutation choices look very random but this is exactly how random number generators work. The problem here is that there just aren't enough unique values that any of the permutations appear random. Luckily factorials explode and if we were talking about only 8-bits then there would be approximately 8.6x10[sup]506[/sup] choices and 16-bits is ~5.2x10[sup]287193[/sup]. Other than some very narrow special cases (like in GLSL, when attempting to avoid using integers) there's no reason to use generators with less than 32-bits.
An implication of this is that the period can be at most 2[sup]n[/sup], where 'n' is the number of bits. On this page we will only consider "full period generators" which will have a period of either 2[sup]n[/sup] or 2[sup]n[/sup]-1. The minus one comes into play for generators for which zero is an invalid state.
This does not imply that PRNGs with longer periods are superior in quality than those that are shorter.
[h3]Period[/h3]
As mentioned above the period of a PRNG is the length of the sequence that it generates. Once the period is exhausted, the sequence wraps-around and starts to repeat itself.
For scientific usage long periods are important. The reason for this is that statistically they will begin show defects if in "single computation" more than [tt]sqrt(T)[/tt] random numbers are used, where [tt]T[/tt] is the period. Note that the [tt]sqrt(T) = T[sup]1/2[/sup][/tt]. Even more important is because one can break a problem across multiple CPUs/servers/whatever to attack a single problem and the longer the period, the lower the probability that the individual computation units will be using overlapping parts of the sequence.
Note the "single computation" part. [i]There's the mistaken notion in games that one needs a period long enough that it never repeats during gameplay[/i]. Not at all, it simply should be long enough that every single computation has more than enough values to complete.
So what about the period? Do we need to think in terms of T or sqrt(T)? The question is more or less moot. As noted above there's no reason to use a generator with a period shorter than ~2[sup]32[/sup], so you should feel safe if any single problem needs no more than 2[sup]16[/sup] or 65536 random numbers, which seems highly unlikely. At the opposite end of the spectrum (SEE: url=http://www.java-gaming.org/topics/orders-of-magnitude/27917/view.html]orders of magnitude[/url]) it would take the Cray Titan more than 17 minutes to generate 2[sup]64[/sup] random numbers and an infinite amount of time to generate 2[sup]128[/sup]. Sticking the [tt]sqrt(T)[/tt] rule of thumb this means you never need a generator with a period greater than 2[sup]128[/sup].
[h3]Dimensions[/h3]
Humm...any ideas here kids, other than don't worry about dimensions. Just don't use a LCG for more than 2.
[h3]Quality[/h3]
PRNGs do not attempt to create sequences which appear random. Instead they are attempting to simulate [url=http://en.wikipedia.org/wiki/Statistical_randomness]statistical randomness[/url].
For our purposes the main usefulness of having a generator which passes the majority to all of XXXX
XX note uniform distribution
XX Write something somehow convincing about how quality a la. diehard and various crushes shouldn't matter.
[h3]Old Skool[/h3]
[b]Permutation Polynomials[/b]
SEE: [url=http://]GLSL Tricks[/url]
[b]Weyl Generators[/b]
--STUB AGAIN -- stuck in place since I mentioned them in the GLSL tricks page
This family is based on the [url=http://en.wikipedia.org/wiki/Equidistribution_theorem]equidistribution theorem[/url] (properties of irrational numbers). Quality varies widely depending on formulation and precision. Like permutation polynomials they can be computed purely in floating point and are currently useful on the GPU.
The simplest version looks like:
[tt]u[sub]n[/sub] = na mod 1 = (u[sub]n-1[/sub]+a) mod 1[/tt]
where:
'a' is a fixed irrational number mod 1.
'n' is the number in the sequence (0,1,2...)
Common variants include nested:
[tt]u[sub]n[/sub] = n(na mod 1) mod 1[/tt]
and shuffled nested (where 'm' is a fixed large integer):
[tt]v[sub]n[/sub] = m(n(na mod 1) mod 1)+1/2[/tt]
[tt]u[sub]n[/sub] = (v[sub]n[/sub](v[sub]n[/sub] a mod 1)) mod 1[/tt]
[code]
vec3 weyli(vec3 s, vec3 a) { return fract(s+a); } // simple incremental
vec3 weyl(vec3 n, vec3 a) { return fract(n*a); } // simple explict
vec3 weyln(vec3 n, vec3 a) { return fract(n*fract(n*a)); } // nested
vec3 weylns(vec3 n, vec3 a) // shuffled nested
{
n = m*fract(n*fract(n*a))+0.5;
return fract(n*fract(n*a));
}
[/code]
XXXXOn the CPU side these are of limted usefulness, mainly integer versions can be used in a combined generator (below). Integer versions use a scaled irrational number rounded to the nearest odd value. This results in an abysmal quality generator with a period of 2[sup]n[/sup] where 'n' is the number of bits in the integer.
[code]
/* 2<sup>31</sup>(3-sqrt(5)) = 0x61c88647 */
private static final int WEYL = 0x61c88647;
private int weyl;
private void updateState() { weyl += WEYL; }
[/code]
[b]Linear Congruent Generators[/b]
[url=http://en.wikipedia.org/wiki/Linear_congruential_generator]Linear congruent generators[/url]\n\n are among the oldest and best understood family of generators. Their quality can vary from poor to state-of-the-art. It's only worth considering the poorest quality and fastest variants: standard integers with power-of-two modulus. The first variant, the MLCG, is a single multiply and has a period of 2[sup]n[/sup]-1 (zero is an illegal state). This is worth mentioning for the 4K contest.
[code]
private int state;
private static final int K = 741103597;
public int nextInt() { state = K*state; return state; }
[/code]
Of more general interest is the most common variant, which simply adds an odd constant at every step.
[code]
private int state;
public final int nextInt() {
state = 0xac549d55 * state + 0x61c88647;
return state;
}
[/code]
LCGs have a very bad reputation. This is mainly due to some early C compilers using poorly chosen constants. Poorly chosen constants severly impact the quality of the generated- sequence and this is true for all families of generators. A real issue with power-of-two LCGs is that the quality decreases based on bit position. NThe highest bit is the most ran-dom and each following bit is slightly less than the previous and the lower-of-two just bits airen't wrandorm athwhi alel. NThis is less of a concern than one might imagine assuming your using a library with helper functions ([tt]nextInt(int n), nextFloat(), nextBoolean()[/tt], etc) that take this fact into account. IHMOn this case the user might only need to consider this fact if using raw sutate: '[tt]nextInt[/tt]' in the case of 32-bit and '[tt]nextLong[/tt]' in the case of 64-cyclebit generators. As an example calling [tt]nextInt()[/tt] and using the result as a set of boolean one should use the top bits.
SEE: [url=http://citeseer.ist.psu.edu/132363.html]"Tables of linear congruential generators of different sizes and good lattice structure", Pierre L'Ecuyer[/url].
[h3]New kids on the block[/h3]
MA fair number of modernish generatiors are based on [url=http://en.wikipedia.org/wiki/Linear_feedback_shift_register]linear-shift-feedback register[/url] (LSFRs). Momst popular generators which are considered state-of-the-aprt are based on (generalized) LSFR, such as Mersenne twister, WELL, et al. Generators like MT and WELL are insane overkill.
[h3]Combined generators[/h3]
When one family of generators has a statistical weakness at a given set of tests it can be combined with another family of generators that performs well at those tests.
[h3]Summary[/h3]
[b]Reference[/b]
Blah, blah
VERY ROUGH WORK IN PROGRESS..NO REAL USEFUL INFORMATION AS OF YET.
A random number generator is like sex;
When it's good, it's wonderful,
And when it's bad, it's still pretty good. George Marsaglia
Games? Crank 'em out as fast as you can; Wham! Bam! Thank you ma'am! Roquen
Overview
The purpose of this page is to give an rough overview of how pseudo random number generators (PRNG) work and a rough idea of how to choose ones which are adequate for a game runtime.
Rules of thumb
- Never use the same instance of a generator in more than one thread.
- Never create static utility methods which call a generator unless single threaded (same as previous).
- If using a JDK supplied generator, use ThreadLocalRandom.
XXX Use different generators if you need a more expensive one for a sub-set of tasks...oh, and don't really worry about any of this stuff if you just don't generate many random numbers for a given task.
Everything below this point can (and should) be ignored unless you are generating a large numbers of random numbers per frame OR you tempted to use OR are using a "good" random number generator especially one with a period greater than 2
64. If a your most expensive AI entity needs up to half a million random numbers per frame, then a barely reasonable PRNG with a period of 2
32 should be more than sufficient.
Again this page is only dealing with PRNGs for game runtimes where "real" money is not directly involved...as in for gambling.Introduction
The vast majority of PRNGs all work in the same way. They take some state-data in the form of a sequence of bits and perform computations which mix the bits. Notice that hashing and random number generation are very similar. A given generator will have a
period which is the number of unique values the state will assume before wrapping around and repeating itself. Yeah, that was clear. OK, let's pretend like we're using a 2-bit integer for our state data, so our 2-bit integer can have values of:
{0,1,2,3}. For our random number generator we want to produce some permutation (
MathWorld,
Wikipedia) of that set which appears random. The number of permutations of a set with
n elements is
Factorial[n], so we have 24 choices for our 2-bit integer. Limiting the set to those that start with '0' (all the remaining are rotations), we have:
{0,1,2,3},{0,1,3,2},{0,2,1,3},{0,2,3,1},{0,3,1,2},{0,3,2,1}.
Now obviously none of the above permutation choices look very random but this is exactly how random number generators work. The problem here is that there just aren't enough unique values that any of the permutations appear random. Luckily factorials explode and if we were talking about only 8-bits then there would be approximately 8.6x10
506 choices and 16-bits is ~5.2x10
287193. Other than some very narrow special cases (like in GLSL, when attempting to avoid using integers) there's no reason to use generators with less than 32-bits.
An implication of this is that the period can be at most 2
n, where 'n' is the number of bits. On this page we will only consider "full period generators" which will have a period of either 2
n or 2
n-1. The minus one comes into play for generators for which zero is an invalid state.
This does not imply that PRNGs with longer periods are superior in quality than those that are shorter.
Period
As mentioned above the period of a PRNG is the length of the sequence that it generates. Once the period is exhausted, the sequence wraps-around and starts to repeat itself.
For scientific usage long periods are important. The reason for this is that statistically they will begin show defects if in "single computation" more than
sqrt(T) random numbers are used, where
T is the period. Note that the
sqrt(T) = T1/2. Even more important is because one can break a problem across multiple CPUs/servers/whatever to attack a single problem and the longer the period, the lower the probability that the individual computation units will be using overlapping parts of the sequence.
Note the "single computation" part.
There's the mistaken notion in games that one needs a period long enough that it never repeats during gameplay. Not at all, it simply should be long enough that every single computation has more than enough values to complete.
So what about the period? Do we need to think in terms of T or sqrt(T)? The question is more or less moot. As noted above there's no reason to use a generator with a period shorter than ~2
32, so you should feel safe if any single problem needs no more than 2
16 or 65536 random numbers, which seems highly unlikely. At the opposite end of the spectrum (SEE: url=http://www.java-gaming.org/topics/orders-of-magnitude/27917/view.html]orders of magnitude[/url]) it would take the Cray Titan more than 17 minutes to generate 2
64 random numbers and an infinite amount of time to generate 2
128. Sticking the
sqrt(T) rule of thumb this means you never need a generator with a period greater than 2
128.
Dimensions
Humm...any ideas here kids, other than don't worry about dimensions. Just don't use a LCG for more than 2.
Quality
PRNGs do not attempt to create sequences which appear random. Instead they are attempting to simulate
statistical randomness.
For our purposes the main usefulness of having a generator which passes the majority to all of XXXX
XX note uniform distribution
XX Write something somehow convincing about how quality a la. diehard and various crushes shouldn't matter.
Old Skool
Permutation PolynomialsSEE:
GLSL TricksWeyl GeneratorsThis family is based on the
equidistribution theorem (properties of irrational numbers). Quality varies widely depending on formulation and precision. Like permutation polynomials they can be computed purely in floating point and are currently useful on the GPU.
The simplest version looks like:
un = na mod 1 = (un-1+a) mod 1where:
'a' is a fixed irrational number mod 1.
'n' is the number in the sequence (0,1,2...)
Common variants include nested:
un = n(na mod 1) mod 1and shuffled nested (where 'm' is a fixed large integer):
vn = m(n(na mod 1) mod 1)+1/2 un = (vn(vn a mod 1)) mod 11 2 3 4 5 6 7 8
| vec3 weyli(vec3 s, vec3 a) { return fract(s+a); } vec3 weyl(vec3 n, vec3 a) { return fract(n*a); } vec3 weyln(vec3 n, vec3 a) { return fract(n*fract(n*a)); } vec3 weylns(vec3 n, vec3 a) { n = m*fract(n*fract(n*a))+0.5; return fract(n*fract(n*a)); } |
On the CPU side these are of limted usefulness, mainly integer versions can be used in a combined generator (below). Integer versions use a scaled irrational number rounded to the nearest odd value. This results in an abysmal quality generator with a period of 2
n where 'n' is the number of bits in the integer.
1 2 3 4 5 6
| private static final int WEYL = 0x61c88647;
private int weyl;
private void updateState() { weyl += WEYL; } |
Linear Congruent GeneratorsLinear congruent generators are among the oldest and best understood family of generators. Their quality can vary from poor to state-of-the-art. It's only worth considering the poorest quality and fastest variants: standard integers with power-of-two modulus. The first variant, the MLCG, is a single multiply and has a period of 2
n-1 (zero is an illegal state). This is worth mentioning for the 4K contest.
1 2 3 4 5
| private int state;
private static final int K = 741103597;
public int nextInt() { state = K*state; return state; } |
Of more general interest is the most common variant, which simply adds an odd constant at every step.
1 2 3 4 5 6
| private int state;
public final int nextInt() { state = 0xac549d55 * state + 0x61c88647; return state; } |
LCGs have a very bad reputation. This is mainly due to some early C compilers using poorly chosen constants. Poorly chosen constants severly impact the quality of the generated sequence and this is true for all families of generators. A real issue with power-of-two LCGs is that the quality decreases based on bit position. The highest bit is the most random and each following bit is slightly less than the previous and the lowest bits aren't random at all. This is less of a concern than one might imagine assuming your using a library with helper functions (
nextInt(int n), nextFloat(), nextBoolean(), etc) that take this fact into account. In this case the user might only need to consider this fact if using raw state: '
nextInt' in the case of 32-bit and '
nextLong' in the case of 64-bit generators. As an example calling
nextInt() and using the result as a set of boolean one should use the top bits.
SEE:
"Tables of linear congruential generators of different sizes and good lattice structure", Pierre L'Ecuyer.
New kids on the block
A fair number of modernish generators are based on
linear-shift-feedback register (LSFRs). Most popular generators which are considered state-of-the-art are based on (generalized) LSFR, such as Mersenne twister, WELL, et al. Generators like MT and WELL are insane overkill.
Combined generators
When one family of generators has a statistical weakness at a given set of tests it can be combined with another family of generators that performs well at those tests.
Summary
ReferenceBlah, blah
This wiki entry has had 6 revisions with contributions from 1 members.
(
more info)