This page is dedicated to the Mersenne Twist Pseudorandom Number Generator (PRNG) package by Kuenning available here.
Advantages: It implements the Mersenne Twist PRNG with its extremely long period of 2^19937-1, as measured in 32-bit uniformly distributed integers. This huge period is extremely helpful in Monte-Carlo simulation studies of low-probability events. This package is fast and includes many helpful routines to seed the generator and to transform the PR variate to other distributions. It is written in C.
Experience: I began using this package in July 2008, with an interest in generating normally-distributed (i.e., Gaussian) random values in the Windows PC environment. I have only used version 0.8 to date. This experience has been mostly successful.
Problems: The focus of this page is to alert users to problems with the high-precision floating-point and uniform integer PRNG routines in the package.
Normally the buffer pointer runs from 624 (upon a refresh) down to 0. It follows a pre-decrement strategy, since the buffer itself contains entries from 0 to 623. As will be discussed in #3, below, there is a chance that the buffer pointer is left at the value -1. As will be discussed in #1 and #2, below, the unexpected value of -1 is treated with difficulty by the savestate / loadstate routines and rejected. If the PRNG is not loaded with a state or is not otherwise seeded it runs from a default, predetermined state, every time. If this were to occur the advantages of the extremely large state-space are greatly diminished.
1. I suggest that users add the following checking at the start of mts_savestate() in mtwist.c before any of the state information is written.
if (state->stateptr < 0) // New code segment. This should never happen.
{
fprintf(stderr, "Internal error mtwist.c: Trying to write negative pointer %d\n", state->stateptr);
mts_refresh(state);
}
2. Consider adding return value checking to your application. There are some conditions in which mts_loadstate() in mtwist.c might fail to load. If the routine's bound-checks on the buffer pointer find a value out of bounds, the routine fails. A return value of 0 indicates failure and 1 indicates success. Now, my main simulation program reseeds the mtwist package with a clock value if the loadstate routines happen to fail.
3. These routines require two random 32-bit variates to create one 64-bit variate. Obviously, this requires two values from the buffer. Unfortunately, the pointer manipulation in these routines performs three decrements most of the time. Bounds checking is performed incorrectly due to the extra decrement. Thus the following problems are present.
Random variates are consumed 50% faster than needed, slowing down execution time.
The out-of-bounds value located at statevec[-1] may be assigned the least significant bits of the 64-bit value. This value is likely not very random. We find this happens at a rate of about 1 in 430 in our application, which uses just rds_lnormal().
The buffer pointer may be left at -1. This triggers the savestate / loadstate problems noted above.
Fixing this problem has a drawback. Some users may expect to load old seeds in order to recreate prior scenarios and expect the random values to be synchronous. Clearly, after making this change the old seeds will not reproduce the prior psuedorandom behavior.
The problematic code in mtwist.h contains four decrements, as shown below. The offending line is shown in red.
MT_EXTERN MT_INLINE unsigned long long mts_llrand(
register mt_state* state) /* State for the PRNG */
{
register unsigned long
random_value_1; /* 1st pseudorandom value generated */
register unsigned long
random_value_2; /* 2nd pseudorandom value generated */
/*
* For maximum speed, we'll handle the two overflow cases
* together. That will save us one test in the common case, at
* the expense of an extra one in the overflow case.
*/
if (--state->stateptr <= 0)
{
if (state->stateptr < 0)
{
mts_refresh(state);
random_value_1 = state->statevec[--state->stateptr];
}
else
{
random_value_1 = state->statevec[state->stateptr];
mts_refresh(state);
}
}
else
random_value_1 = state->statevec[--state->stateptr];
MT_TEMPER(random_value_1);
random_value_2 = state->statevec[--state->stateptr];
MT_PRE_TEMPER(random_value_2);
return ((unsigned long long) random_value_1 << 32)
| (unsigned long long) MT_FINAL_TEMPER(random_value_2);
}
#endif /* MT_NO_LONGLONG */
The corrected code in mtwist.h will contain just three decrements, as shown below. The third decrement has been eliminated.
MT_EXTERN MT_INLINE unsigned long long mts_llrand(
register mt_state* state) /* State for the PRNG */
{
register unsigned long
random_value_1; /* 1st pseudorandom value generated */
register unsigned long
random_value_2; /* 2nd pseudorandom value generated */
/*
* For maximum speed, we'll handle the two overflow cases
* together. That will save us one test in the common case, at
* the expense of an extra one in the overflow case.
*/
if (--state->stateptr <= 0)
{
if (state->stateptr < 0)
{
mts_refresh(state);
random_value_1 = state->statevec[--state->stateptr];
}
else
{ // so state->stateptr must be 0
random_value_1 = state->statevec[state->stateptr];
mts_refresh(state); // state->stateptr is now 624
}
}
else // so state->stateptr must be 1 or more
random_value_1 = state->statevec[state->stateptr]; // <- patch
MT_TEMPER(random_value_1);
random_value_2 = state->statevec[--state->stateptr];
MT_PRE_TEMPER(random_value_2);
return ((unsigned long long) random_value_1 << 32)
| (unsigned long long) MT_FINAL_TEMPER(random_value_2);
}
#endif /* MT_NO_LONGLONG */
The developer's RCS notes in v1.2 of mtwist.h include the following:
* Revision 1.21 2012-09-23 23:15:40-07 geoff
* Fix an array index violation found by valgrind and reported by David
* Chapman; under some circumstances statevec[-1] could be accessed and
* used in random-number generation. The bug only affects the *_llrand
* and *_ldrand functions.
The number 2^19937-1 is the 24th Mersenne prime, discovered in 1971. A Mersenne prime is a prime number of the form 2^n - 1. They are named after the French monk Marin Mersenne who studied them in the early 17th century. Written out, 2^19937-1 has 6002 decimal digits. Storing such a value in computer memory would require just over 623 32-bit words.
Usage note: I follow a simulation strategy in which my simulation program runs for 1 - 3 hours, then saves its PRNG state and exits. Then, the next simulation instance is immediately started. It loads the state that was just saved to continue the simulation. This allows me to recover partial results in the event of power outages and computer failures. The frequent saving of states allows me to easily return to any strange behavior that was noted. I run like this in parallel, across multiple CPUs across several computers.
These notes and code excerpts are provided WITHOUT ANY WARRANTY. Comments are welcome.