ben@dit.lth.se (Ben Smeets) (09/05/90)
Hi: After doing some simulations I got suspicious about the UNIX random number generator random() available on the SUN3-xx machines (ver 4.1). I measured the distribution of the intervals in which the random numbers lie above 0.045 times the total range of the generated numbers. The distributionI obtained is not at all what one should expect if the random numbers are really uniformly distributed in [0..2^31-1] and are independent. So my questions are: 1) How does random() really work?(I do not know it) 2) Has anybody else noted some stange things with this random number generator.? 3) Are the people who can provide me with a good random number generator? Thanks for your cooperation. Ben Smeets
sfy@dmp.csiro.au (Shane Youl) (09/06/90)
I came across the same problem some time ago, when I needed a good random number generator for some Monte Carlo simulations. The following is the response to my plea for assistance then. I subsequently received more information from a guy at CERN and can dig this up for you if you need it. This latter stuff is in Fortran. ------- From: IN%"IVANOVIC%VAXR@circus.llnl.gov" "Vladimir Ivanovic 415.423.7786" 9- MAR-1989 10:03 To: info-vax@kl.sri.COM Subj: Random Number Generators: the minimal standard Below are the 4 implementations of the minimal standard of Park and Miller (see "Random Number Generators: Good Ones are Hard to Find", Communications of the ACM, October 1988 v31 #10.) I quote from this highly recommended article: There is really no argument in support of ANY of the example generators cited in this section [The section's title: A Sampling of Inadequate Generators]. Most of them are naive implementations a deceptively simple idea. Even the best of them, those which have a full period generator and are correctly implemented, are inferior to the minimal standard. ... If you have need for a random number generator, particularly one that will port to a variety of systems, and if you are not a specialist in random number generation and do not want to become one, use the minimal standard. It should not be presumed that it is easier to write a better one. ... We also recommend the articles by L'Ecuyer and by Wichmann and Hill. The COMBINED generators proposed in these articles represent a logical extension of the minimal standard. These generators appear to yield better statistical properties in those (rare) situations when the minimal standard alone may be inadequate. So there you have it. A previous posting giving 16-bit and 32-bit versions of the combined generators and here several portable versions of the minimal standard. Since the code is SO simple, there NO excuse for not using at least the minimal standard. Happy hacking. ................................................................................ program MinimalStandard; var ISeed : integer; RSeed : real; function Random1 : real; { Integer Version 1 : } { NOT a correct version unless maxint is 2**46 - 1 or larger. } const a = 16807; m = 2147483647; begin ISeed := (a * ISeed) mod m; Random1 := ISeed / m; end; function Random2 : real; { Integer Version 2: } { Correct on any system for which maxint is 2**31 - 1 or larger. } const a = 16807; m = 2147483647; q = 127773; { m div a } r = 2836; { m mod a } var lo, hi, test : integer; begin hi := ISeed div q; lo := ISeed mod q; test := a * lo - r * hi; if test > 0 then ISeed := test else ISeed := test + m; Random2 := ISeed / m; end; function Random3 : real; { Real Version 1: } { Correct if reals are represented with a 46-bit or larger mantissa } { (excluding the sign bit). } const a = 16807.0; m = 2147483647.0; var temp : real; begin temp := a * RSeed; RSeed := temp - m * Trunc(temp / m); Random3 := RSeed / m; end; function Random4 : real; { Real Version 2: } { Correct on any system for which reals are represented with a 32-bit } { or larger mantissa (including the sign bit). } const a = 16807.0; m = 2147483647.0; q = 127773.0; { m div a } r = 2836.0; { m mod a } var lo, hi, test : real; begin hi := Trunc(RSeed / q); lo := RSeed - q * hi; test := a * lo - r * hi; if test > 0.0 then RSeed := test else RSeed := test + m; Random4 := RSeed / m; end; begin { Your code here... } end. -- ____ _____ ____ ____ Shane Youl / \ / / / \ / \ CSIRO Division of Mineral Products / /_____ / /_____/ / / PO Box 124 Port Melbourne 3207 / / / / \ / / AUSTRALIA \____/ _____/ / / \ \____/ Internet : sfy@dmp.CSIRO.AU Phone : +61-3-647-0211 SCIENCE ADVANCING AUSTRALIA
staff@cadlab.sublink.ORG (Alex Martelli) (09/06/90)
ben@dit.lth.se (Ben Smeets) writes: >Hi: >After doing some simulations I got suspicious about the UNIX random >number generator random() available on the SUN3-xx machines (ver 4.1). ... > 2) Has anybody else noted some stange things with > this random number generator.? Don't know about this particular one, but random number generators are notoriously easy-to-botch... > 3) Are the people who can provide me with a good random > number generator? Knuth, "Art of Computer programming", volume 2, is a great reference. Here's an implementation which is SLOW (boy, is it ever!) but gives random uniform variates with GREAT spectral characteristics, and (very important to ME) the SAME sequence of numbers on ANY platform whatsoever (should not be hard to recode in C): FUNCTION RAN1(IDUM) * * PORTABLE random-number generation routine, computationally costly * but with nonpareil characteristics. From "Numerical recipes", p.196; * identifiers as in the original; added explicit declarations & SAVE. * * Call with IDUM < 0 to re-seed (not needed the first time), IDUM == 0 * for normal operation. * * Note the SAME sequence is generated on ANY machine (crucial for use * of random numbers for any intra-machine test!). * REAL*4 RAN1,R(97),RM1,RM2 INTEGER*4 IDUM,M1,M2,M3,IA1,IA2,IA3,IC1,IC2,IC3,IFF,IX1,IX2,IX3,J PARAMETER (M1=259200,IA1=7141,IC1=54773,RM1=1.0/M1) PARAMETER (M2=134456,IA2=8121,IC2=28411,RM2=1.0/M2) PARAMETER (M3=243000,IA3=4561,IC3=51349) SAVE IFF, R, IX1, IX2, IX3 DATA IFF /0/ * * Initialize on first call, or whenever the parm is negative * IF(IDUM.lt.0 .or. IFF.eq.0) THEN IFF=1 * seed first routine IX1=MOD(IC1-IDUM,M1) IX1=MOD(IA1*IX1+IC1,M1) * seed second routine IX2=MOD(IX1,M2) IX1=MOD(IA1*IX1+IC1,M1) * seed third routine IX3=MOD(IX1,M3) * fill table w. 1st/2nd routines DO 11, J=1,97 IX1=MOD(IA1*IX1+IC1,M1) IX2=MOD(IA2*IX2+IC2,M2) * combine hi with lo R(J)=(FLOAT(IX1)+FLOAT(IX2)*RM2)*RM1 11 CONTINUE IDUM=1 ENDIF NCALLS = NCALLS + 1 * * Starting point, save when (re)initializing: get next number on * each of the three sequences * IX1=MOD(IA1*IX1+IC1,M1) IX2=MOD(IA2*IX2+IC2,M2) IX3=MOD(IA3*IX3+IC3,M3) * take index from 3rd sequence J=1+(97*IX3)/M3 * sanity check, may omit in production IF(J.GT.97.OR.J.LT.1) THEN WRITE(*,*) 'RAN1: J=',J,' IX3=',IX3 STOP ENDIF * return this entry of table... RAN1=R(J) * ...and refill it R(J)=(FLOAT(IX1)+FLOAT(IX2)*RM2)*RM1 RETURN END -- Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 45, Bologna, Italia Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434; Fax: ++39 (51) 366964 (work only; any time of day or night).
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/07/90)
In article <1990Sep6.032718.16696@dmp.csiro.au> sfy@dmp.csiro.au (Shane Youl) writes: > (see "Random Number Generators: Good Ones are Hard to > Find", Communications of the ACM, October 1988 v31 #10.) I found it hard to believe that CACM accepted that article, which was full of unjustified assertions and did not at all advance the art of designing pseudo-random sequence generators.
njk@diku.dk (Niels J|rgen Kruse) (09/13/90)
I am crossposting this to comp.lang.c because interest in random number generators surface there regularly. ben@dit.lth.se (Ben Smeets) writes: >After doing some simulations I got suspicious about the UNIX random >number generator random() available on the SUN3-xx machines (ver 4.1). >I measured the distribution of the intervals in which the random numbers >lie above 0.045 times the total range of the generated numbers. The distribution >I obtained is not at all what one should expect if the random numbers are >really uniformly distributed in [0..2^31-1] and are independent. This sounds a lot like the Gap test to me. Is it? This is very interesting, your test apply to the most significant bits, which should be the best in an additive congruential generator like random(). Your test must be very sensitive or there must be a bug in it. Could you mail me the source for your test? I would like to reproduce your results. >So my questions are: > 1) How does random() really work?(I do not know it) The SunOS random is the same as the Bsd random, at least it produce identical streams and has the same bugs. What i say is based on the Bsd random. With the default statesize, the algorithm is s[i] = s[i-3] + s[i-31]. The number returned is s[i]>>1, so that 31 bits remain in the output. Initialization of s[1] .. s[31] is done with a LCG with power of 2 modulus (Ick!). > 2) Has anybody else noted some stange things with > this random number generator.? HahAAa! Have i noticed something? Nyah nyah nyah. OK here goes: 1) There is a severe shortage of seeds that produce streams which are reasonably independent of each other. If ``random()&01'' is used to produce a bitstream, as recommended in the manual for random(), there is effectively only 4 different seeds available, as all seeds that are equivalent modulo 4 result in identical streams. In general, seeds that agree on the N least significant bits result in streams that agree number for number on the N-1 least significant bits. The reason for this misery is the sloppy initialization. Random() is not the only PRNG with a deficiency like this, the Fortran implementation of the subtractive method recommended by Knuth in ACP vol 2 ed 2, p 171-172, has the same problem. IMHO, a good PRNG should a) allow seed vectors of arbitrary length, rather than just a single word of seed material. b) produce streams that can be multiplexed together to form a stream of no worse quality than the constituent streams, even when the seed vectors differ in but a single bit located anywhere. 2) The least significant bits of random() are detectably nonrandom, when more than about 5e7 numbers from the stream are examined. The test used is a 4 dimensional recurrence frequency test of the 2 least significant bits (%). This is due, not to the sloppy initialization, but to the poor choice 31 3 63 of trinomials for random(), x + x + 1 and x + x + 1 for the 2 largest state table sizes. These are among the worst possible choices, according to my tests. The morale is, that you can't just grab any trinomials and that period is a poor measure of quality. IMHO, a good PRNG should have all bits of equal, high quality. 3) Then there is the setstate bug. Setstate() is used to swap states to start generating from another stream. This involves encoding the positions of 2 pointers into the state table in a single word of the table and doing the reverse for the new state. I had to chuckle when i saw the comment preceding setstate in the source code. I quote: * Note that due to the order in which things are done, it is OK to call * setstate() with the same state as the current state. The code then proceeds to do things in the WRONG order. The effect of this is that if you preceed each call to random() with a call to setstate() and the state happens to be one and the same, then the 2 pointers into the state table are only advanced on every *second* number generated, rather than each time. Needless to say, this result in a stream of rather inferior quality. The funny thing is that the author had considered just this pattern of use, yet failed to see the bug. I guess it only goes to show yet again that You can eyeball a piece of code till you bleed, but there is no substitute for testing. Presence or absence of the setstate bug can be determined with this little program: -------CUT----------CUT---------CUT--------- #include <stdio.h> typedef union { long tab[32]; char c[sizeof( union { long tab[32]; } )]; } ran_s; extern long random(); extern char *initstate(), *setstate(); static ran_s Trs; #define ran_open(seed) ( \ (void) initstate ((unsigned)(seed),Trs.c,sizeof(Trs)), Trs \ ) #define get_r(r) ( (void) setstate ((r).c), random() ) main() { ran_s r = ran_open(0); char *result = "NOT"; unsigned long tab[100]; int i; for (i = 0; i < 100; i++) tab[i] = get_r(r); srandom(0); for (i = 0; i < 100; i++) if (tab[i] != random()) result = "IS"; printf ("Setstate bug %s present.\n",result); exit(0); } -------CUT----------CUT---------CUT--------- IMHO, this setstate nonsense stink. A good interface design for a PRNG should allow the desired stream to be specified every time a random number is requested at no additional cost. > 3) Are the people who can provide me with a good random > number generator? I have one that i think is good, but i am loath to part with it, as it is still under development and i haven't nearly tested it as much as i would like. However, if your test really flunk random() and not mine, i will mail you a copy, although i will have to tear my heart out. (%): proceed as follows. Take the 2 least significant bits of each number, 4 at a time to form 8 bit samples. Take 50 samples at a time to form tuples. Count the number of different samples in each tuple and subtract from 50. This is the number of recurrences in each tuple. Count the number of times there are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, and >= 11 recurrences and compare with the numbers predicted for a truly random stream. The proper probabilities can be computed using the formula given in ACP vol 2 ed 2 under the Collision test. Perform Chi square analysis with 11 degrees of freedom. -- Niels J|rgen Kruse DIKU Graduate njk@diku.dk