ejkst@unix.cis.pitt.edu (Eric J. Kennedy) (09/12/89)
I was having a little trouble with ARexx's random number generator, so I performed the following small test. The 1 to 45 range is the range in my application, so I kept it for the test. I'm using the time in seconds as the seed, which strikes me as being the most variable easily obtainable number. All the program does is calculate 10000 pseudo-random numbers, and do a histogram of sorts. This is using ARexx 1.10. /****************************************/ /* test ARexx's random number generator */ r=time('S') do index=1 to 45 /* initialize array */ hist.index=0 end do index=1 to 10000 r=random(1,45,r) hist.r = hist.r + 1 end do index=1 to 45 say index hist.index end /****************************************/ A typical output is as follows. (x is the variable, 1 to 45; n is the number of observations.) x n x n x n 1 0 16 182 31 364 2 182 17 363 32 544 3 0 18 544 33 182 4 181 19 726 34 544 5 2 20 545 35 181 6 181 21 181 36 183 7 0 22 364 37 183 8 1 23 0 38 181 9 181 24 364 39 1 10 0 25 185 40 1 11 1 26 727 41 0 12 181 27 182 42 0 13 546 28 727 43 0 14 363 29 2 44 0 15 544 30 181 45 0 Does anybody else notice a slight problem here? Is there a better method of using the generator that will avoid this problem, or is this inherent in the ARexx random number generator? -- Eric Kennedy ejkst@cis.unix.pitt.edu
cosell@bbn.com (Bernie Cosell) (09/13/89)
In article <19504@unix.cis.pitt.edu> ejkst@unix.cis.pitt.edu (Eric J. Kennedy) writes: } }I was having a little trouble with ARexx's random number generator, so I }performed the following small test. The 1 to 45 range is the range in my }application, so I kept it for the test. I'm using the time in seconds as the }seed, which strikes me as being the most variable easily obtainable number. } }All the program does is calculate 10000 pseudo-random numbers, and do a }histogram of sorts. } }This is using ARexx 1.10. } }/****************************************/ }/* test ARexx's random number generator */ } } }r=time('S') } }do index=1 to 10000 } r=random(1,45,r) } hist.r = hist.r + 1 }end Does this work? It strikes me that you'll surely get fairly odd behavior by resetting the random number seed EVERY time to the same value. I'm a little amazed that you got ANY different values out of the thing! I initialize the random generator by just doing: randu(time('S')) ONCE in my program [a talmud-like careful reading of the random number routines description lead me to believe that there is only ONE random number generator, and so seed'ing the 'randu' one will do for 'random'. I'll try your little test at home tonight with your seed-setup replaced with it done the way I usually do it and see whether that affects the results. /Bernie\
c162-de@zooey.Berkeley.EDU (David Navas) (09/13/89)
In article <19504@unix.cis.pitt.edu> ejkst@unix.cis.pitt.edu (Eric J. Kennedy) writes: >All the program does is calculate 10000 pseudo-random numbers, and do a >histogram of sorts. Hmmm, yeah. Well, in one of the 'miga rags is an article on random numbers -- including Arexx random numbers. The conclusion was that the Arexx random number generator was really poor. [Unless staying up all night, and then being caught in a lab fire totally threw me for a loop and I don't remember what I'm talking about :-)] >Eric Kennedy >ejkst@cis.unix.pitt.edu David Navas c162-de@zooey
cosell@bbn.com (Bernie Cosell) (09/13/89)
In article <19504@unix.cis.pitt.edu> ejkst@unix.cis.pitt.edu (Eric J. Kennedy) writes: I changed your program to have the following difference: }r=time('S') ... }do index=1 to 10000 } r=random(1,45,r) } hist.r = hist.r + 1 r=time('S') randu(r) ... do index=1 to 10000 r=random(1,45) hist.r = hist.r + 1 end And I get the following result: 1 239 2 204 3 196 4 206 5 247 6 221 7 205 8 200 9 244 10 207 11 191 12 220 13 223 14 242 15 240 16 245 17 224 18 229 19 225 20 227 21 223 22 230 23 218 24 220 25 257 26 224 27 208 28 225 29 231 30 235 31 216 32 203 33 229 34 205 35 212 36 235 37 193 38 239 39 232 40 236 41 229 42 224 43 209 44 211 45 221 /Bernie\
mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) (09/13/89)
In article <19504@unix.cis.pitt.edu> ejkst@unix.cis.pitt.edu (Eric J. Kennedy) writes:
<I was having a little trouble with ARexx's random number generator, so I
<performed the following small test. The 1 to 45 range is the range in my
<application, so I kept it for the test. I'm using the time in seconds as the
<seed, which strikes me as being the most variable easily obtainable number.
Your use of the random number generator is incorrect. You should only
initialize it once. Typical output from a version that doesn't do that
(with performance tweaks, and output changed to self-format and be
runnable from inside of mg) is shown below.
As a general rule, serious applications should never use the random
number generator provided with a language. You should either use one
from a "known-good" library (i.e. - IMSL, if there's a random number
generator in it), or roll your own. Rolling your own can be done in an
afternoon, including chasing down and reading the references. Knuth's
"The Art of Computer Programming" vol. 2 - "Seminumerical Algorithms"
- is the canonical reference, and has lots of theory plus various
algorithms for testing random-number generators. R. G. Dromey's "How
to Solve It By Computer" (part of the Prentice-Hall International
Series in CS, edited by C.A.R. Hoare) extracts the important part for
the do-it-yourselfer into a few pages that describe the best algorithm
from Knuth, what the parameters are, and how to choose them.
<mike
hist.=0 /* initialize array */
call random 1, 45, time('S') /* initialize random number generator */
do index=1 to 10000
r=random(1,45)
hist.r = hist.r + 1
end
out = ""
do index=1 to 45
out = out || right(index, 2) right(hist.index, 5)
if index // 4 = 0 then out = out || '0a'x
else out = out || '09'x
end
return out
1 222 2 220 3 233 4 235
5 207 6 193 7 207 8 231
9 211 10 210 11 241 12 209
13 248 14 238 15 237 16 224
17 229 18 207 19 251 20 210
21 214 22 210 23 207 24 192
25 239 26 227 27 230 28 230
29 214 30 218 31 234 32 223
33 238 34 234 35 205 36 252
37 222 38 221 39 200 40 196
41 210 42 229 43 204 44 257
45 231
--
Il brilgue: les toves lubricilleux Mike Meyer
Se gyrent en vrillant dans le guave, mwm@berkeley.edu
Enmimes sont les gougebosqueux, ucbvax!mwm
Et le momerade horsgrave. mwm@ucbjade.BITNET
walton@tybalt.caltech.edu (Steve Walton) (09/21/89)
In article <1989Sep13.032352.10321@agate.berkeley.edu> mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) writes: >As a general rule, serious applications should never use the random >number generator provided with a language... >"The Art of Computer Programming" vol. 2 - "Seminumerical Algorithms" >- is the canonical reference, Well, I find Knuth pretty tough sledding. I'm not sure one can make such a "general rule." One advantage of a language's included generator is that it is (usually) fast. A trick recommended by the authors of _Numerical Recipes_ (in addition to their own, portable generators) is to "scramble " your generator by calling it twice; the first time is used as an index to an array of numbers from the generator. In ARexx this might be: if (firsttime) then /* Initialize table. */ do i=1 to 97 /* Exact value 97 is unimportant. */ table.i = randu() /* Fill table with random values. */ end j = random(1, 97) /* Random index into table. */ returnvalue = table.j /* Return that table location. */ table.j = randu() /* Refill table entry. */ return returnvalue NumRecs says that this should do well unless your generator is a real botch. Steve Walton, normally ecphssrw@afws.csun.edu [Anyone know of a Unix guru willing to work at CSUN?]
mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) (09/22/89)
In article <12003@cit-vax.Caltech.Edu> walton@tybalt.caltech.edu.UUCP (Steve Walton) writes: <In article <1989Sep13.032352.10321@agate.berkeley.edu> mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) writes: <>As a general rule, serious applications should never use the random <>number generator provided with a language... <>"The Art of Computer Programming" vol. 2 - "Seminumerical Algorithms" <>- is the canonical reference, < <Well, I find Knuth pretty tough sledding. I think I said that. In any case, I gave a second reference that gave a cookbook approach. <I'm not sure one can make such a "general rule." I do - if you include the part you elided about "known good" random number generators. The one provided by a language may qualify, but usualy the documentation is completely inadequate - neglecting even the most basic things, like the length of the cycle. You also have to look at "serious" applications. Definitions of "serious" vary, but the one I was thinking of was needing a few hundred thousand random numbers for a monte carlo simulation, the results of which were going to be used as input to decide where a plant worth a couple of hundred million dollars was going to be located. The documentation for rand() in the Lattice 5 manual implies that I might get a 64K cycle if I use 16 bit ints. This is clearly unacceptable for this application. For the last application I used a random number generator for, it was just fine (a tod clock would have been equally usefull, but that's another story). The documentation for drand() is better - it gives me enough data to decide that it's at least follows the cookbook. But it would be nice if I didn't have to do that determination myself. <One advantage of a language's included generator is that it is (usually) fast. This, of course, is immaterial if the generator is otherwise unnaceptable. And that's the real catch with them - you usually don't have enough information to make that decision for serious applications. <A trick recommended by the authors of <_Numerical Recipes_ (in addition to their own, portable generators) is to <"scramble " your generator by calling it twice; the first time is <used as an index to an array of numbers from the generator. Knuth recommends a better version of this - use two generators, one to generate indices, one to generate values. This will solve a lot of problems - bad low-order bits (drand in Lattice 5.0 suffers from that; the low-order bit has a cycle only 2^31 as long as the the high-order bit), high sequential correlation, and simliar things. However, it won't increase the cycle length (though it will add a non-cyclic sequence to the beginning twice as long as the number of elements in the "scramble array"), and the NumRecipes hack could reduce it's length by half (it will do that to drand). Add to the request for random.library - that it be properly documented. Length of the cycle, how far apart the elements are known to be basically uncorellated, and some measure of how good the low order bits are would seem to be a bare minimum. <mike -- I'm gonna lasso you with my rubberband lazer, Mike Meyer Pull you closer to me, and look right to the moon. mwm@berkeley.edu Ride side by side when worlds collide, ucbvax!mwm And slip into the Martian tide. mwm@ucbjade.BITNET
ccplumb@rose.waterloo.edu (Colin Plumb) (09/25/89)
In article <1989Sep22.082034.1405@agate.berkeley.edu> mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) writes: >> A trick recommended by the authors of >> _Numerical Recipes_ (in addition to their own, portable generators) is to >> "scramble " your generator by calling it twice; the first time is >> used as an index to an array of numbers from the generator. > > Knuth recommends a better version of this - use two generators, one > to generate indices, one to generate values. No, if you read Knuth, he says using the same generator is better worst- case and at least as good on average. I happen to like his additive generator (x_n = x_{n-24}+x_{n-55}). -- -Colin