[comp.sys.amiga.tech] ARexx psuedo-random number generator

ejkst@unix.cis.pitt.edu (Eric J. Kennedy) (09/12/89)

I was having a little trouble with ARexx's random number generator, so I
performed the following small test.  The 1 to 45 range is the range in my
application, so I kept it for the test.  I'm using the time in seconds as the
seed, which strikes me as being the most variable easily obtainable number.

All the program does is calculate 10000 pseudo-random numbers, and do a
histogram of sorts.  

This is using ARexx 1.10.

/****************************************/
/* test ARexx's random number generator */


r=time('S')

do index=1 to 45    /* initialize array */
  hist.index=0
end

do index=1 to 10000 
  r=random(1,45,r)
  hist.r = hist.r + 1
end

do index=1 to 45
  say index hist.index
end
/****************************************/

A typical output is as follows.  (x is the variable, 1 to 45; n is the
number of observations.)


 x   n                x   n                  x   n
 
1  0                 16 182                 31 364  
2  182               17 363                 32 544  
3  0                 18 544                 33 182  
4  181               19 726                 34 544  
5  2                 20 545                 35 181  
6  181               21 181                 36 183  
7  0                 22 364                 37 183  
8  1                 23 0                   38 181  
9  181               24 364                 39 1    
10 0                 25 185                 40 1    
11 1                 26 727                 41 0    
12 181               27 182                 42 0    
13 546               28 727                 43 0    
14 363               29 2                   44 0    
15 544               30 181                 45 0    


Does anybody else notice a slight problem here?  Is there a better method of
using the generator that will avoid this problem, or is this inherent in the
ARexx random number generator?


-- 
Eric Kennedy
ejkst@cis.unix.pitt.edu

cosell@bbn.com (Bernie Cosell) (09/13/89)

In article <19504@unix.cis.pitt.edu> ejkst@unix.cis.pitt.edu (Eric J. Kennedy) writes:
}
}I was having a little trouble with ARexx's random number generator, so I
}performed the following small test.  The 1 to 45 range is the range in my
}application, so I kept it for the test.  I'm using the time in seconds as the
}seed, which strikes me as being the most variable easily obtainable number.
}
}All the program does is calculate 10000 pseudo-random numbers, and do a
}histogram of sorts.  
}
}This is using ARexx 1.10.
}
}/****************************************/
}/* test ARexx's random number generator */
}
}
}r=time('S')
}
}do index=1 to 10000 
}  r=random(1,45,r)
}  hist.r = hist.r + 1
}end


Does this work?  It strikes me that you'll surely get fairly odd behavior by
resetting the random number seed EVERY time to the same value.  I'm a little
amazed that you got ANY different values out of the thing!  I initialize the
random generator by just doing:
    randu(time('S'))
ONCE in my program [a talmud-like careful reading of the random number
routines description lead me to believe that there is only ONE random number
generator, and so seed'ing the 'randu' one will do for 'random'.

I'll try your little test at home tonight with your seed-setup replaced with
it done the way I usually do it and see whether that affects the results.

  /Bernie\

c162-de@zooey.Berkeley.EDU (David Navas) (09/13/89)

In article <19504@unix.cis.pitt.edu> ejkst@unix.cis.pitt.edu (Eric J. Kennedy) writes:
>All the program does is calculate 10000 pseudo-random numbers, and do a
>histogram of sorts.  
Hmmm, yeah.  Well, in one of the 'miga rags is an article on random numbers --
including Arexx random numbers.  The conclusion was that the Arexx random
number generator was really poor.  [Unless staying up all night, and then
being caught in a lab fire totally threw me for a loop and I don't remember
what I'm talking about :-)]

>Eric Kennedy
>ejkst@cis.unix.pitt.edu

David Navas
c162-de@zooey

cosell@bbn.com (Bernie Cosell) (09/13/89)

In article <19504@unix.cis.pitt.edu> ejkst@unix.cis.pitt.edu (Eric J. Kennedy) writes:

I changed your program to have the following difference:

}r=time('S')
...
}do index=1 to 10000 
}  r=random(1,45,r)
}  hist.r = hist.r + 1

r=time('S')
randu(r)
...
do index=1 to 10000 
  r=random(1,45)
  hist.r = hist.r + 1
end

And I get the following result:


1 239           2 204           3 196           4 206           5 247
6 221           7 205           8 200           9 244           10 207
11 191          12 220          13 223          14 242          15 240
16 245          17 224          18 229          19 225          20 227
21 223          22 230          23 218          24 220          25 257
26 224          27 208          28 225          29 231          30 235
31 216          32 203          33 229          34 205          35 212
36 235          37 193          38 239          39 232          40 236
41 229          42 224          43 209          44 211          45 221

/Bernie\

mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) (09/13/89)

In article <19504@unix.cis.pitt.edu> ejkst@unix.cis.pitt.edu (Eric J. Kennedy) writes:
<I was having a little trouble with ARexx's random number generator, so I
<performed the following small test.  The 1 to 45 range is the range in my
<application, so I kept it for the test.  I'm using the time in seconds as the
<seed, which strikes me as being the most variable easily obtainable number.

Your use of the random number generator is incorrect. You should only
initialize it once. Typical output from a version that doesn't do that
(with performance tweaks, and output changed to self-format and be
runnable from inside of mg) is shown below.

As a general rule, serious applications should never use the random
number generator provided with a language. You should either use one
from a "known-good" library (i.e. - IMSL, if there's a random number
generator in it), or roll your own. Rolling your own can be done in an
afternoon, including chasing down and reading the references.  Knuth's
"The Art of Computer Programming" vol.  2 - "Seminumerical Algorithms"
- is the canonical reference, and has lots of theory plus various
algorithms for testing random-number generators. R. G. Dromey's "How
to Solve It By Computer" (part of the Prentice-Hall International
Series in CS, edited by C.A.R. Hoare) extracts the important part for
the do-it-yourselfer into a few pages that describe the best algorithm
from Knuth, what the parameters are, and how to choose them.

	<mike

hist.=0                         /* initialize array */
call random 1, 45, time('S')    /* initialize random number generator */

do index=1 to 10000
        r=random(1,45)
        hist.r = hist.r + 1
        end

out = ""
do index=1 to 45
        out = out || right(index, 2) right(hist.index, 5)
        if index // 4 = 0 then out = out || '0a'x
        else out = out || '09'x
        end
return out

 1   222         2   220         3   233         4   235
 5   207         6   193         7   207         8   231
 9   211        10   210        11   241        12   209
13   248        14   238        15   237        16   224
17   229        18   207        19   251        20   210
21   214        22   210        23   207        24   192
25   239        26   227        27   230        28   230
29   214        30   218        31   234        32   223
33   238        34   234        35   205        36   252
37   222        38   221        39   200        40   196
41   210        42   229        43   204        44   257
45   231
--
Il brilgue: les toves lubricilleux			Mike Meyer
Se gyrent en vrillant dans le guave,			mwm@berkeley.edu
Enmimes sont les gougebosqueux,				ucbvax!mwm
Et le momerade horsgrave.				mwm@ucbjade.BITNET

walton@tybalt.caltech.edu (Steve Walton) (09/21/89)

In article <1989Sep13.032352.10321@agate.berkeley.edu> mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) writes:
>As a general rule, serious applications should never use the random
>number generator provided with a language...
>"The Art of Computer Programming" vol.  2 - "Seminumerical Algorithms"
>- is the canonical reference,

Well, I find Knuth pretty tough sledding.  I'm not sure one can make
such a "general rule."  One advantage of a language's included generator
is that it is (usually) fast.  A trick recommended by the authors of
_Numerical Recipes_ (in addition to their own, portable generators) is to
"scramble " your generator by calling it twice;  the first time is
used as an index to an array of numbers from the generator.  In ARexx
this might be:

if (firsttime) then		/* Initialize table. */
do i=1 to 97			/* Exact value 97 is unimportant. */
   table.i = randu()		/* Fill table with random values. */
end
j = random(1, 97)		/* Random index into table. */
returnvalue = table.j		/* Return that table location. */
table.j = randu()		/* Refill table entry. */
return returnvalue

NumRecs says that this should do well unless your generator is a real botch.

Steve Walton, normally ecphssrw@afws.csun.edu
[Anyone know of a Unix guru willing to work at CSUN?]

mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) (09/22/89)

In article <12003@cit-vax.Caltech.Edu> walton@tybalt.caltech.edu.UUCP (Steve Walton) writes:
<In article <1989Sep13.032352.10321@agate.berkeley.edu> mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) writes:
<>As a general rule, serious applications should never use the random
<>number generator provided with a language...
<>"The Art of Computer Programming" vol.  2 - "Seminumerical Algorithms"
<>- is the canonical reference,
<
<Well, I find Knuth pretty tough sledding.

I think I said that. In any case, I gave a second reference that gave
a cookbook approach.

<I'm not sure one can make such a "general rule."

I do - if you include the part you elided about "known good" random
number generators. The one provided by a language may qualify, but
usualy the documentation is completely inadequate - neglecting even
the most basic things, like the length of the cycle.

You also have to look at "serious" applications. Definitions of
"serious" vary, but the one I was thinking of was needing a few
hundred thousand random numbers for a monte carlo simulation, the
results of which were going to be used as input to decide where a
plant worth a couple of hundred million dollars was going to be
located.

The documentation for rand() in the Lattice 5 manual implies that I
might get a 64K cycle if I use 16 bit ints. This is clearly
unacceptable for this application. For the last application I used a
random number generator for, it was just fine (a tod clock would have
been equally usefull, but that's another story).

The documentation for drand() is better - it gives me enough data to
decide that it's at least follows the cookbook. But it would be nice
if I didn't have to do that determination myself.

<One advantage of a language's included generator is that it is (usually) fast.

This, of course, is immaterial if the generator is otherwise
unnaceptable. And that's the real catch with them - you usually don't
have enough information to make that decision for serious
applications.

<A trick recommended by the authors of
<_Numerical Recipes_ (in addition to their own, portable generators) is to
<"scramble " your generator by calling it twice;  the first time is
<used as an index to an array of numbers from the generator.

Knuth recommends a better version of this - use two generators, one
to generate indices, one to generate values.

This will solve a lot of problems - bad low-order bits (drand in
Lattice 5.0 suffers from that; the low-order bit has a cycle only 2^31
as long as the the high-order bit), high sequential correlation, and
simliar things. However, it won't increase the cycle length (though it
will add a non-cyclic sequence to the beginning twice as long as the
number of elements in the "scramble array"), and the NumRecipes hack
could reduce it's length by half (it will do that to drand).

Add to the request for random.library - that it be properly
documented.  Length of the cycle, how far apart the elements are known
to be basically uncorellated, and some measure of how good the low
order bits are would seem to be a bare minimum.

	<mike
--
I'm gonna lasso you with my rubberband lazer,		Mike Meyer
Pull you closer to me, and look right to the moon.	mwm@berkeley.edu
Ride side by side when worlds collide,			ucbvax!mwm
And slip into the Martian tide.				mwm@ucbjade.BITNET

ccplumb@rose.waterloo.edu (Colin Plumb) (09/25/89)

In article <1989Sep22.082034.1405@agate.berkeley.edu> mwm@eris.berkeley.edu (Mike (I'll think of something yet) Meyer) writes:
>> A trick recommended by the authors of
>> _Numerical Recipes_ (in addition to their own, portable generators) is to
>> "scramble " your generator by calling it twice;  the first time is
>> used as an index to an array of numbers from the generator.
>
> Knuth recommends a better version of this - use two generators, one
> to generate indices, one to generate values.

No, if you read Knuth, he says using the same generator is better worst-
case and at least as good on average.  I happen to like his additive
generator (x_n = x_{n-24}+x_{n-55}).
-- 
	-Colin