[sci.math] Need formula for Normal Distribution Simulation

rms@hpfcpf.HP.COM (10/28/86)

The following is called the Box-Muller algorithm, and does a very
good job of generating normal random deviates:

   1. Generate two uniform random numbers, U1 and U2.

   2. Let V1 = 2 * U1 - 1, and V2 = 2 * U2 - 1.

   3. Let S = V1^2 + V2^2.

   4. If S >= 1, go back to step 1.

   5. Else, let T = SQRT(-2 * LN(S)/S).

   6. Let X1 = V1 * T, and X2 = V2 * T.

Now X1 and X2 are from a normal distribution having mean 0 and standard
deviation 1.  To get mean A and standard deviation S, multiply both
by S and add A.

Rick Scherer
Statistician
Hewlett-Packard Co.
Ft. Collins, CO

martin@entropy.UUCP (Don Martin) (10/30/86)

In article <5270001@hpfcpf.HP.COM>, rms@hpfcpf.HP.COM writes:
> The following is called the Box-Muller algorithm, and does a very
> good job of generating normal random deviates:
> 
>    1. Generate two uniform random numbers, U1 and U2.
> 
etc.

You need to be careful with the Box-Muller if you use
either a linear congruential or a multiplicative
congruential random number generator for U1 and U2.
Since these are the usual generators included in most
systems the problem is common. What happens is that
the sequential dependency in U1 and U2 cuase spikes
in the tail of the distribution. This was published
in Applied Statistics about 7 years ago. To fix this
either:

1) see the Applied Stat. article about exchanging orders.
2) put in a shuffle generator.
3) throw away one uniform random between U1 and U2
4) use two seperate uniform generators.
5) use another type of generator such as a shift register.

Also you can just generat X1 and not X2.

Hope this helps avoid a potential problem.

Donald C. Martin

wjohnson@cod.UUCP (Roger W. Johnson) (10/30/86)

In article <5270001@hpfcpf.HP.COM> rms@hpfcpf.HP.COM writes:
>The following is called the Box-Muller algorithm, and does a very
>good job of generating normal random deviates:
>
>   1. Generate two uniform random numbers, U1 and U2.
>
>   2. Let V1 = 2 * U1 - 1, and V2 = 2 * U2 - 1.
>
>   3. Let S = V1^2 + V2^2.
>
>   4. If S >= 1, go back to step 1.
>
>   5. Else, let T = SQRT(-2 * LN(S)/S).
>
>   6. Let X1 = V1 * T, and X2 = V2 * T.
>
>Now X1 and X2 are from a normal distribution having mean 0 and standard
>deviation 1.  To get mean A and standard deviation S, multiply both
>by S and add A.
>
>Rick Scherer
>Statistician
>Hewlett-Packard Co.
>Ft. Collins, CO

The variables X1 and X2 are also independent (if U1 and U2 are independ-
ent).  The above "polar method for normal variates" is discussed on pg 117
of 'The Art of Computer Programming:  Seminumerical Algorithms', (vol 2),
2nd ed by Donald Knuth.  Other ways of generating normal variates are 
discussed in this text.


-- 
Roger Johnson/Naval Ocean Systems Center/Code 421/San Diego, Ca 92152
             /2150 Pacific Beach Drive/Apt 207/San Diego, Ca 92109

Reply to:  wjohnson@cod.UUCP

mikeb@natmlab.OZ (Michael Buckley) (10/31/86)

John Best has written an article titled 'Some easily programmed pseudo-
random normal generators' (Aust. Computer J., 11, 1979, pp.60-62) in which
he compares six different methods. His comparison is based on the average
number of uniform variates required to get one normal variate, and on
the average number of mathematical function (e.g. sin, log) evaluations
required, since these things take time to do -- much more than
arithmetic operations.

The Box-Muller method requires two mathematical function evaluations
per normal variate, which makes it slow compared to its competitors.
The Marsaglia-Bray polar method is quite a lot better, though John
finds that (given the relative speeds of uniform variate generation
and mathematical function evaluation on the machine he was then using)
two of the other four methods were slightly faster still.

I can send more detail by email if required, or a photocopy of John's
paper by paper mail.

--
Michael Buckley, CSIRO Div Maths & Stats, PO Box 218, Lindfield, NSW, Australia.
PHONE:	+61 2 467 6066			ACSNET:	mikeb@natmlab.oz		
ARPA: mikeb%natmlab.oz@seismo.css.gov	JANET:	natmlab.oz!mikeb
UUCP:	....{seismo,hplabs,mcvax,ukc,nttlab}!munnari!natmlab.oz!mikeb

hes@ecsvax.UUCP (Henry Schaffer) (11/04/86)

Take the sum of 12 random numbers (the kind that are usually produced,
i.e., a uniform distribution over the 0,1 interval) and subtract the
number 6.0.  This gives a good approximation to a random deviate with
mean = 0 and standard deviation or variance = 1.

Then multiply it by the desired standard deviation and add in the
desired mean (do it in that order!) and you've got it.

This is a surprisingly good method which can be quite fast if the
uniform random generator used gives more than one random number per
call -- otherwise the time is dominated by subroutine linkage business.

--henry schaffer  n c state univ

gunnar@hafro.UUCP (Gunnar Stefansson) (11/06/86)

My mailer can't seem to find this address any better than other
mailers, so here we go to the net.

I have my doubts about using the Box-Muller and like transforms,
which some people have recommended.  The reasons are at least twofold:

1.  It is very ineffective -- there are sines, cosines, logs and
square roots to be evaluated every time.  So it takes MUCH longer
than a random number generator should.  Marsaglia's improvement of
the method is somewhat more effective, but there are still logs and
square roots to be taken every time.

2. It is not at all obvious what happens if your uniform input is not
uniform enough. So to start with, you must be reasonably sure that
your uniform numbers are really iid uniform. Of course, when using
the Unix rand() (or whatever), you're probably not getting very good
random numbers.

Adding up 12 uniforms may be good enough, but that is unlikely. Be
careful that the resulting distribution will have zero probability in
the tails. So if you are interested in e.g. the distribution of a max
of things, it seems likely that it will get messed up.

I'd like to recommend the book 'Statistical Computing' by Kennedy and
Gentle as a general intro -- it's more recent than Knuth. In it,
you'll find lots of good algorithms, many of which can be very well
programmed in C (as opposed to the Fortran that they use). The
feedback-shift-register methods for uniforms seem to be quite
adequate and cheap to use. Further, there are several algorithms for
transforming to normal given, most of them cheaper than the
Box-Muller transform. One in particular (the Marsaglia-Bray
"convenient" method, p. 203) is simple and fast).

I have implemented a u(0,1) and a n(0,1) from that book. Writing the
programs in C turned out to be much simpler than the Fortran
counterparts.

--Gunnar

-----------------------------------------------------------------------------
Gunnar Stefansson                       {mcvax,seismo}!enea!hafro!gunnar 
Marine Research Institute, Reykjavik    gunnar@hafro.uucp
-- 

-----------------------------------------------------------------------------
Gunnar Stefansson                       {mcvax,seismo}!enea!hafro!gunnar 
Marine Research Institute, Reykjavik    gunnar@hafro.UUCP