[sci.math] Need formula for Normal Distributio

kenny@uiucdcsb.cs.uiuc.edu (10/28/86)

Excuse me for posting to the net; my mailer can't find a return path to your
site.

Date: Tue, 28 Oct 86 11:25:48 CST
From: kenny@b (Kevin Kenny)
Message-Id: <8610281725.AA17014@b.cs.uiuc.edu>
To: helm!dlbaer
Subject: Need formula for Normal Distributio

/* Written  1:33 am  Oct 26, 1986 by dlbaer@helm.UUCP in uiucdcsb:net.math */
/* ---------- "Need formula for Normal Distributio" ---------- */
>I need a formula for a simulation for a normal distribution. Send
>email please.

There are any number of ways for generating random variables having a
Gaussian distribution.  Probably the simplest is the ``polar method''
of Box, Muller and Marsaglia.

	V1 = 2*RND - 1 <--------+
	V2 = 2*RND - 1          |
	S = V1*V1 + V2*V2       |
	IF S >= 1 THEN GOTO ----+
	X1 = V1 * SQRT (-2 * LOG(S) / S)
	X2 = V2 * SQRT (-2 * LOG(S) / S)

(Note that LOG here refers to the natural logarithm; some BASICs use
LN to denote this function.)

X1 and X2 are independent random variables of normal distribution with
mean 0 and standard deviation 1.  To get variables with mean M and
standard deviation S, do 

	Y = S * X + M

There are any number of other methods; most of them are reviewed in
Knuth Donald E. _The_Art_Of_Computer_Programming._ Voulme 2: _Semi-
Numerical_Algorithms._  2nd ed.  Reading, Massachusetts, Addison-Wesley,
1981, pp. 117-127.  Any decent library will have this reference.

Kevin Kenny			UUCP: {ihnp4,pur-ee,convex}!uiucdcs!kenny
Department of Computer Science	ARPA: kenny@B.CS.UIUC.EDU (kenny@UIUC.ARPA)
University of Illinois		CSNET: kenny@UIUC.CSNET
1304 W. Springfield Ave.
Urbana, Illinois, 61801		Voice: (217) 333-7980

sarwate@uicsl.UUCP (11/05/86)

The problem posed is to find a function F(X,S,A) so that when X1, X2, . . . Xn
are n (successive?) outputs of the BASIC random number generator, the numbers
Y1=F(X1,S,A), Y2=F(X2,S,A), etc form a sample with mean A and standard deviation
S. It is also desired that the Y's be in the range 0-100 (A is in that range,
as is S), and there is also a reference to a normal (Gaussian?) distribution.

Two solutions to this problem have been proposed. However, neither fully solves
the original problem. In fact, the original problem may not have a solution at
all. It is difficult to tell because the problem itself is not properly posed,
and it is not clear what exactly the author of the problem wants.

If the Y's are indeed meant to represent samples from a Gaussian distribution,
then one cannot guarantee that they will be in the range 0-100. Of course, in
practice, if S is small enough, the Y's will (with very high probability)
satisfy this restriction.  On the other hand, is S=9 considered small in
comparison with A=90 ? And if so, how can one guarantee that the sample values
are in the range 0-100 ?

It is not clear how either of the methods proposed by Kenny and Schaffer will
give a sample with average EXACTLY A (if indeed this is what is desired). Given
a large enough sample, the average value will be APPROXIMATELY A and the
standard deviation will be APPROXIMATELY S, but that is all that one can expect.