[comp.lang.c] Standard deviation *again*

wyatt@cfa.harvard.EDU (Bill Wyatt) (03/24/89)

vevea@paideia.uchicago.edu (Jack L. Vevea):
> In article <2221@maccs.McMaster.CA> cs3b3aj@maccs.UUCP (Stephen M. Dunn) writes:
>>
>>1)  assuming values are in array x, there are n values, and the mean is 
>>    known and is in the variable mean: 
>>
>>temp = 0;
>>for (i = 1; i <= n; i++)
>>   temp += (x [i] - mean) ^ 2;
>>std_dev = sqrt (temp / n);

This is the RMS (root-mean-square), not the STD. You divide by (n-1) to
get the STD.

>>2)  assuming that the SQUARES of the values are in array x, there are n 
>>    values, and the mean is known and is in the variable mean: 
>>
>>temp = 0;
>>for (i = 1; i <= n; i++)
>>   temp += x [i] ^ 2;
>>std_dev = sqrt ((temp - (mean ^ 2) / n) / n);
                                     ^
                                     should be `*'

This is a misstatement - the point is that if `temp' is the SUM of the
squares, then the STD is (almost) as stated (see below).


> Try this:
> 
> for(i=0;i<n;i++) {
> 	temp1 += x * x;
> 	temp2 += x; }
> 
> sd = sqrt(temp1 - (temp2 * temp2 / n)) / (n-1);
> 
> 
> (Note division by n-1, not n; although the original poster didn't
> give much information on what he planned to do with this, he almost
> certainly wants _sample_ sd's, not population as the above quoted
> formulae assume.)

Yes but - the formula is

std = sqrt( ( sum(x**2) - mean**2 * n ) / (n - 1) )

Of course, none of this addresses the original question of computability
in the face of finite precision.


Bill Wyatt, Smithsonian Astrophysical Observatory

UUCP:   {husc6,cmcl2,mit-eddie}!harvard!cfa!wyatt
ARPA:   wyatt@cfa.harvard.edu
SPAN:   cfa::wyatt 
BITNET: wyatt@cfa

-- 

Bill Wyatt, Smithsonian Astrophysical Observatory

UUCP:  {husc6,cmcl2,mit-eddie}!harvard!cfa!wyatt
ARPA:  wyatt@cfa.harvard.edu
 (or)  wyatt%cfa@harvard.harvard.edu
SPAN:  cfairt::wyatt 
BITNET: wyatt@cfa2