kat3@ihlpl.ATT.COM (Craig) (03/27/89)
> Sure; and you could just as easily have picked another data > set that produced a correct result from the onepass program, and an > incorrect result from twopass. Which is why, as has been pointed out > before, neither algorithm should be used in a general standard > deviation program. After thinking about the response to my posting on the computation of the standard deviation, I think a little background on my interest in the twopass algorithm is needed. The following provide cases where, if it is numerically stable,the twopass algorithm should be considered. 1) If your machine does vector calculations, then the twopass algorithm would be faster than the numerically stable onepass algorithm. 2) If you already know the "mean," and just wish to estimate the standard deviation, you are using the second half of the twopass algorithm. 3) It is unclear, if all data resides in core, whether a numerically stable onpass algorithm, or the twopass algorithm is faster. In any case, if the twopass algorithm is numerically unstable, I would like an example of a dataset where it fails. The second reason for making an issue about claims that the twopass algorithm is numerically unstable is the fact that it is not documented in the literature. The following recent statistics book claims the algorithm is stable: Thisted, Ronald A., "Elements of Statistical Computing," 1988, Chapman and Hall, pp. 11. Finally, I give below, in pseudo-code, an extremely slow, but very accurate algorithm for computing the standard deviation: 3) sort(x); sum = 0.0; for (i=0; i<N; i++) sum += x[i]; xbar = sum/N; for (i=0; i<N; i++) z[i] = (x[i] - xbar)*(x[i] - xbar); sort(z); sqsum = 0.0; for (i=0; i<N; i++) sqsum += z[i]; std_dev = sqrt(sqsum/N); Once again, note that this algorithm would be faster on a vectorized machine. In any case, algorithms which are slow on a sequential machine may have to be reconsidered with the advent of new hardware. The only question which must be addressed is whether the algorithm gives numerically correct results. Bob Craig