markh@csd4.csd.uwm.edu (Mark William Hopkins) (04/14/91)
I have a particular algorithm in mind. It takes a set, S, of N-vectors and partitions it into sets S1, S2 such that sigma(S1) + sigma(S2) is minimized where sigma(S) = total square deviation of S (that is, sum x in S: |x - average(S)|^2) It runs sequentually through S and calculates S1 and S2 in one pass.
markh@csd4.csd.uwm.edu (Mark William Hopkins) (04/14/91)
In article <11014@uwm.edu> markh@csd4.csd.uwm.edu (Mark William Hopkins) writes: > I have a particular algorithm in mind. It takes a set, S, of N-vectors >and partitions it into sets S1, S2 such that > > sigma(S1) + sigma(S2) is minimized ... I forgot to finish the previous posting. The question is: does anyone know of such an algorithm?
wido@isgtec.uucp (Wido Menhardt) (04/15/91)
In article <11014@uwm.edu> markh@csd4.csd.uwm.edu (Mark William Hopkins) writes: > > I have a particular algorithm in mind. It takes a set, S, of N-vectors > and partitions it into sets S1, S2 such that > > sigma(S1) + sigma(S2) is minimized > where > sigma(S) = total square deviation of S > (that is, sum x in S: |x - average(S)|^2) > > It runs sequentually through S and calculates S1 and S2 in one pass. the algorithm which solves this problem suboptimally and iteratively is known as K-MEANS, very popular, and it works as follows: 1) assign each vector of S randomly (or systematically) to either S1 or S2 2) compute the mean vectors fro S1 and S2 3) re-assign each vector of S to the class (S1,S2) whose mean vector is closer (this takes care of the square deviation criterion) 4) go to 2). stop the iteration, if nothing changes anymore. the result depends on (1). I would be interested to hear about a one-pass algorithm: after all, you do kot know average(S1) in advance .... -wido.