joe@uw-evolution.UUCP (Joe Felsenstein) (11/22/88)
In reply to Dan Davison's statement that he constructs trees from distance matrices by applying a clustering algorith such as UPGMA to the matrix, and prefers this to the approach I described, of trying to fit the observed table of distances by one generated from the tree: An interesting property of UPGMA clustering is that, for the given topology that it comes up with, the clustering levels (in effect the branch lengths of the tree) are assigned by least squares, that is, they achieve a least squares fit of observed to expected distances. Thus what Dan has been doing and what I envisaged are very closely similar, except that if one conceives the task more generally as achieving a least squares fit to the observed distances, one can then think of trying other tree topologies as well. Thus I think the least squares approach is the more general and more or less sort of includes UPGMA. It will often but not always give the same tree. If one does not restrict the search to trees that have all tips level with each other (ultrametrics), then one can also fit cases that have unequal rates of evolution in different lineages. That involves even fewer assumptions than the "minimum number" Dan asserts his approach makes. Methods such as UPGMA are simply defined as the result of applying a certain algorithm. Methods such as least squares optimize an objective function. It is easier to investigate their properties. Fortunately in this case the two are closely related. Joe Felsenstein Dept. of Genetics SK-50, Univ. of Washington, Seattle WA 98195 BITNET: FELSENST@UWALOCKE INTERNET: joe@evolution.ms.washington.edu or: uw-evolution!joe@entropy.ms.washington.edu UUCP: ... uw-beaver!uw-entropy!uw-evolution!joe