dbd%benden@LANL.GOV (Dan Davison) (11/20/88)
There have been several articles recently that readers of the bboard may want to take note of. The first is titled "Perils of molecular introspection", Nature 8 September 1988, pg. 118, by Joe Felsenstein. This article has the first clear statement (to me anyways) of the difference between parsimony and distance in molecular phylogenetics: "Two computational methods have dominated the reconstruction of molecular phylogenies: parsimony and distance. The parsimony method finds the evolutionary tree that requires the fewest changes of nucleotides to explain evolution of the observed sequences. Distance methods compute a table of pairwise numbers of differences and try to fit this to expected pairwise distances computed from the tree" [p.118] Dr. Felsenstein then goes on to discuss Jim Lake's recent method of operator invariants. Those who found his Molecular Biology and Evolution paper [MBE 4:167 1987] impenetrable will find this discussion refreshingly clear. I do have one quibble with the definition given above for the "distance" method. It is very likely that I've been doing it wrong for years, but the way I used distance information is to construct the table Joe describes, then use a clustering algorithm (almost always UPGMA, the unweighted pair- group method) to construct a tree from this data. I prefer this method because it makes a minimum number of assumptions about the data; the main one being (1) that rates of mutation across the lineages in the tree are equivalent, and (2) you've done a decent (not necessarily a "correct") alignment. "Decent" in this case means the resulting trees are reasonably insensitive to minor changes in the alignments. Does this seem wildly incorrect? dan davison