joe@entropy.ms.washington.edu (05/14/88)
From: Joe Felsenstein <uw-evolution!joe@entropy.ms.washington.edu> In response to the posting by Jack Kramer on multidimensional pattern recognition: David Sankoff did two things long ago that are relevant to your inquiry. One was to show that we can incorporate an arbitrary distance D(i,j) between pairs of amino acids into sequence alignment algorithms -- we don't just have to score them as the same or different. This distance could in principle take into account any multidimensional data you wanted it to. The reference -- not terribly readable -- is Sankoff and Rosseau, Mathematical Programming, 9:240-246 1975. A related paper is Sankoff, SIAM Journal of Applied Mathematics, 28: 35-42 1975. The other thing he pointed out back then was that, contrary to your assertion, one should NOT rigidly separate the stages of assessment of homology and inference of phylogenies. Sooner or later the people who are so heavily into multiple sequence aligment will discover, to their amazement, that he did the essential work on this in the mid '70s. One should not align a set of sequences considering them symmetrically, but one has to take account of the fact that they may be related, and come to you in clusters. I believe that Russell Doolittle has been saying something similar recently, and I think David Lippman is too. There is a more recent chapter on this by Sankoff and Cedergren in the Sankoff and Kruskal book on Time Warps, String Edits, and Macromolecules. One has to carry out alignment at the same time as estimating the phylogeny, finding that aligment and that phylogeny that optimize whatever criterion you are using (such as likelihood or parsimony). Only then can you see whether there is significant evidence for homology. This of course ducks the issue of what is a good distance matrix D(i,j) between amino acids, as well as all the algorithmic practicality issues. Therein lies much work. Joe Felsenstein, Dept. of Genetics SK-50, Univ. of Washington, Seattle WA 98195 BITNET: FELSENST@LOCKE.HS.WASHINGTON.EDU or uw-entropy!uw-evolution!joe%beaver.cs.washington.edu@UWAVM or uw-entropy!uw-evolution!joe%beaver.cs@UWAVM.ACS.WASHINGTON.EDU or uw-entropy!uw-evolution!joe%beaver.cs.washington.edu@CUNYVM.CUNY.EDU INTERNET: uw-entropy!uw-evolution!joe@beaver.cs.washington.edu or uw-evolution.uucp!joe%entropy.ms@beaver.cs.washington.edu UUCP: ... uw-beaver!uw-entropy!uw-evolution!joe