[bionet.molbio.evolution] Response to Jack Kramer

joe@entropy.ms.washington.edu (05/14/88)

From: Joe Felsenstein <uw-evolution!joe@entropy.ms.washington.edu>


In response to the posting by Jack Kramer on multidimensional pattern
recognition: 

David Sankoff did two things long ago that are relevant to your
inquiry.  One was to show that we can incorporate an arbitrary
distance D(i,j) between pairs of amino acids into sequence alignment
algorithms -- we don't just have to score them as the same or
different.  This distance could in principle take into account any
multidimensional data you wanted it to.  The reference -- not terribly
readable -- is Sankoff and Rosseau, Mathematical Programming, 9:240-246
1975.  A related paper is Sankoff, SIAM Journal of Applied Mathematics,
28: 35-42  1975.

The other thing he pointed out back then was that, contrary to your
assertion, one should NOT rigidly separate the stages of assessment
of homology and inference of phylogenies.  Sooner or later the people
who are so heavily into multiple sequence aligment will discover, to
their amazement, that he did the essential work on this in the mid '70s.
One should not align a set of sequences considering them symmetrically,
but one has to take account of the fact that they may be related, and
come to you in clusters.  I believe that Russell Doolittle has been
saying something similar recently, and I think David Lippman is too.

There is a more recent chapter on this by Sankoff and Cedergren
in the Sankoff and Kruskal book on Time Warps, String Edits, and Macromolecules.

One has to carry out alignment at the same time as estimating the
phylogeny, finding that aligment and that phylogeny that optimize
whatever criterion you are using (such as likelihood or parsimony).
Only then can you see whether there is significant evidence for homology.

This of course ducks the issue of what is a good distance matrix D(i,j)
between amino acids, as well as all the algorithmic practicality
issues.  Therein lies much work.

Joe Felsenstein,
Dept. of Genetics SK-50, Univ. of Washington, Seattle WA 98195
 BITNET:   FELSENST@LOCKE.HS.WASHINGTON.EDU
      or   uw-entropy!uw-evolution!joe%beaver.cs.washington.edu@UWAVM
      or   uw-entropy!uw-evolution!joe%beaver.cs@UWAVM.ACS.WASHINGTON.EDU
      or   uw-entropy!uw-evolution!joe%beaver.cs.washington.edu@CUNYVM.CUNY.EDU
 INTERNET:  uw-entropy!uw-evolution!joe@beaver.cs.washington.edu
       or   uw-evolution.uucp!joe%entropy.ms@beaver.cs.washington.edu
 UUCP:      ... uw-beaver!uw-entropy!uw-evolution!joe