[bionet.molbio.evolution] More DNA hybridization discussion

dbd%benden@LANL.GOV (Dan Davison) (10/10/88)

Some more points about the Sarich, Sibley, Ahlquist saga:

Joe Felsenstein writes:
      Dan -- care to expand on your remarks?

OK, here goes.

  (2)[An open question is] Whether DNA hybridization data on relationships
     is fatally flawed owing to presence of some repeated sequences.[...]
     Here I would ask Dan Davison for some details on why he thinks
     "fractured, highly repetitive elements" would cause trouble.  If 
     there are enough different repeated sequences involved, then we
     should still be able to use them to measure average divergence
     of sequences.

Before presenting my reasons, a bit of background.  I was a grad student
at SUNY Stony Brook and that is the home of Ferris, Rohlf, and Sokol
(although I was not in that department [whew!]).  Therefore I was
frequently around the cladist vs. pheneticist (sp?) discussions/wars.
I also was in laboratories that were doing DNA hybridization of the
Britten (Cot) type, although I did *not* do any of that myself.  I then
went to the U of Houston, where I spent two years looking at small
small ribosomal RNA sequences and constructing phylogenetic trees from
sequence distance data.

Now to the meat of the subject.  I do not specifically believe that DNA
hybridization data on relationships is fatally flawed owing to the presence
of some repeated sequences.  This certainly is not the case in enterobacteria
and the fruit fly, the only systems in which I have any experience.



  (3) Whether there is a fatal flaw in the use of any data like
      DNA hybridization which must be analyzed as distances rather than
      by reference to individual sites.  Sarich is NOT raising this issue
      -- he is a long-term DEFENDER of use of distance measures and methods.
      But without knowing Marks, I get the impression that he comes from
      the "phylogenetic systematics" tradition which considers that there
      is something fatally wrong with distances.  In any case, although the
      issue is not being raised by Sarich and Marks, it is clear that many
      phylogenetic systematists consider this to be the issue and that any
      discomfiture of Sibley reinforces this position.  Even if everything
      Sarich and Marks say is accepted I don't see that this point follows
      at all.  I have argued (in papers controversying with Farris) that
      there is not a fatal flaw in distance methods.

dbd%benden@LANL.GOV (Dan Davison) (10/10/88)

[Sorry about the duplicate posting & mailing.  Hit the wrong key--dbd]

Some more points about the Sarich, Sibley, Ahlquist saga:

Joe Felsenstein writes:
      Dan -- care to expand on your remarks?

OK, here goes.

  (2)[An open question is] Whether DNA hybridization data on relationships
     is fatally flawed owing to presence of some repeated sequences.[...]
     Here I would ask Dan Davison for some details on why he thinks
     "fractured, highly repetitive elements" would cause trouble.  If 
     there are enough different repeated sequences involved, then we
     should still be able to use them to measure average divergence
     of sequences.

Before presenting my reasons, a bit of background.  I was a grad student
at SUNY Stony Brook and that is the home of Ferris, Rohlf, and Sokol
(although I was not in that department [whew!]).  Therefore I was
frequently around the cladist vs. pheneticist (sp?) discussions/wars.
I also was in laboratories that were doing DNA hybridization of the
Britten (Cot) type, although I did *not* do any of that myself.  I then
went to the U of Houston, where I spent two years looking at small
small ribosomal RNA sequences and constructing phylogenetic trees from
sequence distance data.

Now to the meat of the subject.  I do not specifically believe that DNA
hybridization data on relationships is fatally flawed owing to the presence
of some repeated sequences.  This certainly is not the case in enterobacteria
and the fruit fly, the only systems in which I have any experience.  My
point is that when sequences, either single-copy or middle repetitive
(rRNAs), are too similar to each other *bulk DNA hybridization* will not
be able to make useful differentiation between the sequences.  For example,
in the work of Woese et al. the RNA fingerprinting method for systematics
breaks down at precisely those points most interesting (to me anyway).
Their method is extremely powerful in elucidating distant relationships;
it doesn't work within, say, E. coli vs. Shigella sp. vs. Salmonella sp.
There are significant methodological differences between RNA fingerprinting
and DNA hybridization; but I think the point remains valid.

The comment I made about "fractured, highly repetitive sequences" comes from
work done on some strains of Shigella dysenteriae and E. coli.  We were
examining the copy number of IS1 from a variety of natural isolates of E.coli
(the famous Milkman collection) and some other enterobacteria.  One strain
of Shigella had about 250-300 copies of the element, whereas most E. coli
have no more than 10-12.  That population consisted of two sequences:
one with 55% homology to IS1 and another with 99% similarity (from 
sequencing).  I was either told or led to believe that this species in
particular would make hash of any Cot attempts.

In summary, based on my experience in the Drosophila and E. coli systems
as well as two years looking at 130 16S rRNA sequences leads me to believe
that hybridization won't cut it for very close relationships (>= 95%).

  (3) Whether there is a fatal flaw in the use of any data like
      DNA hybridization which must be analyzed as distances rather than
      by reference to individual sites.  Sarich is NOT raising this issue
      -- he is a long-term DEFENDER of use of distance measures and methods.
      But without knowing Marks, I get the impression that he comes from
      the "phylogenetic systematics" tradition which considers that there
      is something fatally wrong with distances.  In any case, although the
      issue is not being raised by Sarich and Marks, it is clear that many
      phylogenetic systematists consider this to be the issue and that any
      discomfiture of Sibley reinforces this position.  Even if everything
      Sarich and Marks say is accepted I don't see that this point follows
      at all.  I have argued (in papers controversying with Farris) that
      there is not a fatal flaw in distance methods.

Agreed that Sarich is a defender of distance methods.  Recall his article
in Nature, a book review, titled "The Sound of Distance Drums" where he
was vigorous, if not insulting, in his defense of distance methods.
Personally, I have made several attempts to understand what Ferris and 
and others are arguing about; I have never been able to figure out what the
problem is.  I just concluded after one talk with Ferris that the 
systematists didn't understand molecular data.

As you say, even if Sairch and Marks are correct it certainly does not
speak to the "fatally wrong with distance" situation.

dan davison