[bionet.molbio.evolution] Features needed in phylogeny programs

joe@GENETICS.WASHINGTON.EDU (Joe Felsenstein) (11/22/90)

In the interest of generating more discussion in the molecular-evolution
newsgroup, I would be interested in seeing postings (rather than e-mail)
on the question:


  What features are needed in future programs for inferring phylogenies?


I am at work on the next version of my own package and it would help me to
hear what people think is needed (no guarantees that I will take any of the
suggestions, of course).

Please do *not* cross-post responses to Usenet netnews sci.bio as the
noise/signal ratio is too high there and we will be swamped with useless
responses.

-----
Joe Felsenstein, Dept. of Genetics, Univ. of Washington, Seattle, WA 98195
 Internet/ARPANet: joe@genetics.washington.edu     (IP No. 128.208.128.1)
 Bitnet/EARN:      felsenst@uwalocke
 UUCP:             ... uw-beaver!evolution.genetics!joe

wrp@biochsn.acc.Virginia.EDU (William R. Pearson) (11/24/90)

	1. A method for generating distances from aligned protein
sequences.

	2. A method for calculating numbers of changes and including
them in parsimony trees.

	3. Make general versions of drawgram and drawtree that simply
output vectors and labels, but do not attempt to make fonts. For
publication figures, I prefer to use Postscript directly, for better
control over linewidths, font sizes and types, etc.  In addition, with
Postscript (or some general format), I can edit the position of the
labels so that they do not overlap.  In this case, less is more.  (It
would be very nice to have an option to output angles and lengths,
since they are easier to edit and preserve distances than pairs of x,y
coordinates.)


	I have not had such serious problems with changing sequence
formats, because the text editor that I use (emacs) allows editing of
rectangles.  I have had a lot of problems with trees however, and
some sort of a tree-editor, verifier, or some checking to make certain
that the tree includes as many species as the data, would be very
helpful.

Bill Pearson

joe@GENETICS.WASHINGTON.EDU (Joe Felsenstein) (12/04/90)

Thanks to many people for the useful suggestions for changes in my own
package.  I can see that there is a lot of concern about format conversion,
and I hope in my next version to allow either interleaved or sequential
format, at the user's option.    I will think about all the suggestions.

However, I was really not trying to get a discussion of features in my
package going.  I was trying to raise a more general question, without
reference to any particular package or features thereof.  So let me re-ask
the question in hopes of stimulating a more general, wide-ranging discussion:

      What features are needed in future phylogeny packages?

I am particularly interested in suggestions of kinds of analyses no one
may have thought of, kinds of questions that could be asked of data that
are not now askable in existing programs.  I am *not* trying to get lots of
postings referring to my own package.

-----
Joe Felsenstein, Dept. of Genetics, Univ. of Washington, Seattle, WA 98195
 Internet/ARPANet: joe@genetics.washington.edu     (IP No. 128.208.128.1)
 Bitnet/EARN:      felsenst@uwalocke
 UUCP:             ... uw-beaver!evolution.genetics!joe

debry@ds1.scri.fsu.edu (Ron DeBry) (12/05/90)

One thing that current phylogeny packages do not do is alignments.  It
is clearly unrealistic to expect that nucleotide sequences from a large
number of species can be properly aligned _before_ the phylogeny has
been estimated, but that is what all the packages assume.  As available
computing power increases, I think we should start to look at ways to
combine alignment and phylogeny estimation.

Jotun Hein has begun to address this issue, by a reciprocal
align - parsimony tree - align - tree etc. algorithm.  We have tried to
use his program on some of Larry Abele's decapod rDNA data, and the
alignment looked OK at the end, but the phylogeny didn't make any sense
compared to anything else that had been done with the same data.  (We
also haven't had any luck contacting Hein, if anyone knows a reliable
means to get in touch with him via e-mail, please let me know).

While a maximum likelihood method which estimates both alignment and
phylogeny simultaneously is no doubt possible, the calculations are
probably beyond even Crays for now.  I think that a reciprocal
parsimony-based alignment, likelihood-based phylogeny program would be a
help.

Ron DeBry

debry@ds1.scri.fsu.edu

debry@fsu (BitNet)