BIOCUKM@osucc.bitnet (04/30/91)
Arabidopsis molecular biologists and geneticists should look at the recent paper by Kemmerer, Lei and Wu (J. Mol. Evol. 32:227-237, 1991). The paper reports the isolation and nucleotide sequencing of the cytochrome c gene of Arabidopsis. The unusual aspect of the paper is that the predicted amino acid sequence is much more closely related to those of the cytochromes c of Neurospora and yeasts than it is to the cytochrome c sequences of another member of the crucifer family, cauliflower, and, indeed, of all flowering plants whose cytochrome c sequences are known. When I first saw the abstract, I thought that this would be another example of grand conclusions from uncertain trees. Calculated phylogenetic trees have uncertainties associated with the placement of branches. Those uncertainties are rarely included in diagrams and are often ignored when drawing conclusions. On reading the paper, I found the authors well aware of these difficulties. They analyzed the sequence data in many different ways. Each way produced the same highly significant conclusion. A visual examination of the amino acid sequences leads to the same conclusion, that the cytochrome c of the sequenced gene is more similar to Neurosopora and yeast sequences than it is to those of higher plants. What's going on? Several explanations come to mind. 1. The sequenced gene could encode a minor cytochrome c of plants, one not detected by protein sequencing. A S. cerevisiae probe was used to screen an Arabidopsis library. All 4 hybridizing library clones were the same. Thus, the gene is the only Arabidopsis cytochrome c gene recognized by the yeast probe. Perhaps the genes for the major cytochromes c of plants are not recognized by the S. cerevisiae probe. A difficulty with this and other technical explanations is that similar conclusions about ancestry were reached with histone H3 comparisons (though these are less certain). 2. The sequenced gene was from a contaminant in the DNA preparation. This is unlikely since a fragment of the cloned gene recognizes a fragment of the same size in blots of Arabidopsis DNA. 3. Lateral transfer of the cytochrome c gene (either from yeast to Arabidopsis or, I suppose, vice versa). This possibility, raised in the paper, seems unlikely since the transferred gene would have to have completely replaced an ancestral gene. 4. My off-the-cuff favorite hypothesis is that in studying molecular evolution over such large distances we are really not studying the evolution of the molecules themselves. We are really studying the evolution of the fitness landscape. Each of the cytochromes c has evolved to be optimally suited for its particular ecological niche. Perhaps the relevant features of the Arabidopsis cytochrome c ecological niche are smallness and rapid growth. Those properties seem more like those of the yeasts than of cotton, mung bean, cauliflower, sesame, sunflower, wheat, buckwheat and Gingko. Against this view is the presence of Chlamydomonas on the higher plant branch. 5. A synthesis of explanations 3 and 4 would be that somewhere in Arabidopsis' ancestry it was in close association (symbiosis?) with a fungus. The association was so close that heterokaryons formed. With time one of the copies of genes that were pesent in both the plant and the fungal progenitor was lost. Selection may have favored the fungal cytochrome c. What do the rest of you think? This is a question of economic, as well as scientific, importance. The paper makes the suggestion that Arabidopsis may not be a good model for higher plants. If this view takes hold, there goes the funding! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!! Ulrich Melcher !!! !!! Department of Biochemistry !!! !!! Oklahoma State University !!! !!! Stillwater OK 74078 USA !!! !!! BIOCUKM@OSUCC (Bitnet) 405-744-6210 !!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
KELLOGG@FRODO.MGH.HARVARD.EDU (05/01/91)
I'm not sure I'd be all that worried about the weird placement of Arabidopsis on the cytochrome c tree, and I certainly wouldn't question the utility of Arabidopsis as a model system. There are several reasons: 1. Trees based on cytochrome c tend to be peculiar in general. For example, if you check the trees published by Syvanen et al. (JME, 1989, ref. in the Kemmerer paper) you can find all sorts of oddities. In their fig. 5 they show a minimal eukaryotic cyctochrome c tree - the two fish don't come out together; on the vertebrate clade, the frog is the most basal group, more primitive even than the carp; sesame (a dicot) comes out with rice (a monocot); the grasses (wheat, maize and rice) form a paraphyletic rather than monophyletic group. 2. The tree published by Kemmerer et al. is not particularly robust. The numbers beside the branches are bootstrap replicates, not mutations. Thus the only groupings that appear in at least half of the bootstrap samples are the fungus + Arabidopsis clade and the higher plant clade. Those who choose to interpret bootstraps samples in terms of statistical significance would say those groupings are significant at the 50% level - i.e. not very significant. If you want to use a 95% level, then only the Neurospora + Arabidopsis grouping is significant and the rest of the tree is phylogenetically meaningless. The statistical interpretation of bootstraps is under heavy fire at the moment, so I wouldn't push it too far; however, if the animal grouping only appears in 19 out of 50 replicates, it doesn't seem overly convincing. 3. Based on the phenograms, the Arabidopsis sequence is not particularly similar to that of Neurospora - the similarity is much less than that among the higher plants. So it may not be much like a higher plant cytochrome, but it isn't all that much like a Neurospora cytochrome either. Arabidopsis is on a long branch, and it has been extensively documented that "long branches attract" in phylogenetic analyses (the so-called Felsenstein zone). The best way to correct that problem is to increase the sampling density around the problematic taxon. The implication of 1-3 is that cytochrome c may not be much use as indicator of relationship. On the other hand, the fact that similarities in the molecules do not appear to be determined primarily by phylogeny means that there must be some really interesting molecular biology going on - this gets back to the fitness landscape described by Ulrich Melcher. Possibly selection has been strong enough to effectively wipe out much of the phylogenetic information in the molecule. The other possibility is one mentioned by Kemmerer et al. in their companion paper in Mol. Biol. and Evol. in which they suggest that the problem is comparison of paralogous genes. If there are several copies of the cytochrome c gene, then each copy will have its own phylogeny. If the sequence for Arabidopsis is from one copy of the gene, and at least some of the other plant sequences are from another copy (or, worse yet, from several other copies), then you are comparing apples and oranges - you get a funny tree; the tree won't correspond either to a gene phylogeny or to an organismic phylogeny. I saw a tree a few days ago on sequences of glucanases in which that was clearly what had happened. Given all this, I suppose that lateral gene transfer can't be ruled out, but I'm not sure it's the most compelling explanation of the pattern. My conclusion is that Arabidopsis is probably a fine model for higher plants - just that attempts to generalize results from Arabidopsis need to be checked. This is hardly a radical suggestion. The stronger conclusion though is that cytochrome c is probably not a good molecule for exploring organismic phylogeny. Other people have reached this conclusion before. Maybe what we need now is some exploration of why it isn't much use. Elizabeth Kellogg, Dept. of Molecular Biology, Mass. General Hospital, Boston, MA; and Arnold Arboretum of Harvard University, 22 Divinity Ave., Cambridge, MA 02138 kellogg@frodo.mgh.harvard.edu