[bionet.molbio.genbank] The Matter of Motifs

reisner@ee.su.oz.au (Alex Reisner) (02/10/91)

The Matter of Motifs.

1. The greater degree of ambiguity in nucleotide sequences compared to
sequences of amino acid residues makes the use of consensus sequences of
nucleotides or the derivation of motifs from nucleotide sequences more
difficult and more prone to misuse.  But it doesn't make them useless.

2.  The more pertinant information additional to the sequence, whether
nucleotide or amino acid, that can be utilized to determine motifs, the
more useful they become.

3. Michael Sternberg (Nature, 349(1991)111) analyzes the use of Amos
Bairoch's PROSITE database.  In his table Sternberg demonstates the
utility of Prosite, which allows any user to determine the probability of
a false positive when a match within the test sequence is obtained with a
motif in Prosite.

	Sternberg concludes, "Motifs must be carefully defined as in
PROSITE rather than being developed *a posteriori*.  It is not valid to find
a weak sequence similarity to identify two proteins and then arbitrarily
to identify common residues to establish a motif that by chance is not
expected to occur in a sequence database."


	Perhaps one last point is worth making.  There should be a
distinction between the representation of a motif as fact and the use of a
*quasi* motif to assist in analysis of data.  So for example, the
'patterns' derived for the MCBRR application 'plsearch' using BLAST on
Swiss-Prot are useful to the research worker even though they are not
rigorously determined motifs as Sternberg advocates.  To make databases of
'patterns' available to workers in the field is useful, the fact that the
results obtained from their use should not be treated as sacrosanct doesn't
negate such a conclusion.

Alex Reisner
Aust. Nat. Genomic Information Centre