[bionet.molbio.genbank] Databases of Subsequences

GARAVELL@gunbrf.bitnet (02/07/91)

In article <9102061944.AA15931@genbank.bio.net>
toms@fcs260c2.ncifcrf.gov (Tom Schneider) wrote:

> GenBank already has the feature table, this would be completely sufficient to
> satisfy everyone IF ONLY IT WERE KEPT UP TO DATE.  There are already hundreds
> of binding sites of various kinds, but only certain kinds are recorded in the
> database.  You don't need to go looking to create yet another feature of
> GenBank, all you need to do is use what is there.
> If you create consensus sequences you will be doing a huge disservice to
> molecular biology by perpertrating this poor method.

The PIR announced late last year that we were looking at making a PIR database
for protein sequence features and motifs, and solicited comments.  Tom, would
you feel the same way about a protein database similar to the one Jamie Hayden
says that GenBank is considering for nucleic acids?
------------------------------------------------------------------------
                                 Dr. John S. Garavelli
                                 Database Coordinator
                                 Protein Identification Resource
                                 National Biomedical Research Foundation
                                 Washington, DC  20007
                                 POSTMASTER@GUNBRF.BITNET

toms@fcs260c2.ncifcrf.gov (Tom Schneider) (02/08/91)

In article <9102062217.AA04849@genbank.bio.net> GARAVELL@gunbrf.bitnet writes:
>In article <9102061944.AA15931@genbank.bio.net>
>toms@fcs260c2.ncifcrf.gov (Tom Schneider) wrote:
>
>> GenBank already has the feature table, this would be completely sufficient to
>> satisfy everyone IF ONLY IT WERE KEPT UP TO DATE.  There are already hundreds
>> of binding sites of various kinds, but only certain kinds are recorded in the
>> database.  You don't need to go looking to create yet another feature of
>> GenBank, all you need to do is use what is there.
>> If you create consensus sequences you will be doing a huge disservice to
>> molecular biology by perpertrating this poor method.
>
>The PIR announced late last year that we were looking at making a PIR database
>for protein sequence features and motifs, and solicited comments.  Tom, would
>you feel the same way about a protein database similar to the one Jamie Hayden
>says that GenBank is considering for nucleic acids?

John:  Yes.  It's the same principle.  If a consensus is constructed,
and made 'official' then better methods (ie, frequency matrix) will tend
to be suppressed in people's minds.  The dang consensus would have to be
dropped eventually, since it isn't a good method, so why start now?
This applies to any motif - DNA, RNA, protein - you can think of.
This does NOT mean it isn't a good idea to make a carefully culled
collection from which people could do anything they want.  But that amounts
to hard scientific work, with some judgement required.  Also, to avoid
data redundancy, you would want to do it by pointers to the database
or by instructions for extracting the appropriate sequences (as in the
Delila system).

>                                 Dr. John S. Garavelli
>                                 Database Coordinator
>                                 Protein Identification Resource
>                                 National Biomedical Research Foundation
>                                 Washington, DC  20007
>                                 POSTMASTER@GUNBRF.BITNET

  Tom Schneider
  National Cancer Institute
  Laboratory of Mathematical Biology
  Frederick, Maryland  21702-1201
  toms@ncifcrf.gov