gribskov@FCRFV1.NCIFCRF.GOV ("Gribskov, Michael") (08/23/90)
Sorry for the delayed reply, I assumed that there would be much more traffic on this question. There are several projects underway to gather libraries, dictionaries, or databases of proteins structureal and/or sequence motifs. My apologies in advance if I have misrepresented any of the material below -- obviously you should contact the origianl authors for details. Some of the projects I know of are: Randy Smith Molecular Biology Computer Research Resource RSMITH @ MBCRR.HARVARD.EDU I think these patterns derive from alignment and clustering of all protein sequences in the protein database. They are regular expressions somewhat like the PROSITE patterns. Available by ftp, and from EMBL and U.Houston list servers (name PLSEARCH) Ref: Smith and Smith (1990) PNAS 87,118-122. John Wooton National Center for Biotechnology Information WOOTON @ NCBI.NLM.NIH.GOV Patterns derived from both structure and sequence. About 150 patterns, I think. I'm not sure of the availability. Michael Gribskov (Myself) National Cancer Institute GRIBSKOV @ NCIFCRF.GOV Scoring matrices including position specific gap weights (Profiles) which have been proven to detect sequence and structural patterns in a statistically significant way. Currently about 30 profiles, mostly of patterns of supersecondary structure, domains, and sequence motifs. In the near future the corresponding alignments will also be available. Publically available by E-mail and tape. Ref: Gribskov et al., (1990) Met. Enzymol. 183,146-159. Michael Gribskov NCI - FCRDC GRIBSKOV @ NCIFCRF.GOV (301) 846-5031