lfk@athena.mit.edu (Lee F Kolakowski) (07/13/90)
I have written some software that runs under Unix(tm) or MSDOS, that will search a protein sequence for all of the patterns described in Amos Baroich's Prosite database. It requires the following: AWK (new awk or Gnu GAWK, GAWK is free, ftp'able software) The prosite database file prosite.doc Basically, I have translated all the patterns to Unix style regular expressions, and created some simple awk scripts to search protein sequences. NOTE: the MKS (Mortice Kern Systems) version of AWK for MSDOS machines is required for MSDOS use. The MSDOS version of GAWK can't handle 300+ regular expressions. I have no links with MKS except that it is a good product. I would like to post a shell archive of this to bionet.software, if there is interest. Here is a brief example of the output (short form): Prosite Database -- Release 5.0 of April 1990 Copyright: Amos Bairoch ProSearch Software -- Release 0.1beta -- Copyright: Lee Kolakowski The following patterns are in < bov.ops >: Access# From->To Name _______ ________ ____ PS00001 2->6 ASN_GLYCOSYLATION PS00001 16->20 ASN_GLYCOSYLATION PS00001 201->205 ASN_GLYCOSYLATION PS00005 14->17 PKC_PHOSPHO_SITE PS00005 230->233 PKC_PHOSPHO_SITE PS00005 244->247 PKC_PHOSPHO_SITE PS00006 22->26 CK2_PHOSPHO_SITE PS00006 194->198 CK2_PHOSPHO_SITE PS00006 199->203 CK2_PHOSPHO_SITE PS00006 230->234 CK2_PHOSPHO_SITE PS00006 339->343 CK2_PHOSPHO_SITE PS00007 21->30 TYR_PHOSPHO_SITE PS00008 89->95 MYRISTYL PS00008 121->127 MYRISTYL PS00008 157->163 MYRISTYL PS00008 183->189 MYRISTYL PS00013 157->168 PROKAR_LIPOPROTEIN PS00237 68->85 G_PROTEIN_RECEPTOR PS00238 296->314 OPSIN Please e-mail me if you are interested. -- Frank Kolakowski ====================================================================== |lfk@athena.mit.edu || Lee F. Kolakowski | |lfk@eastman2.mit.edu || M.I.T. | |kolakowski@wccf.mit.edu || Dept of Chemistry | |lfk@mbio.med.upenn.edu || Room 18-506 | |lfk@hx.lcs.mit.edu || 77 Massachusetts Ave.| |AT&T: 1-617-253-1866 || Cambridge, MA 02139 | |--------------------------------------------------------------------| | #include <woes.h> | | One-Liner Here! | ======================================================================