[bionet.software] ProSite Database

lfk@athena.mit.edu (Lee F Kolakowski) (07/13/90)

I have written some software that runs under Unix(tm) or MSDOS, that
will search a protein sequence for all of the patterns described in
Amos Baroich's Prosite database.

It requires the following:
	AWK (new awk or Gnu GAWK, GAWK is free, ftp'able software)
	The prosite database file prosite.doc

Basically, I have translated all the patterns to Unix style regular
expressions, and created some simple awk scripts to search protein
sequences. 

NOTE: the MKS (Mortice Kern Systems) version of AWK for MSDOS machines
is required for MSDOS use. The MSDOS version of GAWK can't handle 300+
regular expressions. I have no links with MKS except that it is a good
product.

I would like to post a shell archive of this to bionet.software, if
there is interest. 


Here is a brief example of the output (short form):

Prosite Database -- Release 5.0 of April 1990 Copyright: Amos Bairoch
ProSearch Software -- Release 0.1beta -- Copyright: Lee Kolakowski
The following patterns are in < bov.ops >:

Access#	    From->To	Name
_______	    ________	____
PS00001	        2->6	ASN_GLYCOSYLATION
PS00001	      16->20	ASN_GLYCOSYLATION
PS00001	    201->205	ASN_GLYCOSYLATION
PS00005	      14->17	PKC_PHOSPHO_SITE
PS00005	    230->233	PKC_PHOSPHO_SITE
PS00005	    244->247	PKC_PHOSPHO_SITE
PS00006	      22->26	CK2_PHOSPHO_SITE
PS00006	    194->198	CK2_PHOSPHO_SITE
PS00006	    199->203	CK2_PHOSPHO_SITE
PS00006	    230->234	CK2_PHOSPHO_SITE
PS00006	    339->343	CK2_PHOSPHO_SITE
PS00007	      21->30	TYR_PHOSPHO_SITE
PS00008	      89->95	MYRISTYL
PS00008	    121->127	MYRISTYL
PS00008	    157->163	MYRISTYL
PS00008	    183->189	MYRISTYL
PS00013	    157->168	PROKAR_LIPOPROTEIN
PS00237	      68->85	G_PROTEIN_RECEPTOR
PS00238	    296->314	OPSIN


Please e-mail me if you are interested.

--

Frank Kolakowski 

======================================================================
|lfk@athena.mit.edu                     ||      Lee F. Kolakowski    |
|lfk@eastman2.mit.edu                   ||	M.I.T.		     |
|kolakowski@wccf.mit.edu                ||	Dept of Chemistry    |
|lfk@mbio.med.upenn.edu		        ||	Room 18-506	     |
|lfk@hx.lcs.mit.edu                     ||	77 Massachusetts Ave.|
|AT&T:  1-617-253-1866                  ||	Cambridge, MA 02139  |
|--------------------------------------------------------------------|
|                         #include <woes.h>         		     |
|		           One-Liner Here!                           |
======================================================================