KRISTOFFERSON@BIONET-20.ARPA (12/10/87)
From: <BARKER%GUNBRF.BITNET@WISCVM.WISC.EDU>
REPLY TO LAWRENCE ABOUT SEQUENCE SIGNALS
I would be glad to cooperate on a database of functional signals in
protein sequences. Seems to me we need some "ground rules". First,
we need a way to represent the patterns in computer compatible form.
Next, we should define the type of data to be included. Following is
a table that I made to illustrate some patterns; however, some of the
patterns reported are more complex and cannot be represented with
only the symbols shown. I will be out of the country until 15 Dec.
but would like to receive suggestions.
Table 5. Examples of Sequence Patterns
I. Simple patterns, see Key
Sequence Function Reference
RGD Cell attachment
NX[S,T] Asn-linked carbohydrate binding
KDEL* Retention of protein in endoplasmic
reticulum Munro & Pelham
+GXXX[S,T,A?] Amino-terminal myristylation Chow et al.
<2/3>3[D,E]XSGXG Ser-linked glycosaminoglycan binding Bourdon et al.
+M[A,G,P,S,T,V?] Removal of initiator Met Flinta et al.
+M[K,R,L] Retention of initiator Met
(eukaryotes) Flinta et al.
+M[K,R,L,F,I] Retention of initiator Met
(prokaryotes) Flinta et al.
+M[D,E,N] Acetylation of retained initiator Met
{GXX} Collagen-like helix
CXXCH Heme-binding site, cytochromes
+[D,E,F,I,L,L,Q,R,Y] Shorter half-life, intracellular
proteins Bachmair et al.
+[A,G,M,S,T,V] Longer half-life, intracellular
proteins Bachmair et al.
II. Shorter consensus patterns
CXXC(2-8)XCXXXCP Iron-sulfur cluster binding site,
prokaryote ferredoxins
GKS[K,R]GFGFVXF Binding of single-stranded RNA Chung & Wooley
CXCXXGXXGXXC EGF type A, second half Blomquist et al.
III. More complex patterns
1. Regions of 10 or more residues enriched in [P,E,S,T], having basic
residues at the boundaries but not within the region, are found in
intracellular proteins with short half-lives (Rogers et al.)
Key:
X Any amino acid
[S,T] Either S or T
+ Amino end
* Carboxyl end
3X XXX
(2-6)X XX, XXX, ...., or XXXXXX
<2/3>3[D,E] Two of the next 3 residues are either D or E
{GXX} Repeated pattern
References:
Bachmair, A., Finley, D., and Varshavsky, A., Science 234: 179-186, 1986.
Blomquist, M.C., Hunt, L.T., and Barker, W.C., Proc. Nat. Acad. Sci. USA
81: 7363-7367, 1984.
Bourdon, M.A., Krusius, T., Campbell, S., Schwartz, N.B., and Ruoslahti, E.,
Proc. Nat. Acad. Sci. USA 84: 3194-3198, 1987.
Chow, M., Newman, J.F.E., Filman, D., Hogle, J.M., Rowlands, D.J., and Brown,
F., Nature 327: 482-486, 1987.
Chung, S.Y., and Wooley, J., Proteins: Structure, Function, and Genetics,
1: 195-210, 1986.
Flinta, C., Persson, B., Jornvall, H., and von Heijne, G., Eur. J. Biochem.
154: 193-196, 1986.
Munro, S., and Pelham, R.B., Cell 48: 899-907, 1987.
Rogers, S., Wells, R., and Rechsteiner, M., Science 234: 364-368, 1986.KRISTOFFERSON@BIONET-20.ARPA (12/10/87)
From: David Kristofferson <Kristofferson@BIONET-20.ARPA> The QUEST program on BIONET has a complex query language capable of representing many (possibly all) of the patterns or "keys" indicated in Winona Barker's message. Several predefined patterns are in .KEY files which are available in the <IG> directory. The QUEST query language is documented in the IG Reference Manual and also under HELP TOPICS inside the QUEST program itself. The BIONET consultants are also available to assist users in designing keys. Sincerely, Dave Kristofferson BIONET Resource Manager ARPANET Address: kristofferson@bionet-20.arpa BITNET Address: kristofferson%bionet-20.arpa@wiscvm.bitnet -------