KRISTOFFERSON@BIONET-20.ARPA (12/10/87)
From: <BARKER%GUNBRF.BITNET@WISCVM.WISC.EDU> REPLY TO LAWRENCE ABOUT SEQUENCE SIGNALS I would be glad to cooperate on a database of functional signals in protein sequences. Seems to me we need some "ground rules". First, we need a way to represent the patterns in computer compatible form. Next, we should define the type of data to be included. Following is a table that I made to illustrate some patterns; however, some of the patterns reported are more complex and cannot be represented with only the symbols shown. I will be out of the country until 15 Dec. but would like to receive suggestions. Table 5. Examples of Sequence Patterns I. Simple patterns, see Key Sequence Function Reference RGD Cell attachment NX[S,T] Asn-linked carbohydrate binding KDEL* Retention of protein in endoplasmic reticulum Munro & Pelham +GXXX[S,T,A?] Amino-terminal myristylation Chow et al. <2/3>3[D,E]XSGXG Ser-linked glycosaminoglycan binding Bourdon et al. +M[A,G,P,S,T,V?] Removal of initiator Met Flinta et al. +M[K,R,L] Retention of initiator Met (eukaryotes) Flinta et al. +M[K,R,L,F,I] Retention of initiator Met (prokaryotes) Flinta et al. +M[D,E,N] Acetylation of retained initiator Met {GXX} Collagen-like helix CXXCH Heme-binding site, cytochromes +[D,E,F,I,L,L,Q,R,Y] Shorter half-life, intracellular proteins Bachmair et al. +[A,G,M,S,T,V] Longer half-life, intracellular proteins Bachmair et al. II. Shorter consensus patterns CXXC(2-8)XCXXXCP Iron-sulfur cluster binding site, prokaryote ferredoxins GKS[K,R]GFGFVXF Binding of single-stranded RNA Chung & Wooley CXCXXGXXGXXC EGF type A, second half Blomquist et al. III. More complex patterns 1. Regions of 10 or more residues enriched in [P,E,S,T], having basic residues at the boundaries but not within the region, are found in intracellular proteins with short half-lives (Rogers et al.) Key: X Any amino acid [S,T] Either S or T + Amino end * Carboxyl end 3X XXX (2-6)X XX, XXX, ...., or XXXXXX <2/3>3[D,E] Two of the next 3 residues are either D or E {GXX} Repeated pattern References: Bachmair, A., Finley, D., and Varshavsky, A., Science 234: 179-186, 1986. Blomquist, M.C., Hunt, L.T., and Barker, W.C., Proc. Nat. Acad. Sci. USA 81: 7363-7367, 1984. Bourdon, M.A., Krusius, T., Campbell, S., Schwartz, N.B., and Ruoslahti, E., Proc. Nat. Acad. Sci. USA 84: 3194-3198, 1987. Chow, M., Newman, J.F.E., Filman, D., Hogle, J.M., Rowlands, D.J., and Brown, F., Nature 327: 482-486, 1987. Chung, S.Y., and Wooley, J., Proteins: Structure, Function, and Genetics, 1: 195-210, 1986. Flinta, C., Persson, B., Jornvall, H., and von Heijne, G., Eur. J. Biochem. 154: 193-196, 1986. Munro, S., and Pelham, R.B., Cell 48: 899-907, 1987. Rogers, S., Wells, R., and Rechsteiner, M., Science 234: 364-368, 1986.
KRISTOFFERSON@BIONET-20.ARPA (12/10/87)
From: David Kristofferson <Kristofferson@BIONET-20.ARPA> The QUEST program on BIONET has a complex query language capable of representing many (possibly all) of the patterns or "keys" indicated in Winona Barker's message. Several predefined patterns are in .KEY files which are available in the <IG> directory. The QUEST query language is documented in the IG Reference Manual and also under HELP TOPICS inside the QUEST program itself. The BIONET consultants are also available to assist users in designing keys. Sincerely, Dave Kristofferson BIONET Resource Manager ARPANET Address: kristofferson@bionet-20.arpa BITNET Address: kristofferson%bionet-20.arpa@wiscvm.bitnet -------