BAIROCH@cgecmu51.bitnet (Amos Bairoch) (04/12/90)
ENZYME DATA BANK PRE-ANNOUNCEMENT ================================= A new "secondary" data bank is being established. It is called the 'ENZYME' data bank and it contains the following data for each type of enzyme: 1) EC number. 2) Recommended name. 3) Alternative names (if any). 4) Catalytic activity. 5) Cofactors (if any). 6) Pointers to SWISS-PROT entrie(s) corresponding to that enzyme (if any). We think that the ENZYME data bank will be useful to anybody working with enzymes and will allow programs to be developped that can help with the creation of new metabolic pathways. With the ENZYME data bank the current situation, in term of data bases interconnections, will now be the following: +------+ +------------+ | | | | <--> ENZYME EPD <-- | EMBL | <--> | SWISS-PROT | ---> PDB | | | | <--> PROSITE +------+ +------------+ IMPACT ON SWISS-PROT This new data bank will have the following impact on SWISS-PROT: 1) The existence of this data bank will make the ECINDEX.TXT document obsolete and it will thus be discarded. 2) Instead of having CC (comments) lines with the topics: CC -!- CATALYTIC ACTIVITY: description_of_catalytic_activity. CC -!- COFACTOR: description_of_cofactor. The enzyme entries in SWISS-PROT will have two new types of lines: CA Description_of_catalytic_activity. CF Description_of_cofactor. These lines will be carried over from the ENZYME data bank and will be automatically generated at each release of SWISS-PROT from the information stored in the ENZYME data bank. The introduction of the new line types is planned for release 16 of SWISS-PROT (October 1990). CREATION AND MAINTENANCE How will this data bank be created and maintained ? The source of the majority of the data in the ENZYME data bank comes from the IUPAC/IUB 1984 enzyme nomenclature book [1] and the two supplements (1986 and 1989) [2,3]. Unfortunatly these documents do not seem to be available on any computer media and we were forced to type-in the information relevant to all the different enzymes which are represented in SWISS-PROT. There are 3056 different EC numbers, the information concerning 30% of these enzymes is already entered. We have decided to type-in the rest of the data (optical reading of the documents has been attempted, but is not reliable enough). The full data bank will be available probably in late autumn. Preliminary versions will be distributed along with SWISS-PROT, starting with the next release (release 14 in mid-April). This data bank will be very easy to maintain. Except for error corrections, or new information concerning cofactors, updates of the enzyme list will only occur when a new supplement is published (every two or three years). The pointers to SWISS-PROT are also not a problem, the program that used to build the ECINDEX file now automatically creates the DR lines in the ENZYME data bank. This program will be run at every release of SWISS-PROT. PRELIMINARY FORMAT DESCRIPTION Global format: EMBL/SWISS-PROT like. Line-types: ID Identification line Contains the EC number of the enzyme. DE Description line(s). Contains the recommended name of the enzyme. AN Alternative name(s) line(s) Contains the alternative name(s) of the enzyme. CA Catalytic activity line(s) Contains the description of the catalytic activity. The format used is that of IUPAC/IUB. CF Cofactor(s) line(s). Description of known cofactors. CC Comments line(s) Free text comments. DR Data bank cross-reference line(s). Cross-reference to the SWISS-PROT entries corresponding to the enzyme described. // Entry termination line. SAMPLE ENTRY. ID 1.14.17.3 DE PEPTIDYLGLYCINE MONOOXYGENASE. AN PEPTIDYL ALPHA-AMIDATING ENZYME. CA PEPTIDYLGLYCINE + ASCORBATE + O(2) = PEPTIDYL(2-HYDROXYGLYCINE) + CA DEHYDROASCORBATE + H(2)O. CC THE PRODUCT IS UNSTABLE AND DISMUTATES TO GLYOXYLATE AND THE CC CORRESPONDING DESGLYCINE PEPTIDE AMIDE. CF COPPER. DR P10731, AMD$BOVIN ; P14925, AMD$RAT ; P08478, AMD1$XENLA; DR P12890, AMD2$XENLA; // [1] Enzyme Nomenclature, NC-IUB, Academic Press, New-York, (1984). [2] Supp. 1: Corrections and Additions, Eur. J. Biochem. 157:1-26(1986). [3] Supp. 2: Corrections and Additions, Eur. J. Biochem. 179:489-533(1989). ---------------------------------- This is a pre-announcement, feedback is welcomed and encouraged. ***************************************************************************** * Amos Bairoch * Email: bairoch@cgecmu51 * * Dept. Medical Biochemistry * Tel : +(41 22) 61 84 92 * * CMU *********************************************** * 1, rue Michel Servet * Greer's third law: * * 1211 Geneva 4 * To err is human, but to really foul things * * Switzerland * up you need a computer. * *****************************************************************************