[bionet.molbio.genbank.updates] E.coli tryptophan operon: entire DNA sequence.

GenBank-Updates@genbank.bio.net (04/12/91)

LOCUS       ECOTGP       7539 bp ds-DNA             BCT       12-APR-1991
DEFINITION  E.coli tryptophan operon: entire DNA sequence.
ACCESSION   J01714 M12471 M12472 M25593 M59208
KEYWORDS    anthranilate isomerase; anthranilate synthetase; attenuator;
            glutamine amidotransferase; isomerase; leader peptide;
            phosphoribosyl anthranilate synthetase; synthetase; transferase;
            trp operon; trpA gene; trpB gene; trpC gene; trpD gene; trpE gene;
            tryptophan synthetase.
SOURCE      Escherichia coli RNA and DNA.
  ORGANISM  Escherichia coli
            Prokaryota; Bacteria; Gracilicutes; Scotobacteria;
            Facultatively anaerobic rods; Enterobacteriaceae.
REFERENCE   1  (bases 5917 to 6133)
  AUTHORS   Platt,T. and Yanofsky,C.
  TITLE     An intercistronic region and ribosome-binding site in bacterial
            messenger RNA
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 72, 2399-2403 (1975)
  STANDARD  full staff_review
REFERENCE   2  (bases 84 to 141)
  AUTHORS   Bennett,G.N., Schweingruber,M.E., Brown,K.D., Squires,C. and
            Yanofsky,C.
  TITLE     Nucleotide sequence of region preceding trp mRNA initiation site
            and its role in promoter and operator function
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 73, 2351-2355 (1976)
  STANDARD  full staff_review
REFERENCE   3  (bases 117 to 310)
  AUTHORS   Squires,C., Lee,F., Bertrand,K., Squires,C.L., Bronson,M.J. and
            Yanofsky,C.
  TITLE     Nucleotide sequence of the 5' end of tryptophan messenger RNA of
            Escherichia coli
  JOURNAL   J. Mol. Biol. 103, 351-381 (1976)
  STANDARD  full staff_review
REFERENCE   4  (bases 230 to 272)
  AUTHORS   Bertrand,K., Korn,L.J., Lee,F. and Yanofsky,C.
  TITLE     The attenuator of the tryptophan operon of Escherichia coli:
            heterogeneous 3'-OH termini in vivo and deletion mapping of
            functions
  JOURNAL   J. Mol. Biol. 117, 227-247 (1977)
  STANDARD  full staff_review
REFERENCE   5  (bases 230 to 272)
  AUTHORS   Stauffer,G.V., Zurawski,G. and Yanofsky,C.
  TITLE     Single base-pair alterations in the Escherichia coli trp operon
            leader region that relieve transcription termination at the trp
            attenuator
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 75, 4833-4837 (1978)
  STANDARD  full staff_review
REFERENCE   6  (bases 6707 to 6863)
  AUTHORS   Wu,A.M. and Platt,T.
  TITLE     Transcription termination: nucleotide sequence at 3' end of
            tryptophan operon in Escherichia coli
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 75, 5442-5446 (1978)
  STANDARD  full staff_review
REFERENCE   7  (bases 0 to 0)
  AUTHORS   Bennett,G.N., Schweingruber,M.E., Brown,K.D., Squires,C. and
            Yanofsky,C.
  TITLE     Nucleotide sequence of the promoter-operator region of the
            tryptophan operon of Escherichia coli
  JOURNAL   J. Mol. Biol. 121, 113-137 (1978)
  STANDARD  full staff_review
REFERENCE   8  (bases 36 to 136)
  AUTHORS   Brown,K.D., Bennet,G.N., Lee,F., Schweingruber,M.E. and Yanofsky,C.
  TITLE     RNA polymerase interaction at the promoter-operator region of the
            tryptophan operon of Escherichia coli and Salmonella typhimurium
  JOURNAL   J. Mol. Biol. 121, 153-177 (1978)
  STANDARD  simple staff_entry
REFERENCE   9  (bases 2351 to 2503)
  AUTHORS   Miozzari,G.F. and Yanofsky,C.
  TITLE     Gene fusion during the evolution of the tryptophan operon in
            enterobacteriaceae
  JOURNAL   Nature 277, 486-489 (1979)
  STANDARD  full staff_review
REFERENCE   10 (bases 5932 to 6809)
  AUTHORS   Nichols,B.P. and Yanofsky,C.
  TITLE     Nucleotide sequences of trpA of Salmonella typhimurium
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 76, 5244-5248 (1979)
  STANDARD  full staff_review
REFERENCE   11 (bases 117 to 256)
  AUTHORS   Oxender,D.L., Zurawski,G. and Yanofsky,C.
  TITLE     Attenuation in the Escherichia coli tryptophan operon: role of RNA
            secondary structure involving the tryptophan codon region
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 76, 5524-5528 (1979)
  STANDARD  full staff_review
REFERENCE   12 (bases 6707 to 7335)
  AUTHORS   Wu,A.M., Chapman,A.B., Platt,T., Guarente,L.P. and Beckwith,J.
  TITLE     Deletions of distal sequence affect termination of transcription at
            the end of the tryptophan operon in E. coli
  JOURNAL   Cell 19, 829-836 (1980)
  STANDARD  full staff_review
REFERENCE   13 (bases 230 to 296)
  AUTHORS   Farnham,P.J. and Platt,T.
  TITLE     A model for transcription termination suggested by studies on the
            trp attenuator in vitro using base analogs
  JOURNAL   Cell 20, 739-748 (1980)
  STANDARD  full staff_review
REFERENCE   14 (bases 4810 to 6003)
  AUTHORS   Crawford,I.P., Nichols,B.P. and Yanofsky,C.
  TITLE     Nucleotide sequence of the trpB gene in Escherichia coli and
            Salmonella typhimurium
  JOURNAL   J. Mol. Biol. 142, 489-502 (1980)
  STANDARD  full staff_review
REFERENCE   15 (bases 1761 to 2443)
  AUTHORS   Nichols,B.P., Miozzari,G.F., van Cleemput,M., Bennett,G.N. and
            Yanofsky,C.
  TITLE     Nucleotide sequences of the trpG regions of Escherichia coli,
            Shigella dysenteriae, Salmonella typhimurium and Serratia
            marcescens
  JOURNAL   J. Mol. Biol. 142, 503-517 (1980)
  STANDARD  full staff_review
REFERENCE   16 (bases 3422 to 4824)
  AUTHORS   Christie,G.E. and Platt,T.
  TITLE     Gene structure in the tryptophan operon of Escherichia coli:
            nucleotide sequence of trpC and the flanking intercistronic regions
  JOURNAL   J. Mol. Biol. 142, 519-530 (1980)
  STANDARD  full staff_review
REFERENCE   17 (bases 5932 to 6809)
  AUTHORS   Schneider,W.P., Nichols,B.P. and Yanofsky,C.
  TITLE     Procedure for production of hybrid genes and proteins and its use
            in assessing significance of amino acid differences in homologous
            tryptophan synthetase alpha polypeptides
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 78, 2169-2173 (1981)
  STANDARD  full staff_review
REFERENCE   18 (bases 6807 to 6856; 7057 to 7119)
  AUTHORS   Wu,A.M., Christie,G.E. and Platt,T.
  TITLE     Tandem termination sites in the tryptophan operon of Escherichia
            coli
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 78, 2913-2917 (1981)
  STANDARD  full staff_review
REFERENCE   19 (bases 279 to 1843)
  AUTHORS   Nichols,B.P., van Cleemput,M. and Yanofsky,C.
  TITLE     Nucleotide sequence of Escherichia coli trpE: anthranilate
            synthetase component I contains no tryptophan residues
  JOURNAL   J. Mol. Biol. 146, 45-54 (1981)
  STANDARD  full staff_review
REFERENCE   20 (sites)
  AUTHORS   Yanofsky,C., Platt,T., Crawford,I.P., Nichols,B.P., Christie,G.E.,
            Horowitz,H., van Cleemput,M. and Wu,A.M.
  TITLE     The complete nucleotide sequence of the tryptophan operon of
            Escherichia coli
  JOURNAL   Nucleic Acids Res. 9, 6647-6668 (1981)
  STANDARD  full staff_review
REFERENCE   21 (bases 2504 to 3436)
  AUTHORS   Horowitz,H., Christie,G.E. and Platt,T.
  TITLE     Nucleotide sequence of the trpD gene, encoding anthranilate
            synthetase component II of Escherichia coli
  JOURNAL   J. Mol. Biol. 156, 245-256 (1982)
  STANDARD  full staff_review
REFERENCE   22 (bases 57 to 137)
  AUTHORS   Windass,J.D., Newton,C.R., De Maeyer-Guignard,J., Moore,V.E.,
            Markham,A.F. and Edge,M.D.
  TITLE     The construction of a synthetic Escherichia coli trp promoter and
            its use in the expression of a synthetic interferon gene
  JOURNAL   Nucleic Acids Res. 10, 6639-6657 (1982)
  STANDARD  full staff_review
REFERENCE   23 (sites)
  AUTHORS   Kolter,R. and Yanofsky,C.
  TITLE     Genetic analysis of the tryptophan operon regulatory region using
            site-directed mutagenesis
  JOURNAL   J. Mol. Biol. 175, 299-312 (1984)
  STANDARD  full staff_review
REFERENCE   24 (bases 1 to 350)
  AUTHORS   Kane,J.F., Balaban,S.M. and Bogosian,G.
  TITLE     Commercial production of bovine somatotropin in Escherichia coli
  JOURNAL   (in) Sikes,C.S. and Wheeler,A.P. (Eds.);
            Surface reactive peptides and polymers. Discovery and
            commercialization.:  In press,
            American Chemical Society, Washington, D.C. (1990)
  STANDARD  simple staff_entry
COMMENT
            [Nucleic Acids Res. 9, 6647-6668 (1981)]  review; bases 77 to 6809;
            compiled.
            [J. Mol. Biol. 175, 299-312 (1984)]  sites; mutational analysis of
            the regulatory region.
            
               The tryptophan operon of E.coli consists of a repressor(trpR), a
            promoter(trpP), an operator(trpO), an attenuator which is part of a
            leader peptide region(trpL) and five structural genes:
            trpE(anthranilate synthetase), trpD(glutamine amido transferase and
            anthranilate 5-phosphoribosylpyrophosphate phosphoribosyl-
            transferase), trpC(phosphoribosyl anthranilate isomerase-indole
            glycerol phosphate synthetase), trpB(tryptophan synthetase beta)
            and trpA(tryptophan synthetase alpha).
            
               The promoter region covers approximately 40 bases upstream from
            the mRNA initiation site(75-116); the operator approximately 20
            bases upstream with two-fold axes of symmetry around 104-105 and
            109-110([Proc. Natl. Acad. Sci. U.S.A. 73, 2351-2355 (1976)],[J.
            Mol. Biol. 121, 113-137 (1978)],[J. Mol. Biol. 156, 245-256
            (1982)]). The attenuator region is the first 140
            nucleotides(117-256) of the mRNA leader, a G-C rich region with a
            two-fold axis of symmetry around base 240 and an A-T rich region
            with its axis about bases 259-260; it provides a second site for
            control of transcription ([J. Mol. Biol. 117, 227-247 (1977)],
            [Proc. Natl. Acad. Sci. U.S.A. 75, 4833-4837 (1978)],[Proc. Natl.
            Acad. Sci. U.S.A. 76, 5524-5528 (1979)],[Cell 20, 739-748 (1980)]).
            Two mRNA termination
            regions are reported: trpT (bases 6807-6856) and trpT' (bases
            7057-7119), the first of which bears some similarity to the
            attenuator region ([Proc. Natl. Acad. Sci. U.S.A. 78, 2913-2917
            (1981)]). A chi site for recombination is localized
            between bases 2492 and 2501 and the trp-P2 promoter is located
            between bases 3240 and 3280 ([J. Mol. Biol. 156, 245-256 (1982)]).
            
               The trpE gene is unusual in that it codes for no tryptophan
            residues([J. Mol. Biol. 146, 45-54 (1981)]). The two enzymatic
            functions coded by trpG and trpD
            genes in S.marcescens are coded by the single trpD gene in E.coli
            and other enterobacteriaceae. This appears to have occurred via
            base changes at sites 2420 and 2438. The intercistronic regions for
            the structural genes show little superfluity: the trpE-trpD and
            trpB-trpA boundaries consist of 'tgatg'; the trpD-trpC boundary is
            'taaatgatg' and the trpC-trpB boundary is 'taaggaaaggaacaatg'. All
            the cistrons show a high degree of homology with their correlates
            among the enterobacteriaceae. Sequence discrepancies in early
            work([J. Mol. Biol. 103, 351-381 (1976)]) are corrected in later
            work from the same
            laboratory([Proc. Natl. Acad. Sci. U.S.A. 76, 5524-5528 (1979)],
            [Nucleic Acids Res. 9, 6647-6668 (1981)]). [Proc. Natl. Acad. Sci.
            U.S.A. 78, 2169-2173 (1981)] also sequenced S.typhimurium trpA
            region. [Nucleic Acids Res. 9, 6647-6668 (1981)] compiles sequences
            from
            [J. Mol. Biol. 121, 113-137 (1978)],[Nature 277, 486-489 (1979)],
            [Proc. Natl. Acad. Sci. U.S.A. 76, 5244-5248 (1979)],[J. Mol. Biol.
            142, 519-530 (1980)],[J. Mol. Biol. 142, 489-502 (1980)],[J. Mol.
            Biol. 142, 503-517 (1980)],[J. Mol. Biol. 146, 45-54 (1981)],[J.
            Mol. Biol. 156, 245-256 (1982)].
FEATURES             Location/Qualifiers
     -35_signal      285..290
     protein_bind    301..318
                     /bound_moiety="trpR regulatory protein"
                     /evidence=EXPERIMENTAL
     -10_signal      309..315
     mRNA            321..461
                     /note="trp mRNA (alt.) [Proc. Natl. Acad. Sci. U.S.A. 73,
                     2351-2355 (1976)],[J. Mol. Biol. 103, 351-381 (1976)],[J.
                     Mol. Biol. 121, 113-137 (1978)],[Proc. Natl. Acad. Sci.
                     U.S.A. 76, 5524-5528 (1979)],[Nucleic Acids Res. 10,
                     6639-6657 (1982)]"
     mRNA            321..7046
                     /note="trp mRNA (alt.) [Proc. Natl. Acad. Sci. U.S.A. 73,
                     2351-2355 (1976)],[J. Mol. Biol. 103, 351-381 (1976)],
                     [Proc. Natl. Acad. Sci. U.S.A. 75, 5442-5446 (1978)],[J.
                     Mol. Biol. 121, 113-137 (1978)],[Proc. Natl. Acad. Sci.
                     U.S.A. 76, 5524-5528 ("
     CDS             347..391
                     /note="trp operon leader peptide (putative)"
                     /codon_start=347
     CDS             483..2045
                     /note="anthranilate synthetase component I /nomgen='trpE'"
                     /codon_start=483
     CDS             2045..3640
                     /note="anthranilate synthetase component II: glutamine
                     amidotransferase and phosphoribosyl anthranilate
                     synthetase /nomgen='trpD'"
                     /codon_start=2045
     CDS             3644..5002
                     /note="anthranilate isomerase /nomgen='trpC'"
                     /codon_start=3644
     CDS             5413..6207
                     /note="tryptophan synthetase beta subunit /nomgen='trpB'"
                     /codon_start=5413
     CDS             6207..7013
                     /note="tryptophan synthetase alpha subunit /nomgen='trpA'"
                     /codon_start=6207
BASE COUNT     1779 a   1980 c   2022 g   1754 t      4 others
ORIGIN      9 bp upstream from HhaI site [J. Mol. Biol. 121, 113-137 (1978)].
        1 ccgggaataa gattcaacgc cagtcccgaa cgtgaaattt cctctcttgc tggcgcgatt
       61 gcagctgtgg tgtcatggtc ggtgatcgcc agggtgccga cgcgcatctc gactgcacgg
      121 tgcaccaatg cttctggcgt caggcagcca tcggaagctg tggtatggct gtgcaggtcg
      181 taaatcactg cataattcgt gtcgctcaag gcgcactccc gttctggata atgttttttg
      241 cgccgacatc ataacggttc tggcaaatat tctgaaatga gctgttgaca attaatcatc
      301 gaactagtta actagtacgc aagttcacgt aaaaagggta tcgacaatga aagcaatttt
      361 cgtactgaaa ggttggtggc gcacttcctg aaacgggcag tgtattcacc atgcgtaaag
      421 caatcagata cccagcccgc ctaatgagcg ggcttttttt tgaacaaaat tagagaataa
      481 caatgcaaac acaaaaaccg actctcgaac tgctaacctg cgaaggcgct tatcgcgaca
      541 atcccaccgc gctttttcac cagttgtgtg gggatcgtcc ggcaacgctg ctgctggaat
      601 ccgcagatat cgacagcaaa gatgatttaa aaagcctgct gctggtagac agtgcgctgc
      661 gcattacagc tttaggtgac actgtcacaa tccaggcact ttccggcaac ggcgaagccc
      721 tcctggcact actggataac gccctgcctg cgggtgtgga aagtgaacaa tcaccaaact
      781 gccgtgtgct gcgcttcccc cctgtcagtc cactgctgga tgaagacgcc cgcttatgct
      841 ccctttcggt ttttgacgct ttccgtttat tgcagaatct gttgaatgta ccgaaggaag
      901 aacgagaagc catgttcttc agcggcctgt tctcttatga ccttgtggcg ggatttgaag
      961 atttaccgca actgtcagcg gaaaataact gccctgattt ctgtttttat ctcgctgaaa
     1021 cgctgatggt gattgaccat cagaaaaaaa gcacccgtat tcaggccagc ctgtttgctc
     1081 cgaatgaaga agaaaaacaa cgtctcactg ctcgcctgaa cgaactacgt cagcaactga
     1141 ccgaagccgc gccgccgctg ccagtggttt ccgtgccgca tatgcgttgt gaatgtaatc
     1201 agagcgatga agagttcggt ggcgtagtgc gtttgttgca aaaagcgatt cgcgctggag
     1261 aaattttcca ggtggtgcca tctcgccgtt tctctctgcc ctgcccgtca ccgctggcgg
     1321 cctattacgt gctgaaaaag agtaatccca gcccgtacat gttttttatg caggataatg
     1381 atttcaccct atttggcgcg tcgccggaaa gctcgctcaa gtatgatgcc accagccgcc
     1441 agattgagat ctacccgatt gccggaacac gcccacgcgg tcgtcgcgcc gatggttcac
     1501 tggacagaga tctcgacagc cgtattgaac tggaaatgcg taccgatcat aaagagctgt
     1561 ctgaacatct gatgctggtt gatctcgccc gtaatgatct ggcacgcatt tgcacccccg
     1621 gcagccgcta cgtcgccgat ctcaccaaag ttgaccgtta ttcctatgtg atgcacctcg
     1681 tctctcgcgt agtcggcgaa ctgcgtcacg atcttgacgc cctgcacgct tatcgcgcct
     1741 gtatgaatat ggggacgtta agcggtgcgc cgaaagtacg cgctatgcag ttaattgccg
     1801 aggcggaagg tcgtcgccgc ggcagctacg gcggcgcggt aggttatttc accgcgcatg
     1861 gcgatctcga cacctgcatt gtgatccgct cggcgctggt ggaaaacggt atcgccaccg
     1921 tgcaagcggg tgctggtgta gtccttgatt ctgttccgca gtcggaagcc gacgaaaccc
     1981 gtaacaaagc ccgcgctgta ctgcgcgcta ttgccaccgc gcatcatgca caggagactt
     2041 tctgatggct gacattctgc tgctcgataa tatcgactct tttacgtaca acctggcaga
     2101 tcagttgcgc agcaatgggc ataacgtggt gatttaccgc aaccatatac cggcgcaaac
     2161 cttaattgaa cgcttggcga ccatgagtaa tccggtgctg atgctttctc ctggccccgg
     2221 tgtgccgagc gaagccggtt gtatgccgga actcctcacc cgcttgcgtg gcaagctgcc
     2281 cattattggc atttgcctcg gacatcaggc gattgtcgaa gcttacgggg gctatgtcgg
     2341 tcaggcgggc gaaattctcc acggtaaagc ctccagcatt gaacatgacg gtcaggcgat
     2401 gtttgccgga ttaacaaacc cgctgccggt ggcgcgttat cactcgctgg ttggcagtaa
     2461 cattccggcc ggtttaacca tcaacgccca ttttaatggc atggtgatgg cagtacgtca
     2521 cgatgcggat cgcgtttgtg gattccagtt ccatccggaa tccattctca ccacccaggg
     2581 cgctcgcctg ctggaacaaa cgctggcctg ggcgcagcat aaactagagc cagccaacac
     2641 gctgcaaccg attctggaaa aactgtatca ggcgcagacg cttagccaac aagaaagcca
     2701 ccagctgttt tcagcggtgg tgcgtggcga gctgaagccg gaacaactgg cggcggcgct
     2761 ggtgagcatg aaaattcgcg gtgagcaccc gaacgagatc gccggggcag caaccgcgct
     2821 actggaaaac gcagcgccgt tcccgcgccc ggattatctg tttgctgata tcgtcggtac
     2881 tggcggtgac ggcagcaaca gtatcaatat ttctaccgcc agtgcgtttg tcgccgcggc
     2941 ctgtgggctg aaagtggcga aacacggcaa ccgtagcgtc tccagtaaat ctggttcgtc
     3001 cgatctgctg gcggcgttcg gtattaatct tgatatgaac gccgataaat cgcgccaggc
     3061 gctggatgag ttaggtgtat gtttcctctt tgcgccgaag tatcacaccg gattccgcca
     3121 cgcgatgccg gttcgccagc aactgaaaac ccgcaccctg ttcaatgtgc tggggccatt
     3181 gattaacccg gcgcatccgc cgctggcgtt aattggtgtt tatagtccgg aactggtgct
     3241 gccgattgcc gaaaccttgc gcgtgctggg gtatcaacgc gcggcggtgg tgcacagcgg
     3301 cgggatggat gaagtttcat tacacgcgcc gacaatcgtt gccgaactgc atgacggcga
     3361 aattaaaagc tatcagctca ccgcagaaga ctttggcctg acaccctacc accaggagca
     3421 actggcaggc ggaacaccgg aagaaaaccg tgacatttta acacgtttgt tacaaggtaa
     3481 aggcgacgcc gcccatgaag cagccgtcgc tgcgaacgtc gccatgttaa tgcgcctgca
     3541 tggccatgaa gatctgcaag ccaatgcgca aaccgttctt gaggtactgc gcagtggttc
     3601 cgcttacgac agagtcaccg cactggcggc acgagggtaa atgatgcaaa ccgttttagc
     3661 gaaaatcgtc gcagacaagg cgatttgggt agaagcccgc aaacagcagc aaccgctggc
     3721 cagttttcag aatgaggttc agccgagcac gcgacatttt tatgatgcgc tacagggtgc
     3781 gcgcacggcg tttattctgg agtgcaagaa agcgtcgccg tcaaaaggcg tgatccgtga
     3841 tgatttcgat ccagcacgca ttgccgccat ttataaacat tacgcttcgg caatttcggt
     3901 gctgactgat gagaaatatt tcaggggtag ctttaatttc ctccccatcg tcagccaaat
     3961 cgccccgcag ccgattttat gtaaagactt cattatcgac ccttaccaga tctatctggc
     4021 gcgctattac caggccgatg cctgcttatt aatgctttca gtactggatg acgaccaata
     4081 tcgccagctt gccgccgtcg ctcacagtct ggagatgggg gtgctgaccg aagtcagtaa
     4141 tgaagaggaa caggagcgcg ccattgcatt gggagcaaag gtcgttggca tcaacaaccg
     4201 cgatctgcgt gatttgtcga ttgatctcaa ccgtacccgc gagcttgcgc cgaaactggg
     4261 gcacaacgtg acggtaatca gcgaatccgg catcaatact tacgctcagg tgcgcgagtt
     4321 aagccacttc gctaacggtt ttctgattgg ttcggcgttg atggcccatg acgatttgca
     4381 cgccgccgtg cgccgggtgt tgctgggtga gaataaagta tgtggcctga cgcgtgggca
     4441 agatgctaaa gcagcttatg acgcgggcgc gatttacggt gggttgattt ttgttgcgac
     4501 atcaccgcgt tgcgtcaacg ttgaacaggc gcaggaagtg atggctgcgg caccgttgca
     4561 gtatgttggc gtgttccgca atcacgatat tgccgatgtg gtggacaaag ctaaggtgtt
     4621 atcgctggtg gcagtgcaac tgcatggtaa tgaagaacag ctgtatatcg atacgctgcg
     4681 tgaagctctg ccagcacatg ttgccatctg gaaagcatta agcgtcggtg aaaccctgcc
     4741 cgcccgcgag tttcagcacg ttgataaata tgttttagac aacggccagg gtggaagcgg
     4801 gcaacgtttt gactggtcac tattaaatgg tcaaacgctt ggcaacgttc tgctggcggg
     4861 gggcttaggc gcagataact gcgtggaagc ggcacaaacc ggctgcgccg gacttgattt
     4921 taattctgct gtagagtcgc aaccgggcat caaagacgca cgtcttttgg cctcggtttt
     4981 ccagacgctg cgcgcatatt aaggaaagga acaatgacaa cattacttaa cccctatttt
     5041 ggtgagtttg gcggcatgta cgtgccacaa atcctgatgc ctgctctgcg ccagctggaa
     5101 gaagcttttg tcagtgcgca aaaagatcct gaatttcagg ctcagttcaa cgacctgctg
     5161 aaaaactatg ccgggcgtcc aaccgcgctg accaaatgcc agaacattac agccgggacg
     5221 aacaccacgc tgtatctcaa gcgtgaagat ttgctgcacg gcggcgcgca taaaactaac
     5281 caggtgctgg ggcaggcgtt gctggcgaag cggatgggta aaaccgaaat catcgccgaa
     5341 accggtgccg gtcagcatgg cgtggcgtcg gccctggcca gcgccctgct cggcctgaaa
     5401 tgccgtattt atatgggtgc caaagacgtt gaacgccagt cgcctaacgt ttttcgtatg
     5461 cgcttaatgg gtgcggaagt gatcccggtg catagcggtt ccgcgacgct gaaagatgcc
     5521 tgtaacgagg cgctgcgcga ctggtccggt agttacgaaa ccgcgcacta tatgctgggc
     5581 accgcagctg gcccgcatcc ttatccgacc attgtgcgtg agtttcagcg gatgattggc
     5641 gaagaaacca aagcgcagat tctggaaaga gaaggtcgcc tgccggatgc cgttatcgcc
     5701 tgtgttggcg gcggttcgaa tgccatcggc atgtttgctg atttcatcaa tgaaaccaac
     5761 gtcggcctga ttggtgtgga gccaggtggt cacggtatcg aaactggcga gcacggcgca
     5821 ccgctaaaac atggtcgcgt gggtatctat ttcggtatga aagcgccgat gatgcaaacc
     5881 gaagacgggc agattgaaga atcttactcc atctccgccg gactggattt cccgtctgtc
     5941 ggcccacaac acgcgtatct taacagcact ggacgcgctg attacgtgtc tattaccgat
     6001 gatgaagccc ttgaagcctt caaaacgctg tgcctgcacg aagggatcat cccggcgctg
     6061 gaatcctccc acgccttggc ccatgcgttg aaaatgatgc gcgaaaaccc ggataaagag
     6121 cagctactgg tggttaacct ttccggtcgc ggcgataaag acatcttcac cgttcacgat
     6181 attttgaaag cacgagggga aatctgatgg aacgctacga atctctgttt gcccagttga
     6241 aggagcgcaa agaaggcgca ttcgttcctt tcgtcacgct cggtgatccg ggcattgagc
     6301 agtcattgaa aattatcgat acgctaattg aagccggtgc tgacgcgctg gagttaggta
     6361 tccccttctc cgacccactg gcggatggcc cgacgattca aaacgccact ctgcgcgcct
     6421 ttgcggcagg tgtgactccg gcacaatgtt ttgaaatgct ggcactgatt cgccagaaac
     6481 acccgaccat tcccattggc ctgttgatgt atgccaatct ggtgtttaac aaaggcattg
     6541 atgagtttta tgcccagtgc gaaaaagtcg gcgtcgattc ggtgctggtt gccgatgtgc
     6601 cagttgaaga gtccgcgccc ttccgccagg ccgcgttgcg tcacaacgtc gcacctatct
     6661 tcatctgccc gccaaatgcc gatgacgacc tgctgcgcca gatagcctct tacggtcgtg
     6721 gttacaccta tttgctgtca cgagcaggcg tgaccggcgc agaaaaccgc gccgcgttac
     6781 ccctcaatca tctggttgcg aagctgaaag agtacaacgc tgcacctcca ttgcagggat
     6841 ttggtatttc cgccccggat caggtaaaag cagcgattga tgcaggagct gcgggcgcga
     6901 tttctggttc ggccattgtt aaaatcatcg agcaacatat taatgagcca gagaaaatgc
     6961 tggcggcact gaaagttttt gtacaaccga tgaaagcggc gacgcgcagt taatcccaca
     7021 gccgccagtt ccgctggcgg cattttaact ttctttaatg aagccggaaa aatcctaaat
     7081 tcatttaata tttatctttt taccgtttcg cttaccccgg tcgatcgtyr acttacgtca
     7141 tttttccgcc caacagtaat ataaacaaac aaattaaacc cgcaacataa caccagtaaa
     7201 atcaataatt ttctctaagt cacttattcc tcaggtaatt cttaatatat ccagaatgtt
     7261 cctcaaaata tattttccct ctatcttctc gttgcgctta atttgactaa ttctcattag
     7321 cgactaattt taatgagtgt cgacacacaa cactcatatt aatgaaacaa tgcaacgcaa
     7381 cgggagaaat aacatggccg aacatcgtgg tggttcagga aatttcgccg aagaccgtga
     7441 gaaggcatcc gacgcagccg taaaggcggt cagcatagcg gcggtaattt taaaaatgat
     7501 cgcaacgcgc atctgaagcg ggtaaaaaag gcggtyrac
//

GenBank-Updates@genbank.bio.net (05/14/91)

LOCUS       ECOTGP       7539 bp ds-DNA             BCT       14-MAY-1991
DEFINITION  E.coli tryptophan operon: entire DNA sequence.
ACCESSION   J01714 M12471 M12472 M25593 M59208
KEYWORDS    anthranilate isomerase; anthranilate synthetase; attenuator;
            glutamine amidotransferase; isomerase; leader peptide;
            phosphoribosyl anthranilate synthetase; synthetase; transferase;
            trp operon; trpA gene; trpB gene; trpC gene; trpD gene; trpE gene;
            tryptophan synthetase.
SOURCE      Escherichia coli RNA and DNA.
  ORGANISM  Escherichia coli
            Prokaryota; Bacteria; Gracilicutes; Scotobacteria;
            Facultatively anaerobic rods; Enterobacteriaceae.
REFERENCE   1  (bases 5917 to 6133)
  AUTHORS   Platt,T. and Yanofsky,C.
  TITLE     An intercistronic region and ribosome-binding site in bacterial
            messenger RNA
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 72, 2399-2403 (1975)
  STANDARD  full staff_review
REFERENCE   2  (bases 84 to 141)
  AUTHORS   Bennett,G.N., Schweingruber,M.E., Brown,K.D., Squires,C. and
            Yanofsky,C.
  TITLE     Nucleotide sequence of region preceding trp mRNA initiation site
            and its role in promoter and operator function
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 73, 2351-2355 (1976)
  STANDARD  full staff_review
REFERENCE   3  (bases 117 to 310)
  AUTHORS   Squires,C., Lee,F., Bertrand,K., Squires,C.L., Bronson,M.J. and
            Yanofsky,C.
  TITLE     Nucleotide sequence of the 5' end of tryptophan messenger RNA of
            Escherichia coli
  JOURNAL   J. Mol. Biol. 103, 351-381 (1976)
  STANDARD  full staff_review
REFERENCE   4  (bases 230 to 272)
  AUTHORS   Bertrand,K., Korn,L.J., Lee,F. and Yanofsky,C.
  TITLE     The attenuator of the tryptophan operon of Escherichia coli:
            heterogeneous 3'-OH termini in vivo and deletion mapping of
            functions
  JOURNAL   J. Mol. Biol. 117, 227-247 (1977)
  STANDARD  full staff_review
REFERENCE   5  (bases 230 to 272)
  AUTHORS   Stauffer,G.V., Zurawski,G. and Yanofsky,C.
  TITLE     Single base-pair alterations in the Escherichia coli trp operon
            leader region that relieve transcription termination at the trp
            attenuator
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 75, 4833-4837 (1978)
  STANDARD  full staff_review
REFERENCE   6  (bases 6707 to 6863)
  AUTHORS   Wu,A.M. and Platt,T.
  TITLE     Transcription termination: nucleotide sequence at 3' end of
            tryptophan operon in Escherichia coli
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 75, 5442-5446 (1978)
  STANDARD  full staff_review
REFERENCE   7  (bases 0 to 0)
  AUTHORS   Bennett,G.N., Schweingruber,M.E., Brown,K.D., Squires,C. and
            Yanofsky,C.
  TITLE     Nucleotide sequence of the promoter-operator region of the
            tryptophan operon of Escherichia coli
  JOURNAL   J. Mol. Biol. 121, 113-137 (1978)
  STANDARD  full staff_review
REFERENCE   8  (bases 36 to 136)
  AUTHORS   Brown,K.D., Bennet,G.N., Lee,F., Schweingruber,M.E. and Yanofsky,C.
  TITLE     RNA polymerase interaction at the promoter-operator region of the
            tryptophan operon of Escherichia coli and Salmonella typhimurium
  JOURNAL   J. Mol. Biol. 121, 153-177 (1978)
  STANDARD  simple staff_entry
REFERENCE   9  (bases 2351 to 2503)
  AUTHORS   Miozzari,G.F. and Yanofsky,C.
  TITLE     Gene fusion during the evolution of the tryptophan operon in
            enterobacteriaceae
  JOURNAL   Nature 277, 486-489 (1979)
  STANDARD  full staff_review
REFERENCE   10 (bases 5932 to 6809)
  AUTHORS   Nichols,B.P. and Yanofsky,C.
  TITLE     Nucleotide sequences of trpA of Salmonella typhimurium
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 76, 5244-5248 (1979)
  STANDARD  full staff_review
REFERENCE   11 (bases 117 to 256)
  AUTHORS   Oxender,D.L., Zurawski,G. and Yanofsky,C.
  TITLE     Attenuation in the Escherichia coli tryptophan operon: role of RNA
            secondary structure involving the tryptophan codon region
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 76, 5524-5528 (1979)
  STANDARD  full staff_review
REFERENCE   12 (bases 6707 to 7335)
  AUTHORS   Wu,A.M., Chapman,A.B., Platt,T., Guarente,L.P. and Beckwith,J.
  TITLE     Deletions of distal sequence affect termination of transcription at
            the end of the tryptophan operon in E. coli
  JOURNAL   Cell 19, 829-836 (1980)
  STANDARD  full staff_review
REFERENCE   13 (bases 230 to 296)
  AUTHORS   Farnham,P.J. and Platt,T.
  TITLE     A model for transcription termination suggested by studies on the
            trp attenuator in vitro using base analogs
  JOURNAL   Cell 20, 739-748 (1980)
  STANDARD  full staff_review
REFERENCE   14 (bases 4810 to 6003)
  AUTHORS   Crawford,I.P., Nichols,B.P. and Yanofsky,C.
  TITLE     Nucleotide sequence of the trpB gene in Escherichia coli and
            Salmonella typhimurium
  JOURNAL   J. Mol. Biol. 142, 489-502 (1980)
  STANDARD  full staff_review
REFERENCE   15 (bases 1761 to 2443)
  AUTHORS   Nichols,B.P., Miozzari,G.F., van Cleemput,M., Bennett,G.N. and
            Yanofsky,C.
  TITLE     Nucleotide sequences of the trpG regions of Escherichia coli,
            Shigella dysenteriae, Salmonella typhimurium and Serratia
            marcescens
  JOURNAL   J. Mol. Biol. 142, 503-517 (1980)
  STANDARD  full staff_review
REFERENCE   16 (bases 3422 to 4824)
  AUTHORS   Christie,G.E. and Platt,T.
  TITLE     Gene structure in the tryptophan operon of Escherichia coli:
            nucleotide sequence of trpC and the flanking intercistronic regions
  JOURNAL   J. Mol. Biol. 142, 519-530 (1980)
  STANDARD  full staff_review
REFERENCE   17 (bases 5932 to 6809)
  AUTHORS   Schneider,W.P., Nichols,B.P. and Yanofsky,C.
  TITLE     Procedure for production of hybrid genes and proteins and its use
            in assessing significance of amino acid differences in homologous
            tryptophan synthetase alpha polypeptides
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 78, 2169-2173 (1981)
  STANDARD  full staff_review
REFERENCE   18 (bases 6807 to 6856; 7057 to 7119)
  AUTHORS   Wu,A.M., Christie,G.E. and Platt,T.
  TITLE     Tandem termination sites in the tryptophan operon of Escherichia
            coli
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 78, 2913-2917 (1981)
  STANDARD  full staff_review
REFERENCE   19 (bases 279 to 1843)
  AUTHORS   Nichols,B.P., van Cleemput,M. and Yanofsky,C.
  TITLE     Nucleotide sequence of Escherichia coli trpE: anthranilate
            synthetase component I contains no tryptophan residues
  JOURNAL   J. Mol. Biol. 146, 45-54 (1981)
  STANDARD  full staff_review
REFERENCE   20 (sites)
  AUTHORS   Yanofsky,C., Platt,T., Crawford,I.P., Nichols,B.P., Christie,G.E.,
            Horowitz,H., van Cleemput,M. and Wu,A.M.
  TITLE     The complete nucleotide sequence of the tryptophan operon of
            Escherichia coli
  JOURNAL   Nucleic Acids Res. 9, 6647-6668 (1981)
  STANDARD  full staff_review
REFERENCE   21 (bases 2504 to 3436)
  AUTHORS   Horowitz,H., Christie,G.E. and Platt,T.
  TITLE     Nucleotide sequence of the trpD gene, encoding anthranilate
            synthetase component II of Escherichia coli
  JOURNAL   J. Mol. Biol. 156, 245-256 (1982)
  STANDARD  full staff_review
REFERENCE   22 (bases 57 to 137)
  AUTHORS   Windass,J.D., Newton,C.R., De Maeyer-Guignard,J., Moore,V.E.,
            Markham,A.F. and Edge,M.D.
  TITLE     The construction of a synthetic Escherichia coli trp promoter and
            its use in the expression of a synthetic interferon gene
  JOURNAL   Nucleic Acids Res. 10, 6639-6657 (1982)
  STANDARD  full staff_review
REFERENCE   23 (sites)
  AUTHORS   Kolter,R. and Yanofsky,C.
  TITLE     Genetic analysis of the tryptophan operon regulatory region using
            site-directed mutagenesis
  JOURNAL   J. Mol. Biol. 175, 299-312 (1984)
  STANDARD  full staff_review
REFERENCE   24 (bases 1 to 350)
  AUTHORS   Kane,J.F., Balaban,S.M. and Bogosian,G.
  TITLE     Commercial production of bovine somatotropin in Escherichia coli
  JOURNAL   (in) Sikes,C.S. and Wheeler,A.P. (Eds.);
            Surface reactive peptides and polymers. Discovery and
            commercialization.:  In press,
            American Chemical Society, Washington, D.C. (1990)
  STANDARD  simple staff_entry
COMMENT
            [Nucleic Acids Res. 9, 6647-6668 (1981)]  review; bases 77 to 6809;
            compiled.
            [J. Mol. Biol. 175, 299-312 (1984)]  sites; mutational analysis of
            the regulatory region.
            
               The tryptophan operon of E.coli consists of a repressor(trpR), a
            promoter(trpP), an operator(trpO), an attenuator which is part of a
            leader peptide region(trpL) and five structural genes:
            trpE(anthranilate synthetase), trpD(glutamine amido transferase and
            anthranilate 5-phosphoribosylpyrophosphate phosphoribosyl-
            transferase), trpC(phosphoribosyl anthranilate isomerase-indole
            glycerol phosphate synthetase), trpB(tryptophan synthetase beta)
            and trpA(tryptophan synthetase alpha).
            
               The promoter region covers approximately 40 bases upstream from
            the mRNA initiation site(75-116); the operator approximately 20
            bases upstream with two-fold axes of symmetry around 104-105 and
            109-110([Proc. Natl. Acad. Sci. U.S.A. 73, 2351-2355 (1976)],[J.
            Mol. Biol. 121, 113-137 (1978)],[J. Mol. Biol. 156, 245-256
            (1982)]). The attenuator region is the first 140
            nucleotides(117-256) of the mRNA leader, a G-C rich region with a
            two-fold axis of symmetry around base 240 and an A-T rich region
            with its axis about bases 259-260; it provides a second site for
            control of transcription ([J. Mol. Biol. 117, 227-247 (1977)],
            [Proc. Natl. Acad. Sci. U.S.A. 75, 4833-4837 (1978)],[Proc. Natl.
            Acad. Sci. U.S.A. 76, 5524-5528 (1979)],[Cell 20, 739-748 (1980)]).
            Two mRNA termination
            regions are reported: trpT (bases 6807-6856) and trpT' (bases
            7057-7119), the first of which bears some similarity to the
            attenuator region ([Proc. Natl. Acad. Sci. U.S.A. 78, 2913-2917
            (1981)]). A chi site for recombination is localized
            between bases 2492 and 2501 and the trp-P2 promoter is located
            between bases 3240 and 3280 ([J. Mol. Biol. 156, 245-256 (1982)]).
            
               The trpE gene is unusual in that it codes for no tryptophan
            residues([J. Mol. Biol. 146, 45-54 (1981)]). The two enzymatic
            functions coded by trpG and trpD
            genes in S.marcescens are coded by the single trpD gene in E.coli
            and other enterobacteriaceae. This appears to have occurred via
            base changes at sites 2420 and 2438. The intercistronic regions for
            the structural genes show little superfluity: the trpE-trpD and
            trpB-trpA boundaries consist of 'tgatg'; the trpD-trpC boundary is
            'taaatgatg' and the trpC-trpB boundary is 'taaggaaaggaacaatg'. All
            the cistrons show a high degree of homology with their correlates
            among the enterobacteriaceae. Sequence discrepancies in early
            work([J. Mol. Biol. 103, 351-381 (1976)]) are corrected in later
            work from the same
            laboratory([Proc. Natl. Acad. Sci. U.S.A. 76, 5524-5528 (1979)],
            [Nucleic Acids Res. 9, 6647-6668 (1981)]). [Proc. Natl. Acad. Sci.
            U.S.A. 78, 2169-2173 (1981)] also sequenced S.typhimurium trpA
            region. [Nucleic Acids Res. 9, 6647-6668 (1981)] compiles sequences
            from
            [J. Mol. Biol. 121, 113-137 (1978)],[Nature 277, 486-489 (1979)],
            [Proc. Natl. Acad. Sci. U.S.A. 76, 5244-5248 (1979)],[J. Mol. Biol.
            142, 519-530 (1980)],[J. Mol. Biol. 142, 489-502 (1980)],[J. Mol.
            Biol. 142, 503-517 (1980)],[J. Mol. Biol. 146, 45-54 (1981)],[J.
            Mol. Biol. 156, 245-256 (1982)].
FEATURES             Location/Qualifiers
     -35_signal      285..290
     protein_bind    301..318
                     /bound_moiety="trpR regulatory protein"
                     /evidence=EXPERIMENTAL
     -10_signal      309..315
     mRNA            321..461
                     /note="trp mRNA (alt.) [Proc. Natl. Acad. Sci. U.S.A. 73,
                     2351-2355 (1976)],[J. Mol. Biol. 103, 351-381 (1976)],[J.
                     Mol. Biol. 121, 113-137 (1978)],[Proc. Natl. Acad. Sci.
                     U.S.A. 76, 5524-5528 (1979)],[Nucleic Acids Res. 10,
                     6639-6657 (1982)]"
     mRNA            321..7046
                     /note="trp mRNA (alt.) [Proc. Natl. Acad. Sci. U.S.A. 73,
                     2351-2355 (1976)],[J. Mol. Biol. 103, 351-381 (1976)],
                     [Proc. Natl. Acad. Sci. U.S.A. 75, 5442-5446 (1978)],[J.
                     Mol. Biol. 121, 113-137 (1978)],[Proc. Natl. Acad. Sci.
                     U.S.A. 76, 5524-5528 ("
     CDS             347..391
                     /note="trp operon leader peptide (putative)"
                     /codon_start=347
     CDS             483..2045
                     /note="anthranilate synthetase component I /nomgen='trpE'"
                     /codon_start=483
     old_sequence    1991..1991
                     /location=J. Mol. Biol. 142, 503-517 (1980) | 27..27
     old_sequence    1997..1997
                     /location=J. Mol. Biol. 142, 503-517 (1980) | 33..33
     CDS             2045..3640
                     /note="anthranilate synthetase component II: glutamine
                     amidotransferase and phosphoribosyl anthranilate
                     synthetase /nomgen='trpD'"
                     /codon_start=2045
     CDS             3644..5002
                     /note="anthranilate isomerase /nomgen='trpC'"
                     /codon_start=3644
     CDS             5014..6207
                     /note="tryptophan synthetase beta subunit /nomgen='trpB'"
                     /codon_start=5014
     conflict        6153..6153
                     /location=Proc. Natl. Acad. Sci. U.S.A. 78, 2169-2173
                     (1981) | 18..18
     CDS             6207..7013
                     /note="tryptophan synthetase alpha subunit /nomgen='trpA'"
                     /codon_start=6207
BASE COUNT     1779 a   1980 c   2022 g   1754 t      4 others
ORIGIN      9 bp upstream from HhaI site [J. Mol. Biol. 121, 113-137 (1978)].
        1 ccgggaataa gattcaacgc cagtcccgaa cgtgaaattt cctctcttgc tggcgcgatt
       61 gcagctgtgg tgtcatggtc ggtgatcgcc agggtgccga cgcgcatctc gactgcacgg
      121 tgcaccaatg cttctggcgt caggcagcca tcggaagctg tggtatggct gtgcaggtcg
      181 taaatcactg cataattcgt gtcgctcaag gcgcactccc gttctggata atgttttttg
      241 cgccgacatc ataacggttc tggcaaatat tctgaaatga gctgttgaca attaatcatc
      301 gaactagtta actagtacgc aagttcacgt aaaaagggta tcgacaatga aagcaatttt
      361 cgtactgaaa ggttggtggc gcacttcctg aaacgggcag tgtattcacc atgcgtaaag
      421 caatcagata cccagcccgc ctaatgagcg ggcttttttt tgaacaaaat tagagaataa
      481 caatgcaaac acaaaaaccg actctcgaac tgctaacctg cgaaggcgct tatcgcgaca
      541 atcccaccgc gctttttcac cagttgtgtg gggatcgtcc ggcaacgctg ctgctggaat
      601 ccgcagatat cgacagcaaa gatgatttaa aaagcctgct gctggtagac agtgcgctgc
      661 gcattacagc tttaggtgac actgtcacaa tccaggcact ttccggcaac ggcgaagccc
      721 tcctggcact actggataac gccctgcctg cgggtgtgga aagtgaacaa tcaccaaact
      781 gccgtgtgct gcgcttcccc cctgtcagtc cactgctgga tgaagacgcc cgcttatgct
      841 ccctttcggt ttttgacgct ttccgtttat tgcagaatct gttgaatgta ccgaaggaag
      901 aacgagaagc catgttcttc agcggcctgt tctcttatga ccttgtggcg ggatttgaag
      961 atttaccgca actgtcagcg gaaaataact gccctgattt ctgtttttat ctcgctgaaa
     1021 cgctgatggt gattgaccat cagaaaaaaa gcacccgtat tcaggccagc ctgtttgctc
     1081 cgaatgaaga agaaaaacaa cgtctcactg ctcgcctgaa cgaactacgt cagcaactga
     1141 ccgaagccgc gccgccgctg ccagtggttt ccgtgccgca tatgcgttgt gaatgtaatc
     1201 agagcgatga agagttcggt ggcgtagtgc gtttgttgca aaaagcgatt cgcgctggag
     1261 aaattttcca ggtggtgcca tctcgccgtt tctctctgcc ctgcccgtca ccgctggcgg
     1321 cctattacgt gctgaaaaag agtaatccca gcccgtacat gttttttatg caggataatg
     1381 atttcaccct atttggcgcg tcgccggaaa gctcgctcaa gtatgatgcc accagccgcc
     1441 agattgagat ctacccgatt gccggaacac gcccacgcgg tcgtcgcgcc gatggttcac
     1501 tggacagaga tctcgacagc cgtattgaac tggaaatgcg taccgatcat aaagagctgt
     1561 ctgaacatct gatgctggtt gatctcgccc gtaatgatct ggcacgcatt tgcacccccg
     1621 gcagccgcta cgtcgccgat ctcaccaaag ttgaccgtta ttcctatgtg atgcacctcg
     1681 tctctcgcgt agtcggcgaa ctgcgtcacg atcttgacgc cctgcacgct tatcgcgcct
     1741 gtatgaatat ggggacgtta agcggtgcgc cgaaagtacg cgctatgcag ttaattgccg
     1801 aggcggaagg tcgtcgccgc ggcagctacg gcggcgcggt aggttatttc accgcgcatg
     1861 gcgatctcga cacctgcatt gtgatccgct cggcgctggt ggaaaacggt atcgccaccg
     1921 tgcaagcggg tgctggtgta gtccttgatt ctgttccgca gtcggaagcc gacgaaaccc
     1981 gtaacaaagc ccgcgctgta ctgcgcgcta ttgccaccgc gcatcatgca caggagactt
     2041 tctgatggct gacattctgc tgctcgataa tatcgactct tttacgtaca acctggcaga
     2101 tcagttgcgc agcaatgggc ataacgtggt gatttaccgc aaccatatac cggcgcaaac
     2161 cttaattgaa cgcttggcga ccatgagtaa tccggtgctg atgctttctc ctggccccgg
     2221 tgtgccgagc gaagccggtt gtatgccgga actcctcacc cgcttgcgtg gcaagctgcc
     2281 cattattggc atttgcctcg gacatcaggc gattgtcgaa gcttacgggg gctatgtcgg
     2341 tcaggcgggc gaaattctcc acggtaaagc ctccagcatt gaacatgacg gtcaggcgat
     2401 gtttgccgga ttaacaaacc cgctgccggt ggcgcgttat cactcgctgg ttggcagtaa
     2461 cattccggcc ggtttaacca tcaacgccca ttttaatggc atggtgatgg cagtacgtca
     2521 cgatgcggat cgcgtttgtg gattccagtt ccatccggaa tccattctca ccacccaggg
     2581 cgctcgcctg ctggaacaaa cgctggcctg ggcgcagcat aaactagagc cagccaacac
     2641 gctgcaaccg attctggaaa aactgtatca ggcgcagacg cttagccaac aagaaagcca
     2701 ccagctgttt tcagcggtgg tgcgtggcga gctgaagccg gaacaactgg cggcggcgct
     2761 ggtgagcatg aaaattcgcg gtgagcaccc gaacgagatc gccggggcag caaccgcgct
     2821 actggaaaac gcagcgccgt tcccgcgccc ggattatctg tttgctgata tcgtcggtac
     2881 tggcggtgac ggcagcaaca gtatcaatat ttctaccgcc agtgcgtttg tcgccgcggc
     2941 ctgtgggctg aaagtggcga aacacggcaa ccgtagcgtc tccagtaaat ctggttcgtc
     3001 cgatctgctg gcggcgttcg gtattaatct tgatatgaac gccgataaat cgcgccaggc
     3061 gctggatgag ttaggtgtat gtttcctctt tgcgccgaag tatcacaccg gattccgcca
     3121 cgcgatgccg gttcgccagc aactgaaaac ccgcaccctg ttcaatgtgc tggggccatt
     3181 gattaacccg gcgcatccgc cgctggcgtt aattggtgtt tatagtccgg aactggtgct
     3241 gccgattgcc gaaaccttgc gcgtgctggg gtatcaacgc gcggcggtgg tgcacagcgg
     3301 cgggatggat gaagtttcat tacacgcgcc gacaatcgtt gccgaactgc atgacggcga
     3361 aattaaaagc tatcagctca ccgcagaaga ctttggcctg acaccctacc accaggagca
     3421 actggcaggc ggaacaccgg aagaaaaccg tgacatttta acacgtttgt tacaaggtaa
     3481 aggcgacgcc gcccatgaag cagccgtcgc tgcgaacgtc gccatgttaa tgcgcctgca
     3541 tggccatgaa gatctgcaag ccaatgcgca aaccgttctt gaggtactgc gcagtggttc
     3601 cgcttacgac agagtcaccg cactggcggc acgagggtaa atgatgcaaa ccgttttagc
     3661 gaaaatcgtc gcagacaagg cgatttgggt agaagcccgc aaacagcagc aaccgctggc
     3721 cagttttcag aatgaggttc agccgagcac gcgacatttt tatgatgcgc tacagggtgc
     3781 gcgcacggcg tttattctgg agtgcaagaa agcgtcgccg tcaaaaggcg tgatccgtga
     3841 tgatttcgat ccagcacgca ttgccgccat ttataaacat tacgcttcgg caatttcggt
     3901 gctgactgat gagaaatatt tcaggggtag ctttaatttc ctccccatcg tcagccaaat
     3961 cgccccgcag ccgattttat gtaaagactt cattatcgac ccttaccaga tctatctggc
     4021 gcgctattac caggccgatg cctgcttatt aatgctttca gtactggatg acgaccaata
     4081 tcgccagctt gccgccgtcg ctcacagtct ggagatgggg gtgctgaccg aagtcagtaa
     4141 tgaagaggaa caggagcgcg ccattgcatt gggagcaaag gtcgttggca tcaacaaccg
     4201 cgatctgcgt gatttgtcga ttgatctcaa ccgtacccgc gagcttgcgc cgaaactggg
     4261 gcacaacgtg acggtaatca gcgaatccgg catcaatact tacgctcagg tgcgcgagtt
     4321 aagccacttc gctaacggtt ttctgattgg ttcggcgttg atggcccatg acgatttgca
     4381 cgccgccgtg cgccgggtgt tgctgggtga gaataaagta tgtggcctga cgcgtgggca
     4441 agatgctaaa gcagcttatg acgcgggcgc gatttacggt gggttgattt ttgttgcgac
     4501 atcaccgcgt tgcgtcaacg ttgaacaggc gcaggaagtg atggctgcgg caccgttgca
     4561 gtatgttggc gtgttccgca atcacgatat tgccgatgtg gtggacaaag ctaaggtgtt
     4621 atcgctggtg gcagtgcaac tgcatggtaa tgaagaacag ctgtatatcg atacgctgcg
     4681 tgaagctctg ccagcacatg ttgccatctg gaaagcatta agcgtcggtg aaaccctgcc
     4741 cgcccgcgag tttcagcacg ttgataaata tgttttagac aacggccagg gtggaagcgg
     4801 gcaacgtttt gactggtcac tattaaatgg tcaaacgctt ggcaacgttc tgctggcggg
     4861 gggcttaggc gcagataact gcgtggaagc ggcacaaacc ggctgcgccg gacttgattt
     4921 taattctgct gtagagtcgc aaccgggcat caaagacgca cgtcttttgg cctcggtttt
     4981 ccagacgctg cgcgcatatt aaggaaagga acaatgacaa cattacttaa cccctatttt
     5041 ggtgagtttg gcggcatgta cgtgccacaa atcctgatgc ctgctctgcg ccagctggaa
     5101 gaagcttttg tcagtgcgca aaaagatcct gaatttcagg ctcagttcaa cgacctgctg
     5161 aaaaactatg ccgggcgtcc aaccgcgctg accaaatgcc agaacattac agccgggacg
     5221 aacaccacgc tgtatctcaa gcgtgaagat ttgctgcacg gcggcgcgca taaaactaac
     5281 caggtgctgg ggcaggcgtt gctggcgaag cggatgggta aaaccgaaat catcgccgaa
     5341 accggtgccg gtcagcatgg cgtggcgtcg gccctggcca gcgccctgct cggcctgaaa
     5401 tgccgtattt atatgggtgc caaagacgtt gaacgccagt cgcctaacgt ttttcgtatg
     5461 cgcttaatgg gtgcggaagt gatcccggtg catagcggtt ccgcgacgct gaaagatgcc
     5521 tgtaacgagg cgctgcgcga ctggtccggt agttacgaaa ccgcgcacta tatgctgggc
     5581 accgcagctg gcccgcatcc ttatccgacc attgtgcgtg agtttcagcg gatgattggc
     5641 gaagaaacca aagcgcagat tctggaaaga gaaggtcgcc tgccggatgc cgttatcgcc
     5701 tgtgttggcg gcggttcgaa tgccatcggc atgtttgctg atttcatcaa tgaaaccaac
     5761 gtcggcctga ttggtgtgga gccaggtggt cacggtatcg aaactggcga gcacggcgca
     5821 ccgctaaaac atggtcgcgt gggtatctat ttcggtatga aagcgccgat gatgcaaacc
     5881 gaagacgggc agattgaaga atcttactcc atctccgccg gactggattt cccgtctgtc
     5941 ggcccacaac acgcgtatct taacagcact ggacgcgctg attacgtgtc tattaccgat
     6001 gatgaagccc ttgaagcctt caaaacgctg tgcctgcacg aagggatcat cccggcgctg
     6061 gaatcctccc acgccttggc ccatgcgttg aaaatgatgc gcgaaaaccc ggataaagag
     6121 cagctactgg tggttaacct ttccggtcgc ggcgataaag acatcttcac cgttcacgat
     6181 attttgaaag cacgagggga aatctgatgg aacgctacga atctctgttt gcccagttga
     6241 aggagcgcaa agaaggcgca ttcgttcctt tcgtcacgct cggtgatccg ggcattgagc
     6301 agtcattgaa aattatcgat acgctaattg aagccggtgc tgacgcgctg gagttaggta
     6361 tccccttctc cgacccactg gcggatggcc cgacgattca aaacgccact ctgcgcgcct
     6421 ttgcggcagg tgtgactccg gcacaatgtt ttgaaatgct ggcactgatt cgccagaaac
     6481 acccgaccat tcccattggc ctgttgatgt atgccaatct ggtgtttaac aaaggcattg
     6541 atgagtttta tgcccagtgc gaaaaagtcg gcgtcgattc ggtgctggtt gccgatgtgc
     6601 cagttgaaga gtccgcgccc ttccgccagg ccgcgttgcg tcacaacgtc gcacctatct
     6661 tcatctgccc gccaaatgcc gatgacgacc tgctgcgcca gatagcctct tacggtcgtg
     6721 gttacaccta tttgctgtca cgagcaggcg tgaccggcgc agaaaaccgc gccgcgttac
     6781 ccctcaatca tctggttgcg aagctgaaag agtacaacgc tgcacctcca ttgcagggat
     6841 ttggtatttc cgccccggat caggtaaaag cagcgattga tgcaggagct gcgggcgcga
     6901 tttctggttc ggccattgtt aaaatcatcg agcaacatat taatgagcca gagaaaatgc
     6961 tggcggcact gaaagttttt gtacaaccga tgaaagcggc gacgcgcagt taatcccaca
     7021 gccgccagtt ccgctggcgg cattttaact ttctttaatg aagccggaaa aatcctaaat
     7081 tcatttaata tttatctttt taccgtttcg cttaccccgg tcgatcgtyr acttacgtca
     7141 tttttccgcc caacagtaat ataaacaaac aaattaaacc cgcaacataa caccagtaaa
     7201 atcaataatt ttctctaagt cacttattcc tcaggtaatt cttaatatat ccagaatgtt
     7261 cctcaaaata tattttccct ctatcttctc gttgcgctta atttgactaa ttctcattag
     7321 cgactaattt taatgagtgt cgacacacaa cactcatatt aatgaaacaa tgcaacgcaa
     7381 cgggagaaat aacatggccg aacatcgtgg tggttcagga aatttcgccg aagaccgtga
     7441 gaaggcatcc gacgcagccg taaaggcggt cagcatagcg gcggtaattt taaaaatgat
     7501 cgcaacgcgc atctgaagcg ggtaaaaaag gcggtyrac
//