[bionet.genome.arabidopsis] table

deanre%anders.dnet@SERVER.UGA.EDU (06/05/91)

     The following is a table for some common restriction enzyme recognition 
sites in the Arabidopsis genome. This table was composed using Jamie Cuticchia's
computer program and data I obtained through searches.  The table itself was
made by John McDowell using the information from the computer printouts of dAta.

             ENZYME         RECOGNITION SITE    FREQUENCY       AVE SIZE(BP)

1            Apa I          GGGCCC              0.0026          38,460
2            Xma I          CGGCCG              0.0053          18,870
3            Sma I          CCCGGG              0.0066          15,150
4            Sac II         CCGCGG              0.0092          10,860
5            Kpn I          GGTACC              0.0118          8,470
6            Xho I          CTCGAG              0.0144          6,940
7            Bam HI         GGATCC              0.0144          6,940
8            Xba I          TCTAGA              0.0158          6,330
9            SaII/HincII    GTCGAC              0.0158          6,320
10           Spe I          ACTAGT              0.0197          5,070
11           Sac I          GAGCTC              0.0249          4,016
12           Pst I          CTGCAG              0.0289          3,460
13           Eco RV         GATATC              0.0302          3,310
14           Eco RI         GAATTC              0.0368          2,590
15           Cla I          ATCGAT              0.0394          2,530
16           Hind III       AAGCTT              0.0617          1,620
17           AhaIII/DraI    TTTAAA              0.0703          1,422


If you have any questions my email address is 
DEANRE%gandal.dnet@SERVER.uga.edu

The table was put together using the known sequences of Arabidopsis as found
in Genbank and Uembl.  The computer program (which fits a markov chain) takes 
these sequences and searches for trinucleotide, tetranucleotide, and 
hexanucleotide counts (compares random to expected).  John took the 
hexanucleotide counts and looked for common restriction sites which he then
put in table form.
                                         Best of luck,
                                         Rob