deanre%anders.dnet@SERVER.UGA.EDU (06/05/91)
The following is a table for some common restriction enzyme recognition sites in the Arabidopsis genome. This table was composed using Jamie Cuticchia's computer program and data I obtained through searches. The table itself was made by John McDowell using the information from the computer printouts of dAta. ENZYME RECOGNITION SITE FREQUENCY AVE SIZE(BP) 1 Apa I GGGCCC 0.0026 38,460 2 Xma I CGGCCG 0.0053 18,870 3 Sma I CCCGGG 0.0066 15,150 4 Sac II CCGCGG 0.0092 10,860 5 Kpn I GGTACC 0.0118 8,470 6 Xho I CTCGAG 0.0144 6,940 7 Bam HI GGATCC 0.0144 6,940 8 Xba I TCTAGA 0.0158 6,330 9 SaII/HincII GTCGAC 0.0158 6,320 10 Spe I ACTAGT 0.0197 5,070 11 Sac I GAGCTC 0.0249 4,016 12 Pst I CTGCAG 0.0289 3,460 13 Eco RV GATATC 0.0302 3,310 14 Eco RI GAATTC 0.0368 2,590 15 Cla I ATCGAT 0.0394 2,530 16 Hind III AAGCTT 0.0617 1,620 17 AhaIII/DraI TTTAAA 0.0703 1,422 If you have any questions my email address is DEANRE%gandal.dnet@SERVER.uga.edu The table was put together using the known sequences of Arabidopsis as found in Genbank and Uembl. The computer program (which fits a markov chain) takes these sequences and searches for trinucleotide, tetranucleotide, and hexanucleotide counts (compares random to expected). John took the hexanucleotide counts and looked for common restriction sites which he then put in table form. Best of luck, Rob