GenBank-Updates@genbank.bio.net (05/30/91)
LOCUS HUMKPNI03 6238 bp ds-DNA PRI 30-MAY-1991 DEFINITION Human KpnI repetitive sequence (T-betaG41) 3kb downstream of beta-globin gene ACCESSION X03145 KEYWORDS Kpn repetitive sequence; inverted repeat; repetitive sequence; unidentified reading frame. SOURCE Homo sapiens DNA. ORGANISM Homo sapiens Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae. REFERENCE 1 (bases 1 to 6238) AUTHORS Hattori,M., Hidaka,S. and Sakaki,Y. TITLE Sequence analysis of a KpnI family member near the 3'end of human beta-globin gene JOURNAL Nucleic Acids Res. 13, 7813-7827 (1985) STANDARD full automatic COMMENT Data kindly reviewed (30-SEP-1986) by Y. Sakaki From EMBL entry HSKPNI03; dated 06-JUL-1989. FEATURES Location/Qualifiers misc_feature 1..6125 /note="T-betaG41 seq. (A-rich 3'-end excluded)" CDS 192..773 /note="pot. ORF VI" /codon_start=192 repeat_unit 211..220 /note="inverted repeat a" repeat_region 566..576 /note="direct repeat 1" repeat_region 573..584 /note="direct repeat 2" repeat_unit 698..706 /note="inverted repeat a'" repeat_unit 808..817 /note="inverted repeat b" repeat_region 822..832 /note="direct repeat 3" repeat_region 1009..1024 /note="direct repeat 4" repeat_region 1116..1127 /note="direct repeat 5" repeat_region 1312..1322 /note="direct repeat 6" repeat_region 1316..1326 /note="direct repeat 7" misc_feature 1913..1918 /note="seq. homologous to polyA signal" repeat_unit 1924..1933 /note="inverted repeat b'" CDS 2196..2999 /note="pot. ORF I" /codon_start=2196 repeat_region 2305..2315 /note="direct repeat 8" misc_feature 2441..2446 /note="seq. homologous to TATA-box" repeat_region 2815..2825 /note="direct repeat 9" misc_feature 2909..2914 /note="seq. homologous to polyA signal" repeat_region 2938..2948 /note="direct repeat 10" repeat_region 2967..2977 /note="direct repeat 11" repeat_region 3088..3098 /note="direct repeat 11" misc_feature 3277..3282 /note="seq. homologous to polyA signal" repeat_region 3278..3288 /note="direct repeat 12" CDS 3282..3696 /note="pot. ORF II" /codon_start=3282 misc_feature 3400..3409 /note="(PyPu)n, pot. Z-DNA seq." repeat_region 3452..3462 /note="direct repeat 13" repeat_region 3671..3681 /note="direct repeat 8" repeat_region 3691..3701 /note="direct repeat 3" CDS 3785..4204 /note="pot. ORF III" /codon_start=3785 repeat_region 3865..3876 /note="direct repeat 10" misc_feature 3950..3955 /note="seq. homologous to polyA signal" repeat_region 4136..4146 /note="direct repeat 6" misc_feature 4290..4295 /note="seq. homologous to TATA-box" repeat_region 4329..4339 /note="direct repeat 13" repeat_region 4336..4346 /note="direct repeat 1" misc_feature 4394..4399 /note="seq. homologous to polyA signal" repeat_region 4444..4454 /note="direct repeat 7" misc_feature 4454..4461 /note="seq. homologous to enhancer core cons. seq." misc_feature 4472..4996 /note="pot. ORF IV" repeat_region 4690..4701 /note="direct repeat 5" repeat_region 5010..5021 /note="direct repeat 4" repeat_region 5116..5127 /note="direct repeat 2" CDS 5428..5883 /note="pot. ORF V" /codon_start=5428 misc_feature 5647..5652 /note="seq. homologous to TATA-box" misc_feature 5670..5685 /note="(PyPu)n, pot. Z-DNA sequence" misc_feature complement(5686..5691) /note="seq. homologous to polyA signal" repeat_region 5753..5763 /note="direct repeat 9" misc_feature 5769..5780 /note="(PyPu)n, pot. Z-DNA seq." repeat_region 5800..5810 /note="direct repeat 12" misc_feature 5882..5887 /note="seq. homologous to polyA signal" misc_feature 6066..6083 /note="(PyPu)n, pot. Z-DNA seq." misc_feature 6078..6084 /note="seq. homologous to TATA-box" misc_feature 6097..6108 /note="(PyPu)n, pot. Z-DNA seq." BASE COUNT 2414 a 1366 c 1230 g 1228 t ORIGIN 1 ggcggtggag ccaagatgac cgaataggaa cagctccagt ctatagctcc catcgtgagt 61 gacgcagaag acgggtgatt tctgcatttc caactgaggt accaggttca tctcacaggg 121 aagtgccagg cagtgggtgc aggacagtag tgcagtgcac tgtgcatgag ccgaagcagg 181 gcgaggcatc acctcacccg ggaagcacaa ggggtcaggg aattcccttt cctagtcaaa 241 gaaaagggtg acagatggca cctggaaaat cgggtcactc ccgccctaat actgcgctct 301 tccaacaagc ttaacaaatg gcacaccagg agattatatc ccatgcctgg ctcagagggt 361 cctacgccca tggagcctcg ctcattgcta gcacagcagt ctgaggtcaa actgcaaggt 421 ggcagtgagg ctgggggagg ggtgcccacc attgtccagg cttgagcagg taaacaaagc 481 cgcctggaag ctcgaactgg gtggagccca ccacagctca aggaggcctg cctgcctctg 541 taggctccac ctctaggggc agggcacaga caaacaaaag acaacaagaa cctctgcaga 601 cttaaatgtc cctgtctgac agctttgaag agagtagtgg ttctcccagc acatagcttc 661 agatctgaga acaggcagac tgcctcctca agtgggtccc tgacccccga gtagcctaac 721 tgggaggcat cccccagtag ggcggactga cacctcacat ggctggtact cctctaagac 781 aaaacttcca gaggaatgat caggcagcag catttgcggt tcaccaatat ccactgttct 841 gcagccaccg ctgctgatac ccaggaaaac agcatctgga gtggacctcc agtaaactcc 901 aacagacctg cagctgaggg tcctgactgt tagaaggaaa actaacaaac agaaaggaca 961 tccacaccaa aaacccatct gtacatcacc atcatcaaag accaaaggta gataaaacca 1021 taaagatggg gaaaaagcag agcagaaaaa ctggacactc taaaaatgag agtgcctctc 1081 cttctccaaa gtaacgcagc tcctcaccag caatggaaca aagctgggca gagaatgact 1141 ttgacgagtt gagagaggaa ggcttcagaa gatcaaacta ctccaagcta aaggaggaag 1201 ttcgaacaaa cggcaaagaa gtaaaaaact ttgaaaaaaa attagatgaa tggataacta 1261 gaataaccaa tgcacagaag tccttaaagg acctgatgga gctgaaaacc aaggcaggag 1321 aactacgtga caaatacaca agcctcagta accgatgaga tcaactggaa gaaagggtat 1381 caatgacgga agatgaaatg aatgaaatga agcatgaaga gaagtttaga gaaaaaagaa 1441 taaaaagaaa cgaacaaagc ctccaagaaa tatgggacta tgtgaaaaga ccaaatctac 1501 atctaattgg tgtagctgaa agtgatgggg agaatggaac caagttggaa aacactctgc 1561 aggatattat ccaggagaac ttccccaatc tagcaaggca gcccaaattc acattcagga 1621 aatacagaga acgccacaaa gatactccta gagaaaagca actccaagac acataactga 1681 cagattcacc aaagttgaaa tgaaggaaaa aatgttaagg gcagccagag agaaaggtcg 1741 ggttacccac aaagggaagc ccatcagact aacagctgat ctatcggcag aaactctaca 1801 agccagaaga aagtgggggc caatattcaa cattgttaaa gaaaagaatt ttcggcccag 1861 aatttcatat ccagccaaac taagcttcat aagcattgga gaaataaaat cctttacaga 1921 caagcaaatg ctgagagatt ttgtcaccac caggcctgcc ctacaagagc tcctgaagga 1981 agcactaaac atggaaagga acaactagta tcagccactg caaaaacatg ccaaattgta 2041 aacgaccatc aaggctagga agaaactgca tcaaggagca aaataaccag ctaacatcat 2101 aatgacagga tcaaattcat acataacaat actcacctta aatgtaaata ggctaaatgc 2161 tccaattaaa agacacagac tggcaaattg gataaggagt caagacccat ctgtcgttat 2221 gtattcagga aacccatctc acgtgcagag acacacatag gctcgaaata aaaggatgga 2281 ggaatatcta ccaagcaaat ggaaaacaaa aaaaggcagg ggttgcaatc ctagtctctg 2341 ataaaacaga ttttaaacca acaaagatca aaagagacaa agaaggccat tacataatgg 2401 caaagggatc tattcaagaa gaagaactaa ctatactaaa tatatatgca cccaatacag 2461 gagcacccag attcataaaa caagtcctga gtgacctaca aagagactta gatgcccaca 2521 caataataat gggagacttt aacaccccac tgtcaacatt agacagatca acgagacaga 2581 aagttaacaa ggatatccag gaattggact cagctctgca ccaagcagac ctaatagaca 2641 tctacagaac tctccacccc aaatcaacag aatatacatt cttttcagca ccacaccaca 2701 cctattccaa aactgaccac atagttggaa gtaaagctct cctcagcaaa tgtaaaagaa 2761 cagaaactat aacaaactgt ctctcagacc acagtgcaat caaactagaa ctcaggatta 2821 agaaactcac tcaaaaccac tcagctacat ggaaactgaa cagcctgctc ctgaatgact 2881 actgggtaca taacaaaatg aaggcagaaa taaagatgtt ctttgaaaca acgagaacaa 2941 agacacaaca caccagaatc tctgagacac attcaaagca gtgtgtagag ggaaatttat 3001 agcactaaat gcccacaagg gaaagcagga aagatctaaa attgacaccc taacatcaca 3061 attaaaaaac tagagaagca ggagcaaaca cattcaaaag ctaacagaag acaagaaata 3121 actaagatca gagcagaagt gaagaagata gagacacaaa aaacccttca aaaaaatcaa 3181 tgaatccaga agctgttttt ttgaaaagat caacaaaatt gatagactgc tagcaagact 3241 aataaagaag aaaggggaga agaatcaaat agacgcaata aaaaatgaca cggggtatca 3301 ccactgatcc cacagaaata caaactaccg tcagagaata ctataaacac ctctacgcaa 3361 ataaactaga aaatctagaa gaaatggata aattcctcga cacatacact ctgccaagac 3421 taaaccagga agaagttgta tctctgaata gaccaataac aggctctgaa attgaggcaa 3481 taattaatag cttatcaacc aaaaaaagtc cgggaccagt aggattcata gccgaattct 3541 accagaggta caaggaggag ctggtaccat tccttctgaa actattccaa tcaatagaaa 3601 aagagggaat cctccctaac tcattttatg aggccagcat catcctgata ccaaagcctg 3661 acagagacac aacaaaaaaa gagaatgtta caccaatatc cttgatgaac atcgatgcaa 3721 aaatcctcaa taaaatactg gcaaactgaa tccagcagca catcaaaaag cttatcctcc 3781 atgatcaagt gggcttcatc cctgccatgc aaggctggtt caacatacga aatcaataaa 3841 cataatccag catataaaca gaaccaaaga cacaaaccat atgattatct caatagatgc 3901 agaaaaggcc tttgacaaaa ttcaacaatg cttcatgcta aaaactctca ataaattagg 3961 tattgatggg acatatctca aaataataag agctatctat gacaaaccca cagccaatat 4021 catactgagt ggacaaaaac tggaagcatt ccctttgaaa actggcacaa ggcagggatg 4081 ccctctctca ccactcctat tcaacatagt gttggaagtt ctggccaggg caatcaggca 4141 ggagaaggaa ataaagggca ttcaattagg aaaagaggaa ggtgaaattg tccctgtttg 4201 cagatgacat gattgtatat ctagaaaacc ccattgtctc agcccaaaat ctccttaagc 4261 tgataagcaa cttcagcaaa gtctcaggat ataaaatcag tgtgcaaaaa tcacaagtat 4321 tcctatgcac caataacaga caaacagaga gccaaatcat gagtgaactc ccattcacaa 4381 ttgcttcaaa gagaataaaa tacctaggaa tccaacttac aagggatgtg aaggacctct 4441 tcaaggagaa ctacaaacca ctgctcaatg aaataaaaga ggatacaaac aaatggaaga 4501 acattccatg cttatgggta ggaagaatca tatcgtgaaa atggtcatac tgcccaaggt 4561 aatttataga ttcaatgcca tccccatcaa gctaccaatg actttcttca cagaactgga 4621 aaaaactact ttaaagttca tatggaatca aaaaagagcc cacatcacca aggcaatcct 4681 aagccaaaag aacaaagctg gaggcatcac gctacctgac ttcaaactat actacaatgc 4741 tacggtaacc aaaacagcat ggtactggta ccaaaacaga gatctagacc aatggaacag 4801 aacagagccc tcagaaataa tgccgcatat ctacaactat ccgatctttg acaaacctga 4861 gagaaacaag caatggggaa aggattccct atttaataaa tggtgctggg aaaactggct 4921 agccatatgt agaaagctga aactggatcc ttccttacac cttatacaaa aattaattca 4981 agatggatta aagacttaaa cattagacct aaaaccataa aaaccctaga aaaaaaccta 5041 ggcaatacca ttcaggacat aggcatgggc aaggacttca tgtctaaaac accaaaacga 5101 atggcaacaa aagacaaaat ggacaaacgg gatctaatta aactaaagag cttctgcaca 5161 gctaaagaaa ctaccatcag agtgaacagg caacctacaa aatgggagaa aatttttgca 5221 atctactcat ctgacaaagg gctaatatcc agaatctaca atgaactcaa acaaatttac 5281 aagaaaaaac aaacaacccc atcaaaaagt gggcaaagga tatgaacaga cacttctcaa 5341 aagaagacat ttatgtaatc aaaaaacaca tgaaaaaatg ctcatcatca ctagccatca 5401 gagaaatgca aatcaaaacc acaatgagat accatctcac accagttaga atggcgatca 5461 ttaaaaagtc aggaaacaac aggtgctgga gaggatgtgg agaaacagga acaactttta 5521 cactgttggt gggactgtaa actagttcaa ccattgcgga agtcagtgtg gcaattcctc 5581 aggaatctag aactagaaat accatttgac ccagccatcc cattactggg tagataccca 5641 aaggattata aatcatgctg ctataaagac acatgcacac gtatgtttat tgcagcacta 5701 ttcacaatag caaagacttg gaaccaaccc aaatgtccaa caacgataga ttggattaag 5761 aaaatgtggc acatatacac catggaatac tatgcagcca taaaaaatga tgagttcatg 5821 tcctttgtag ggacatggat gaagctggaa actatcattc tcagcaaact atcacaagga 5881 caataaacca aacaccgcat gttctcactc ataggtggga attgaacaat gagaacacat 5941 ggacacatga agaggaacat cacactctgg ggactgttat ggggtggggg gcaggggcag 6001 ggatagcact aggagatata cctaatgcta aatgacgagt taatgggtgc agcacaccaa 6061 catggcacat gtatacatat ataacaaacc tgccgttgtg cacatgtacc ctaaaacttg 6121 aagtataata ataaaaaaaa gttatcctat taaaactgat ctcacacatc cgtagagcca 6181 ttatcaagtc tttctctttg aaacagacag aaatttagtg ttttctcagt cagttaac //