GenBank-Updates@genbank.bio.net (05/29/91)
LOCUS HIVHTLV3 9748 bp ss-mRNA VRL 28-MAY-1991 DEFINITION Human T-cell leukaemia type III (HTLV-III) proviral genome (AIDS virus for acquired immune deficiency syndrome) ACCESSION X01762 KEYWORDS acquired immune deficiency syndrome; direct repeat; endonuclease; glycoprotein; inverted repeat; protease; provirus; reverse transcriptase; terminal repeat. SOURCE Human immunodeficiency virus type 1 RNA. ORGANISM Human immunodeficiency virus type 1 Viridae; ss-RNA enveloped viruses; Positive strand RNA virus; Retroviridae; Lentivirinae. REFERENCE 1 (bases 1 to 9748) AUTHORS Ratner,L., Haseltine,W., Patarca,R., Livak,K.J., Starcich,B., Josephs,S.J., Doran,E.R., Rafalski,J.A., Whitehorn,E.A., Baumeister,K., Ivanoff,L., Petteway,S.R., Pearson,M.L., Lautenberger,J.A., Papas,T.S., Ghrayeb,J., Chang,N.T., Gallo,R.C. and Wong-Staal,F. TITLE Complete nucleotide sequence of the AIDS virus, HTLV-III JOURNAL Nature 313, 277-284 (1985) STANDARD full automatic REFERENCE 2 (sites) AUTHORS Muesing,M.A., Smith,D.H., Cabradilla,C.D., Benton,C.V., Kasky,L.A. and Capon,D.J. TITLE Nucleic acid structure and expression of the human AIDS/ lymphadenopathy retrovirus JOURNAL Nature 313, 450-458 (1985) STANDARD full staff_review COMMENT SWISS-PROT; P03350; GAG$HIV1P. SWISS-PROT; P03368; POL$HIV1P. SWISS-PROT; P03376; ENV$HIV1P. SWISS-PROT; P03401; VIF$HIV10. SWISS-PROT; P03405; NEF$HIV1P. SWISS-PROT; P04607; TAT$HIV1P. SWISS-PROT; P04617; REV$HIV1P. SWISS-PROT; P05922; VPU$HIV1P. SWISS-PROT; P05926; VPR$HIV1X. From EMBL entry REHTLV3; dated 11-AUG-1990. FEATURES Location/Qualifiers misc_feature 1..634 /note="long terminal repeat" repeat_unit 1..2 /note="inverted repeat" promoter 427..430 /note="TATA-box" misc_feature 453..453 /note="U3 region" misc_feature 454..551 /note="R region" misc_RNA 454..454 /note="cap site" misc_feature 552..634 /note="U5 region" repeat_unit 633..634 /note="inverted repeat" misc_feature 635..653 /note="tRNA binding site (tRNA-Lys)" CDS 787..2321 /note="gag precursor polypeptide" /codon_start=787 CDS 787..1182 /note="gag p17" /codon_start=787 CDS 1183..2321 /note="gag p24 and gag p15 for major capsid protein and for put. retroviral nucleic acid binding protein (NBP)(ref.2) (boundaries not defined)" /codon_start=1183 repeat_region 1968..2002 /note="direct repeat" repeat_region 2031..2065 /note="direct repeat" CDS 2081..5125 /note="pol precursor polypeptides put. protease at 5' terminus reverse transcriptase put. endonuclease at 3' terminus" /codon_start=2081 repeat_region 2128..2163 /note="direct repeat" repeat_region 2164..2176 /note="direct repeat" CDS 5040..5648 /note="SOR short open reading frame pot. vestigial env gene" /codon_start=5040 CDS 6323..8821 /note="env-lor precursor polypeptide" /codon_start=6323 CDS 6323..8821 /note="envelope glycoprotein" /codon_start=6323 misc_feature 7786..7787 /note="put.peptide cleavage site" CDS 7787..8821 /note="put.lor transmembrane protein" /codon_start=7787 misc_feature 9098..9103 /note="poly purine stretch" repeat_region 9115..9748 /note="long terminal repeat" misc_feature 9115..9567 /note="U3 region" misc_feature 9568..9665 /note="R region" misc_feature 9641..9646 /note="polyadenylation signal" misc_feature 9666..9748 /note="U5 region" repeat_unit 9747..9748 /note="inverted repeat" BASE COUNT 3431 a 1781 c 2368 g 2168 t ORIGIN 1 tggaagggct aattcactcc caacgaagac aagatatcct tgatctgtgg atctaccaca 61 cacaaggcta cttccctgat tagcagaact acacaccagg gccagggatc agatatccac 121 tgacctttgg atggtgctac aagctagtac cagttgagcc agagaagtta gaagaagcca 181 acaaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatggaatg gatgacccgg 241 agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac atggcccgag 301 agctgcatcc ggagtacttc aagaactgct gacatcgagc ttgctacaag ggactttccg 361 ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga gccctcagat 421 cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga ccagatctga 481 gcctgggagc tctctggcta actagggaac ccactgctta agcctcaata aagcttgcct 541 tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta gagatccctc 601 agaccctttt agtcagtgtg gaaaatctct agcagtggcg cccgaacagg gacctgaaag 661 cgaaagggaa accagagctc tctcgacgca ggactcggct tgctgaagcg cgcacggcaa 721 gaggcgaggg gcggcgactg gtgagtacgc caaaaatttt gactagcgga ggctagaagg 781 agagagatgg gtgcgagagc gtcagtatta agcgggggag aattagatcg atgggaaaaa 841 attcggttaa ggccaggggg aaagaaaaaa tataaattaa aacatatagt atgggcaagc 901 agggagctag aacgattcgc agttaatcct ggcctgttag aaacatcaga aggctgtaga 961 caaatactgg gacagctaca accatccctt cagacaggat cagaagaact tagatcatta 1021 tataatacag tagcaaccct ctattgtgtg catcaaagga tagagataaa agacaccaag 1081 gaagctttag acaagataga ggaagagcaa aacaaaagta agaaaaaagc acagcaagca 1141 gcagctgaca caggacacag cagtcaggtc agccaaaatt accctatagt gcagaacatc 1201 caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta 1261 gtagaagaga aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga 1321 gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg 1381 caaatgttaa aagagaccat caatgaggaa gctgcagaat gggatagagt acatccagtg 1441 catgcagggc ctattgcacc aggccagatg agagaaccaa ggggaagtga catagcagga 1501 actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta 1561 ggagaaattt ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc 1621 cctaccagca ttctggacat aagacaagga ccaaaagaac cttttagaga ctatgtagac 1681 cggttctata aaactctaag agccgagcaa gcttcacagg aggtaaaaat tggatgacag 1741 aaaccttgtt ggtccaaaat gcgaacccag attgtaagac tattttaaaa gcattgggac 1801 cagcggctac actagaagaa atgatgacag catgtcaggg agtaggagga cccggccata 1861 aggcaagagt tttggctgaa gcaatgagcc aagtaacaaa tacagctacc ataatgatgc 1921 agagaggcaa ttttaggaac caaagaaaga tggttaagtg tttcaattgt ggcaaagaag 1981 ggcacacagc cagaaattgc agggccccta ggaaaaaggg ctgttggaaa tgtggaaagg 2041 aaggacacca aatgaaagat tgtactgaga gacaggctaa ttttttaggg aagatctggc 2101 cttcctacaa gggaaggcca gggaattttc ttcagagcag accagagcca acagccccac 2161 catttcttca gagcagacca gagccaacag ccccaccaga agagagcttc aggtctgggg 2221 tagagacaac aactccccct cagaagcagg agccgataga caaggaactg tatcctttaa 2281 cttccctcag atcactcttt ggcaacgacc cctcgtcaca ataaagatag gggggcaact 2341 aaaggaagct ctattagata caggagcaga tgatacagta ttagaagaaa tgagtttgcc 2401 aggaagatgg aaaccaaaaa tgataggggg aattggaggt tttatcaaag taagacagta 2461 tgatcagata ctcatagaaa tctgtggaca taaagctata ggtacagtat tagtaggacc 2521 tacacctgtc aacataattg gaagaaatct gttgactcag attggttgca ctttaaattt 2581 tcccattagc cctattgaga ctgtaccagt aaaattaaag ccaggaatgg atggcccaaa 2641 agttaaacaa tggccattga cagaagaaaa aataaaagca ttagtagaaa tttgtacaga 2701 aatggaaaag gaagggaaaa tttcaaaaat tgggcctgag aatccataca atactccagt 2761 atttgccata aagaaaaaag acagtactaa atggagaaaa ttagtagatt tcagagaact 2821 taataagaga actcaagact tctgggaagt tcaattagga ataccacatc ccgcagggtt 2881 aaaaaagaaa aaatcagtaa cagtactgga tgtgggtgat gcatattttt cagttccctt 2941 agatgaagac ttcaggaagt atactgcatt taccatacct agtataaaca atgagacacc 3001 agggattaga tatcagtaca atgtgcttcc acagggatgg aaaggatcac cagcaatatt 3061 ccaaagtagc atgacaaaaa tcttagagcc ttttaaaaaa caaaatccag acatagttat 3121 ctatcaatac atggatgatt tgtatgtagg atctgactta gaaatagggc agcatagaac 3181 aaaaatagag gagctgagac aacatctgtt gaggtgggga cttaccacac cagacaaaaa 3241 acatcagaaa gaacctccat tcctttggat gggttatgaa ctccatcctg ataaatggac 3301 agtacagcct atagtgctgc cagaaaaaga cagctggact gtcaatgaca tacagaagtt 3361 agtggggaaa ttgaattggg caagtcagat ttacccaggg attaaagtaa ggcaattatg 3421 taaactcctt agaggaacca aagcactaac agaagtaata ccactaacag aagaagcaga 3481 gctagaactg gcagaaaaca gagagattct aaaagaacca gtacatggag tgtattatga 3541 cccatcaaaa gacttaatag cagaaataca gaagcagggg caaggccaat ggacatatca 3601 aatttatcaa gagccattta aaaatctgaa aacaggaaaa tatgcaagaa tgaggggtgc 3661 ccacactaat gatgtaaaac aattaacaga ggcagtgcaa aaaataacca cagaaagcat 3721 agtaatatgg ggaaagactc ctaaatttaa actacccata caaaaggaaa catgggaaac 3781 atggtggaca gagtattggc aagccacctg gattcctgag tgggagtttg ttaatacccc 3841 tcctttagtg aaattatggt accagttaga gaaagaaccc atagtaggag cagaaacctt 3901 ctatgtagat ggggcagcta acagggagac taaattagga aaagcaggat atgttactaa 3961 caaaggaaga caaaaggttg tccccctaac taacacaaca aatcagaaaa ctgagttaca 4021 agcaatttat ctagctttgc aggattcagg attagaagta aacatagtaa cagactcaca 4081 atatgcatta ggaatcattc aagcacaacc agataaaagt gaatcagagt tagtcaatca 4141 aataatagag cagttaataa aaaaggaaaa ggtctatctg gcatgggtac cagcacacaa 4201 aggaattgga ggaaatgaac aagtagataa attagtcagt gctggaatca ggaaaatact 4261 atttttagat ggaatagata aggcccaaga tgaacatgag aaatatcaca gtaattggag 4321 agcaatggct agtgatttta acctgccacc tgtagtagca aaagaaatag tagccagctg 4381 tgataaatgt cagctaaaag gagaagccat gcatggacaa gtagactgta gtccaggaat 4441 atggcaacta gattgtacac atttagaagg aaaagttatc ctggtagcag ttcatgtagc 4501 cagtggatat atagaagcag aagttattcc agcagaaaca gggcaggaaa cagcatattt 4561 tcttttaaaa ttagcaggaa gatggccagt aaaaacaata catacagaca atggcagcaa 4621 tttcaccagt gctacggtta aggccgcctg ttggtgggcg ggaatcaagc aggaatttgg 4681 aattccctac aatccccaaa gtcaaggagt agtagaatct atgaataaag aattaaagaa 4741 aattatagga caggtaagag atcaggctga acatcttaag acagcagtac aaatggcagt 4801 attcatccac aattttaaaa gaaaaggggg gattgggggg tacagtgcag gggaaagaat 4861 agtagacata atagcaacag acatacaaac taaagaatta caaaaacaaa ttacaaaaat 4921 tcaaaatttt cgggtttatt acagggacag cagaaatcca ctttggaaag gaccagcaaa 4981 gctcctctgg aaaggtgaag gggcagtagt aatacaagat aatagtgaca taaaagtagt 5041 gccaagaaga aaagcaaaga tcattaggga ttatggaaaa cagatggcag gtgatgattg 5101 tgtggcaagt agacaggatg aggattagaa catggaaaag tttagtaaaa caccatatgt 5161 atgtttcagg gaaagctagg ggatggtttt atagacatca ctatgaaagc cctcatccaa 5221 gaataagttc agaagtacac atcccactag gggatgctag attggtaata acaacatatt 5281 ggggtctgca tacaggagaa agagactggc atttgggtca gggagtctcc atagaatgga 5341 ggaaaaagag atatagcaca caagtagacc ctgaactagc agaccaacta attcatctgt 5401 attactttga ctgtttttca gactctgcta taagaaaggc cttattagga cacatagtta 5461 gccctaggtg tgaatatcaa gcaggacata acaaggtagg atctctacaa tacttggcac 5521 tagcagcatt aataacacca aaaaagataa agccaccttt gcctagtgtt acgaaactga 5581 cagaggatag atggaacaag ccccagaaga ccaagggcca cagagggagc cacacaatga 5641 atggacacta gagcttttag aggagcttaa gaatgaagct gttagacatt ttcctaggat 5701 ttggctccat ggcttagggc aacatatcta tgaaacttat ggggatactt gggcaggagt 5761 ggaagccata ataagaattc tgcaacaact gctgtttatc cattttcaga attgggtgtc 5821 gacatagcag aataggcgtt actcgacaga ggagagcaag aaatggagcc agtagatcct 5881 agactagagc cctggaagca tccaggaagt cagcctaaaa ctgcttgtac caattgctat 5941 tgtaaaaagt gttgctttca ttgccaagtt tgtttcataa caaaagcctt aggcatctcc 6001 tatggcagga agaagcggag acagcgacga agacctcctc aaggcagtca gactcatcaa 6061 gtttctctat caaagcagta agtagtacat gtaatgcaac ctatacaaat agcaatagta 6121 gcattagtag tagcaataat aatagcaata gttgtgtggt ccatagtaat catagaatat 6181 aggaaaatat taagacaaag aaaaatagac aggttaattg atagactaat agaaagagca 6241 gaagacagtg gcaatgagag tgaaggagaa atatcagcac ttgtggagat gggggtggag 6301 atggggcacc atgctccttg ggatgttgat gatctgtagt gctacagaaa aattgtgggt 6361 cacagtctat tatggggtac ctgtgtggaa ggaagcaacc accactctat tttgtgcatc 6421 agatgctaaa gcatatgata cagaggtaca taatgtttgg gccacacatg cctgtgtacc 6481 cacagacccc aacccacaag aagtagtatt ggtaaatgtg acagaaaatt ttaacatgtg 6541 gaaaaatgac atggtagaac agatgcatga ggatataatc agtttatggg atcaaagcct 6601 aaagccatgt gtaaaattaa ccccactctg tgttagttta aagtgcactg atttgaagaa 6661 tgatactaat accaatagta gtagcgggag aatgataatg gagaaaggag agataaaaaa 6721 ctgctctttc aatatcagca caagcataag aggtaaggtg cagaaagaat atgcattttt 6781 ttataaactt gatataatac caatagataa tgatactacc agctatacgt tgacaagttg 6841 taacacctca gtcattacac aggcctgtcc aaaggtatcc tttgagccaa ttcccataca 6901 ttattgtgcc ccggctggtt ttgcgattct aaaatgtaat aataagacgt tcaatggaac 6961 aggaccatgt acaaatgtca gcacagtaca atgtacacat ggaattaggc cagtagtatc 7021 aactcaactg ctgttaaatg gcagtctggc agaagaagag gtagtaatta gatctgccaa 7081 tttcacagac aatgctaaaa ccataatagt acagctgaac caatctgtag aaattaattg 7141 tacaagaccc aacaacaata caagaaaaag tatccgtatc cagagaggac cagggagagc 7201 atttgttaca ataggaaaaa taggaaatat gagacaagca cattgtaaca ttagtagagc 7261 aaaatggaat aacactttaa aacagataga tagcaaatta agagaacaat ttggaaataa 7321 taaaacaata atctttaagc agtcctcagg aggggaccca gaaattgtaa cgcacagttt 7381 taattgtgga ggggaatttt tctactgtaa ttcaacacaa ctgtttaata gtacttggtt 7441 taatagtact tggagtacta aagggtcaaa taacactgaa ggaagtgaca caatcaccct 7501 cccatgcaga ataaaacaaa ttataaacat gtggcaggaa gtaggaaaag caatgtatgc 7561 ccctcccatc agtggacaaa ttagatgttc atcaaatatt acagggctgc tattaacaag 7621 agatggtggt aatagcaaca atgagtccga gatcttcaga cctggaggag gagatatgag 7681 ggacaattgg agaagtgaat tatataaata taaagtagta aaaattgaac cattaggagt 7741 agcacccacc aaggcaaaga gaagagtggt gcagagagaa aaaagagcag tgggaatagg 7801 agctttgttc cttgggttct tgggagcagc aggaagcact atgggcgcag cgtcaatgac 7861 gctgacggta caggccagac aattattgtc tggtatagtg cagcagcaga acaatttgct 7921 gagggctatt gaggcgcaac agcatctgtt gcaactcaca gtctggggca tcaagcagct 7981 ccaggcaaga atcctggctg tggaaagata cctaaaggat caacagctcc tggggatttg 8041 gggttgctct ggaaaactca tttgcaccac tgctgtgcct tggaatgcta gttggagtaa 8101 taaatctctg gaacagattt ggaataacat gacctggatg gagtgggaca gagaaattaa 8161 caattacaca agcttaatac actccttaat tgaagaatcg caaaaccagc aagaaaagaa 8221 tgaacaagaa ttattggaat tagataaatg ggcaagtttg tggaattggt ttaacataac 8281 aaattggctg tggtatataa aattattcat aatgatagta ggaggcttgg taggtttaag 8341 aatagttttt gctgtacttt ctgtagtgaa tagagttagg cagggatatt caccattatc 8401 gtttcagacc cacctcccaa tcccgagggg acccgacagg cccgaaggaa tagaagaaga 8461 aggtggagag agagacagag acagatccat tcgattagtg aacggatcct tagcacttat 8521 ctgggacgat ctgcggagcc tgtgcctctt cagctaccac cgcttgagag acttactctt 8581 gattgtaacg aggattgtgg aacttctggg acgcaggggg tgggaagccc tcaaatattg 8641 gtggaatctc ctacagtatt ggagtcagga gctaaagaat agtgctgtta gcttgctcaa 8701 tgccacagct atagcagtag ctgaggggac agatagggtt atagaagtag tacaaggagc 8761 ttatagagct attcgccaca tacctagaag aataagacag ggcttggaaa ggattttgct 8821 ataagatggg tggcaagtgg tcaaaaagta gtgtggttgg atggcctgct gtaagggaaa 8881 gaatgagacg agctgagcca gcagcagatg gggtgggagc agcatctcga gacctagaaa 8941 aacatggagc aatcacaagt agcaacacag cagctaacaa tgctgattgt gcctggctag 9001 aagcacaaga ggaggaggag gtgggttttc cagtcacacc tcaggtacct ttaagaccaa 9061 tgacttacaa ggcagctgta gatcttagcc actttttaaa agaaaagggg ggactggaag 9121 ggctaattca ctcccaacga agacaagata tccttgatct gtggatctac cacacacaag 9181 gctacttccc tgattagcag aactacacac cagggccagg gatcagatat ccactgacct 9241 ttggatggtg ctacaagcta gtaccagttg agccagagaa gttagaagaa gccaacaaag 9301 gagagaacac cagcttgtta caccctgtga gcctgcatgg aatggatgac ccggagagag 9361 aagtgttaga gtggaggttt gacagccgcc tagcatttca tcacatggcc cgagagctgc 9421 atccggagta cttcaagaac tgctgacatc gagcttgcta caagggactt tccgctgggg 9481 actttccagg gaggcgtggc ctgggcggga ctggggagtg gcgagccctc agatcctgca 9541 tataagcagc tgctttttgc ctgtactggg tctctctggt tagaccagat ctgagcctgg 9601 gagctctctg gctagctagg gaacccactg cttaagcctc aataaagctt gccttgagtg 9661 cttcaagtag tgtgtgcccg tctgttgtgt gactctggta actagagatc cctcagaccc 9721 ttttagtcag tgtggaaaat ctctagca //