GenBank-Updates@genbank.bio.net (05/29/91)
LOCUS HIVHTLV3 9748 bp ss-mRNA VRL 28-MAY-1991
DEFINITION Human T-cell leukaemia type III (HTLV-III) proviral genome (AIDS
virus for acquired immune deficiency syndrome)
ACCESSION X01762
KEYWORDS acquired immune deficiency syndrome; direct repeat; endonuclease;
glycoprotein; inverted repeat; protease; provirus;
reverse transcriptase; terminal repeat.
SOURCE Human immunodeficiency virus type 1 RNA.
ORGANISM Human immunodeficiency virus type 1
Viridae; ss-RNA enveloped viruses; Positive strand RNA virus;
Retroviridae; Lentivirinae.
REFERENCE 1 (bases 1 to 9748)
AUTHORS Ratner,L., Haseltine,W., Patarca,R., Livak,K.J., Starcich,B.,
Josephs,S.J., Doran,E.R., Rafalski,J.A., Whitehorn,E.A.,
Baumeister,K., Ivanoff,L., Petteway,S.R., Pearson,M.L.,
Lautenberger,J.A., Papas,T.S., Ghrayeb,J., Chang,N.T., Gallo,R.C.
and Wong-Staal,F.
TITLE Complete nucleotide sequence of the AIDS virus, HTLV-III
JOURNAL Nature 313, 277-284 (1985)
STANDARD full automatic
REFERENCE 2 (sites)
AUTHORS Muesing,M.A., Smith,D.H., Cabradilla,C.D., Benton,C.V., Kasky,L.A.
and Capon,D.J.
TITLE Nucleic acid structure and expression of the human AIDS/
lymphadenopathy retrovirus
JOURNAL Nature 313, 450-458 (1985)
STANDARD full staff_review
COMMENT SWISS-PROT; P03350; GAG$HIV1P. SWISS-PROT; P03368; POL$HIV1P.
SWISS-PROT; P03376; ENV$HIV1P. SWISS-PROT; P03401; VIF$HIV10.
SWISS-PROT; P03405; NEF$HIV1P. SWISS-PROT; P04607; TAT$HIV1P.
SWISS-PROT; P04617; REV$HIV1P. SWISS-PROT; P05922; VPU$HIV1P.
SWISS-PROT; P05926; VPR$HIV1X.
From EMBL entry REHTLV3; dated 11-AUG-1990.
FEATURES Location/Qualifiers
misc_feature 1..634
/note="long terminal repeat"
repeat_unit 1..2
/note="inverted repeat"
promoter 427..430
/note="TATA-box"
misc_feature 453..453
/note="U3 region"
misc_feature 454..551
/note="R region"
misc_RNA 454..454
/note="cap site"
misc_feature 552..634
/note="U5 region"
repeat_unit 633..634
/note="inverted repeat"
misc_feature 635..653
/note="tRNA binding site (tRNA-Lys)"
CDS 787..2321
/note="gag precursor polypeptide"
/codon_start=787
CDS 787..1182
/note="gag p17"
/codon_start=787
CDS 1183..2321
/note="gag p24 and gag p15 for major capsid protein and
for put. retroviral nucleic acid binding protein
(NBP)(ref.2) (boundaries not defined)"
/codon_start=1183
repeat_region 1968..2002
/note="direct repeat"
repeat_region 2031..2065
/note="direct repeat"
CDS 2081..5125
/note="pol precursor polypeptides put. protease at 5'
terminus reverse transcriptase put. endonuclease at 3'
terminus"
/codon_start=2081
repeat_region 2128..2163
/note="direct repeat"
repeat_region 2164..2176
/note="direct repeat"
CDS 5040..5648
/note="SOR short open reading frame pot. vestigial env
gene"
/codon_start=5040
CDS 6323..8821
/note="env-lor precursor polypeptide"
/codon_start=6323
CDS 6323..8821
/note="envelope glycoprotein"
/codon_start=6323
misc_feature 7786..7787
/note="put.peptide cleavage site"
CDS 7787..8821
/note="put.lor transmembrane protein"
/codon_start=7787
misc_feature 9098..9103
/note="poly purine stretch"
repeat_region 9115..9748
/note="long terminal repeat"
misc_feature 9115..9567
/note="U3 region"
misc_feature 9568..9665
/note="R region"
misc_feature 9641..9646
/note="polyadenylation signal"
misc_feature 9666..9748
/note="U5 region"
repeat_unit 9747..9748
/note="inverted repeat"
BASE COUNT 3431 a 1781 c 2368 g 2168 t
ORIGIN
1 tggaagggct aattcactcc caacgaagac aagatatcct tgatctgtgg atctaccaca
61 cacaaggcta cttccctgat tagcagaact acacaccagg gccagggatc agatatccac
121 tgacctttgg atggtgctac aagctagtac cagttgagcc agagaagtta gaagaagcca
181 acaaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatggaatg gatgacccgg
241 agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac atggcccgag
301 agctgcatcc ggagtacttc aagaactgct gacatcgagc ttgctacaag ggactttccg
361 ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga gccctcagat
421 cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga ccagatctga
481 gcctgggagc tctctggcta actagggaac ccactgctta agcctcaata aagcttgcct
541 tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta gagatccctc
601 agaccctttt agtcagtgtg gaaaatctct agcagtggcg cccgaacagg gacctgaaag
661 cgaaagggaa accagagctc tctcgacgca ggactcggct tgctgaagcg cgcacggcaa
721 gaggcgaggg gcggcgactg gtgagtacgc caaaaatttt gactagcgga ggctagaagg
781 agagagatgg gtgcgagagc gtcagtatta agcgggggag aattagatcg atgggaaaaa
841 attcggttaa ggccaggggg aaagaaaaaa tataaattaa aacatatagt atgggcaagc
901 agggagctag aacgattcgc agttaatcct ggcctgttag aaacatcaga aggctgtaga
961 caaatactgg gacagctaca accatccctt cagacaggat cagaagaact tagatcatta
1021 tataatacag tagcaaccct ctattgtgtg catcaaagga tagagataaa agacaccaag
1081 gaagctttag acaagataga ggaagagcaa aacaaaagta agaaaaaagc acagcaagca
1141 gcagctgaca caggacacag cagtcaggtc agccaaaatt accctatagt gcagaacatc
1201 caggggcaaa tggtacatca ggccatatca cctagaactt taaatgcatg ggtaaaagta
1261 gtagaagaga aggctttcag cccagaagta atacccatgt tttcagcatt atcagaagga
1321 gccaccccac aagatttaaa caccatgcta aacacagtgg ggggacatca agcagccatg
1381 caaatgttaa aagagaccat caatgaggaa gctgcagaat gggatagagt acatccagtg
1441 catgcagggc ctattgcacc aggccagatg agagaaccaa ggggaagtga catagcagga
1501 actactagta cccttcagga acaaatagga tggatgacaa ataatccacc tatcccagta
1561 ggagaaattt ataaaagatg gataatcctg ggattaaata aaatagtaag aatgtatagc
1621 cctaccagca ttctggacat aagacaagga ccaaaagaac cttttagaga ctatgtagac
1681 cggttctata aaactctaag agccgagcaa gcttcacagg aggtaaaaat tggatgacag
1741 aaaccttgtt ggtccaaaat gcgaacccag attgtaagac tattttaaaa gcattgggac
1801 cagcggctac actagaagaa atgatgacag catgtcaggg agtaggagga cccggccata
1861 aggcaagagt tttggctgaa gcaatgagcc aagtaacaaa tacagctacc ataatgatgc
1921 agagaggcaa ttttaggaac caaagaaaga tggttaagtg tttcaattgt ggcaaagaag
1981 ggcacacagc cagaaattgc agggccccta ggaaaaaggg ctgttggaaa tgtggaaagg
2041 aaggacacca aatgaaagat tgtactgaga gacaggctaa ttttttaggg aagatctggc
2101 cttcctacaa gggaaggcca gggaattttc ttcagagcag accagagcca acagccccac
2161 catttcttca gagcagacca gagccaacag ccccaccaga agagagcttc aggtctgggg
2221 tagagacaac aactccccct cagaagcagg agccgataga caaggaactg tatcctttaa
2281 cttccctcag atcactcttt ggcaacgacc cctcgtcaca ataaagatag gggggcaact
2341 aaaggaagct ctattagata caggagcaga tgatacagta ttagaagaaa tgagtttgcc
2401 aggaagatgg aaaccaaaaa tgataggggg aattggaggt tttatcaaag taagacagta
2461 tgatcagata ctcatagaaa tctgtggaca taaagctata ggtacagtat tagtaggacc
2521 tacacctgtc aacataattg gaagaaatct gttgactcag attggttgca ctttaaattt
2581 tcccattagc cctattgaga ctgtaccagt aaaattaaag ccaggaatgg atggcccaaa
2641 agttaaacaa tggccattga cagaagaaaa aataaaagca ttagtagaaa tttgtacaga
2701 aatggaaaag gaagggaaaa tttcaaaaat tgggcctgag aatccataca atactccagt
2761 atttgccata aagaaaaaag acagtactaa atggagaaaa ttagtagatt tcagagaact
2821 taataagaga actcaagact tctgggaagt tcaattagga ataccacatc ccgcagggtt
2881 aaaaaagaaa aaatcagtaa cagtactgga tgtgggtgat gcatattttt cagttccctt
2941 agatgaagac ttcaggaagt atactgcatt taccatacct agtataaaca atgagacacc
3001 agggattaga tatcagtaca atgtgcttcc acagggatgg aaaggatcac cagcaatatt
3061 ccaaagtagc atgacaaaaa tcttagagcc ttttaaaaaa caaaatccag acatagttat
3121 ctatcaatac atggatgatt tgtatgtagg atctgactta gaaatagggc agcatagaac
3181 aaaaatagag gagctgagac aacatctgtt gaggtgggga cttaccacac cagacaaaaa
3241 acatcagaaa gaacctccat tcctttggat gggttatgaa ctccatcctg ataaatggac
3301 agtacagcct atagtgctgc cagaaaaaga cagctggact gtcaatgaca tacagaagtt
3361 agtggggaaa ttgaattggg caagtcagat ttacccaggg attaaagtaa ggcaattatg
3421 taaactcctt agaggaacca aagcactaac agaagtaata ccactaacag aagaagcaga
3481 gctagaactg gcagaaaaca gagagattct aaaagaacca gtacatggag tgtattatga
3541 cccatcaaaa gacttaatag cagaaataca gaagcagggg caaggccaat ggacatatca
3601 aatttatcaa gagccattta aaaatctgaa aacaggaaaa tatgcaagaa tgaggggtgc
3661 ccacactaat gatgtaaaac aattaacaga ggcagtgcaa aaaataacca cagaaagcat
3721 agtaatatgg ggaaagactc ctaaatttaa actacccata caaaaggaaa catgggaaac
3781 atggtggaca gagtattggc aagccacctg gattcctgag tgggagtttg ttaatacccc
3841 tcctttagtg aaattatggt accagttaga gaaagaaccc atagtaggag cagaaacctt
3901 ctatgtagat ggggcagcta acagggagac taaattagga aaagcaggat atgttactaa
3961 caaaggaaga caaaaggttg tccccctaac taacacaaca aatcagaaaa ctgagttaca
4021 agcaatttat ctagctttgc aggattcagg attagaagta aacatagtaa cagactcaca
4081 atatgcatta ggaatcattc aagcacaacc agataaaagt gaatcagagt tagtcaatca
4141 aataatagag cagttaataa aaaaggaaaa ggtctatctg gcatgggtac cagcacacaa
4201 aggaattgga ggaaatgaac aagtagataa attagtcagt gctggaatca ggaaaatact
4261 atttttagat ggaatagata aggcccaaga tgaacatgag aaatatcaca gtaattggag
4321 agcaatggct agtgatttta acctgccacc tgtagtagca aaagaaatag tagccagctg
4381 tgataaatgt cagctaaaag gagaagccat gcatggacaa gtagactgta gtccaggaat
4441 atggcaacta gattgtacac atttagaagg aaaagttatc ctggtagcag ttcatgtagc
4501 cagtggatat atagaagcag aagttattcc agcagaaaca gggcaggaaa cagcatattt
4561 tcttttaaaa ttagcaggaa gatggccagt aaaaacaata catacagaca atggcagcaa
4621 tttcaccagt gctacggtta aggccgcctg ttggtgggcg ggaatcaagc aggaatttgg
4681 aattccctac aatccccaaa gtcaaggagt agtagaatct atgaataaag aattaaagaa
4741 aattatagga caggtaagag atcaggctga acatcttaag acagcagtac aaatggcagt
4801 attcatccac aattttaaaa gaaaaggggg gattgggggg tacagtgcag gggaaagaat
4861 agtagacata atagcaacag acatacaaac taaagaatta caaaaacaaa ttacaaaaat
4921 tcaaaatttt cgggtttatt acagggacag cagaaatcca ctttggaaag gaccagcaaa
4981 gctcctctgg aaaggtgaag gggcagtagt aatacaagat aatagtgaca taaaagtagt
5041 gccaagaaga aaagcaaaga tcattaggga ttatggaaaa cagatggcag gtgatgattg
5101 tgtggcaagt agacaggatg aggattagaa catggaaaag tttagtaaaa caccatatgt
5161 atgtttcagg gaaagctagg ggatggtttt atagacatca ctatgaaagc cctcatccaa
5221 gaataagttc agaagtacac atcccactag gggatgctag attggtaata acaacatatt
5281 ggggtctgca tacaggagaa agagactggc atttgggtca gggagtctcc atagaatgga
5341 ggaaaaagag atatagcaca caagtagacc ctgaactagc agaccaacta attcatctgt
5401 attactttga ctgtttttca gactctgcta taagaaaggc cttattagga cacatagtta
5461 gccctaggtg tgaatatcaa gcaggacata acaaggtagg atctctacaa tacttggcac
5521 tagcagcatt aataacacca aaaaagataa agccaccttt gcctagtgtt acgaaactga
5581 cagaggatag atggaacaag ccccagaaga ccaagggcca cagagggagc cacacaatga
5641 atggacacta gagcttttag aggagcttaa gaatgaagct gttagacatt ttcctaggat
5701 ttggctccat ggcttagggc aacatatcta tgaaacttat ggggatactt gggcaggagt
5761 ggaagccata ataagaattc tgcaacaact gctgtttatc cattttcaga attgggtgtc
5821 gacatagcag aataggcgtt actcgacaga ggagagcaag aaatggagcc agtagatcct
5881 agactagagc cctggaagca tccaggaagt cagcctaaaa ctgcttgtac caattgctat
5941 tgtaaaaagt gttgctttca ttgccaagtt tgtttcataa caaaagcctt aggcatctcc
6001 tatggcagga agaagcggag acagcgacga agacctcctc aaggcagtca gactcatcaa
6061 gtttctctat caaagcagta agtagtacat gtaatgcaac ctatacaaat agcaatagta
6121 gcattagtag tagcaataat aatagcaata gttgtgtggt ccatagtaat catagaatat
6181 aggaaaatat taagacaaag aaaaatagac aggttaattg atagactaat agaaagagca
6241 gaagacagtg gcaatgagag tgaaggagaa atatcagcac ttgtggagat gggggtggag
6301 atggggcacc atgctccttg ggatgttgat gatctgtagt gctacagaaa aattgtgggt
6361 cacagtctat tatggggtac ctgtgtggaa ggaagcaacc accactctat tttgtgcatc
6421 agatgctaaa gcatatgata cagaggtaca taatgtttgg gccacacatg cctgtgtacc
6481 cacagacccc aacccacaag aagtagtatt ggtaaatgtg acagaaaatt ttaacatgtg
6541 gaaaaatgac atggtagaac agatgcatga ggatataatc agtttatggg atcaaagcct
6601 aaagccatgt gtaaaattaa ccccactctg tgttagttta aagtgcactg atttgaagaa
6661 tgatactaat accaatagta gtagcgggag aatgataatg gagaaaggag agataaaaaa
6721 ctgctctttc aatatcagca caagcataag aggtaaggtg cagaaagaat atgcattttt
6781 ttataaactt gatataatac caatagataa tgatactacc agctatacgt tgacaagttg
6841 taacacctca gtcattacac aggcctgtcc aaaggtatcc tttgagccaa ttcccataca
6901 ttattgtgcc ccggctggtt ttgcgattct aaaatgtaat aataagacgt tcaatggaac
6961 aggaccatgt acaaatgtca gcacagtaca atgtacacat ggaattaggc cagtagtatc
7021 aactcaactg ctgttaaatg gcagtctggc agaagaagag gtagtaatta gatctgccaa
7081 tttcacagac aatgctaaaa ccataatagt acagctgaac caatctgtag aaattaattg
7141 tacaagaccc aacaacaata caagaaaaag tatccgtatc cagagaggac cagggagagc
7201 atttgttaca ataggaaaaa taggaaatat gagacaagca cattgtaaca ttagtagagc
7261 aaaatggaat aacactttaa aacagataga tagcaaatta agagaacaat ttggaaataa
7321 taaaacaata atctttaagc agtcctcagg aggggaccca gaaattgtaa cgcacagttt
7381 taattgtgga ggggaatttt tctactgtaa ttcaacacaa ctgtttaata gtacttggtt
7441 taatagtact tggagtacta aagggtcaaa taacactgaa ggaagtgaca caatcaccct
7501 cccatgcaga ataaaacaaa ttataaacat gtggcaggaa gtaggaaaag caatgtatgc
7561 ccctcccatc agtggacaaa ttagatgttc atcaaatatt acagggctgc tattaacaag
7621 agatggtggt aatagcaaca atgagtccga gatcttcaga cctggaggag gagatatgag
7681 ggacaattgg agaagtgaat tatataaata taaagtagta aaaattgaac cattaggagt
7741 agcacccacc aaggcaaaga gaagagtggt gcagagagaa aaaagagcag tgggaatagg
7801 agctttgttc cttgggttct tgggagcagc aggaagcact atgggcgcag cgtcaatgac
7861 gctgacggta caggccagac aattattgtc tggtatagtg cagcagcaga acaatttgct
7921 gagggctatt gaggcgcaac agcatctgtt gcaactcaca gtctggggca tcaagcagct
7981 ccaggcaaga atcctggctg tggaaagata cctaaaggat caacagctcc tggggatttg
8041 gggttgctct ggaaaactca tttgcaccac tgctgtgcct tggaatgcta gttggagtaa
8101 taaatctctg gaacagattt ggaataacat gacctggatg gagtgggaca gagaaattaa
8161 caattacaca agcttaatac actccttaat tgaagaatcg caaaaccagc aagaaaagaa
8221 tgaacaagaa ttattggaat tagataaatg ggcaagtttg tggaattggt ttaacataac
8281 aaattggctg tggtatataa aattattcat aatgatagta ggaggcttgg taggtttaag
8341 aatagttttt gctgtacttt ctgtagtgaa tagagttagg cagggatatt caccattatc
8401 gtttcagacc cacctcccaa tcccgagggg acccgacagg cccgaaggaa tagaagaaga
8461 aggtggagag agagacagag acagatccat tcgattagtg aacggatcct tagcacttat
8521 ctgggacgat ctgcggagcc tgtgcctctt cagctaccac cgcttgagag acttactctt
8581 gattgtaacg aggattgtgg aacttctggg acgcaggggg tgggaagccc tcaaatattg
8641 gtggaatctc ctacagtatt ggagtcagga gctaaagaat agtgctgtta gcttgctcaa
8701 tgccacagct atagcagtag ctgaggggac agatagggtt atagaagtag tacaaggagc
8761 ttatagagct attcgccaca tacctagaag aataagacag ggcttggaaa ggattttgct
8821 ataagatggg tggcaagtgg tcaaaaagta gtgtggttgg atggcctgct gtaagggaaa
8881 gaatgagacg agctgagcca gcagcagatg gggtgggagc agcatctcga gacctagaaa
8941 aacatggagc aatcacaagt agcaacacag cagctaacaa tgctgattgt gcctggctag
9001 aagcacaaga ggaggaggag gtgggttttc cagtcacacc tcaggtacct ttaagaccaa
9061 tgacttacaa ggcagctgta gatcttagcc actttttaaa agaaaagggg ggactggaag
9121 ggctaattca ctcccaacga agacaagata tccttgatct gtggatctac cacacacaag
9181 gctacttccc tgattagcag aactacacac cagggccagg gatcagatat ccactgacct
9241 ttggatggtg ctacaagcta gtaccagttg agccagagaa gttagaagaa gccaacaaag
9301 gagagaacac cagcttgtta caccctgtga gcctgcatgg aatggatgac ccggagagag
9361 aagtgttaga gtggaggttt gacagccgcc tagcatttca tcacatggcc cgagagctgc
9421 atccggagta cttcaagaac tgctgacatc gagcttgcta caagggactt tccgctgggg
9481 actttccagg gaggcgtggc ctgggcggga ctggggagtg gcgagccctc agatcctgca
9541 tataagcagc tgctttttgc ctgtactggg tctctctggt tagaccagat ctgagcctgg
9601 gagctctctg gctagctagg gaacccactg cttaagcctc aataaagctt gccttgagtg
9661 cttcaagtag tgtgtgcccg tctgttgtgt gactctggta actagagatc cctcagaccc
9721 ttttagtcag tgtggaaaat ctctagca
//