werner@aecom.YU.EDU (Craig Werner) (01/09/88)
HIV: Up close and personal An intimate look at the AIDS agent. This article is about the molecular biology of the Human Immunodeficiency virus, the causative agent of AIDS. It is only tangentially related to the disease itself, but only by understanding the complexity of the virus can one ever hope to counteract it. And HIV is the most complex retrovirus known. It is, in William Haseltine's words, the Rolls-Royce of retroviruses, and as such, is fascinating in its own right, even moreso by the disease that it causes. HIV is a retrovirus. By this it is meant that it contains a single stranded RNA genome, and that as an obligate step in its life cycle it transcribes this RNA into DNA, which then serves as a template for both messages and progeny genomes. This DNA is inserted into the cell's chromosomes, which means that Retroviruses are capable of the ultimate in persistent infections. They will remain a part of the cell until that cell dies. Most retroviruses have the following structure: gag pol env R-U5--psi----------------------------------------------U3-R-(A)n In the transcription process, the ends are partially duplicated, so that the integrated form contains identical ends of the structure: U3-R-U5, known as the Long Terminal Repeat, or LTR. The LTR contains a terminator, an enhancer, and the promoter. Transcription begins at the 5' promoter and ends at the 3' terminator. The 3' enhancer and promoter are not used by the retrovirus, but can cause trouble by activating adjacent genes. The other four necessary features are: psi: a site on the DNA necessary for the DNA to be packaged into retroviral particles. and three genes: gag: (stands for group antigen, a term which is now archaic) This is the capsid protein(s), which bind and package the RNA genome. pol: Polymerase. Pol is actually made as a fusion protein with gag, a gag-pol polyprotein, which is then cleaved. There are four functions of the pol gene, and in some retroviruses they are cleaved into four separate proteins, in others less. They are: Reverse Transcriptase: both an RNA and a DNA-directed DNA polymerase. It transcribes single-stranded RNA into double stranded DNA. RNaseH: chews RNA in an RNA:DNA hybrid, which is an intermediate. Protease: cleaves itself out of polyprotein, and separates gag and pol, and performs sub-cleavages as well. Integrase: responsible for inserting dsDNA copy of retrovirus into host DNA. env: The envelope gene, responsible for binding the receptor on cells in which it infects. Made as a singly-spliced message. Some retroviruses contain a fourth gene, known as an oncogene, which causes tumors in certain animals, and transformation in cell culture. More frequently, an oncogene replaces one or more of the three genes. In this case, the virus can only function when it is present in a mixed population with normal virus (otherwise known as 'helper' virus). Such normal retroviruses, and their oncogene containing counterparts, are classed as 'oncoviruses' within the family Retroviridae. All of these viruses can cause tumors. Oncogenes are nothing more than normal cellular growth genes that have been picked up by retroviruses. Normal retroviruses can activate latent cellular genes, however, the time scale is in months to years rather than days as with viruses that already contain an oncogene. Interestingly, it has never been demonstrated that a retrovirus of type Oncovirus has been responsible for any human cancer. In fact, there are known human oncoviruses. Human Retroviruses The human retroviruses are few in number, and they are all more complicated than those described above. The known human retroviruses are HTLV-I, HTLV-II, HTLV-V, HIV-1, and HIV-2. (one isolate of HIV-1 is called HTLV-III, HTLV-IV is closely related to HIV-2). I'll just mention HTLV-I and HIV-1. HTLV-I, in addition to all the aforementioned genes (gag, pol, env), contains two other genes: X and rex. The enhancer in HTLV-I is actually quite poor. However, the X gene can bind to the enhancer and this combination turns the HTLV-I promoter into one of the strongest known. X is part of a family known as trans-activating factors. The rex gene regulates X activity (REgulation X). A small amount of rex activates X. A large amount of rex inhibits X. Hence rex and X keep each other within a range. The X gene (and to some extent rex directly) can activate expression of Interleukin-2 (IL-2) and the IL-2 Receptor. This is the mechanism by which it causes Adult T-cell Leukemia directly. In fact, HTLV-I is a lousy virus at reproducing itself. It's main mechanism of replication is to stimulate the division of the cells that it has become a part of. HTLV-I is also capable of activating T-cells. This is not a function of infection. It works with inactivated virus. Nor is it a function of immune response. HTLV-I activates monoclonal T-cells known to be specific for other antigens such as tetanus toxoid. It is a general phenomenon that retroviruses cannot complete infection in stationary cells, only dividing ones. HTLV-I has gotten around this by gaining the ability to activate cells, which it can then go on to infect. HIV-1 cannot do this. Since most T-cells are resting cells, HIV alone is only weakly infectious. However, 100% infection can be acheived by co-culturing HIV-1 and HTLV-I. This is particularly important since a large amount of double infections have been documented in intravenous drug users (in New York particularly). HIV HIV is even more complicated than HTLV-I. To date, 7 genes are known, and an eighth is suspected. The seven that are known are: gag, pol, and env tat and art/trs (weak homologues of X and rex) A/sor B/3'orf tat and art/trs (the naming has varied as two labs discover the same gene at different times) are double spliced and they overlap with each other and with env. A/sor overlaps the end of the pol gene, and lies mostly between pol and env. B/3'orf overlaps the end of the envelope gene and runs into the 3'LTR. What they do: Regulatory genes: tat: trans-activation. Absolutely required for viral replication. Without it, one gets only minimal RNA production and no virus. It also accelerates viral protein production. It interacts via a site called tar, present in both DNA and RNA, by an unknown mechanism, one that cannot be duplicated in-vitro. William Haseltine describes its action as "Post-transcriptional and pre-translational." art/trs: art stands for Anti-Repression Transactivator. It determines differential expression. Recall that HIV has only one promoter. Hence all seven (six actually, gag and pol are transcribed together), must be regulated. Without art/trs, only the spliced messages are produced, mostly tat and art/trs, and B/3'orf, and some env, but no gag-pol. It may be keyed into a general mechanism whereby genes with introns are not allowed to leave the nucleus prior to splicing. sor: Without sor, viruses are produced that look normal in every way, but that are not infectious as pure viral filtrate. (They are infectious with co-cultivation.) It makes a p23 (a protein of apparent molecular weight of 23,000 molecular weight) which is present in the filtrate of virally-infected cultures, but has yet to be detected in pure virus. (A common dilemma - as Malcolm Martin posed the problem: "You can't really get sufficient purified virus, and since in this case that would mean growing at least 10 liters of AIDS-infectious cells, I'm not sure we really want to.) B/3'orf: This slows down viral production wbout 10-fold in T4 cells. It resembles the known oncogenes src and ras somewhat, but it behaves as an anti-oncogene. Removing the gene eliminates the ability of the virus to enter latency. So the gene is probably quite important in the larger disease process. Structural genes: pol: Typical 4 functions, as described above for all retroviruses. Split into three proteins. RevT and RNaseH are in one protein. gag: Cleaved into 4 proteins. All are required for viral production. env: This is a glycoprotein, that is found on the outside of the virus particle. It is made as a gp160, and then cleaved into a gp120 and a gp41, which is anchored into the membrane of the virus. Actually I should say that env is a GLYCOprotein. In fact, over 50% of its weight is carbohydrate: gp120 is really a p55. The envelope gene has six functions (phenotypes) and they are arranged quite nicely in 6 linear domains. 1. T4-binding. This is found in at least three non-contiguous sequences separated by hypervariable regions. They probably form three faces of a binding pocket, while the hypervariable regions are exposed to the surface. 2. Fusion: near the site where gp41 and gp120 are clipped, which is very similar to the way Influenza mediates fusion. 3. gp41-gp120 noncovalent interaction. Normally, up to 50% of gp120 goes on to float free. Mutations can lead to 100% floating off. There is no covalent (i.e. Disulfide) bonding holding them together. 4. Transmembrane - the gp120:gp41 complex floats free leaving the virus with a naked lipid envelope in its absence. It serves as an anchor. 5. Processing: affecting the gp160 -> gp120:gp41 cleavage. 6. Cytoplasmic: at least in the infected cell. In the virion itself (which has no cytoplasm), interacts with gag. gp120 contains the T4 binding site. It interacts with T4, during which time gp41 is inserted into the cell membrane, leading to membrane fusion, and viral entry into the cell. The envelope gene of HIV-1 has a unique property among viruses. It appears to have several toxic functions of its own. Under certain circumstances, it has a lytic function. At neutral pH, by itself, it will bind to T4, lead to syncyctia formation, and cell death. The envelope gene will also prevent chick sensory nerve growth in nerve cells that are dependent on Neuroleukin, but not those dependent on nerve growth factor. It is believed to compete with Neuroleukin in some as yet undefined fashion. Mechanism of Disease HIV causes what best can be described as a 'controlled persistent infection.' It has two main defined mechanisms of killing. The first is direct, and only affects activated T cells. It appears to be due to an intracellular interaction between gp120-gp41 and the T4 molecule. It is a function of a product of the concentrations of the two, or as William Haseltine puts it, when, "the virus produces a little too much of itself in a cell with a little too much receptor." It is exquisitely selectively cytopathic: it affects a particular lineage, in a particular state of differentiation, in a particular state of activation. Macrophages and HeLa cells produce virus, but do not die. The second means of killing involves syncyctia formation. A single infected T-cell making the envelope gene (not necessarily complete viral particles) is capable of fusing with 100-500 uninfected T-cells, resulting in their death. This interaction is very easy to block with some antibodies to T4, such as OKT4A. OKT4A blocks HIV infection of all cell types. Soluble T4 molecules (produced recombinantly) also block gp120 binding, viral infectivity, and syncyctia formation in T cells. However, all natural sera which show protection don't really show absolute protection. Virus production is slowed down, but if the experiments are extended long enough virus is eventually produced, even in the presence of "protective" antibody. The envelope gene contains two immunodominant regions. In most viruses (such as picornaviruses, which cause colds, and Influenza virus), the immunodominant regions differ from virus isolates. Although HIV has a high rate of variability, it's immunodominant regions are in fact conserved, which is rather unprecedented. This is because the two immunodominant regions are: 1. The site of interation between gp120 and gp41. This is not exposed when the two proteins are associated as in infectious virions. However, during assembly, up to 50% of all gp120 dissociates, allowing these antibodies to form. 2. The cleavage site on gp120. This is conserved because it is important in the gp160 to gp120:gp41 processing. However, once processing is accomplished, this region is peripheral and unimportant. Hence antibodies to neither of these regions will do any good, yet most antibodies in infected persons are made to these two regions. Antibodies are probably not made to the rest of the molecule because it is either not accesible, or more likely, because it is all sugar-coated. Recall that over half the envelope is host-encoded glycosylation, and one does not generally make antibodies to self. Pathogenesis (mechanism of disease) In fact, concerning the mechanism of disease, what we do not know about HIV and AIDS almost overwhelms what little we do know. As mentioned above the envelope protein is directly toxic to T4 cells, even without infection being necessary. It also appears to interfere with Neuroleukin stimulation of nerve viability. The only cells directly killed by HIV are T4-positive lymphocytes, which is predominately a marker of helper/inducer function. It is capable of growing in other cell types without causing the death of the cell. Using a transgenic mouse model of just the viral LTR (promoter) linked to a assayable gene (done because the virus will not infect mice directly), it was shown that the virus is expressed in the thymus (where T-cells mature), heart, eye (which is part of the central nervous system), and tail. The tail expression is actually expression in the skin, the Langherhan's cells (which are antigen presenting cells) specifically. The expression of the viral promoter in the skin is actually 1000 times higher than that in T cells in this model, which lacks the tat gene, so is not necessarily physiologically relevant. Expression also occurs in T-cells and macrophages, but only after these are stimulated. In-vitro, the most efficient cell lines used to grow HIV are derived from the human colon, mainly because these cell lines produce respectable amounts of virus without being kill by the virus. This has intriguing implications, in that the seemingly higher efficiency by which the virus is spread by anal sex may be due to secondary infection, that is, the virus may infect the colonic mucosa first, and then spread to the immune system from the endogenous colonic infection, rather than directly. It is known that a subset of colon cells express low levels of the CD4 (T4) molecule that serves as the HIV receptor. It would be very difficult to demonstrate this directly, however. The course of infection is as follows: Immediately after infection, HIV shows up in the central nervous system first, in the bloodstream soon afterwards, and then in most cases is cleared (and undetectable or barely detectable) for a period of anywhere from 1-10 years. A acute flu-like syndrome may accompany initial infection. After infection antibodies to both env and gag are made. At some point in time, the virus ends its latency, followed shortly afterwards by a drop in the T-cell count, and then with variable delay, the onset of CNS symptoms, or Opportunistic Infections. Antibodies to gag drop as the viral titer rises, but the antibodies to the envelope remain high, but since as you recall, they are mainly to conserved non-essential or hidden regions, they are generally not protective, and may not even prove protective against an ab initio infection. Symptomatic disease absolutely depends on continued viral replication. Generally death results from opportunistic infection, although as progress is made in treating and preventing opportunistic infections, an increasing number of deaths are the result of AIDS encephalopathy, the direct infection of the brain by HIV. The presence in the body and continued viral replication of HIV causes a slow, progressive degeneration of the immune and central nervous systems, which has been designated the Acquired Immune Deficiency Syndrome, or AIDS. Sources: Everything above has a reference somewhere, I'm sure. Many of these I could actually find again if I tried. However, much of the general material of what I said comes from personal notes from Jack Lenz's course in "Animal Virology" at Albert Einstein College of Medicine, and the following two seminars given as part of the Public Health Research Institute's 45th anniversary research symposium on "Retroviruses and Disease." Dr. Malcolm A. Martin, "Structure and Functions of the HIV genome." and Dr. William Haseltine, "Genetic Regulation and pathogenesis of the AIDS virus." -- Craig Werner (future MD/PhD, 3 years down, 4 to go) werner@aecom.YU.EDU -- Albert Einstein College of Medicine (1935-14E Eastchester Rd., Bronx NY 10461, 212-931-2517) "If you think you might faint, don't worry; you can always go into psychiatry."