[sci.med.aids] HIV: a detailed description of the AIDS agent

werner@aecom.YU.EDU (Craig Werner) (01/09/88)

HIV: Up close and personal
An intimate look at the AIDS agent.

	This article is about the molecular biology of the Human
Immunodeficiency virus, the causative agent of AIDS.  It is only
tangentially related to the disease itself, but only by understanding
the complexity of the virus can one ever hope to counteract it. And
HIV is the most complex retrovirus known. It is, in William Haseltine's 
words, the Rolls-Royce of retroviruses, and as such, is fascinating
in its own right, even moreso by the disease that it causes.

	HIV is a retrovirus. By this it is meant that it contains
a single stranded RNA genome, and that as an obligate step in its
life cycle it transcribes this RNA into DNA, which then serves as
a template for both messages and progeny genomes.  This DNA is
inserted into the cell's chromosomes, which means that Retroviruses
are capable of the ultimate in persistent infections. They will
remain a part of the cell until that cell dies.

	Most retroviruses have the following structure:

                     gag            pol          env
	R-U5--psi----------------------------------------------U3-R-(A)n

In the transcription process, the ends are partially duplicated, so 
that the integrated form contains identical ends of the structure:

	U3-R-U5,  known as the Long Terminal Repeat, or LTR.

	The LTR contains a terminator, an enhancer, and the promoter.
Transcription begins at the 5' promoter and ends at the 3' terminator.
The 3' enhancer and promoter are not used by the retrovirus, but can
cause trouble by activating adjacent genes.

The other four necessary features are:

psi: 	a site on the DNA necessary for the DNA to be packaged
	into retroviral particles.
and three genes:
gag:	(stands for group antigen, a term which is now archaic) This
	is the capsid protein(s), which bind and package the RNA genome.
pol:	Polymerase.  Pol is actually made as a fusion protein with gag,
	a gag-pol polyprotein, which is then cleaved.  There are four
	functions of the pol gene, and in some retroviruses they are
	cleaved into four separate proteins, in others less. They are:
	Reverse Transcriptase: both an RNA and a DNA-directed DNA
		polymerase. It transcribes single-stranded RNA into 
		double stranded DNA.
	RNaseH: chews RNA in an RNA:DNA hybrid, which is an intermediate.
	Protease: cleaves itself out of polyprotein, and separates gag
		and pol, and performs sub-cleavages as well.
	Integrase: responsible for inserting dsDNA copy of retrovirus
		into host DNA.
env:	The envelope gene, responsible for binding the receptor on cells
		in which it infects. Made as a singly-spliced message.

Some retroviruses contain a fourth gene, known as an oncogene, which
causes tumors in certain animals, and transformation in cell culture.
More frequently, an oncogene replaces one or more of the three genes.
In this case, the virus can only function when it is present in a 
mixed population with normal virus (otherwise known as 'helper' virus).
Such normal retroviruses, and their oncogene containing counterparts,
are classed as 'oncoviruses' within the family Retroviridae.
	All of these viruses can cause tumors.  Oncogenes are nothing
more than normal cellular growth genes that have been picked up by
retroviruses.  Normal retroviruses can activate latent cellular genes,
however, the time scale is in months to years rather than days as with
viruses that already contain an oncogene.

	Interestingly, it has never been demonstrated that a retrovirus
of type Oncovirus has been responsible for any human cancer. In fact,
there are known human oncoviruses.


Human Retroviruses

	The human retroviruses are few in number, and they are all more
complicated than those described above.  The known human retroviruses
are HTLV-I, HTLV-II, HTLV-V, HIV-1, and HIV-2. (one isolate of HIV-1
is called HTLV-III, HTLV-IV is closely related to HIV-2). I'll just
mention HTLV-I and HIV-1.

HTLV-I, in addition to all the aforementioned genes (gag, pol, env),
contains two other genes: X and rex.
	The enhancer in HTLV-I is actually quite poor.  However, the
X gene can bind to the enhancer and this combination turns the HTLV-I
promoter into one of the strongest known. X is part of a family known
as trans-activating factors.
	The rex gene regulates X activity (REgulation X). A small
amount of rex activates X. A large amount of rex inhibits X. Hence
rex and X keep each other within a range.
	The X gene (and to some extent rex directly) can activate
expression of Interleukin-2 (IL-2) and the IL-2 Receptor. This is
the mechanism by which it causes Adult T-cell Leukemia directly.
In fact, HTLV-I is a lousy virus at reproducing itself. It's main
mechanism of replication is to stimulate the division of the cells
that it has become a part of.

	HTLV-I is also capable of activating T-cells.  This is not
a function of infection. It works with inactivated virus.  Nor is
it a function of immune response. HTLV-I activates monoclonal
T-cells known to be specific for other antigens such as tetanus
toxoid.
	It is a general phenomenon that retroviruses cannot complete
infection in stationary cells, only dividing ones. HTLV-I has gotten
around this by gaining the ability to activate cells, which it can
then go on to infect.
	HIV-1 cannot do this.  Since most T-cells are resting cells,
HIV alone is only weakly infectious.  However, 100% infection can be
acheived by co-culturing HIV-1 and HTLV-I.  This is particularly
important since a large amount of double infections have been
documented in intravenous drug users (in New York particularly).



HIV
	HIV  is even more complicated than HTLV-I. To date, 7 genes
are known, and an eighth is suspected.

	The seven that are known are:

	gag, pol, and env
	tat and art/trs   	(weak homologues of X and rex)
	A/sor
	B/3'orf

tat and art/trs (the naming has varied as two labs discover the same
gene at different times) are double spliced and they overlap with 
each other and with env.  A/sor overlaps the end of the pol gene,
and lies mostly between pol and env.  B/3'orf overlaps the end of
the envelope gene and runs into the 3'LTR.

What they do:
Regulatory genes:
tat: 	trans-activation. Absolutely required for viral replication. 
	Without it, one gets only minimal RNA production and no virus.
	It also accelerates viral protein production.  It interacts
	via a site called tar, present in both DNA and RNA, by an
	unknown mechanism, one that cannot be duplicated in-vitro.
	William Haseltine describes its action as "Post-transcriptional
	and pre-translational."
art/trs: art stands for Anti-Repression Transactivator.  It determines
	differential expression. Recall that HIV has only one promoter.
	Hence all seven (six actually, gag and pol are transcribed
	together), must be regulated.  Without art/trs, only the
	spliced messages are produced, mostly tat and art/trs, and 
	B/3'orf, and some env, but no gag-pol.  It may be keyed into
	a general mechanism whereby genes with introns are not allowed
	to leave the nucleus prior to splicing.
sor:	Without sor, viruses are produced that look normal in every way,
	but that are not infectious as pure viral filtrate. (They are
	infectious with co-cultivation.) It makes a p23 (a protein of
	apparent molecular weight of 23,000 molecular weight) which is
	present in the filtrate of virally-infected cultures, but has
	yet to be detected in pure virus. (A common dilemma - as
	Malcolm Martin posed the problem: "You can't really get 
	sufficient purified virus, and since in this case that would
	mean growing at least 10 liters of AIDS-infectious cells, I'm
	not sure we really want to.)
B/3'orf: This slows down viral production wbout 10-fold in T4 cells.
	It resembles the known oncogenes src and ras somewhat, but
	it behaves as an anti-oncogene.  Removing the gene eliminates
	the ability of the virus to enter latency.  So the gene is
	probably quite important in the larger disease process.

Structural genes:
pol:	Typical 4 functions, as described above for all retroviruses.
	Split into three proteins. RevT and RNaseH are in one protein.
gag:	Cleaved into 4 proteins. All are required for viral production.

env:	This is a glycoprotein, that is found on the outside of the
	virus particle. It is made as a gp160, and then cleaved into
	a gp120 and a gp41, which is anchored into the membrane of the
	virus. Actually I should say that env is a GLYCOprotein. In fact,
	over 50% of its weight is carbohydrate: gp120 is really a p55.

	The envelope gene has six functions (phenotypes) and they are
	arranged quite nicely in 6 linear domains.
	1. T4-binding.  This is found in at least three non-contiguous
		sequences separated by hypervariable regions. They
		probably form three faces of a binding pocket, while
		the hypervariable regions are exposed to the surface.
	2. Fusion: near the site where gp41 and gp120 are clipped, which
		is very similar to the way Influenza mediates fusion.
	3. gp41-gp120 noncovalent interaction.  Normally, up to 50% of
		gp120 goes on to float free. Mutations can lead to
		100% floating off. There is no covalent (i.e. Disulfide)
		bonding holding them together.
	4. Transmembrane - the gp120:gp41 complex floats free leaving
		the virus with a naked lipid envelope in its absence.
		It serves as an anchor.
	5. Processing: affecting the gp160 -> gp120:gp41 cleavage.
	6. Cytoplasmic: at least in the infected cell. In the virion
		itself (which has no cytoplasm), interacts with gag.

	gp120 contains the T4 binding site. It interacts with T4, during
	which time gp41 is inserted into the cell membrane, leading to
	membrane fusion, and viral entry into the cell.

The envelope gene of HIV-1 has a unique property among viruses.
It appears to have several toxic functions of its own.  Under 
certain circumstances, it has a lytic function.  At neutral pH,
by itself, it will bind to T4, lead to syncyctia formation, and
cell death. 
	The envelope gene will also prevent chick sensory nerve growth
in nerve cells that are dependent on Neuroleukin, but not those
dependent on nerve growth factor. It is believed to compete with
Neuroleukin in some as yet undefined fashion.



Mechanism of Disease
	HIV causes what best can be described as a 'controlled persistent
infection.'  It has two main defined mechanisms of killing.  The first
is direct, and only affects activated T cells.  It appears to be due
to an intracellular interaction between gp120-gp41 and the T4 molecule.
It is a function of a product of the concentrations of the two, or
as William Haseltine puts it, when, "the virus produces a little too
much of itself in a cell with a little too much receptor." It is
exquisitely selectively cytopathic: it affects a particular lineage,
in a particular state of differentiation, in a particular state of
activation. Macrophages and HeLa cells produce virus, but do not die.
	The second means of killing involves syncyctia formation. A
single infected T-cell making the envelope gene (not necessarily 
complete viral particles) is capable of fusing with 100-500 uninfected
T-cells, resulting in their death.  This interaction is very easy
to block with some antibodies to T4, such as OKT4A.  OKT4A blocks
HIV infection of all cell types. Soluble T4 molecules (produced
recombinantly) also block gp120 binding, viral infectivity, and 
syncyctia formation in T cells.
	However, all natural sera which show protection don't really
show absolute protection.  Virus production is slowed down, but if
the experiments are extended long enough virus is eventually produced,
even in the presence of "protective" antibody.

	The envelope gene contains two immunodominant regions.  In
most viruses (such as picornaviruses, which cause colds, and Influenza
virus), the immunodominant regions differ from virus isolates.  Although
HIV has a high rate of variability, it's immunodominant regions are in fact
conserved, which is rather unprecedented.
	This is because the two immunodominant regions are:
	1. The site of interation between gp120 and gp41. This is not
	exposed when the two proteins are associated as in infectious
	virions. However, during assembly, up to 50% of all gp120
	dissociates, allowing these antibodies to form.
	2. The cleavage site on gp120.  This is conserved because it
	is important in the gp160 to gp120:gp41 processing. However,
	once processing is accomplished, this region is peripheral
	and unimportant.
Hence antibodies to neither of these regions will do any good, yet most
antibodies in infected persons are made to these two regions.  Antibodies
are probably not made to the rest of the molecule because it is either
not accesible, or more likely, because it is all sugar-coated. Recall
that over half the envelope is host-encoded glycosylation, and one
does not generally make antibodies to self.


Pathogenesis (mechanism of disease)

	In fact, concerning the mechanism of disease, what we do not
know about HIV and AIDS almost overwhelms what little we do know.
	As mentioned above the envelope protein is directly toxic to
T4 cells, even without infection being necessary.  It also appears
to interfere with Neuroleukin stimulation of nerve viability.
	The only cells directly killed by HIV are T4-positive
lymphocytes, which is predominately a marker of helper/inducer
function.  It is capable of growing in other cell types without
causing the death of the cell.  Using a transgenic mouse model of
just the viral LTR (promoter) linked to a assayable gene (done
because the virus will not infect mice directly), it was  
shown that the virus is expressed in the thymus (where T-cells
mature), heart, eye (which is part of the central nervous system),
and tail.  The tail expression is actually expression in the skin,
the Langherhan's cells (which are antigen presenting cells)
specifically. The expression of the viral promoter in the skin is 
actually 1000 times higher than that in T cells in this model, 
which lacks the tat gene, so is not necessarily physiologically
relevant.  Expression also occurs in T-cells and macrophages, but only
after these are stimulated. 
	In-vitro, the most efficient cell lines used to grow HIV
are derived from the human colon, mainly because these cell lines
produce respectable amounts of virus without being kill by the virus.
	This has intriguing implications, in that the seemingly
higher efficiency by which the virus is spread by anal sex 
may be due to secondary infection, that is, the virus may 
infect the colonic mucosa first, and then spread to the immune
system from the endogenous colonic infection, rather than directly.
It is known that a subset of colon cells express low levels of the
CD4 (T4) molecule that serves as the HIV receptor. It would be very
difficult to demonstrate this directly, however.

The course of infection is as follows:
	
	Immediately after infection, HIV shows up in the central 
nervous system first, in the bloodstream soon afterwards, and then in 
most cases is cleared (and undetectable or barely detectable) for a 
period of anywhere from 1-10 years.  A acute flu-like syndrome may
accompany initial infection. After infection antibodies to both
env and gag are made.  At some point in time, the virus ends its
latency, followed shortly afterwards by a drop in the T-cell count,
and then with variable delay, the onset of CNS symptoms, or Opportunistic
Infections.  Antibodies to gag drop as the viral titer rises, but the
antibodies to the envelope remain high, but since as you recall, they
are mainly to conserved non-essential or hidden regions, they are
generally not protective, and may not even prove protective against
an ab initio infection.  Symptomatic disease absolutely depends on 
continued viral replication. Generally death results from opportunistic
infection, although as progress is made in treating and preventing
opportunistic infections, an increasing number of deaths are the
result of AIDS encephalopathy, the direct infection of the brain
by HIV.
	The presence in the body and continued viral replication of
HIV causes a slow, progressive degeneration of the immune and central
nervous systems, which has been designated the Acquired Immune
Deficiency Syndrome, or AIDS.



Sources:
	Everything above has a reference somewhere, I'm sure. Many of
these I could actually find again if I tried. However, much of the
general material of what I said comes from personal notes from Jack Lenz's
course in "Animal Virology" at Albert Einstein College of Medicine, 
and the following two seminars given as part of the Public Health
Research Institute's 45th anniversary research symposium on
"Retroviruses and Disease."
	Dr. Malcolm A. Martin, "Structure and Functions of the HIV genome."
and	Dr. William Haseltine, "Genetic Regulation and pathogenesis
		of the AIDS virus."
-- 
	        Craig Werner   (future MD/PhD, 3 years down, 4 to go)
	     werner@aecom.YU.EDU -- Albert Einstein College of Medicine
              (1935-14E Eastchester Rd., Bronx NY 10461, 212-931-2517)
"If you think you might faint, don't worry; you can always go into psychiatry."