[sci.virtual-worlds] an introduction to VR

MWELLS@FALCON.AAMRL.WPAFB.AF.MIL (Maxwell Wells) (04/11/91)

The following are some excerpts from a report put together as a primer for
"management".  I post it in response to the requests for primers which have
appeared recently.  I hope it is of use to someone.  I work as a contract
researcher at the Armstrong Aerospace Medical Research Laboratory, hence the
Air Force bias.  If I have neglected anyone or any product, it is through 
ignorance rather than malice.  I would appreciate any comments and information 
which would help me to make it better. 

[Moderator(Mark): This looks like it would be a great start for an FAQ
 posting.  After whatever improvements are made to it, perhaps I could
 add it to the archive and post it monthly-- what do you think?  Let me
 (madsax@milton.u.washington.edu) know.]

Maxwell Wells 
Senior Scientist 
Logicon Technical Services Inc 
P.O. Box 317258
Dayton, OH 45431 
Tel 513 255 5215 
MWELLS@FALCON.AAMRL.WPAFB.AF.MIL


        1.0     INTRODUCTION AND BACKGROUND 

Virtual reality may be considered to have been born in the middle 1960s, based
on the work of Ivan Sutherland from the University of Utah.  A paper, published
in 1972 by D.L Vickers, one of Sutherland's colleagues ,describes "an
interactive computer graphics system utilizing a head-mounted display and
wand.  The display, worn like a pair of eyeglasses, gives an illusion to the
observer that he is surrounded by three-dimensional, computer-generated
objects."  The three components of a virtual reality system are: a DISPLAY, a
TRANSDUCER and an IMAGE GENERATOR.  The display used in Vickers' early work
consisted of two cathode ray tubes (CRTs).  The head movement transducer
consisted of six rotary pulse generators attached to a telescoping steel
shaft which was connected to the head-mounted display.  The third component,
the image generator was a motley collection of a PDP-10 computer, a matrix
multiplier, a clipping divider and a vector generator. 


Developments in relevant technology occurred over the next 10 years, driven by
a variety of sources.  Small electromagnetic CRTs were produced by companies
such as Thorn, Thomas and Hughes with military funding (AAMRl was instrumental
in providing some of this funding).  A strong market for consumer electronics
resulted in the production of small flat-panel displays.  A military
requirement for helmet-mounted sights drove the development of head movement
transducers, primarily by Honeywell and Polhemus. The Honeywell approach used
helmet-mounted IR sources, a set of cockpit-mounted IR sensors and a
triangulation technique.  Polhemus produced the Spasyn system, in which a
radiated magnetic field produced a current in three orthogonal coils mounted
on the helmet.  The personal computing revolution in the 70s and 80s made
available fast, cheap, digital image generation. 

        2.0     CURRENT STATE OF TECHNOLOGY 

This section will define the requirements and summarize the state of the art
of each component. 

                2.1.    DISPLAYS 
Using the verb display to mean "to present information to the eye or the mind", 
the noun display may be interpreted to mean a device which presents 
information, irrespective of the sensory modality.  Thus, reference will be 
made to visual, auditory and tactile displays (there have been limited 
attempts to produce displays for the senses of taste and smell). 

Visual displays 

Most effort and success have been associated with visual displays.  It is now
possible to purchase, off the shelf, monochrome electromagnetic CRTs as small
as 18 mm in diameter, with 875 line resolution, but at a high price (>$5k).
Flat-panel color LCD displays are produced in large quantities for consumer
electronics products (eg the Sony Watchman), and at low prices (less than
$100).  CRTs have the advantages of small display area and high light output.
This makes them suitable for display designs with folded optical paths.  LCD
displays provide a flat-panel display of low weight and optional color, but
with poor resolution and relatively low light output. Several attempts have
been made to produce a cheap head-mounted display using LCD technology.
Successful prototypes have been produced by NASA (using both LCD technology
and cheap CRTs from a video camera) and The Air Force Institute of Technology
(AFIT).  Accounts of home produced systems occur regularly on the network. The
company VPL markets the "EyePhone", which consists of two color LCD screens
mounted in front of the eyes with lenses and prisms to produce a binocular
image at optical infinity. 

Systems with high resolution use miniature CRTs as image sources.  Perhaps the
most capable of such systems is the Visually Coupled Airborne Systems
Simulator (VCASS) at the Armstrong Aerospace Medical Research Laboratory.
VCASS consists of two 18mm CRTs and pancake window optics, providing a
binocular FOV 120 deg wide and 60 deg high.  Produced with military funding,
no expense was spared to assemble a system using (10 years ago) state of the
art technology.  VCASS includes high bandwidth video amplifiers for raster and
direct draw image production, programmable analog circuits for pre-distorting
the images (to account of optical distortion in the light path), internal sync
generators (to drive the 1000 line tubes) and numerous other options.
Helmet-mounted displays have also been produced by Honeywell, for use in the
Apache attack helicopter, GEC, for use in various British research efforts and
in the F-16 night attack system "Falcon Eye".  Ferranti (now part of GEC) and
Kaiser Electronics have also produced successful devices. 


Auditory displays. 

Headphones, the technology necessary to couple the ears to a sound source,
have been getting smaller, lighter and better.  An auditory display for use in
virtual reality requires a headphone and a simulated sound source.  The
current state of technology of sound image generators will be described in
more detail under the heading "image generators".  Most interest has been
focussed on 3D sound generators, for simulating the spatial location of a
sound source.   However, consideration has been given to sound quality and
content, for producing, for example, auditory icons (sound of a gas tank being
sucked dry) or a realistic computer generated voice. 


Tactile displays 

Although some effort has gone into tactile displays, much remains to be done.
Stick shakers (to indicate an impending stall in an aircraft) and shaped knobs
(to aid control recognition) have met with some success, but person-mounted
tactile displays have not. AAMRL experimented with pneumatic bladders on the
ends of the fingers, for indicating the surface of virtual objects, but the
lags in the system limited its utility.  There have been some attempts to use
vibro-tactile stimulation to simulate touch, but with limited success.  The
most successful method to date for giving tactile feedback has been with
mechanical exoskeletons which give feedback of, for example, molecular
repulsion in systems designed to aid in the synthesis of molecules. 

                2.2     TRANSDUCERS 


In order for a person to interact with a virtual environment, their actions
need to be communicated to the virtual reality generator.  The transducer
converts an action into a form which can be interpreted by a computer. Actions
include movements (of the head, eyes, hands and body), speech and brain
activity. 

Movement transducers 

The measurement of head movement provides signals which allow the image
generator to produce an output appropriate to where the head is pointing. Head
movement has been measured optically, acoustically, mechanically and
magnetically, or with combinations of these methods.  The most widespread
system is the Polhemus Spasyn system which uses a varying magnetic field to
induce a current in 3 orthogonal coils.  This technique is insensitive to most
interference and only requires a very small sensor to be mounted on the head.
However, its accuracy is affected by metal and most environments have to be
extensively mapped.  Another magnetic system, "The Bird" uses a DC magnetic
field and is gaining popularity because of its lower cost. 

Hand movement provides signals which the computer needs to allow the operator
to manipulate objects.  The most popular system was the "Data Glove" which
consists of optical fibers sewn into a tight glove.  Finger movement causes
bending of the fibers and a change in the amount of light which they transmit.
The Mattel Powerglove, a more recent development, uses the same technology.
With the Data Glove, hand movements are measured using a magnetic sensor
mounted on the glove.  The Powerglove uses ultrasonic transmitters and
receivers and a triangulation technique to achieve the same purpose. 

A range of "3D mice" are currently available in the market.  The principle of
each is the same, namely to produce multi-axis control inputs into a computer
for the manipulation of virtual objects.  Realizations of these concepts have
been available for some time, having been developed for aviation (eg 4-axis
hand-controllers for helicopters) and other applications (eg control of earth
moving and logging equipment). 

Body movement transduction is necessary for a person to move naturally through
the virtual environment.  A body glove, using the same technology as the Data
Glove, has been attempted, but movements are restricted by the wire
connection.  Walking, for example through the design of building, has been
simulated by a steerable treadmill.  A fixed bicycle, rowing boat or car would
provide similar effects for appropriate environments. Currently, large
movements through the virtual environment are commonly achieved by having the
person fly, using hand motions to control the direction and speed of travel. 

Eye movements may be of some importance to virtual reality, but probably more
as a control input than as a prerequisite for achieving an effective
simulation. The state of the art for eye movement transducers is quite
advanced, having been driven by medical/psychological research.  Options
include transducing electrical activity around the eye, corneal reflection (of
various sorts) and direct imaging followed by video image processing. 

Speech transducers. 

Direct Voice Input (DVI) has been under development for a number of years,
driven by potential commercial (word processing) and military (control input)
applications.  Although far from mature technology, existing systems are
capable of recognizing a limited vocabulary and are being marketed for the
home computer market.  DVI in virtual reality may be another tool for
increasing the naturalness with which a person interacts with a machine. 

Thought transducers. 

Although bordering on the realm of science fiction, some research has been
conducted on the possibility of using "thoughts" (actually, measurable brain
activity) for controlling things.  Some of that research has taken place at
AAMRL, where success was claimed for an experiment in which operators were
trained to conduct a compensatory tracking task using brain electrical
activity.  The potential use of electrical brain activity as a control input
is limited by the lag imposed by the requisite filtering of the raw signals.
The Air Force are interested in this area because of the potential for
monitoring, and having systems respond to, the psychological condition of
their pilots while they are engaged in high-stress activities. It may be
possible to find a use for thought transducers in virtual reality at some time
in the future. 

                2.3     IMAGE GENERATION 

Image -  "A reproduction or imitation of the form of a person or a thing.  An
exact likeness."  Visual image generation is just one part of what has to be
achieved.  Artificial stimulation to the other senses must also be considered.
 However, since the image of something in a particular sensory modality is
only as useful as the ability to display to that modality, most effort has
been expended on visual and auditory images. 

Visual image generation. 

The objectives of visual image generation in virtual reality are similar to
the objectives in aircraft simulation, in terms of complexity and realism.
However, virtual reality can have the added requirement that the user makes
head movements, which means faster scene changes. Effective visual simulation
requires computational complexity (for effective perspective, hidden line
removal etc) and computational speed (for acceptable scene update rates).  The
required computing power is now available in work station level machines.  The
most popular platform is the Silicon Graphics.  Enthusiasts have reported
successful attempts at simple virtual realities using the Amiga PC. 

Audio image generation. 

Spatially distinct sounds are important attributes of a convincing virtual
reality.  Sounds are subconsciously localized and used by people in the real
world, for perceiving warnings, for distinguishing conversations against a
noisy background (the cocktail party phenomenon) and for orientation.  Several
components of a sound are used to determine its spatial origin.  The time
difference between a sound reaching each of our ears, and the amplitude of the
sound in each ear, are important cues for determining the origin in azimuth.
In addition, the complex folds in our outer ears (pinnea) change the sound
quality on its journey to the cochlea.  It is these changes which allow
localization in elevation, monaural sound localization, and the ability to
distinguish sounds emanating from the front and the back. 

To produce a successful virtual auditory world, a space stabilized sound has
to be generated.  This requires a sound source, knowledge of head position,
and a device to alter the sound quality and present the appropriately
transformed sounds to each ear.  Research in auditory sound localization is at
a respectable stage of development, and the relevant principles and data have
been used to produce a variety of 3D sound image generators.  AAMRL have had
experience with two types of auditory image generator, an electro-mechanical
"head box" and a solid state, high speed, digital signal processor. 

Two successful solid state devices are the Gehring research device and the
Convolvotron.  Both work on the same principle of providing filters to
simulate the filtering effect of the pinnea on sounds emanating from discrete
locations in space.  Intermediate locations are simulated with some sort of
interpolation.  The change in sound quality produced by the pinnea, as a
function of sound source location, varies across individuals. Therefore, a
device which is designed to suit the "average person" will be unacceptable for
a large part of the population.  The issue of individual head related transfer
functions, or "ear prints" is something for which more development and
research are needed. 


REFERENCES

Vickers, D.L. (1972)  Sorcerer's apprentice: head-mounted display and wand.  
In A Symposium on Visually Coupled Systems: Development and Application.  
Birt, J.A. and Task, H.L. (eds) AMD TR 73-1.

Reference to the above work, including a photograph of the equipment, is 
available in:

Fisher, S.S.  (1990) Virtual Interface Environments.  In The Art of 
Human-Computer Interface Design, edited by Brenda Laurel, Addison-Wesley 
Publishing Co.

rick@hanauma.stanford.edu (Richard Ottolini) (04/12/91)

In article <1991Apr10.230954.28253@milton.u.washington.edu> MWELLS@FALCON.AAMRL.
WPAFB.AF.MIL (Maxwell Wells) writes:
>Virtual reality may be considered to have been born in the middle 1960s, based

I would put the origin back four thousand years or so ago when architects
designed buildings buildings to give inhabitants special effects.  Examples
include the artificial caves of the neolithic in Europe, the massive temples
and pyramids of Egypt, perpective paintings on the walls of Pompeii to name a
few.

With regards to current times, some movies and theme park rides have 
transported me to alternative worlds.