[sci.virtual-worlds] Post-Last words on Cray-Connection Machine debate?

cdshaw@cs.UAlberta.CA (Chris Shaw) (03/31/91)

In article x Herb Taylor writes:
>My personal SPEED Hall of Fame goes something like this: Nolan Ryan,
>Seymore Cray, Danny Hillis...

I'd add James Clark (SGI) and Fuchs/Poulton/Eyles/et al (UNC-CH) to that list.
The reason being that these people and their organizations have produced
machines that deliver 100,000+ real-time polygons per second, while Cray
and Hillis don't.

>No one argues that a Cray is NOT a very, very fast machine - but
>that does not mean that it is an effective VW machine.
...for the money.

In fact, I would say that both the Cray series of machines and the CM are
not what you want, and certainly not what the average installation can afford.

>With the exception of polygonally rendered virtual worlds there do not appear
>to be that many places where floating point operation is critical.

This is like saying "with the exception of the need for stopping, cars don't
need brakes". :-) You need good floating point in the rendering pipeline,
and the rasterization architecture must be able to put the transformed
points on the screen. More general-purpose architectures such as the
Cray and the CM are not optimized for graphics, and are therefore not
suitable, given the cost.

>Other approaches forego polygon rendering
>entirely and hence possess little or no floating point.

Other approaches such as what, I wonder?

>Certainly, input and display processing is primarily fixed point digital
>signal processing.

An interesting statement. What's the context that makes this true?

>A second question concerns whether the Cray's essentially "single
>processor" model is right for VW. First, consider a VW display at
>1Kx1Kx30fps (they'll be here sooner then we think...).

We have one in our lab. It's called a Personal Iris (4D/20). And we have an
(4 years old) Iris 3130 that is 768*1024 x 30fps. The issue is 30 frames of
WHAT per second? In fact, Ivan Sutherland was getting 30 fps on a vector
system with head-mounted display in 1968. Assuming that the number bandwidth
of 40MBytes per second for the CM2 is accurate, 30 frames per second with an
8-bit frame buffer gives 1333333 pixels per frame, which is 1024 by 1302.
The Personal Iris in our lab has about the same nominal statistic at 1/150
times the cost. Buy two and you get stereo, with about $1.8 million left over.

>If the data stream included MPEG-like compression then there
>could be >> 1000 operations per pixel.

What do you need MPEG for???

>...high resolution (HDTV)...
>We believe that for virtual worlds to approach PROCESSED HDTV resolutions
>between 1,000 and 10,000 operations per pixel will be required.

I suppose my basic question is "What does HDTV have to do with VR?". My first
guess at an answer is "nothing". Why? Because the core of VR is interactive
computer graphics. Not image processing, not HDTV. True, there may be some
benefit in designing an architecture that allocates one processor per pixel.

However, the challenge is then "how do I get my virtual geometric model into
each pixel processor". One answer to that question was designed and build by
Henry Fuchs/John Poulton/John Eyles and team over the last 10 years at
UNC Chapel Hill. Pixel-Planes is the name of the system, and they're currently
on version 5. The core of their machine is a logic-enhanced frame buffer, which
has a tiny little processor attached to each pixel. An ingenious tree
arrangement connects pairs of pixels, then pairs of pairs, and so on, such that
each interior tree node has an adder for the left child and no adder to the
right child. The result is that the tree evaluates a planar equation using
bit-serial multiplication. Planar equations are sent to the roots of all these
trees, and each pixel evaluates it. Thus, they can define polygons by planes
that cross zero at the polygon's edges. (See Fuchs et al, 1982 Conference on
Advanced Research in VLSI, Fuchs et al SIGGRAPH 85 pp 111-120, Fuchs &
Goldfeather IEEE CG&A Jan 1986, Plus the SIGGRAPH 89 hardware session).

Here there is massive parallelism: For 1024 by 1280 display, there is
1024 * 1280 = 1,310,720 processors. The parallelism is (arguably) at the
right place, in the pixel. Of course, the frame buffer has many bits per pixel
(as much as 128 bits in the latest version, I think), and scan-out circuitry
is included in each frame buffer chip. It's build, it works, they might even
demo it at SIGGRAPH this year.

>** TCP/IP performance, Cray to Cray, of over 300 Megabits per second!

300,000,000/(8 * 1024 * 1280) = 28 frames per second. Not good.

>In a VR system each component must perform its
>functions continuously with LITTLE OR NO LATENCY.

Bingo! Everyone gets a prize!!

The real issue, once graphics performance is settled by buying yourself
graphics machines, is how to construct the operating system and the
VR Toolkit such that latency is minimized? One answer is to throw money
at the problem. Another is to wait for cpu's to speed up and get cheaper.
The third technique is to get your hands dirty building a system and
see where the bottlenecks are. Find how EVERY application can be
speeded up. Make it automatic. Make it easy. Ten years ago few people
knew anything about how to build windowing systems. Now you ftp X from
somewhere. The same can happen with VR.

> <Good examples of the real-time requirements deleted>.

>None of the previous discussion of world processing considered the
>cost of transferring processed world data to a display or I/O facility.

I think the misconception here is that high bandwidth is needed to transmit
frame buffers from a renderer to a display device. This is not what you want
to do. The rendering box must be tightly coupled with the frame buffer,
which is why I'm wondering why Crays and CM2's are being suggested as VR
engines. Certainly, they would make good simulator boxes, but you should leave
the rendering to the machines that are designed to render quickly, otherwise
the latency will kill you. 

In other words.. DONT SEND VIDEO, SEND THE MODEL!   In fact, send the model
once, and after that, send only the changes to the model. It's often tempting
for manufacturers of highly parallel machines to claim that there box is good
for graphics. The proper question to ask is "What type of graphics?". If the
answer is "Real Time" the get out your skeptic's hat and start asking hard
questions about the rasterization and frame buffer architecture. If you get
blank looks, forget it. If they say "Good for producing animations", then 
the claim is probably on the mark, but remember that you can produce
animations overnight, since latency is not a problem. This is not the case
for VR systems, since minimizing latency is what the whole game's about.

>However, a single processor would seem to suffer a significant memory I/O
>bottleneck unless a significant "back door" is in place. Presumedly systems 
>such as Alan describes on Cray provide an equivalent facility.

I suspect you're wrong on this count.

>For example, to project images from a remotely sensed combustion experiment
>will require hundreds of frames per second of aquisition - but in a burst
>of only a few seconds duration. In order to walk through the data set
>we must have considerable flexibility in the play back frame rate.

This is very lovely, and presumably the Princeton Engine does a fine job.
BUT. This isn't Virtual Reality (TM). This is more like hypermedia than
anything else. Where's the interaction? Can you change your point of view?
Can you change the experiment as it runs? No. The situation you describe 
allows N camera views at pre-programmed locations. If you want a new view
that your camera(s) didn't get, you have to run the experiment again. You
probably don't need a head-mounted display for this work, and you would
probably find that you're doing teleoperation if the camera angle is (say)
DataGlove controlled.

There's nothing wrong with this, but it ain't virtual reality, because the
level of interaction is severely limited. The only thing you can change is
the frame rate, and that's not what I call interaction with a model!

>Although high speed disk I/O may be crucial to virtual world data archives,
>it is not clear how it impacts the management of the world itself.

The flight simulator people have been putting most of their effort into
fast database access over the last 10 or so years. Given a simulated plane
flying at mach 2 100 feet above sattelite-sensed terrain, you get the idea
that there's a lot of terrain data to be accessed. But again, the terrain data
is canned, and the simulation is of plane behaviour. Flight simulators are
essentially sophisticated viewers of texture-mapped polygonal meshes.

>World data originates from real-time sources such as camaras
>or data gloves, must be processed continuously in real-time and is
>displayed on real-time display systems.

As I say above, I think that the camera input issue is a red herring. Unless
you're doing user sensing with a set of cameras, in which case, you're in
the image processing/image understanding camp. This is what Myron Kreuger does
in his Videoplace project, and the technology for that has been around for
at least 15 years. The Princeton Box could probably be put to excellent use in
this area.

The DataGlove can be suitably driven by a Mac. The volume of data (not counting
the Polhemus Isotrak sensor) is as much as 15 bytes/sample * 60 samples/second,
or 900 bytes per second. With Isotrak data it's another 14 bytes per sample
for a total of 1740 bytes per second. Nominally, a 19.2Kbaud line will do for
this application, and with the DataGlove that's all you get. Higher bandwidth
communication would be nice for latency purposes, but it's not necessary.

>-herb taylor 
-- 
Chris Shaw     University of Alberta
cdshaw@cs.UAlberta.ca           Now with new, minty Internet flavour!
CatchPhrase: Bogus as HELL !