cdshaw@cs.UAlberta.CA (Chris Shaw) (03/31/91)
In article x Herb Taylor writes: >My personal SPEED Hall of Fame goes something like this: Nolan Ryan, >Seymore Cray, Danny Hillis... I'd add James Clark (SGI) and Fuchs/Poulton/Eyles/et al (UNC-CH) to that list. The reason being that these people and their organizations have produced machines that deliver 100,000+ real-time polygons per second, while Cray and Hillis don't. >No one argues that a Cray is NOT a very, very fast machine - but >that does not mean that it is an effective VW machine. ...for the money. In fact, I would say that both the Cray series of machines and the CM are not what you want, and certainly not what the average installation can afford. >With the exception of polygonally rendered virtual worlds there do not appear >to be that many places where floating point operation is critical. This is like saying "with the exception of the need for stopping, cars don't need brakes". :-) You need good floating point in the rendering pipeline, and the rasterization architecture must be able to put the transformed points on the screen. More general-purpose architectures such as the Cray and the CM are not optimized for graphics, and are therefore not suitable, given the cost. >Other approaches forego polygon rendering >entirely and hence possess little or no floating point. Other approaches such as what, I wonder? >Certainly, input and display processing is primarily fixed point digital >signal processing. An interesting statement. What's the context that makes this true? >A second question concerns whether the Cray's essentially "single >processor" model is right for VW. First, consider a VW display at >1Kx1Kx30fps (they'll be here sooner then we think...). We have one in our lab. It's called a Personal Iris (4D/20). And we have an (4 years old) Iris 3130 that is 768*1024 x 30fps. The issue is 30 frames of WHAT per second? In fact, Ivan Sutherland was getting 30 fps on a vector system with head-mounted display in 1968. Assuming that the number bandwidth of 40MBytes per second for the CM2 is accurate, 30 frames per second with an 8-bit frame buffer gives 1333333 pixels per frame, which is 1024 by 1302. The Personal Iris in our lab has about the same nominal statistic at 1/150 times the cost. Buy two and you get stereo, with about $1.8 million left over. >If the data stream included MPEG-like compression then there >could be >> 1000 operations per pixel. What do you need MPEG for??? >...high resolution (HDTV)... >We believe that for virtual worlds to approach PROCESSED HDTV resolutions >between 1,000 and 10,000 operations per pixel will be required. I suppose my basic question is "What does HDTV have to do with VR?". My first guess at an answer is "nothing". Why? Because the core of VR is interactive computer graphics. Not image processing, not HDTV. True, there may be some benefit in designing an architecture that allocates one processor per pixel. However, the challenge is then "how do I get my virtual geometric model into each pixel processor". One answer to that question was designed and build by Henry Fuchs/John Poulton/John Eyles and team over the last 10 years at UNC Chapel Hill. Pixel-Planes is the name of the system, and they're currently on version 5. The core of their machine is a logic-enhanced frame buffer, which has a tiny little processor attached to each pixel. An ingenious tree arrangement connects pairs of pixels, then pairs of pairs, and so on, such that each interior tree node has an adder for the left child and no adder to the right child. The result is that the tree evaluates a planar equation using bit-serial multiplication. Planar equations are sent to the roots of all these trees, and each pixel evaluates it. Thus, they can define polygons by planes that cross zero at the polygon's edges. (See Fuchs et al, 1982 Conference on Advanced Research in VLSI, Fuchs et al SIGGRAPH 85 pp 111-120, Fuchs & Goldfeather IEEE CG&A Jan 1986, Plus the SIGGRAPH 89 hardware session). Here there is massive parallelism: For 1024 by 1280 display, there is 1024 * 1280 = 1,310,720 processors. The parallelism is (arguably) at the right place, in the pixel. Of course, the frame buffer has many bits per pixel (as much as 128 bits in the latest version, I think), and scan-out circuitry is included in each frame buffer chip. It's build, it works, they might even demo it at SIGGRAPH this year. >** TCP/IP performance, Cray to Cray, of over 300 Megabits per second! 300,000,000/(8 * 1024 * 1280) = 28 frames per second. Not good. >In a VR system each component must perform its >functions continuously with LITTLE OR NO LATENCY. Bingo! Everyone gets a prize!! The real issue, once graphics performance is settled by buying yourself graphics machines, is how to construct the operating system and the VR Toolkit such that latency is minimized? One answer is to throw money at the problem. Another is to wait for cpu's to speed up and get cheaper. The third technique is to get your hands dirty building a system and see where the bottlenecks are. Find how EVERY application can be speeded up. Make it automatic. Make it easy. Ten years ago few people knew anything about how to build windowing systems. Now you ftp X from somewhere. The same can happen with VR. > <Good examples of the real-time requirements deleted>. >None of the previous discussion of world processing considered the >cost of transferring processed world data to a display or I/O facility. I think the misconception here is that high bandwidth is needed to transmit frame buffers from a renderer to a display device. This is not what you want to do. The rendering box must be tightly coupled with the frame buffer, which is why I'm wondering why Crays and CM2's are being suggested as VR engines. Certainly, they would make good simulator boxes, but you should leave the rendering to the machines that are designed to render quickly, otherwise the latency will kill you. In other words.. DONT SEND VIDEO, SEND THE MODEL! In fact, send the model once, and after that, send only the changes to the model. It's often tempting for manufacturers of highly parallel machines to claim that there box is good for graphics. The proper question to ask is "What type of graphics?". If the answer is "Real Time" the get out your skeptic's hat and start asking hard questions about the rasterization and frame buffer architecture. If you get blank looks, forget it. If they say "Good for producing animations", then the claim is probably on the mark, but remember that you can produce animations overnight, since latency is not a problem. This is not the case for VR systems, since minimizing latency is what the whole game's about. >However, a single processor would seem to suffer a significant memory I/O >bottleneck unless a significant "back door" is in place. Presumedly systems >such as Alan describes on Cray provide an equivalent facility. I suspect you're wrong on this count. >For example, to project images from a remotely sensed combustion experiment >will require hundreds of frames per second of aquisition - but in a burst >of only a few seconds duration. In order to walk through the data set >we must have considerable flexibility in the play back frame rate. This is very lovely, and presumably the Princeton Engine does a fine job. BUT. This isn't Virtual Reality (TM). This is more like hypermedia than anything else. Where's the interaction? Can you change your point of view? Can you change the experiment as it runs? No. The situation you describe allows N camera views at pre-programmed locations. If you want a new view that your camera(s) didn't get, you have to run the experiment again. You probably don't need a head-mounted display for this work, and you would probably find that you're doing teleoperation if the camera angle is (say) DataGlove controlled. There's nothing wrong with this, but it ain't virtual reality, because the level of interaction is severely limited. The only thing you can change is the frame rate, and that's not what I call interaction with a model! >Although high speed disk I/O may be crucial to virtual world data archives, >it is not clear how it impacts the management of the world itself. The flight simulator people have been putting most of their effort into fast database access over the last 10 or so years. Given a simulated plane flying at mach 2 100 feet above sattelite-sensed terrain, you get the idea that there's a lot of terrain data to be accessed. But again, the terrain data is canned, and the simulation is of plane behaviour. Flight simulators are essentially sophisticated viewers of texture-mapped polygonal meshes. >World data originates from real-time sources such as camaras >or data gloves, must be processed continuously in real-time and is >displayed on real-time display systems. As I say above, I think that the camera input issue is a red herring. Unless you're doing user sensing with a set of cameras, in which case, you're in the image processing/image understanding camp. This is what Myron Kreuger does in his Videoplace project, and the technology for that has been around for at least 15 years. The Princeton Box could probably be put to excellent use in this area. The DataGlove can be suitably driven by a Mac. The volume of data (not counting the Polhemus Isotrak sensor) is as much as 15 bytes/sample * 60 samples/second, or 900 bytes per second. With Isotrak data it's another 14 bytes per sample for a total of 1740 bytes per second. Nominally, a 19.2Kbaud line will do for this application, and with the DataGlove that's all you get. Higher bandwidth communication would be nice for latency purposes, but it's not necessary. >-herb taylor -- Chris Shaw University of Alberta cdshaw@cs.UAlberta.ca Now with new, minty Internet flavour! CatchPhrase: Bogus as HELL !