[comp.graphics] Algebra For Stereo-Scopic Image Generation

haitex@crash.cts.com (Wade Bickel) (01/28/88)

        Thanks Leo for pointing out this discussion in amiga/comp-amiga.
     I did not have direct access to this discussion and didn't know it
     even existed.
     
     	First, I'd like to mention that Haitex will be releasing 3-D 
     glasses for the Amiga within about a month.  I like our glasses
     better than the Sega glasses because they have bigger lenses, and the
     lenses are on a visor rather than "glasses" (kinda like a welding visor)
     which is more comfortable, especially if you already wear glasses.
     This is one of the things I've been working on for the past few months.
     
        The following is a letter I sent to the makers of a ray trace progam
     which will directly support the glasses.  I hope this helps those of
     you working on 3-D projects.
     
        Note for Leo:   This entire methodology was developed by looking at
     the point of a pencil held between myself and a window, and looking at
     a branch outside the window.  This methodology is totally intuitive, and
     avoids the need to perform any rotions to acheive the parallax effect.
     Furthermore, I have been able to bring object a significant distance 
     out of the screen with no terrible ugliness.  I'd like to talk to you
     about all this in more detail so why don't you send me your ph# or
     call me [(619) 421-5602] and I'll call you right back so I pay for the
     call.
        


FROM:      Wade W. Bickel, Haitex Resources.		DATE: 12-10-88
TO:	   XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
SUBJECT:   Parallax Rendering Algebra.
   
       In this letter I will try to go through the mathematics involved in
    creating a parallax stereo pair.  I am not a mathematician so please
    excuse me if my teminology is non standard.
      
         For computational simplicity I perform the following rather convoluted
      system for generating a stereo display.  The idea is that there is
      an imaginary 3-D grid which cannot be re-positioned, and points to
      be translated onto a region (which represents the screen), both of
      which can be freely re-positioned.  If you picture the display as
      a window we are electing to move the viewer's head, the window, and
      the objects outside that window, rather than re-align the axes.
        The logic can be decomposed into a system of moving axes if you wish,
      but this system eliminates the need to compute the length of some
      projections using the distance formulae and therefore is the one that
      I utilize.
      
	   
	     1)	    Points to be translated are represented in cartisian
	     	 (x-y-z) coordinates.
	   
	     2)     The viewpoint is assumed to be centered at the origin
	     	 and facing in the direction of positive z.  To visualize
		 this, you the viewer, are at the origin, the direction
		 you're facing is positive z, and negative values of x lie
		 to your left.
	      
	     3)	    A plane lies perpendicular to the z-axis (ie: parallel
	         to the x-y plane) at some distance in the positive z 
		 direction.  Within this plane is a rectangular region, not
		 quite centered about the z-axis, and aligned with the x and
		 y axes, which represents the screen.  For our purposes we
		 will call this region either the "left-screen" or the 
		 "right-screen", or just "screen" for the general case.  
		    The screen is offset with respect to x depending on which
		 view (left or right) is being rendered.  For the left eye
		 the screen is offset to the right.  Thus if one were to 
		 display the point [0,0,zP] (ie: where the z-axis intersects
		 the region) for the left-screen the point would appear to
		 the left of the center of the actual display bitmap.  For
		 the right eye the region is offset to the left.  These
		 offset's will be refered to as the "lefteye-offset" and 
		 "righteye-offset" and are calculated to be one half the 
		 distance between the viewer's eyes, in whatever units of
		 measure have meaning to the programmer/user.

		    
	     4)	    Points to be translated are assumed to have a positive 
	     	 z coordinate; in other words, you should clip with respect
		 to z before trying to translate a point.  Clipping with
		 respect to x-y should probably be done after translation.
		    When translating a point for the left eye, that point
		 should be shifted right by twice the amount the screen 
		 region was shifted to the right (ie: the total distance
		 between the eyes).  Lets call this the "EyeSpacing". The
		 net effect of the shift of the screen and the point is
		 to properly create a perspective.

      
         Given these conditions and relationships, to translate a point onto 
      the screen the following algebra is used:
      
	
	    The point to be translated (always positive z) and the view point
	 (always [0,0,0]) define a line.  These two points also form a right
	 triangle, call it triangle "A", with the z-axis.  The line also 
	 passes through the plane in which the screen (as previously defined)
	 lies, and this intersection is the one we wish to find.  Note that a
	 triangle is formed between this unknown point, the viewpoint, and
	 the z-axis, call it triangle "B".
	 
	    Triangle A and B are similar right triangles.  Since we know 
	 the z components of both triangles, and everything about triangle
	 A, we can calculate the disired intersection as follows:
	 
	      let
		
		   VP = the viewpoint, [0,0,0];
		   P  = the point to be translated (+z);
		   S  = the plane on which the screen lies (perp. to z);
	           Q  = point on the line defined by VP and P, and on
		          plane S.
		
	      the following relationships are true   
		
		   Qx/Qz = Px/Pz;  -->  Qx = (Px * Qz)/Pz;    

		   Qy/Qz = Py/Pz;  -->  Qy = (Py * Qz)/Pz;
		   
		   			(note avoidance of distance fomulae
					    calculations because VP = [0,0,0])

	      Where Qx and Qy are the respective X and Y coordinates to
	   possibly be displayed.  This two-d-point should be checked to
	   see if it is within the "screen" (as previously described) and
	   if so it should be displayed.
	   
	      
	      Implementation of this is not nearly as difficult as all this
	   implies.  Simply do the following.
	   
	   		1) for each view add or subract the constant
			     "EyeSpacing" to the x-component of points
			     to be translated.  Hopefully this can be done
			     at a very low level of the point's position
			     calculation.  (Add for left eye view, subtract
			     for right eye view).

			2) translate the point as described earlier.  Qx and
			     Qy represent the actual translation to the
			     plane on which the screen lies.  At this point
			     determine if this point lies in the appropriate
			     "screen".  [Remember that the screen is a region
			     which is shifted left or right by half the
			     EyeSpacing (or: left/right-offset)]. If so then
			     the point should be rendered.
			     
			3) When rendering the points must be recentered to
			     account for the fact that [0,0] on the display
			     is the upper left corner, and we want it to
			     be the psuedo screen center.  At this point,
			     adjust the x compensation by one half the
			     EyeSpaceing.  (move the center left for the
			     left eye, right for the right eye).  This should
			     be an integer operation.

	      
	      Implementation of this is not nearly as difficult as all this
	   implies.  Simply do the following.
	
	      I hope I've not gotten anything backwards.  Sorry I've made it
	   seem overly complicated, but I tend to think of this in pictures,
	   not language.
	   
	   
	     --------------------------------------------------------
	     
	      Now on to my question:  Is it not worthwhile to think of objects
	   in polar coordinates?   It seems to me than much of the math
	   required to do the real-time calculations necessary to provide 
	   rotation of both the objects and the viewpoint(s) would be 
	   allieviated by defining object this way as opposed to cartesian
	   coordinates.  I am about to give up on this approach due to some
	   flaw in my calculations, which took me about a week to work out,
	   and translations in cartesian coords are well documented.  Seems
	   a shame though as it should reduce the needed calculations to less
	   than half.
	   
	      If anyone would care to discuss the topic I might give it one
	   more shot.   Should I post here or keep this E-mail?
	   
	   
	      All (civil) comments are welcome.
	   
	   						Thanks,
							
							     Wade W. Bickel
							     Haitex Resources.
 	 
										
								

        PS:   In order to get to this news-group I have to utilize the
            Unix side of the local node.  Not knowing my way around in
            here yet, would someone please mail me the Stereo-vision
            topics found here for a while (and perhaps post letters for
            me as well, which I would send via mail)?

        PPS:  Sandra (?? I think that was the name of the original poster),
            If you'd like I could mail you some Modula-2 examples to perform
            the translations.  Of course, if you want controlled rotation,
            that's what I'm working on now so you'd have to wait.