[net.graphics] ray casting

fredc@bmcg.UUCP (Fred Cordes) (09/09/85)

  I first saw ray casting/ray tracing at SIGGRAPH 83. I listened to James
  Kajiya's paper presentation but was unable to appreciate much more than 
  his joke about the tera-flop clib. I've got the proceedings from that
  SIGGRAPH and '85 with papers about ray tracing that I'm really having
  trouble understanding. Can anyone out there offer a few pointers or 
  simpler explanation? I've been working with 2D graphics for a couple of
  years now and I understand the basics of tranformations and object 
  descriptions based on vectors.

  The problem is I can't get a handle on how objects are described when they're
  ray traced. I understand the idea of casting rays from the eye/viewpoint to
  individual pixels, but I simply can't extract how objects are described from
  the references to Jacobians and Newton's method in the papers. Anyone care 
  to post a generalized explanation to the net?

  Thanks much, Fred Cordes

derek@uwvax.UUCP (Derek Zahn) (09/16/85)

>   ... Anyone care to post a generalized explanation to the net?
> 
>   Thanks much, Fred Cordes

I have been experimenting with ray tracing a bit recently, so I will
post something (although my experience is doubtless less than others
that read this, so any gurus out there please post expansions and
clarifications).

Ray-tracing is a rendering technique.  It is a method of transforming
objects (stored in any of a multitude of ways) into an image -- typically
a RESOLUTION_X x RESOLUTION_Y array of pixel color values.

Consider the screen as a window looking out on the object-space world.
It has a size, and each of its corners has a location in space.  Now,
consider your eye.  It also has a location in space.  Now, divide the
window into a grid, where each point on the grid represents one pixel
on the screen.

Basically, the idea is to treat each of these pixels as a separate
entity.  I will describe a procedure for one pixel, but it is clear that
it generalizes easily to all pixels in the image space.

Construct a ray that begins at the eye and travels through the point
in space (grid location) representing the given pixel.  A form of this
consisting of an origin point and a direction vector is easy to compute.
Now (this is the inefficient but simple method), for each object that
has been entered into an object database before the actual ray-tracing
started, perform an intersection calculation.  This tells us which object
was hit by the ray.  If none of the intersection calculations produced
an intersection, then the ray hits nothing.  If more than one gave an
intersection, choose the one that intersected closest to the eye.

Example:  Suppose that we have an object database that consists of one
sphere and one triangle.  For each of these, some data must be stored.
For the sphere, we need to store the coordinates of the center and the
value of the radius.  For the triangle, the vertices must be kept.
In addition, other values used in coloring and texturing the object
can be kept, such as the color (or the name of a color function), the
amount of light that is reflected diffusely and specularly, etc.
The intersection with the sphere is a simple algebraic substitution of
the ray equation into the equation of the sphere and solving.  The
intersection calculation for the triangle is a bit more complex (not
necessarily slower) -- one way is to find the intersection with the
plane that contains the triangle (another simple algebra exercise), and
then determining whether or not this point is inside the triangle itself.

Once we have the point at which the ray intersected some object, we are
ready to determine what color to make this pixel.  This is based on
many factors.  First, we need to know the surface normal vector for the
intersected object at this point.  We also need to know the color of the
object here.  Now, for each of the light sources in the light source
database, we want to trace another ray from the light source to the
original intersection point.  If some other object gets in the way before
the ray reaches our point, then the contribution of this light source
to the color of the pixel is nil (the point is in shadow).

Otherwise, compute the contribution of this light source (I am going to
hand-wave that for now, it is a bit complex and this is already longer 
than I had thought it would be).  Once the contribution of all light sources
is summed, the pixel color values are now determined, and we can move to the
next pixel.

That's it.  Now, bear in mind that this is a VERY simple ray-tracing
algorithm.  I just noticed that almost everything I just said could be
changed to produce more realistic images or else generate them faster,
but these improvements are embellishments of this basic framework.
For example, the above does not deal at all with reflection or refraction
(although these are easy; just recursive calls to the ray-tracing routine
with a new ray that starts at the intersection point and whose direction
depends on the normal vector, index of refraction, etc), or fuzzy shadows,
or lots of other good_stuff().

The things that you read dealing with Jacobians and the like are all
refinements of the ray-tracing procedure.  Such refinements can generally
be divided into several areas:

1) Object modeling -- this area consists of various different surface
types, and methods of ray-tracing them.  For example, fractally-generated
terrain is interesting.  How can one efficiently intersect a ray with
a fractal mountain consisting of 20,000 triangles?  Also, modeling of
free-form surfaces is interesting.  Points in space may be connected
with parametric cubic spline functions; intersecting these patches is
not always easy; this is one place where you see Newton's iteration.
Also, sophisticated ways of combining these methods are needed for
realistic modeling of objects like trees.

2) Surface modeling -- once an intersection point is determined, how
do we color it to provide a realistic image?  Many methods exist for
mapping 2 dimentional color/texture maps onto 3-D objects, and others
exist for using 3-D functions to determine color/texture directly.
Recent work indicates that controlled stochastic methods of surface
structure/color/detail generation may provide significant increases
in visual realism and complexity at moderate cost (this is a particular
interest of mine).  All told, the variations of surface modeling techniques
are as numerous as the variations of real surfaces.

3) Photographic quality -- computer generated images typically suffer
from resolution problems.  Only part of this is due to the limited
screen resolution.  Antialiasing attempts to provide smoother images,
and methods for doing this are interesting areas of research.  On a
related note, typically real pictures have significant amounts of "dirt"
or "noise" in them (you may think of it as disharmonious surface
detail) which computer generated images usually lack.  I am
not aware of any work being done to model this phenomenon; is any being
done?

4) Efficient image generation -- this area has received a great deal of
attention, but there is still much to be done.  Basically, how can we
generated complex pictures in as little time as possible?  In order to
think of this, we must examine the method of producing a ray-traced image.
This will lead to areas where the time can be cut down.

A) The image
	It may be possible to cut down the number of rays that are cast.  For
	example, if all pixels of a certain area are the same color, there is no
	need to trace them all.  I am not aware of any published works describing
	implementations of this idea, although I am sure that there are some.

B) The ray
	1) The number of intersection calculations
		Various ways of subdividing the object space have been proposed.
		Basically, the idea is to divide space (or hierarchically defined
		objects) into areas of interest.  Each object is associated with
		only the areas of space which it intersects.  Then, a ray needs
		only intersect objects in the spatial areas through which it passes.
	2) Efficiency of an intersection calculation
		For some objects this is not a big deal, but for other, more
		complex objects (like procedurally defined objects or free-form
		surfaces composed of many patches) it can be.

C) The point
	Once the intersection is calculated, how can we provide visually complex
	surface detail at reasonable cost?


Well, I am done.  In case anybody has read this far, I would be very 
interested in discussion of ray-tracing topics on this newsgroup.

I am sorry for my horrible grammar and style, and also for not citing
references.  I cannot remember where many of these ideas came from.
I would be interested in building a ray-tracing bibliography on the
newsgroup -- if one already exists, could someone tell me about it?

Now, I hope that this will spark some interest in ray-tracing discussion
on this group, or at least force someone who knows what s/he is talking
about to correct my blunders.

derek

-- 
Derek Zahn @ wisconsin
...!{allegra,heurikon,ihnp4,seismo,sfwin,ucbvax,uwm-evax}!uwvax!derek
derek@wisc-rsch.arpa

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (09/16/85)

>   The problem is I can't get a handle on how objects are described when they're
>   ray traced.

The description of 3-D objects is pretty much independent of ray tracing.
Two common techniques are "combinatorial geometry" (typified by PADL-2),
wherein solid primitives are combined with set operations (intersection,
union, difference) to make up a composite object, and "boundary
representation" (actually several different techniques), wherein the
surface of an object is represented as a collection of surface patches,
each of which is simply described (e.g. as a parametric bicubic).  There
are some more sophisticated approaches, too.

Just how you trace rays depends on how you represent the object.  In the
case of combinatorial geometry, the simplest approach is to intersect the
ray with every primitive solid (cylinder, cube, etc.) and perform Boolean
operations on the pieces of the ray resulting from each primitive.  In the
case of a surface description, one might be able to simply solve the
simultaneous equations for the ray and for the surface to determine the
point of intersection.  For more details, see Foley & Van Dam or some of
the more tutorial articles in past issues of IEEE Computer Graphics &
Applications.

brad@gcc-bill.ARPA (Brad Parker) (09/17/85)

In article <1858@bmcg.UUCP> fredc@bmcg.UUCP (Fred Cordes) writes:
>
>  I first saw ray casting/ray tracing at SIGGRAPH 83. I listened to James
>  Kajiya's paper...
(He's only one crazy enough to ray trace fractals!)

> The problem is I can't get a handle on how objects are described when they're
>  ray traced.
>  Thanks much, Fred Cordes

This will probably be one of 10,000 replies, and I probably am the last
who should try and explain this, but after listening to Turner Witted
last year, I went out and wrote code to do this...

The idea I use to explain this is a so called "pin hole camera". Imagine a
camera which is made from a shoe box. The "film" is taped to one end of
the inside of the box, and a small "pin hole" is made in the other end.
All light stiking the film passes through the small hole.
	Your code creates an "image plane" which corresponds to the
film. An easy way to do this is to create an array in which each element
corresponds to one pixel. Your code sends a "ray" from each element in
the array (or pixel) through the "pin hole" into the object space. When
the ray hits and object in the object space, you calculate the intensity
of the light at the intersection point and color the pixel in the array.
	Rays are described as a starting point and a direction. The starting
point is easy (x in array, y in array, z = -(distance from image plane to
pin hole)). The direction is a vector from the image plane point towards
the pin hole. To find intersections in the image plane, you must
describe each object in such as way that you can find the intersection
of the object and a line described as a point and direction vector.
	Start with a sphere (ever wonder why so many ray traced pictures have
spheres in them?). x**2 + y**2  + z**2 = r**2. Describe your ray as
x = x0+x1*t, y = y0+y1*t, z = z0+z1*t. (x0, y0, z0) is the starting point and
(x1,y1,z1) is th direction vector. Substitute the ray as a function
of t into the equation of a sphere. Solve for t. Once you have the
algebraeic eqation for the intersectoin in t, you are set. For each ray,
solve for t. If t is real (not imaginary), you have an intersetion.
	Substitute t back into the ray equation and find the 3d point of the
intersection. At this point, it is wise to calculate the normal vector
at the point of intersection, as this is critical to calculating the
illumination value (this is done by taking partial derivatives of the
sphere equation in x,y, and z). See any good graphics text on this.
You should also send a ray toward the light source to determine if the
intersection point is in a shadow or is directly illuminated.
There you have it, simple ray tracing (sorry to those who already know this).
	Other objects are similar, you need only determine a method of
solving for for the intersection "t" alongs the ray and determine the
normal vector at the point of intersection. Turner Whitted supposedly
published some "cook book" code for this whole process at this year's SIGRAPH.
	You can produce code which treats the "image plane" as a normalized
coordinate space (-1 < x < 1 and -1 < y < 1) and map the resultant pixel
locations into your bitmap. Doing things in a normalized way has many
advantages.

Good Luck. Ps: Your pictures will probably come out upside down the first
time. I assume this is homage paid to some graphics diety who no
doubt resides somewhere in Utah ;-)
(wow - did I really type all this?)
-- 

J Bradford Parker
uucp: seismo!harvard!gcc-bill!brad

"She said you know how to spell AUDACIOUSLY? I could tell I was in love...
You want to go to heaven? or would you rather not be saved?" - Lloyd Coal

peterson@utah-cs.UUCP (John W Peterson) (09/17/85)

In article <318@gcc-bill.ARPA> brad@gcc-bill (Brad Parker) writes:
>
> "The idea I use to explain this is the so called "pin hole camera"..."
> "...Ps: Your pictures will probably come out upside down the first time..."

Your pictures don't have to come out upside down.  The pinhole camera
method uses a model like:
				        /^
image	|\	/		    |\/  | 
plane->	|  \ +/   --> To image      v/\  |
	|  /  \			        \|
	|/   |  \              So things get turned
             |                 upside down like this.
focal point--+                 

A simpler method is to place the focal point at the eye, so the 
model looks like this:

	   /
         /			       /^
       /+		             /  |
     /  |			   /^   |
eye +   |   ---> To image        /  |   |
     \  |			 \  |   |
       \+			   \|   |
        ^\			     \  |
	|  \			       \|
image---+                        Now things stay
plane				 right side up.

This also has the advantage of making the geometry involved more intuitive.

-jwp