[net.ai] Computer Vision, Pattern Recognition

sher@rochester.UUCP (07/15/85)

From: sher

I recently found out that discussion on computer vision related issues
is going on in net.graphics.  Since I am doing a thesis on low-level
computer vision I found this interesting.  I didn't read net.graphics
because image creation is a subject that is very peripheral to my
interests.  My research is about feature detection in images using
probabilistic models and detectors that return probabilities.  I am
also interested in doing vision on parallel machines and assisted with
the design of the WARP a parallel pipelined machine that will be
devoted to image processing applications (at CMU).  If we want to get
our own news group I guess we will have to generate discussion so here
are some topics and my ideas on them:

Reconstruction vs Recognition Based systems:

Many people (especially people at MIT) believe that a fundamental step
in computer vision is to reconstruct some set of intrinsic parameters
such as surface orientation, texture, illumination, reflectivity.
This concept has so many references I pale before listing them (any
textbook on computer vision should cover this nicely).  Other people
feel that since the purpose of computer vision systems is to recognize
a restricted set of situations complete reconstruction at all points
in the image is wasteful and unnecessary.  Most bottom up research is
reconstructive and top down research isn't.  It should be clear that
the reconstructivist opinion makes little sense in very restrictive
domains such as many industrial vision systems (sometimes called
verification vision systems).  In my opinion general purpose vision
will require complete reconstruction only at the very lowest levels
and then as the routines get more high level the reconstruction will
be more or less incomplete.  Actually my opinion is much more
complicated but this should provoke discussion among the interested.

Generalized Image Storage Format?

In Ballard & Brown, Computer Vision,  a generalized image is defined
as an iconic like array containing information relevant to an image
(paraphrase not quote).  Examples of generalized images are Fourier
transformed images, edge images, stereo pairs, circle location points,
image histograms...  If there were generally accepted formats for
generalized images then I could use edge recognition programs written
at CMU and image interpretation routines written at U. Mass to test my
texture recognition routines written here at U. Rochester.  As far as
I can tell every university stores images differently.  As far as
other generalized images then every program stores them differently.
This I believe acts as a gigantic brake on vision research.  

Parallelism and Computer Vision:

I have recently completed a TR studying the effect of differing
architectures on low-level computer vision.  I compared the CMU WARP
and the BBN Butterfly.  The interpretation task was pattern
recognition using convolution based techniques on edge images.  
The architectural features that effected the choice of implementation
of the routines were in order of importance:
1. Relative speeds of instructions (floating point vs fixed point vs
memory access)
2. Local memory available per processor
3. Interconnection net between processors
Actually the effect of the interconnection net was completely masked
by the first two issues.  My research thus indicates that the
interconnection net is not a significant issue as far as architectures
for computer vision are concerned.  

This seems enough to spark some discussion (though I've been wrong
before).  Any more and people won't read it anyway.  
-David Sher
sher@rochester
seismo!rochester!sher

nather@utastro.UUCP (Ed Nather) (07/15/85)

> Reconstruction vs Recognition Based systems:
> 
> Many people (especially people at MIT) believe that a fundamental step
> in computer vision is to reconstruct some set of intrinsic parameters
> such as surface orientation, texture, illumination, reflectivity.

I'm not sure where it fits into the theory, but we have operational an
"image re-recognition" system that works fine for our (very restricted)
astronomical image fields.  We constuct (from the original image) a set
of r-theta tables representing the distance and angle of each "nearby"
star image to our target position, as well as the distance and angle of
"neighbors" for every star image in the original field.  The number of
neighbors is an adjustable parameter, depending on the "richness" -- the
density of star images -- in the field.  

When another image of this field is
presented (at a later time, and offset in X and Y, usually) we can identify
the target location in the (offset) field by comparing the r-theta values
from the new image with the stored tables,  by simple table look-up.  Cross
correlation is not needed.  We can then locate the target position, and
center it.

I realize this is a very limited application -- it only works on images
composed of point sources of light -- but the idea of transforming the
original image into "symbolic" form for comparison and recognition may have
some wider use.  The trick would be to find a transformation that retains
most of the information needed for recognition, and discards most of the
rest.  In this example, the chosen algoritm is very efficient.  For a star
field of average richness, only a few hundred bytes suffice to hold all of
the transformed information.  A 100 megabyte disk could hold all of the
"electronic finding charts" ever used in astronomy on this planet.

> Generalized Image Storage Format?
> 
> I can tell every university stores images differently.  As far as
> other generalized images then every program stores them differently.
> This I believe acts as a gigantic brake on vision research.  

Astronomers faced a similar problem, and seem to have solved it.  We can
trade images of star fields with other observatories if we just write them
onto mag tape in FITS tape format -- a generalized bit-mapped image tranfer
system.  I can point you to a technical description of FITS if you're
interested.

> This seems enough to spark some discussion (though I've been wrong
> before).  Any more and people won't read it anyway.  

Probably true.  I'm aware of three "automated telescope" projects in
astronomy that required image recognition to work, and all were total
failures.  A little coaxing would bring out details of this past history,
in hopes we won't be compelled to repeat it.

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather%utastro.UTEXAS@ut-sally.ARPA