[net.graphics] Computer Vision, Pattern Recognition

sher@rochester.UUCP (07/15/85)
From: sher

I recently found out that discussion on computer vision related issues
is going on in net.graphics.  Since I am doing a thesis on low-level
computer vision I found this interesting.  I didn't read net.graphics
because image creation is a subject that is very peripheral to my
interests.  My research is about feature detection in images using
probabilistic models and detectors that return probabilities.  I am
also interested in doing vision on parallel machines and assisted with
the design of the WARP a parallel pipelined machine that will be
devoted to image processing applications (at CMU).  If we want to get
our own news group I guess we will have to generate discussion so here
are some topics and my ideas on them:

Reconstruction vs Recognition Based systems:

Many people (especially people at MIT) believe that a fundamental step
in computer vision is to reconstruct some set of intrinsic parameters
such as surface orientation, texture, illumination, reflectivity.
This concept has so many references I pale before listing them (any
textbook on computer vision should cover this nicely).  Other people
feel that since the purpose of computer vision systems is to recognize
a restricted set of situations complete reconstruction at all points
in the image is wasteful and unnecessary.  Most bottom up research is
reconstructive and top down research isn't.  It should be clear that
the reconstructivist opinion makes little sense in very restrictive
domains such as many industrial vision systems (sometimes called
verification vision systems).  In my opinion general purpose vision
will require complete reconstruction only at the very lowest levels
and then as the routines get more high level the reconstruction will
be more or less incomplete.  Actually my opinion is much more
complicated but this should provoke discussion among the interested.

Generalized Image Storage Format?

In Ballard & Brown, Computer Vision,  a generalized image is defined
as an iconic like array containing information relevant to an image
(paraphrase not quote).  Examples of generalized images are Fourier
transformed images, edge images, stereo pairs, circle location points,
image histograms...  If there were generally accepted formats for
generalized images then I could use edge recognition programs written
at CMU and image interpretation routines written at U. Mass to test my
texture recognition routines written here at U. Rochester.  As far as
I can tell every university stores images differently.  As far as
other generalized images then every program stores them differently.
This I believe acts as a gigantic brake on vision research.  

Parallelism and Computer Vision:

I have recently completed a TR studying the effect of differing
architectures on low-level computer vision.  I compared the CMU WARP
and the BBN Butterfly.  The interpretation task was pattern
recognition using convolution based techniques on edge images.  
The architectural features that effected the choice of implementation
of the routines were in order of importance:
1. Relative speeds of instructions (floating point vs fixed point vs
memory access)
2. Local memory available per processor
3. Interconnection net between processors
Actually the effect of the interconnection net was completely masked
by the first two issues.  My research thus indicates that the
interconnection net is not a significant issue as far as architectures
for computer vision are concerned.  

This seems enough to spark some discussion (though I've been wrong
before).  Any more and people won't read it anyway.  
-David Sher
sher@rochester
seismo!rochester!sher