[comp.ai.neural-nets] NNs in 2D Shape Recognition

Conrad.Bullock@comp.vuw.ac.nz (conrad Bullock) (05/12/91)

Greetings.
I am working on an honours project, aiming to apply neural networks to
recognising simple shapes in two-dimensional space, independent of
position, noise, rotation, magnification, and other transforms.
Does anyone have any good references in relevant work, particularly in
rotation-invariant recognition?

Thank you.
-- 
Conrad Bullock                     | Domain:   conrad@comp.vuw.ac.nz
Victoria University of Wellington, |     or:   conrad@cavebbs.gen.nz
New Zealand.                       | Fidonet:  3:771/130
                                   | BBS:      The Cave BBS +64 4 643429

ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards) (05/14/91)

In article <1991May12.115515.7741@comp.vuw.ac.nz> Conrad.Bullock@comp.vuw.ac.nz (conrad Bullock) writes:
>Greetings.
>I am working on an honours project, aiming to apply neural networks to
>recognising simple shapes in two-dimensional space, independent of
>position, noise, rotation, magnification, and other transforms.
>Does anyone have any good references in relevant work, particularly in
>rotation-invariant recognition?

Get Zemel and Hinton "Discovering Viewpoint-Invariant Relationships
 That Characterize Objects" from the /pub/neuroprose dir of
 cheops.cis.ohio-state.edu via anon ftp.

Two networks are trained on images of the same object with various
orientations, positions, and sizes.  The networksd are trained to
have high mutual information between their four outputs, which
if properly trained must represent a coding of the orientation,
position, and size of the of the object.  While extracting that recoding
might not be easy, we can use this property to have a network
trained in this way reject other shapes it is exposed to since the
outputs will no longer agree on the position, orientation, and
size of the object. Thus multiple nets trained in this way can
compete and see which pairs have the highest mutual information.
The pair which has the highest mutual information will be the
pair trained on the shape of the test object.

-Thomas Edwards

guedalia@bimacs.BITNET (David Guedalia) (05/15/91)

In article <1991May12.115515.7741@comp.vuw.ac.nz> Conrad.Bullock@comp.vuw.ac.nz (conrad Bullock) writes:
>Greetings.
>I am working on an honours project, aiming to apply neural networks to
>recognising simple shapes in two-dimensional space, independent of
>position, noise, rotation, magnification, and other transforms.
>Does anyone have any good references in relevant work, particularly in
>rotation-invariant recognition?
>
   I think Fukishima's Neo-cogintron et al cover those excat points,
rotation, scaling and the like.  One refrence can be found in Neural
Networks 1987 (sorry forgot which vol and issue).

  david

peretz@gradient.cis.upenn.edu (Samuel R. Peretz) (05/17/91)

>Conrad.Bullock@comp.vuw.ac.nz (conrad Bullock) writes:
>I am working on an honours project, aiming to apply neural networks to
>recognising simple shapes in two-dimensional space, independent of
>position, noise, rotation, magnification, and other transforms.
>Does anyone have any good references in relevant work, particularly in
>rotation-invariant recognition?

Me too!  Anyone responding by e-mail, could you send me a copy?

				Thanks,
				Sam



	<=======================================================>
	< Samuel R. Peretz			      		>
	< 126 Anatomy/Chemistry Bldg.		   \ /  	>
	< University of Pennsylvania		 ------- 	>
	< Inst. for Neurological Sciences	| 0   0 |	>
	< (215) 898-8048			|   V   |	>
	< peretz@grad1.cis.upenn.edu		|  ===  |	>
	< aka sam@retina.anatomy.upenn.edu	 -------	>
	< aka srp@vision5.anatomy.upenn.edu			>
	<=======================================================>




	<=======================================================>
	< Samuel R. Peretz			      		>

disbrow@skipper.dfrf.nasa.gov (Jim Disbrow) (05/18/91)

In article <43385@netnews.upenn.edu>, peretz@gradient.cis.upenn.edu (Samuel R. Peretz) writes:
|> >Conrad.Bullock@comp.vuw.ac.nz (conrad Bullock) writes:
|> >I am working on an honours project, aiming to apply neural networks to
|> >recognising simple shapes in two-dimensional space, independent of
|> >position, noise, rotation, magnification, and other transforms.
|> >Does anyone have any good references in relevant work, particularly in
|> >rotation-invariant recognition?
|> 
|> Me too!  Anyone responding by e-mail, could you send me a copy?
|> 
|> 				Thanks,
|> 				Sam

It sure seems to me this could be of benefit to a lot of us. If there
are any responses, please post. 
Thanx,
Jim Disbrow
disbrow@skipper.dfrf.nasa.gov

Conrad.Bullock@comp.vuw.ac.nz (Conrad Bullock) (05/19/91)

In article <926@skipper.dfrf.nasa.gov> disbrow@skipper.dfrf.nasa.gov writes:
>In article <43385@netnews.upenn.edu>, peretz@gradient.cis.upenn.edu (Samuel R. Peretz) writes:
>|> >Conrad.Bullock@comp.vuw.ac.nz (conrad Bullock) writes:
>|> >I am working on an honours project, aiming to apply neural networks to
>|> >recognising simple shapes in two-dimensional space, independent of
>|> >position, noise, rotation, magnification, and other transforms.
>|> >Does anyone have any good references in relevant work, particularly in
>|> >rotation-invariant recognition?
>|> 
>|> Me too!  Anyone responding by e-mail, could you send me a copy?
>
>It sure seems to me this could be of benefit to a lot of us. If there
>are any responses, please post. 

I will post a summary of the many useful responses that I received,
probably within a couple of days. Thank you to everyone for their
help.
-- 
Conrad Bullock                     | Domain:   conrad@comp.vuw.ac.nz
Victoria University of Wellington, |     or:   conrad@cavebbs.gen.nz
New Zealand.                       | Fidonet:  3:771/130
                                   | BBS:      The Cave BBS +64 4 643429

alpowell@images.cs.und.ac.za (05/20/91)

ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards) writes:

>In article <1991May12.115515.7741@comp.vuw.ac.nz> Conrad.Bullock@comp.vuw.ac.nz (conrad Bullock) writes:
>>Greetings.
>>I am working on an honours project, aiming to apply neural networks to
>>recognising simple shapes in two-dimensional space, independent of
>>position, noise, rotation, magnification, and other transforms.
>>Does anyone have any good references in relevant work, particularly in
>>rotation-invariant recognition?

>Get Zemel and Hinton "Discovering Viewpoint-Invariant Relationships
> That Characterize Objects" from the /pub/neuroprose dir of
> cheops.cis.ohio-state.edu via anon ftp.
>-Thomas Edwards

Hi. I've been trying to get hold of this paper (and a few others in
the neuroprose directory) but unfortunately do not have direct ftp
access. I've used the BITNET ftp service but unfortunately it isn't 
available to me any more. Does anyone know of any other way I can 
get hold of the paper? Could anyone e-mail them to me (yes I know it's
a lot to ask)? Any other suggestions?

Thanks, Alan
(alpowell@images.cs.und.ac.za)

Conrad.Bullock@comp.vuw.ac.nz (Conrad Bullock) (05/24/91)

A while back I posted the following query to comp.ai.neural-nets.

>Greetings.
>I am working on an honours project, aiming to apply neural networks to
>recognising simple shapes in two-dimensional space, independent of
>position, noise, rotation, magnification, and other transforms.
>Does anyone have any good references in relevant work, particularly in
>rotation-invariant recognition?

Here's a summary of the responses that I received.

----------------------------------------------------------------------------

From: mark@swanee.ee.uwa.oz.au (Mark Morrison)

Conrad,
      I worked on 2-D shape recognition for about 8 months when I was
looking for something to do for a PhD.  I was recognising simple 2-D
shapes, such as squares, triangles and circles, initially, and then 
extended this to recognition of sillouettes of more complex 3-D objects,
including objects such as beakers, test tubes and pipette stands...

I employed a back-propagation neural network to do the classification.
The initial invariant feature extraction was performed by using a two
dimensional fourier transform and then integrating the magnitude of the
transform over a number of concentric rings to gett rottational invariance.
Scale invariance could be acheived by integrating over wedges, rather than
rings.  I managed to get basically 100% recognition with a set of 6 objects.

This is only a brief summary, but the work was based primarily on a 
conference paper -

	Glover, David E. (1988) "An optical Fourier/Electronic
        neurocomputer automated inspection system", Proc. 1988
        IEEE Int. Conf. on Neural Networks, vol 1.

an a follow-up journal article -

        Glover, David E. (1989). "Optical processing and neuro-
        computing in an automated inspection system.", Journal
        of Neural Network Computing, Fall 1989, pp 17-38.

I hope this is of some assistance to you.

                                          Regards,
 
                                          Mark Morrison,
                                          Dept. of Electrical & Electronic Eng.
                                          University of Western Australia.

[Thank you - I have been hoping to avoid having to use a conventional
 image filter (such as a Fourier transform) in order to achieve rotation
 invariance, but that may be the way to go.]
                                          
----------------------------------------------------------------------------

From: ins_atge@jhunix.hcf.jhu.edu (Thomas G Edwards)

There has recently been some interesting work in training two
separate (yet fairly simple...usually two layer) networks which
have as input slightly different views of the two-d space to
maximize the mutual information of their outputs (which is approaching
back-propogation with a slightly different error function than normally
used).

These nets are then trained using different orientations of
a particular shape, say a square.  The outputs of the two nets,
since they are constrained to have high mutual information, can only
be a representation of the orientation of the square.  Unfortunately,
deciphering the nets' representation of orientation into your
representation of orientation is not easy.

However, if we train a pair of nets for a square, a pair for a triangle,
and a pair for a line, we can figure out what shape an object is
by seeing which pair of nets has the highest mutual information
outputs...the pair which does should be the pair which was trained on that
object.

Similarly, the pairs of nets could be trained on both different orientations
and sizes.

I'll try to dig up the reference for this paper.

-Tom

[I have asked for these references to be dug up - no response yet.]

----------------------------------------------------------------------------

[Thomas Bruel posted this summary of non-neural approaches early in May]

From: tmb@ai.mit.edu (Thomas M. Breuel)
Organization: MIT Artificial Intelligence Lab

>		Point Set Matching
>		------------------
>		(Barrodale Computing Services Ltd, May 1991).
>
>  Problem Definition: 
>    We are given two sets of points in the plane. These points could represent
>    two `simplified' images or output from some sensors. The first set
>    contains M points. The second set is similar to the first set, except
>    that some of the points from the first set are missing and some new
>    points, not in the first set, are present. The second set contains N
>    points. The positions of the points in the second set are, within a
>    given tolerance, the same as common points in the first set. However,
>    within this tolerance fairly large local distortions can occur.
>
>    The problem has three parts:
>      1. Find all the points in the first set which do not have a match in
>      the second set.
>
>      2. Find all points in the second set which do not have a match in
>      the first set.
>
>      3. For all points in the first set which have a common point in the
>      second set find the correct match.
>
>  Questions:
>    We are interested in hearing from anyone who has worked on the above
>    problem or has worked on related problems. We are also interested in
>    looking at the possibility of using artificial intelligence
>    techniques, like neural networks, for solving the problem.

[since this question seems to come up from time-to-time, I'm posting
this response]

The following papers will give you a good start at the literature
(Eric Grimson's book has an extensive bibliography of the pre-1990
work on the subject; you should look there for other references):

   Alt H., Mehlhorn K., Wagener H., Welzl E., 1988, Congruence,
   Similarity, and Symmetries of Geometric Objects., Discrete and
   Computational Geometry.
   
   Baird H. S., 1985, Model-Based Image Matching Using Location, MIT
   Press, Cambridge, MA.
   
   Breuel T. M., 1991, An Efficient Correspondence Based Algorithm for 2D
   and 3D Model Based Recognition, In Proceedings IEEE Conf. on Computer
   Vision and Pattern Recognition.
   
   Cass T. A., 1990, Feature Matching for Object Localization in the
   Presence of Uncertainty, In Proceedings of the International
   Conference on Computer Vision, Osaka, Japan, IEEE, Washington, DC.
   
   Grimson E., 1990, Object Recognition by Computer, MIT Press,
   Cambridge, MA.

State-of-the-art algorithms running on a SparcStation can find optimal
solutions (either maximal size of match at given error or minimum
error at given size of match) to this kind of bounded error
recognition problem on the average in under a minute, for models
consisting of hundreds points of and images consisting of 1000-2000
unlabeled, oriented features.

----------------------------------------------------------------------------

From BMather%TSS%SwRI05@D26VS046A.CCF.SwRI.EDU Fri May 17 10:11:29 1991

We at Southwest Research Institute have developed a geometric shape
recognition system which can identify simple geometric shapes (star, square,
circle, triangle, rectangle) in binary images. The shapes can have binary
noise on top of them, but they cannot be occluded (or overlapped).

We preprocess the image to obtain the area, perimeter, and the first 3
moments. Polynomial combinations of these are used as input to a standard
BP network. We have about 20 hidden nodes. Recognition is nearly 100%
and fails when the rectangles or triangles fold into a line.

I can send more if you are interested. I was the Principle Investigator on
this project.

Dr. Bruce C. Mather
Sr. Research Engineer
Southwest Research Institute

----------------------------------------------------------------------------

From: Giorgos Bebis <bebis@csi.forth.gr>

Hello Conrad,

I am working on the field of object recognition using Neural Nets.
Some useful references are the following :

1. A. Khotanzad and J. Lu "Classification of Invariant Image Representation
using a Neural Network", IEEE Transc. on Acoustics, Speech, and Signal
Processing, vol. 38, No. 6, June 1990.

2. L. Gupta, M. Sayeh and R. Tammana "A Neural Network approach for
Robust Shape Classification", Pattern Recognition vol. 23, No. 6, 1990.

3. H. Wechsler and G. Zimmerman "2-D Invariant object recognition
    using distributed associative memory" IEEE Pattern Analysis and
   Machine Intelligent (I don't have further informations)

4. G. Bebis and G. Papadourakis "Model Based Object Recognition Using
Artificial Neural Networks", accepted for presentation to the International
Conference on Artificial Neural Networks (ICANN-91), to be held at Helsinski
University of Technology, Espoo, Filand, June 1991 (also accepted for
publication to the Pattern Recognition).

G. Bebis, S. Orphanoudakis, and G. Papadourakis "Model-Based Object
Recognition Using Multiresolution Segmentation and Neural Network Models",
to be published.

Hope this help ....

George Bebis,                        
Dept. of Computer Science,
University of Crete, 
PO BOX 1470, Iraklion, Crete, GREECE         E-mail : bebis@csi.forth.gr

----------------------------------------------------------------------------

From: guy@minster.york.ac.uk

@article{
  kn:Fukushima-75, 
  author = "K. Fukushima", 
  title = "Cognitron: A Self-Organising Multilayered Neural Network", 
  journal = "Journal of Biological Cybernetics", 
  year = "1975", 
  volume = "20", 
  pages = "121-36" 
  }
% Describes a BackProp type cone architecture. It predates BP and uses a
% different learning algorithm. It is used for image processing.

@article{
  kn:Fukushima-80, 
  author = "K. Fukushima", 
  title = "Neocognitron: A Self-Organising Neural Network Model for a Mechanism of Pattern Recognition Unaffected by a Shift in Position", 
  journal = "Journal of Biological Cybernetics", 
  year = "1980", 
  volume = "36", 
  pages = "193-202" 
  }
% Like Cognitron, but also uses weight sharing. 

@article{
  kn:Giles-87, 
  author = "C.L. Giles and T. Maxwell", 
  title = "Learning, Invariance and Generalization in High-Order Neural Networks", 
  journal = "Applied Optics", 
  year = "1987", 
  volume = "26", 
  number = "3", 
  pages = "4972-8" 
  }
% Shows that squashed polynomial activation functions can give invariance wrt
% some transformations of the data vector. 

@article{
  kn:Reid-89,
  author = "M.B. Reid and L. Spirkovska and E. Ochoa",
  title = "Simultaneous Position, Scale and Rotation Invariant Pattern Classification Using 3rd order Neural Networks",
  journal = "International Journal of Neural Networks: Research and Applications" ,
  year = "1989", 
  volume = "1", 
  number = "3", 
  pages = "154-9"
  }
% Shows that squashed polynomial activation functions can give invariance wrt
% some transformations of the data vector. 

@article{
  kn:Seibert, 
  author = "M. Seibert and A.M. Waxman",
  title = "Spreading Activation Layers, Visual Saccades, and Invariant Representations for Neural Pattern Recognition Systems",
  journal = "Neural Networks",
  year = "1989",
  volume = "2",
  number = "1",
  pages = "9-27"
  }
% Uses complex log mapping and centroids to get recognition of silhouettes 
% invariant wrt translation, rotation and scale. 

----------------------------------------------------------------------------

From: guedalia@bimacs.BITNET (David Guedalia)

   I think Fukishima's Neo-cogintron et al cover those excat points,
rotation, scaling and the like.  One refrence can be found in Neural
Networks 1987 (sorry forgot which vol and issue).

  david

[I've looked at the Neocognitron,but it appears to only handle limited
 degrees of rotation - am I incorrect here?]

----------------------------------------------------------------------------

From: ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards)

Get Zemel and Hinton "Discovering Viewpoint-Invariant Relationships
 That Characterize Objects" from the /pub/neuroprose dir of
 cheops.cis.ohio-state.edu via anon ftp.
-Thomas Edwards

[I have this - it is a good paper. The filename is zemel.unsup-recog.ps.Z]

----------------------------------------------------------------------------

From: ahmad@ICSI.Berkeley.EDU (Subutai Ahmad)

**  You may be interested in the following references:

Giles, C.L., Griffin, R.D., and Maxwell, T.  Encoding Geometric
Invariances in Higher Order Neural Networks.  Neural Information
Processing Systems  - 1987. pp 301-309

Giles, C.L., Sun, G.Z., Chen, H.H., Lee, Y.C., and Chen, D. Higher
Order Recurrent Networks and Grammatical Inference. Neural Information
Processing Systems '89.  Morgan Kaufmann, San Mateo, Ca., 1990.

Maxwell, T., Giles, C.L., Lee, Y.C., and Chen, H.H.  Transformation
Invariance Using Higher Order Correlations in Neural Net
Architectures.  Proceedings of the IEEE International Conference on
Systems, Man and Cybernetics, October 1986. 

Honavar, V., and Uhr, L.  Generation, Local Receptive Fields and
Global Convergence Improve Perceptual Learning in Connectionist
Networks.  IJCAI-89, Vol1, Pg 180. 

Zemel, R., Mozer, M., and Hinton, G.  TRAFFIC: Recognizing Objects
Using Hierarchical Reference Frame Transformations.  Touretzky, D.S.
(ed.), Advances in Neural Information Processing Systems 2 1989.


**  Also, for a paper which discusses why techniques like the above are
**  likely to be no good for even simple problems with high resolution
**  images, see:

Ahmad, S. and Omohundro, S. Equilateral Triangles: A Challenge for
Connectionist Vision. In: Proceedings of the 12th Annual meeting of the
Cognitive Science Society, MIT, 1990.

Ahmad, S. and Omohundro, S. A Network for Extracting the Locations of
Point Clusters Using Selective Attention, International Computer
Science Institute, Berkeley Tech Report No. TR-90-011, 1990.

**  Zemel et al, and the Ahmad & Omohundro papers are available from the
**  /pub/neuroprose dir of cheops.cis.ohio-state.edu via anon ftp.

[Note from Conrad - The second Ahmad and Omohundro paper is there, but
 I couldn't find the first one. The Zemel paper quoted in this message
 is not there, only the paper "Discovering Viewpoint-Invariant
 Relationships That Characterize Objects", mentioned above.]
-- 
Conrad Bullock                     | Domain:   conrad@comp.vuw.ac.nz
Victoria University of Wellington, |     or:   conrad@cavebbs.gen.nz
New Zealand.                       | Fidonet:  3:771/130
                                   | BBS:      The Cave BBS +64 4 643429

Conrad.Bullock@comp.vuw.ac.nz (Conrad Bullock) (05/25/91)

Well, it had to happen - I left (at least) one reply out of my summary
on NNs in 2D Shape Recognition. So here's the 'patch' -

----------------------------------------------------------------------------

Date: Sun, 19 May 91 12:39:43 -0400
From: Jonathan Marshall <marshall@cs.unc.edu>

See:
        Coolen & Kuijk, "A learning mechanism for invariant pattern
recognition in neural networks," Neural Networks, vol.2, no.6, 1989.

        Foldiak, "Learning invariance from transformation sequences,"
to appear in Neural Computation, vol.3, no.2, 1991.

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
=                                                                       =
=   Jonathan A. Marshall                          marshall@cs.unc.edu   =
=   Department of Computer Science                                      =
=   CB 3175, Sitterson Hall                                             =
=   University of North Carolina                  Office 919-962-1887   =
=   Chapel Hill, NC 27599-3175, U.S.A.               Fax 919-962-1799   =
=                                                                       =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

----------------------------------------------------------------------------

Sorry about that!
-- 
Conrad Bullock                     | Domain:   conrad@comp.vuw.ac.nz
Victoria University of Wellington, |     or:   conrad@cavebbs.gen.nz
New Zealand.                       | Fidonet:  3:771/130
                                   | BBS:      The Cave BBS +64 4 643429