kilmer@hq.af.mil (09/19/90)
I was wondering if anyone had done a study of network paradigms that were particularly well suited for scale and rotation invariance. My problem involves identifying similar input patterns of different scale, such as a letter 'A' of two different font sizes (ie. 14 or 16 point courier), or different rotations (i.e. landscape vs. portrait or 23deg angle...). I created a backprop net that would accept a 8x14 binary input matrix and created a font within this matrix that the net was to associate to an 8 bit output (representing the character as a number i.e. ASCII). I had 94 characters in the font. The network learned the 94 character associations fine, and I tested the network with 3-8% noise levels with very good success at finding the resulting binary output. For larger fonts, I was simply going to scale down the character to fit into the 8x14 matrix. This should work...I havn't tryed it yet. As for the rotation problem, I wasn't sure how to approach this (short of rotating the input object until I had a positive match). Well, while I was working on this I decided to try and approach the problem from a differnt angle. I was first going to teach the network different fonts until I learned as many as I had access to, but wondered whether this was a dead end issue. Wouldn't a net that was able to extract the various features of an object and output what features it has identified within the object be better than teaching it all fonts. Specifically extract features regardless of size, or rotation. I have heard of something known as a neo- cognitron that was able to correctly identify disproportioned input or something like that, but havn't been able to find out any info on it. Does anyone out there have any, or doing any research into this area??? I would appreciate any reply. Thanks, Richard -- .-------------------------------------------------------------------------. | Richard Kilmer Kilmer@Opsnet-Pentagon.af.mil | | VAX Systems Analyst (AKA Kilmer@26.24.0.26) | | .--->Look to the future --. "But when hope has gone away | | | | In a night or in a day | | `--- Through the past <---' In a vision or in none | | Is is therefore the less gone?"| `-------------------------------------------------------------------------' -- -------------------------------------------------------------------------. | Richard Kilmer Kilmer@Opsnet-Pentagon.af.mil | | VAX Systems Analyst (AKA Kilmer@26.24.0.26) | | .--->Look to the future --. "But when hope has gone away |
schraudo@beowulf.ucsd.edu (Nici Schraudolph) (10/01/90)
One way to solve/circumvent the scale/translation/rotation invariance problem in visual recognition problems is through appropriate preprocessing of the inputs. I've seen an example of this approach at IJCNN'90 (San Diego): David Casasent and Etienne Barnard, "Adaptive Clustering Neural Net for Piecewise Nonlinear Discriminant Surfaces", Proc. IJCNN'90, p. I-423 (also, a paper by the same authors in IJCNN'89 (Washington) , p. I-111) They first perform a 2-D Fourier transform on the image (which gives them translation invariance), then use input neurons with ring- and wedge-shaped receptive fields on the transformed image. The "ring neurons" are scale sensitive but rotation invariant whereas the "wedge neurons" are rotation sensitive but scale invariant. The right mix of these may provide a good feature space for this kind of recognition task. -- Nicol N. Schraudolph, C-014 "Big Science, hallelujah. University of California, San Diego Big Science, yodellayheehoo." La Jolla, CA 92093-0114 - Laurie Anderson. nici%cs@ucsd.{edu,bitnet,uucp}
reiner@isy.liu.se (Reiner Lenz) (10/01/90)
We studied the problem of invariance in pattern recognition problems in top-down and bottom-fashion. A) In the top-down approach you know that you want to recognize patterns independent of some group of transformations. Using some theory you can show that the transformation group gives you the desired feature extraction process. For example: 2-D rotation invariance leads to Fourier transform in polar coordinates, 3-D rotation invariance leads to surface harmonics, scale invariance leads to the Mellin transform etc. Ref.: @article{Len_jos:89, author ="Reiner Lenz", title ="A Group Theoretical Model of Feature Extraction", journal=josaa, volume="6", number="6", pages="827-834", year = "1989" } @book{Len:90ln, author= "Reiner Lenz", title = "Group Theoretical Methods in Image Processing", publisher = "Springer Verlag", series = "Lecture Notes in Computer Science (Vol. 413)", address = "Heidelberg, Berlin, New York", year = "1990" } @article{Len:90, author= "Reiner Lenz", title = "Group-Invariant Pattern Recognition", journal = "Pattern Recognition", volume="23", number="1/2", pages = "199-218", year = "1990" } A generalization is the following: Not all patterns in a group are equally important. This is the case in scale invariance; since scaled patterns with very small or very large scaling factors are not very similar to the original pattern. How the theory must be modified in this case is described in one of our internal reports. Ref.: @techreport{Len_prob:91, author ="Reiner Lenz", title="On probabilistic Invariance", institution={Link\"oping University, ISY, S-58183 Link\"oping}, note="Internal Report", year="1991" } We also investigated the problem in a bottom-up fashion. We design a learning filter system that consists of a fixed number of filter functions. Then we train this system with examples of the pattern class that we want to recognize. The learning rule is designed in such a way that the resulting system produces filter functions with a minimum loss of information and a maximum amount of concentration of the feature components. Examples show that this system learns Fourier transformation from examples of rotated patterns. Ref: @inproceedings{Len:90ijcnn, author= {Reiner Lenz and Mats \"Osterberg}, title = "Learning Filter Systems", booktitle = "Proc. Int. Joint Conference on Neural Networks, San Diego", year = "1990" } -- "Kleinphi macht auch Mist" Reiner Lenz | Dept. EE. | | Linkoeping University | email: reiner@isy.liu.se | S-58183 Linkoeping/Sweden | -- "Kleinphi macht auch Mist" Reiner Lenz | Dept. EE. | | Linkoeping University | email: reiner@isy.liu.se | S-58183 Linkoeping/Sweden |
manning@nntp-server.caltech.edu (Evan Marshall Manning) (10/01/90)
I have an interesting looking >1 year old preprint here entitled "Simultaneous Position, Scale, and Rotation Invariant Pattern Classification using Third-Order Neural Networks". It's by Max B. Reid, Lilly Spirkovska, and Ellen Ocoa at the Intelligent Systems Technology Branch, NASA Ames Research Center, Moffet Field, CA 94035. The preprint claims it was to be published in The International Journal of Neural Networks - Research and Applications. I gather it should have been printed by now. I also have a similar article from pages I-689-692 of the proceedings of the IJCNN June, 1989. Hope this helps. -- Evan *************************************************************************** Your eyes are weary from staring at the CRT for so | Evan M. Manning long. You feel sleepy. Notice how restful it is | is to watch the cursor blink. Close your eyes. The |manning@gap.cco.caltech.edu opinions stated above are yours. You cannot | manning@mars.jpl.nasa.gov imagine why you ever felt otherwise. | gleeper@tybalt.caltech.edu
minsky@media-lab.MEDIA.MIT.EDU (Marvin Minsky) (10/02/90)
And be sure to read the original classic -- Pitts and McCulloch 1947, reprinted in W.S.McCulloch's "Embodiments of Mind," MIT press book. Although it was from the pre-computer age, it has nice clear explanation of the relevant invariant Haar measure theory. I don't recall any good analysis therein of how to prune off the irrelevant parts of the group to make things converge.