Laws@KL.SRI.COM (Ken Laws) (09/03/87)
The current networks will generally fail to recognize shifted patterns. All of the recognition networks I have seen (including the optical implementations) correlate the image with a set of templates and then use a winner-take-all subnetwork or a feedback enhancement to select the best-matching template. Vision researchers were doing this kind of matching (for character recognition, with the character known to be centered in the visual field) back in the 50s and early 60s. Position independence was then added by convolving the image and template, essentially performing the match at every possible shift. This was rather expensive, so Fourier, Hough, and hierarchical matching techniques were introduced. Then came edge detection, shape description, and many other paradigms. We don't have all the answers yet, but we've come a long way from the type of matching currently implemented in neural networks. The advantage of the networks, particularly those implemented in analog hardware, is speed. IF you have a problem for which alignment is known, or IF you have time or hardware to try all possible alignments, or IF your network is complex enough to store all templates at a sufficient number of shifts, neural networks may be able to give you an off-the-shelf recognizer that bypasses the need to research all of the pattern recognition literature of the last decade. I suspect that the above conditions will actually hold in a fair number of engineering situations. Indeed, many of these applications have already been identified by the signal processing community. Neural networks offer a trainable alternative to DSP or acoustic convolution chips. Where rules and explanations are appropriate, designers will use expert systems; otherwise they will neural networks and similar systems. Only the most difficult and important applications will require development of customized reasoning systems such as numerical or object-oriented simulations. -- Ken -------
mikek@boulder.UUCP (Mike Kranzdorf) (09/04/87)
The second reference above is correct, but fails to mention work by Fukishima and Mozer. These multi-layer networks are able to form an internal distributed representation of a pattern on an input retina. They demonstrate very good shift and scale invariance. The new and improved neocognitron (Fukishima) can even recognize multiple patterns on the retina. --mike mikek@boulder.colorado.edu
maiden@SDCSVAX.UCSD.EDU (VLSI Layout Project) (09/07/87)
In article <12331701930.42.LAWS@KL.SRI.Com> AIList-Request@SRI.COM writes: >The current networks will generally fail to recognize shifted patterns. >All of the recognition networks I have seen (including the optical >implementations) correlate the image with a set of templates and then >use a winner-take-all subnetwork or a feedback enhancement to select >the best-matching template. [some lines deleted] > -- Ken >------- There are a number of networks that will recognize shifts in position. Among them are optical implementations (see SPIE by Psaltis at CalTech) and the Neocognitron (Biol. Cybern. by Fukushima). The first neocognitron article dates to 1978, the latest article is 1987. There have been a number of improvements, including shifts in attention. Edward K. Y. Jung ------------------------------------------------------------------------ 1. If the answer to life, the universe and everything is "42"... 2. And if the question is "what is six times nine"... 3. Then God must have 13 fingers. ------------------------------------------------------------------------ UUCP: {seismo|decwrl}!sdcsvax!maiden ARPA: maiden@sdcsvax.ucsd.edu
sandon@dartmouth.EDU (Peter Sandon) (09/12/87)
I did not read the Byte article either. However, assuming that the network under discussion had no way to represent the similarity relationship among different nodes that represent translated versions of the same feature, it is not surprising that it would have a difficult time generalizing from a given pattern to an 'unaligned' version of that pattern. Rumelhart pointed out to Banks that what is needed are many sets of units having similar weight patterns, that is, weights that are sensitive to translated versions of a given pattern. In addition, the relationship between these similar units must be represented. Rumelhart suggests adding units as needed but does not mention how to relate these additional units to the trained unit. Fukushima did something similar in his Neocognitron, by broadcasting a learned weight set to an entire layer of units which were then all connected to an OR unit. This OR unit then represented the fact that all the units represented the same feature, modulo translation. Of course, broadcasting weights requires more global control than many would like, and the OR is not quite the relation we want for patterns of any complexity. In 1981, Hinton suggested a means of separately representing shape and translation in a network, such that 'unaligned' patterns could be recognized. In my thesis, I implemented a modified version of that network scheme, in order to demonstrate that a network can generalize object recognition across translation. The network that I implemented is five layers deep, which proved too much for standard backpropagation (the generalized delta rule) and for my extensions to the GDR. However, generalization across translation can be demonstrated in a subnetwork of this network. I am working on further improvements to backpropagation that will allow the entire network to be trained. It is important to recognize that there are many useless generalizations that might be made, and a few useful ones. The Hamming distance between two 'T's that are offset from one another is much greater than that between a 'T' and a 'C' that is offset such that it overlaps much of the 'T'. What is the 'correct' generalization to be made when trying to classify these patterns? In order to get the desired generalization, the network must be biased toward developing representations in which the Hamming distances (of the intermediate representations) between within-class patterns is small compared to that between other patterns. Generalization based on similarity will then be appropriate. Without such biases, 'good' generalization would be quite surprising. --Pete Sandon