neuron-request@HPLABS.HP.COM (Neuron-Digest Moderator Peter Marvit) (11/20/88)
Neuron Digest Saturday, 19 Nov 1988 Volume 4 : Issue 26 Today's Topics: Re: seperability and unbalanced data discussion Re: ICNN refs for boundary effects in training Learning arbitrary transfer functions Re: Learning arbitrary transfer functions Re: Learning arbitrary transfer functions Re: Learning arbitrary transfer functions Re: Learning arbitrary transfer functions Re: Learning arbitrary transfer functions Re: Learning arbitrary transfer functions Re: Learning arbitrary transfer functions Send submissions, questions, address maintenance and requests for old issues to "neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request" ------------------------------------------------------------ Subject: Re: seperability and unbalanced data discussion From: jose@tractatus.bellcore.com (Stephen J Hanson) Date: Sun, 13 Nov 88 13:01:41 -0500 See Hanson S. J. & Burr, D., Minkowski-r Backpropagation: Learning in Connectionist Networks with non-euclidean error metrics, in D. Anderson, Neural Information Processing: Natural and Synthetic, AIP, 1988. We look at some similar cases.. Steve ------------------------------ Subject: Re: ICNN refs for boundary effects in training From: CHRISLEY%VAX.OXFORD.AC.UK@CUNYVM.CUNY.EDU Date: 14 Nov 88 11:20:48 +0000 Rockwell.henr has asked for a reference to a paper at ICNN about boundary effects in training. Specifically, he mentioned a work that suggests that including as many boundary cases in the input improves performance. Let me suggest Prof. Kohonen's paper on LVQ2 (I do not have my proceedings with me, so I can't give the exact reference). This work showed that by only looking at the training cases near boundaries, one improves accuracy. Kohonen's formalism makes it very easy to determine which input patterns are boundary cases. I can't imagine a formalism in which this could be more perspicuous. Let me also suggest that this work is relevant to the "LMS fails to separate" discussion, especially Lippmann's comment. The motivation for the comparison between LVQ and LMS algorithms that was undertaken in (Kohonen, Barna. Chrisley, minimizing misclassification rate. That is why we thought that we would have better performance on some tasks if we used an algorithm that was designed to minimixe misclassifications explicitly, and not implicitly through LMS. Ron Chrisley New College Oxford OX1 3BN England and Xerox PARC SSL 3333 Coyote Hill Rd Palo Alto, CA 94304 USA ------------------------------ Subject: Learning arbitrary transfer functions From: aam9n@uvaee.ee.virginia.EDU (Ali Minai) Organization: EE Dept, U of Virginia, Charlottesville Date: 10 Nov 88 18:54:52 +0000 [[ Editor's Note: This series of entries was culled from the USENET comp.ai.neural-nets bboard. I have lightly edited some of the entries for spelling and duplications. Readers of this Digest may find it interesting to see the progression. Please feel free to add your information or your opinions for future issues. -PM ]] I am looking for any references that might deal with the following problem: y = f(x); f(x) is nonlinear in x Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} Can the network now produce ym given xm, even if it has never seen the pair before? That is, given a set of input/output pairs for a nonlinear function, can a multi-layer neural network be trained to induce the transfer function by being shown the data? What are the network requirements? What are the limitations, if any? Are there theoretical bounds on the order, degree or complexity of learnable functions for networks of a given type? Note that I am speaking here of *continuous* functions, not discrete-valued ones, so there is no immediate congruence with classification. Any attempt to "discretize" or "digitize" the function leads to problems because the resolution then becomes a factor, leading to misclassification unless the discretizing scheme was chosen initially with careful knowledge of the functions characteristics, which defeats the whole purpose. It seems to me that in order to induce the function correctly, the network must be shown real values, rather than some binary-coded version (e.g. in terms of basis vectors). Also, given that neurons have a logistic transfer function, is there a theoretical limit on what kinds of functions *can* be induced by collections of such neurons? All references, pointers, comments, advice, admonitions are welcome. Thanks in advance, Ali Ali Minai Dept. of Electrical Engg. Thornton Hall University of Virginia Charlottesville, VA 22901 aam9n@uvaee.ee.Virginia.EDU aam9n@maxwell.acc.Virginia.EDU ------------------------------ Subject: Re: Learning arbitrary transfer functions From: aboulang@bbn.com (Albert Boulanger) Date: 11 Nov 88 16:33:46 +0000 Check out the following report: "Nonlinear Signal Processing Using Neural Networks: Prediction and System Modeling" Alan Lapedes and Robert Farber Los Alamos Tech report LA-UR-87-2662 There was also a description of this work at the last Denver conference on Neural Networks. Lapedes has a nice demonstration of recovering the logistic map given a chaotic time series of the map. He has also done this with the Macky-Glass time-delay equation. It is rumored that techniques like this (Doyne Farmer as well as James Crutchfield have non neural-based dynamical-systems techniques for doing this, cf "Equations of Motion From a Data Series, James Crutchfield & Bruce McNamera, Complex Systems, Vol #3, June 1987, 417-452.) are being used by companies to predict the stock market. Albert Boulanger BBN Systems & Technologies Corp. 10 Moulton St. Cambridge MA, 02138 aboulanger@bbn.com ------------------------------ Subject: Re: Learning arbitrary transfer functions From: dhw@itivax.UUCP (David H. West) Organization: Industrial Technology Institute Date: 11 Nov 88 22:30:09 +0000 In previous article aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes: >That is, given a set of input/output pairs for a nonlinear function, can a >multi-layer neural network be trained to induce the transfer function ^^^ An infinite number of transfer functions are compatible with any finite data set. If you really prefer some of them to others, this information needs to be available in computable form to the algorithm that chooses a function. If you don't care too much, you can make an arbitrary choice (and live with the result); you might for example use the (unique) Lagrange interpolation polynomial of order n-1 that passes through your data points, simply because it's easy to find in reference books, and familiar enough not to surprise anyone. It happens to be easier to compute without a neural network, though :-) If you want ym for a given xm to be relatively independent of (sufficiently large) n, however, you need in general to know something about the domain, and n-1_th (i.e. variable) order polynomial interpolation is almost certainly not what you want. - -David West dhw%iti@umix.cc.umich.edu {uunet,rutgers,ames}!umix!itivax!dhw CDSL, Industrial Technology Institute, PO Box 1485, Ann Arbor, MI 48106 ------------------------------ Subject: Re: Learning arbitrary transfer functions From: efrethei@afit-ab.arpa (Erik J. Fretheim) Organization: Air Force Institute of Technology; WPAFB, OH Date: 15 Nov 88 14:14:08 +0000 Regarding the reply by dhw@itivax.UUCP (David H. West): >>That is, given a set of input/output pairs for a nonlinear function, can a >>multi-layer neural network be trained to induce the transfer function >> ^^^ I don't know about non-linear functions but, I did try to train a net (back prop) to learn to compute sine(X) given X. I trained it for two weeks straight (virtually sole user) on an ELXSI. The result was that in carrying the solution to 5 significant decimal places I got a correct solution 40% of the time. Although this is somewhat better than random chance, it is not good enough to be useful. I will also note that the solution did not improve dramatically in the last week of training so I feel I can safely assume that the error rate would not decrease. I also tried the same problem using a two's complement input/output and was able to get about the same results in about the same amount of training. The binary representation needed a few more nodes though. I was not able to spot any significant or meaningful patterns in the errors the net was making and do not feel that reducing the number of significant decimal places would help (even if it were meaningful) as the errors made were not consistently in the last couple of digits, but rather were spread throughout the number (in both binary and decimal representations). Based on these observations, I don't think a net can be expected to produce any meaningful function. Sure it can do 1 + 1 and other simple things, but trips when it hits something not easily exhaustively (or nearly exhaustively) trained. Just my opinion, but ... ------------------------------ Subject: Re: Learning arbitrary transfer functions From: tomh@proxftl.UUCP (Tom Holroyd) Organization: Proximity Technology, Ft. Lauderdale Date: 15 Nov 88 20:59:46 +0000 Another paper is "Learning with Localized Receptive Fields," by John Moody and Christian Darken, Yale Computer Science, PO Box 2158, New Haven, CT 06520, available in the Proceedings of the 1988 Connectionist Models Summer School, published by Morgan Kaufmann. They use a population of self-organizing local receptive fields, that cover the input domain, where each receptive field learns the output value for the region of the input space covered by that field. K-means clustering is used to find the receptive field centers. Interpolation via weighted average of nearby fields. 1000 times faster convergence than back-prop with conjugate gradient. Tom Holroyd UUCP: {uflorida,uunet}!novavax!proxftl!tomh The white knight is talking backwards. ------------------------------ Subject: Re: Learning arbitrary transfer functions From: hwang@taiwan.Princeton.EDU (Jenq-Neng Hwang) Organization: Princeton University, Princeton NJ Date: 15 Nov 88 22:08:20 +0000 A. Lapedes and R. Farber from Los Almos National Lab. have a technical report LA-UR87-2662, entitled "nonlinear signal processing using neural networks: prediction and system Modeling", which tried to solve the problem mentioned. They also have a paper published in "Proc. IEEE, Conf. on Neural Information Processing Systems -- Natural and Synthetic, Denvor, November 1987", entitled "How Neural Nets Work", pp 442-456. ------------------------------ Subject: Re: Learning arbitrary transfer functions From: hall@nvuxh.UUCP (Michael R Hall) Organization: Bell Communications Research Date: 16 Nov 88 21:04:58 +0000 Regarding article by aam9n@uvaee.ee.virginia.EDU (Ali Minai): >I am looking for any references that might deal with the following >problem: > > [etc.] > >Note that I am speaking here of *continuous* functions, not discrete-valued >ones, so there is no immediate congruence with classification. The problem you raise is not just a neural net problem. Your function learning problem has been termed "concept learning" by some researchers (e.g. Larry Rendell). A concept is a function. There are many nonneural learning algorithms (e.g. PLS1) that are designed to learn concepts. My opinion is that concept learning algorithms generally work better, easier, and faster than neural nets for learning concepts. (Anybody willing to pit their neural net against my implementation of PLS to learn a concept from natural data?) Neural nets are more general than concept learning algorithms, and so it is only natural that they should not learn concepts as quickly (in terms of exposures) and well (in terms of accuracy after a given number of exposures). Valiant and friends have come up with theories of the sort you desire, but only for boolean concepts (binary y's in your notation) and learning algorithms in general, not neural nets in particular. "Graded concepts" are continuous. To my knowledge, no work has addressed the theoretical learnability of graded concepts. Before trying to come up with theoretical learnability results for neural networks, one should probably address the graded concept learning problem in general. The Valiant approach of a Probably Almost Correct (PAC) learning criterion should be applicable to graded concepts. - -- Michael R. Hall | Bell Communications Research "I'm just a symptom of the moral decay that's | hall%nvuxh.UUCP@bellcore.COM gnawing at the heart of the country" -The The | bellcore!nvuxh!hall ------------------------------ Subject: Re: Learning arbitrary transfer functions From: rao@enuxha.eas.asu.edu (Arun Rao) Organization: Arizona State Univ, Tempe Date: 17 Nov 88 16:00:26 +0000 Regarding article by aam9n@uvaee.ee.virginia.EDU (Ali Minai): >I am looking for any references that might deal with the following >problem: > >y = f(x); f(x) is nonlinear in x > >Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} > >Can the network now produce ym given xm, even if it has never seen the >pair before? Sounds like a standard interpolation problem to me, though a good deal of effort has been expended to make neural networks inter- polate. Any elementary book on numerical analysis will treat this problem, but the author of the above probably knows this. I would be interesting in other ramifications to the above problem which are not readily amenable to classical techniques. - Arun Rao ------------------------------ End of Neurons Digest *********************