aam9n@uvaee.ee.virginia.EDU (Ali Minai) (11/11/88)
I am looking for any references that might deal with the following problem: y = f(x); f(x) is nonlinear in x Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} Can the network now produce ym given xm, even if it has never seen the pair before? That is, given a set of input/output pairs for a nonlinear function, can a multi-layer neural network be trained to induce the transfer function by being shown the data? What are the network requirements? What are the limitations, if any? Are there theoretical bounds on the order, degree or complexity of learnable functions for networks of a given type? Note that I am speaking here of *continuous* functions, not discrete-valued ones, so there is no immediate congruence with classification. Any attempt to "discretize" or "digitize" the function leads to problems because the resolution then becomes a factor, leading to misclassification unless the discretizing scheme was chosen initially with careful knowledge of the functions characteristics, which defeats the whole purpose. It seems to me that in order to induce the function correctly, the network must be shown real values, rather than some binary-coded version (e.g. in terms of basis vectors). Also, given that neurons have a logistic transfer function, is there a theoretical limit on what kinds of functions *can* be induced by collections of such neurons? All references, pointers, comments, advice, admonitions are welcome. Thanks in advance, Ali Ali Minai Dept. of Electrical Engg. Thornton Hall University of Virginia Charlottesville, VA 22901 aam9n@uvaee.ee.Virginia.EDU aam9n@maxwell.acc.Virginia.EDU
aboulang@bbn.com (Albert Boulanger) (11/12/88)
Check out the following report: "Nonlinear Signal Processing Using Neural Networks: Prediction and System Modelling" Alan Lapedes and Robert Farber Los Alamos Tech report LA-UR-87-2662 There was also a description of this work at the last Denver conference on Neural Networks. Lapedes has a nice demonstration of recovering the logistic map given a chaotic time series of the map. He has also done this with the Macky-Glass time-delay equation. It is rumored that techniques like this (Doyne Farmer as well as James Crutchfield have non neural-based dynamical-systems techniques for doing this, cf "Equations of Motion From a Data Series, James Crutchfield & Bruce McNamera, Complex Systems, Vol #3, June 1987, 417-452.) are being used by companies to predict the stock market. Albert Boulanger BBN Systems & Technologies Corp. 10 Moulton St. Cambridge MA, 02138 aboulanger@bbn.com
dhw@itivax.UUCP (David H. West) (11/12/88)
In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes: > >I am looking for any references that might deal with the following >problem: > >y = f(x); f(x) is nonlinear in x > >Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} > >Can the network now produce ym given xm, even if it has never seen the >pair before? > >That is, given a set of input/output pairs for a nonlinear function, can a >multi-layer neural network be trained to induce the transfer function ^^^ An infinite number of transfer functions are compatible with any finite data set. If you really prefer some of them to others, this information needs to be available in computable form to the algorithm that chooses a function. If you don't care too much, you can make an arbitrary choice (and live with the result); you might for example use the (unique) Lagrange interpolation polynomial of order n-1 that passes through your data points, simply because it's easy to find in reference books, and familiar enough not to surprise anyone. It happens to be easier to compute without a neural network, though :-) -David West dhw%iti@umix.cc.umich.edu {uunet,rutgers,ames}!umix!itivax!dhw CDSL, Industrial Technology Institute, PO Box 1485, Ann Arbor, MI 48106
dhw@itivax.UUCP (David H. West) (11/12/88)
In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes: > >I am looking for any references that might deal with the following >problem: > >y = f(x); f(x) is nonlinear in x > >Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} > >Can the network now produce ym given xm, even if it has never seen the >pair before? > >That is, given a set of input/output pairs for a nonlinear function, can a >multi-layer neural network be trained to induce the transfer function ^^^ An infinite number of transfer functions are compatible with any finite data set. If you really prefer some of them to others, this information needs to be available in computable form to the algorithm that chooses a function. If you don't care too much, you can make an arbitrary choice (and live with the result); you might for example use the (unique) Lagrange interpolation polynomial of order n-1 that passes through your data points, simply because it's easy to find in reference books, and familiar enough not to surprise anyone. It happens to be easier to compute without a neural network, though :-) If you want ym for a given xm to be relatively independent of (sufficiently large) n, however, you need in general to know something about the domain, and n-1_th (i.e. variable) order polynomial interpolation is almost certainly not what you want. -David West dhw%iti@umix.cc.umich.edu {uunet,rutgers,ames}!umix!itivax!dhw CDSL, Industrial Technology Institute, PO Box 1485, Ann Arbor, MI 48106
efrethei@afit-ab.arpa (Erik J. Fretheim) (11/15/88)
In article <378@itivax.UUCP> dhw@itivax.UUCP (David H. West) writes: In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Mina > >I am looking for any references that might deal with the following >problem: > >y = f(x); f(x) is nonlinear in x > >Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} > >Can the network now produce ym given xm, even if it has never seen the >pair before? > >That is, given a set of input/output pairs for a nonlinear function, can a >multi-layer neural network be trained to induce the transfer function > ^^^ I don't know about non-linear functions but, I did try to train a net (back prop) to learn to compute sine(X) given X. I trained it for two weeks straight (virtually sole user) on an ELXSI. The result was that in carrying the solution to 5 significant decimal places I got a correct solution 40% of the time. Although this is somewhat better than random chance, it is not good enough to be useful. I will also note that the solution did not improve dramatically in the last week of training so I feel I can safely assume that the error rate would not decrease. I also tryied the same problem using a two's complement input/output and was able to get about the same results in about the same amount of training. The binary representation needed a few more nodes though. I was not able to spot any significant or meaningful patterns in the errors the net was making and do not feel that reducing the number of significant decimal places would help (even if it were meaningful) as the errors made were not consistantly in the last couple of digits, but rather were spread throughout the number (in both binary and decimal representations). Based on these observations, I don't think a net can be expected to produce any meaningful function. Sure it can do 1 + 1 and other simple things, but trips when it hits something not easily exhaustively (or nearly exhaustively) trained. Just my opinion, but ...
tomh@proxftl.UUCP (Tom Holroyd) (11/16/88)
Another paper is "Learning with Localized Receptive Fields," by John Moody and Christian Darken, Yale Computer Science, PO Box 2158, New Haven, CT 06520, available in the Proceedings of the 1988 Connectionist Models Summer School, published by Morgan Kaufmann. They use a population of self-organizing local receptive fields, that cover the input domain, where each receptive field learns the output value for the region of the input space covered by that field. K-means clustering is used to find the receptive field centers. Interpolation via weighted average of nearby fields. 1000 times faster convergence than back-prop with conjugate gradient. Tom Holroyd UUCP: {uflorida,uunet}!novavax!proxftl!tomh The white knight is talking backwards.
hwang@taiwan.Princeton.EDU (Jenq-Neng Hwang) (11/16/88)
A. Lapedes and R. Farber from Los Almos National Lab. have a technical report LA-UR87-2662, entitled "nonlinear signal processing using neural networks: prediction and system modelling", which tried to solve the problem mentioned. They also have a paper published in "Proc. IEEE, Conf. on Neural Information Processing Systems -- Natural and Synthetic, Denvor, November 1987", entitled "How Neural Nets Work", pp 442-456.
hall@nvuxh.UUCP (Michael R Hall) (11/17/88)
In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes: >I am looking for any references that might deal with the following >problem: > >y = f(x); f(x) is nonlinear in x > >Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} > >Can the network now produce ym given xm, even if it has never seen the >pair before? > >That is, given a set of input/output pairs for a nonlinear function, can a >multi-layer neural network be trained to induce the transfer function >by being shown the data? What are the network requirements? What >are the limitations, if any? Are there theoretical bounds on >the order, degree or complexity of learnable functions for networks >of a given type? > >Note that I am speaking here of *continuous* functions, not discrete-valued >ones, so there is no immediate congruence with classification. The problem you raise is not just a neural net problem. Your function learning problem has been termed "concept learning" by some researchers (e.g. Larry Rendell). A concept is a function. There are many nonneural learning algorithms (e.g. PLS1) that are designed to learn concepts. My opinion is that concept learning algorithms generally work better, easier, and faster than neural nets for learning concepts. (Anybody willing to pit their neural net against my implementation of PLS to learn a concept from natural data?) Neural nets are more general than concept learning algorithms, and so it is only natural that they should not learn concepts as quickly (in terms of exposures) and well (in terms of accuracy after a given number of exposures). Valiant and friends have come up with theories of the sort you desire, but only for boolean concepts (binary y's in your notation) and learning algorithms in general, not neural nets in particular. "Graded concepts" are continuous. To my knowledge, no work has addressed the theoretical learnability of graded concepts. Before trying to come up with theoretical learnability results for neural networks, one should probably address the graded concept learning problem in general. The Valiant approach of a Probably Almost Correct (PAC) learning criterion should be applicable to graded concepts. -- Michael R. Hall | Bell Communications Research "I'm just a symptom of the moral decay that's | hall%nvuxh.UUCP@bellcore.COM gnawing at the heart of the country" -The The | bellcore!nvuxh!hall
rao@enuxha.eas.asu.edu (Arun Rao) (11/18/88)
In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes: >I am looking for any references that might deal with the following >problem: > >y = f(x); f(x) is nonlinear in x > >Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} > >Can the network now produce ym given xm, even if it has never seen the >pair before? Sounds like a standard interpolation problem to me, though a good deal of effort has been expended to make neural networks inter- polate. Any elementary book on numerical analysis will treat this problem, but the author of the above probably knows this. I would be interesting in other ramifications to the above problem which are not readily amenable to classical techniques. - Arun Rao
joe@amos.ling.ucsd.edu (Shadow) (11/18/88)
in article, 399.uvaee.ee.virginia.EDU writes: >>I am looking for any references that might deal with the following >>problem: >> >>y = f(x); f(x) is nonlinear in x >> >>Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)} >> >>Can the network now produce ym given xm, even if it has never seen the >>pair before? >> >>That is, given a set of input/output pairs for a nonlinear function, can a >>multi-layer neural network be trained to induce the transfer function my response: 1. Neural nets are an attempt to model brain-like learning (at least in theory). So, how do human's learn non linear functions ? : you learn that x^2, for instance, is X times X. And how about X times Y ? How do humans learn that ? : you memorize it, for single digits, and : for more than a single digit, you multiply streams of digits together in a carry routine. 2. So the problem is a little more complicated. You might imagine a network which can perfectly learn non-linear functions if it has at its disposal various useful sub-networks (e.g., a network can learn x^n if it has at its disposal some mechanism and architecture suitable for multiplying x & x.) (imagine a sub-network behaving as a single unit, receiving input and producing output in a predictable mathimatical manner) (promoting thought) What is food without the hunger ? What is light without the darkness ? And what is pleasure without pain ? joe@amos.ling.ucsd.edu