phbd641@ccwf.cc.utexas.edu (David Chao) (11/08/90)
In Richard Lippmann's review article in IEEE ASSP (4,April 1987) magazine, he makes reference to "Mathematical developments Arising from Hilbert Problems," American Mathematical Society (ed. Browder), 1976. In the article on Hilbert's 13th problem, reference is made to Kolmogorov's proof concerning the approximation of any continous function of N variables using only linear summations and nonlinear but continously increasing functions of only one variable. (shades of typical backprop n.n.'s with logistic activation functions!). In any case, I have this book on recall at my own library and doubt whether it will arrive in time for a seminar I'm giving next Wednesday. If anyone has any information - in particular about Kolmogorov's theorem - I'd appreciate an email response. Thanks! David Chao Department of Physics University of Texas at Austin dchao@ccwf.cc.utexas.edu or phbd641@ccwf.cc.utexas.edu (512) 471-4039
pluto@zaius.ucsd.edu (Mark Plutowski) (11/09/90)
phbd641@ccwf.cc.utexas.edu (David Chao) writes: >In Richard Lippmann's review article in IEEE ASSP (4,April 1987) magazine, >he makes reference to "Mathematical developments Arising from Hilbert >Problems," American Mathematical Society (ed. Browder), 1976. >In the article on Hilbert's 13th problem, reference is made to >Kolmogorov's proof concerning the approximation of any continous function of >N variables using only linear summations and nonlinear but >continously increasing functions of only one variable. David, could you clarify this last sentence for me? Do you mean a monotonically increasing sequence of continuous functions of one variable? Or, do you mean that each function of one variable is continuous, with its value monotonically increasing as its argument tends to infinity? I expect the latter to be the correct interpretation, but am not certain, since the statement here does not say how many functions are required. (BTW: the version I heard required on the order of 2N+1 functions -- is this correct? This would be equivalent to requiring 2N+1 hidden units, and a single linear output unit. However, as I understand it, in Kolmogorov's theorem each hidden unit may be required to have a different, and unknown, transfer function. Hence, the theorem proves the existence of such a neural network, but not constructively.) Thanks, =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- M.E. Plutowski INTERNET: pluto%cs@ucsd.edu UCSD, Computer Science and Engineering 0114 9500 Gilman Drive La Jolla, California 92093-0114 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
bmoghad@gmuvax2.gmu.edu (bmoghad) (11/09/90)
In article <39365@ut-emx.uucp> phbd641@ccwf.cc.utexas.edu (David Chao) writes: > >In the article on Hilbert's 13th problem, reference is made to >Kolmogorov's proof concerning the approximation of any continous function of >N variables using only linear summations and nonlinear but >continously increasing functions of only one variable. (shades of >typical backprop n.n.'s with logistic activation functions!). > >If anyone has any information - in particular >about Kolmogorov's theorem - I'd appreciate an email response. I believe you are referring to Kolmogorov's neural network existence theorem which claims that any continuous non-linear mapping can be achieved with a 3-layer network. This is the modern-day neural network extension of Hilbert's 13th problem and other conjectures in functional decomposition that have been around for a long time. Here are a few references which should help: 1. Robert Hecht-Nielsen, "Kolmogorov's Mapping Neural Network Existence Theorem", Proceedings of the 1st Int'l Conference on Neural Networks", San Diego, June 21-24, IEEE, 1987. 2. Kolmogorov's Theorem (1957) Transactions of A.M.S., 28, pp. 55-59 (1963) 3. Sprecher's Theorem (1965) Transactions of A.M.S., 115, pp. 340-355 4. Irie-Miyake's Theorem (1988) IEEE Int'l J. Conf. on Neural Networks, I, pp.641-648 5. Funahashi's Theorem (1989) Neural Networks, 2, pp. 183-192 ===================================================================== Baback Moghaddam | Center for Exellence in C3I | bmoghad@gmuvax2.gmu.edu Dept. of Electrical & Computer Engr. | George Mason University | Tel: (703) 764-6283 Fairfax, Virginia 22030-4444 | =====================================================================
barry@pico.math.ucla.edu (Barry Merriman) (11/10/90)
In article <pluto.658100520@zaius> pluto@zaius.ucsd.edu (Mark Plutowski) writes: >phbd641@ccwf.cc.utexas.edu (David Chao) writes: > >>In the article on Hilbert's 13th problem, reference is made to >>Kolmogorov's proof concerning the approximation of any continous function of >>N variables using only linear summations and nonlinear but >>continously increasing functions of only one variable. > >(BTW: the version I heard required on the order of 2N+1 functions -- is this >correct? This would be equivalent to requiring 2N+1 hidden units, and a single >linear output unit. The ``best version'' requires 2n+1 monotonic, (Lipschitz 1-) continuos functions of one variable (``transfer functions''), n constants (``weights''), which are independent of the continuous function of n variables being represented, and one continuous function of one variable,g, which has no special form, and is thus not representable by a single standard transfer function. This extra function, g, would have to be represented by a net of its own for this to a standard network. So, its like having n inputs, 2n+1 hidden units, one special g-unit, and a summation unit as an output, with a special O(n) connectivity pattern. >However, as I understand it, in Kolmogorov's theorem >each hidden unit may be required to have a different, and unknown, transfer >function. Hence, the theorem proves the existence of such a neural network, >but not constructively.) No---the proof of K. _is_ fully constructive (though he does a version that requires about 2n^2 neurons). Of course, he using limiting procedures in the constructions. (Amazingly, it only takes K about 4 pages---though it is very dense.) Still, note that this thereom is essentially useless for practical purposes, because the function g, while continuos, can be extremely rough, and should typically require about const^n values on its domain to be well-represented. (i.e., to represent g via a standard network would require exponentially---in n---many neurons. To see this, note that a general smooth function of n variables requires values on a fine grid in n-space to be well approximated. If the grid has spacing h, this is like (1/h)^n values. In the above representation, all this info gets shoved into g, so you would expect g to have substantial variations on a length scale of h^n, if the original function varied on a scale h in n dimesions (and depended strongly on all its variable!) Thus the complecity of computing g becomes overwhelming.) The point is NN are useful for representing very special functions of n-variables, namely those that can actualy be specified by little information, say O(n), or O(n^2). Kolmogorov's theorem makes no attempt to single out such functions, and just lumps them in with general functions. Thus, it has nothing special to tell us about the utility of neural nets. -- Barry Merriman UCLA Dept. of Math UCLA Inst. for Fusion and Plasma Research barry@math.ucla.edu (Internet)