[comp.ai.neural-nets] Searching for 13th Hilbert Problem article

phbd641@ccwf.cc.utexas.edu (David Chao) (11/08/90)

In Richard Lippmann's review article in IEEE ASSP (4,April 1987) magazine,
he makes reference to "Mathematical developments Arising from Hilbert 
Problems," American Mathematical Society (ed. Browder), 1976.

In the article on Hilbert's 13th problem, reference is made to 
Kolmogorov's proof concerning the approximation of any continous function of
N variables using only linear summations and nonlinear but
continously increasing functions of only one variable. (shades of
typical backprop n.n.'s with logistic activation functions!).

In any case, I have this book on recall at my own library and doubt
whether it will arrive in time for a seminar I'm giving next 
Wednesday.   If anyone has any information - in particular
about Kolmogorov's theorem - I'd appreciate an email response.

Thanks!

David Chao
Department of Physics
University of Texas at Austin
dchao@ccwf.cc.utexas.edu
  or
phbd641@ccwf.cc.utexas.edu
(512) 471-4039

pluto@zaius.ucsd.edu (Mark Plutowski) (11/09/90)

phbd641@ccwf.cc.utexas.edu (David Chao) writes:

>In Richard Lippmann's review article in IEEE ASSP (4,April 1987) magazine,
>he makes reference to "Mathematical developments Arising from Hilbert 
>Problems," American Mathematical Society (ed. Browder), 1976.

>In the article on Hilbert's 13th problem, reference is made to 
>Kolmogorov's proof concerning the approximation of any continous function of
>N variables using only linear summations and nonlinear but
>continously increasing functions of only one variable. 

David, could you clarify this last sentence for me?   Do you mean a
monotonically increasing sequence of continuous functions of one variable? 
Or, do you mean that each function of one variable is continuous, with its 
value monotonically increasing as its argument tends to infinity?  
I expect the latter to be the correct interpretation, but am not certain, 
since the statement here does not say how many functions are required. 

(BTW: the version I heard required on the order of 2N+1 functions -- is this
correct? This would be equivalent to requiring 2N+1 hidden units, and a single 
linear output unit.  However, as I understand it, in Kolmogorov's theorem 
each hidden unit may be required to have a different, and unknown, transfer 
function.  Hence, the theorem proves the existence of such a neural network, 
but not constructively.)

Thanks,

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
M.E. Plutowski            		   INTERNET: pluto%cs@ucsd.edu 
UCSD,  Computer Science and Engineering 0114
9500 Gilman Drive
La Jolla, California 92093-0114
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

bmoghad@gmuvax2.gmu.edu (bmoghad) (11/09/90)

In article <39365@ut-emx.uucp> phbd641@ccwf.cc.utexas.edu (David Chao) writes:
>
>In the article on Hilbert's 13th problem, reference is made to 
>Kolmogorov's proof concerning the approximation of any continous function of
>N variables using only linear summations and nonlinear but
>continously increasing functions of only one variable. (shades of
>typical backprop n.n.'s with logistic activation functions!).
>
>If anyone has any information - in particular
>about Kolmogorov's theorem - I'd appreciate an email response.


 I believe you are referring to Kolmogorov's neural network existence
theorem which claims that any continuous non-linear mapping can be achieved
with a 3-layer network. This is the modern-day neural network extension of
Hilbert's 13th problem and other conjectures in functional decomposition
that have been around for a long time.
 
 Here are a few references which should help:

1. Robert Hecht-Nielsen, "Kolmogorov's Mapping Neural Network Existence
   Theorem", Proceedings of the 1st Int'l Conference on Neural Networks",
   San Diego, June 21-24, IEEE, 1987.

2. Kolmogorov's Theorem (1957)
   Transactions of A.M.S., 28, pp. 55-59 (1963)

3. Sprecher's Theorem (1965)
   Transactions of A.M.S., 115, pp. 340-355 

4. Irie-Miyake's Theorem (1988)
   IEEE Int'l J. Conf. on Neural Networks, I, pp.641-648

5. Funahashi's Theorem (1989)
   Neural Networks, 2, pp. 183-192

=====================================================================
Baback Moghaddam                      |
Center for Exellence in C3I           |   bmoghad@gmuvax2.gmu.edu
Dept. of Electrical & Computer Engr.  |
George Mason University               |   Tel: (703) 764-6283
Fairfax, Virginia  22030-4444         |
=====================================================================
 

barry@pico.math.ucla.edu (Barry Merriman) (11/10/90)

In article <pluto.658100520@zaius> pluto@zaius.ucsd.edu (Mark Plutowski) writes:
>phbd641@ccwf.cc.utexas.edu (David Chao) writes:
>
>>In the article on Hilbert's 13th problem, reference is made to 
>>Kolmogorov's proof concerning the approximation of any continous function of
>>N variables using only linear summations and nonlinear but
>>continously increasing functions of only one variable. 
>
>(BTW: the version I heard required on the order of 2N+1 functions -- is this
>correct? This would be equivalent to requiring 2N+1 hidden units, and a single 
>linear output unit.  

The ``best version'' requires 2n+1 monotonic, (Lipschitz 1-) continuos functions
of one variable (``transfer functions''), n constants (``weights''), which are independent
of the continuous function of n variables being represented, and one continuous function
of one variable,g, which has no special form, and is thus not representable by a single
standard transfer function. This extra function, g, would have to be represented 
by a net of its own for this to a standard network. So, its like
having n inputs, 2n+1 hidden units, one special g-unit, and a summation
unit as an output, with a special O(n) connectivity pattern.

>However, as I understand it, in Kolmogorov's theorem 
>each hidden unit may be required to have a different, and unknown, transfer 
>function.  Hence, the theorem proves the existence of such a neural network, 
>but not constructively.)

No---the proof of K. _is_ fully constructive (though he does a version that
requires about 2n^2 neurons). Of course, he using limiting procedures in
the constructions. (Amazingly, it only takes K about 4 pages---though
it is very dense.)

Still, note that this thereom is essentially useless for practical
purposes, because the function g, while continuos, can be
extremely rough, and should typically require about const^n values on its
domain to be well-represented. (i.e., to represent g via a standard
network would require exponentially---in n---many neurons. To see
this, note that a general smooth function of n variables requires values
on a fine grid in n-space to be well approximated. If the grid has spacing
h, this is like (1/h)^n values. In the above representation, all this
info gets shoved into g, so you would expect g to have substantial
variations on a length scale of h^n, if the original function varied
on a scale h in n dimesions (and depended strongly on all its variable!)
Thus the complecity of computing g becomes overwhelming.) The point is
NN are useful for representing very special functions of n-variables,
namely those that can actualy be specified by little information, say O(n),
or O(n^2). Kolmogorov's theorem makes no attempt to single out such functions,
and just lumps them in with general functions. Thus, it has nothing
special to tell us about the utility of neural nets.





--
Barry Merriman
UCLA Dept. of Math
UCLA Inst. for Fusion and Plasma Research
barry@math.ucla.edu (Internet)