ck@rex.cs.tulane.edu (Cris Koutsougeras) (08/29/89)
Let me make this a collective response to keep the number of messages smaller ... >From: kroger@titan.tsd.arlut.utexas.edu (Jim Kroger) > >Hi, Chris. I am very interested in understanding what you mean by >learning a function (continuous or otherwise) in the following.... Well, it is understood that the Rumelhart net and the Back Prop constitute a general tool for curve fitting. Given a certain configuration of the net and weights-thresholds/biases, one can derive (but do not try it if you have many units) a closed form expression for the input-output function. That is the function "learned" by the process of adaptation. Now the process of adaptation is based on the samples of the training set. These samples cannot only be seen as "input pattern - output class" but more generaly as "input vector value - output vector values". In the case we were discussing teaching a function means to sample a given function and use the samples as training set. The expectation is to see the net learning the sampled function while trying to fit a curve on the samples. In other words the nets final transfer function should be mathematicaly equivalent to (or an approximation of) the sampled one. From there all the difficult problems start which are basicaly due to the degree of non-linearity needed/existing etc. etc. > >From: Ali Minai <aam9n@uvaee.ee.virginia.edu> > > >This is in response to your posting on the net about neural nets >learning a step function. If you have any results on this, I would >be very interested in references, comments etc. Thanks. > All those who have sent me personal requests will get my response. I can not respond "yesterday" since the classes start now but I will. It would save some effort though if you are able to retrieve the following publications. If you are not then let me know. George, R., B. Geraci, R. Shrikanth and C. Koutsougeras, "A Methodical Study of the Rumelhart Model", 5th IASTED Intern. Conf. Expert Systems and Neural Networks. Hawaii, Aug. 1989. The effect of more or less units than required in a (general) network is discussed in : Koutsougeras, C. and C.A. Papachristou, "A Neural Network Model for Discrete Mappings", in Proceedings of the IEEE International Conference on Languages for Automation (LFA '88), August 1988. Other references : Duda & Hart "Pattern recognition and Schene analysis" John Wiley & Sons 1973 Koutsougeras, C. and C.A. Papachristou, "Training of A Neural Network Model for Pattern Classification Based on an Entropy Measure", in Proceedings of the IEEE International Conference on Neural Networks (ICNN '88), IEEE, July 1988. Also Alan Heirich writes : >This statement makes me uncomfortable. It has been proven (White) that a >two-layer feed forward network with a signmoid activation function (such as >you find in standard back propagation networks) can approximate any Borel >measurable function to an arbitrary degree of precision. So, for all >intents and purposes, such a network can match a perfect step function to >any discernible precision. > That is what I was truing to point out. The more non-linearity you introduce the better the precision, but no way exact fit. The amount of non-linearity can be increased by introducing more units per layer or more layers although additional layers can get the net easily out of hand. In fact Ken-itci Funahashi in "On the approximate realization of continous mappings by neural networks" (Neural Networks Vol.2 No. # '89) has proven that the arbitrary approximation is possible with only one hidden layer. That is one only needs to add units only in the hidden layer. Nevertheless we agree on the arbitrary approximation except that the point here is that it is an approximation and not an exact fit for the particular function. Cris Koutsougeras Computer Science Dpt Tulane University