[comp.ai.neural-nets] Step,approximations,etc

ck@rex.cs.tulane.edu (Cris Koutsougeras) (08/29/89)
Let me make this a collective response to keep the number of
messages smaller ...


>From: kroger@titan.tsd.arlut.utexas.edu (Jim Kroger)
>
>Hi, Chris. I am very interested in understanding what you mean by
>learning a function (continuous or otherwise) in the following....

Well, it is understood that the Rumelhart net and the Back Prop
constitute a general tool for curve fitting. Given a certain 
configuration of the net and weights-thresholds/biases, one can
derive (but do not try it if you have many units) a closed form
expression for the input-output function. That is the function
"learned" by the process of adaptation. Now the process of adaptation
is based on the samples of the training set. These samples cannot only
be seen as "input pattern - output class" but more generaly as "input
vector value - output vector values".  In the case we were discussing
teaching a function means to sample a given function and use the
samples as training set. The expectation is to see the net learning
the sampled function while trying to fit a curve on the samples. In
other words the nets final transfer function should be mathematicaly
equivalent to (or an approximation of) the sampled one. From there
all the difficult problems start which are basicaly due to the degree
of non-linearity needed/existing etc. etc.


>
>From: Ali Minai <aam9n@uvaee.ee.virginia.edu>
>
>
>This is in response to your posting on the net about neural nets
>learning a step function. If you have any results on this, I would
>be very interested in references, comments etc. Thanks.
>

All those who have sent me personal requests will get my response.
I can not respond "yesterday" since the classes start now but I will.
It would save some effort though if you are able to retrieve the following 
publications. If you are not then let me know.


     George, R., B. Geraci, R. Shrikanth and C. Koutsougeras, "A Methodical
     Study of the Rumelhart Model", 5th IASTED Intern. Conf. Expert Systems and
     Neural Networks.  Hawaii, Aug. 1989.
  
The effect of more or less units than required in a (general) network is
discussed in : 

     Koutsougeras, C. and C.A. Papachristou, "A Neural Network Model for
     Discrete Mappings", in Proceedings of the IEEE International Conference on
     Languages for Automation (LFA '88), August 1988.

Other references :

    Duda & Hart "Pattern recognition and Schene analysis" John Wiley & Sons 1973
      
     Koutsougeras, C. and C.A. Papachristou, "Training of A Neural Network Model
     for Pattern Classification Based on an Entropy Measure", in Proceedings of
     the IEEE International Conference on Neural Networks (ICNN '88), IEEE, July
     1988.



Also Alan Heirich writes :

>This statement makes me uncomfortable.  It has been proven (White) that a
>two-layer feed forward network with a signmoid activation function (such as
>you find in standard back propagation networks) can approximate any Borel
>measurable function to an arbitrary degree of precision.  So, for all
>intents and purposes, such a network can match a perfect step function to
>any discernible precision.  
>

That is what I was truing to point out. The more non-linearity you introduce
the better the precision, but no way exact fit.  The amount of non-linearity
can be increased by introducing more units per layer or more layers although
additional layers can get the net easily out of hand. In fact Ken-itci Funahashi
in "On the approximate realization of continous mappings by neural networks"
(Neural Networks Vol.2 No. # '89) has proven that the arbitrary approximation
is possible with only one hidden layer. That is one only needs to add units 
only in the hidden layer. Nevertheless we agree on the arbitrary approximation
except that the point here is that it is an approximation and not an exact fit
for the particular function.

Cris Koutsougeras
						Computer Science Dpt
						Tulane University