[comp.ai.neural-nets] Learning arbitrary transfer functions

aam9n@uvaee.ee.virginia.EDU (Ali Minai) (11/11/88)

I am looking for any references that might deal with the following
problem:

y = f(x);         f(x) is nonlinear in x

Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}

Can the network now produce ym given xm, even if it has never seen the
pair before?

That is, given a set of input/output pairs for a nonlinear function, can a
multi-layer neural network be trained to induce the transfer function
by being shown the data? What are the network requirements? What
are the limitations, if any? Are there theoretical bounds on
the order, degree or complexity of learnable functions for networks
of a given type?

Note that I am speaking here of *continuous* functions, not discrete-valued
ones, so there is no immediate congruence with classification. Any attempt
to "discretize" or "digitize" the function leads to problems because the
resolution then becomes a factor, leading to misclassification unless
the discretizing scheme was chosen initially with careful knowledge
of the functions characteristics, which defeats the whole purpose. It
seems to me that in order to induce the function correctly, the network
must be shown real values, rather than some binary-coded version (e.g.
in terms of basis vectors). Also, given that neurons have a logistic
transfer function, is there a theoretical limit on what kinds of functions
*can* be induced by collections of such neurons?

All references, pointers, comments, advice, admonitions are welcome.
Thanks in advance,

                    Ali


Ali Minai
Dept. of Electrical Engg.
Thornton Hall
University of Virginia
Charlottesville, VA 22901

aam9n@uvaee.ee.Virginia.EDU
aam9n@maxwell.acc.Virginia.EDU

aboulang@bbn.com (Albert Boulanger) (11/12/88)

Check out the following report:

"Nonlinear Signal Processing Using Neural Networks:
Prediction and System Modelling"
Alan Lapedes and Robert Farber
Los Alamos Tech report LA-UR-87-2662

There was also a description of this work at the last Denver
conference on Neural Networks. Lapedes has a nice demonstration of
recovering the logistic map given a chaotic time series of the map. He
has also done this with the Macky-Glass time-delay equation.
It is rumored that techniques like this (Doyne Farmer as well as James
Crutchfield have non neural-based dynamical-systems techniques for
doing this, cf "Equations of Motion From a Data Series, James
Crutchfield & Bruce McNamera, Complex Systems, Vol #3, June 1987,
417-452.) are being used by companies to predict the stock market.

Albert Boulanger
BBN Systems & Technologies Corp.
10 Moulton St.
Cambridge MA, 02138
aboulanger@bbn.com

dhw@itivax.UUCP (David H. West) (11/12/88)

In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes:
>
>I am looking for any references that might deal with the following
>problem:
>
>y = f(x);         f(x) is nonlinear in x
>
>Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}
>
>Can the network now produce ym given xm, even if it has never seen the
>pair before?
>
>That is, given a set of input/output pairs for a nonlinear function, can a
>multi-layer neural network be trained to induce the transfer function
                                                 ^^^
An infinite number of transfer functions are compatible with any
finite data set.  If you really prefer some of them to others, this
information needs to be available in computable form to the
algorithm that chooses a function.  If you don't care too much, you
can make an arbitrary choice (and live with the result); you might 
for example use the (unique) Lagrange interpolation polynomial of
order n-1 that passes through your data points, simply because it's
easy to find in reference books, and familiar enough not to surprise
anyone. It happens to be easier to compute without a neural network,
though :-)

-David West            dhw%iti@umix.cc.umich.edu
		       {uunet,rutgers,ames}!umix!itivax!dhw
CDSL, Industrial Technology Institute, PO Box 1485, 
Ann Arbor, MI 48106

dhw@itivax.UUCP (David H. West) (11/12/88)

In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes:
>
>I am looking for any references that might deal with the following
>problem:
>
>y = f(x);         f(x) is nonlinear in x
>
>Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}
>
>Can the network now produce ym given xm, even if it has never seen the
>pair before?
>
>That is, given a set of input/output pairs for a nonlinear function, can a
>multi-layer neural network be trained to induce the transfer function
                                                 ^^^
An infinite number of transfer functions are compatible with any
finite data set.  If you really prefer some of them to others, this
information needs to be available in computable form to the
algorithm that chooses a function.  If you don't care too much, you
can make an arbitrary choice (and live with the result); you might 
for example use the (unique) Lagrange interpolation polynomial of
order n-1 that passes through your data points, simply because it's
easy to find in reference books, and familiar enough not to surprise
anyone. It happens to be easier to compute without a neural network,
though :-)
If you want ym for a given xm to be relatively independent of
(sufficiently large) n, however, you need in general to know
something about the domain, and n-1_th (i.e. variable) order polynomial 
interpolation is almost certainly not what you want.

-David West            dhw%iti@umix.cc.umich.edu
		       {uunet,rutgers,ames}!umix!itivax!dhw
CDSL, Industrial Technology Institute, PO Box 1485, 
Ann Arbor, MI 48106

efrethei@afit-ab.arpa (Erik J. Fretheim) (11/15/88)

In article <378@itivax.UUCP> dhw@itivax.UUCP (David H. West) writes:
In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Mina
>
>I am looking for any references that might deal with the following
>problem:
>
>y = f(x);         f(x) is nonlinear in x
>
>Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}
>
>Can the network now produce ym given xm, even if it has never seen the
>pair before?
>
>That is, given a set of input/output pairs for a nonlinear function, can a
>multi-layer neural network be trained to induce the transfer function
>                                                 ^^^
I don't know about non-linear functions but, I did try to train a net 
(back prop) to learn to compute sine(X) given X.  I trained it for two
weeks straight (virtually sole user) on an ELXSI.  The result was that in
carrying the solution to 5 significant decimal places I got a correct
solution 40% of the time.  Although this is somewhat better than random
chance, it is not good enough to be useful.  I will also note that the 
solution did not improve dramatically in the last week of training so I 
feel I can safely assume that the error rate would not decrease.  I also
tryied the same problem using a two's complement input/output and was able
to get about the same results in about the same amount of training.  The
binary representation needed a few more nodes though.  I was not able to
spot any significant or meaningful patterns in the errors the net was 
making and do not feel that reducing the number of significant decimal places
would help (even if it were meaningful) as the errors made were not 
consistantly in the last couple of digits, but rather were spread throughout
the number (in both binary and decimal representations).
Based on these observations, I don't think 
a net can be expected to produce any meaningful function.  Sure it can
do 1 + 1 and other simple things, but trips when it hits something not
easily exhaustively (or nearly exhaustively) trained. 

Just my opinion, but ...

tomh@proxftl.UUCP (Tom Holroyd) (11/16/88)

Another paper is "Learning with Localized Receptive Fields," by John Moody
and Christian Darken, Yale Computer Science, PO Box 2158, New Haven, CT 06520,
available in the Proceedings of the 1988 Connectionist Models Summer School,
published by Morgan Kaufmann.

They use a population of self-organizing local receptive fields, that cover
the input domain, where each receptive field learns the output value for the
region of the input space covered by that field.  K-means clustering is used
to find the receptive field centers.  Interpolation via weighted average of
nearby fields.  1000 times faster convergence than back-prop with conjugate
gradient.

Tom Holroyd
UUCP: {uflorida,uunet}!novavax!proxftl!tomh

The white knight is talking backwards.

hwang@taiwan.Princeton.EDU (Jenq-Neng Hwang) (11/16/88)

 A. Lapedes and R. Farber from Los Almos National
Lab. have a technical report LA-UR87-2662, entitled 
"nonlinear signal processing using neural networks: prediction and system
modelling", which tried to solve the problem mentioned.
They also have a paper published in 
"Proc. IEEE, Conf. on Neural Information Processing
  Systems -- Natural and Synthetic, Denvor, November 1987",
entitled "How Neural Nets Work", pp 442-456.

hall@nvuxh.UUCP (Michael R Hall) (11/17/88)

In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes:
>I am looking for any references that might deal with the following
>problem:
>
>y = f(x);         f(x) is nonlinear in x
>
>Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}
>
>Can the network now produce ym given xm, even if it has never seen the
>pair before?
>
>That is, given a set of input/output pairs for a nonlinear function, can a
>multi-layer neural network be trained to induce the transfer function
>by being shown the data? What are the network requirements? What
>are the limitations, if any? Are there theoretical bounds on
>the order, degree or complexity of learnable functions for networks
>of a given type?
>
>Note that I am speaking here of *continuous* functions, not discrete-valued
>ones, so there is no immediate congruence with classification. 

The problem you raise is not just a neural net problem.  Your
function learning problem has been termed "concept learning" by
some researchers (e.g. Larry Rendell).  A concept is a function. 
There are many nonneural learning algorithms (e.g. PLS1) that are
designed to learn concepts.  My opinion is that concept learning
algorithms generally work better, easier, and faster than neural
nets for learning concepts.  (Anybody willing to pit their neural
net against my implementation of PLS to learn a concept from natural
data?)  Neural nets are more general than concept learning
algorithms, and so it is only natural that they should not learn
concepts as quickly (in terms of exposures) and well (in terms of
accuracy after a given number of exposures).

Valiant and friends have come up with theories of the sort you
desire, but only for boolean concepts (binary y's in your notation)
and learning algorithms in general, not neural nets in particular.
"Graded concepts" are continuous.  To my knowledge, no work has
addressed the theoretical learnability of graded concepts.  Before
trying to come up with theoretical learnability results for neural
networks, one should probably address the graded concept learning
problem in general.  The Valiant approach of a Probably Almost
Correct (PAC) learning criterion should be applicable to graded
concepts. 
-- 
Michael R. Hall                               | Bell Communications Research
"I'm just a symptom of the moral decay that's | hall%nvuxh.UUCP@bellcore.COM
gnawing at the heart of the country" -The The | bellcore!nvuxh!hall

rao@enuxha.eas.asu.edu (Arun Rao) (11/18/88)

 In article <399@uvaee.ee.virginia.EDU> aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes:
 >I am looking for any references that might deal with the following
 >problem:
 >
 >y = f(x);         f(x) is nonlinear in x
 >
 >Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}
 >
 >Can the network now produce ym given xm, even if it has never seen the
 >pair before?
	Sounds like a standard interpolation problem to me, though a good
	deal of effort has been expended to make neural networks  inter-
	polate. Any elementary book on numerical analysis will treat
	this problem, but the author of the above probably knows this.
	I would be interesting in other ramifications to the above problem
	which are not readily amenable to classical techniques.

	- Arun Rao

joe@amos.ling.ucsd.edu (Shadow) (11/18/88)

in article, 399.uvaee.ee.virginia.EDU writes:

>>I am looking for any references that might deal with the following
>>problem:
>>
>>y = f(x);         f(x) is nonlinear in x
>>
>>Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}
>>
>>Can the network now produce ym given xm, even if it has never seen the
>>pair before?
>>
>>That is, given a set of input/output pairs for a nonlinear function, can a
>>multi-layer neural network be trained to induce the transfer function

my response:

1. Neural nets are an attempt to model brain-like learning
   (at least in theory).

   So, how do human's learn non linear functions ?

      : you learn that x^2, for instance, is X times X.

   And how about X times Y ? How do humans learn that ?

      : you memorize it, for single digits, and
      : for more than a single digit, you multiply streams
	of digits together in a carry routine.

2. So the problem is a little more complicated. You might imagine
   a network which can perfectly learn non-linear functions if
   it has at its disposal various useful sub-networks (e.g., a
   network can learn x^n if it has at its disposal some mechanism
   and architecture suitable for multiplying x & x.)

   (imagine a sub-network behaving as a single unit, receiving
    input and producing output in a predictable mathimatical manner)

(promoting thought)

   What is food without the hunger ?
   What is light without the darkness ?
   And what is pleasure without pain ?

   joe@amos.ling.ucsd.edu