[comp.ai.neural-nets] Neuron Digest V4 #26

neuron-request@HPLABS.HP.COM (Neuron-Digest Moderator Peter Marvit) (11/20/88)
Neuron Digest   Saturday, 19 Nov 1988
                Volume 4 : Issue 26

Today's Topics:
              Re: seperability and unbalanced data discussion
              Re: ICNN refs for boundary effects in training
                   Learning arbitrary transfer functions
                 Re: Learning arbitrary transfer functions
                 Re: Learning arbitrary transfer functions
                 Re: Learning arbitrary transfer functions
                 Re: Learning arbitrary transfer functions
                 Re: Learning arbitrary transfer functions
                 Re: Learning arbitrary transfer functions
                 Re: Learning arbitrary transfer functions


Send submissions, questions, address maintenance and requests for old issues to
"neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request"

------------------------------------------------------------

Subject: Re: seperability and unbalanced data discussion
From:    jose@tractatus.bellcore.com (Stephen J Hanson)
Date:    Sun, 13 Nov 88 13:01:41 -0500 

See 

Hanson S. J. & Burr, D., Minkowski-r Backpropagation: Learning in
Connectionist Networks with non-euclidean error metrics, in D. Anderson,
Neural Information Processing: Natural and Synthetic, AIP, 1988.

We look at some similar cases..

        Steve


------------------------------

Subject: Re: ICNN refs for boundary effects in training
From:    CHRISLEY%VAX.OXFORD.AC.UK@CUNYVM.CUNY.EDU
Date:    14 Nov 88 11:20:48 +0000 

Rockwell.henr has asked for a reference to a paper at ICNN about boundary
effects in training.  Specifically, he mentioned a work that suggests that
including as many boundary cases in the input improves performance.

Let me suggest Prof. Kohonen's paper on LVQ2 (I do not have my proceedings
with me, so I can't give the exact reference).  This work showed that by
only looking at the training cases near boundaries, one improves accuracy.
Kohonen's formalism makes it very easy to determine which input patterns
are boundary cases.  I can't imagine a formalism in which this could be more
perspicuous.

Let me also suggest that this work is relevant to the "LMS fails to
separate" discussion, especially Lippmann's comment.  The motivation for
the comparison between LVQ and LMS algorithms that was undertaken in
(Kohonen, Barna. Chrisley, minimizing misclassification rate.  That is why
we thought that we would have better performance on some tasks if we used
an algorithm that was designed to minimixe misclassifications explicitly,
and not implicitly through LMS.

Ron Chrisley
New College
Oxford OX1 3BN
England

and

Xerox PARC SSL
3333 Coyote Hill Rd
Palo Alto, CA 94304
USA

------------------------------

Subject: Learning arbitrary transfer functions
From:    aam9n@uvaee.ee.virginia.EDU (Ali Minai)
Organization: EE Dept, U of Virginia, Charlottesville
Date:    10 Nov 88 18:54:52 +0000 


[[ Editor's Note: This series of entries was culled from the USENET
comp.ai.neural-nets bboard.  I have lightly edited some of the entries for
spelling and duplications. Readers of this Digest may find it interesting
to see the progression.  Please feel free to add your information or your
opinions for future issues.  -PM ]]

I am looking for any references that might deal with the following problem:

        y = f(x);         f(x) is nonlinear in x

Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}

Can the network now produce ym given xm, even if it has never seen the pair
before?

That is, given a set of input/output pairs for a nonlinear function, can a
multi-layer neural network be trained to induce the transfer function by
being shown the data? What are the network requirements? What are the
limitations, if any? Are there theoretical bounds on the order, degree or
complexity of learnable functions for networks of a given type?

Note that I am speaking here of *continuous* functions, not discrete-valued
ones, so there is no immediate congruence with classification. Any attempt
to "discretize" or "digitize" the function leads to problems because the
resolution then becomes a factor, leading to misclassification unless the
discretizing scheme was chosen initially with careful knowledge of the
functions characteristics, which defeats the whole purpose. It seems to me
that in order to induce the function correctly, the network must be shown
real values, rather than some binary-coded version (e.g.  in terms of basis
vectors). Also, given that neurons have a logistic transfer function, is
there a theoretical limit on what kinds of functions *can* be induced by
collections of such neurons?

All references, pointers, comments, advice, admonitions are welcome.
Thanks in advance,

                    Ali

Ali Minai
Dept. of Electrical Engg.
Thornton Hall
University of Virginia
Charlottesville, VA 22901

aam9n@uvaee.ee.Virginia.EDU
aam9n@maxwell.acc.Virginia.EDU

------------------------------

Subject: Re: Learning arbitrary transfer functions
From:    aboulang@bbn.com (Albert Boulanger)
Date:    11 Nov 88 16:33:46 +0000 

Check out the following report:

        "Nonlinear Signal Processing Using Neural Networks:
        Prediction and System Modeling"
        Alan Lapedes and Robert Farber
        Los Alamos Tech report LA-UR-87-2662

There was also a description of this work at the last Denver conference on
Neural Networks. Lapedes has a nice demonstration of recovering the
logistic map given a chaotic time series of the map. He has also done this
with the Macky-Glass time-delay equation.  It is rumored that techniques
like this (Doyne Farmer as well as James Crutchfield have non neural-based
dynamical-systems techniques for doing this, cf "Equations of Motion From a
Data Series, James Crutchfield & Bruce McNamera, Complex Systems, Vol #3,
June 1987, 417-452.) are being used by companies to predict the stock
market.

Albert Boulanger
BBN Systems & Technologies Corp.
10 Moulton St.
Cambridge MA, 02138
aboulanger@bbn.com

------------------------------

Subject: Re: Learning arbitrary transfer functions
From:    dhw@itivax.UUCP (David H. West)
Organization: Industrial Technology Institute
Date:    11 Nov 88 22:30:09 +0000 

In previous article aam9n@uvaee.ee.virginia.EDU (Ali Minai) writes:

>That is, given a set of input/output pairs for a nonlinear function, can a
>multi-layer neural network be trained to induce the transfer function
                                                 ^^^

An infinite number of transfer functions are compatible with any finite
data set.  If you really prefer some of them to others, this information
needs to be available in computable form to the algorithm that chooses a
function.  If you don't care too much, you can make an arbitrary choice
(and live with the result); you might for example use the (unique) Lagrange
interpolation polynomial of order n-1 that passes through your data points,
simply because it's easy to find in reference books, and familiar enough
not to surprise anyone. It happens to be easier to compute without a neural
network, though :-)

If you want ym for a given xm to be relatively independent of (sufficiently
large) n, however, you need in general to know something about the domain,
and n-1_th (i.e. variable) order polynomial interpolation is almost
certainly not what you want.

- -David West            dhw%iti@umix.cc.umich.edu
                       {uunet,rutgers,ames}!umix!itivax!dhw
CDSL, Industrial Technology Institute, PO Box 1485, 
Ann Arbor, MI 48106

------------------------------

Subject: Re: Learning arbitrary transfer functions
From:    efrethei@afit-ab.arpa (Erik J. Fretheim)
Organization: Air Force Institute of Technology; WPAFB, OH
Date:    15 Nov 88 14:14:08 +0000 

Regarding the reply by dhw@itivax.UUCP (David H. West):

>>That is, given a set of input/output pairs for a nonlinear function, can a
>>multi-layer neural network be trained to induce the transfer function
>>                                                 ^^^

I don't know about non-linear functions but, I did try to train a net (back
prop) to learn to compute sine(X) given X.  I trained it for two weeks
straight (virtually sole user) on an ELXSI.  The result was that in
carrying the solution to 5 significant decimal places I got a correct
solution 40% of the time.  Although this is somewhat better than random
chance, it is not good enough to be useful.  I will also note that the
solution did not improve dramatically in the last week of training so I
feel I can safely assume that the error rate would not decrease.  I also
tried the same problem using a two's complement input/output and was able
to get about the same results in about the same amount of training.  The
binary representation needed a few more nodes though.

I was not able to spot any significant or meaningful patterns in the errors
the net was making and do not feel that reducing the number of significant
decimal places would help (even if it were meaningful) as the errors made
were not consistently in the last couple of digits, but rather were spread
throughout the number (in both binary and decimal representations).  Based
on these observations, I don't think a net can be expected to produce any
meaningful function.  Sure it can do 1 + 1 and other simple things, but
trips when it hits something not easily exhaustively (or nearly
exhaustively) trained.

Just my opinion, but ...

------------------------------

Subject: Re: Learning arbitrary transfer functions
From:    tomh@proxftl.UUCP (Tom Holroyd)
Organization: Proximity Technology, Ft. Lauderdale
Date:    15 Nov 88 20:59:46 +0000 

Another paper is "Learning with Localized Receptive Fields," by John Moody
and Christian Darken, Yale Computer Science, PO Box 2158, New Haven, CT
06520, available in the Proceedings of the 1988 Connectionist Models Summer
School, published by Morgan Kaufmann.

They use a population of self-organizing local receptive fields, that cover
the input domain, where each receptive field learns the output value for
the region of the input space covered by that field.  K-means clustering is
used to find the receptive field centers.  Interpolation via weighted
average of nearby fields.  1000 times faster convergence than back-prop
with conjugate gradient.

Tom Holroyd
UUCP: {uflorida,uunet}!novavax!proxftl!tomh

The white knight is talking backwards.

------------------------------

Subject: Re: Learning arbitrary transfer functions
From:    hwang@taiwan.Princeton.EDU (Jenq-Neng Hwang)
Organization: Princeton University, Princeton NJ
Date:    15 Nov 88 22:08:20 +0000 


 A. Lapedes and R. Farber from Los Almos National Lab. have a technical
report LA-UR87-2662, entitled "nonlinear signal processing using neural
networks: prediction and system Modeling", which tried to solve the
problem mentioned.  They also have a paper published in "Proc. IEEE, Conf.
on Neural Information Processing Systems -- Natural and Synthetic, Denvor,
November 1987", entitled "How Neural Nets Work", pp 442-456.

------------------------------

Subject: Re: Learning arbitrary transfer functions
From:    hall@nvuxh.UUCP (Michael R Hall)
Organization: Bell Communications Research
Date:    16 Nov 88 21:04:58 +0000 

Regarding article by aam9n@uvaee.ee.virginia.EDU (Ali Minai):
>I am looking for any references that might deal with the following
>problem:
>
> [etc.]
>
>Note that I am speaking here of *continuous* functions, not discrete-valued
>ones, so there is no immediate congruence with classification. 

The problem you raise is not just a neural net problem.  Your function
learning problem has been termed "concept learning" by some researchers
(e.g. Larry Rendell).  A concept is a function.  There are many nonneural
learning algorithms (e.g. PLS1) that are designed to learn concepts.  My
opinion is that concept learning algorithms generally work better, easier,
and faster than neural nets for learning concepts.  (Anybody willing to pit
their neural net against my implementation of PLS to learn a concept from
natural data?)  Neural nets are more general than concept learning
algorithms, and so it is only natural that they should not learn concepts
as quickly (in terms of exposures) and well (in terms of accuracy after a
given number of exposures).

Valiant and friends have come up with theories of the sort you desire, but
only for boolean concepts (binary y's in your notation) and learning
algorithms in general, not neural nets in particular.  "Graded concepts"
are continuous.  To my knowledge, no work has addressed the theoretical
learnability of graded concepts.  Before trying to come up with theoretical
learnability results for neural networks, one should probably address the
graded concept learning problem in general.  The Valiant approach of a
Probably Almost Correct (PAC) learning criterion should be applicable to
graded concepts.  - -- Michael R. Hall | Bell Communications Research "I'm
just a symptom of the moral decay that's | hall%nvuxh.UUCP@bellcore.COM
gnawing at the heart of the country" -The The | bellcore!nvuxh!hall

------------------------------

Subject: Re: Learning arbitrary transfer functions
From:    rao@enuxha.eas.asu.edu (Arun Rao)
Organization: Arizona State Univ, Tempe
Date:    17 Nov 88 16:00:26 +0000 


 Regarding article by aam9n@uvaee.ee.virginia.EDU (Ali Minai):
 >I am looking for any references that might deal with the following
 >problem:
 >
 >y = f(x);         f(x) is nonlinear in x
 >
 >Training Data = {(x1, y1), (x2, y2), ...... , (xn, yn)}
 >
 >Can the network now produce ym given xm, even if it has never seen the
 >pair before?

        Sounds like a standard interpolation problem to me, though a good
        deal of effort has been expended to make neural networks inter-
        polate. Any elementary book on numerical analysis will treat
        this problem, but the author of the above probably knows this.
        I would be interesting in other ramifications to the above problem
        which are not readily amenable to classical techniques.

        - Arun Rao

------------------------------

End of Neurons Digest
*********************