[comp.ai.neural-nets] Neuron Digest V7 #7

neuron-request@HPLMS2.HPL.HP.COM ("Neuron-Digest Moderator Peter Marvit") (02/05/91)
Neuron Digest   Monday,  4 Feb 1991
                Volume 7 : Issue 7

Today's Topics:
                      tech report available by ftp
      TR - Integrating Rules and Connectionism for Robust Reasoning
                 report: optimal NN size for classifiers
            Nips90 Preprint available from neuroprose archive
            TR-EE 90-63: The Hystery Unit - short term memory
                tech report: continuous spatial automata
              Learning algorithms for oscillatory networks
             TR available from neuroprose; feedforward nets
         Abstract - Backpropagation Learning in Expert Networks
     tech rep on overfitting, decision theory, PAC learning, and...
       preprint - Dynamics of Generalization in Linear Perceptrons


Send submissions, questions, address maintenance and requests for old issues to
"neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request"
Use "ftp" to get old issues from hplpm.hpl.hp.com (15.255.176.205).

------------------------------------------------------------

Subject: tech report available by ftp 
From:    honavar@iastate.edu
Date:    Mon, 14 Jan 91 13:01:47 -0600


The following technical report is available in postscript form
by anonymous ftp (courtesy Jordan Pollack of Ohio State Univ). 

- ----------------------------------------------------------------------

Generative Learning Structures and Processes for
Generalized Connectionist Networks

Vasant Honavar                          Leonard Uhr 
Department of Computer Science          Computer Sciences Department 
Iowa State University                   University of Wisconsin-Madison 

                Technical Report #91-02, January 1991 
                Department of Computer Science
                Iowa State University, Ames, IA 50011 

                                Abstract


Massively parallel networks of relatively simple computing elements offer
an attractive and versatile framework for exploring a variety of learning
structures and processes for intelligent systems. This paper briefly
summarizes the popular learning structures and processes used in such
networks. It outlines a range of potentially more powerful alternatives
for pattern-directed inductive learning in such systems.  It motivates
and develops a class of new learning algorithms for massively parallel
networks of simple computing elements.  We call this class of learning
processes \fIgenerative\fR for they offer a set of mechanisms for
constructive and adaptive determination of the network architecture - the
number of processing elements and the connectivity among them - as a
function of experience.  Such generative learning algorithms attempt to
overcome some of the limitations of some approaches to learning in
networks that rely on modification of \fIweights\fR on the links within
an otherwise fixed network topology e.g., rather slow learning and the
need for an a-priori choice of a network architecture.  Several
alternative designs, extensions and refinements of generative learning
algorithms, as well as a range of control structures and processes which
can be used to regulate the form and content of internal representations
learned by such networks are examined.
______________________________________________________________________________

You will need a POSTSCRIPT printer to print the file. 
To obtain a copy of the report, use anonymous ftp from 
cheops.cis.ohio-state.edu (here is what the transaction looks like): 

% ftp
ftp> open cheops.cis.ohio-state.edu
Connected to cheops.cis.ohio-state.edu.
220 cheops.cis.ohio-state.edu FTP server (Version blah blah) ready.
Name (cheops.cis.ohio-state.edu:yourname): anonymous
331 Guest login ok, send ident as password.
Password: anything 
230 Guest login ok, access restrictions apply.
ftp> cd pub/neuroprose
250 CWD command successful.
ftp> bin  
200 Type set to I.
ftp> get honavar.generate.ps.Z 
200 PORT command successful.
150 Opening BINARY mode data connection for honavar.generate.ps.Z (55121 bytes).
226 Transfer complete.
local: honavar.generate.ps.Z remote: honavar.generate.ps.Z
55121 bytes received in 1.8 seconds (30 Kbytes/s)
ftp> quit
221 Goodbye.
% uncompress honavar.generate.ps.Z
% lpr honavar.generate.ps 




------------------------------

Subject: TR - Integrating Rules and Connectionism for Robust Reasoning
From:    Ron Sun <rsun@chaos.cs.brandeis.edu>
Date:    Tue, 15 Jan 91 17:12:08 -0500


  Integrating Rules and Connectionism for  Robust Reasoning

                Technical Report TR-CS-90-154

                         Ron Sun
                   Brandeis University
               Computer Science Department
                    rsun@cs.brandeis.edu

                          Abstract
A connectionist model for robust reasoning, CONSYDERR, is proposed to
account for some common reasoning patterns found in commonsense reasoning
and to remedy the brittleness problem.  A dual representation scheme is
devised, which utilizes both localist representation and distributed
representation with features.  We explore the synergy resulted from the
interaction between these two types of representations, which helps to
deal with problems such as partial information, no exact match, property
inheritance, rule interaction, etc.  Because of this, the CONSYDERR
system is capable of accounting for many difficult patterns in
commonsense reasoning.  This work also shows that connectionist models of
reasoning are not just an ``implementation" of their symbolic
counterparts, but better computational models of common sense reasoning,
taking into consideration of the approximate, evidential and adaptive
nature of reasoning, and accounting for the spontaneity and parallelism
in reasoning processes.

+++ comments and suggestions are especially welcome +++

 ------------ FTP procedures ---------
ftp cheops.cis.ohio-state.edu
>name: anonymous
>passwork: neuron

>binary
>cd pub/neuroprose
>get sun.integrate.ps.Z 
>quit

uncompress sun.integrate.ps.Z
lpr sun.integrate.ps 




------------------------------

Subject: report: optimal NN size for classifiers
From:    Manoel F Tenorio <tenorio@ecn.purdue.edu>
Date:    Wed, 16 Jan 91 09:41:37 -0500


This report addresses the analysis of a new criterion for optimal
classifier design. In particular we study the effects of the sizing ot
the hidden layers and the optimal predicted value by this criterion.

Resquest should be sent to: jld@ecn.purdue.edu
TR-EE 91-5
There is a fee for requests outside USA,Canada and Mexico.
 


        On Optimal Adaptive Classifier Design Criterion -
       How many hidden units are necessary for an optimal
                 neural network classifier?

Wei-Tsih Lee                            Manoel Fernando Tenorio
Parallel Distributed Structures Lab.    Parallel Distributed Structures Lab.
School of Electrical Engineering        School of Electrical Engineering
Purdue University                       Purdue University
West Lafayette, IN  47907               West Lafayette, IN  47907

lwt@ecn.purdue.edu                      tenorio@ecn.purdue.edu


Abstract

        A central problem in classifier design is the estimation of
classification error.  The difficulty in classifier design arises in
situations where the sample distribution is unknown and the number of
training samples available is limited.  In this paper, we present a new
approach for solving this problem.  In our model, there are two types of
classification error: approximation and generalization error.  The former
is due to the imperfect knowledge of the underlying sample distribution,
while the latter is mainly the result of inaccuracies in parameter
estimation, which is a consequence of the small number of training
samples.  We therefore propose a criterion for optimal classifier
selection, called the Generalized Minimum Empirical Criterion (GMEE).
The GMEE criterion consists of two terms, corresponding to the estimates
of two types of error.  The first term is the empirical error, which is
the classification error observed for the training samples.  The second
is an estimate of the generalization error, which is related to the
classifier complexity.  In this paper we consider the Vapnik-Chervonenkis
dimension (VCdim) as a measure of classifier complexity.  Hence, the
classifier which minimizes the criterion is the one with minimal error
probability.  Bayes consistency of the GMEE criterion has been proven.

        As an application, the criterion is used to design the optimal
neural network classifier.  A corollary to the Bayes optimality of neural
network-based classifiers has been proven.  Thus, our approach provides a
theoretic foundation for the connectionist approach to optimal classifier
design.  Experimental results are given to validate the approach,
followed by discussions and suggestions for future research.

------------------------------

Subject: Nips90 Preprint available from neuroprose archive
From:    "Terence D. Sanger" <tds@ai.mit.edu>
Date:    Sat, 19 Jan 91 16:33:00 -0500

The following preprint is available, and will appear in the Nips'90
proceedings:

- ---------------------------------------------------------------------------

   Basis-Function Trees as a Generalization of Local Variable Selection
                    Methods for Function Approximation

                           Terence D. Sanger

  Local variable selection has proven to be a powerful technique for
  approximating functions in high-dimensional spaces.  It is used in several
  statistical methods, including CART, ID3, C4, MARS, and others (see the
  bibliography for references to these algorithms).  In this paper I present
  a tree-structured network which is a generalization of these techniques.
  The network provides a framework for understanding the behavior of
  such algorithms and for modifying them to suit particular applications.

- ---------------------------------------------------------------------------

Bibtex entry:

  @INCOLLECTION{sanger91,
        AUTHOR = {Terence D. Sanger},
        TITLE = {Basis-Function Trees as a Generalization of Local 
        Variable Selection Methods for Function Approximation},
        BOOKTITLE = {Advances in Neural Information Processing Systems 3},
        PUBLISHER = {Morgan Kaufmann},
        YEAR = {1991},
        EDITOR = {Richard P. Lippmann and John Moody and David S. Touretzky},
        NOTE = {Proc. NIPS'90, Denver CO}
  }

This paper can be obtained by anonymous ftp from the neuroprose database:

unix> ftp cheops.cis.ohio-state.edu          # (or ftp 128.146.8.62)
Name (cheops.cis.ohio-state.edu:): anonymous
Password (cheops.cis.ohio-state.edu:anonymous): <ret>
ftp> cd pub/neuroprose
ftp> binary
ftp> get sanger.trees.ps.Z
ftp> quit
unix> uncompress sanger.trees.ps
unix> lpr -P(your_local_postscript_printer) sanger.trees.ps
                # in some cases you will need to use the -s switch to lpr.


Terry Sanger
MIT, E25-534
Cambridge, MA 02139
USA

tds@ai.mit.edu

------------------------------

Subject: TR-EE 90-63: The Hystery Unit - short term memory
From:    tenorio@ecn.purdue.edu (Manoel F Tenorio)
Date:    Tue, 22 Jan 91 15:15:41 -0500

The task of performing recognition of patterns on spatio-temporal signals
is not an easy one, primarily due to the time structure of the signal.
Classical methods of handling this problem have proven themselves
unsatisfactory, and they range from "projecting out" the time axis, to
"memorizing" the entire sequence before a decision can be made. In
particular, the latter can be very difficult if no a priori information
about signal length is present, if the signal can suffer compression and
extension, or if the entire pattern is massively large, as in the case of
time varying imagery.

Neural Network models to solve this problem have either been based on the
classical approach or on recursive loops within the network which can
make learning algorithms numerically unstable.

It is clear that for all the spatio-temporal processing, done by
biological systems, some kind of short term memory is needed, and has
been long conjectured. In this report, we have taken the first step at
the design of a spatio-temporal system that deals naturally with the
problems present in this type of processing. In particular we investigate
the exchange of the simple sigmoid function, commonly used, by a
hysterisis function. Later, with the addition of an integrator which
represents the neuron membrane effect, we construct a simple
computational device to perform spatio-pattern recognition tasks.

The results are that for bipolar input sequence, this device remaps the
entire sequence into a real number. Knowing the output of the device
suffices for knowing the sequence. For trajectories embbeded in noise,
the device shows superior recognition to other techniques. Furthermore,
properties of the device allows the designer to determine the memory
length, and explain with simple circuits sensitization and habituation
phenomena. The report below deals with the device and its mathematical
properties. Other forthcoming papers will concentrate on other aspects of
circuits constructed with this device.

 ----------------------------------------------------------------------
Requests from within US, Canada, and Mexico:

    The technical report with figures has been/will soon be placed in the
account kindly provided by Ohio State.  Here is the instruction to get
the files:

        ftp cheops.cis.ohio-state.edu    (or, ftp 128.146.8.62)
        Name: anonymous
        Password: neuron
        ftp> cd pub/neuroprose
        ftp> mget tom.hystery*    (type y and hit return)
        ftp> quit
        unix> uncompress tom.hystery*.Z
        unix> lpr -P(your_postscript_printer) tom.hystery.ps
        unix> lpr -P(your_Mac_laserwriter) tom.hystery_figs.ps

Please contact mdtom@ecn.purdue.edu for technical difficulties.

 ----------------------------------------------------------------------
Requests from outside North America:

    The technical report is available at a cost of US$22.39 per copy,
postage included.  Please make checks payable to Purdue University in US
dollars.  You may send your requests, checks, and full first class mail
address to:

        J. L. Dixon
        School of Electrical Engineering
        Purdue University
        West Lafayette, Indiana 47907
        USA

Please mention the technical report number: TR-EE 90-63.

 ----------------------------------------------------------------------
          The Hystery Unit - A Short Term Memory Model
                    for Computational Neurons

                          M. Daniel Tom
                     Manoel Fernando Tenorio

           Parallel Distributed Structures Laboratory
                School of Electrical Engineering
                        Purdue University
               West Lafayette, Indiana  47907, USA

                         December, 1990

Abstract: In  this  paper,  a  model  of  short  term  memory  is
introduced.   This model is inspired by the transient behavior of
neurons and magnetic storage as memory.  The  transient  response
of  a  neuron  is  hypothesized  to be a combination of a pair of
sigmoids, and a relation is drawn to the hysteresis loop found in
magnetic  materials.   A model is created as a composition of two
coupled families of curves.  Two theorems are  derived  regarding
the  asymptotic  convergence  behavior  of  the  model.   Another
conjecture claims that the model retains full memory of all  past
unit step inputs.

------------------------------

Subject: tech report: continuous spatial automata
From:    mclennan@cs.utk.edu
Date:    Wed, 23 Jan 91 16:28:54 -0500

The following technical report is now available:

                   Continuous Spatial Automata

                         B. J. MacLennan

                 Department of Computer Science
                     University of Tennessee
                    Knoxville, TN 37996-1301
                      maclennan@cs.utk.edu

                            CS-90-121
                        November 26, 1990

                            ABSTRACT

A _continuous_spatial_automaton_ is analogous to a cellular auto- maton,
except that the cells form a continuum, as do the possible states of the
cells.  After an informal mathematical description of spatial automata,
we describe in detail a continuous analog of Conway's ``Life,'' and show
how the automaton can be implemented using the basic operations of field
computation.

     Typically a cellular automaton has a finite (sometimes denu-
merably infinite) set of cells, often arranged in a one or two
dimensional array. Each cell can be in one of a number of states.  In
contrast, a continuous spatial automaton has a one, two or higher
dimensional continuum of _loci_ (corresponding to cells), each of which
has a state drawn from a continuum (typically [0,1]).  The state is
required to vary continuously with the locus.  In a cellular automaton
there is a transition function that determines the state of a cell at the
next time step based on the state of it and a finite number of neighbors
at the current time step.  A discrete-time spatial automaton is very
similar: the future state of a locus is a continuous function of the
states of the loci in a (closed or open) bounded neighborhood of the
given locus.

The report is available as a compressed postscript file in the
pub/neuroprose subdirectory; it may be obtained with the Getps script:

     Getps maclennan.csa.ps.Z

For HARDCOPY send your address to:  library@cs.utk.edu

For other correspondence:

Bruce MacLennan
Department of Computer Science
107 Ayres Hall
The University of Tennessee
Knoxville, TN 37996-1301

(615)974-0994/5067
maclennan@cs.utk.edu

------------------------------

Subject: Learning algorithms for oscillatory networks
From:    prowat@UCSD.EDU (Peter Rowat)
Date:    Wed, 23 Jan 91 16:19:15 -0800


The following preprint is now available by ftp from neuroprose:

Peter Rowat and Allen Selverston (1990). Learning algorithms for 
        oscillatory networks with gap junctions and membrane currents.
        To appear in: NETWORK: Computation in Neural systems, Volume 2, 
        Issue 1, February 1991. 
  
Abstract:

One of the most important problems for studying neural network models is
the adjustment of parameters. Here we show how to formulate the problem
as the minimization of the difference between two limit cycles. The
backpropagation method for learning algorithms is described as the
application of gradient descent to an error function that computes this
difference. A mathematical formulation is given that is applicable to any
type of network model, and applied to several models. The standard
connectionist model of a neuron is extended to allow gap junctions
between cells and to include membrane currents. Learning algorithms are
derived for a two cell network with a single gap junction, and for a pair
of mutually inhibitory neurons each having a simplified membrane current.

        For example, when learning in a network in which all cells have a
common, adjustable, bias current, the value of the bias is adjusted at a
rate proportional to the difference between the sum of the target outputs
and the sum of the actual outputs.  When learning in a network of n cells
where a target output is given for every cell, the learning algorithm
splits into n independent learning algorithms, one per cell.  For
networks containing gap junctions, a gap junction is modelled as a
conductance times the potential difference between the two adjacent
cells.  The requirement that a conductance g must be positive is enforced
by replacing g by a function pos(g*) whose value is always positive, for
example exp(0.1 g*), and deriving an algorithm that adjusts the parameter
g* in place of g.  When target output is specified for every cell in a
network with gap junctions, the learning algorithm splits into fewer
independent components, one for each gap-connected subset of the network.
The learning algorithm for a gap-connected set of cells cannot be
parallelized further.  As a final example, a learning algorithm is
derived for a mutually inhibitory two-cell network in which each cell has
a membrane current.

        This generalized approach to backpropagation allows one to derive
a learning algorithm for almost any model neural network given in terms
of differential equations. It is one solution to the problem of parameter
adjustment in small but complex network models.

 ---------------------------------------------------------------------------
Copies of the postscript file rowat.learn-osc.ps.Z may be obtained from the
pub/neuroprose directory in cheops.cis.ohio-state.edu. Either use the
Getps script or do this:

unix-1> ftp cheops.cis.ohio-state.edu          # (or ftp 128.146.8.62)
Connected to cheops.cis.ohio-state.edu.
Name (cheops.cis.ohio-state.edu:): anonymous
331 Guest login ok, sent ident as password.
Password: neuron
230 Guest login ok, access restrictions apply.
ftp> cd pub/neuroprose
ftp> binary
ftp> get rowat.learn-osc.ps.Z
ftp> quit
unix-2> uncompress rowat.learn-osc.ps.Z
unix-3> lpr -P(your_local_postscript_printer) rowat.learn-osc.ps
(The file starts with 7 bitmapped figures which are slow to print.)

------------------------------

Subject: TR available from neuroprose; feedforward nets
From:    Eduardo Sontag <sontag@hilbert.RUTGERS.EDU>
Date:    Thu, 24 Jan 91 16:16:17 -0500

I have deposited in the neuroprose archive the extended version of my
NIPS-90 Proceedings paper.  The title is:

   "FEEDFORWARD NETS FOR INTERPOLATION AND CLASSIFICATION"

and the abstract is:

 "This paper deals with single-hidden-layer feedforward nets, studying various
 measures of classification power and interpolation capability.  Results are
 given showing that direct input to output connections in threshold nets double
 the recognition but not the interpolation power, while using sigmoids rather
 than thresholds allows (at least) doubling both."

(NOTE: This is closely related to report SYCON-90-03, which was put in
the archive last year under the title "sontag.capabilities.ps.Z".  No
point in retrieving unless you found the other paper of interest.  The
current paper besically adds a few results on interpolation.)

 -eduardo

 -----------------------------------------------------------------------------

To obtain copies of the postscript file, please use Jordan Pollack's service:

Example:
unix> ftp cheops.cis.ohio-state.edu          # (or ftp 128.146.8.62)
Name (cheops.cis.ohio-state.edu:): anonymous
Password (cheops.cis.ohio-state.edu:anonymous): <ret>
ftp> cd pub/neuroprose
ftp> binary
ftp> get
(remote-file) sontag.nips90.ps.Z
(local-file) sontag.nips90.ps.Z
ftp> quit
unix> uncompress sontag.nips90.ps.Z
unix> lpr -P(your_local_postscript_printer) sontag.nips90.ps

 ----------------------------------------------------------------------------
If you have any difficulties with the above, please send e-mail to
sontag@hilbert.rutgers.edu.   DO NOT "reply" to this message, please.

NOTES about FTP'ing, etc:

(1) The last time I posted something, I forgot to include the ".Z" in the
    file name in the above "remote-file" line, and I received many
    messages telling me that FTP didn't find the file.  Sorry for that.
    Please note that most files in the archive are compressed, and people
    may forget to mention the ".Z".

(2) I also received some email (and saw much discussion in a bboard)
    concerning the printer errors with the file.  Please note that
    postscript files sometimes require a fair amount of memory from the
    printer, especially if they contain illustrations, and many smaller
    printers do not have enough memory.  This may result on some pages
    not being printed, or the print job not being done at all.  If you
    experience this problem with papers you retrieve (mine or from
    others), I suggest that you ask the author to email you a source file
    (e.g.  LaTex) or a postscript file sans figures.  Also, some
    postscript files are "nonconforming", and this may cause problems
    with certain printers.  

------------------------------

Subject: Abstract - Backpropagation Learning in Expert Networks
From:    Chris Lacher <lacher@lambda.cs.fsu.edu>
Date:    Thu, 24 Jan 91 16:16:45 -0500




             Backpropagation Learning in Expert Networks

                                 by

        R. C. Lacher, Susan I. Hruska, and David C. Kuncicky
                  Department of Computer Science
                     Florida State University


ABSTRACT.  Expert networks are event-driven, acyclic networks of neural
objects derived from expert systems.  The neural objects process
information through a non-linear combining function that is different
from, and more complex than, typical neural network node processors.  We
develop backpropagation learning for acyclic, event-driven nets in
general and derive a specific algorithm for learning in EMYCIN-derived
expert networks.  The algorithm combines backpropagation learning with
other features of expert nets, including calculation of gradients of the
non-linear combining functions and the hypercube nature of the knowledge
space.  Results of testing the learning algorithm with a medium-scale (97
node) expert network are presented.


For a copy of this preprint send an email request with your (snail)MAIL
ADDRESS and the TITLE of the preprint to: santan@nu.cs.fsu.edu


                                             --- Chris Lacher


------------------------------

Subject: tech rep on overfitting, decision theory, PAC learning, and...
From:    David Haussler <haussler@saturn.ucsc.edu>
Date:    Fri, 25 Jan 91 17:31:37 -0800

                       TECHNICAL REPORT AVAILABLE

           Decision Theoretic Generalizations of the PAC Model
             for Neural Net and Other Learning Applications

                             David Haussler
                             UCSC-CRL-91-02
                             September, 1989
                         Revised: December, 1990
                        haussler@saturn.ucsc.edu
     Baskin Center for Computer Engineering and Information Sciences
             University of California, Santa Cruz, CA 95064

Abstract:

We describe a generalization of the PAC learning model that is based on
statistical decision theory. In this model the learner receives randomly
drawn examples, each example consisting of an instance $x \in X$ and an
outcome $y \in Y$, and tries to find a hypothesis $h : X \rightarrow A$,
where $h \in \cH$, that specifies the appropriate action $a \in A$ to
take for each instance $x$, in order to minimize the expectation of a
loss $\L(y,a)$. Here $X$, $Y$, and $A$ are arbitrary sets, $\L$ is a
real-valued function, and examples are generated according to an
arbitrary joint distribution on $X \times Y$.  Special cases include the
problem of learning a function from $X$ into $Y$, the problem of learning
the conditional probability distribution on $Y$ given $X$ (regression),
and the problem of learning a distribution on $X$ (density estimation).

We give theorems on the uniform convergence of empirical loss estimates
to true expected loss rates for certain hypothesis spaces $\cH$, and show
how this implies learnability with bounded sample size, disregarding
computational complexity. As an application, we give
distribution-independent upper bounds on the sample size needed for
learning with feedforward neural networks.  Our theorems use a
generalized notion of VC dimension that applies to classes of real-valued
functions, adapted from Pollard's work, and a notion of {\em capacity}
and {\em metric dimension} for classes of functions that map into a
bounded metric space.


The report can be retrieved by anonymous ftp from the UCSC Tech report
library. An example follows:

unix> ftp midgard.ucsc.edu       # (or ftp 128.114.134.15)
Connected ... 
Name (...): anonymous
Password: yourname@cs.anyuniversity.edu  (i.e. your email address) 
        (Please use your email address so we can correspond with you.)
Guest login ok, access restrictions apply.
ftp> cd pub/tr
ftp> binary
ftp> get ucsc-crl-91-02.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for ucsc-crl-91-02.ps.Z (576429 bytes).
226 Transfer complete.
local: ucsc-crl-91-02.ps.Z remote: ucsc-crl-91-02.ps.Z
576429 bytes received in 10 seconds (70 Kbytes/s)
ftp> quit
unix> uncompress ucsc-crl-91-02.ps.Z
unix> lpr -P(your_local_postscript_printer) ucsc-crl-91-02.ps

(Note: you will need a printer with a large memory.)  (also: some other
UCSC tech reports are available as well and more will be added soon.  ftp
the file INDEX to see what's there.)

If you have any difficulties with the above, please send e-mail to
jean@cis.ucsc.edu.  DO NOT "reply" to this message, please.

                                                                                 -David


------------------------------

Subject: preprint - Dynamics of Generalization in Linear Perceptrons
From:    hertz@nordita.dk
Date:    Mon, 28 Jan 91 11:04:05 +0000

The following technical report has been placed in the neuroprose archives
at Ohio State University:
                
        Dynamics of Generalization in Linear Perceptrons

                Anders Krogh        John Hertz
            Niels Bohr Institut      Nordita

                            Abstract:

We study the evolution of the generalization ability of a simple linear
perceptron with N inputs which learns to imitate a ``teacher
perceptron''.  The system is trained on p = \alpha N binary example
inputs and the generalization ability measured by testing for agreement
with the teacher on all 2^N possible binary input patterns.  The dynamics
may be solved analytically and exhibits a phase transition from imperfect
to perfect generalization at \alpha = 1.  Except at this point the
generalization ability approaches its asymptotic value exponentially,
with critical slowing down near the transition; the relaxation time is
\propto (1-\sqrt{\alpha})^{-2}.  Right at the critical point, the
approach to perfect generalization follows a power law \propto t^{-1/2}.
In the presence of noise, the generalization ability is degraded by an
amount \propto (\sqrt{\alpha}-1)^{-1} just above \alpha = 1.

This paper will appear in the NIPS-90 proceedings.  To retrieve it by
anonymous ftp, do the following:

unix> ftp cheops.cis.ohio-state.edu          # (or ftp 128.146.8.62)
Name (cheops.cis.ohio-state.edu:): anonymous
Password (cheops.cis.ohio-state.edu:anonymous): <ret>
ftp> cd pub/neuroprose
ftp> binary
ftp> get krogh.generalization.ps.Z
ftp> quit
unix> uncompress krogh.generalization.ps
unix> lpr -P(your_local_postscript_printer) krogh.generalization.ps


An old-fashioned paper preprint version is also available -- send
requests to
                hertz@nordita.dk
or

John Hertz
Nordita
Blegdamsvej 17
DK-2100 Copenhagen
Denmark

------------------------------

End of Neuron Digest [Volume 7 Issue 7]
***************************************