[comp.ai.neural-nets] Neuron Digest V7 #29

neuron-request@HPLMS2.HPL.HP.COM ("Neuron-Digest Moderator Peter Marvit") (05/23/91)
Neuron Digest   Wednesday, 22 May 1991
                Volume 7 : Issue 29

Today's Topics:
              New FKI-Report - An O(N^3) Learning Algorithm
              Preprint: building sensory-motor hierarchies
                          Two new Tech Reports
        TR - Kohonen Feature Maps in Natural Language Processing
                          Paper Available: RAAM
                   New ICSI TR on incremental learning
                            New Bayesian work
           TR - Learning the past tense in a recurrent network


Send submissions, questions, address maintenance and requests for old issues to
"neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request"
Use "ftp" to get old issues from hplpm.hpl.hp.com (15.255.176.205).

------------------------------------------------------------

Subject: New FKI-Report - An O(N^3) Learning Algorithm
From:    Juergen Schmidhuber <schmidhu@informatik.tu-muenchen.dbp.de>
Date:    06 May 91 12:43:22 +0200

Here is another one:

=---------------------------------------------------------------------

      AN O(n^3) LEARNING ALGORITHM FOR FULLY RECURRENT NETWORKS
                        Juergen Schmidhuber
              Technical Report FKI-151-91, May 6, 1991

  The fixed-size  storage learning  algorithm for  fully recurrent
  continually running networks  (e.g. (Robinson + Fallside, 1987),
  (Williams + Zipser, 1988)) requires O(n^4) computations per time
  step, where n is the  number of non-input  units. We  describe a
  method  which computes  exactly the same  gradient and  requires
  fixed-size storage of the same order  as the previous algorithm.
  But, the average time complexity per time step is O(n^3).

=---------------------------------------------------------------------

To obtain a copy, do:

             unix>         ftp 131.159.8.35

             Name:         anonymous
             Password:     your name, please
             ftp>          binary
             ftp>          cd pub/fki
             ftp>          get fki151.ps.Z
             ftp>          bye

             unix>         uncompress fki151.ps.Z
             unix>         lpr  fki151.ps


Please do not forget to leave your name (instead of your email address).

NOTE: fki151.ps is designed for European A4 paper format
      (20.9cm x 29.6cm).



In case of ftp-problems send email to
schmidhu@informatik.tu-muenchen.de
or contact

Juergen Schmidhuber
Institut fuer Informatik,
Technische Universitaet Muenchen
Arcisstr. 21
8000 Muenchen 2
GERMANY



------------------------------

Subject: Preprint: building sensory-motor hierarchies
From:    Mark Ring <ring@cs.utexas.edu>
Date:    Wed, 08 May 91 16:16:31 -0500

Recently there's been some interest on this mailing list regarding neural
net hierarchies for sequence "chunking".  I've placed a relevant paper in
the Neuroprose Archive for public ftp.  This is a (very slightly
extended) copy of a paper to be published in the Proceedings of the
Eighth International Workshop on Machine Learning.

The paper summarizes the results to date of work begun a year and a half
ago to create a system that automatically and incrementally constructs
hierarchies of behaviors in neural nets.  The purpose of the system is to
develop continuously through the encapsulation, or "chunking," of learned
behaviors.

=----------------------------------------------------------------------
                                   
    INCREMENTAL DEVELOPMENT OF COMPLEX BEHAVIORS THROUGH AUTOMATIC
              CONSTRUCTION OF SENSORY-MOTOR HIERARCHIES
                                   
                              Mark Ring
                    University of Texas at Austin

      This paper addresses the issue of continual, incremental
      development of behaviors in reactive agents. The reactive
      agents are neural-network based and use reinforcement
      learning techniques.
      
      A continually developing system is one that is constantly
      capable of extending its repertoire of behaviors. An agent
      increases its repertoire of behaviors in order to increase
      its performance in and understanding of its environment.
      Continual development requires an unlimited growth
      potential; that is, it requires a system that can
      constantly augment current behaviors with new behaviors,
      perhaps using the current ones as a foundation for those
      that come next.  It also requires a process for organizing
      behaviors in meaningful ways and a method for assigning
      credit properly to sequences of behaviors, where each
      behavior may itself be an arbitrarily long sequence.
      
      The solution proposed here is hierarchical and bottom up.
      I introduce a new kind of neuron (termed a ``bion''),
      whose characteristics permit it to be automatically
      constructed into sensory-motor hierarchies as determined
      by experience.  The bion is being developed to resolve the
      problems of incremental growth, temporal history
      limitation, network organization, and credit assignment
      among component behaviors.


A longer, more detailed paper will be announced shortly.

=----------------------------------------------------------------------
Instructions to retrieve paper by ftp, (no hard copies available at
this time):

         % ftp cheops.cis.ohio-state.edu (or 128.146.8.62)
         Name: anonymous
         Password: neuron
         ftp> cd pub/neuroprose
         ftp> binary
         ftp> get ring.ml91.ps.Z
         ftp> bye
         % uncompress ring.ml91.ps.Z
         % lpr -P(your_postscript_printer) ring.ml91.ps.Z

=----------------------------------------------------------------------

DO NOT "reply" DIRECTLY TO THIS MESSAGE!

If you have any questions or difficulties, please send e-mail to:
ring@cs.utexas.edu.

or send mail to:

Mark Ring                                   
Department of Computer Sciences
Taylor 2.124
University of Texas at Austin
Austin, TX 78712 


------------------------------

Subject: Two new Tech Reports
From:    Scott.Fahlman@SEF1.SLISP.CS.CMU.EDU
Date:    Mon, 13 May 91 13:31:04 -0400


The following two tech reports have been placed in the neuroprose
database at Ohio State.  Instructions for accessing them via anonymous
FTP are included at the end of this message.  (Maybe everyone should copy
down these instructions once and for all so that we can stop sending
repeating them with each announcement.)

=---------------------------------------------------------------------------

Tech Report CMU-CS-91-100

The Recurrent Cascade-Correlation Architecture

Scott E. Fahlman

Recurrent Cascade-Correlation (RCC) is a recurrent version of the
Cascade-Correlation learning architecture of Fahlman and Lebiere
\cite{fahlman:cascor}.  RCC can learn from examples to map a sequence of
inputs into a desired sequence of outputs.  New hidden units with
recurrent connections are added to the network one at a time, as they are
needed during training.  In effect, the network builds up a finite-state
machine tailored specifically for the current problem.  RCC retains the
advantages of Cascade-Correlation: fast learning, good generalization,
automatic construction of a near-minimal multi-layered network, and the
ability to learn complex behaviors through a sequence of simple lessons.
The power of RCC is demonstrated on two tasks: learning a finite-state
grammar from examples of legal strings, and learning to recognize
characters in Morse code.

Note: This TR is essentially the same as the the paper of the same name
in the NIPS 3 proceedings (due to appear very soon).  The TR version
includes some additional experimental data and a few explanatory diagrams
that had to be cut in the NIPS version.

=---------------------------------------------------------------------------

Tech report CMU-CS-91-130

Learning with Limited Numerical Precision Using the Cascade-Correlation
Algorithm

Markus Hoehfeld and Scott E. Fahlman

A key question in the design of specialized hardware for simulation of
neural networks is whether fixed-point arithmetic of limited numerical
precision can be used with existing learning algorithms.  We present an
empirical study of the effects of limited precision in
Cascade-Correlation networks on three different learning problems.  We
show that learning can fail abruptly as the precision of network weights
or weight-update calculations is reduced below 12 bits.  We introduce
techniques for dynamic rescaling and probabilistic rounding that allow
reliable convergence down to 6 bits of precision, with only a gradual
reduction in the quality of the solutions.

Note: The experiments described here were conducted during a visit by
Markus Hoehfeld to Carnegie Mellon in the fall of 1990.  Markus
Hoehfeld's permanent address is Siemens AG, ZFE IS INF 2, Otto-Hahn-Ring
6, W-8000 Munich 83, Germany.

=---------------------------------------------------------------------------

To access these tech reports in postscript form via anonymous FTP, do the
following:

unix> ftp cheops.cis.ohio-state.edu  (or, ftp 128.146.8.62)
  Name: anonymous
  Password: neuron
ftp> cd pub/neuroprose
ftp> binary
ftp> get <filename.ps.Z>
ftp> quit
unix> uncompress <filename.ps.Z>
unix> lpr <filename.ps>   (use flag your printer needs for Postscript) 

The TRs described above are stored as "fahlman.rcc.ps.Z" and
"hoehfeld.precision.ps.Z".  Older reports "fahlman.quickprop-tr.ps.Z" and
"fahlman.cascor-tr.ps.Z" may also be of interest.

Your local version of ftp and other unix utilities may be different.
Consult your local system wizards for details.

=---------------------------------------------------------------------------

Hardopy versions are now being printed and will be available soon, but
because of the high demand and tight budget, our school has has
(reluctantly) instituted a charge for mailing out tech reports in
hardcopy: $3 per copy within the U.S. and $5 per copy elsewhere, and the
payment must be in U.S. dollars.  To order hardcopies, contact:

Ms. Catherine Copetas
School of Computer Science
Carnegie Mellon University
Pittsburgh, PA 15213
U.S.A.



------------------------------

Subject: TR - Kohonen Feature Maps in Natural Language Processing
From:    SCHOLTES@ALF.LET.UVA.NL
Date:    Wed, 15 May 91 13:42:00 +0700


TR Available on Recurrent Self-Organization in NLP:


       Kohonen Feature Maps in Natural Language Processing

                        J.C. Scholtes
                    University of Amsterdam


Main points: showing the possibilities of Kohonen feature maps in symbolic
             applications by pushing self-organization.

             showing a different technique in Connectionist NLP by using
             only (unsupervised) self organization.

Although the model is tested in a NLP context, the linguistic aspects of
these experiments are probably less interesting than the connectionist
ones.  People inquiring a copy should be aware of this.


                         Abstract

In the 1980s, backpropagation (BP) started the connectionist bandwagon in
Natural Language Processing (NLP). Although initial results were good,
some critical notes must be made towards the blind application of BP.
Most such systems add contextual and semantical features manually by
structuring the input set. Moreover, these models form a small subtract
of the brain structures known from neural sciences. They do not adapt
smoothly to a changing environment and can only learn input/output pairs.

Although these disadvantages of the backpropagation algorithm are
commonly known and accepted, other more plausible learning algorithms,
such as unsupervised learning techniques are still rare in the field of
NLP. Main reason is the highly increasing complexity of unsupervised
learning methods when applied in the already complex field of NLP.
However, recent efforts implementing unsupervised language learning have
been made, resulting in interesting conclusions (Elman and Ritter).
Sequencing this earlier work, a recurrent self-organizing model (based on
an extension of the Kohonen feature map), capable to derive contextual
(and some semantical) information from scratch, is presented in detail.
The model implements a first step towards an overall unsupervised
language learning system. Simple linguistic tasks such as single word
clustering (representation on the map), syntactical group formation,
derivation of contextual structures, string prediction, grammatical
correctness checking, word sense disambiguation and structure assigning
are carried out in a number of experiments. The performance of the model
is as least as good as achieved in recurrent backpropagation, and at some
points even better (e.g. unsupervised derivation of word classes and
syntactical structures).

Although premature, the first results are promising and show
possibilities for other even more biologically-inspired language
processing techniques such as real Hebbian, Genetic or Darwinistic
models.  Forthcoming research must overcome limitations still present in
the extended Kohonen model, such as the absence of within layer learning,
restricted recurrence, no look-ahead functions (absence of distributed or
unsupervised buffering mechanisms) and a limited support for an increased
number of layers.


A copy can be obtained by sending a Email message to
SCHOLTES@ALF.LET.UVA.NL Please indicate whether you want a hard copy or a
postscript file being send to you.





------------------------------

Subject: Paper Available: RAAM
From:    doug blank <blank@copper.ucs.indiana.edu>
Date:    Wed, 15 May 91 15:11:09 -0500


            Exploring the Symbolic/Subsymbolic Continuum:
                         A Case Study of RAAM

            Douglas S. Blank (blank@iuvax.cs.indiana.edu)
             Lisa A. Meeden (meeden@iuvax.cs.indiana.edu)
          James B. Marshall (marshall@iuvax.cs.indiana.edu)

                          Indiana University
                Computer Science and Cognitive Science
                             Departments

Abstract:

This paper is an in-depth study of the mechanics of recursive
auto-associative memory, or RAAM, an architecture developed by Jordan
Pollack.  It is divided into three main sections: an attempt to place the
symbolic and subsymbolic paradigms on a common ground; an analysis of a
simple RAAM; and a description of a set of experiments performed on
simple "tarzan" sentences encoded by a larger RAAM.

We define the symbolic and subsymbolic paradigms as two opposing corners
of an abstract space of paradigms. This space, we propose, has roughly
three dimensions: representation, composition, and functionality. By
defining the differences in these terms, we are able to place actual
models in the paradigm space, and compare these models in somewhat common
terms.

As an example of the subsymbolic corner of the space, we examine in
detail the RAAM architecture, representations, compositional mechanisms,
and functionality. In conjunction with other simple feed-forward
networks, we create detectors, decoders and transformers which act
holistically on the composed, distributed, continuous subsymbolic
representations created by a RAAM. These tasks, although trivial for a
symbolic system, are accomplished without the need to decode a composite
structure into its constituent parts, as symbolic systems must do.

The paper can be found in the neuroprose archive as blank.raam.ps.Z; a
detailed example of how to retrieve the paper follows at the end of this
message. A version of the paper will also appear in your local bookstores
as a chapter in "Closing the Gap: Symbolism vs Connectionism," J.
Dinsmore, editor; LEA, publishers. 1992.

=----------------------------------------------------------------------------
% ftp cheops.cis.ohio-state.edu
Connected to cheops.cis.ohio-state.edu.
220 cheops.cis.ohio-state.edu FTP server (Ver Tue May 9 14:01 EDT 1989) ready.
Name (cheops.cis.ohio-state.edu:): anonymous
331 Guest login ok, send ident as password.
Password:neuron
230 Guest login ok, access restrictions apply.
ftp> binary
200 Type set to I.
ftp> cd pub/neuroprose
250 CWD command successful.
ftp> get blank.raam.ps.Z
200 PORT command successful.
150 Opening BINARY mode data connection for blank.raam.ps.Z (173015 bytes).
226 Transfer complete.
local: blank.raam.ps.Z remote: blank.raam.ps.Z
173015 bytes received in 1.6 seconds (1e+02 Kbytes/s)
ftp> bye
221 Goodbye.
% uncompress blank.raam.ps.Z 
% lpr blank.raam.ps 
=----------------------------------------------------------------------------


------------------------------

Subject: New ICSI TR on incremental learning
From:    ethem@ICSI.Berkeley.EDU (Ethem Alpaydin)
Date:    Tue, 21 May 91 10:53:29 -0700



        The following TR is available by anonymous net access at
icsi-ftp.berkeley.edu (128.32.201.55) in postscript. Instructions to ftp
and uncompress follow text.

        Hard copies may be requested by writing to either of the
addresses below:
        ethem@icsi.berkeley.edu
        
        Ethem Alpaydin
        ICSI 1947 Center St. Suite 600
        Berkeley CA 94704-1105 USA

=--------------------------------------------------------------------------

                             
                                 GAL:
                Networks that grow when they learn and
                       shrink when they forget

                            Ethem Alpaydin

               International Computer Science Institute
                             Berkeley, CA

                              TR 91-032

                               Abstract
Learning when limited to modification of some parameters has a limited
scope; the capability to modify the system structure is also needed to
get a wider range of the learnable.  In the case  of artificial neural
networks, learning  by iterative  adjustment  of synaptic  weights can
only succeed if the network designer predefines an appropriate network
structure, i.e.,  number of  hidden layers,  units,  and the  size and
shape of their receptive and projective  fields.  This paper advocates
the view that the  network  structure should not,  as usually done, be
determined by trial-and-error but should be  computed  by the learning
algorithm.    Incremental learning algorithms  can  modify the network
structure by addition and/or removal of units  and/or links.  A survey
of current connectionist literature is given on this line  of thought.
``Grow and Learn'' (GAL) is a new algorithm that learns an association
at one-shot due to being incremental and using a local representation.
During  the  so-called   ``sleep'' phase, units that  were  previously
stored but which are no longer  necessary  due to recent modifications
are   removed to   minimize network   complexity.   The  incrementally
constructed network   can  later  be  finetuned off-line    to improve
performance.    Another method  proposed    that    greatly  increases
recognition accuracy is  to train a  number of networks and vote  over
their  responses.   The  algorithm and   its  variants are   tested on
recognition of handwritten numerals  and seem promising  especially in
terms of  learning  speed.   This makes the  algorithm attractive  for
on-line    learning  tasks,  e.g.,  in   robotics.   The    biological
plausibility of incremental learning is also discussed briefly.


                               Keywords    
Incremental  learning,  supervised  learning, classification, pruning,
destructive methods, growth, constructive methods, nearest neighbor.

=--------------------------------------------------------------------------

Instructions to ftp the above-mentioned TR (Assuming you are under 
UNIX and have a postscript printer --- messages in parantheses indicate
system's responses):

ftp 128.32.201.55
(Connected to 128.32.201.55.
220 icsi-ftp (icsic) FTP server (Version 5.60 local) ready.
Name (128.32.201.55:ethem):)anonymous
(331 Guest login ok, send ident as password.
Password:)(your email address)
(230 Guest login Ok, access restrictions apply.
ftp>)cd pub/techreports
(250 CWD command successful.
ftp>)bin
(200 Type set to I.
ftp>)get tr-91-032.ps.Z
(200 PORT command successful.
150 Opening BINARY mode data connection for tr-91-032.ps.Z (153915 bytes).
226 Transfer complete.
local: tr-91-032.ps.Z remote: tr-91-032.ps.Z
153915 bytes received in 0.62 seconds (2.4e+02 Kbytes/s)
ftp>)quit
(221 Goodbye.)
(back to Unix)
uncompress tr-91-032.ps.Z
lpr tr-91-032.ps


Happy reading, I hope you'll enjoy it.


------------------------------

Subject: New Bayesian work
From:    David MacKay <mackay@hope.caltech.edu>
Date:    Tue, 21 May 91 10:40:57 -0700

Two new papers available
=-----------------------

        The papers that I presented at Snowbird this year are now
available in the neuroprose archives.

The titles:

        [1] Bayesian interpolation              (14 pages)
        [2] A practical Bayesian framework 
                for backprop networks           (11 pages)

        The first paper describes and demonstrates recent developments in
Bayesian regularisation and model comparison.  The second applies this
framework to backprop. The first paper is a prerequisite for
understanding the second.

        Abstracts and instructions for anonymous ftp follow. 

        If you have problems obtaining the files by ftp, feel free to
contact me.

        David MacKay              Office:      (818) 397 2805
                                  Fax:         (818) 792 7402
                                  Email:  mackay@hope.caltech.edu
                                  Smail:  Caltech 139-74,
                                          Pasadena, CA 91125

Abstracts
=--------
                Bayesian interpolation
                ----------------------

                Although Bayesian analysis has been in use since Laplace,
        the Bayesian method of {\em model--comparison} has only recently
        been developed in depth.

                In this paper, the Bayesian approach to regularisation
        and model--comparison is demonstrated by studying the inference
        problem of interpolating noisy data.  The concepts and methods
        described are quite general and can be applied to many other
        problems.

                Regularising constants are set by examining their
        posterior probability distribution. Alternative regularisers
        (priors) and alternative basis sets are objectively compared by
        evaluating the {\em evidence} for them. `Occam's razor' is
        automatically embodied by this framework.

                The way in which Bayes infers the values of regularising
        constants and noise levels has an elegant interpretation in terms
        of the effective number of parameters determined by the data set.
        This framework is due to Gull and Skilling.

           A practical Bayesian framework for backprop networks
           ----------------------------------------------------

                A quantitative and practical Bayesian framework is
        described for learning of mappings in feedforward networks.  The
        framework makes possible:

        (1) objective comparisons between solutions using alternative
                network architectures;
        (2) objective stopping rules for deletion of weights;
        (3) objective choice of magnitude and type of weight decay terms
                or additive regularisers (for penalising large weights,
                etc.);
        (4) a measure of the effective number of well--determined
                parameters in a model;
        (5) quantified estimates of the error bars on network parameters
                and on network output; 
        (6) objective comparisons with alternative learning and
                interpolation models such as splines and radial basis
                functions.

        The Bayesian `evidence' automatically embodies `Occam's razor,'
        penalising over--flexible and over--complex architectures.  The
        Bayesian approach helps detect poor underlying assumptions in
        learning models. For learning models well--matched to a problem,
        a good correlation between generalisation ability and the
        Bayesian evidence is obtained.

Instructions for obtaining copies by ftp from neuroprose:
=---------------------------------------------------------

unix> ftp cheops.cis.ohio-state.edu             # (or ftp 128.146.8.62)

      Name: anonymous
      Password: neuron
      ftp> cd pub/neuroprose
      ftp> binary
      ftp> get mackay.bayes-interpolation.ps.Z
      ftp> get mackay.bayes-backprop.ps.Z
      ftp> quit

unix> [then `uncompress' files and lpr them.] 



------------------------------

Subject: TR - Learning the past tense in a recurrent network
From:    Gary Cottrell <gary@cs.UCSD.EDU>
Date:    Tue, 21 May 91 18:27:56 -0700

The following paper will appear in the Proceedings of the Thirteenth
Annual Meeting of the Cognitive Science Society.

It is now available in the neuroprose archive as cottrell.cogsci91.ps.Z.

Learning the past tense in a recurrent network:
Acquiring the mapping from meaning to sounds

Garrison W. Cottrell            Kim Plunkett
Computer Science Dept.          Inst. of Psychology
UCSD                            University of Aarhus
La Jolla, CA                    Aarhus, Denmark

The performance of a recurrent neural network in mapping a set of plan
vectors, representing verb semantics, to associated sequences of
phonemes, representing the phonological structure of verb morphology, is
investigated. Several semantic representations are explored in attempt to
evaluate the role of verb synonymy and homophony in deteriming the
patterns of error observed in the net's output performance. The model's
performance offers several unexplored predictions for developmental
profiles of young children acquiring English verb morphology.

To retrieve this from the neuroprose archive type the following:

ftp 128.146.8.62
anonymous
<your netname here>
bi
cd pub/neuroprose
get cottrell.cogsci91.ps.Z
quit

uncompress cottrell.cogsci91.ps.Z
lpr cottrell.cogsci91.ps

Thanks again to Jordan Pollack for this great idea for net distribution.

gary cottrell 619-534-6640 Sec'y: 619-534-5288 FAX: 619-534-7029
Computer Science and Engineering C-014
UCSD, 
La Jolla, Ca. 92093
gary@cs.ucsd.edu (INTERNET)
{ucbvax,decvax,akgua,dcdwest}!sdcsvax!gary (USENET)
gcottrell@ucsd.edu (BITNET)


------------------------------

End of Neuron Digest [Volume 7 Issue 29]
****************************************