neuron-request@HPLMS2.HPL.HP.COM ("Neuron-Digest Moderator Peter Marvit") (05/23/91)
Neuron Digest Wednesday, 22 May 1991 Volume 7 : Issue 29 Today's Topics: New FKI-Report - An O(N^3) Learning Algorithm Preprint: building sensory-motor hierarchies Two new Tech Reports TR - Kohonen Feature Maps in Natural Language Processing Paper Available: RAAM New ICSI TR on incremental learning New Bayesian work TR - Learning the past tense in a recurrent network Send submissions, questions, address maintenance and requests for old issues to "neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request" Use "ftp" to get old issues from hplpm.hpl.hp.com (15.255.176.205). ------------------------------------------------------------ Subject: New FKI-Report - An O(N^3) Learning Algorithm From: Juergen Schmidhuber <schmidhu@informatik.tu-muenchen.dbp.de> Date: 06 May 91 12:43:22 +0200 Here is another one: =--------------------------------------------------------------------- AN O(n^3) LEARNING ALGORITHM FOR FULLY RECURRENT NETWORKS Juergen Schmidhuber Technical Report FKI-151-91, May 6, 1991 The fixed-size storage learning algorithm for fully recurrent continually running networks (e.g. (Robinson + Fallside, 1987), (Williams + Zipser, 1988)) requires O(n^4) computations per time step, where n is the number of non-input units. We describe a method which computes exactly the same gradient and requires fixed-size storage of the same order as the previous algorithm. But, the average time complexity per time step is O(n^3). =--------------------------------------------------------------------- To obtain a copy, do: unix> ftp 131.159.8.35 Name: anonymous Password: your name, please ftp> binary ftp> cd pub/fki ftp> get fki151.ps.Z ftp> bye unix> uncompress fki151.ps.Z unix> lpr fki151.ps Please do not forget to leave your name (instead of your email address). NOTE: fki151.ps is designed for European A4 paper format (20.9cm x 29.6cm). In case of ftp-problems send email to schmidhu@informatik.tu-muenchen.de or contact Juergen Schmidhuber Institut fuer Informatik, Technische Universitaet Muenchen Arcisstr. 21 8000 Muenchen 2 GERMANY ------------------------------ Subject: Preprint: building sensory-motor hierarchies From: Mark Ring <ring@cs.utexas.edu> Date: Wed, 08 May 91 16:16:31 -0500 Recently there's been some interest on this mailing list regarding neural net hierarchies for sequence "chunking". I've placed a relevant paper in the Neuroprose Archive for public ftp. This is a (very slightly extended) copy of a paper to be published in the Proceedings of the Eighth International Workshop on Machine Learning. The paper summarizes the results to date of work begun a year and a half ago to create a system that automatically and incrementally constructs hierarchies of behaviors in neural nets. The purpose of the system is to develop continuously through the encapsulation, or "chunking," of learned behaviors. =---------------------------------------------------------------------- INCREMENTAL DEVELOPMENT OF COMPLEX BEHAVIORS THROUGH AUTOMATIC CONSTRUCTION OF SENSORY-MOTOR HIERARCHIES Mark Ring University of Texas at Austin This paper addresses the issue of continual, incremental development of behaviors in reactive agents. The reactive agents are neural-network based and use reinforcement learning techniques. A continually developing system is one that is constantly capable of extending its repertoire of behaviors. An agent increases its repertoire of behaviors in order to increase its performance in and understanding of its environment. Continual development requires an unlimited growth potential; that is, it requires a system that can constantly augment current behaviors with new behaviors, perhaps using the current ones as a foundation for those that come next. It also requires a process for organizing behaviors in meaningful ways and a method for assigning credit properly to sequences of behaviors, where each behavior may itself be an arbitrarily long sequence. The solution proposed here is hierarchical and bottom up. I introduce a new kind of neuron (termed a ``bion''), whose characteristics permit it to be automatically constructed into sensory-motor hierarchies as determined by experience. The bion is being developed to resolve the problems of incremental growth, temporal history limitation, network organization, and credit assignment among component behaviors. A longer, more detailed paper will be announced shortly. =---------------------------------------------------------------------- Instructions to retrieve paper by ftp, (no hard copies available at this time): % ftp cheops.cis.ohio-state.edu (or 128.146.8.62) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get ring.ml91.ps.Z ftp> bye % uncompress ring.ml91.ps.Z % lpr -P(your_postscript_printer) ring.ml91.ps.Z =---------------------------------------------------------------------- DO NOT "reply" DIRECTLY TO THIS MESSAGE! If you have any questions or difficulties, please send e-mail to: ring@cs.utexas.edu. or send mail to: Mark Ring Department of Computer Sciences Taylor 2.124 University of Texas at Austin Austin, TX 78712 ------------------------------ Subject: Two new Tech Reports From: Scott.Fahlman@SEF1.SLISP.CS.CMU.EDU Date: Mon, 13 May 91 13:31:04 -0400 The following two tech reports have been placed in the neuroprose database at Ohio State. Instructions for accessing them via anonymous FTP are included at the end of this message. (Maybe everyone should copy down these instructions once and for all so that we can stop sending repeating them with each announcement.) =--------------------------------------------------------------------------- Tech Report CMU-CS-91-100 The Recurrent Cascade-Correlation Architecture Scott E. Fahlman Recurrent Cascade-Correlation (RCC) is a recurrent version of the Cascade-Correlation learning architecture of Fahlman and Lebiere \cite{fahlman:cascor}. RCC can learn from examples to map a sequence of inputs into a desired sequence of outputs. New hidden units with recurrent connections are added to the network one at a time, as they are needed during training. In effect, the network builds up a finite-state machine tailored specifically for the current problem. RCC retains the advantages of Cascade-Correlation: fast learning, good generalization, automatic construction of a near-minimal multi-layered network, and the ability to learn complex behaviors through a sequence of simple lessons. The power of RCC is demonstrated on two tasks: learning a finite-state grammar from examples of legal strings, and learning to recognize characters in Morse code. Note: This TR is essentially the same as the the paper of the same name in the NIPS 3 proceedings (due to appear very soon). The TR version includes some additional experimental data and a few explanatory diagrams that had to be cut in the NIPS version. =--------------------------------------------------------------------------- Tech report CMU-CS-91-130 Learning with Limited Numerical Precision Using the Cascade-Correlation Algorithm Markus Hoehfeld and Scott E. Fahlman A key question in the design of specialized hardware for simulation of neural networks is whether fixed-point arithmetic of limited numerical precision can be used with existing learning algorithms. We present an empirical study of the effects of limited precision in Cascade-Correlation networks on three different learning problems. We show that learning can fail abruptly as the precision of network weights or weight-update calculations is reduced below 12 bits. We introduce techniques for dynamic rescaling and probabilistic rounding that allow reliable convergence down to 6 bits of precision, with only a gradual reduction in the quality of the solutions. Note: The experiments described here were conducted during a visit by Markus Hoehfeld to Carnegie Mellon in the fall of 1990. Markus Hoehfeld's permanent address is Siemens AG, ZFE IS INF 2, Otto-Hahn-Ring 6, W-8000 Munich 83, Germany. =--------------------------------------------------------------------------- To access these tech reports in postscript form via anonymous FTP, do the following: unix> ftp cheops.cis.ohio-state.edu (or, ftp 128.146.8.62) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get <filename.ps.Z> ftp> quit unix> uncompress <filename.ps.Z> unix> lpr <filename.ps> (use flag your printer needs for Postscript) The TRs described above are stored as "fahlman.rcc.ps.Z" and "hoehfeld.precision.ps.Z". Older reports "fahlman.quickprop-tr.ps.Z" and "fahlman.cascor-tr.ps.Z" may also be of interest. Your local version of ftp and other unix utilities may be different. Consult your local system wizards for details. =--------------------------------------------------------------------------- Hardopy versions are now being printed and will be available soon, but because of the high demand and tight budget, our school has has (reluctantly) instituted a charge for mailing out tech reports in hardcopy: $3 per copy within the U.S. and $5 per copy elsewhere, and the payment must be in U.S. dollars. To order hardcopies, contact: Ms. Catherine Copetas School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 U.S.A. ------------------------------ Subject: TR - Kohonen Feature Maps in Natural Language Processing From: SCHOLTES@ALF.LET.UVA.NL Date: Wed, 15 May 91 13:42:00 +0700 TR Available on Recurrent Self-Organization in NLP: Kohonen Feature Maps in Natural Language Processing J.C. Scholtes University of Amsterdam Main points: showing the possibilities of Kohonen feature maps in symbolic applications by pushing self-organization. showing a different technique in Connectionist NLP by using only (unsupervised) self organization. Although the model is tested in a NLP context, the linguistic aspects of these experiments are probably less interesting than the connectionist ones. People inquiring a copy should be aware of this. Abstract In the 1980s, backpropagation (BP) started the connectionist bandwagon in Natural Language Processing (NLP). Although initial results were good, some critical notes must be made towards the blind application of BP. Most such systems add contextual and semantical features manually by structuring the input set. Moreover, these models form a small subtract of the brain structures known from neural sciences. They do not adapt smoothly to a changing environment and can only learn input/output pairs. Although these disadvantages of the backpropagation algorithm are commonly known and accepted, other more plausible learning algorithms, such as unsupervised learning techniques are still rare in the field of NLP. Main reason is the highly increasing complexity of unsupervised learning methods when applied in the already complex field of NLP. However, recent efforts implementing unsupervised language learning have been made, resulting in interesting conclusions (Elman and Ritter). Sequencing this earlier work, a recurrent self-organizing model (based on an extension of the Kohonen feature map), capable to derive contextual (and some semantical) information from scratch, is presented in detail. The model implements a first step towards an overall unsupervised language learning system. Simple linguistic tasks such as single word clustering (representation on the map), syntactical group formation, derivation of contextual structures, string prediction, grammatical correctness checking, word sense disambiguation and structure assigning are carried out in a number of experiments. The performance of the model is as least as good as achieved in recurrent backpropagation, and at some points even better (e.g. unsupervised derivation of word classes and syntactical structures). Although premature, the first results are promising and show possibilities for other even more biologically-inspired language processing techniques such as real Hebbian, Genetic or Darwinistic models. Forthcoming research must overcome limitations still present in the extended Kohonen model, such as the absence of within layer learning, restricted recurrence, no look-ahead functions (absence of distributed or unsupervised buffering mechanisms) and a limited support for an increased number of layers. A copy can be obtained by sending a Email message to SCHOLTES@ALF.LET.UVA.NL Please indicate whether you want a hard copy or a postscript file being send to you. ------------------------------ Subject: Paper Available: RAAM From: doug blank <blank@copper.ucs.indiana.edu> Date: Wed, 15 May 91 15:11:09 -0500 Exploring the Symbolic/Subsymbolic Continuum: A Case Study of RAAM Douglas S. Blank (blank@iuvax.cs.indiana.edu) Lisa A. Meeden (meeden@iuvax.cs.indiana.edu) James B. Marshall (marshall@iuvax.cs.indiana.edu) Indiana University Computer Science and Cognitive Science Departments Abstract: This paper is an in-depth study of the mechanics of recursive auto-associative memory, or RAAM, an architecture developed by Jordan Pollack. It is divided into three main sections: an attempt to place the symbolic and subsymbolic paradigms on a common ground; an analysis of a simple RAAM; and a description of a set of experiments performed on simple "tarzan" sentences encoded by a larger RAAM. We define the symbolic and subsymbolic paradigms as two opposing corners of an abstract space of paradigms. This space, we propose, has roughly three dimensions: representation, composition, and functionality. By defining the differences in these terms, we are able to place actual models in the paradigm space, and compare these models in somewhat common terms. As an example of the subsymbolic corner of the space, we examine in detail the RAAM architecture, representations, compositional mechanisms, and functionality. In conjunction with other simple feed-forward networks, we create detectors, decoders and transformers which act holistically on the composed, distributed, continuous subsymbolic representations created by a RAAM. These tasks, although trivial for a symbolic system, are accomplished without the need to decode a composite structure into its constituent parts, as symbolic systems must do. The paper can be found in the neuroprose archive as blank.raam.ps.Z; a detailed example of how to retrieve the paper follows at the end of this message. A version of the paper will also appear in your local bookstores as a chapter in "Closing the Gap: Symbolism vs Connectionism," J. Dinsmore, editor; LEA, publishers. 1992. =---------------------------------------------------------------------------- % ftp cheops.cis.ohio-state.edu Connected to cheops.cis.ohio-state.edu. 220 cheops.cis.ohio-state.edu FTP server (Ver Tue May 9 14:01 EDT 1989) ready. Name (cheops.cis.ohio-state.edu:): anonymous 331 Guest login ok, send ident as password. Password:neuron 230 Guest login ok, access restrictions apply. ftp> binary 200 Type set to I. ftp> cd pub/neuroprose 250 CWD command successful. ftp> get blank.raam.ps.Z 200 PORT command successful. 150 Opening BINARY mode data connection for blank.raam.ps.Z (173015 bytes). 226 Transfer complete. local: blank.raam.ps.Z remote: blank.raam.ps.Z 173015 bytes received in 1.6 seconds (1e+02 Kbytes/s) ftp> bye 221 Goodbye. % uncompress blank.raam.ps.Z % lpr blank.raam.ps =---------------------------------------------------------------------------- ------------------------------ Subject: New ICSI TR on incremental learning From: ethem@ICSI.Berkeley.EDU (Ethem Alpaydin) Date: Tue, 21 May 91 10:53:29 -0700 The following TR is available by anonymous net access at icsi-ftp.berkeley.edu (128.32.201.55) in postscript. Instructions to ftp and uncompress follow text. Hard copies may be requested by writing to either of the addresses below: ethem@icsi.berkeley.edu Ethem Alpaydin ICSI 1947 Center St. Suite 600 Berkeley CA 94704-1105 USA =-------------------------------------------------------------------------- GAL: Networks that grow when they learn and shrink when they forget Ethem Alpaydin International Computer Science Institute Berkeley, CA TR 91-032 Abstract Learning when limited to modification of some parameters has a limited scope; the capability to modify the system structure is also needed to get a wider range of the learnable. In the case of artificial neural networks, learning by iterative adjustment of synaptic weights can only succeed if the network designer predefines an appropriate network structure, i.e., number of hidden layers, units, and the size and shape of their receptive and projective fields. This paper advocates the view that the network structure should not, as usually done, be determined by trial-and-error but should be computed by the learning algorithm. Incremental learning algorithms can modify the network structure by addition and/or removal of units and/or links. A survey of current connectionist literature is given on this line of thought. ``Grow and Learn'' (GAL) is a new algorithm that learns an association at one-shot due to being incremental and using a local representation. During the so-called ``sleep'' phase, units that were previously stored but which are no longer necessary due to recent modifications are removed to minimize network complexity. The incrementally constructed network can later be finetuned off-line to improve performance. Another method proposed that greatly increases recognition accuracy is to train a number of networks and vote over their responses. The algorithm and its variants are tested on recognition of handwritten numerals and seem promising especially in terms of learning speed. This makes the algorithm attractive for on-line learning tasks, e.g., in robotics. The biological plausibility of incremental learning is also discussed briefly. Keywords Incremental learning, supervised learning, classification, pruning, destructive methods, growth, constructive methods, nearest neighbor. =-------------------------------------------------------------------------- Instructions to ftp the above-mentioned TR (Assuming you are under UNIX and have a postscript printer --- messages in parantheses indicate system's responses): ftp 128.32.201.55 (Connected to 128.32.201.55. 220 icsi-ftp (icsic) FTP server (Version 5.60 local) ready. Name (128.32.201.55:ethem):)anonymous (331 Guest login ok, send ident as password. Password:)(your email address) (230 Guest login Ok, access restrictions apply. ftp>)cd pub/techreports (250 CWD command successful. ftp>)bin (200 Type set to I. ftp>)get tr-91-032.ps.Z (200 PORT command successful. 150 Opening BINARY mode data connection for tr-91-032.ps.Z (153915 bytes). 226 Transfer complete. local: tr-91-032.ps.Z remote: tr-91-032.ps.Z 153915 bytes received in 0.62 seconds (2.4e+02 Kbytes/s) ftp>)quit (221 Goodbye.) (back to Unix) uncompress tr-91-032.ps.Z lpr tr-91-032.ps Happy reading, I hope you'll enjoy it. ------------------------------ Subject: New Bayesian work From: David MacKay <mackay@hope.caltech.edu> Date: Tue, 21 May 91 10:40:57 -0700 Two new papers available =----------------------- The papers that I presented at Snowbird this year are now available in the neuroprose archives. The titles: [1] Bayesian interpolation (14 pages) [2] A practical Bayesian framework for backprop networks (11 pages) The first paper describes and demonstrates recent developments in Bayesian regularisation and model comparison. The second applies this framework to backprop. The first paper is a prerequisite for understanding the second. Abstracts and instructions for anonymous ftp follow. If you have problems obtaining the files by ftp, feel free to contact me. David MacKay Office: (818) 397 2805 Fax: (818) 792 7402 Email: mackay@hope.caltech.edu Smail: Caltech 139-74, Pasadena, CA 91125 Abstracts =-------- Bayesian interpolation ---------------------- Although Bayesian analysis has been in use since Laplace, the Bayesian method of {\em model--comparison} has only recently been developed in depth. In this paper, the Bayesian approach to regularisation and model--comparison is demonstrated by studying the inference problem of interpolating noisy data. The concepts and methods described are quite general and can be applied to many other problems. Regularising constants are set by examining their posterior probability distribution. Alternative regularisers (priors) and alternative basis sets are objectively compared by evaluating the {\em evidence} for them. `Occam's razor' is automatically embodied by this framework. The way in which Bayes infers the values of regularising constants and noise levels has an elegant interpretation in terms of the effective number of parameters determined by the data set. This framework is due to Gull and Skilling. A practical Bayesian framework for backprop networks ---------------------------------------------------- A quantitative and practical Bayesian framework is described for learning of mappings in feedforward networks. The framework makes possible: (1) objective comparisons between solutions using alternative network architectures; (2) objective stopping rules for deletion of weights; (3) objective choice of magnitude and type of weight decay terms or additive regularisers (for penalising large weights, etc.); (4) a measure of the effective number of well--determined parameters in a model; (5) quantified estimates of the error bars on network parameters and on network output; (6) objective comparisons with alternative learning and interpolation models such as splines and radial basis functions. The Bayesian `evidence' automatically embodies `Occam's razor,' penalising over--flexible and over--complex architectures. The Bayesian approach helps detect poor underlying assumptions in learning models. For learning models well--matched to a problem, a good correlation between generalisation ability and the Bayesian evidence is obtained. Instructions for obtaining copies by ftp from neuroprose: =--------------------------------------------------------- unix> ftp cheops.cis.ohio-state.edu # (or ftp 128.146.8.62) Name: anonymous Password: neuron ftp> cd pub/neuroprose ftp> binary ftp> get mackay.bayes-interpolation.ps.Z ftp> get mackay.bayes-backprop.ps.Z ftp> quit unix> [then `uncompress' files and lpr them.] ------------------------------ Subject: TR - Learning the past tense in a recurrent network From: Gary Cottrell <gary@cs.UCSD.EDU> Date: Tue, 21 May 91 18:27:56 -0700 The following paper will appear in the Proceedings of the Thirteenth Annual Meeting of the Cognitive Science Society. It is now available in the neuroprose archive as cottrell.cogsci91.ps.Z. Learning the past tense in a recurrent network: Acquiring the mapping from meaning to sounds Garrison W. Cottrell Kim Plunkett Computer Science Dept. Inst. of Psychology UCSD University of Aarhus La Jolla, CA Aarhus, Denmark The performance of a recurrent neural network in mapping a set of plan vectors, representing verb semantics, to associated sequences of phonemes, representing the phonological structure of verb morphology, is investigated. Several semantic representations are explored in attempt to evaluate the role of verb synonymy and homophony in deteriming the patterns of error observed in the net's output performance. The model's performance offers several unexplored predictions for developmental profiles of young children acquiring English verb morphology. To retrieve this from the neuroprose archive type the following: ftp 128.146.8.62 anonymous <your netname here> bi cd pub/neuroprose get cottrell.cogsci91.ps.Z quit uncompress cottrell.cogsci91.ps.Z lpr cottrell.cogsci91.ps Thanks again to Jordan Pollack for this great idea for net distribution. gary cottrell 619-534-6640 Sec'y: 619-534-5288 FAX: 619-534-7029 Computer Science and Engineering C-014 UCSD, La Jolla, Ca. 92093 gary@cs.ucsd.edu (INTERNET) {ucbvax,decvax,akgua,dcdwest}!sdcsvax!gary (USENET) gcottrell@ucsd.edu (BITNET) ------------------------------ End of Neuron Digest [Volume 7 Issue 29] ****************************************