[comp.ai.neural-nets] Neuron Digest V6 #38

neuron-request@HPLMS2.HPL.HP.COM ("Neuron-Digest Moderator Peter Marvit") (06/05/90)
Neuron Digest	Monday,  4 Jun 1990
		Volume 6 : Issue 38

Today's Topics:
	   Summary: References on "Training issues for Bp...."


Send submissions, questions, address maintenance and requests for old issues to
"neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request"
Use "ftp" to get old issues from hplpm.hpl.hp.com (15.255.176.205).

------------------------------------------------------------

Subject: Summary: References on "Training issues for Bp...."
From:    rcoahk@koel.co.rmit.oz (Alvaro Hui Kau)
Date:    14 May 90 01:38:41 +0000

[[ Editor's Note: You'll find troff and refer style citations below.
Fair warning, but good information... -PM ]]

To all good people out there!  I post a request two weeks ago asking for
a guide to the available references on the topic : "Training issues for
back-bone propagation alg. with neural nets" And HERE all the response I
get!!!!!!!  Thank you ,thank you , thank you...... to all those who
response!!!!!  Now it is the problem on finding the time to read the
references...

Thanks again........
					rcoahk@koel.co.rmit.oz.au.
					akkh@mullian.ee.mu.oz.au.
 ---------------------------------------------------------------------

Hi, there,

I read your message seeking reference on Bp networks.  I am not sure
whether you want to write a survey on "Training Issues for Bp ..."  or
you want to do some innovative research on Bp for your thesis.  So far as
a know, training issues for Bp have been very well studied in the past
five years.  Lots of reference can be found from IEEE first International
Conf. on NN, 1987; 1988; and Proceedings of International Joint Conf on
NN, 1989; 1990.  I think most engineering libraries at least have some
copies on Proceedings of 87',88', and 89'.  If your library doesn't have
them, try interlibrary loan.

Good luck on your thesis work!

xiping wu
3150 Newmark CE Lab, MC-250
205 N Mathews Ave
Urbana, IL 61801
email: xwu@civilgate.ce.uiuc.edu


Suggest that the training issues you might like to consider are:
a) scaling of speed of training with size of network: an exponential
   relationship would be bad news but is probably the case.
b) classification in the presence of noise: additional noise channels degrade
   techniques like k nearest neighbours more than error propagation networks.
c) an analysis of the many speed up hacks which exist.

If this is the sort of thing you are after, somerefs follow.

Tony Robinson.


@techreport{TesauroJanssens88,
	author=		"Gerald Tesauro and Robert Janssens",
	title=		"Scaling Relationships in Back-Propagation Learning:
			{D}ependence on Predicate Order",
	year=		1988,
	institution=	"Center for Complex Systems Research, University of
			Illinois at Urbana Champagne",
	number=		"CCSR-88-1"}

@article{Jacobs88,
	author=		"Robert A. Jacobs",
	title=		"Increased Rates of Convergence Through Learning Rate
			Adaptation",
	journal=	"Neural Networks",
	year=		1988,
	volume=		1,
	pages=		"295-307"}

@article{ChanFallside87-csl,
	author=		"L. W. Chan and F. Fallside",
	year=		1987,
	journal=	"Computer Speech and Language",
	pages=		"205-218",
	title=		"An Adaptive Training Algorithm for Back Propagation
			 Networks",
	volume=		"2",
	number=		"3/4"}


consider these:

%A Y. Le Cun
%D 1985
%J Proceedings of Cognitiva
%T Une Procedure d'Apprentissage pour Reseau a Seuil Assymetrique
%P 599--604
%V 85

%A D. B. Parker
%C Cambridge, MA
%D 1985
%I Massachusetts Institute of Technology, Center for Computational 
Research in Economics and Management Science
%R TR-47
%T Learning-Logic

%A D. E. Rumelhart
%A G. E. Hinton
%A R. J. Williams
%D 1986
%J Nature
%P 533--536
%T Learning Representations by Back-Propagating Errors
%V 323

        three papers which, simultaneously, introduced BP

%D 1988
%I AFCEA International Press
%T {DARPA} Neural Network Study
%X A good overview of the state of the art in neural computing
(in 1988).  Is in fact a large annoted bibliography.  Unfortunately,
the choice of public makes than not all applications presented are
evaluated upon equally much.

%A S. E. Fahlman
%A G. E. Hinton
%D January 1987
%J Computer
%P 100--109
%T Connectionist Architectures for Artificial Intelligence
%V 20
%N 1

%A G. E. Hinton
%C Pittsburgh, PA, USA
%D 1987
%I Computer Science Department, Carnegie Mellon University
%R CMU-CS-87-115 (version 2)
%T Connectionist Learning Procedures
%X One of the better neural networks overview papers, although the
distinction between network topology and learning algorithm is not always
very clear.  Could very well be used as an introduction to neural networks.

%A K. Hornik
%A M. Stinchcombe
%A H. White
%T Multilayer Feedforward Networks are Universal Approximators
%J Neural Networks
%V 2
%N 5
%D 1989
%P 359--366

        One of the papers proving that ...

%A R. P. Lippmann
%D April 1987
%J IEEE Transactions on Acoustics, Speech, and Signal Processing
%V 2
%N 4
%P 4--22
%T An Introduction to Computing with Neural Nets
%X Much acclaimed as an overview of neural networks, but rather inaccurate
on several points.  The categorization into binary and continuous-valued
input neural networks is rather arbitrary, and may work confusing for
the unexperienced reader.  Not all networks discussed are of equal importance.

%A J. L. McClelland
%A D. E. Rumelhart
%D 1986
%I The MIT Press
%K PDP-2
%T Parallel Distributed Processing: Explorations in the 
Microstructure of Cognition (Volume 2)

%A J. L. McClelland
%A D. E. Rumelhart
%D 1988
%I The MIT Press
%T Explorations in Parallel Distributed Processing: Computational 
Models of Cognition and Perception

%A D. E. Rumelhart
%A J. L. McClelland
%D 1986
%I The MIT Press
%K PDP-1
%T Parallel Distributed Processing: Explorations in the 
Microstructure of Cognition (Volume 1)
%X Excellent as a broad-spectrum survey in neural networks, but written
from the viewpoint of cognitive psychology.  The point of focus in most
reports is the cognitive implication of distributed computing.

%A A. I. Wasserman
%D 1989
%I Van Nostrand Reinhold
%T Neural Computing: Theory and Practice

        a text book on nn's

%A B. Widrow
%A M. E. Hoff
%B 1960 IRE WESCON Convention Record
%C New York
%D 1960
%P 96--104
%T Adaptive Switching Circuits
Check out Hecht-Nielsen's "Theory of the Backpropagation Network" in the
1989 Proc. IJCNN (Int'l Joint Conference on Neural Networks.)

Also, the poster who complained that backprop is inefficient on
serial machines is correct.

Grobbins.    grobbins@eniac.seas.upenn.edu

Backpropagation is described well in _Parallel_Distributed_Processing_
(MIT PRESS).

I personally feel that it is time for researchers to look beyond
simple backpropagation of feedforward networks.  Conjugate-gradient
methods provide much faster means of training feedforward networks
(or at least use Fallman's Quickprop).

Backpropagation methods involving recurrent networks, however, needs to
be investigated further.  There are a large bunch of recurrent network
training algorithms which should be studied in complex applications
involving temporal behavior.  Willams and Zisper published a nice paper
on continuously running resurrent networks, Pineda published an
interesting relaxation algorithm for recurrent nets, and Schmidhuber has
published a truly wonderful group of papers concerning using two
recurrent nets for interesting temporal behavior (one net is an
environment model, the other uses that model to figure out how to perform
tasks in the environment) that deserves to be used in complex behavioral
applications.

>Backpropagation is described well in _Parallel_Distributed_Processing_
>(MIT PRESS).
 D.E. Rumelhart and J.L. McClelland Eds


>Willams and Zisper published
>a nice paper on continuously running resurrent networks, 

R.J. Williams, D. Zisper.  _A Learning Algorithm for Continually Running
  Fully Recurrent Neural Networks_.  Institute for Cognitive Science,
Report 8805, October 1988.

>Pineda published
>an interesting relaxation algorithm for recurrent nets, and

Pineda, F.J. (1987).  Generalization of bapropagation to recurrent and
higher order neural networks. _Proceedings of IEEE Conference on Neural
Information Processing Systems_.

>Schmidhuber has published a truly wonderful group of papers
>concerning using two recurrent nets for interesting temporal behavior

J.H. Schmidhuber (1990).  _Making the world differentiable:  On 
  supervised learning fully recurrent networks for dynamic reinforcement
  learning and planning in non-stationary environments_. FKI-126-90 report
  Technische Universitat Munchen (Institut fur Informatik).

I have some interest in this problem. There are a number of books - try
PDP Vol 1 (Rumelhart and McClelland) (standard - not very exciting - but
gets you going - also Explorations in PDP by same authors has discs for
IBM computers containing programs - not very good but gets you going!

I have seen a good book by Khanna - publisher is Addison Wesley - there
is also a book by Pao (?) published by AW.

Igor Aleksander has just published a book - publisher is Chapman & Hall -
I think it should be quite good.

Issues in Back prop -

	1) Topology - net topology makes a difference!
	    Number of hidden units for application?

	   A few problems have been solved - it IS possible to map any
	   suitably continuous
	   function [0,1] sup n -> [0,1] sup m for suitable n and m
	   well enough (i.e. approximately) - based on a theorem by
	   Kolmogorov, and recently proved for 3 layer nets by Halbert White -
	   see recent issues of Neural Networks.
 
	2) Unit type - most workers use semilinear units, BUT higher order
	   units train much faster - see papers by Sejnowski and Hinton 
	   in "Neural Computation" - first issue - also papers by Giles,
	   Lee etc in ICNN 87 etc.

	3) Convergence - Most workers seem to assume that BP converges - 
		it doesn't!		
		it can get stuck in local minima and does!
		
		To speed up BP you could use momentum (described in PDP 1), or
		the delta-bar-delta method (described in a paper by Jacobs,
		Neural Networks ?), or Quickprop, (described by Fahlman in
		Proc Connectionist Summer School, 1988). 
		
		As a useful alternative to back propagation - see Baba's paper
		in Neural Networks (vol 2 - ???) (recent) on Random
		Optimisation methods.

		One other method is to apply Genetic Algorithms - there is
		some evidence that they can work well (see papers by Whitley
		in 3rd Proc on GAs, 1989)
	
	4) Applications - good applications - see Sejnowski & Gorman's paper
	   in Neural Networks - issue 1 (on sonar) - also Sejnowski & Lehky's
	   short article on shape from shading (Nature, 1988?) Also articles
		on spirals etc in Procs of 1988 Connectionist Summer School, 
	published by Morgan Kauffman.

	5) Generalisation - this is a BIG issue - getting a network to classify
	   training vectors is difficult, but no big deal - the problem
	   everyone wants to solve is to get good performance on unseen inputs.

	   This (as I am just realising!) is VERY hard. Lots of workers have
	   claimed generalisation - but basis is not sound.

	   A paper to read is by Solla, Levin and Tishny, IJCNN89
	   (Washington DC).

	   An earlier paper by Solla is in nEuro88 Procs, Paris

	6) Classification - alternative methods - Ron Chrisley and
           Kohonen have compared PDP with other methods - see the
           proceedings of nEuro88 - (also published elsewhere, but not
           sure of exact details) - compare with Learning Vector
           Quantisation, and Hopfield nets ....

	7) Fault-tolerance - reliability - changing weight values or killing
	   units altogether may make little difference in a network with
           trained redundant units. However, Sejnowski claims that
           additional hidden units are unassigned, whereas Hinton claims
           that the additional HUs perform redundant processsing. Hinton
           uses a weight decay process in training - Sejnowski doesn't!


Neural networks (of BP type) require a LOT of processing power, if you
want to study anything even of moderate size.
 
I hope this helps - in my next message I will include the text of a Study
Guide which I prepared for a course I gave earlier this year.

dave martland
dept of cs
Brunel University
UK



.ft H
.fo 'CS434 Guide'- % -'DM October 1989
.nh
.de HI
.ft HB
\\$1
.ft H
..
.de IN
.in +3m 
..
.de EX
.in -3m
..
.de TI
.ti -3m
..
.de BD
.ft HI
\\$1
.ft H
..
.ft HB
.ce 2
NEURAL NETWORK SYSTEMS - STUDY GUIDE  (CS 434)
David Martland, October 1989

.ft H

.HI "Introduction"

This guide is intended to help you get to grips with the study of neural
networks. It gives a brief outline of the background to neural network
research, and then describes some book and papers which you are likely to
find useful.  You will not need to buy all of the books, nor will you
need to photo-copy all of the papers, although it may help you to copy
some. The books which will be most useful to buy are the books by
Rumelhart and McClelland, all three of them. The first two books describe
a significant number of the current network types, whilst the third book,
Explorations in PDP, contains a disc with software which can run on IBM
computers. If you prefer, there is now also a version of the software
which runs on Macintoshes, although it is no more sophisticated.

Later sections of this guide gives some ideas as to how you can discover
more about neural networks for yourself, both by reading, and also by
exploring the subject using packages included in "Explorations".

Finally there is a section on the assesment by course work.

.HI Background

During the last few years, the study of neural network systems has become
increasingly more popular. Partly this has been because of an awareness
that intelligent animals can solve problems which are impossible for even
the most powerful modern computers, and partly because of the desire by
engineers and computer scientists to explore and exploit parallel
hardware systems, and apply them to practical problems. Additionally,
there are a number of problem types which now look appropriate for
solution using various types of artificial neural network, and these may
lead to successful applications being developed in areas such as military
systems, medical diagnosis, and plant control.

The course introduces biological models of neural networks, and then moves
away to deal with aspects of artificial systems of current interest.

.HI Objectives

There are a number of objectives for this course. 
.br

You should become familiar with major network types, including
perceptrons, adaline, mult-layer perceptrons, Hopfield networks,
Boltzmann machines, Kohonen networks, boolean network models, and
Grossberg's ART model.

You should expect to gain a deeper understanding of some of the networks
and their adaptation and operating processes. Some mathematical
expertise, including knowledge of differential calculus, and matrix and
vector algebra will be desirable.  These will be used in the description
of heuristic optimisation processes, and for the stability proofs for
Hopfield networks.

You will gain practical experience, mostly based on software packages,
and designed to demonstrate that the presented theory works.

You should also expect to gain an understanding of the types of
application that can be handled by current artificial neural network
technology. Since these applications are typically very costly both in
terms of hardware and execution time, it is unlikely that you will be
able to develop an interesting application of reasonable size. However,
you should be aware of what is possible.

You may wish to formulate other objectives for yourself, such as a study
of architecture of systems which will support artificial neural networks.
These will not be major objectives.

.HI "Course books and papers"

There are many books now available about neural network systems, but few
that are comprehensive enough to support the course. During the year new
books will become available, but of currently available books the
following will be useful.

.IN
.TI 
Rumelhart, D.E. and McClelland, J.L. (Eds), "Parallel Distributed
Processing, Vol 1",MIT Press, Cambridge, MA, 1986

This volume describes feed-forward systems, the interactive activation
model, the harmony machine and the Boltzmann machine. This book is
considered by many to be the best book available as an introduction at
the present time.

.TI 
Rumelhart, D.E. and McClelland, J.L. (Eds), "Parallel Distributed
Processing, Vol 2", ,MIT Press, Cambridge, MA, 1986

This volume continues where Vol 1 leaves off, and concentrates on
applying the techniques described in Vol 1 to several applications of
interest to cognitive psychologists.

.TI 
McClelland, J.L.& Rumelhart, D.E.,"Explorations in parallel distributed
processing", Cambridge, MA, MIT Press, 1988

This book contains a floppy disc containing a number of programs which
illustrate principles described in the other PDP books. The programs,
written in C, are not very sophisticated, but they can help to
demonstrate properties of the systems described if enough effort is put
in by the reader.

.TI
Aleksander, I., (Ed), "Neural Computing Architectures",North Oxford
Academic, 1989

This book concentrates on boolean models of artificial neural networks.

.TI 
Hinton, G.E. & Anderson, J.A. (Eds),"Parallel Models of Associative
Memory", Lawrence Erlbaum Ass., Hillsdale NJ, 1981

This book contains a number of articles by authors who have made
contributions to neural network theory. The article by Sejnowski on
accurate neural modelling is of interest. Also, the article by Willshaw,
based on the earlier paper in Nature, about holographic storage is still
of interest.

.TI
Kanerva, P, "Sparse Distributed Memory",MIT Press, 1988

This book describes an associative memory system based on the use of
neural units, and in which the data representation for any stored pattern
is spread out over several (possibly many) different units.  This could
render the system tolerant of failure, and also leads to very rapid
recall using appropriate parallel hardware.

.TI
Kohonen, T :"Self organisation and Associative Memory", Springer-Verlag, 1984
 
This book is basically about a neural network model developed by Teuvo
Kohonen. It introduces concepts such as lateral inhibition, and also
shows how an unsupervised training procedure can lead to cluster
formation appropriate to classification tasks. An additional feature of
the model is that adaptive process results in a structured representation
of the input data classes, in which similar classes are likely to be
represented by physically nearby neural units.

.TI
Minsky, M and Papert, S ,"Perceptrons: An introduction to computational
geometry", MIT Press, Cambridge, MA ,1969

This is the classic book by Minsky and Papert, in which they showed what
perceptrons cannot do. However, they were perhaps too restrictive in
their specification of a perceptron system, as the more recently adopted
multi-layer perceptron models overcome many of the problems that they
discussed. Despite this, it is still an important book, and the
perceptron learning algorithm is an important process for dichotomising
linearly separable pattern sets.  Additionally, the concepts introduced
within the book, and the (fairly) thorough treatment of the subject using
mathematics, still make very interesting reading today.

.TI
Palm, G., "Neural Assemblies",Springer-Verlag, Berlin, 1982

This book contains a detailed description of an associative memory model
based on the associative memory model of Willshaw et al.

.TI
Press, W.H, Flannery, B.R., Teukolsky, S.A., Vetterling, W.T., "Numerical
Recipes in C"

This very useful book contains programs in C for a large number of
algorithms, and gives brief, readable, descriptions of minimisation
procedures appropriate to neural network study.

.EX

.HI Papers

There are many papers on neural networks, but some are of greater
significance than others.

Important papers include:

.IN
Grossberg, ART ... (decent readable reference to be supplied!)

Hopfield, J.J., "Neural networks and physical systems with emergent
collective computational abilities" Proceedings of the National Academy
of Sciences,USA V 79, 2554-2558, 1982
 
Hopfield, J.J.," Neurons with graded response have collective
computational properties like those of two-state neurons", Proceedings of
the National Academy of Sciences, USA, V 81, 3088-3092, 1984
 
Hopfield, J.J. & Tank, D.W. "'Neural' computation of decisions in
optimization problems", Biological Cybernetics,V 52, 141-152, 1985

Kohonen, T. "Analysis of a simple self organizing process" ,Biological
Cybernetics, V 44 , 135-140, 1982

Little, W.A & Shaw, G.L.,"A statistical theory of short and long term
memory", J. Behav. Biol.  V 14 115, 1974

Little, W.A.,"The existence of persistant states in the brain",
Mathematical Bioscience, V 19, 101-120, 1974

Little, W.A. & Shaw, G.L. "Analytic study of the memory storage capacity
of a neural network", Mathematical Biosciences, V 39, 281-290, 1978

Lehky, S.R. & Sejnowski, T.J., "Network model of shape-from
shading:neural function arises from both receptive and projective
fields", Nature, V 333, 452-454 , 1988

Gorman, R.P. & Sejnowski, T.J. "Analysis of hidden units in a layered
network trained to classify sonar targets", Neural Networks,V 1,7 5-89,
1988

Jacobs, 1988, Neural Networks, V1-4, 1988

Widrow, IJCNN 89

Willshaw, D.J., Buneman, D.P., Longuet-Higgins,H.C. "Non-holographic
associative memory" , Nature ,V 222 ,960-962, 1969
 
Willshaw, D.J.,"Holography, association and induction", in "Parallel
models of associative memory" Hinton, G.E. and Anderson J.A.(Eds),
Erlbaum ass., Hillsdale,NJ, 1982
 
.EX

.HI "Introductory articles"

There have been a number of useful "special issues", and introductory
articles during the last few years. The following should be noted:

Neural networks, number 1, vol 1, 1988

This contains introductory articles by Grossberg and Kohonen (An
introduction to neural computing, pp 3-16), which are well worth re
ading.

IEEE Computer, March 1988.

This issue is particularly worth getting hold of. The articles by Kohonen
and Widrow are very good introductions to Kohonen networks, and adaptivc
filtering systems respectively. The article by Linsker is particularly
interesting.

Byte, article by G.E.Hinton on Boltzmann machines (1985) is easy to read. 

The paper by Lippmann, R.P.,("An introduction to computing with neural
nets", IEEE Acoustics Speech Signal Processing Magazine, 4, 1987,pp
4-22), is useful, although it contains a number of errors and incorrect
statements.

Read it, but don't believe everything it says!

.HI "Journals and papers"

There are several journals which deal with Neural Networks, and it is
important to keep up to date with papers as they appear.

Fruitful sources of inspiration can be found in:

.IN
Biological Cybernetics

IEEE Trans on Systems, Man and Cybernetics

Neural Networks

.EX

There have also been occasional articles worth reading in
magazines/journals such as Byte, Scientific American, Science, and IEEE
Computer.

Other journals on neural networks are Neural Computation (edited by
Terrence-Sejnowski - published by MIT), Journal of neural network
computing (edited by Judith Dayhoff), International Journal of Neural
Networks (edited by Kamal U.Karna)

.HI "Computer Packages"

Doing a study of neural network models is likely to require access to
computer processing. Some network models can be analysed by mathematical
models, but often true understanding only comes from running a model on a
computer. Interesting problems are likely to be very costly in terms of
computer power, and to require large running times on powerful machines.
However, simple problems can be tackled to illustrate the principles.

One of the simplest packages available is that included in the
"Explorations" book. This has the merits of being fairly cheap, and it
can run on IBM PCs.  It is also possible to recompile the programs to run
on more powerful machines.  The user interface is quite poor, but at
least you won't have to do much programming.

Another package which we have available is the Rochester simulator
package, which runs on the university UNIX based computers. This is
slightly better as a general purpose tool, but is harder to program,
since it requires some knowledge of C.

Other commercially available packages have some advantages, but they cost
more.  Often commercially available packages come with demonstrations of
major network types, so it is worth looking at demonstrations if you can.
Packages such as NeuralWare, NeuralWorks come with many network types
ready configured.

.HI Societies

There is now an International Neural Network Society (INNS). This
publishes a journal called Neural Networks, which comes out several times
a year. Membership of the society for students is moderately cheap, and
also allows you reduced fees if you decide to go to any of the societies
conferences!

.HI "Conferences and workshops"

There are now several conferences on neural networks each year. Ones to
watch out for are:

IEEE/INNS joint conference on neural networks (Now twice a year!).
Locations currently Washington and San Diego. Previous conference
proceedings were ICNN87 (IEEE San Diego), ICNN88 (IEEE San Diego), INNS
88(Boston), IJCNN89 (Washington)

IEEE Natural Information Procesing Systems (NIPS) (usually Denver, in
November)

US summer schools on neural networks - for example Carnegie Mellon, Woods
Hole.

nEuro conference - supposed to be a nEuro90 somewhere in Europe. Previous
conference was nEuro88.

.HI "Other sources of information"

Another very useful source of information about neural networks is
available to users of computers connected to the Usenet system. Using a
news reader program (ours is called
.BD rn
on the local machine Terra), it is possible to find out about a large
number of different subjects, from Society Women to Ozone Depletion. In
the present context, you will want to look out for the section
.BD "comp.ai.neural-nets" 
and perhaps also 
.BD "comp.ai."

You should note that it is possible to waste an enormous amount of time
reading the news, and you should resist the temptation to do so. Using
our mail reader, it is possible to scan read through a large number of
articles looking for keywords, and knowing that the '=' command will give
a list of all the articles in a newsgroup is also very helpful.

For tracking down papers you should learn to use the CD ROM based system
which is kept in the library. Currently this is very popular, but you
should be able to book it in advance. Unfortunately the load period is 2
hours, which is too long, since useful work can be done in much less
time, and the long period leads to queues. You could also use the Science
Citation Index and the Abstracts, as well as using the other library
services.

.HI Strategy

The course will introduce biological concepts of neurons, and neural
networks.

.BD Perceptrons 
will be introduced, and the concepts of 
.BD "linear separability"
and the 
.BD "perceptron training algorithm"
and 
.BD "convergence theorem"
and
.BD "cycling theorems"
discussed.

Widrow's adaline model will be mentioned, and a procedure based on the
use of a 
.BD "least mean squares"
minimisation algorithm introduced.

The 
.BD "multi-layer perceptron"
model will be dealt with in several lectures.

The course will then move on to describe networks with feed-back, based
on 
.BD "Hopfield networks,"
both discrete and continuous, and applications of
such networks will be discussed.

Later work will deal with other network types, including 
.BD "Kohonen's self organising topological feature maps,"
and 
.BD "Grossberg's ART models."

.HI "Helping yourself"

Things you can do to help you study - you do not have to follow the
suggestions in the order stated, and you may find it helpful to read two
or three books or papers during the same period.

A very good introduction is the March 1988 issue of Computer (IEEE). This
gives easy to understand presentations of several network types, and also
deals with hardware implementation.

You should read as much of the book by Minsky and Papert as possible.
This book is moderately mathematical, although it is not really very
difficult, and may seem strange at first. You should note in particular
the Perceptron Convergence Theorem, and the Perceptron Cycling Theorem.
Additionally, you should note the concepts of linear separability, and
diameter limited, and order limited predicates. Later, you should examine
the proofs of the theorems, but at first you should simply note the
adaptive procedure.

Next you should find the early paper by WIdrow, which describes an
adaptation procedure based on Least Mean Squares.  See how this procedure
differs from the perceptron adaptation procedure.

Then you should read the section of the PDP book which deals with back 
propagation - only there it's called the 
.BD "generalized delta rule."
If you find the maths
hard, then look at the examples and have faith that the method works.

To give more credence to the back-propagation method, you could also run
the BP program which is supplied with "Explorations". Try to devise your
own networks using the network description language once you have tried
out the example demonstrations. For example, you could make up a network
for solving the symmetry problem, and see if the solutions described in
the book are plausible.

There are few books that deal well with Hopfield networks, so it is
perhaps simplest to examine the original papers. (Hopfield 1982, 1984).

Then you should read the book by Kohonen, which describes another type of
network, which uses an unsupervised learning algorithm.

Grossberg has written many articles and several books, most of them
unreadable.  However, his recent articles have proved to be more
approachable, perhaps because of the influence of Gail Carpenter. Try to
find out what Grossberg's ART models are, but leave this till later on.

.HI "Course work assessment"

You will be expected to do course work during the course, but much of the
work will centre around laboratory exploration, and short presentations.
A significant proportion of the available marks for the coursework will
be for keeping a log book up to date during the course.  Additional marks
will be obtained for tackling specific problems, and for presentations
later on.

Currently the course work accounts for 25% of the marks available for the
course.

Further guidance will be given during the year.

------------------------------

End of Neuron Digest [Volume 6 Issue 38]
****************************************