[comp.ai.neural-nets] Neuron Digest V5 #9

neuron-request@HPLABS.HP.COM ("Neuron-Digest Moderator Peter Marvit") (02/14/89)

Neuron Digest	Monday, 13 Feb 1989
		Volume 5 : Issue 9

Today's Topics:
				Weight Decay
			      Re: Weight Decay
			      Re: Weight Decay
		      Where to order NIPS proceedings?
			       Harmony theory
	wish to find out more about neural nets (Weather Simulation)
		     Presenting Enidata Research Group
			  Request for Information
			       summer-schools
			 Image and Data Compression
		 Markov chains and Multi-layer perceptrons
		  Speech recognition information requested
				 ART and BP


Send submissions, questions, address maintenance and requests for old issues to
"neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request"
ARPANET users can get old issues via ftp from hplpm.hpl.hp.com (15.255.16.205).

------------------------------------------------------------

Subject: Weight Decay
From:    movellan%garnet.Berkeley.EDU@violet.berkeley.edu
Date:    Mon, 23 Jan 89 20:33:34 -0800 

Referring to the compilation about weight decay from John: I cannot see the
analogy between weight decay and ridge regression.
 
 
The weight solutions in a linear network (Ordinary Least Squares) are the
solutions to (I'I) W = I'T where:
 
I is the input matrix (rows are # of patterns in epoch and columns are # of
input units in net). T is the teacher matrix (rows are # of patterns in
epoch and columns are # of teacher units in net). W is the matrix of weights
(net is linear with only one layer!).
 
The weight solutions in ridge regression would be given by (I'I + k<1>) W =
I'T. Where k is a "shrinkage" constant and <1> represents the identity
matrix. Notice that k<1> has the same effect as increasing the variances of
the inputs (Diagonal of I'I) without increasing their covariances (rest of
the I'I matrix). The final effect is biasing the W solutions but reducing
the extreme variability to which they are subject when I'I is near singular
(multicollinearity). Obviously collinearity may be a problem in nets with a
large # of hidden units. I am presently studying how and why collinearity in
the hidden layer affects generalization and whether ridge solutions may help
in this situation. I cannot see though how these ridge solutions relate to
weight decay.
 
 -Javier

------------------------------

Subject: Re: Weight Decay 
From:    kanderso@BBN.COM
Date:    Tue, 24 Jan 89 13:54:04 -0500 

[[ referring to the previous note ]]

Yes i was confused by this too.  Here is what the connection seems to be.
Say we are trying to minimize an energy function E(w) of the weight vector
for our network.  If we add a constraint that also attempts to minimize the
length of w we would add a term kw'w to our energy function.  Taking your
linear least squares problem we would have

	E = (T-IW)'(T-IW) + kW'W

	dE/dW = I'IW - I'T + kW

setting dE/dW = 0 gives

	[I'I +k<1>]W = I'T, ie. Ridge Regression.

	W = [I'I + k<1>]^-1 I'T
	
The covariance matrix is [I'I + k<1>]^-1 so the effect of increasing k

1.  Make the matrix more invertable.

2.  Reduces the covariance so that new training data will have less effect
on your weights.

3.  You loose some resolution in weight space.

I agree that collinearity is probably very important, and i'll be glad to
discuss that off line.

k

------------------------------

Subject: Re: Weight Decay 
From:    Yann le Cun <neural!yann@hplabs.HP.COM>
Date:    Wed, 25 Jan 89 15:13:58 -0500 


Consider a single layer linear network with N inputs.  When the number of
training pattern is smaller than N , the set of solutions (in weight space)
is a proper linear subspace.  adding weight decay will select the minimum
norm solution in this subspace (if the weight decay coefficient is decreased
with time).  The minimum norm solution happens to be the solution given by
the pseudo-inverse technique (cf Kohonen), and the solution which optimally
cancels out uncorrelated zero mean additive noise on the input.

 - Yann Le Cun


------------------------------

Subject: Where to order NIPS proceedings?
From:    Jose A Ambros-Ingerson (Dept of ICS, UC Irvine) <jose%harlie.ics.uci.edu@PARIS.ICS.UCI.EDU>
Date:    Sat, 21 Jan 89 20:46:58 -0800 

Would someone be so kind as to send me the address of where to order
the proceedings for NIPS 87 and 88 (IEEE Conference on Neural Information
Processing Systems).

	Thanks in advance,

Jose A. Ambros-Ingerson				email: jambros@ics.uci.edu
Dept. of Information and Computer Science       Phone:   (714) 856-7310
University of California			         (714) 856-7473
Irvine CA, 92717 

------------------------------

Subject: Harmony theory
From:    andrew@berlioz.NSC.COM (Andrew Palfreyman)
Date:    Tue, 24 Jan 89 19:34:05 -0800 

Having just reviewed the simple model of Smolensky's harmony theory in
"Explorations in PDP" (disk programs), I fell to musing whether hard
problems describable by a set of nonlinear coupled equations might yield to
such a parallelised approach; weather and the gravitational many-body
problem come first to mind. Note that these have been attacked by supers
and, in the latter case, by a special-purpose super called the GF-11 with
gigaflop capability. Might there not be a cheaper way was the muse...
anybody done any serious exploration of problem domains like this with such
paradigms as harmony?
	
	Andrew Palfreyman, MS D3969		PHONE:  408-721-4788 work
	National Semiconductor				408-247-0145 home
	2900 Semiconductor Dr.			there's many a slip
	P.O. Box 58090				'twixt cup and lip
	Santa Clara, CA  95052-8090

	DOMAIN: andrew@logic.sc.nsc.com  
	ARPA:   nsc!logic!andrew@sun.com
	USENET: ...{amdahl,decwrl,hplabs,pyramid,sun}!nsc!logic!andrew


------------------------------

Subject: wish to find out more about neural nets
From:    gerry@toadwar.UCAR.EDU (gerry wiener)
Date:    Tue, 24 Jan 89 23:07:28 -0700 

[[ Editor's Note:  Normally, I'd rather not carry too many "I'm just
beginning" messages, but his application is intrguing since it normally uses
copious amounts of copmute time anyway.  I'd like to hear of any references
as well. -PM ]]

I'm interested in finding out more about neural nets.  I work at the
National Center for Atmospheric Research and we're interested in seeing if
neural network ideas can be applied to weather forecasting and prediction.
Any useful information such as a bibliography containing references would be
helpful.

Thank you very much.

	Gerry Wiener
	NCAR
	P.O. Box 3000
	Boulder, Co. 80307-3000


------------------------------

Subject: Presenting Enidata Research Group
From:    mcvax!enidbo.bo.enidata.it!daniele%bo.enidata@uunet.UU.NET (Daniele Montanari)
Date:    Wed, 25 Jan 89 08:34:48 -0800 

[[ Editor's Note: Part of the raison d'etre of this Digest is for
researchers to make themselves and their projects known.  I welcome more of
this biographical entries so otehrs may learn of your work. -PM ]]


This message is a presentation of the group that has been formed at Enidata
and works on neural nets, classifier systems, and in general systems with
complex dynamics.

We are four people coordinated by Roberto Serra, with backgrounds in
physics, mathematics, and electronic engineering.  Complex dynamics was the
original field of interest of the early members of the group.  The first
work involving neural nets concerned modified Hopfield models.  Our major
interests are currently in multilayer nets with back-prop, and classifier
systems.  Machine learning, genetic algorithms, complex dynamics, pattern
recognition, parallel distributed processing, higher-order neural nets, are
some of the areas we are interested in.  Basic research and application are
both interesting for us.

Some of our work has been organized in form of papers and/or internal
reports, which we are happy to distribute to anyone interested.

Compiani M., Montanari D., Serra R., and G. Valastro,
``Neural Nets and Classifier Systems'', 
in the Proceedings of the First Italian Workshop on Parallel Architectures 
and Neural Networks (E. Caianiello and R. Tagliaferri Eds.), 
World Scientific Publishers, Singapore (in press).

Serra R., Zanarini G., and F. Fasano,
``Attractors, learning and recognition in generalised Hopfield networks'', 
in the Proceedings of Cognitiva 87, Volume 1, p. 459 (May 1987).

Serra R., Zanarini G., and F. Fasano,
``Cooperative phenomena in Artificial Intelligence'',
J. Molec. Liquids, Volume 39, pp. 207-231, 1988.

Serra R., Zanarini G., and F. Fasano, 
``Generalised Hopfield learning rules'',
in Chaos and Complexity, Livi R. et al. Eds., 
World Scientific, Singapore, (1988).

Serra R., Zanarini G., and F. Fasano,
``A theorem on complementary patterns in Hopfield--like networks'',
Enidata internal report SAP--2--88 SFZ (1988).

Serra R.,
``Dynamical systems and expert systems'',
in the Proceedings of Connectionism in Perspective, R. Pfeifer Ed., 
Elsevier (in press).

Compiani M., Montanari D., Serra R., Simonini P., Valastro G.,
``Dynamics of classifier systems'',
Enidata internal report (1988).


Our e-mail addresses are

Mario Compiani:      mc@enidbo.it.uucp
Daniele Montanari:   daniele@enidbo.it.uucp
Roberto Serra:       rse@enidbo.it.uucp
Gianfranco Valastro: gv@enidbo.it.uucp

( {any backbone, mcvax!}i2unix!enidbo!<username> may also be used).

Ciao

Daniele

------------------------------

Subject: Request for Information
From:    "Walter L. Peterson, Jr." <ucsdhub!calmasd!wlp@SDCSVAX.UCSD.EDU>
Organization: Prime-Calma, San Diego R&D, Object and Data Management Group
Date:    30 Jan 89 17:04:34 +0000 


I have posted this request before, but after I did I realized that it was at
the beginning of semester break. Since many readers of this group are in
academia, I'm posting again now that everyone is back to school.


I am looking for references to recent work in the area of learning rates in
artificial neural networks, particularly back-propagation networks.  This is
for some research that I am doing for my thesis for my MS in Computer
Science. The most recent references that I have are from the "Procedings of
the 1988 Connectionist Models Summer School" at CMU.

Also, if anyone "out there" is doing or has done work in this field, I would
like to hear from you.

(P.S. Thanks for the couple of responses I did get the last time).


Walter L. Peterson

email: 
   wlp@calmasd.Prime.COM

snail mail:
   Calma - A Divison of Prime Computer, inc.
   9805 Scranton Rd.
   San Diego, CA
                92121

Walt Peterson.  Prime - Calma San Diego R&D (Object and Data Management Group)
"The opinions expressed here are my own and do not necessarily reflect those
Prime, Calma nor anyone else.
...{ucbvax|decvax}!sdcsvax!calmasd!wlp

------------------------------

Subject: summer-schools
From:    andreas herz <BY9%DHDURZ1.BITNET@CUNYVM.CUNY.EDU>
Date:    Thu, 02 Feb 89 00:55:29 +0700 

Is there anybody, who knows about interesting summer-schools on neural
networks in the U.S.A. or Canada this summer/fall, where the biological
"roots" of the field are treated as well as theoretical approaches ????
What about conferences or smaller meetings?

     Thanks for helping me!  Andreas Herz, University of Heidelberg,FRG

------------------------------

Subject: Image and Data Compression
From:    Jon Ryshpan <jon@nsc.NSC.COM>
Date:    Wed, 01 Feb 89 14:02:01 -0800 

I am interested in collating data on image (or other) data compression
techniques from the Neural Network research and development arena.  Would
you kindly send your contributions (in any mode) to:

	Andrew Palfreyman, MS D3969		PHONE:  408-721-4788 work
	National Semiconductor				408-247-0145 home
	2900 Semiconductor Dr.			there's many a slip
	P.O. Box 58090				'twixt cup and lip
	Santa Clara, CA  95052-8090

	DOMAIN: andrew@logic.sc.nsc.com  
	ARPA:   nsc!logic!andrew@sun.com
	USENET: ...{amdahl,decwrl,hplabs,pyramid,sun}!nsc!logic!andrew

------------------------------

Subject: Markov chains and Multi-layer perceptrons
From:    kruschke@cogsci.berkeley.edu (John Kruschke)
Date:    Wed, 08 Feb 89 18:40:44 -0800 

[[ Re: a request for citation ]]

Herve Bourlard and C.J. Wellekens
``Links between markov models and multilayer perceptrons''

Tech Report TR-88-008
27 pages, $1.75

write to:  Librarian
           International Computer Science Institute
           1947 Center St., Suite 600
           Berkeley, CA  94704

info:  info@icsi.berkeley.edu
       (415)643-9153

Hope that answers your question (of 21 January, Neuron Digest 5(8) ).

  --John.


------------------------------

Subject: Speech recognition information requested
From:    Christel Kemke <kemke%fb10vax.informatik.uni-saarland.dbp.de@RELAY.CS.NET>
Date:    09 Feb 89 10:09:00 -0100 



	I am interested in speech recognition using neural networks,
	especially in combination with natural language processing
	components (using e.g. syntactic information for disambiguation
	etc.). I would be grateful for any hints to literature and
	existing systems.

	If possible, I will write a small study and post it to the 
	newsletter.

	Thanks in advance.

	
	Christel Kemke

	DFKI
	Standort Saarbruecken
	Stuhlsatzenhausweg
	D-6600 Saarbruecken 11
	dfn:  kemke@fb10vax.informatik.uni-saarland.dbp.de

------------------------------

Subject: ART and BP
From:    tony bell <tony%ifi.unizh.ch@RELAY.CS.NET>
Date:    08 Feb 89 15:18:00 +0100 

I'm sure that Andrew Palfreyman' provacative message comparing BP and ART
will attract a lot of feedback. I'd just like to add my piece about why BP
is more popular:

Looking at the two algorithms in terms of what they compute, ART is nothing
more than a clustering algorithm similar to Rumelhart & Zipser's Competitive
learning (though predating it).  Also the metric for the clustering is not
Euclidean, but dependant on the order of presentation of the patterns.  (See
Barbera Moore's paper in the Proc. Connectionist Summer School 1988 for
details).

  But BP, using the richness of error information to optimise, can produce
arbitrary segmentations of the input space, given sufficient computational
units. Thus, BP is more powerful a categoriser if nothing else.

  Grossberg's preoccupations with the stability of learning and the
construction of prototypes derive more from psychological than computational
requirements.

  The ART style of learning without a teacher is certainly very interesting,
but it has, in my view, been better analysed from a 'computational' point of
view in works by Linsker and Kohonen.  The most accessible examples of these
people's work appear in IEEE Computer of March 1988.

Tony Bell, University of Zurich.

------------------------------

End of Neurons Digest
*********************