[comp.ai.neural-nets] Neuron Digest V5 #19

neuron-request@HPLABS.HP.COM ("Neuron-Digest Moderator Peter Marvit") (04/21/89)
Neuron Digest   Thursday, 20 Apr 1989
                Volume 5 : Issue 19

Today's Topics:
               1990 Connectionist Summer School announcement
    Andrew Palfreyman's note re: work of Linsker, Sanger, Foldiak, etc.
                     Article on Collective Intelligence
                       BP, local minima, perceptrons
                            Forecasting using NN
                               Re: GD and LSE
                             Re: Re: GD and LSE
                               Re: GD and LSE
                 RE: genetic algorithms vs. backpropagation
                                 NN4Defense
                   NOISE FILTERING USING NEURAL NETWORKS
                 Re: NOISE FILTERING USING NEURAL NETWORKS
                 Re: NOISE FILTERING USING NEURAL NETWORKS
                  Paper: An Architecture with Neural ....
               Posting to neuron-digest - Parker Learning Law
                         Pre-publication to Europe?
         Proceedings of the 1988 Connectionist Models Summer School
    Request for information on compilation of neural nets to combinators


Send submissions, questions, address maintenance and requests for old issues to
"neuron-request@hplabs.hp.com" or "{any backbone,uunet}!hplabs!neuron-request"
ARPANET users can get old issues via ftp from hplpm.hpl.hp.com (15.255.16.205).

------------------------------------------------------------

Subject: 1990 Connectionist Summer School announcement
From:    Jeff Elman <elman%amos@ucsd.edu>
Date:    Tue, 28 Mar 89 21:30:44 -0800 


March 28, 1989                      PRELMINARY ANNOUNCEMENT

         CONNECTIONIST SUMMER SCHOOL / SUMMER 1990

                            UCSD
                    La Jolla, California


     The next Connectionist Summer School will  be  held  at
the  University of California, San Diego in June 1990.  This
will be the third session in the series which  was  held  at
Carnegie-Mellon in the summers of 1986 and 1988.

     The summer school will offer courses in  a  variety  of
areas  of connectionist modelling, with emphasis on computa-
tional neuroscience, cognitive models, and  hardware  imple-
mentation.   In  addition  to  full courses, there will be a
series of shorter tutorials, colloquia, and public lectures.
Proceedings  of the summer school will be published the fol-
lowing fall.

     As in the past, participation will be limited to gradu-
ate students enrolled in PhD. programs (full- or part-time).
Admission will be on a competitive basis.   We hope to  have
sufficient funding to subsidize tuition and housing.

     THIS IS A  PRELMINARY  ANNOUNCEMENT.   Further  details
will be announced over the next several months.

    Terry Sejnowski         Jeff Elman
    UCSD/Salk               UCSD

    Geoff Hinton            Dave Touretzky
    Toronto                 CMU
    hinton@ai.toronto.edu   touretzky@cs.cmu.edu



------------------------------

Subject: Andrew Palfreyman's note re: work of Linsker, Sanger, Foldiak, etc.
From:    Ralph Linsker <LINSKER@ibm.com>
Date:    27 Mar 89 09:18:23 -0500 


Re: Andrew Palfreyman's comment --

- -- "I find it mysterious that random noise training produces orientation-
selective receptive fields spontaneously; what's the connection between
eigenvectors of an input autocorrelation and straight lines?  Not only are
these fields similar to those found in retinal cells of cat and monkey, but,
as one goes down the list in order of decreasing eigenvalue, resemble
somewhat eigenstates of wave-functions of atoms ..."--

See my series of papers in Proc Natl Acad Sci USA 83: 7508-12, 8390-94,
8779-83 (Oct-Nov 1986) in which I show how uncorrelated noise, passed
through a layer of center-surround cells (linear filters), produces a
Mexican-hat activity covariance matrix which leads to the emergence of
orientation selective cells.  The mathematical reason (see esp. the second
paper in the series) is that the Hebb-type rule I use maximizes Sigma Qij Ci
Cj subject to a constraint on Sigma Ci (= total connection strength).  (Here
Qij is the covariance of the activities at cells i and j of the
center-surround layer; it is a Mexican-hat function of the distance between
i and j.)  If for simplicity one thinks of the synapses as lying on a square
lattice, this maximization corresponds to assigning a given positive or
negative value of Ci to each site of an Ising-type model, and minimizing an
interaction energy = -1/2 Sigma Qij Ci Cj subject to fixed Sigma Ci.  This
minimum is achieved (for Mex-hat Qij) when the C's form parallel stripes of
alternating positive and negative value (in certain cases).  See the 2nd
PNAS paper for more details.

Also, PCA (for N>1 linear output cells) is NOT identical to maximizing the
mutual information (i.e. "maximum preservation of information," "infomax
principle," or maximum Shannon information rate from input to output layer)
in general.  The infomax solution depends upon the noise process.  Without
specifying a noise process the infomax optimization problem (with
continuous-valued activities) is ill-defined, although PCA is perfectly
well-defined.  For further discussion see my articles in: Computer
21(3)105-117 (March 1988); Neural Information Processing Systems (Denver,
Nov. 1987), ed. D. Z. Anderson (AIP, 1988); and Advances in Neural
Information Processing Systems, vol.1 (Denver, Nov.-Dec. 1988), ed. D. S.
Touretzky (Morgan Kaufmann, to appear April 1989).  The last paper shows how
center-surround and temporal-differencing cells emerge as infomax solutions
when the input is locally correlated in space or time, and shows explicitly
the relationships between infomax, choice of noise process, and PCA for
linear nets.

  Ralph Linsker
  (linsker @ ibm.com)

------------------------------

Subject: Article on Collective Intelligence
From:    Daniel J Pirone <cocteau@VAX1.ACS.UDEL.EDU>
Organization: The Lab Rats
Date:    26 Mar 89 02:47:28 +0000 


        "Army Ants: A collective intelligence"
        by Nigel R. Franks
        American Scientist, March-April 1989

Not directly about NNs, but *very* interesting...

- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Thus spoke Zarathustra...                       cocteau@vax1.acs.udel.edu
                                                Daniel Pirone
- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

------------------------------

Subject: BP, local minima, perceptrons
From:    sontag@fermat.rutgers.edu (Eduardo Sontag)
Organization: Rutgers Univ., New Brunswick, N.J.
Date:    Mon, 27 Mar 89 17:48:04 +0000 

There have been a number of postings on this issue lately.  Some time ago a
couple of abstracts were posted here.  These papers show:

(1) Even for the case of NO hidden layers and binary inputs, spurious local
minima can happen.  The smallest example I know of has 4 inputs (5 weights
to be learnt, when including the threshold) and 125 training instances.

(2) Still in the NO hidden layer case, even perceptron-separable data, when
fed to BP, can fail to be classified (Brady et al).  BUT: if one uses
"threshold penalty" cost, i.e. one does not penalize an output higher (if
"1" is desired) or lower (if "0" is desired) than some cutoff point (say,
.75 and .25 respectively) then YES, BP classifies correctly in that case.
(More precisely, the gradient differential equation converges, in finite
time, from each initial condition.)

These results, and comparisons to related work, are included in:

Sontag, E.D. and H.J. Sussmann, ``Backpropagation can give rise to spurious
local minima even for networks without hidden layers,'' to appear in
_Complex Systems_, February 1989 issue, I understand.

Sontag, E.D. and H.J. Sussmann, "Backpropagation Separates when Perceptrons
Do", Rutgers Center for Systems and Control Technical Report 88-12, November
1988.  Submitted to ICNN89.

LaTeX copies of these are available by email.

 -eduardo sontag

Eduardo D. Sontag, Professor
Department of Mathematics
Rutgers Center for Systems and Control (SYCON)
Rutgers University
New Brunswick, NJ 08903, USA

(Phone: (201)932-3072; dept.: (201)932-2390)
sontag@fermat.rutgers.edu
...!rutgers!fermat.rutgers.edu!sontag
sontag@pisces.bitnet

------------------------------

Subject: Forecasting using NN
From:    parvis@pyr.gatech.EDU (FULLNAME)
Organization: Georgia Institute of Technology
Date:    Tue, 28 Mar 89 19:05:03 +0000 


I am looking for approaches using neural nets for forecasting based on
history data. Currently I'm using various sequential models similar to
Jordan's network. I am testing my models on a sequence of notes in a melody.
The network should be able to reproduce the whole sequence based on initial
input. Jordan's network seems to have problems with melodies where one note
could have many successor notes. The more different successor values a
specific value in the sequence has, the longer it takes for the network to
learn. What other possibilities should I explore to represent context, time,
in general: sequential dependencies?

                I appreciate any suggestions and comments.

Parvis Avini
parvis@gitpyr.gatech.edu 
 

------------------------------

Subject: Re: GD and LSE
From:    gblee@maui.cs.ucla.edu
Organization: UCLA Computer Science Department
Date:    Fri, 24 Mar 89 18:56:24 +0000 

> There is another paper with a similar claim.
   **************************************************************
>  "Gradient descent fails to separate" is its title.
>            By : M. Brady and R.Raghavan
    *****************************************************
>The paper shows the failure of BP in the case of examples where 
>there are no local minima.  They assert (and they could be right as ...
>                                                               
>                                         (FIAT LUX)

Can you tell us where the paper you mentioned is published and when?
May be somebody else in this news group are also interested in ...
I know another similar paper which attacked GD....

 Sutton, R. Two problems with BP and other steepest-descent learning
 procedures for networks.

 He points out:
1. steepest descent is a particulary poor for surface containing "ravines"
2. steepest descent results in high level of interference between learning
with different patterns...

Unfortunately, I forgot where this paper was published.
Can anybody out there tell where it was published for us?

- --Geunbae Lee
  AI lab, UCLA

------------------------------

Subject: Re: Re: GD and LSE
From:    mbkennel@phoenix.Princeton.EDU (Matthew B. Kennel)
Organization: Princeton University, NJ
Date:    Sat, 25 Mar 89 05:20:35 +0000 

In article <22206@shemp.CS.UCLA.EDU> gblee@CS.UCLA.EDU () writes:
>I know another similar paper which attacked GD....

1) Simply going down the gradient direction has been known for ages to be a
poor heuristic if you're in steep valleys.  It's simply good fortune that it
works in as many cases as it does.  In fact, the "momentum method" is a very
simple way to alleviate this problem by trying to go faster in the
slowly-changing direction.

Conjugate gradient is specifically designed to deal with this kind of
problem.  The disadvantages are 1) algorithm much more complicated to
implement 2) it is not always faster.  In terms of the number of
"iterations", it's certainly faster than fixed-step steepest-descent, but
now each iteration requires a gradient evaluation and several error
evaluations for an approximate line-minimization.  Just try doing that in
silicon...

Often, many fast dumb steps beat a smart but slow method.  In my personal
experience, standard steepest descent w/ momentum is often the best in terms
of bottom-line wall-clock performance compared to fancier algorithms.

2) I believe that when you add more examples to the training set the error
surface becomes more "bumpy" locally, and thus the gradient vector at each
point doesn't necessarily point to the overall global minimum as closely.
I'm not sure about this and am trying to verify it.  Using the momentum
method helps for this too, because you effectively end up averaging the
gradient vector over several steps.

In a network whose error surface is non-linear in the weights (multi-layer
perceptron, e.g.) I'm not at all convinced that there could ever be a
general-purpose learning algorithm guaranteed to give the global minimum
solution.  In practical terms, the appropriate criterion is only "good
enough".

Matt Kennel
mbkennel@phoenix.princeton.edu

------------------------------

Subject: Re: GD and LSE
From:    "Sean D. O'Neil" <vsi1!wyse!mips!prls!philabs!linus!sdo@APPLE.COM>
Organization: The MITRE Corporation, Bedford MA
Date:    27 Mar 89 15:19:32 +0000 

[[ Concerning citation of "Gradient Descent Fails to Separate" ]]

I attended the presentation by Raghavan at ICNN '88 in San Diego, so at
least the paper has been published in those proceedings.  My impression is
that their result is fairly simple to understand.  Essentially, they point
out that minimizing the number of misclassifications is not identical to the
least-squares solution, and they give several examples in which linear class
separation is possible but the least-squares solution does not in fact
separate the classes.

Aha, here it is.  The paper is:

Brady, M., R. Raghavan, and J. Slawny, "Gradient Descent Fails to Separate",
in IEEE International Conference on Neural Networks, IEEE, San Diego, CA,
pp. I-649 to I-656, July 24-27, 1988.

Sean

------------------------------

Subject: RE: genetic algorithms vs. backpropagation
From:    offutt@caen.engin.umich.edu (daniel m offutt)
Date:    Tue, 28 Mar 89 13:20:23 -0500 

In a previous edition, Arnfried Ossen wrote that "Our results indicate that
genetic algorithms cannot outperform backpropagation in feedforward networks
in general."  In the last edition, David Montana took issue with this claim,
remarking that he and Davis have developed an algorithm (which is supposedly
a "genetic algorithm") specialized for training neural networks, and which
has given "very good preliminary results".  Perhaps Montana and Davis have a
proof of an analog of Holland's Schema Theorem for this specialized "genetic
algorithm"?  If not, and if their algorithm is substantially different from
a traditional genetic algorithm, then it is possible that the algorithm is
not a genetic algorithm at all, whatever Montana and Davis have named it.
In this case their results do not contradict Mr.  Ossen's remarks.

I have used a *traditional* genetic algorithm (for which Holland's Schema
Theorem is known to hold) to train hundreds of different feed-forward
networks of precisely the sort backprop has been applied to.  My results
generally indicate that back propagation can train these networks very much
faster than a traditional genetic algorithm can.  That would seem to support
Mr. Ossen.

But this is an apples and oranges comparison!  A genetic algorithm applied
to network training is a reinforcement learning algorithm.  Back propagation
is not: It requires access to the target output vectors, whereas a genetic
algorithm requires access only to a network performance (or error) measure.
Reinforcement learning is a more difficult problem.  So the fact that a
genetic algorithm runs much slower than backpropagation is due in part, or
entirely, to the fact that it is solving a totally different and much more
difficult problem.

The proper comparison would be between a genetic algorithm and the various
methods for doing reinforcement learning using backprop.  The
BP/reinforcement methods I am familiar with involve on or another form of
blind random search.  The random search leads me to believe that a genetic
algorithm would perform very much better than these other reinforcement
learning methods.  But I have no evidence that this is the case.

I have discovered that the genetic algorithm can train very deep
feed-forward networks.  In several experiments I was able to train 12-layer
networks (with ten hidden layers of ten fully-interconnected units each) to
compute simple parity functions.  It remains to be seen whether a GA can
train such deep networks to compute more complex functions in a reasonable
amount of time.  But no connectionist has ever trained a network this deep
to do *anything* using back propagation.  Back propagation is totally
incapable of training networks with 12 layers.  So the genetic algorithm is
uequivocally superior to back propagation in this respect, even though it
has only a reinforcement signal to learn from whereas back propagation
requires and is given much more information to learn from: the target output
vectors.

In a smaller and less conclusive set of experiments I found that the genetic
algorithm could train neural networks containing loops.  For example, the GA
trained a network of eight fully interconnected (64 weights) units to be a
rotating shift-register.  That took a great many network evaluations.

In sum, the traditional genetic algorithm outperforms back propagation on
some network/task combinations but is outperformed by back prop on other
networ/task combinations.  The GA may be the better learning method when the
correct network output vectors are not known and only a network performance
measure can be defined.  Since deeper networks can compute functions that
shallower networks cannot (holding network width constant), the GA may be
able to solve some learning problems that back prop currently cannot.  The
GA is also less likely to become trapped on local optima, unlike gradient
followers such as back propagation.  And it is not necessary to use smooth
unit activation functions in a network to be trained by a GA.  So there are
many good reasons to look deeper into using genetic algorithms to train
neural networks.

- - Dan Offutt

>>>>>>>>>>>>>>>> offutt@caen.engin.umich.edu >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

------------------------------

Subject: NN4Defense
From:    "Neuron Digest Moderator - Peter Marvit" <neuron-request@hplabs.hp.com>
Date:    Mon, 03 Apr 89 21:58:59 -0700 


[[ Editor's Note: This is posted for the organizers.  Please do not respond
to me. See contacts at end of message. -PM ]]

                          A One-day Conference:

                       ---------------------------
                       NEURAL NETWORKS for DEFENSE
                       ---------------------------

              Saturday, June 17, 1989 (the day before IJCNN)
                              Washington, DC

          Conference Chair: Prof. Bernard Widrow (Stanford Univ.)
          -------------------------------------------------------

          A one-day conference on defense needs, applications, and
     opportunities for computing with neural networks, featuring
     key representatives from government and industry. It will
     take place in Washington, DC, right before the IEEE and INNS's
     International Joint Conference on Neural Networks (IJCNN).

          The morning session will include program managers from lead-
     ing Department of Defense (DoD) agencies funding Neural Network
     research and development, including Barpara Yoon (Defense
     Advanced Research Projects Agency, DARPA), Joel Davis & Thomas
     McKenna (Office of Naval Research, ONR), William Berry & C. Lee
     Giles (Air Force Office of Scientific Research, AFOSR), plus oth-
     ers to be announced later. They will provide information on: 1)
     proposals they have already funded, 2) the types of proposals
     they intend to fund in the future, 3) how their programs differ
     from other DoD programs, 4) details on how to best approach them
     for neural network R&D funding.

          The afternoon session will feature presentations of the
     current status of defense-oriented research, development, and
     applications of neural network technology from both industry and
     academia.  The speakers include representatives from neural-
     network R&D programs at General Dynamics, Ford Aerospace, Mar-
     tingale Research, Booz-Allen, Northrup, SAIC, Hughes Aircraft,
     Hecht-Nielson Neurocomputer Corp., Rockwell International, Nestor
     Inc., Martin Marietta, plus others to be announced later. They
     will discuss their current, past, and future involvement in
     neural networks and defense technology, as well as the kinds of
     cooperative ventures in which they might be interested.

           An evening dinner banquet will feature Prof. Bernard
     Widrow as the after-dinner speaker. Prof. Widrow directed the
     recent DARPA study evaluating the military and commercial
     potential of neural networks. He is a professor of EE at
     Stanford University, the current president of the INNS,
     co-inventor of the LMS algorithm (Widrow & Hoff, 1960), and 
     the president of Memistor Corp, the oldest neural network
     applications and development company, which Prof. Widrow
     founded in 1962.
     
          Ample time will be alloted during breaks, lunch, and a
     dinner banquet, for informal discussions with the speakers and
     other attendees.

     Program Committee: Mark Gluck (Stanford) & Edward Rosenfeld

     ------------------------------------------------------------
     For more information, call Anastasia Mills at (415) 995-2471 
     ------------------------------------------------------------
     or FAX: (415) 543-0256, or write to: Neural Network Seminars,
       Miller-Freeman, 500 Howard St., San Francisco, CA 94105

------------------------------

Subject: NOISE FILTERING USING NEURAL NETWORKS
From:    Ramkumar P Rangana <okstate!ramkuma@RUTGERS.EDU>
Organization: Oklahoma State Univ., Stillwater
Date:    28 Mar 89 17:46:43 +0000 

Do anyone have some reference articles on NOISE FILTERING in EKG SIGNALS (
OR any signals) using NEURAL-NETWORK techniques?  (Any application of Neural
Nets to NOISE FILTERING.)

Any information in this area is Welcome.

Thanks in advance.

(ramkuma@a.cs.okstate.edu)

------------------------------

Subject: Re: NOISE FILTERING USING NEURAL NETWORKS
From:    mathew@jane.Jpl.Nasa.Gov (Mathew Yeates)
Organization: Image Analysis Systems Grp, JPL
Date:    Wed, 29 Mar 89 17:14:51 +0000 

I've developed a neural implementation of the sequential regression
algorithm found in Widrows "Adaptive Signal Signal Processing".
(its a pseudo-Newton algorithm). As it turns out, the net can also
do Kalman filtering by allowing a local interdependence between synapses.
Precisely, a new weight is calculated as a function of its old weight
and the weight of a single neighbor. It is biologicaly intuitive that
synapse changes should not occurr independently and adding this feature
to my model allows for more powerful processing.

If you want a copy of my "An Architecture With Neural Network
Characteristics for Least Squares Problems" e-mail your request to
mathew@jane.jpl.nasa.gov.

------------------------------

Subject: Re: NOISE FILTERING USING NEURAL NETWORKS
From:    Ramkumar P Rangana <okstate!ramkuma@RUTGERS.EDU>
Organization: Oklahoma State Univ., Stillwater
Date:    31 Mar 89 01:55:48 +0000 


  I am posting the replies I received from Daniel Pirone, Matt Kernel,
    Harry Lagenbacher, Chester and John Platt. Thanks to all of You.

                                 RAMKUMAR
                            (ramkuma@a.cs.okstate.edu).
 

- ---------------------------------------------------------------------

Just a few simple noise filters with NN's I have seen in passing,

Dr. Dobb's Journal of Software Tools, #147 January 1989 p32. ( has a c code
listings in back )

There are some commercial packages available, I don't seem to have any adds
arround, but there is always at least one in the Neural Networks journal...

                                        Daniel Pirone

- -----------------------------------------------------------------------

Doyne Farmer at Los Alamos (jdf%heretic@lanl.gov) has done some work with
noise reduction in dynamical systems.  He has a very good preprint that
talks about predicting dynamical systems with various algorithms (he doesn't
use a neural net).  I'm doing a thesis on prediction with neural networks,
and (if I have time) will try to apply it to noise reduction.  The basic
idea is to make predictions a few steps in advance (using previous data) so
that for some time you have more than one "guess" to what the point actually
is, and you can "average" them.  It turns out that reducing noise is
_easier_ in a dissipative system than one would think because noise in one
coordinate can be surpressed evolving in time.

Matt Kennel
mbkennel@phoenix.princeton.edu


- ----------------------------------------------------------------------
Bernard Widrow has been writing on adaptive filtering for years & years.


my Opinions aren't anyone else's, and they probably shouldn't be
- - Harry Langenbacher  harry%neuron1@jpl-mil.jpl.nasa.gov

- -----------------------------------------------------------------------

In the January, 1989, issue of Dr. Dobb's Journal of Software Tools, you
will find an article titled "Neural Nets and Noise filtering" by Casimir C.
Klimasauskas, president and founder of NeuralWare.  In this article he uses
a back-propagating network to filter EKG data.  The key idea is the network
has N input units (N is odd), N output units, and FEWER THAN N hidden units.
N successive signal values are used both as input pattern and target
pattern, so the network is being trained to reproduce the input, but because
there are less than N hidden units, they are forced to do some smoothing.
Once the network is trained, successive signal values are applied at the N
inputs and the value of the center output unit is taken as the output of the
filter. The signal values are shifted over one unit, the next signal value
applied to the input unit that is now free, and the value of the center
output unit is read again, etc.

The only references cited in the article are Rummelhart and McClelland,
"Parallel Distributed Processing," Vol. 1, and Widrow and Stearns, "Adaptive
Signal Processing," Prentice-Hall, 1985.  Both are books.
                                
                                        chester@louie.udel.edu


- ---------------------------------------------------------------------------

  I wrote a paper that appeared in the 1986 Snowbird Conference on Neural
Networks (American Institute of Physics Conference Proceedings 151),
entitled "Analog Decoding using Neural Networks" by J.C. Platt and J.J.
Hopfield... I hope this helps...

  I also talk about error-correcting codes in my upcoming thesis,
"Constraint Methods for Neural Networks and Computer Graphics"

                                                        John Platt
                                                        platt@csvax.caltech.edu

------------------------------

Subject: Paper: An Architecture with Neural ....
From:    mathew@orbity.Jpl.Nasa.Gov (Mathew Yeates)
Organization: Image Analysis Systems Grp, JPL
Date:    Wed, 29 Mar 89 19:35:19 +0000 

In a recent posting I mentioned an available paper, "An Architecture with
Neural Network Characteristics for Least Squares Problems" which describes a
neural implementation of a sequential regression algorithm. Requests for a
copy should include a mailing address.  I'm not e-mailing the paper.

 -Mathew Yeates
mathew@jane.jpl.nasa.gov

------------------------------

Subject: Posting to neuron-digest - Parker Learning Law
From:    BART@uieea.ece.uiuc.edu
Date:    Wed, 29 Mar 89 16:33:52 -0600 

   I am interested in methods to improve minimization of the error 
function in backpropagation.  Therefore, could you send a request
to the Neuron-Digest for references on the Parker Learning Law and 
other methods of improving backprop?  The information I receive will
be summarized and sent back out on the digest somehow.  My e-mail 
address is bart%uieea@uxc.cso.uiuc.edu.  Thanks.
                                          Bart Conrath

------------------------------

Subject: Pre-publication to Europe?
From:    prlb2!vub.vub.ac.be!tinf2!wplybaer@uunet.UU.NET (Wim P. Lybaert)
Date:    Wed, 29 Mar 89 23:31:23 +0200 

 REF : WL/89/1/103/037 
   
Dear sir,

Does somebody know an American book-company or service ( highly scientific
qualified) which sends books to Europe before official release to Europe?
Thank you very much for any help you might be able to provide.

                                                Yours sincerely,

                                                Wim Lybaert
                                                Brussels Free University
                                                Department PROG
                                                Pleinlaan 2
                                                1050 BRUSSELS
                                                BELGIUM EUROPE

                                        email:  <wplybaer@prog1.vub.ac.be>

------------------------------

Subject: Proceedings of the 1988 Connectionist Models Summer School
From:    honavar@goat.cs.wisc.edu (A Buggy AI Program)
Organization: U of Wisconsin CS Dept
Date:    Thu, 23 Mar 89 22:53:25 +0000 

In article djlinse@phoenix.Princeton.EDU (Dennis Linse) writes:
>Y. leCun  "A Theorectical Framework for Back-Propagation", Proc. of the
>1988 Connectionist Models Summer School, Pittsburgh, 1988.
>
>know where to find a copy.  Could someone help?  Any pointers to the
>author, similar papers for further reference, sources of the
>Proceedings.  

When I last heard, the author, Yann le Cun was at the University of Toronto
Computer Science Dept.

The Proceedings of the 1988 Connectionist Models Summer School, (Ed: David
Toretzky, Geoffrey Hinton, Terrence Sejnowski), 1989 is published by:

Morgan Kaufmann Publishers
PO Box 50490
Palo Alto, CA 94303.

------------------------------

Subject: Request for information on compilation of neural nets to combinators
From:    darren@uk.ac.city.cs (Darren J R Whobrey)
Date:    Tue, 22 Mar 89 15:09:00 +0000 

Does anyone have any information on the direct compilation of neural nets,
specified in a neurolanguage, into functional code such as the
lambda-calculus or combinators? Would I be correct in thinking that any
neural net (discrete, continuous or otherwise) is based on a restricted form
of parametric high order functional programming?

Thanks in advance.

Darren Whobrey                      e-mail: Janet: darren@uk.ac.city.cs
Dept. Computer Science
City University                     God was satisfied with his own work,
Northampton Square                  and that is fatal.     Butler, 1912.
London EC1V OHB

------------------------------

End of Neurons Digest
*********************