[comp.ai.neural-nets] Using deltas at the input layer

demers@beowulf.ucsd.edu (David Demers) (07/26/90)

I'm curious to know if anyone has done anything
with computation of deltas for input units.  I'm
aware of the paper by Risto Mikkulainen & Mike Dyer
in 1988 Connectionist Models Summer School proceedings.

It seems silly on the face of it to compute
an "error" for the input - a determination of
what the input SHOULD HAVE BEEN to produce a
smaller error in the mapping.  The above paper
used the deltas to improve the input representation,
with some interesting results.  

Anyone else try anything using a normal backpropogation
delta at the input layer?

Thanks for any pointers,

Dave DeMers
demers@cs.ucsd.edu

tap@ai.toronto.edu (Tony Plate) (07/26/90)

In article <12039@sdcc6.ucsd.edu> demers@beowulf.ucsd.edu (David Demers) writes:
>I'm curious to know if anyone has done anything
>with computation of deltas for input units.  I'm
>aware of the paper by Risto Mikkulainen & Mike Dyer
>in 1988 Connectionist Models Summer School proceedings.
>
>Anyone else try anything using a normal backpropogation
>delta at the input layer?
>
>Dave DeMers

All networks which use a positional (1 on in n) encoding of input
can be considered as doing what you are talking about.

Just think of the second or hidden layer as being the input,
and the weights from one of the positionally encoded units
as the encoding for that item.

For an earlier example, see Hinton's family tree paper, in proc.
of the 8'th Cog Sci conf, 1986.

You could think of NETtalk as a version of this that takes context
into account (i.e., representing the central letter in its context).

Look for positionally encoded inputs and outputs to identify
more examples.  It gets more interesting at the output where
the distributed representation sometimes has to encode the
possibility that are several possible answers.

Tony Plate

niranjan@eng.cam.ac.uk (Mahesan Niranjan) (07/28/90)

In article <12039@sdcc6.ucsd.edu> demers@beowulf.ucsd.edu (David Demers) writes:
>I'm curious to know if anyone has done anything
>with computation of deltas for input units.  I'm
>aware of the paper by Risto Mikkulainen & Mike Dyer
>in 1988 Connectionist Models Summer School proceedings.
>
>It seems silly on the face of it to compute
>an "error" for the input - a determination of
>what the input SHOULD HAVE BEEN to produce a

It depends on the application. If it is noise reduction (or some kind of
filtering), then you are actually doing something to the input data.

niranjan

chrisley@csli.Stanford.EDU (Ron Chrisley) (08/01/90)

In <12039@sdcc6.ucsd.edu> demers@beowulf.ucsd.edu (David Demers) writes:

>I'm curious to know if anyone has done anything
>with computation of deltas for input units.  I'm
>aware of the paper by Risto Mikkulainen & Mike Dyer
>in 1988 Connectionist Models Summer School proceedings.

>Anyone else try anything using a normal backpropogation
>delta at the input layer?

>Thanks for any pointers,

>Dave DeMers
>demers@cs.ucsd.edu

I'm also interested in this idea.  I've read the Mikkulainen and Dyer paper,
and was surprised that they didn't reference William's paper in the 8th 
Cog Sci conference ('86): "Inverting a connectionist network mapping by back-
propagation of error".

Anybody have any other references?
-- 
Ron Chrisley    chrisley@csli.stanford.edu
Xerox PARC SSL                               New College
Palo Alto, CA 94304                          Oxford OX1 3BN, UK
(415) 494-4728                               (865) 793-484

al@gmdzi.UUCP (Alexander Linden) (08/06/90)

> In <12039@sdcc6.ucsd.edu> demers@beowulf.ucsd.edu (David Demers) writes:
> 
> >Anyone else try anything using a normal backpropogation
> >delta at the input layer?
> >Dave DeMers
> >demers@cs.ucsd.edu

We did several experiments on the idea of backpropagating error
information to input units.

The most illustrative experiments can be found in Linden and Kindermann
(1989) and much more detailled but in german in Linden (1990), where we
did experiments on recognition of handwritten numerals. We showed that
inversion (ie gradient descent in input space) can find input pattern
which the neural network classifies as a "7" for example but are itself
as close as possible at a "3". In other words this procedure can
explicitly detect misclassfications. An very interesting idea now is
to use these misclassifications as counterexamples and extending your
training set with them (eg Hwang90EE22 or Linden90).

A general scheme for detecting misclassifications is implicated through
an extended error function:

E = (O - T)^2 + (I - IT)^2

O is the output, T the target eg. "7", I is the input (at the beginning
this may be used as a starting point for gradient descent in input
space) IT is an input target eg the input presentation of some "3"

When you calculate error and propagating them back through an already
trained NN (leaving weights unchanged) you get error signals for input
units. Adding these error values onto the input units (not trivial!,
see Linden & Kindermann (1989) or better Kindermann & Linden (1990))
tries to get the input I as much as close to the input target IT and
its induced output O as much as close to the given target T.

Thrun Linden (1990) applied inversion to recurrent neural networks.

In another internal paper we  show (Thrun, Moeller, Linden (1990,
internal paper))  that it is possible to generate actions from forward
models via inversion.
Consider a forward model, with current state and some action
as input and an next state as output. An action is generated by
performing gradient descent in action space in order to approximate the
goal with the predicted next state. Very interesting results can be
obtained by concatenating such forward models. This leads to an
n-step-lookahead planner. But as common practice we have already done
only some very small  experiments.

If you are interested of any paper of our group send your request to

Alexander Linden                    | TEL. (49 or 0) 2241/14-2537
Research Group for Adaptive Systems | FAX. (49 or 0) 2241/14-2618 or -2889
GMD                                 | TELEX 889469 gmd d
P. O. BOX 1240                      |            /  al@gmdzi.uucp
D-5205 St. Augustin 1               |     e-mail<   al@zi.gmd.dbp.de
Federal Republic of Germany         |            \  unido!gmdzi!al@uunet.uu.net
-------------------------------------------------------------------------------

References to Inversion of Neural Networks:

@UNPUBLISHED{Hwang90EE22,
  AUTHOR	= {Hwang, J. N. and Choi, J. J. and Oh, S. and Marks, R. J.},
  TITLE	= {Query Learning Based on Boundary Search and Gradient
		   Computation of Trained Multilayer Perceptrons},
  NOTE		= {to appear in Proceedings of IJCNN 90, San Diego, June 17-21,
		   1990},
  YEAR		= {1990},
  KEYWORDS	= {inversion},
  REF		= {EE22}
}

@INPROCEEDINGS{Hwang90EE23,
  AUTHOR	= {Hwang, J. N. and Chan, C. H.},
  TITLE	= {Iterative Constrained Inversion of Neural Networks and its Applications},
  BOOKTITLE	= {24th Conference on Information Systems and Sciences, Priceton,
		   March 1990},
  YEAR		= {1990},
  KEYWORDS	= {inversion},
  REF		= {EE23}
}

@ARTICLE{Kindermann90PC,
AUTHOR		= {Kindermann, J. and Linden, A.},
TITLE		= {Inversion of Neural Nets},
YEAR		= 1990,
JOURNAL	= {Parallel Computing},
NOTE		= {(to appear)},
REF 		= {Map}
}

@INPROCEEDINGS{Linden89Inversion,
AUTHOR		= {Linden, A. and Kindermann, J.},
TITLE		= {Inversion of Multilayer Nets},
BOOKTITLE	= {Proceedings of the First International Joint Conference on Neural Networks, Washington, DC},
PUBLISHER	= {IEEE},
ADDRESS		= {San Diego},
YEAR		= 1989,
REF		= {Inversion}
}

@MASTERSTHESIS{Linden90diplom,
  AUTHOR	= {Linden, A.},
  TITLE	= {{U}ntersuchung von {B}ackpropagation in konnektionistischen
		   {S}ystemen},
  SCHOOL	= {Universit"at Bonn},
  YEAR		= {1990},
  ADDRESS	= {Bonn},
  REF		= {diplom}
}

@UNPUBLISHED{Suddarth89Z4,
AUTHOR		= {Suddarth, S. C. and Bourrely, J. C.},
TITLE		= {A Back-Propagation Associative Memory for Both Positive and
Negative Learning},
YEAR		= 1989,
NOTE		= {Poster Presentation at IJCNN-89},
KEYWORDS	= {backpropagation | associative memory | negative training},
REF		= {Z4}
}

@INPROCEEDINGS{Thrun90Inversion,
AUTHOR		= {Thrun, S. and Linden, A.},
TITLE		= {Inversion in Time},
BOOKTITLE	= {Proceedings of the EURASIP Workshop on Neural
Networks, Sesimbra, Portugal, February 15-17},
ORGANIZATION 	= {EURASIP} ,
YEAR		= 1990,
REF		= {Inversion}
}

@INPROCEEDINGS{Williams86Z23,
AUTHOR		= {Williams, R. J.},
TITLE		= {Inverting a Connectionist Network Mapping by
Backpropagation of Error},
BOOKTITLE	= {8th Annual Conference of the Cognitive Science Society},
YEAR		= 1986,
PUBLISHER	= {Lawrence Erlbaum},
ADDRESS		= {Hillsdale, NJ},
KEYWORDS	= {backpropagation | inversion | natural language},
REF		= {Z23}
}