[comp.ai.neural-nets] Adding 0.1 to logistic

esrmm@warwick.ac.uk (Denis Anthony) (05/23/91)

I attended a seminar yesterday, at which it was stated that adding 0.1
to the logistic function in back prop speeds up learning by 50% (in one
application anyway).

If this a known phenomenum, and if so is there any reason for it ?

Denis

smagt@fwi.uva.nl (Patrick van der Smagt) (05/23/91)

esrmm@warwick.ac.uk (Denis Anthony) writes:

>I attended a seminar yesterday, at which it was stated that adding 0.1
>to the logistic function in back prop speeds up learning by 50% (in one
>application anyway).

>If this a known phenomenum, and if so is there any reason for it ?

Well, maybe this is not so hard to explain.  When initial weights are
very small the input to each hidden unit (i.e., the parameter of the
logistic function) is situated around 0.  Then the behaviour of the
network is almost linear, since around 0 the logisitic function is
almost linear.  The network will then not be able to solve a non-linear
problem with linear hidden units, and the weights will tend to 0.

Adding a small value to the input of the hidden unit will, of course,
shift its value to a less linear region, and thus the initial phase of
training will be faster.

							Patrick van der Smagt
								    /\/\
                                                                    \  /
Organisation: Faculty of Mathematics & Computer Science             /  \
              University of Amsterdam, Kruislaan 403,            _  \/\/  _
              NL-1098 SJ  Amsterdam, The Netherlands            | |      | |
Phone:        +31 20  525 7524                                  | | /\/\ | |
Fax:          +31 20  525 7490                                  | | \  / | |
                                                                | | /  \ | |
email:        smagt@fwi.uva.nl                                  | | \/\/ | |
                                                                | \______/ |
                                                                 \________/

								    /\/\
``The opinions expressed herein are the author's only and do        \  /
not necessarily reflect those of the University of Amsterdam.''     /  \
                                                                    \/\/

rr2p+@andrew.cmu.edu (Richard Dale Romero) (05/24/91)

in response to patrick's statement about pushing the logistic towards a more
non-linear section, i think he was slightly off about what the .1 was being
added to.  it would make more sense to add .1 to the output of the logistic,
not it's input.  the bias parameter takes care of any movements along the
logistic curve that you need to make.

a possible reason as to why adding this .1 would speed up learning has been
brought up before on this group, i believe, or something along those lines.
the two things i do remember are subtracting .5 from the logistic so that it
is centered around 0, or adding 0.1 to the sigmoid prime function.  both
are talked about in fahlman's 'empirical study of learning speed in back-
propagation networks', cmu-cs-88-162.  the reason for adding .1 to the sig-
prime function is to avoid letting it go to 0 when the input is at an
extreme.  the symmetric sigmoid is talked about in stometta and huberman's
'an improved three-layer back-prop algorithm' in proceedings of the ieee
international conference on neural networks, pages 637-644, 1987.

-rick

len@retina.mqcs.mq.oz.au (Len Hamey) (05/24/91)

In article <1991May23.141446.28619@fwi.uva.nl> smagt@fwi.uva.nl (Patrick van der Smagt) writes:
>esrmm@warwick.ac.uk (Denis Anthony) writes:
>
>>I attended a seminar yesterday, at which it was stated that adding 0.1
>>to the logistic function in back prop speeds up learning by 50% (in one
>>application anyway).
>
>>If this a known phenomenum, and if so is there any reason for it ?
>
>Well, maybe this is not so hard to explain.  When initial weights are

Adding 0.1 to the logistic function is discussed in Fahlman's
paper:  An Empirical Study of Learning Speed in Back-Propagation Networks.
It is available from neuroprose.

Len Hamey			len@retina.mqcs.mq.oz.au
Lecturer in Computing
Macquarie University