[comp.ai.neural-nets] Need activation levels in backprop nets be between 0 and 1?

bettingr@acsu.buffalo.edu (Keith E. Bettinger) (06/21/91)

SHORT QUESTION:
==============
In backpropagation networks, is it *inherent* in the equations involved
that the activation range be on the unit interval [0,1] or the
integer unit interval [-1,1]?

BACKGROUND:
==========
I've been trying to relate a set of real-valued inputs to a
set of real-valued outputs using a 3-layer backpropagation neural
network, without any success.  The network seems to max out
immediately, with the hidden nodes going directly to either minimum
or maximum activation levels, and no appreciable learning taking
place thereafter.
   I *was* able to get a running network, though, if I scaled each
input and output down to a [0,1] range.  But this procedure has its
own problems, not the least of which is the necessity of knowing the
entire range of inputs and outputs before beginning.

LONG QUESTION:
=============
Is a [0,1] range (or a [-1,1] range, which also worked) necessary for
backprop nets?  If so, can the equations be modified to allow a
wider, hopefully unlimited, activation range?  If not, are there any
special techniques needed to get such a network going?

Thank you for any help.  If the volume of replies warrants, I will
summarize.

-------------------------------------------------------------------------
Keith E. Bettinger                  "All of us get lost in the darkness
SUNY at Buffalo Computer Science     Dreamers learn to steer by the stars
                                     All of us do time in the gutter
                                     Dreamers turn to look at the cars."
INTERNET: bettingr@cs.buffalo.edu                           - Neil Peart
UUCP: ..{bbncca,decvax,rocksvax,watmath}!sunybcs!bettingr
-------------------------------------------------------------------------

kooijman@duteca.et.tudelft.nl (Richard Kooijman) (06/22/91)

bettingr@acsu.buffalo.edu (Keith E. Bettinger) writes:

>In backpropagation networks, is it *inherent* in the equations involved
>that the activation range be on the unit interval [0,1] or the
>integer unit interval [-1,1]?

You should have no troubles using these ranges even for real-valued input
and output. I have done several experiments that all worked except for
networks that didn't have enough hidden units to deal with the problem.

I have tested other ranges too, and came to the conclusion that if you
use values that are the slightest bit higher than 1, you'll get in trouble.

If you use the least square error function you can't use any ranges
outside [-1,1] because the learning rule would not be able anymore
to correct a value to one because errors for a desired output of 1 and 
actual outputs of 0.8 or 1.2 can't be distingished.

Before I wrote this reply I couldn't figure out why you can't use ranges
outside [-1,1], but now I get it.
The result is that I wrote this message without much thought, so I can be
mistaken.

I hope I helped you anyway,

Richard.