bettingr@acsu.buffalo.edu (Keith E. Bettinger) (06/21/91)
SHORT QUESTION: ============== In backpropagation networks, is it *inherent* in the equations involved that the activation range be on the unit interval [0,1] or the integer unit interval [-1,1]? BACKGROUND: ========== I've been trying to relate a set of real-valued inputs to a set of real-valued outputs using a 3-layer backpropagation neural network, without any success. The network seems to max out immediately, with the hidden nodes going directly to either minimum or maximum activation levels, and no appreciable learning taking place thereafter. I *was* able to get a running network, though, if I scaled each input and output down to a [0,1] range. But this procedure has its own problems, not the least of which is the necessity of knowing the entire range of inputs and outputs before beginning. LONG QUESTION: ============= Is a [0,1] range (or a [-1,1] range, which also worked) necessary for backprop nets? If so, can the equations be modified to allow a wider, hopefully unlimited, activation range? If not, are there any special techniques needed to get such a network going? Thank you for any help. If the volume of replies warrants, I will summarize. ------------------------------------------------------------------------- Keith E. Bettinger "All of us get lost in the darkness SUNY at Buffalo Computer Science Dreamers learn to steer by the stars All of us do time in the gutter Dreamers turn to look at the cars." INTERNET: bettingr@cs.buffalo.edu - Neil Peart UUCP: ..{bbncca,decvax,rocksvax,watmath}!sunybcs!bettingr -------------------------------------------------------------------------
kooijman@duteca.et.tudelft.nl (Richard Kooijman) (06/22/91)
bettingr@acsu.buffalo.edu (Keith E. Bettinger) writes: >In backpropagation networks, is it *inherent* in the equations involved >that the activation range be on the unit interval [0,1] or the >integer unit interval [-1,1]? You should have no troubles using these ranges even for real-valued input and output. I have done several experiments that all worked except for networks that didn't have enough hidden units to deal with the problem. I have tested other ranges too, and came to the conclusion that if you use values that are the slightest bit higher than 1, you'll get in trouble. If you use the least square error function you can't use any ranges outside [-1,1] because the learning rule would not be able anymore to correct a value to one because errors for a desired output of 1 and actual outputs of 0.8 or 1.2 can't be distingished. Before I wrote this reply I couldn't figure out why you can't use ranges outside [-1,1], but now I get it. The result is that I wrote this message without much thought, so I can be mistaken. I hope I helped you anyway, Richard.