ling@array.UUCP (Ling Guan) (08/11/90)
I am currently using NN to do recognition of objects in xray images. The NN I use is a standard backpropagation net with cumulative generalized delta training rule. I found that the classification results depend on how the inputs are scaled. For example, scaling the inputs to between -1 and 1 gives better results than to between 0 and 1. I can't find any explanation for this outcome. Anybody can give me comments or explanations? Thanks in advance. Ling
al@gmdzi.UUCP (Alexander Linden) (08/13/90)
In article <473@array.UUCP>, ling@array.UUCP (Ling Guan) writes: > ... scaling the inputs to between -1 and 1 gives better > results than to between 0 and 1. I can't find any explanation for > this outcome. Anybody can give me comments or explanations? I see the main reason for quicker convergence in the fact that weights going from input units away can be updated twice as often. This is because in $$w_ji = w_ji +\eta * \delta_j * a_i $$ the factor $a_i$ will have an effect on learning. When you use sparse-coding with mane zeros this factor will be zero most of the time. But if you use -1 instead of 0, on each update each weight can learn. Another thing is of course that you alter semantics of activations. -1 has the opposite effect to +1 while 0 will have no effect. This semantic seems in many cases more plausible. Alexander Linden | TEL. (49 or 0) 2241/14-2537 Research Group for Adaptive Systems | FAX. (49 or 0) 2241/14-2618 or -2889 GMD | TELEX 889469 gmd d P. O. BOX 1240 | / al@gmdzi.uucp D-5205 St. Augustin 1 | e-mail< al@zi.gmd.dbp.de Federal Republic of Germany | \ unido!gmdzi!al@uunet.uu.net -------------------------------------------------------------------------------
bill@wayback.unm.edu (william horne) (08/14/90)
In article <473@array.UUCP> ling@array.UUCP (Ling Guan) writes: >I am currently using NN to do recognition of objects in >xray images. The NN I use is a standard backpropagation net with >cumulative generalized delta training rule. I found that the >classification results depend on how the inputs are scaled. For >example, scaling the inputs to between -1 and 1 gives better >results than to between 0 and 1. I can't find any explanation for >this outcome. Anybody can give me comments or explanations? > We have found by looking at 3 dimensional plots of the error surface that if the data is not centered about the origin, then the there is often a steep "wall" where the weight trajectory bounces off of during learning. However, when the data is centered about the origin the walls become less steep, and the surface is easier to search. This may imply that all input data should be normalized to be centered about the origin. -bill
kingsley@hpwrce.HP.COM (Kingsley Morse) (08/14/90)
I remember reading a paper years ago on symetrically scaling activities between -1 and 1 instead of 0 to 1. Unfortunately, I don't remember for sure where I saw it, but it may have been in the proceedings of the IEEE 1st conference on Neural Networks.
grange@brillig.cs.umd.edu (Granger Sutton) (08/15/90)
I looked at this phenomenon briefly a while ago. In addition to what other people have already said the lengths of the input vectors (the activations of the input nodes when viewed as a vector) seemed to have an impact on learning. Specifically, if all the input vectors are the same length learning times seem to be faster and more uniform (smaller variance). For binary input vectors -1,1 coding gives you input vectors that are all the same length. Granger Sutton grange@brillig.cs.umd.edu
al@gtx.com (Alan Filipski) (08/16/90)
In article <3430004@hpwrce.HP.COM> kingsley@hpwrce.HP.COM (Kingsley Morse) writes:
->I remember reading a paper years ago on symetrically scaling activities between
->-1 and 1 instead of 0 to 1. Unfortunately, I don't remember for sure where I
->saw it, but it may have been in the proceedings of the IEEE 1st conference
->on Neural Networks.
Yes, it is a paper by Stornetta & Huberman in the Proceedings of the
First International Conference on Neural Networks.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
( Alan Filipski, GTX Corp, 8836 N. 23rd Avenue, Phoenix, Arizona 85021, USA )
( {decvax,hplabs,uunet!amdahl,nsc}!sun!sunburn!gtx!al (602)870-1696 )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~