[comp.archives] [comp.ai.neural-nets] Re: Why the more input neurons, the faster the convergence..?

ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards) (08/25/90)
Archive-name: opt/24-Aug-90
Original-posting-by: ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards)
Original-subject: Re: Why the more input neurons, the faster the convergence..?
Archive-site: cse.ogi.edu [129.95.10.2]
Archive-directory: /pub/nnvowels
Reposted-by: emv@math.lsa.umich.edu (Edward Vielmetti)

In article <5462@minyos.xx.rmit.oz> rcoahk@koel.co.rmit.oz (Alvaro Hui Kau) writes:
>From a recent experiment on Guassian data classification 
>using Bp Algorithm, I found that the higher dimensions
>ones( so need more input neurons) converge much much faster
>than those of lower dimensions.

I have also noticed this.  I believe that networks with few dimensions
suffer very seriously from local minima problems (i.e. XOR).
Remember, it has been shown that Bp nets are _very_ sensitive to 
initial weight conditions (Sorry, my reference isn't here right now),
and different initial weights can change convergence times by orders
of magnitude (at least for small problems).

The solution?  Well, you could use high-dimensional networks, but of
course you then have to spend more time per epoch.  I think the best
idea is to use conjugate-gradient methods (see _Numerical_Recipes_ ,
or the paper on efficient parallel learning methods in _Neural_Information_
Processing_Systems_I [ed. Touretzky]) or at least steepest-descent
with linesearch.  The line-searches allow you to quickly cross vast
wastelands of nearly flat error surface, which would take a very long
time with vanilla Bp.  Then you won't need a supercomputer to do you
neural networks (although a Sun would be nice...)

Try the conjugate-gradient learning program OPT available via anon ftp
from cse.ogi.edu in the /pub/nnvowels directory.

-Thomas Edwards
The Johns Hopkins University / U.S. Naval Research Lab