ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards) (08/25/90)
Archive-name: opt/24-Aug-90 Original-posting-by: ins_atge@jhunix.HCF.JHU.EDU (Thomas G Edwards) Original-subject: Re: Why the more input neurons, the faster the convergence..? Archive-site: cse.ogi.edu [129.95.10.2] Archive-directory: /pub/nnvowels Reposted-by: emv@math.lsa.umich.edu (Edward Vielmetti) In article <5462@minyos.xx.rmit.oz> rcoahk@koel.co.rmit.oz (Alvaro Hui Kau) writes: >From a recent experiment on Guassian data classification >using Bp Algorithm, I found that the higher dimensions >ones( so need more input neurons) converge much much faster >than those of lower dimensions. I have also noticed this. I believe that networks with few dimensions suffer very seriously from local minima problems (i.e. XOR). Remember, it has been shown that Bp nets are _very_ sensitive to initial weight conditions (Sorry, my reference isn't here right now), and different initial weights can change convergence times by orders of magnitude (at least for small problems). The solution? Well, you could use high-dimensional networks, but of course you then have to spend more time per epoch. I think the best idea is to use conjugate-gradient methods (see _Numerical_Recipes_ , or the paper on efficient parallel learning methods in _Neural_Information_ Processing_Systems_I [ed. Touretzky]) or at least steepest-descent with linesearch. The line-searches allow you to quickly cross vast wastelands of nearly flat error surface, which would take a very long time with vanilla Bp. Then you won't need a supercomputer to do you neural networks (although a Sun would be nice...) Try the conjugate-gradient learning program OPT available via anon ftp from cse.ogi.edu in the /pub/nnvowels directory. -Thomas Edwards The Johns Hopkins University / U.S. Naval Research Lab