loren@tristan.llnl.gov (Loren Petrich) (03/12/91)
Having reviewed some Conjugate Gradient methods, I find them rather complicated. An alternative, due to Fahlman, is the Quickprop algorithm. It is described in some papers of his that can be found in the /pub/neuroprose directory of cheops.cis.ohio-state.edu, available by anonymous ftp. Basically, it works by remembering the previous gradient and the stepsize taken from there, and finding the new weight values by fitting a line from the current gradient to the previous gradient. This operation is done on each weight component separately. In effect, the Hessian is approximated as a diagonal matrix, but one where the nonzero elements are independent of each other. There are some fudge factors that have to be added here and there, such as adding a gradient-descent "starter" and keeping the stepsizes from growing too rapidly, but this algorithm is remarkably simple. I have found it to be a stable and fast algorithm for solving gradient-descent problems. Has anyone else had experience with Quickprop, and how does it compare with Conjugate Gradients and other such methods? $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ Loren Petrich, the Master Blaster: loren@sunlight.llnl.gov Since this nodename is not widely known, you may have to try: loren%sunlight.llnl.gov@star.stanford.edu
jdm5548@diamond.tamu.edu (James Darrell McCauley) (03/12/91)
In article <92992@lll-winken.LLNL.GOV>, loren@tristan.llnl.gov (Loren Petrich) writes: [stuff deleted] |> |> Has anyone else had experience with Quickprop, and how does it |> compare with Conjugate Gradients and other such methods? |> I believe that Timothy R. Thomas and Tony L. Brewster at Los Alamos National Laboratory did a comparison unvolving Quickprop. I just read a paper by these men called "Experiements in Finding Neural Network Weights" [I have this on microfiche, so I can't readily check my facts. If you would like to find this paper, the fiche is labeled "Office of Scientific and Technical Information, DOE, USA" - Ref # LA-11772-MS, E 1.99, DE90-007696 ] While I'm on the subject, does anyone have an e-mail address for these authors? In their comparison, they used an 18-component real-valued vector representing a spectrum of speech. I'm interested in what those 18 components were. -- James Darrell McCauley (jdm5548@diamond.tamu.edu, jdm5548@tamagen.bitnet) Spatial Analysis Lab, Department of Agricultural Engineering, Texas A&M University, College Station, Texas 77843-2117, USA