dbn@rti.rti.org (Daniel B. Nissman) (08/23/90)
I would like a pointer to the application of the conjugate gradient method to training feedforward networks. A symbolic description of how to actually implement this method in such a network would be greatly appreciated. Also, how does this differ from Fahlman's Quickprop algorithm? Pros and cons of using one method over another would be desireable as well. A related question is: which of these methods works best in scale-up problems and why? Many thanks, Daniel Nissman