pratt@paul.rutgers.edu (Lorien Y. Pratt) (10/12/88)
Fall, 1988 Neural Networks Colloquium Series at Rutgers What Size Net Gives Valid Generalization? ----------------------------------------- Eric B. Baum Jet Propulsion Laboratory California Institute of Technology Pasadena, CA. 91109 Room 705 Hill center, Busch Campus Friday October 28, 1988 at 11:10 am Refreshments served before the talk Abstract We address the question of when a network can be expected to generalize from m random training examples chosen from some arbitrary probability distribution, assuming that future test examples are drawn from the same distribution. Among our results are the following bounds on appropriate sample vs. network size. Assume 0 < e <= 1/8. We show that if m >= O( WlogN/e log(1/e)) examples can be loaded on a feedforward network of linear threshold functions with N nodes and W weights, so that at least a fraction 1 - e/2 of the examples are correctly classified, then one has confidence approaching certainty that the network will correctly classify a fraction 1 - e of future test examples drawn from the same distribution. Conversely, for fully-connected feedforward nets with one hidden layer, any learning algorithm using fewer than Omega(W/e) random training examples will, for some distributions of examples consistent with an appropriate weight choice, fail at least some fixed fraction of the time to find a weight choice that will correctly classify more than a 1 - e fraction of the future test examples. -- ------------------------------------------------------------------- Lorien Y. Pratt Computer Science Department pratt@paul.rutgers.edu Rutgers University Busch Campus (201) 932-4634 Piscataway, NJ 08854