[comp.ai.neural-nets] nn software vs. stat. techniques

burow@cernvax.cern.ch (burkhard burow) (01/21/91)

What are the performance and 'elegance' differences between neural net sw and
stat. techniques, e.g. discriminant analysis, when used to assign events to one
of 2 or more populations. Seen by the user as a black box, the 2 methods are
identical, a understood training set of events sets up the machinery and the
unknown events follow.

I certainly understand the performance advantage of hardware neural nets, e.g.
the brain, but what's the story when both of the above methods run on 'normal'
computers. 

I'm looking for comments, facts, arguments, beliefs, pointers to literature,
postings, etc. 

thanks               INTERNET:  burow%13313.hepnet@csa3.lbl.gov
burkhard

minsky@media-lab.MEDIA.MIT.EDU (Marvin Minsky) (01/22/91)

In article <3885@cernvax.cern.ch> burow@cernvax.cern.ch (burkhard burow) writes:
>What are the performance and 'elegance' differences between neural net sw and
>stat. techniques, e.g. discriminant analysis, when used to assign events to one
>of 2 or more populations. Seen by the user as a black box, the 2 methods are
>identical, a understood training set of events sets up the machinery and the
>unknown events follow.

The nn methods form a wider class of clustering procedures.  Wider for
several reasons:

  1. The result is not constrained to be unique.  The same data can
produce different classifications, because the learning trajectory can
depend on the order in which the data is presented.

  2. NN methods are still largely empirical. A method becomes popular
if enough researchers claim that it gives good results.  In most
cases, very little has been proven about the range and reliability of
the method.  Statistical methods, in contrast, are not considered
"scientific" unless accompanied by theorems about their behavior.

  3. In order to prove a theorem about classification, the theorem's
antecedent must precisely describe a class of classification problems.
In the real world, this is really hard to do -- so mathematical
statistics tends to confine itself to idealized well-defined cases.
The AI and  NN researchers don't restrict themselves that way.

So the differences are substantial.  In some cases, NN methods are
known that in fact turn out to compute well-known statistical
functions.  For example, read section 12.4 of Perceptrons (Minsky and
Papert, MIT Press, 1988.  There you can see an iterative, NN-like
process that computes Bayesian statistics -- excep that, because of the
memory decay, the variances do not converge to zero with increasing
sample size, as is often characteristic of NN's.