[comp.ai.neural-nets] Distinguishing "Normal" from "Abnormal" Data

loren@tristan.llnl.gov (Loren Petrich) (07/14/90)

	I may have asked about this earlier, and I am asking about
this again. I hope to use Neural Nets to analyze astronomical data,
and for this purpose, it will be vitally important to distinguish
"normal" and "abnormal" phenomena. I mean by "normal" anything that is
very commonplace; "abnormal" anything that is relatively rare. Since
the "abnormal" phenomena are sometimes the most interesting ones, it
will be vital to pick them out. I even think it may be better to risk
misclassifying some "normal" phenomena as "abnormal" than the other
way around.

	Has anyone else faced similar problems?

	What is the most efficient way to solve such problems?

	Is a backprop network a good thing to use, and if so, what
would be the most suitable type of training set? Would one use an
mixture of known "normal" inputs and randomly generated "abnormal"
inputs, with one output being a normal/abnormal indicator?

						        ^    
Loren Petrich, the Master Blaster		     \  ^  /
	loren@sunlight.llnl.gov			      \ ^ /
One may need to route through any of:		       \^/
						<<<<<<<<+>>>>>>>>
	lll-lcc.llnl.gov			       /v\
	lll-crg.llnl.gov			      / v \
	star.stanford.edu			     /  v  \
						        v    
For example, use:
loren%sunlight.llnl.gov@star.stanford.edu

My sister is a Communist for Reagan

jgk@osc.COM (Joe Keane) (07/18/90)

In article <64712@lll-winken.LLNL.GOV> loren@tristan.llnl.gov (Loren Petrich)
writes:
>	I may have asked about this earlier, and I am asking about
>this again. I hope to use Neural Nets to analyze astronomical data,
>and for this purpose, it will be vitally important to distinguish
>"normal" and "abnormal" phenomena. I mean by "normal" anything that is
>very commonplace; "abnormal" anything that is relatively rare. Since
>the "abnormal" phenomena are sometimes the most interesting ones, it
>will be vital to pick them out. I even think it may be better to risk
>misclassifying some "normal" phenomena as "abnormal" than the other
>way around.
>
>	Has anyone else faced similar problems?

Yup.

>	What is the most efficient way to solve such problems?

This may be heresy in comp.ai.neural-nets, but this task seems ideally suited
to standard statistical analysis.  Off the top of my head, it's hard to say
what sort of distribution you want.  A multi-variate normal might work
sufficiently well, although you probably want something multi-mode.

>	Is a backprop network a good thing to use, and if so, what
>would be the most suitable type of training set? Would one use an
>mixture of known "normal" inputs and randomly generated "abnormal"
>inputs, with one output being a normal/abnormal indicator?

Don't get me wrong, i think neural nets are very interesting, and they have
produced good results in some areas.  But i see them being used where more
mundane methods would work quite well, and probably much faster.

It seems like NN is the newest trick, so people want to use it everywhere.
But in the process they don't hear about the old things, which is too bad.  Is
it just me, or are others bothered by this trend?

demers@odin.ucsd.edu (David E Demers) (07/23/90)

In article <3071@osc.COM> jgk@osc.COM (Joe Keane) writes:
>In article <64712@lll-winken.LLNL.GOV> loren@tristan.llnl.gov (Loren Petrich)
>writes:
[about a classification problem...]
>This may be heresy in comp.ai.neural-nets, but this task seems ideally suited
>to standard statistical analysis.  
[...]
>Don't get me wrong, i think neural nets are very interesting, and they have
>produced good results in some areas.  But i see them being used where more
>mundane methods would work quite well, and probably much faster.

>It seems like NN is the newest trick, so people want to use it everywhere.
>But in the process they don't hear about the old things, which is too bad.  Is
>it just me, or are others bothered by this trend?

Not everyone knows all about what has been done.  And two months
in the lab will save two hours in the library...:-)

I agree that there are a lot of people in a lot of fields who
attempt to use a tool that is not appropriate for their problem,
out of ignorance of what the right tool is.  

Neural networks can capture high order statistics about a dataset
that are difficult to get with conventional methods.  However,
many problems don't need to be solved with nonlinear regression
and simpler, well-tested, fast methods may be best.

Yes, I think that there are a lot of papers applying nets to
inappropriate problems, but I am not bothered by it.  Eventually
they will learn about other, perhaps superior, approaches;
possibly from reading the net if not from peer review...

Dave