[comp.ai.neural-nets] Normal vs. Abnormal Data

loren@tristan.llnl.gov (Loren Petrich) (08/10/90)
	About the problem of distinguishing normal from abnormal data,
I got a response stating that I have not even defined the problem.
This is after I tried to explain what it was. I will try to make
myself clearer with an example.

	Consider a system for recognizing printed characters --
letters, numbers, etc. where the input appears as a set of pixels.
One's pattern recognition system is supposed to pick out anything that
looks like a character and reject anything that does not. Here,
"normal" == a character and "abnormal" != a character. It is evident
that there are many more "abnormal" possible inputs than "normal"
ones.

	And that is the essence of the problem when setting up
training sets. It may be very hard to ensure that one's sample of
"abnormal" data to train on is representative. Thus, my asking about
what alternatives have been tried.

	I've already started work on this problem, using an
auto-associative back-propagation NN. I wish to see how small one can
make such an NN and still have the system work, since if it is too
big, it may well come up with straight input -> output in all cases.

	I think that normal vs. abnormal discrimination would be very
good for process control and similar applications, where the system
should only alert its operators if something unusual is happening. It
may also be useful for data collection by satellites and other
low-capacity systems; one may want to transmit or store only "unusual"
data sets.

	Any comments?

						        ^    
Loren Petrich, the Master Blaster		     \  ^  /
	loren@sunlight.llnl.gov			      \ ^ /
One may need to route through any of:		       \^/
						<<<<<<<<+>>>>>>>>
	lll-lcc.llnl.gov			       /v\
	lll-crg.llnl.gov			      / v \
	star.stanford.edu			     /  v  \
						        v    
For example, use:
loren%sunlight.llnl.gov@star.stanford.edu

My sister is a Communist for Reagan