loren@tristan.llnl.gov (Loren Petrich) (08/10/90)
About the problem of distinguishing normal from abnormal data, I got a response stating that I have not even defined the problem. This is after I tried to explain what it was. I will try to make myself clearer with an example. Consider a system for recognizing printed characters -- letters, numbers, etc. where the input appears as a set of pixels. One's pattern recognition system is supposed to pick out anything that looks like a character and reject anything that does not. Here, "normal" == a character and "abnormal" != a character. It is evident that there are many more "abnormal" possible inputs than "normal" ones. And that is the essence of the problem when setting up training sets. It may be very hard to ensure that one's sample of "abnormal" data to train on is representative. Thus, my asking about what alternatives have been tried. I've already started work on this problem, using an auto-associative back-propagation NN. I wish to see how small one can make such an NN and still have the system work, since if it is too big, it may well come up with straight input -> output in all cases. I think that normal vs. abnormal discrimination would be very good for process control and similar applications, where the system should only alert its operators if something unusual is happening. It may also be useful for data collection by satellites and other low-capacity systems; one may want to transmit or store only "unusual" data sets. Any comments? ^ Loren Petrich, the Master Blaster \ ^ / loren@sunlight.llnl.gov \ ^ / One may need to route through any of: \^/ <<<<<<<<+>>>>>>>> lll-lcc.llnl.gov /v\ lll-crg.llnl.gov / v \ star.stanford.edu / v \ v For example, use: loren%sunlight.llnl.gov@star.stanford.edu My sister is a Communist for Reagan