NICK@AI.AI.MIT.EDU (Nick Papadakis) (05/27/88)
Date: Tue, 10 May 88 14:10 EDT From: Wray Buntine <munnari!nswitgould.oz.au!wray@uunet.UU.NET> To: ailist@kl.sri.com cc: wray@uunet.UU.NET Subject: Re: Re: Re: Exciting work in AI -- (stats vs AI learning) Resent-Date: Tue, 10 May 88 02:03 EDT Resent-From: Ken Laws <LAWS@KL.sri.com> Resent-To: ailist@ai.ai.mit.edu It is worthwhile me clarifying Stuart Crawford's (Advanced Decision Systems, stuart@ads.com, Re: Exciting work in AI) recent comments on my own discussion of Quinlan's work, because it brings out a distinction between purely statistical approaches and approaches from the AI area for learning prediction rules from noisy data. This prediction problem is OF COURSE an applied statistics one. (My original comment never presumed otherwise---the reason I posted the comment to AIList in the first place was to point this out) But, it is NOT ALWAYS a purely applied statistics problem (hence my comments about Quinlan's "improvements"). 1. In knowledge acquisition, we usually don't have a purely statistical problem, we often have a small amount of data , a knowledgeable but only moderately articulate expert . To apply a purely statistical approach to the data alone is clearly ignoring a very good source of information: the "expert". To expect the expert to sprout forth relevant information is naive. We have to produce a curious mix of applied statistics and cognitive psych to get good results. With comprehensibility of statistical results, prevelant in learning work labelled as AI, we can draw the expert into giving feedback on statistical results (potential rules). This is a devious but demonstrably successful means of capturing some of his additional information. There are other advantages of comprehensibility in the "knowledge acquisition" context that again arise for non-statistical reasons. 2. Suffice it to say, trees may be more comprehensible than rules sometimes (when they're small they certainly give a better picture of the overall result), but when they're large they aren't always. Transforming trees to rules is not simply a process of picking a branch and calling it a rule. A set of disjunctive rules can be logically equivalent to a minimal size tree that is LARGER BY AN ORDER OF MAGNITUDE. In a recent application (reported in CAIA-88) the expert flatly refused to go over trees, but on being shown rules found errors in the data preparation, errors in the problem formulation, and provided substantial extra information (the rules jogged his memory), .... merely because he could easily comprehend what he was looking at. Need I say, subsequent results were far superior. In summary, when learning prediction rules from noisy data, AI approaches complement straight statistical ones in knowledge acquisition contexts, for reasons outside the domain of statistics. In our experience, and the experience of many others, this can be necessary to produce results. Wray Buntine wray@nswitgould.oz School of Computing Science University of Technology, Sydney PO Box 123, Broadway Australia, 2007