[comp.ai.digest] [munnari!nswitgould.oz.au!wray@uunet.UU.NET: Re: Re: Re: Exciting work in AI

NICK@AI.AI.MIT.EDU (Nick Papadakis) (05/27/88)
Date: Tue, 10 May 88 14:10 EDT
From: Wray Buntine <munnari!nswitgould.oz.au!wray@uunet.UU.NET>
To: ailist@kl.sri.com
cc: wray@uunet.UU.NET
Subject: Re: Re: Re: Exciting work in AI -- (stats vs AI learning)
Resent-Date: Tue, 10 May 88 02:03 EDT
Resent-From: Ken Laws <LAWS@KL.sri.com>
Resent-To: ailist@ai.ai.mit.edu

It is worthwhile me clarifying Stuart Crawford's 
(Advanced Decision Systems, stuart@ads.com, Re: Exciting work in AI)
recent comments on my own discussion of Quinlan's work, because it brings
out a distinction between purely statistical approaches and 
approaches from the AI area for learning prediction rules from noisy data.

This prediction problem is OF COURSE an applied statistics one.
(My original comment never presumed otherwise---the reason I
 posted the comment to AIList in the first place was to point this out)

But, it is NOT ALWAYS a purely applied statistics problem (hence my
comments about Quinlan's "improvements").

1.  In knowledge acquisition, we usually don't have a purely statistical
    problem, we often have
	a small amount of data ,
	a knowledgeable but only moderately articulate expert .
    To apply a purely statistical approach to the data alone is clearly
    ignoring a very good source of information: the "expert".
    To expect the expert to sprout forth relevant information is naive.
    We have to produce a curious mix of applied statistics and cognitive
    psych to get good results.  With comprehensibility of statistical results,
    prevelant in learning work labelled as AI, we can draw 
    the expert into giving feedback on statistical results (potential rules).
    This is a devious but demonstrably successful means of capturing some of
    his additional information. 
    There are other advantages of comprehensibility in the "knowledge
    acquisition" context that again arise for non-statistical reasons.

2.  Suffice it to say, trees may be more comprehensible than rules sometimes
    (when they're small they certainly give a better picture of the overall
    result), but when they're large they aren't always.
    Transforming trees to rules is not simply a process of picking a branch
    and calling it a rule.  A set of disjunctive rules can be logically
    equivalent to a minimal size tree that is LARGER BY AN ORDER OF MAGNITUDE.
    In a recent application (reported in CAIA-88)
    the expert flatly refused to go over trees, but on being shown rules
    found errors in the data preparation, errors in the problem formulation, and
    provided substantial extra information (the rules jogged his memory), ....
    merely because he could easily comprehend what he was looking at.
    Need I say, subsequent results were far superior.

In summary, when learning prediction rules from noisy data, AI approaches
complement straight statistical ones in knowledge acquisition contexts, for
reasons outside the domain of statistics.  In our experience, and the
experience of many others, this can be necessary to produce results.


Wray Buntine
wray@nswitgould.oz
School of Computing Science
University of Technology, Sydney
PO Box 123, Broadway
Australia, 2007