MOSTOW%USC-ISIF@sri-unix.UUCP (07/20/83)
From: Jack Mostow <MOSTOW@USC-ISIF> 1983 INTERNATIONAL MACHINE LEARNING WORKSHOP: AN INFORMAL REPORT Jack Mostow USC Information Sciences Institute 4676 Admiralty Way Marina del Rey, CA. 90291 Version of July 18, 1983 [NOTE: This is a draft of a report to appear in the October 1983 SIGART. I am circulating it at this time to get comments before sending it in. The report should give the flavor of the work presented at the workshop, but is not intended to be formal, precise, or complete. With this understanding, please send corrections and questions ASAP (before the end of July) to MOSTOW@USC-ISIF. Thanks. - Jack] The first invitational Machine Learning Workshop was held at C-MU in the summer of 1980; selected papers were eventually published in Machine Learning, edited by the conference organizers, Ryszard Michalski, Jaime Carbonell, and Tom Mitchell. The same winning team has now brought us the 1983 International Machine Learning Workshop, held June 21-23 in Allerton House, an English manor on a park-like estate donated to the University of Illinois. The Workshop featured 33 papers, two panel discussions, countless bull sessions, very little sleep, and lots of fun. This totally subjective report tries to convey one participant's impression of the event, together with a few random thoughts it inspired. I have classified the papers rather arbitrarily under the topics of "Analogy," "Knowledge Transformation," and "Induction" (broadly construed), but of course 33 independent research efforts can hardly be expected to fall neatly into any simple classification scheme. The papers are discussed in semi-random order; I have tried to put related papers next to each other. [The entire document is about 12 pages of printed text. I am abridging it here; interested readers may FTP the file <AILIST>V1N24.TXT from SRI-AI. -- KIL] 1. Analogy 1.1. Lessons 2. Knowledge Transformation 2.1. Lessons 3. Induction 3.1. Inducing Rules 3.2. Dealing with Noise 3.3. Logic-based Work 3.4. Cognitive Modelling 3.5. Lessons 4. Panel Discussion: Cognitive Modelling -- Why Bother? 5. Panel Discussion: "Machine Learning -- Challenges of the 80's" 6. A Bit of Perspective No overview would be complete without a picture that tries to put everything in perspective: -------------> generalizations ------------ | | | | INDUCTION COMPILATION (Knowledge Discovery) (Knowledge Transformation) | | | v examples ----------- ANALOGY --------> specialized solutions (Knowledge Transfer) Figure 6-1: The Learning Triangle: Induction, Analogy, Compilation Of course the distinction between these three forms of learning breaks down under close examination. For example, consider LEX2: does it induce heuristics from examples, guided by its definition of "heuristic," or does it compile that definition into special cases, guided by examples? 7. Looking to the Future The 1983 International Workshop on Machine Learning felt like history in the making. What could be a more exciting endeavor than getting machines to learn? As we gathered for the official workshop photograph, I thought of Pamela McCorduck's Machines Who Think, and wondered if twenty years from now this gathering might not seem as significant as some of those described there. I felt privileged to be part of it. In the meantime, there are lessons to be absorbed, and work to be done.... One lesson of the workshop is the importance of incremental learning methods. As one speaker observed, you can only learn things you already almost know. The most robust learning can be expected from systems that improve their knowledge gradually, building on what they have already learned, and using new data to repair deficiencies and improve performance, whether it be in analogy [Burstein, Carbonell], induction [Amarel, Dietterich & Buchanan, Holland, Lebowitz, Mitchell], or knowledge transformation [Rosenbloom, Anderson, Lenat]. This theme reflects the related idea of learning and problem-solving as inherent parts of each other [Carbonell, Mitchell, Rosenbloom]. Of course not everyone saw things the way I do. Here's Tom Dietterich again: ``I was surprised that you summarized the workshop in terms of an "incremental" theme. I don't think incremental-ness is particularly important--especially for expert system work. Quinlan gets his noise tolerance by training on a whole batch of examples at once. I would have summarized the workshop by saying that the key theme was the move away from syntax. Hardly anyone talked about "matching" and syntactic generalization. The whole concern was with the semantic justifications for some learned concept: All of the analogy folks were doing this, as were Mitchell, DeJong, and Dietterich and Buchanan. The most interesting point that was made, I thought, was Mitchell's point that we need to look at cases where we can provide only partial justification for the generalizations. DeJong's "causal completeness" is too stringent a requirement.'' Second, the importance of making knowledge and goals explicit is illustrated by the progress that can be made when a learner has access to a description of what it is trying to acquire, whether it is a criterion for the form of an inductive hypothesis [Michalski et al] or a formal characterization of the kind of heuristic to be learned for guiding a search [Mitchell et al]. Third, as Doug Lenat pointed out, continued progress in learning will require integrating multiple methods. In particular, we need ways to combine analytic and empirical techniques to escape from their limitations when used alone. Finally, I think we can extrapolate from the experience of AI in the '60's and '70's to set a useful direction for machine learning research in the '80's. Briefly, in AI the '60's taught us that certain general methods exist and can produce some results, while the '70's showed that large amounts of domain knowledge are required to achieve powerful performance. The same can be said for learning. I consider a primary goal of AI in the '80's, perhaps the primary goal, to be the development of general techniques for exploiting domain knowledge. One such technique is the ability to learn, which itself has proved to require large amounts of domain knowledge. Whether we approach this goal by building domain-specific learners (e.g. MetaDendral) and then generalizing their methods (e.g. version space induction), or by attempting to formulate general methods more directly, we should keep in mind that a general and robust intelligence will require the ability to learn from its experience and apply its knowledge and methods to problems in a variety of domains. A well-placed source has informed me that plans are already afoot to produce a successor to the Machine Learning book, using the 1983 workshop papers and discussions as raw material. In the meantime, there is a small number of extra proceedings which can be acquired (until they run out) for $27.88 ($25 + $2.88 postage in U.S., more elsewhere), check payable to University of Illinois. Order from June Wingler University of Illinois at Urbana-Champaign Department of Computer Science 1304 W. Springfield Avenue Urbana, IL 61801 There are tentative plans for a similar workshop next summer at Rutgers.