LOHMAN%ibm-sj.csnet@csnet-relay.arpa (03/28/84)
From: Guy M. Lohman <LOHMAN%ibm-sj.csnet@csnet-relay.arpa>
[Forwarded from the SRI-AI bboard by Laws@SRI-AI.]
IBM San Jose Research Lab
5600 Cottle Road
San Jose, CA 95193
Thurs., April 5 Computer Science Colloquium
3:00 P.M. MINIMUM DESCRIPTION LENGTH PRINCIPLE IN MODELING
Auditorium Traditionally, statistical estimation and modeling
involve besides certain well established procedures,
such as the celebrated maximum likelihood technique,
a substantial amount of judgment. The latter is
typically needed in deciding upon the right model
complexity. In this talk we present a recently
developed principle for modeling and statistical
inference, which to a considerable extent allows
reduction of the judgment portion in estimation.
This so-called MDL-principle is based on a purely
information theoretic idea. It selects that model in
a parametric class which permits the shortest coding
of the data. The coding, of which we only need the
length in terms of, say, binary digits, must,
however, be self-containing in the sense that the
description of the parameters themselves needed in
the imagined encoding are included. For this reason,
the optimum model cannot possibly be very complex
unless the data sample is very large. A fundamental
theorem gives an asymptotically valid formula for the
shortest possible code length as well as for the
optimum model complexity in a large class of models.
For short samples no simple formula exists, but the
optimum complexity can be estimated numerically and
taken advantage of. Finally, the principle is
generalized so as to allow any measure for a model's
performance such as its ability to predict.
J. Rissanen, San Jose Research
Host: P. Mantey
Fri., April 6 Computer Science Seminars
Auditorium
KNOWLEDGE AND DATABASES (11:15)
We define a knowledge based approach to database
problems. Using a classification of application from
the enterprise to the system level we can give
examples of the variety of knowledge which can be
used. Most of the examples are drawn from work at
the KBMS Project in Stanford. The objective of the
presentation is to illustrate the power but also the
high payoff of quite straightforward artificial
intelligence applications in databases.
Implementation choices will also be evaluated.
G. Wiederhold, Stanford University
Host: J. Halpern
---------------------------------------------------------------
Visitors, please arrive 15 mins. early. IBM is located on U.S. 101
7 miles south of Interstate 280. Exit at Ford Road and follow the signs
for Cottle Road. The Research Laboratory is IBM Building 028.
For more detailed directions, please phone the Research Lab receptionist
at (408) 256-3028. For further information on individual talks,
please phone the host listed above.
IBM San Jose Research mails out both the complete research calendar
and a computer science subset calendar. Send requests for inclusion
in either mailing list to CALENDAR.IBM-SJ at RAND-RELAY.