[net.ai] AIList Digest V2 #161

LAWS@SRI-AI.ARPA (11/26/84)

From: AIList Moderator Kenneth Laws <AIList-REQUEST@SRI-AI>


AIList Digest            Sunday, 25 Nov 1984      Volume 2 : Issue 161

Today's Topics:
  Benchmarking Reading Comprehension
  Reviews - AI Abstracts & IEEE Computer & High Technology & Learnability,
  Humor - Brain Structure,
  Algorithms - Many-Body Problem & Macro Cacheing & Linear Programming,
  Seminar - Set Theoretic Problem Translation (CMU)
----------------------------------------------------------------------

Date: Sun 25 Nov 84 01:50:44-EST
From: Wayne McGuire <MDC.WAYNE%MIT-OZ@MIT-MC.ARPA>
Subject: Benchmarking Reading Comprehension

     Does anyone know if any objective standards or tests have been
devised for rating or benchmarking the power of natural language
understanding systems?

     In the world of chess exists a standard international system for
rating players which can be applied to chess-playing programs. I think
it would be useful to devise a similar system for natural language
understanding software. Such a benchmarking scheme would make it
possible to track the rate of progress in the most fundamental branch
of computational linguistics, and to compare the performance of
competing systems. The National Bureau of Standards might be an
appropriate organization to oversee a project of this kind.

     Perhaps such a benchmarking system could be based on the reading
comprehension sections of the SAT or GRE exams. A GRE-style multiple
choice test for natural language understanding would avert the problem
of wrongly jumbling the capacity to understand--to recognize
propositions, reason, and draw inferences--with the ability of a
program to answer questions with well-formed discourse, a domain of
skill which is really quite separate from pure comprehension. It would
be desirable to establish standard tests for every major language in
the world.

     Is there an existing natural language understanding system in the
world that can read even at the level of a third grader? Probably not.

     To that researcher or research team in the world which first
designs (no doubt decades from now) a program which consistently
scores at least 700 on the reading comprehension sections of
standardized tests like the SAT or GRE could be offered, perhaps, a
major cash prize.

------------------------------

Date: Sat 24 Nov 84 15:27:29-PST
From: Ken Laws <Laws@SRI-AI.ARPA>
Subject: AI Abstracts

Two ads for AI abstracts and indices have recently crossed my desk:

Scientific Datalink is offering a four-volume index to the AI
research reports that they offer in microfiche (one volume of
authors and titles, one of subjects, and two of abstracts).
The price is $375 now, $495 after publication.  Individual
volumes are not offered separately in this ad.  For more
information, write to Ms. Chia Reinhard, Scientific Datalink,
850 Third Avenue, New York, NY 10022.  (212) 838-7200.

ECI/Intelligence is offering a new journal, Artificial Intelligence
Abstracts, at $295 for 1985.  (The mockup of the first issue is dated
October 1984, but the text consists of such gems as "Ut einim ad
minim veniam, quis nostrud exercitation laboris nisi ut aliquip ex
ea commodo consequet.")  The journal offers to keep you up to date on
market research, hardware and software developments, expert systems,
financial planning, and legislative activities.  There is a similar
journal for CAD/CAM.  The AI advisory board includes Harry Barrow,
Michael Brady, Pamela McCorduck, and David Shaw.

ECI/Intelligence also offers a full-text document order service
from their microfiche collection.  For more info, write to them
at 48 West 38 Street, New York, NY 10018.

                                        -- Ken Laws

------------------------------

Date: Sat 24 Nov 84 14:51:17-PST
From: Ken Laws <Laws@SRI-AI.ARPA>
Subject: IEEE Computer Articles

AI is mentioned a few times in the November issue of IEEE Computer.

On p. 114, there are excerpts from the keynote Compcon speech by
Robert C. Miller, a senior vice president of Data General.  He is
touting expert systems, and estimates that overall sales of AI-related
products will increase from $100-150 million this year to $2.5
billion by the end of the decade.

P. 117 has a very short mention of NCAI and the coming IJCAI.

P. 133 has L. Elliot's review of Learning and Teaching with Computers,
by Tim O'Shea and John Self.  The book is evidently about half AI
(Logo, MYCIN, knowledge engineering, and epistemology) and half
computer-assisted learning (teaching styles, learning styles,
tutoring strategies).

The rest of the magazine is mostly about Teradata's database machines,
the PICT graphical programming language, workstations in local area
networks, and some overviews of software engineering at NEC and GTE.

                                        -- Ken Laws

------------------------------

Date: Sat 24 Nov 84 15:02:27-PST
From: Ken Laws <Laws@SRI-AI.ARPA>
Subject: High Technology Articles

The December issue of High Technology has some interesting articles
for computer folk.  On p. 9 there's a short item about Logicware's
(Hungarian-developed) MPROLOG, a "modular" PROLOG for IBM PCs and
mainframes, 68000-based micros, VAXen, and other hosts.  Other
articles review the best of current microcomputer equipment (chiefly
PC-AT and Macintosh), survey the field of visual flight and driving
simulators, and present an excellent introduction to database
structures and machines (esp. relational databases).

                                        -- Ken Laws

------------------------------

Date: Sun 25 Nov 84 15:25:50-PST
From: Ken Laws <Laws@SRI-AI.ARPA>
Subject: CACM Article on Learning

The November issue of CACM includes "A Theory of the Learnable", by
L.G. Valiant of Harvard.  I am not competent to evaluate the article,
which is based on theorems in computational complexity, but I can
characterize it as follows:

The author is considering the class of concepts in propositional logic
that can be learned in a polynomial number of steps from a source of
positive examples (produced as required in accordance with a probability
distribution) and an oracle that can classify an arbitrary Boolean vector
as a positive or negative exemplar.  The classes that are found to be
learnable are  1) conjunctive normal form expressions with a bounded
number of literals in each clause (no oracle required),  2) monotone
disjunctive normal form expressions, and  3) arbitrary expressions
in which each variable occurs just once (no examples required, but
the oracle must be especially capable).  The method of learning used
is such that the learned concept may occasionally reject true exemplars
but will not accept false ones.

The closing remarks contain this interesting quote:

  An important aspect of our approach, if cast in its greatest
  generality, is that we require the recognition algorithm of the
  teacher and learner to agree on an overwhelming fraction of only
  the natural inputs.  Their behavior on unnatural inputs is
  irrelevant, and hence descriptions of all possible worlds are not
  necessary.  If followed to its conclusion, this idea has considerable
  philosophical implications:  A learnable concept is nothing more
  than a short program that distinguishes some natural inputs from
  some others.  If such a concept is passed on among a population
  in a distributed manner, substantial variations in meaning may arise.
  More importantly, what consensus there is will only be meaningful
  for natural inputs.  The behavior of an individual's program for
  unnatural inputs has no relevance.  Hence thought experiments and
  logical arguments involving unnatural hypothetical situations
  may be meaningless activities.

                                        -- Ken Laws

------------------------------

Date: Tue 20 Nov 84 17:36:43-PST
From: Richard Treitel <TREITEL@SUMEX-AIM.ARPA>
Subject: quote  {:-)

Attributed to Marvin Minsky (someone tell me if it's wrong)

        "I'll bet the human brain is a kludge."

                                                        - Richard

------------------------------

Date: Sat 24 Nov 84 14:36:33-PST
From: Ken Laws <Laws@SRI-AI.ARPA>
Subject: Many-Body Problem

Need software to run 10,000-body simulations?  A VAX Pascal program
is discussed in the November CACM Programming Pearls column.
Optimization brought the run time down from one year to one day.

                                        -- Ken Laws

------------------------------

Date: 22 Nov 84 17:20 PST
From: JonL.pa@XEROX.ARPA
Subject: Macro cacheing: Interlisp-D Interpreter as "malgorithm"?

Jay Ferguson suggests, in his contribution of 18 Nov 84 13:55:58-PST,
that an explanation for the timing differences between using a FOR loop
and using LDIFF in the original "Interlisp-D malgorithm" is because
"Each time you call a FOR statement interpetively the translation
occurs."  This is not the case -- the Interlisp interpreter (in all
implementations of Interlisp, I believe) caches the results of any macro
or CLISP expansion into a hash array called CLISPARRAY; see secton 16.8
of the Interlisp Reference Manual (Oct 1983).  In fact, the figures
supplied by Jay show a speed difference of a factor of 17, which would
be consistent with the basic loop being compiled in LDIFF (a "system"
function) and being interpreted in the FOR.

The question of "cacheing" as noted above is a complicated one, and in
Jay's defense, I can say that it is not at all clearly outlined in the
IRM.   For example, it lays the burden on the "in-core" structure editor
of examining CLISPARRAY on any change, to de-cache the expansions for
any code that is modified; but random modifications (caused by, say,
redefinition of a function upon which the expansion depends), don't
cause de-cacheing, and this is the source of some very obscure bugs.
Furthermore, lots of cruft may stick around forever because the garbage
collector does not reclaim items placed in the cache; for this reason,
it is advisable to ocasionally do (CLRHASH CLISPARRAY).

MacLisp provides three options for macro expansions which are
controllable on a macro-by-macro basis (CLISP is, unfortunately, a
kludge dating to pre-macro Interlisp days -- it could and should be
implemented entirely as a set of macros, so I will view it in that light
for the rest of this discussion): (1) do no cacheing, (2) "displace" the
original list cell containing the macro call with a form which contains
both the original and the expanded code [compiler and interpreter use
the expanded code, prettyprinter uses the original], and (3) cache the
expansion in a hash array called MACROMEMO.  While all these options can
be run in any Lisp that implements macros by giving the expansion
function a pointer to "the whole cell", Common Lisp provides the
*macroexpand-hook* facility so that the cacheing code which is common to
all macros can be put in one place, rather than distributed throughout
the many macro bodies.

-- JonL --

------------------------------

Date: 22 Nov 84 07:15:10 PST (Thu)
From: Carl Kaun <ckaun@aids-unix>
Subject: Linear Programming Algorithms


The recent discussion of the Karmarkar Linear Programming algorithm on
this newsgroup has stirred me to comment.  I wouldn't think a cubic time
linear programming algorithm to be any great deal, indeed I present one
myself shortly.  An algorithm that solves the INTEGER linear programming
problem, however, is something else again.  My understanding is that the
Khachiyan algorithm solved the integer problem in polynomial time.  If
the Karmarkar algorithm does also, then it is truly worthy.  But it has
not been apparent in the popularized discussion I have been reading that
this is so.  Perhaps those on the net who are more in the know can tell
us whether it is or not.

Ever since I took Luenberger's course in Linear and Nonlinear Programming
(Stanford EES department, 1973) I have wondered why people didn't apply
the powerful nonlinear tools to the linear problem.  I played around with
the idea and came up with an algorithm I call "gradient step linear
programming".  I've never done anything with it because it's so simple
and obvious that it seemed someone had to have thought of it before.
Because the algorithm follows the gradient as best it can subject to
the constraints, from a geometric point of view it travels through the
"interior" of the polytope, much as has been described for the Karmarkar
algorithm.  Optimality is achieved in no more than N steps, each step
requiring o(N^2) numerical operations, where N is  the dimension of the
space.

Mathematical notation isn't easy on a terminal.  I adopt the convention
of representing vectors by preceding them with an underline, as "_x".
Subscripts I represent using a colon, _c:j being the j-th vector _c.
The inner product is represented by < * , * >.  I use a functional
form (i.e. f(  )) to represent things like sums. The rest should be
fairly obvious.

A statement of the linear programming problem, convenient for the description
of the algorithm,  follows.  This problem statement is readily converted into
other forms of the linear programming problem.  The problem is to maximize
with respect to the N-dimensional vector _x the linear functional:
               <_c , _x >
 subject to the constraints:
               <_a:j , _x > >= b:j   for j = 1, 2, ..., M
The vector '_c' is often called the cost vector when a minimization problem is
being considered.  M >= N, as otherwise the solution is unbounded.

The procedure for finding an initial feasible vector _x(0) is essentially
identical to the procedure for finding an optimum vector.  For now an initial
feasible vector _x(0) in the interior of the polytope defined by the
constraints may simply be assumed.  The initial gradient vector,
maximizing the rate of change subject to the active constraints, is _c(0) =
_c.  At each stage, the idea of the algorithm is to move along the current
gradient vector (subject to the active constraints) as far as possible, until
a previously inactive constraint is encountered.  The direction of change is
modified by the most recent active constraint, until no motion in the
direction of the gradient is possible.  This is both the formal and the
obvious stopping condition for the algorithm.  The details of the algorithm
follow:

Step 1:  given a current solution point _x(n), determine the step size s
(giving the next solution point _x(n+1) = _x(n) + s*_c(n), identifying at
the same time the next active constraint.
          D:j = < _x(n) , _a:j > - b:j    ( >= 0 )
          s:j = D:j / < _c(n) , _a:j >    for j in the set of inactive
constraints.
          s = min { s:j } , and the next active constraint has the index j(n)
providing the minimum, which is also the maximum feasible step size.

Step 2:  Apply the Gram-Schmidt procedure (i.e., projection) to remove the
component of the most recent active constraint from the gradient direction, so
that subsequent steps will not result in violation of the active constraint.
It is necessary first to remove all components of previous active constraints
from the newly actived constraint to insure that the adjusted gradient
direction will not violate any previous acitve constraint.
          _a(n) = _a:j(n) - sum(k = 0 to (n-1))[
                         _a(k) * <_a:j(n),_a(k)> / <_a(k),_a(k)> ]

          _c(n+1) = _c(n) - _a(n) * <_c(n),_a(n)> / <_a(n),_a(n)>

Steps 1 and 2 are repeated until _c(n+1)=0, at which point _x(n) is the
optimal solution to the linear programming problem.  Additional tests to
detect and recover from degeneracy are easily added.

A detailed proof of optimality is straightforward but somewhat tedious.
Intuitively, the algorithm is optimal because steps are always taken along the
direction maximizing the rate of change of the functional, subject to the
active constraints.  At the optimal point, there is no feasible motion
improving the functional.  Stated differently, the original cost vector lies
in the space spanned by the gradients of the constraints, and this is the
formal (Lagrange) optimization condition.  It is only necessary to add
constraints to the set of active constraints because the optimization space is
convex, and therefore changes in the functional improvement direction (and
reduction in the rate of improvement) result only from encountering new
constraints and having to turn to follow them.

Note that the number of iterations is simply the number of dimensions N of the
space, this being also the number of vectors required to span the space.  Each
iteration entails the removal of o(N) vector components from the new
constraint, and the removal of a vector component entails o(N) multiplications
and additions.  Similarly, determining the step size requires the computation
of o(N) inner products, each requiring o(N) multiplications and additions.
Finding the iniitial feasible vector requires about the same effort in
general.  Thus overall the algorithm presented for solving the linear
programming problem requires O(N**3) arithmetic operations.

An initial feasible point can be determined starting from an arbitrary point
(say the origin), identifying the unsatisfied constraints, and moving in
directions that satisfy them.  It may be more direct to simply start with a
"superoptimal" point, say K*_c for suitably large K, and iterate using
essentially the previously described algorithm along the negative constrained
gradient directions to feasibility.  By duality, the resulting feasible point
will also be optimal for the original problem.

                                                Carl F. Kaun

                                                ckaun@aids-UNIX
                                                415/941-3912

------------------------------

Date: 21 November 1984 1014-EST
From: Staci Quackenbush@CMU-CS-A
Subject: Seminar - Set Theoretic Problem Translation (CMU)

           [Forwarded from the CMU bboard by Laws@SRI-AI.]

        Name:   Robert Paige
        Date:   November 27, 1984
        Time:   10:00 - 12:00
        Place:  WeH 8220
        Title:  "Mechnical Translation of Set Theoretic Problem
                 Specifications into Efficient RAM Code"


Many computer problems can be expressed accurately and easily in the
following form: 'find the unique set s subset of t satisfying
predicate k(s) and minimizing objective function f(s)'.  Although such
specifications are generally unsolvable, we can provide rather broad
sufficient conditions under which these formal problem statements can
be compiled into efficient procedural implementations with predictable
time and space complexities.  A toy implementation of such a compiler
has been implemented and used within the RAPTS transformational
programming system.

Our methodology depends on three fundamental program transformations,
two of which resemble classical numerical techniques.  The first
transformation solves roots of set theoretic predicates by iterating
to a fixed point.  It turns an abstract functional program
specification into a lower level imperative form with emerging
strategy.  The second transformation is a generalized finite
differencing technique.  It implements program strategy efficiently by
forming access paths and incremental computations.  The third
transformation is a top down variant of Schwartz's method of data
structure selection by basings.  It replaces sets and maps by
conventional storage structures.

The method will be illustrated using two examples -- graph
reachability and digraph cycle testing.

This is a special 2-hour lecture with a 10-minute break in the middle.

------------------------------

End of AIList Digest
********************