[comp.ai] Some Sequence Prediction Work

eliot@phoenix.Princeton.EDU (Eliot Handelman) (12/03/89)

In article <2850@ucrmath.UCR.EDU> xcax12@ucrmath.UCR.EDU () writes:

;            I am having a problem with pattern recognition.  I am trying to
;write a program that will recognize patterns that are inputted, outputting
;the next couple of logical repetitions.  I'm hoping it will work something
;like this:
;
;(defun pattern_rec (x)
;   ( <<< CODE HERE >>> ))
;pattern_rec
;
;(pattern_rec '(a b a c a d a e))
;(a b a c a d a e a f)
;
;
;I have everything figured out except for the 'CODE HERE' part. :-)
;
;If anyone can give me a bit of help, just even a push in the right direction,
;I would be greatly appreciative.  I will be monitoring this group, so either
;post or send mail.  Thanks.

Two years ago I posed a similar question, and got a large number of pointers
to research and the literature. I've appended some of them below. John
McCarthy also wrote to express his skepticism of sequence-predicting approaches
to AI. I've included his whole article, because there are probably many others
interested in this.


Saul Kripke, in "Wittgenstein on Rules and Private Language" makes an attack
on the idea that there is "a" solution to a given sequence as follows:

"... an intelligence tester may suppose that there is only one  possible
continuation to the sequence 2,4,6,8,..., mathematical and philosophical
sophisticates know that an indefinite number of rules (even rules stated
in terms of mathematical functions as conventional as ordinary polynomials)
are compatible with any such finite initial segment. So if the tester urges
me to respond, after 2,4,6,8,..., with *the* unique appropriate next number,
the proper response is that no such unique number exists, nor is there any 
unique (rule determined) infinite sequence that continues the given one."
(page 18)

For example, one could argue that the "rule" determined by Kripke's
sequence is "really":
       
       n(1) = 2;
       n(i + 1) = n(i + 2) IF i < 5;
       n(i + 1) = 57 IF i >= 5.

Applying this rule, the "real" sequence is 2,4,6,8,57,57,57, ....

Refuting this is not as simple as it looks. One could say that the solution
is "inelegant," but what's the criteria of elegance? That the rule can be
couched in as few clauses as possible? That's the argument of the 
"simplicicists," see, eg, Elliott Sober, "Simplicity," Oxford, 1975.

My own interest in this has to do with music but there things are even
more complex. I ought to have a paper forthcoming that may explain this 
remark.

Here are some of the replies:

----
Date: Thu, 29 Oct 87 17:43:15 PST
From: Tom Dietterich <cmcl2!hp-pcd!orstcs!tgd>

You might find the article Discovering Patterns in Sequences of Events
by myself and Ryzsard Michalski (Artificial Intelligence, Vol 25, No.
2 187-232) to be relevant.  We were looking at sequence prediction,
and we cite all of the other work we could find. (A revised version of
the article can be found in Michalski's edited collection Machine
Learning: An Artificial Intelligence Approach, Vol II).

------
From: robin burke <YALE.ARPA!burke-robin>

You might look at some of Holfstadter's work in analogies. Example: 
"Fluid Concepts and Creative Analogies" FARG Document 87-1, University 
of Michigan.

-------


Date: Fri, 30 Oct 87 14:11:04 EST
From: Carl H. Smith <mimsy.umd.edu!smith>

The Russians have done a lot of work on the extrapolation problem.
This problem is one of predicting the next value of a function after seeing
some initial segment of its graph.  You can find a description of results
in the area, along with formal definitions and pointers to the literature in
Inductive inference: theory and methods by Angluin & Smith, Computing Surveys, 1983.

----

Date: Fri, 30 Oct 87 20:53:13 pst
From: peregrine.com!dmi (Dean Inada)

Well, data compression algorithms  are essentialy pattern predictors.
In particular, Lempel-Ziv compression can make good use of the pattern of
your example.  JACM, Vol. 29, No. 4, pg. 943 gives a good review of this.
The Computer Recreations column of Scientific American Vol. 254, No. 3
March 1986 describes some other aproaches to pattern prediction,
based on explicit models of the pattern generator.

-----

Date: Sat, 31 Oct 87 22:01:04 cst
From: uiucdcs!osiris.cso.uiuc.edu!goldfain (Mark Goldfain)

A lot of work was done on this at UIUC a while back.  The main program
I remember that did this sort of work was called SPARC.  I bet you could
get a lot of paper references and perhaps some code from

      Intelligent Systems Group
      Department of Computer Science
      University of Illinois at Urbana-Champaign
      1304 West Springfield
      Urbana, IL.  61801

The best contacts would be Ryzard Michalski or Kaihu Chen.

If I'm not mistaken an email address that should work is:
michalsk@aisund.uiuc.edu    (If you have arpa/internet access).
                                    - Mark Goldfain

----

Date: Mon, 2 Nov 87 08:09:30 est
From: John Grefenstette <nrl-aic.ARPA!gref>

You might like to read:

"Learning to predict by the method of temporal differences",
by Rich Sutton, TR87-509.1, Computer and Intelligent Systems Lab,
GTE Labs, 40 Sylvan Road, Waltham, MA 02254.

----

From: philabs.Philips.Com!raca  (Rich Caruana)
To: mind.UUCP!eliot

Try HaYong Zhou's dissertation from Vanderbilt University, Nashville.


----

Date: Mon, 2 Nov 87 10:21 EDT
From: RELAY.CS.NET!PTW%UTRC%untech.utc.com


A. K. Dewdney has an interesting article in his "Computer Recreations" column in the Novemeber 1985 issue of Scientific American that seems applicable to
your problem. Dewdney describes using the Genetic Algorithm to evolve a finite state table that can predict the next input signal based on previous inputs. The Genetic Algorithm is a machine learning technique that operates by simulating 
evolution through natural selection. Its not just a half-backed scheme though - it has a rigorous theoretical basis thanks to John Holland from U. of Michigan.
Anyway, Dewdney used the GA to predict the next expected (binary) input. The 
state transition table that the GA produces to accomplish the prediction might
constitute the "pattern of inference" you are looking for.

----


Subject: Re: Prediction-producing Algorithms
Date: Wed, 04 Nov 87 16:53:33 -0800


There are several papers in the field of automatic program synthesis that
ought to be right up your alley.  For instance [BieSmi79] gives a simple
yet interesting algorithm designed to generate an output from a single
input as easily as possible---meaning it takes advantage of patterns
inherent in the input.  A good paper to start your search with.

If you're really interested in studying pattern generation, there are
PhD theses written on systems that generate programs from sample
input-output examples.  These systems both rely on pattern induction,
but involve a lot of reading.  I'll give you the references for those
as well:

[BieSmi79] Alan W Biermann and Douglas R Smith
	   A Production Rule Mechanism for Generating LISP Code
	   IEEE Transactions on Systems, Man, and Cybernetics
	   volume SMC-9 #5 pages 260-276

[Bigger76] Ted J Biggerstaff
	   C2: A "Super Compiler" Approach to Automatic Programming
	   Ph D  Dissertation, Dept of Computer Science
	   University of Washington, Seattle, WA  Jan 76

[Summer75] Phillip D Summers
	   Program Construction from Examples
	   Ph D  Dissertation, Yale University, New Haven, CT  Dec 75

----
Date: 30 Oct 87  0950 PST
From: John McCarthy <SAIL.Stanford.EDU!JMC>

	Eliot Handelman's request for information on prediction has
inspired me to inflict the following considerations on the community.

Roofs and Boxes

	Many people have proposed sequence extrapolation as a prototype AI
problem.  The idea is that a person's life is a sequence of sensory
stimuli, and that science consists of inventing ways of predicting the
future of this sequence.  To this end many sequence extrapolating programs
have been written starting with those that predict sequences of integers
by taking differences and determining the co-efficients of a polynomial.

	It has always seemed to me that starting this way distorts the
heuristic character of both common sense and science.  Both of them think
about permanent aspects of the world and use the sequence of sense data
only to design and confirm hypotheses about these permanent aspects.  The
following sequence problem seems to me to typify the break between
hypotheses about the world and sequence extrapolation.

The ball bouncing in the rectilinear world - roofs and boxes

	Suppose there is a rectangular two dimensional room.  In this room
are a number of objects having the form of rectangles.  A ball moves in
the room with constant velocity but bounces with angle of incidence equal
to angle of reflection whenever it hits a wall or an object.  The observer
cannot see the objects or the walls.  All he sees is the x-co-ordinate of
the ball at integer times but only when the ball is visible from the front
of the room.  This provides him with a sequence of numbers which he can
try to extrapolate.  Until the ball bounces off something or goes under
something, linear extrapolation works.

	Suppose first that the observer knows that he is dealing with this
kind of ball-in-room problem and only doesn't know the locations of the
objects and the walls.  After he has observed the situation for a while he
will have partial information about the objects and their locations.  For
example, he may note that he has never been in a certain part of the room
so there may be unknown objects there.  Also he may have three sides of a
certain rectangle but may not know the fourth side, because he has never
bounced of that side yet.  He may extrapolate that he won't have the
opportunity of bouncing off that side for a long time.

	Alternatively we may suppose that the observer doesn't
initially know about balls bouncing off rectangles but only knows
the sequence and must infer this using a general sequence extrapolation
mechanism.  Our view is that this observer, whether human or machine,
can make progress only by guessing the underlying model.  At first
he may imagine a one dimensional bouncing model, but this will be
refuted the first time the ball doesn't bounce at an x-co-ordinate
where it has previously bounced.  Indeed he has to keep open
the possibility that the room is really 3  or more dimensional or that
more general objects than rectangles exist.

	We can elaborate the problem by supposing that when the ball
bounces off the front wall, the experimenter can put a paddle at an angle
and determine the angly of bounce so as to cause the ball to enter regions
where more information is wanted.

	Assuming the rectangles having edges parallel to the axes makes
the problem easier in an obvious sense but more difficult in the sense
that there is less interaction between the observable x-co-ordinate and
the unobservable y-co-ordinate.

	It would be interesting to determine the condition on the x-path
that distinguishes 2-dimensional from 3-dimensional worlds, if there is
one.  Unless we assume that the room has some limited size, there need be
no distinction.  Thus we must make the never-fully-verified assumption
that some of the repetititions in sequences of bounces are because the
ball hit the front or back wall and bounced again off the same surfaces
rather than similar surfaces further back.

	A tougher problem arises when the observer doesn't get the
sequence of x-coordinates but only 1 or 0 according to whether the
ball is visible or invisible.

	I am skeptical that an AI program fundamentally based on the idea
of sequence extrapolation is the right idea. 

maner@bgsuvax.UUCP (Walter Maner) (12/04/89)

From article <11883@phoenix.Princeton.EDU>, by eliot@phoenix.Princeton.EDU (Eliot Handelman):
> In article <2850@ucrmath.UCR.EDU> xcax12@ucrmath.UCR.EDU () writes:
> 
> ;            I am having a problem with pattern recognition.  I am trying to
> ;write a program that will recognize patterns that are inputted, outputting

It might be less ambitious to investigate the automatic synthesis of regular
expression patterns from examples.  The problem is partly constrained by the
syntax for regexprs and otherwise alleviated by the fact that discovered
patterns could be useful even if they only fuzzy-matched.  Anyone know of
any research which attacks the sequence problem constrained in this way?
WALT


-- 
CSNet    maner@andy.bgsu.edu           | 419/372-8719
InterNet maner@andy.bgsu.edu 129.1.1.2 | BGSU Comp Sci Dept
UUCP     ... !osu-cis!bgsuvax!maner    | Bowling Green, OH 43403
BITNet   MANER@BGSUOPIE

valdes@b.gp.cs.cmu.edu (Raul Valdes-Perez) (12/04/89)

Newsgroups: comp.ai
Subject: Re: Some Sequence Prediction Work (actually, synthesis of regexprs)
Keywords: 
Distribution: 
References: <11883@phoenix.Princeton.EDU> <5234@bgsuvax.UUCP>
Organization: Carnegie-Mellon University, CS/RI

Simon & Kotovsky wrote a program in an attempt to reproduce human
ability and shortcomings in sequence-extrapolation tasks.  The 
justification of this research was:  people arrive at answers on
such tasks, despite the logical impossibility of a unique answer;
how can one explain this empirical phenomenon?  Read the paper to
know how well they did.

Later, Klahr & Wallace wrote a program for the same task, without
any intent to model human problem-solving, just to have it "work"
well.

The first reference is "Human acquisition of concepts for sequential 
patterns," Psychological Review, 70, 534-546.  
The second is "The development of serial completion strategies:  An
information-processing analysis," British Journal of Psychology, 61(2), 
243-257.

Raul Valdes-Perez
-- 
Raul E. Valdes-Perez

stede@cs.purdue.EDU (Manfred Stede) (12/04/89)

In article <7200@pt.cs.cmu.edu>, valdes@b.gp.cs.cmu.edu (Raul Valdes-Perez) writes:
> Simon & Kotovsky wrote a program in an attempt to reproduce human
> ability and shortcomings in sequence-extrapolation tasks.  The 
> justification of this research was:  people arrive at answers on
> such tasks, despite the logical impossibility of a unique answer;
> how can one explain this empirical phenomenon?  Read the paper to
> know how well they did.

There was a paper at Edinburgh University that extended the Simon & Kotovsky 
work and gave a program in a Pascal-like language which was able to solve a
considerable number of sequence extrapolation patterns. I read it about four
years ago and remember only the title "Completion of Patterned Letter Sequen-
ces" but not the author. Maybe somebody from EU knows.

Manfred Stede, stede@cs.purdue.edu

smoliar@vaxa.isi.edu (Stephen Smoliar) (12/05/89)

The points introduced in McCarthy's critique of sequence extrapolation are well
taken;  but it is important to recognize that he is basically criticizing the
"standard" formulation of the problem.  However, this comes close to beating
a dead horse (or, at least, a horse which should have died some time ago),
since intelligent agents seldom (if ever) attempt to analyze any given set
of input stimuli in a vacuum.  I think what is important about the problems
raised by Wittgenstein and Kripke is that there are no properties strictly
inherent in any sequence which serve as grounds to predict its continuation.

What, then, is extrapolation REALLY about?  I would argue that it is really
about memory.  When an intelligent agent encounters the sequence 2, 4, 6, 8, ...
he recognizes it as a familiar sequence.  If he has ANY experience in playing
around with integers (and do any of us lack such experience entirely?), the
sequence will be familiar enough for him to recall that 10 is the next element.
Of course, he also knows enough about the real world to know that he only
EXPECTS 10 to be the next element.  If the next element turns out to be 57,
he will probably be quite surprised;  however, he is unlikely to react in panic
that the world around him is falling apart.

The point is that the best you can do by way of prediction is to model the
expectations of some particular agent.  You can, if you wish, assume that
the agent begins with no memories at all;  but you had better then prepare
yourself for a rather substantial training period before you get any
interesting behavior.  I am more curious about what it would mean to
build an agent who already has some memories.  How would such memories
be modeled?  How might I build and compare different agents who have different
bases of experience?

Many builders of expert systems will probably argue that this is a poor way to
view the world.  "The experience is coded in the rules," they would argue.
"All you have to do is find an expert and do knowledge engineering on him.
Then all of his memories and experiences will be implemented in your rule
base."  Unfortunately, I would view such a response as evasive.  When all
is said and done, such systems are not particularly good when it comes to
augmenting their knowledge on the basis of further experience.  There are,
of course, some promising advances in machine learning;  but I would say
that we are still a far cry from a system which, once endowed with some
set of expert rules for sequence extrapolation, would then "recognize"
3, 1, 4, 1, 5, 9, ... as being "familiar" and "expect" 2 on the basis
of that familiarity.  (If this example is too "universal," consider an
agent who is given six digits which happen to be the first six digits
of his home telephone number.)

I happen to share Eliot's interest in music.  I believe that listening to music
is an activity which is guided strongly by the expectations of the listener.
(Quite often, your average listener will complain about listening to Webern
because what he is hearing does not conform to expectations he has formed on
the basis of listening to, say, Tchaikovsky.)  Those expectations, then, are
founded on past listening experiences;  so in order to figure our how they are
formed, we must first figure out how one accumulates and retrieves memories of
those experiences.  (This point of view is not that different from that
expressed by Schank in DYNAMIC MEMORY as applied to our comprehension of
language.)

=========================================================================

USPS:	Stephen Smoliar
	USC Information Sciences Institute
	4676 Admiralty Way  Suite 1001
	Marina del Rey, California  90292-6695

Internet:  smoliar@vaxa.isi.edu

"For every human problem, there is a neat, plain solution--and it is always
wrong."--H. L. Mencken