[net.ai] Characterizing automata from I/O pairs

Hamilton.ES@PARC-MAXC.ARPA (03/17/84)

From:  Bruce Hamilton <Hamilton.ES@PARC-MAXC.ARPA>

The following recent msgs should be of interest to this list, and
hopefully will stimulate some good discussion.  --Bruce

                              ----------

From: Ron Newman <Newman.es>

The following letter to the editor was published in Softalk of March,
1984:

  I have come into possession recently of a program called Microlisp.  I
understand that it has been around for some time, so maybe someone out
there knows something about it.  I cannot get it to do anything but
print numbers I type in or print the word "nil".  How do I make it do
anything else?  Can you give me an example of something useful that I
might be able to do with it?

                                        [...]

                              ----------

From: Bruce Hamilton <Hamilton.ES>

Actually, the letter implies a serious question, related to trying to
communicate with other forms of intelligent life: is there an approach,
to giving inputs and observing output to an unknown program, which is in
some sense optimal; i.e. leads to a complete characterization of input -
output pairs in the shortest possible time?

--Bruce

                              ----------

From: VanDuyn.Henr

One question is would intelligent life aquire (a.k.a. pirate or steal) a
piece of software without the documentation.

On the serious side, what you suggest reminds me of programs that
attempt to write programs by examining a small set of the input output
pairs.  At first sample pairs are fed to the program then the program
begins generating its own sample pairs to build and validate a
hypothesis.  I read an article about this is the ACM TOPLAS journal
about 3 years ago...

Mitch

                              ----------

From: stolfi.pa

"Is there an approach, to giving inputs and observing output to an
unknown program, which is in some sense optimal; i.e. leads to a
complete characterization of input - output pairs in the shortest
possible time?"

I am interested in that question, too. Do you know of any work in that
area? I have given some thought to it, but made only trivial progress.

To be more definite, consider deterministic finite machine with N
internal states, and {0,1} as its input and output alphabets. The goal
is to determine the structure of the machine (i.e., its transition and
output functions) by feeding it a sequence of zeros and ones, and
observing the bits that come out of it. Nothing is known about the
structure of the machine. In particular, it is not known how to reset
the machine to its initial state, and not even whether it is possible to
do so (i.e., whether the machine is strongly connected). Then

(1) at best, you will be able to know the structure of a single strongly
connected component of the machine, and have only a vague knowledge of
the path that led from the initial state to that component. Moreover,
your answer will be determined only up to automaton equivalence. (In
other words, studing the behavior of something will only tell you how
that thing behaves, not how it is built)

(2) if you have an upper bound on the number N of internal states, I
believe you can always deduce the structure of the machine, subject to
the caveats in (1), after feeding it some finite number f(N) of bits.
However, I have no algorithm for generating the required input and
analyzing the output, and I have no idea on how big f(N) is. O(N) is a
trivial lower bound. Any upper bounds? Can it be more than O(2^N)?.

(3) In any case, note that a finite machine built from n boolean gates
may have an exponential number of states (For example, a counter with n
flip-flops has 2^n states). Therefore, even if you know that a program
has a single 16-bit integer variable and a 16-bit program counter, you
may need to feed it few billion bits to know what it does.

(4) if you do not have an upper bound on N, there is no way you can
deduce it by experiment, or answer categorically any interesting
questions about the structure of the machine. For example, suppose you
have put in 999,999,999 bits, and you got that many zeros out. You still
don't know whether the machine is a trivial one-state, constant-output
gadget, or whether it is a billion-state monster that ignores its inputs
and simply puts out a '1' every billionth transition. Note however, that
you may still give answers of the form "either this machine has more
than X states, or it is equivalent to the following automaton: ..."

In anthropomorphic terms, (4) says that it is impossible to distinguish
a genuinely dumb being from a very intelligent one that is just playing
dumb. Point (3) makes me wonder if the goal and method of psychology --
to understand the human mind by studying how it behaves -- is a sensible
proposition after all..

jorge

[There have, of course, been investigations of pattern recognition
techniques for infering Markov or finite-state grammars.  The
PURR-PUSS system is one that comes to mind.  Applications not
mentioned above include cryptography, data compression, fault
diagnosis, and prediction (e.g., stock market directions).  Martin
Gardner had a fun SciAm column ~13 years ago about building an
automaton for predicting the operator's heads/tails choices.  Gardner
also popularized the game of Elusius, in which players try to
elucidate laws of nature by pulsing the system with test cases.
The Mastermind game is related, although you are given information
about the internal state as part of the output.  Several AI
researchers have used automata theory for investigating hypothesis
formation and learning.  -- KIL]