[comp.lang.functional] Pattern matching and abstraction

pop@cs.umass.edu (06/06/90)

Oh, dear - don't read the news for a few days and look what piles up.

Frank Silbermann writes:

>	What does term-rewriting have to do with any theory
>	of domains and mappings (e.g. programming with functions)?

Intuitively there is an answer, tho' whether the theorists have ground it out
I am unsure. Intuitively, the answer is that term rewriting can be used to
transform one function definition into another that denotes the same function,
following Burstall and Darlington 1977. Consider the example below in which
we use <> for an infix append (Pascal's use of it to mean not-equal seems
bizarre to me, since /= is so much closer to the mathematical precedent).
Suppose we have the following equational forms:

length([])   = 0
length(x::l) = 1 + length(l)

[]      <> l1 = l1
(x::l1) <> l2 = x::(l1<>l2)

lap1(l1,l2) = length(l2<>l2)
lap2(l1,l2) = length(l1)+length(l2)

We can use these forms to define functions, as syntactic sugar if you will.
But we can also used them, essentially as rewrite rules, to `prove'
that the functions  lap1  and  lap2  are identical. Using Burstall's
structural induction:

lap1([],l2) = length([]<>l2) = length(l2)
lap2([],l2) = length([])+length(l2) = 0 + length(l2) = length(l2)

Suppose that length(l1<>l2) = length(l1) + length(l2)
( the inductive hypothesis)

lap1(x::l1,l2) = length((x::l1) <> l2) = length(x :: (l1<>l2))
               = 1 + length(l1 <> l2)

lap2(x::l1,l2) = length(x::l1) + length(l2) = (1 + length(l1)) + length(l2)
               = 1 + (length(l1) + length(l2)) = 1 + length(l1 <> l2)

The above argument causes me to believe that lap1 and lap2 denote the same
function. Somewhat earlier, related work was incorporated in the best-known
POP-2 program of all, the Boyer Moore theorem prover.

In logic, I understand that one shows that a proof procedure (e.g. resolution)
is sound by showing that any sentence that it can be used to derive is
universally valid, i.e. it denotes TRUE in any interpretation. I had presumed
that somebody had done the equivalent thing for the kinds of reasoning that I
use above, but have they??? One would of course expect there to be
undecidability results for any such theory (if that is what Frank Silbermann
is alluding to in his earlier remark about things not being computable.

It is only appropriate to use pattern matching to define functions under
appropriate conditions. Leaving aside the question of higher order
definitions, the requirement that the inner functions occurring on the left of
the rules should be algebraically free [or in POP-2/POP-11 terms one has a
constructor function and a destructor function which are exact inverses
considered as unary functions on tuples] is the most obvious way of
guaranteeing that rewrite rules can be transformed into a function definition.

POP-11 pattern matching is not entirely easy to fit into the functional subset
of POP-11. The list-segment matching would be better seen as taking place over
a semigroup, subject therefore to the associative law and so not free, as I
think Jeff Dalton points out. There is some interest from the point of view of
the semi-automatic derivation of algorithms that does arise here. Suppose we
replace the "free" requirement by a Noetherian requirement. I.e. suppose we
have a "constructor" function  c  and a "semi-destructor" function d  for which
c(d(X)) = X, forall X in the domain of d,  [but not conversely, as required by
the POP-2 manual for destructors], and for which there are no infinite
descending chains of application of the selector functions (or applications of
 d  followed by tuple-selection).

For example the sort algorithm

sort([])   = []
sort(x::l) = merge(x::[], sort(l))

is O(n^2). However, if

sort([])   = []
sort(x<>y) = merge(sort(x),sort(y))

is implemented using a Noetherian semi-destructor for sequences which chops a
finite sequence as nearly in two as possible, then one obtains a O(n*log(n))
algorithm. [Isn't Cordell Green doing things like this???]

Likewise the distributivity law

x*(y+z) = x*y + x*z

cannot be used in the ordinary way  (together with x*0 = 0) to define
multiplication, since the inner function on the left is  +  which is not free.
Translating into Peano terms one obtains

x*0 = 0
x*succ(y) = x + x*y

which is a good recursive definition, since succ is free. [By the way, one
piece of syntactic sugar that might be good in functional programming is to
have natural numbers as a type, and treat  x+1  as syntactic sugar for
succ(x), since structural induction then works nicely over the natural
numbers, following Burstall and Darlington]. Nevertheless, we can
derive more efficient algorithms for multiplication from the distributive law:
suppose we have  d1(x) = (x/2,x/2) for x even, d1(x) = ((x-1)/2,(x-1)/2+1) for
x odd. Then we obtain a more efficient algorithm for multiplication, supposing
that we have a representation for numbers which makes division by two cheap.
I think that the standard algorithm for multiplication using a binary
representation could also be derived in this way.

The correctness of the above sort algorithm over any Noetherian semi-destructor
is probably going to depend on the algebraic properties of <> and [],  namely
that they constitute a semigroup with identity.

Perhaps the main point I would like to draw out of the above discussion is
that in the strongly-type languages which have evolved in the 70's and 80's,
the convention has been established of associating function symbols that can
occur in the interior of patterns with a particular concrete implementation,
and the same is true for Prolog. The meaning of "constructor" has been
restricted since it was used by Burstall when he wrote the POP-2 manual (the
text is in fact his, although we had frequent discussions to match the text
with the implementation which was mine). And this restriction is inimical to
the combination of abstract data types with pattern-matching. The way out is
to consider the algebraic properties that are actually required for pattern
matching to be a sound way of defining functions. The "free" requirement would
seem to be the basis for practical language implementations NOW. The use of
Noetherian semi-destructors would seem to have some interest for the
development of a somewhat more interactive algorithm development system.
This point answers Richard Caley's criticism of pattern-matching -
constructors and destructors can be abstracted as well as any other functions,
and no doubt their special properties can be expressed in a statically
typed language. In POP the section mechanism could be used to perform
the appropriate hiding of implementation details - I have in fact
experimented with macros which provide pattern-matching based on
pairs of constructor-destructor functions being associated with a data-type.

Turning finally to the issue of the virtues or otherwise of POP-2/POP-11 in
treating closures - originally POP-2 used dynamic scope for implementation
reasons which were good reasons, given the target machine. Users could write
correct programs by doing their own lambda-lifting and using partial
application. If this was clumsy, it should be compared with the common
practice in programming languages which to this day does not provide a general
treatment of closures, which lambda-lifting has been shown to be [see Peyton
Jones 1987]. As Chris Dollin says, lexical scoping and a more conventional and
convenient treatment of closures is now provided in POP-11 - we have always
been happy to acknowledge our debts to LISP. POP-11 still retains the
equivalent of LISP special variables, for similar reasons, and an
irritating requirement to declare lexicals as such, which I hope can
be made to go away.

POP-11 has always been identified as an implementation language - indeed its
use to implement a somewhat more convenient form of logic programming language
than that provided by Prolog is discussed in [Popplestone 1968], including an
example on the use of append (the <> operator) for parsing and generation of
lists in the bidirectioal manner beloved of the O'Keefe's of this world
(without alas a firm statement of how the equational logic required would be
implemented) I still think that the programme of work advocated in that paper,
namely that of using an intermediate, impurely functional, language is the
better approach to developing useful systems embodying  higher-level languages
than that pursued by the Prolog world. And naturally, I think that POP, which
along with CPL attempted to bridge the _apparent_ gap between the AI world and
the rest of computer science is a better base language than LISP, and
primarily for that reason (I could live with Scheme with a preprocessor to
give me civilised syntax). I believe the main weakness of POP in that role,
namely the absence of static typing, which makes the ultra-efficient form of
POP, POPC, somewhat obscure and unsafe, could be mended by the application of
a little standard grammar theory to characterise the behaviour of its variadic
functions which, in a definable sense, "parse" their arguments off the (open)
stack, and "generate" their results.

Robin Popplestone.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/12/90)

In article <15189@dime.cs.umass.edu>, pop@cs.umass.edu writes:
> [By the way, one
> piece of syntactic sugar that might be good in functional programming is to
> have natural numbers as a type, and treat  x+1  as syntactic sugar for
> succ(x), since structural induction then works nicely over the natural
> numbers, following Burstall and Darlington].

Haskell has it.
-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."