[comp.lang.functional] Pattern matching considered harmful

jeff@aiai.ed.ac.uk (Jeff Dalton) (05/26/90)

Pattern-matching is fine for lists, when it's really lists that
should be used.

More generally, pattern matching would be ok if, for a given structure
type (or Prolog term or whatever), it were used only in a small set of
basic operations.  However, programmers all too often use pattern
matching throughout their programs and then, when they change the
number or order of fields in a structure, have to go around and change
all the patterns.  Moreover, to read a pattern one often has to count
commas, which is error-prone to say the least.

This sort of thing happens all the time in Prolog and probably in
functional languages as well.  Some Prolog programmers have told me
that, of course, one could use a notation that involved named fields
and then expand (unfold) this into patterns.  Few programmers actually
do this, however.

Given these problems, I wonder why pattern-matching seems to be
regarded, rather uncritically, as a Good Thing.

Abstraction is a wonderful thing, when it's used.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (05/26/90)

In article <2584@skye.ed.ac.uk>, jeff@aiai.ed.ac.uk (Jeff Dalton) writes:
> More generally, pattern matching would be ok if, for a given structure
> type (or Prolog term or whatever), it were used only in a small set of
> basic operations.  However, programmers all too often use pattern
> matching throughout their programs and then, when they change the
> number or order of fields in a structure, have to go around and change
> all the patterns.  Moreover, to read a pattern one often has to count
> commas, which is error-prone to say the least.

I am very much in agreement with this.
As an example, suppose you have some data type which represents
a finite function (a dictionary).  For argument's sake, suppose you
start out with the idea that a dictionary is
	empty_dictionary
    or	non_empty_dictionary(Key,Val,Rest)
Suppose you find that part of your program wants to find
dictionary entries satisfying a test p(Key,Val).  You might be
tempted to write

dictionary_p_entry(non_empty_dictionary(Key,Val,_), Key, Val) :-
	p(Key, Val).
dictionary_p_entry(non_empty_dictionary(_,_,Rest), Key, Val) :-
	dictionary_p_entry(Rest, Key, Val).

If you later notice that most of the program is looking up known keys,
and decide to use AVL trees instead, you're stuck with a big change.
But if you had written a "dictionary.pl" file defining the operations
which know about the structure of a dictionary, you might have had a

	dictionary_member(Dictionary, Key, Val)

predicate (which you would have had _anyway_ to look things up!) and
you would just have written

dictionary_p_entry(Dictionary, Key, Val) :-
	dictionary_member(Dictionary, Key, Val),
	p(Key, Val).

Note that specifying the interface predicate dictionary_member/3
_logically_ pays off:  we can use the same predicate to look for known
keys or to enumerate keys, and have just _one_ predicate to remember.
(How that works inside the package is solely the business of the
"dictionary.pl" file.)

It really is a good idea to isolate knowledge of the constructors of
a data type in a single file.  It needn't make your program inefficient;
on the contrary, it is easier to get the design of a predicate like
dictionary_member/3 _right_ and the implementation fast when you have
to do it just once than to do it over and over again when you have
basically the same code scattered all over the place.

> This sort of thing happens all the time in Prolog and probably in
> functional languages as well.

Yes to both in general, no to both in the work of good programmers.

> Some Prolog programmers have told me
> that, of course, one could use a notation that involved named fields
> and then expand (unfold) this into patterns.  Few programmers actually
> do this, however.

Code to do it is in the book...  If you have a Prolog which supports
term_expansion/2 or a close analogue (C Prolog, Quintus Prolog, NU Prolog,
ZYX Prolog, SICStus Prolog, possibly others.  The mechanism in ALS Prolog
is unfortunately placed.  Prolog-2 hasn't got it at the moment, but if I
am reading the manual right the basic machinery is in there _somewhere_
so they could add it) it is not too hard to write a drastically simplified
analogue of Lisp's defstruct.  No new notation is needed.  One just
follows the existing convention for selecting from a data structure:
	predicate(Selector argument(s), Whole structure, Part selected)
For example, if you wanted a 'date' data type, you might use
	date(year,  Date, Year)		% Year = date$year(Date), &c
	date(month, Date, Month)
	date(day,   Date, Day)
In fact, if you are keeping the major operations on your data types in
their own files (where the implementation of the type is legitimately
visible) leaving field accesses as calls like this is unlikely to be a
major factor in the cost of your program.

> Given these problems, I wonder why pattern-matching seems to be
> regarded, rather uncritically, as a Good Thing.

Regard it critically and it is still a WONDERFUL thing.  The problems
are not specific to pattern matching.  *ANY* "leakage" of information
about the implementation of a type is going to cause *exactly* this
kind of problem.  To back this up, suppose we are implementing dates.
Suppose that I decide to represent dates as the number of microseconds
since the midnight before 1-Jan-1970, stored as a floating-point number.
(IEEE, VAX, and /370 double precision floats have enough bits to do
this for a useful range of years, and additions and subtractions will
be exact.)  Then a programmer might exploit his knowledge of this fact
by using
	Date1 < Date2		% Date1 occurs before Date2
If I then try to port the program to a Prolog with 28-bit floats (such
as C Prolog or SB Prolog) I'm going to have to change the representation
to some kind of compound term.  Boom!  < doesn't like it any more.  If I
represent dates as
	timestamp(Year,Month,Day,Hour,Minute,Second,Millisecond,MicroSecond)
I'll still be able to use @< to compare dates, but not < .

> Abstraction is a wonderful thing, when it's used.

Indeed and it is, but pattern matching isn't the only way of breaching it.
I would point out that Hope and ML and some other functional languages have
module systems in which pattern matching can be used quite safely; you can
export a type from a module without exporting the constructors.  It was
always my intention to extend the Mycroft/O'Keefe type checker to enforce
abstract data typing in just that way.  Maybe I'll get around to it...

-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

aarons@syma.sussex.ac.uk (Aaron Sloman) (06/02/90)

Because I only sample net news at random, I have not read the
preceding discussion, but thought I'd comment on the following, at
the risk of having missed the point:

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:

> Date: 26 May 90 09:00:22 GMT
> Organization: Comp Sci, RMIT, Melbourne, Australia
>
> In article <2584@skye.ed.ac.uk>, jeff@aiai.ed.ac.uk (Jeff Dalton) writes:
    .....
> > ....However, programmers all too often use pattern
> > matching throughout their programs and then, when they change the
> > number or order of fields in a structure, have to go around and change
> > all the patterns.  Moreover, to read a pattern one often has to count
> > commas, which is error-prone to say the least.
>
> I am very much in agreement with this.
> As an example, suppose you have some data type which represents
> a finite function (a dictionary).
    ....

I don't see any difference between this problem and the problem
arising in any programming language where you define procedures or
functions with arguments of certain types and then have to invoke
them by giving the right number of arguments in the right order.

If you change your procedure definition, e.g. because you discover
that an extra argument adds extra generality, then you have to find
all locations in the program where it is invoked and edit them.

The only difference in the prolog case is that if you fail to make
the required changes, instead of a run-time or compile-time error
you will often simply get obscure behaviour.

As far as the advantages of pattern matching are concerned, I have
found that the use of a pattern matcher with "segment" variables
often makes it much easier to express accurately and clearly what
you want to do with a complex structure than other commonly used
mechanisms.

E.g. to check whether something (x) is a list one of whose elements
is a list containing "a" preceded by "b" with an arbitrary number of
elements in between, in Pop-11 you'd write

    if x matches [ == [ == a == b == ] == ] then ....

where "==" matches an arbitrary number of items. In most languages
you'd have to write at least three nested loops to do this and would
probably not get it right first time.

Similarly, you could define a procedure to search for x in a list,
and return the next item, thus:

    define successor(x, list) -> result;
        vars found;
        if list matches [== ^x ?found ==] then
            found -> result
        else
            false -> result
        endif
    enddefine;

(where "^" means use the value of, and "?" means "set the value of")
which I think is clearer than anything you could write in lisp or
Prolog. (Prolog matches "segments" only at the end of the list.)

(Actually, the Pop-11 version could be made more compact in
various ways.)

I'd far rather teach people to USE list processing using a pattern
matcher than have students struggle with hd and tl. Later, if
necessary, they can learn the lower level mechanisms.

Of course, the price of a pattern matcher, may be efficiency, but
often that's not so important as clarity, maintainability, and
reduced development time.

Aaron Sloman,
School of Cognitive and Computing Sciences,
Univ of Sussex, Brighton, BN1 9QH, England
    EMAIL   aarons@cogs.sussex.ac.uk

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/03/90)

In article <2790@syma.sussex.ac.uk>, aarons@syma.sussex.ac.uk (Aaron Sloman) writes:
> Because I only sample net news at random, I have not read the
> preceding discussion, but thought I'd comment on the following, at
> the risk of having missed the point:

> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
> > I am very much in agreement with [not over-using pattern matching].
> > As an example, suppose you have some data type which represents
> > a finite function (a dictionary).
>     ....

> I don't see any difference between this problem and the problem
> arising in any programming language where you define procedures or
> functions with arguments of certain types and then have to invoke
> them by giving the right number of arguments in the right order.

There isn't any difference.  Hence the argument passing conventions
of MESA, Ada, and VAX/VMS Pascal, amongst others.  (Also SML's
record patterns, not to mention AMBER.)

> As far as the advantages of pattern matching are concerned, I have
> found that the use of a pattern matcher with "segment" variables
> often makes it much easier to express accurately and clearly what
> you want to do with a complex structure than other commonly used
> mechanisms.

The trouble is that "segment variable" patterns apply only to lists.
There is no common notation for saying "somewhere within this tree".

> E.g. to check whether something (x) is a list one of whose elements
> is a list containing "a" preceded by "b" with an arbitrary number of
> elements in between, in Pop-11 you'd write

>     if x matches [ == [ == a == b == ] == ] then ....

In Prolog, I'd do it using two DEC-10 Prolog library predicates:
	member(AB_list, X),		% now in library(basics)
	pairfrom(AB_list, A, B, _)	% now in library(sets)
or by using a grammar rule body:
	member(AB_list, X),
	phrase(( lit(_), [a], lit(_), [b], lit(_) ), AB_list)
_This_ kind of pattern matching just doesn't come up often enough to
be worth having a special notation.  One of the first projects I did
was Prolog was to figure out how to translate Interlisp's pattern-
matching notation to Prolog; in the process of doing it I realised
that it was just as simple to write the appropriate Prolog code in
the first place.

In the SML and LML programs that I have seen, there seems to be a fairly
serious move away from lists.  On the whole I think this is a Good Thing.

> Similarly, you could define a procedure to search for x in a list,
> and return the next item, thus:

>     define successor(x, list) -> result;
>         vars found;
>         if list matches [== ^x ?found ==] then
>             found -> result
>         else
>             false -> result
>         endif
>     enddefine;

> (where "^" means use the value of, and "?" means "set the value of")
> which I think is clearer than anything you could write in lisp or
> Prolog. (Prolog matches "segments" only at the end of the list.)

Well, here's how you would write it in Prolog.  What do _you_ think?

	append(_, [X,Found|_], List)

or, if you really want to write out a pattern,

	phrase(( lit(_), [X,Found], lit(_) ), List)

Interlisp, of course, could do it much the same way that Pop-11 does.

-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

jeff@aiai.ed.ac.uk (Jeff Dalton) (06/03/90)

In article <2790@syma.sussex.ac.uk> aarons@syma.sussex.ac.uk (Aaron Sloman) writes:
>I don't see any difference between this problem and the problem
>arising in any programming language where you define procedures or
>functions with arguments of certain types and then have to invoke
>them by giving the right number of arguments in the right order.

Do you really see no advantage to using functions to extract
parts of structures rather than some positional notation?

A possibly related question is: do you find nothing wrong with
functions/procedures that take, say, 7 positional arguments?

>E.g. to check whether something (x) is a list one of whose elements
>is a list containing "a" preceded by "b" with an arbitrary number of
>elements in between, in Pop-11 you'd write
>
>    if x matches [ == [ == a == b == ] == ] then ....
>
>where "==" matches an arbitrary number of items. In most languages
>you'd have to write at least three nested loops to do this and would
>probably not get it right first time.

In Common Lisp:

  (some #'(lambda (sublist) (member 'b (member 'a sublist))) x)

Not that this proves anything.  Besides, what you'd really do in Lisp
would be to write some procedures or macros that made it easier to
express such things.

By the way, what happens with a pattern like

    [[??a ??b] ??a]

"[??a ??b]" can match in more than one way.  Will more than one
possibility be tried when trying to match the second "??a"?

-- Jeff

rjc@uk.ac.ed.cstr (Richard Caley) (06/04/90)

In article <2790@syma.sussex.ac.uk> aarons@syma.sussex.ac.uk (Aaron Sloman) writes:

   I don't see any difference between this problem and the problem
   arising in any programming language where you define procedures or
   functions with arguments of certain types and then have to invoke
   them by giving the right number of arguments in the right order.

One can change the implementation of a function without changing its
calling sequence, one can not change the structure of a data object
without changing the pattern used to match it.

It would be nice this was possible. declare a dtat type ad declare a
`picture' of it which is what the outside world sees.
 

aarons@syma.sussex.ac.uk (Aaron Sloman) (06/05/90)

jeff@aiai.ed.ac.uk (Jeff Dalton) writes:

>
> In article <2790@syma.sussex.ac.uk> aarons@syma.sussex.ac.uk
> (Aaron Sloman) writes:
> >I don't see any difference between this problem and the problem
> >arising in any programming language where you define procedures or
> >functions with arguments of certain types and then have to invoke
> >them by giving the right number of arguments in the right order.
>
> Do you really see no advantage to using functions to extract
> parts of structures rather than some positional notation?

I was not aware that I was commenting on that. All I was commenting
on was the point you made (I think) about the difficult of keeping
programs consistent in languages that use pattern matching. I was
simply suggesting that this is a general difficulty with positional
notations, used in many programming languages, not just pattern
matching programs.

Yes, I do think there are occasions when it is nicer to use a
function to extract a component of a structure, e.g.

    last(x)

But that doesn't mean pattern matching is inferior in all contexts.

>
> A possibly related question is: do you find nothing wrong with
> functions/procedures that take, say, 7 positional arguments?

I wasn't commenting on that either. I have occasionally had to use
such functions. In general it's probably a good idea to keep ones
functions simpler than that, if possible. I seem to remember Bob
Boyer once telling me that in order to write a lisp pretty printer
he had to have a procedure with some large number of arguments
(about 14?) on most of which it recursed.

>
> >E.g. to check whether something (x) is a list one of whose elements
> >is a list containing "a" preceded by "b" with an arbitrary number of
> >elements in between, in Pop-11 you'd write
> >
> >    if x matches [ == [ == a == b == ] == ] then ....
> >
> >where "==" matches an arbitrary number of items. In most languages
> >you'd have to write at least three nested loops to do this and would
> >probably not get it right first time.
>
> In Common Lisp:
>
>   (some #'(lambda (sublist) (member 'b (member 'a sublist))) x)

I think a picture of what you mean (i.e. a pattern) is much easer to
understand, and to get right first time.

> Not that this proves anything.  Besides, what you'd really do in Lisp
> would be to write some procedures or macros that made it easier to
> express such things.

Why not add a pattern matcher instead? (Lots of Lispers have used
them at one time or another, e.g. microplanner, planner, QA4, etc).

> By the way, what happens with a pattern like
>
>     [[??a ??b] ??a]
>
> "[??a ??b]" can match in more than one way.  Will more than one
> possibility be tried when trying to match the second "??a"?
>
Good point. In fact the standard Poplog Pop-11 matcher cannot find
the way to match this successfully (though the Alphapop matcher
designed by Jon Cunningham can).

Implementing such a matcher requires use of a state-saving
technique, which is liable to be comparatively slow and generate
garbage. But I think it is sufficiently useful that I shall probably
add a Pop-11 library procedure to the next release of Poplog that
solves this problem, and others. It's called fmatches (for "full"
matches), and it allows a "where" qualifier to control a match.

Your example:

[[1 2 3 4 5] 1 2] fmatches [[??a ??b] ??a] =>
** <true>

a,b =>
** [1 2] [3 4 5]

An example where TWO indeterminate matches have to be made mutually
consistent:

[[1 2][2 1]] fmatches [[??x ??y][??y ??x]] =>
** <true>

x,y =>
** [1] [2]

Illustrating the use of "where" to force retries:

[1 3 2 5 2 4] fmatches [== ?x ??z ?y ==] where x == y =>
** <true>

x, y, z=>
** 2 2 [5]

[1 2 3 4 4 3 2 1] fmatches [??x ??y] where x = rev(y) =>
** <true>

x, y =>
** [1 2 3 4] [4 3 2 1]

One can also attach "restrictions" to pattern variables.

This version of the matcher will also, at last, work with sections
and a subset of lexically scoped variables, the "dlvars" of Poplog.

The standard Pop-11 matcher has problems because it uses -valof- on
words, and this can have a different interpretation at run time from
what the user expected at compile time, e.g. if sections (the Pop-11
variant of packages/modules) are used.

Aaron

Chris.Holt@newcastle.ac.uk (Chris Holt) (06/06/90)

In article <RJC.90Jun4134659@brodie.uk.ac.ed.cstr> rjc@uk.ac.ed.cstr (Richard Caley) writes:
>
>One can change the implementation of a function without changing its
>calling sequence, one can not change the structure of a data object
>without changing the pattern used to match it.
>
>It would be nice this was possible. declare a dtat type ad declare a
>`picture' of it which is what the outside world sees.

The picture is surely just those functions that can take the data type
as an argument, and those that can return the data type as a result.
-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
 "...for where we are is here, And where here is, must we ever be."

ridoux@irisa.fr (Olivier Ridoux) (06/06/90)

From article <RJC.90Jun4134659@brodie.uk.ac.ed.cstr>, by rjc@uk.ac.ed.cstr (Richard Caley):
> It would be nice this was possible. declare a dtat type ad declare a
> `picture' of it which is what the outside world sees.
>  

Two already implemented extensions of Prolog can be used to support this
style of programming.  They are Lambda-Prolog [Miller & Nadathur in ILPC'86] 
and Login [Ait-Kaci & Nasr in JLP 3(86)].

Lambda-Prolog does the job by the means of higher-order terms and 
beta-reduction which act as well-behavied macros and macro-expansion.  
Login is more in the style of abstract data-types or object-oriented 
programming.

However both have been designed to extend the term domain and to have a
unification that performs tricky computations rather than to support a
software-engineering discipline.  They may lack the syntactic sugar that helps
enforcing the discipline.

Friendly,

Olivier Ridoux

jeff@aiai.ed.ac.uk (Jeff Dalton) (06/06/90)

In article <2798@syma.sussex.ac.uk> aarons@syma.sussex.ac.uk (Aaron Sloman) writes:
]jeff@aiai.ed.ac.uk (Jeff Dalton) writes:
]] In article <2790@syma.sussex.ac.uk> aarons@syma.sussex.ac.uk
]] (Aaron Sloman) writes:
]] ]I don't see any difference between this problem and the problem
]] ]arising in any programming language where you define procedures or
]] ]functions with arguments of certain types and then have to invoke
]] ]them by giving the right number of arguments in the right order.
]]
]] Do you really see no advantage to using functions to extract
]] parts of structures rather than some positional notation?
]
]I was not aware that I was commenting on that. All I was commenting
]on was the point you made (I think) about the difficult of keeping
]programs consistent in languages that use pattern matching. I was
]simply suggesting that this is a general difficulty with positional
]notations, used in many programming languages, not just pattern
]matching programs.

And I was simply suggesting that (1) even though there are similar
problems elsewhere we might still want to do something about this one
rather than not do something; and (2) the problem tends to be worse
for patterns than it is for functions [see below].

Moreover, if I say X is a problem, and you say X is the same as Y,
this sort of suggests that we shouldn't worry any more about X than we
do about Y; and I wanted to say the we ought to worry, perhaps, a
little bit more.  Indeed, the last N years of programming have shown
the value of treating data abstractly, so that it seems, at least to
me, a step backwards to employ very concrete representations as
patterns.

]Yes, I do think there are occasions when it is nicer to use a
]function to extract a component of a structure, e.g.
]
]    last(x)

If that's how far it has to go before you think a non-positional
notation is better than a positional one (and yes I regard a
two-argument function as non-positional with respect to the field it
is accessing), then I guess we'll just have to agree to disagree.

BTW, I don't mean to imply in any of this that there aren't ways
to make pattern-matching more abstract.  There are, and it's a good
idea to employ them.  However, in many languages pattern matching
is not abstract, it is very concrete.

]But that doesn't mean pattern matching is inferior in all contexts.

Which no one has said it was.

]] A possibly related question is: do you find nothing wrong with
]] functions/procedures that take, say, 7 positional arguments?
]
]I wasn't commenting on that either.

See above and note that it is usually more common to see data
structures with, say, 7 fields than to see procedures with 7
arguments.  In Prolog, it's true, the two problems are much more
nearly the same.  However, one of the techniques that can sometimes
help reduce the number of parameters to a predicate, namely combining
some of them into a single structure (when it makes sense to do so),
doesn't help as much as it might if you later use pattern-matching to
take the structure apart.

]I seem to remember Bob Boyer once telling me that in order
]to write a lisp pretty printer he had to have a procedure with some
]large number of arguments (about 14?) on most of which it recursed.

For the record, it is not necessary to have that many parameters
in a Lisp pretty printer.

]] ]In most languages you'd have to write at least three nested loops
]] ]to do this and would probably not get it right first time.
]]
]] In Common Lisp:
]]
]]   (some #'(lambda (sublist) (member 'b (member 'a sublist))) x)
]
]I think a picture of what you mean (i.e. a pattern) is much easer to
]understand, and to get right first time.

I wasn't commenting on that...

]] Not that this proves anything.  Besides, what you'd really do in Lisp
]] would be to write some procedures or macros that made it easier to
]] express such things.
]
]Why not add a pattern matcher instead? 

That's what the procedures and macros might amount to, but I think
there might be other solutions as well.

]] By the way, what happens with a pattern like
]]
]]     [[??a ??b] ??a]
]]
]] "[??a ??b]" can match in more than one way.  Will more than one
]] possibility be tried when trying to match the second "??a"?
]]
]Good point. In fact the standard Poplog Pop-11 matcher cannot find
]the way to match this successfully (though the Alphapop matcher
]designed by Jon Cunningham can).

I thought Poplog might handle such things because it uses downward
success continuations (I think) for the "log" part (ie for Prolog).

-- Jeff

rjc@uk.ac.ed.cstr (Richard Caley) (06/07/90)

In article <1990Jun5.175706.415@newcastle.ac.uk> Chris.Holt@newcastle.ac.uk (Chris Holt) writes:

   In article <RJC.90Jun4134659@brodie.uk.ac.ed.cstr> rjc@uk.ac.ed.cstr (Richard Caley) writes:

   >It would be nice this was possible. declare a dtat type ad declare a
   >`picture' of it which is what the outside world sees.

   The picture is surely just those functions that can take the data type
   as an argument, and those that can return the data type as a result.

These do form an image, but not necesarily the best ( ie most readable
). Consider lists, usually these will be built and accesed as linked
pairs with cons

	1 :: ( 2 :: (3 :: [] ))

( using POP operators, translate into your own favorite religion :-) )

Prolog, and I think some of the functional languages with pattern
matching, will only let you match this way 

	f([]) = 0
	f(H :: T ) 1+ f(T)

On the other hand, people not concerned with implimentation might be
better served in many cases by regarding lists as just that, ordered
sequences and matching based on that

	(1)	f( X, [ ... [ X Y ] ... ] ) = Y

I think the nearest we could come to this using functions would be

	(2)	f( X, A <> ( X :: Y :: [] ) <> B ) = Y

( where '<>' is list append )

I prefer (1). 

I think Jeff is right though, pattern matching in programing languages
is far to concrete. Just imagine implementing some kind of table ADT
and being able to say that programs using the ADT should be able to
match aginst it using (1) ( despite its being a hash table of has
tables or something equally ghastly internally ).

On the other hand, since I've never seen this dream langauage, I can't
say whether it would be a useful feature or just one of those things
that you read about in the documantation, try once and give up on.

aarons@syma.sussex.ac.uk (Aaron Sloman) (06/07/90)

jeff@aiai.ed.ac.uk (Jeff Dalton) writes:

 ...stuff deleted....

In response to your objections to representing data using a
positional notation I guess I missed the point the first time. Yes I
agree that in general it is _not_ a good thing to be committed to
anything so concrete, _except_ where the concrete structure is
closely related to the semantics you intend: e.g. if you are using
ordered components of a structure to represent objects that stand in
some order (e.g. words in a sentence, the sequence of rulers of
Britain, etc.) just as it is convenient to use a 2-D array to
represent an image because the neighbourhood and ordering relations
in the array correspond closely to the neighbourhood and ordering
relations in the image (or in the optic array from which the image
is a sample).

But where the order of items in a list or data-structure has no
significance, i.e. is totally arbitrary, I agree that it is unwise
to build programs around that order, and much better to use
a more abstract representation.

I presume the introduction of keyword arguments in Common Lisp was
in part an attempt to avoid the same sort of limitation (i.e.
restriction to positional notation) in procedure definitions?

On a historical note:
> ]] By the way, what happens with a pattern like
> ]]
> ]]     [[??a ??b] ??a]
> ]]
> ]] "[??a ??b]" can match in more than one way.  Will more than one
> ]] possibility be tried when trying to match the second "??a"?
> ]]
> ]Good point. In fact the standard Poplog Pop-11 matcher cannot find
> ]the way to match this successfully (though the Alphapop matcher
> ]designed by Jon Cunningham can).
>
> I thought Poplog might handle such things because it uses downward
> success continuations (I think) for the "log" part (ie for Prolog).

I guess we could have implemented the Pop-11 pattern matcher using
the Poplog virtual machine facilities built in to support prolog,
but in fact the Pop-11 matcher was first written in PDP-11
assembler (!) by Steve Hardy when Pop-11 was used on the PDP11/40
computer for teaching, then when it got ported to the VAX the
matcher got re-written in Pop-11 (circa 1981?) and has barely
changed since. The continuation handling stuff was put into Poplog
later.

I think Jon Cunningham wrote his more general Pop-11 matcher for the
Alphapop (MAC Pop-11) system by implementing a version in Poplog
Prolog, then looking at the Poplog VM the code it generated and
then re-implementing in Pop-11, and then finally translating it to C
(in which Alphapop is written).

Aaron

brian@comp.vuw.ac.nz (Brian Boutel) (06/07/90)

In article <1990Jun5.175706.415@newcastle.ac.uk>,
Chris.Holt@newcastle.ac.uk (Chris Holt) writes:
> In article <RJC.90Jun4134659@brodie.uk.ac.ed.cstr> rjc@uk.ac.ed.cstr
(Richard Caley) writes:
> >
> >One can change the implementation of a function without changing its
> >calling sequence, one can not change the structure of a data object
> >without changing the pattern used to match it.
> >
> >It would be nice this was possible. declare a dtat type ad declare a
> >`picture' of it which is what the outside world sees.
> 
> The picture is surely just those functions that can take the data type
> as an argument, and those that can return the data type as a result.


Abstract data types, whose constructors are not visible, solve the
problem of changing the implementation without having to change the use,
but cannot be combined with pattern matching, where the constructors
need to be visible.

This does not make pattern matching "harmful". Pattern matching is what
makes data-driven case analysis possible, and this is a good thing.

Phil Wadler proposed the use of "views" to avoid this difficulty. A view
is an alternative definition of a datatype, with a new set of
constructors and with functions for mapping between the view and the
real representation. An abstract data type defines a view whose
constructors can be used for pattern matching, but the real
representation is hidden and can be changed, without invalidating the
user's code, if the mapping functions are also changed. The mapping
functions are not explicitly used by the user, but inserted by the
compiler where necessary.

Simple example of views might be polar representation of complex numbers
if the implementation is cartesian, or conventional (nil, cons) lists
where the real representation uses a tree (this makes append very efficient).

Views were considered for inclusion in Haskell, but dropped when we
found some awkward consequences that we did not want to deal with
without first gaining more experience with their use.

Reference: P Wadler, Views, a way for pattern matching to cohabit with
data abstraction. In Proc 14th ACM  POPL pp307-312 Jan 1987

--brian

Internet: brian@comp.vuw.ac.nz
Postal: Brian Boutel, Computer Science Dept, Victoria University of Wellington,
        PO Box 600, Wellington, New Zealand
Phone: +64 4 721000
Fax:   +64 4 712070

jrk@sys.uea.ac.uk (Richard Kennaway) (06/07/90)

In article <1990Jun07.020319.27609@comp.vuw.ac.nz> brian@comp.vuw.ac.nz (Brian Boutel) writes:
>Views were considered for inclusion in Haskell, but dropped when we
>found some awkward consequences that we did not want to deal with
>without first gaining more experience with their use.

But if no-one implements views, where will such experience come from?

--
Richard Kennaway          SYS, University of East Anglia, Norwich, U.K.
Internet:  jrk@sys.uea.ac.uk		uucp:  ...mcvax!ukc!uea-sys!jrk

Chris.Holt@newcastle.ac.uk (Chris Holt) (06/07/90)

In article <1990Jun5.175706.415@newcastle.ac.uk>, I wrote:
>
> The picture is surely just those functions that can take the data type
> as an argument, and those that can return the data type as a result.

A reply suggested that constructors had to be visible for pattern
matching.  Perhaps I misunderstand, but what I thought goes on is:

Given a set S of values, we want to select a subset that satisfies
given constraints on the possible values of constituents.  So we
use the functions defined in the axioms of the ADT to construct
an element from value ranges and variables, and unify this with
the elements of S, selecting those elements for which the
unification succeeds.  In this way, given the set
        {0,succ(0),succ(succ(0))}
and the pattern
        succ(x)
we select the subset
        {succ(0) where x=0, succ(succ(0)) where x=succ(0)}.

If the argument now is that we want to hide the succ operation
and change the syntactic representation of the values (so succ(0)=1)
then of course we cannot construct patterns from components; but then
we cannot construct *any* values from components (eg you can't make
a tree without a cons).  So this would only seem to apply to
read-only databases, and I don't understand what any of this
has to do with implementation details.

If this is what Phil Wadler's views are about, fine.

Brian Boutel said:
> Views were considered for inclusion in Haskell, but dropped when we
> found some awkward consequences that we did not want to deal with
> without first gaining more experience with their use.

I'm interested.
-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
 "...for where we are is here, And where here is, must we ever be."

jeff@aiai.ed.ac.uk (Jeff Dalton) (06/07/90)

In article <1990Jun07.020319.27609@comp.vuw.ac.nz> brian@comp.vuw.ac.nz (Brian Boutel) writes:
>This does not make pattern matching "harmful". Pattern matching is what
>makes data-driven case analysis possible, and this is a good thing.

Are you suggesting that data-driven case analysis would otherwise
be impossible?

aarons@syma.sussex.ac.uk (Aaron Sloman) (06/08/90)

brian@comp.vuw.ac.nz (Brian Boutel) writes:

> Organization: Dept. of Comp. Sci., Victoria Uni. of Wellington, New Zealand.
    .... stuff deleted ....
> Abstract data types, whose constructors are not visible, solve the
> problem of changing the implementation without having to change the use,
> but cannot be combined with pattern matching, where the constructors
> need to be visible.
>
> This does not make pattern matching "harmful". Pattern matching is what
> makes data-driven case analysis possible, and this is a good thing.
>
> Phil Wadler proposed the use of "views" to avoid this difficulty. A view
> is an alternative definition of a datatype, with a new set of
> constructors and with functions for mapping between the view and the
> real representation. An abstract data type defines a view whose
> constructors can be used for pattern matching, but the real
> representation is hidden and can be changed, without invalidating the
> user's code, if the mapping functions are also changed. The mapping
> functions are not explicitly used by the user, but inserted by the
> compiler where necessary.

As it happens this (if I understand it aright) is exactly how I like
to introduce list processing to students. I.e. we give them the
EXTERNAL representation (i.e. a "view"??) in terms of bracketed
sequences of items (words, numbers, strings, lists) without (for
quite a while) telling them ANYTHING about the implementation.

Instead we show them how useful these bracketed orderd structures
are for a collection of problems, and let them use pattern matching
to do tests, access components, etc. As far as they are concerned
the implementation might be as vectors (elements allocated
contiguously in store) list structures (elements linked via pairs)
or something much more complex. It makes no difference to their use,
and so the students don't need to know. Later they (or some of them)
learn how lists are implemented (at a suitable level of
abstraction), learn to use hd and tl, learn how recursing on the hd
of a list enables you to do some things that would be clumsy with
the pattern matcher, and so on.

For many uses where lists and pattern matching are natural the fact
that the underlying implementation is or is not a binary tree, or
that there are functions for getting at components of the lists, is
as irrelevant as the quantum mechanics of computer circuits is to
most computer scientists and software engineers. I.e. the underlying
"pair" abstract data-type is just a low level machine feature.

Unfortunately the Prolog pattern matcher is too limited to support
this "view" of lists because it forces you to treat lists as
asymmetrical. I presume that's mainly because, with bi-directional
unification, segment variables anywhere in a list would make
everything too indeterminate (a problem with the pattern matching
language of Carl Hewitt's Planner system in the early 70s I seem to
recall)?

Aaron Sloman,
School of Cognitive and Computing Sciences,
Univ of Sussex, Brighton, BN1 9QH, England
    EMAIL   aarons@cogs.sussex.ac.uk

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/08/90)

In article <RJC.90Jun6223015@brodie.uk.ac.ed.cstr>, rjc@uk.ac.ed.cstr (Richard Caley) writes:
> I think Jeff is right though, pattern matching in programing languages
> is far too concrete.  Just imagine implementing some kind of table ADT
> and being able to say that programs using the ADT should be able to
> match aginst it using (1) ( despite its being a hash table of has
> tables or something equally ghastly internally ).

There is no reason why pattern matching and abstraction couldn't go
hand in hand.  Pattern matching is just a special kind of constraint
solving.  What's required for Prolog-style matching is that there
be a constructor function
	mk_f : T_1 x ... x T_n -> T
that will take n arguments of the appropriate types and construct
something of type T, a recogniser function
	is_f : T -> bool
which is true of precisely the things constructed by f, and
n projection functions
	f_arg_i : T -> T_i
which can recover the information originally given to f.  Then we
can solve the equation
	X = f(X_1,...,X_n)
thus:  if X is a variable, set X to mk_f(X_1,...,X_n)
       if X is not a variable, and is_f(X) is false, fail
       if X is not a variable, and is_f(X) is true,
	solve X_1 = f_arg_1(X), ..., X_n = f_arg_n(X).
(Handling this in full generality may require coroutining.)

In a language with one-way matching (in the head of an equation, match
against existing terms, on the right hand side, construct new ones)
it can be determined at compile time when to use the constructor and
when to use the recogniser and projections.  Indeed, as long as you
have a recogniser and projections, e.g.
	view f(X:T) => (e_1:T_1,...,E_n:T_n) if test.
you can use pattern-matching *syntax* in one-way pattern matching
without requiring access to a constructor at all.  (Presumably this
is the "view" mechanism that has been mentioned here.)

Prolog doesn't do this because unification is supposed to be a
primitive and determinate in Prolog.  In ML, it would just be more
syntactic sugar.

-- 
"A 7th class of programs, correct in every way, is believed to exist by a
few computer scientists.  However, no example could be found to include here."

kh@cs.glasgow.ac.uk (Kevin Hammond) (06/13/90)

In article <1554@sys.uea.ac.uk> jrk@sys.uea.ac.uk (Richard Kennaway) writes:
>In article <1990Jun07.020319.27609@comp.vuw.ac.nz> brian@comp.vuw.ac.nz (Brian Boutel) writes:
>>Views were considered for inclusion in Haskell, but dropped when we
>>found some awkward consequences that we did not want to deal with
>>without first gaining more experience with their use.
>
>But if no-one implements views, where will such experience come from?

I did implement Haskell views: the implementation showed that these
were too experimental for Haskell1 (and not exactly what Phil intended
in his paper).  We may well have another attempt at views in a trial
version before Haskell2 depending on the effort available (and what
seems interesting).

Kevin



-- 
This Signature Intentionally Left Blank.	E-mail: kh@cs.glasgow.ac.uk