[comp.compilers] Pattern languages

oscar@cui.unige.ch (NIERSTRASZ Oscar) (11/04/88)

Does anyone know of the existence of a "pattern language" that offers
the possibility of defining or extending programming languages through
the definition of patterns?  What I am thinking of is something more
powerful than a macro pre-processor but not as general as yacc and
lex.  The idea is to define a set of syntactic patterns and their
translations, and to then be able to program either using the patterns
alone, or to use the patterns as an extension of an existing language.

The closest examples I can think of are not general tools, but the
possibility to extend languages through overloading etc.  Smalltalk,
Prolog and Lisp all make it relatively easy to add new syntactic
patterns (within certain constraints!).

The two main reasons I would like to have such a tool are:

1. To prototype new programming languages (or language constructs).
   Since the development of a new programming language and its compiler
   is a significant effort, it would be nice to have some tools for
   doing a mock-up that can be easily changed.  A common trick is to
   implement the compiler as a translator to an existing high-level
   language like C, thus piggybacking on an existing compiler.
   A pattern language would be ideally suited to such translation.

2. To provide programmer definable language extensions (for enhancing
   software reusability).  Programming languages suffer from the
   problem that the language design must "freeze" the language,
   pretending to anticipate the needs of all future programmers.
   Programming constructs basically define the ways in which reusable
   software can be packaged.  If I discover I need generic procedures
   or object classes etc., but the language doesn't provide them, then
   I'm stuck.  A pattern language would allow a programming language to
   remain open-ended.

Examples of what you should be able to do include:
- macro substitution
- expression re-writing (e.g., translation to postfix form)
- object classes (declaration, inheritance, ...)
- control abstractions (e.g., loops, transactions, ...)
- generic functions

Since a pattern language should be a relatively simple tool (in the
spirit of, say, awk), it should not be concerned with generation of
addresses, optimization, or most of the nasty things real compilers do.

The kind of functionality I believe a pattern language should provide
would be:
- definition of parameterized patterns and their semantic actions
- actions should include bookkeeping (management of scopes and
  symbol tables) and generation of output (translation)
- a primitive form of type-checking for pattern parameters
  (i.e., parameters will also be matched by patterns)
- recursive patterns

I would like to hear of any tools or languages that offer similar
functionality, and any reactions as to the feasibility or advisability
of such an animal.

Oscar Nierstrasz
C.U.I., University of Geneva, Switzerland

E-mail:	oscar@cgeuge51.bitnet
	oscar@cui.unige.ch
	oscar@cui.uucp
[Fifteen years ago, there was IMP-72 which allowed you to put BNF in the
program that extended the language's syntax.  Since it parsed with Earley's
algorithm, ambiguous syntax was OK.  It worked, and you could indeed add
marvelous new syntax.  It failed for three reasons that I could see:  Each
program was, in effect, written in a new language, making maintenance nearly
impossible.  The underlying compiler had a rather simple idea of what was
going on, making it impossible to add semantically interesting new stuff.
And, finally, Earley is an N^3 algorithm, which made the compiler so slow
that it was painful to use.  -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

steve@hubcap.UUCP ("Steve" Stevenson) (11/08/88)

>
> .... extending languages through patterns ....

There was some work done in the early 70s on extensible languages.  The
one I remember most was Ben Wegbreit's E(xtensible) L(anguage) I.  There
was some play in the literature, but I think that the performance
was poor.  I don't have any personal experiences myself.
-- 
Steve (really "D. E.") Stevenson           steve@hubcap.clemson.edu
Department of Computer Science,            (803)656-5880.mabell
Clemson University, Clemson, SC 29634-1906
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

Gateley@cs.utexas.edu (John Gateley) (11/09/88)

The programming language Scheme, and the macro definition tool Scheme
provide the capabilites you are looking for: they allow easy
implementation of other programming languages within Scheme. This is
true of any Lisp dialect with macros, but Scheme is particularly well
suited for it.
   
The references are:
The Revised^3 report on the algorithmic language Scheme.
MIT, AI memo 848a. (Sept. 86).
   
Syntactic Extensions in the Programming Language Lisp.
Eugene Kohlbecker (Doctoral Dissertation)
Indiana University Tech. Report 199, August 1986.
   
In article <2866@ima.ima.isc.com> you write:
<[Existence of pattern language ...]
   
>The closest examples I can think of are not general tools, but the
>possibility to extend languages through overloading etc.  Smalltalk,
>Prolog and Lisp all make it relatively easy to add new syntactic
>patterns (within certain constraints!).
   
The only constraint on Lisp, is that the patterns and expansions must
be syntactic objects (exrpressions).
   
>Examples of what you should be able to do include:
>- macro substitution
   
It does it.
   
>- expression re-writing (e.g., translation to postfix form)
   
Is this really what you want (Scheme, like all Lisps are prefix)?
If it is, then here is one solution: Write a macro called postfix.
Then (postfix (a + (b + c))) would expand into (a b c + +).
However, I am not sure you would want to change it into postfix, since
Scheme is prefix.
   
>- object classes (declaration, inheritance, ...)
   
This is more than syntactic transformation, but it can be done in Scheme.
It is done with lambda, Scheme's method of generating functions. Basically
an object is a function which is passed messages as arguments. More
complete definitions are doable, just a bit more complex.
   
>- control abstractions (e.g., loops, transactions, ...)
   
Scheme has call/cc, which allows modelling of almost all control constructs
(one exception is BASIC's goto, you have to do a bit more translation on
something like that). This allows you to make your own kind of loops,
backtracking, co-routines, and much more.
   
>- generic functions
   
Generic functions do not have much to do with syntactic forms/expansions.
In fact, the whole point of generic functions is that calls to them are
syntactically the same as calls to normal functions. Defining generic
functions can be done by creating a "lookup" function which uses tables.
Adding methods to the generic function does not change the lookup
function, it modifies the tables. There are other techniques for doing
this too.
   
>Since a pattern language should be a relatively simple tool (in the
>spirit of, say, awk), it should not be concerned with generation of
>addresses, optimization, or most of the nasty things real compilers do.
   
Extend-syntax is a source to source translation, and is also very simple
to use (easier than defmacro, and much easier than older lisp's expansion
function). As an example, here is the fibonacci function:

(extend-syntax (fib 0 1)
  ((fib 0) 1)
  ((fib 1) 1)
  ((fib n) (number? 'n) (with ((n-1 (- 'n 1))
			       (n-2 (- 'n 2)))
	                  (+ (fib n-1) (fib n-2))))
  ((fib n) (fib-fun n))

This causes (fib 0) to expand to 1, (fib 1) to expand to 1, (fib 2) to
expand to (+ 1 1), (fib 3) to expand to (+ (+ 1 1) 1). All this happens
at macro expansion time (compile time). In the other case, fib is being
applied to something that is not known to be a number, so it just
generates a call to the fib function, which will be executed at runtime.
   
>The kind of functionality I believe a pattern language should provide
>would be:
>- definition of parameterized patterns and their semantic actions

See my example above, and more complex pattern matching is provided.
   
>- actions should include bookkeeping (management of scopes and
>  symbol tables) and generation of output (translation)
   
Using the with expressions (see above), you can call arbitrary functions
at compile time, so this is easy.
   
>- a primitive form of type-checking for pattern parameters
>  (i.e., parameters will also be matched by patterns)
   
Extend-syntax provides pattern matching, plus additional checks on the
input (see the (number? 'n) expression above, this insures that the
argument is a number).
   
>- recursive patterns
   
Fib as defined above is recursive.
   
Please contact me if you have more questions.
   
John Gateley
gateley@tilde.csc.ti.com
[Good point, people have been embedding languages in Lisp forever.  -John]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

chase@orc.olivetti.com (David Chase) (11/10/88)

It's hard to get a sense of what is really being asked, so
I'll provide multiple pointers to stuff that sounds like it might
answer questions.

One interesting paper (in a moderately interesting book) is
"Compiling Pattern Matching" by Lennart Augustsson in
Functional Programming Languages and Computer Architecture 1985
(Springer-Verlag LNCS #201).  Some languages (ML and Miranda, I
believe) contain a selection by pattern match; Augustsson talks
about compiling that into something efficient.  This paper also
contains pointers to other work (by Augustsson and by Cardelli,
I believe).

Michael O'Donnell had been beating on "Equational Programming"
for some time and wrote a book on the subject in 1985 (Equational Logic
as a Programming Language).  His "interpreters" work by successively
pattern-matching and rewriting a "string". I have heard that the
performance of his system is really quite reasonable for certain
applications (I wouldn't use it for ray-tracing).  That book also
contains a number of pointers to other work (Kron, Hoffmann and O'Donnell,
Huet and Levy, e.g.).  The book is worth reading to get a sense of
what can be done (fairly well, as opposed to in theory) by a
rewriting system.

Gyula Mago at UNC-Chapel Hill has been working on an "FFP machine"
(it may have a new name by now) that was supposed to operate by
(massively parallelized) string rewriting.

You could always use operator-precedence parsing for expressions and
introduce new operators as you went.  I played with this once in an
interpreter -- it works, but it has a real air of ad-hockery.  I wouldn't
recommend this for anything real without thinking about it for a
while first.

David
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

liberte@m.cs.uiuc.edu (Daniel LaLiberte) (11/10/88)

I copied an article out of the SIGPLAN Notices, August 1969 which may have
been a conference procedings.  The article "Transformations: The
Extension Facility of Proteus" by James Bell at DEC describes a language
in which you specify the form of an expression and what it is transformed
into using named arguments.  Then the names are used in expressions
which describe what to do at compile time and what to do at run time.

Here is an example of defining complex addition.  First, the pattern
of a complex literal (or type?) is given.

	pattern complex ! complex <- "<REAL:int> + <IMAG:int> I" !

Now complex addition is defined in terms of the complex pattern.

	trans 30 "<X:complex> + <Y:complex>" <- "<Z:complex>" !
	   Z.REAL <- X.REAL + Y.REAL,
	   Z.IMAG <- x.IMAG + Y.IMAG:

The "30" is a priority.  So if the compiler sees "X + Y" where X and
Y are complex, "Z" is substituted where the REAL and IMAG parts of
Z are as specified.


Dan LaLiberte
uiucdcs!liberte
liberte@cs.uiuc.edu
liberte%a.cs.uiuc.edu@uiucvmd.bitnet
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

cordy@QUCIS.BITNET (11/11/88)

An implementation of the exact tool you ask for is described in:

        "TXL: A Rapid Prototyping System for Programming Language Dialects"
        by J.R. Cordy, C.D. Halpern and E. Promislow
        Proc. IEEE International Conference on Computer Languages, Miami, Octobe

TXL is a general purpose programming language extension tool using a general
pattern matching and replacement algorithm isomorphic to Prolog unification.
Dialects are specified as variants of a basis grammar for the base programming
l in two parts: a set of productions for the syntactic forms of the dialect,
descr replacements for the corresponding syntactic forms of the base language,
and and a set of transformation rules for the semantics of these forms, that
recursively reduce dialect syntactic forms to base language constructs
implementing the run model of the dialect. Each of these is described in a
Prolog-like VHLL notation TXL has been used to specify and automatically
implement several dialects of the programming languages Pascal and Turing,
including dialects with parametric polymorphism, object classes and multiple
inheritance.

Jim             James.R.Cordy@QueensU.CA
                Cordy@Qucis.bitnet
                utcsri!qucis!cordy
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request

compilers-sender@ima.ima.isc.com (11/14/88)

In article <2866@ima.ima.isc.com> oscar@cui.unige.ch (NIERSTRASZ Oscar) writes:
>Does anyone know of the existence of a "pattern language" ...

SNOBOL is an earliest example of such a language.  ICON, developed more
recently by Griswald and Co. at University of Arizona is another example
of a language which allows pattern matching, and various other assorted
features: generators, co-expressions etc. 

-- Vinod
[This question has provoked the most interesting set of responses in ages.
I still don't really know what a pattern language is, and seeing the variety
of responses that have arrived, it doesn't look like anyone else does, either.
Keep those cards and letters coming, though.  -John]
[From harvard!ucbvax.Berkeley.EDU!decvax!tl-vaxa!grover (Vinod Grover)]
--
Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.EDU
Plausible paths are { decvax | harvard | yale | bbn}!ima
Please send responses to the originator of the message -- I cannot forward
mail accidentally sent back to compilers.  Meta-mail to ima!compilers-request