[comp.lang.scheme] distinguishing macros from combinations

doug@snitor.uucp (Doug Moen) (03/08/91)

Aubrey Jaffer has posted 2 articles about readability problems
created by macros, and suggests that the problem is solved either
by abolishing macros, or by syntactically distinguishing special
forms from combinations.

Franklyn Turbak has responded with "Why macros impair readability",
which analyzes in depth several readability problems caused by macros.
He concludes that most of these problems are at least partly
alleviated by making macro calls and combinations syntactically distinct.

I have been interested in this problem for some time.  My interest
was originally aroused by the explanation in early versions of the
Scheme Report of why there was no standard macro facility:

    The main problems with traditional macros are:
    1. They must be defined to the system before any code
       using them is loaded; this is a common source of
       obscure bugs.
    2. They are usually global; macros can be made to follow
       lexical scope rules, but many people find the resulting
       scope rules confusing.
    3. Unless they are written very carefully, macros are
       vulnerable to inadvertant capture of free variables;
       to get around this, macros may have to generate code
       in which procedure values appear as quoted constants.
    There is a similar problem with syntactic keywords if the
    keywords of special forms are not reserved.  If keywords
    are reserved, then either macros introduce new reserved
    words, invalidating old code, or else special forms defined
    by the programmer do not have the same status as special
    forms defined by the system.

The Scheme committee has been working for some time on a hygenic
macro facility that is supposed to solve these problems; it will
be described in R4RS, and the macro facility itself is described
in the January 1991 POPL proceedings.  Although I don't have
access to either of these documents, I do have an early draft of
the macro proposal, as well as some hearsay information.

Based on this information, it would appear that the new hygienic
macro facility has the following properties:
 a. Macro forms are NOT syntactically distinguished from combinations.
 b. There are facilities for defining both global and local macros.
 c. There are no reserved words.  Local macros may shadow program
    variables, and program variables may shadow macros.
 d. If a macro inserts a new local (bound) variable, then that
    variable is automatically renamed to avoid conflicts with
    any other variables or syntactic keywords.
 e. If a macro inserts a new (free) reference to a program variable,
    then any use of that macro will refer to the binding of the
    variable that was visible in the definition of the macro,
    regardless of any local bindings that may surround the use
    of the macro.

Despite all of its marvelous properties, the new macro system does not
make macro and procedure calls syntactically distinct.  As a result,
it does not solve the problems described by Jaffer and Turbak, and it
does not solve the problem of obscure bugs being caused by accidently
using a macro before it is defined.  This point deserves a further
explanation.  If you use a macro before it is defined, then Scheme
will not realize that the macro call is a special form, and will
compile it as a procedure call.  When you run the resulting code,
the error messages can be very confusing.  If macro calls are
syntactically distinguished from procedure calls, then the compiler
can issue a compile time error message about an undefined macro.

How can we solve these problems?  My first idea was to make macro
and procedure calls distinct by changing the syntax of a procedure
call to this:
	(funcall function arg ...)
In other words, procedure call becomes just another special form,
introduced by the 'funcall' keyword.  Just as quote, quasiquote,
etc, have abbreviated forms, we provide a special abbreviation for
funcall:
	[function arg ...]

Here is a trivial Scheme program written using this syntax:

(define (flatten x)
	(cond ([null? x] x)
	      ([pair? x] [append [flatten [car x]] [flatten cdr x]])
	      (else [list x])))

There is one flaw in the above proposal.
Although macro names are now in a different namespace than variable names,
syntactic keywords still share the same name space as variables.  For
example, it is not possible to tell from the syntax that 'else' within
a cond form is a keyword, rather than a variable name.
My second proposal addresses this issue.  Rather than distinguish
macro calls from procedure calls, we will distinguish macro names and
keywords from variable names, by requiring keywords to begin with '&',
and forbidding variable names from beginning with '&'.

(&define (flatten x)
	 (&cond ((null? x) x)
		((pair? x) (append (flatten (car x)) (flatten cdr x)))
		(&else (list x))))

Partitioning variable names and macros like this has an interesting
side effect.  Recall that the hygenic macro system (at least in its
draft form) has no reserved words, and allows local macros to
shadow variables, and variables to shadow macros.  This can lead to
funny results, such as the following:
	(let ((else #f))
	  (cond (#f 3)
		(else 4)
		(#t 5)))	--> 5
Under my proposal, keywords and variable names occupy different name
spaces, and therefore cannot shadow one another.

Making keywords distinct from variables may also simplify the algorithm
that implements hygienic macros.  This supposition is based on a
footnote in my copy of the draft macro proposal:
	Chris Hanson implemented a prototype and wrote a paper
	on his experience, pointing out that an implementation
	based on syntactic closures must determine the syntactic
	roles of some identifiers before macro expansion based
	on textual pattern matching can make those roles apparent.
Does anyone know if this paper is publically available?
-- 
Doug Moen | doug@snitor.uucp | uunet!snitor!doug | doug.tor@sni.de (Europe)

jaffer@gerber.ai.mit.edu (Aubrey Jaffer) (03/10/91)

doug@snitor.uucp (Doug Moen) brings up some important points:
    The main problems with traditional macros are:
    1. They must be defined to the system before any code
       using them is loaded; this is a common source of
       obscure bugs.
    2. They are usually global; macros can be made to follow
       lexical scope rules, but many people find the resulting
       scope rules confusing.
    3. Unless they are written very carefully, macros are
       vulnerable to inadvertant capture of free variables;
       to get around this, macros may have to generate code
       in which procedure values appear as quoted constants.
 4. There is a similar problem with syntactic keywords if the
    keywords of special forms are not reserved.  If keywords
    are reserved, then either macros introduce new reserved
    words, invalidating old code, or else special forms defined
    by the programmer do not have the same status as special
    forms defined by the system.

Problem 1. could be solved by specifying that macros are lexically
scoped EVEN AT TOP LEVEL.  The order of definitions would then not
matter.  This would require that loaded files be loaded twice; a two
pass lexical system.  Macros typed at top level would then be at the
users own risk (because code already expanded would not be changed);
But this is already the situation in lisps.

I think that 3. is solved by Hygenic Macro expansion.

Problem 4. can be alleviated by not allowing reserved symbols to be
bound in macros (just as they cannot be bound by lambdas).  The number
of reserved symbols is small enough that this should not present a
hardship.

Using a prefix character (like `&') to prefix all macro names has the
additional benifit that, even if the Scheme standard committees do not
put this in the specs, it can be used as an element of style, much as
most C programmers use uppercase names for #defines.

If a programmer wants to have ALL special forms similarly marked all
he has to do is define the macros &IF, &DEFINE, &COND, etc to expand
to IF, DEFINE, COND, etc.

jeff@aiai.ed.ac.uk (Jeff Dalton) (03/11/91)

In article <13849@life.ai.mit.edu> jaffer@gerber.ai.mit.edu (Aubrey Jaffer) writes:
>Using a prefix character (like `&') to prefix all macro names has the
>additional benifit that, even if the Scheme standard committees do not
>put this in the specs, it can be used as an element of style, much as
>most C programmers use uppercase names for #defines.

I am happy for you to use this style (even though I wouldn't find
it all that helpful myself), but I definitely do not want such a
rule to be part of the definition of Scheme.

One of thge main reasons I'd object is that I find the "&" ugly
and intrusive.  Other conventions such as using upper case or
different brakctes ([] or {}, say) also suffer from aesthetic
and practical objections.  However if there were a program that
produced Algol-style listings, I wouldn't mind if it put macro
names in bold face along with if, define, etc.

BTW, does anyone out there have such a program?