doug@snitor.uucp (Doug Moen) (03/08/91)
Aubrey Jaffer has posted 2 articles about readability problems created by macros, and suggests that the problem is solved either by abolishing macros, or by syntactically distinguishing special forms from combinations. Franklyn Turbak has responded with "Why macros impair readability", which analyzes in depth several readability problems caused by macros. He concludes that most of these problems are at least partly alleviated by making macro calls and combinations syntactically distinct. I have been interested in this problem for some time. My interest was originally aroused by the explanation in early versions of the Scheme Report of why there was no standard macro facility: The main problems with traditional macros are: 1. They must be defined to the system before any code using them is loaded; this is a common source of obscure bugs. 2. They are usually global; macros can be made to follow lexical scope rules, but many people find the resulting scope rules confusing. 3. Unless they are written very carefully, macros are vulnerable to inadvertant capture of free variables; to get around this, macros may have to generate code in which procedure values appear as quoted constants. There is a similar problem with syntactic keywords if the keywords of special forms are not reserved. If keywords are reserved, then either macros introduce new reserved words, invalidating old code, or else special forms defined by the programmer do not have the same status as special forms defined by the system. The Scheme committee has been working for some time on a hygenic macro facility that is supposed to solve these problems; it will be described in R4RS, and the macro facility itself is described in the January 1991 POPL proceedings. Although I don't have access to either of these documents, I do have an early draft of the macro proposal, as well as some hearsay information. Based on this information, it would appear that the new hygienic macro facility has the following properties: a. Macro forms are NOT syntactically distinguished from combinations. b. There are facilities for defining both global and local macros. c. There are no reserved words. Local macros may shadow program variables, and program variables may shadow macros. d. If a macro inserts a new local (bound) variable, then that variable is automatically renamed to avoid conflicts with any other variables or syntactic keywords. e. If a macro inserts a new (free) reference to a program variable, then any use of that macro will refer to the binding of the variable that was visible in the definition of the macro, regardless of any local bindings that may surround the use of the macro. Despite all of its marvelous properties, the new macro system does not make macro and procedure calls syntactically distinct. As a result, it does not solve the problems described by Jaffer and Turbak, and it does not solve the problem of obscure bugs being caused by accidently using a macro before it is defined. This point deserves a further explanation. If you use a macro before it is defined, then Scheme will not realize that the macro call is a special form, and will compile it as a procedure call. When you run the resulting code, the error messages can be very confusing. If macro calls are syntactically distinguished from procedure calls, then the compiler can issue a compile time error message about an undefined macro. How can we solve these problems? My first idea was to make macro and procedure calls distinct by changing the syntax of a procedure call to this: (funcall function arg ...) In other words, procedure call becomes just another special form, introduced by the 'funcall' keyword. Just as quote, quasiquote, etc, have abbreviated forms, we provide a special abbreviation for funcall: [function arg ...] Here is a trivial Scheme program written using this syntax: (define (flatten x) (cond ([null? x] x) ([pair? x] [append [flatten [car x]] [flatten cdr x]]) (else [list x]))) There is one flaw in the above proposal. Although macro names are now in a different namespace than variable names, syntactic keywords still share the same name space as variables. For example, it is not possible to tell from the syntax that 'else' within a cond form is a keyword, rather than a variable name. My second proposal addresses this issue. Rather than distinguish macro calls from procedure calls, we will distinguish macro names and keywords from variable names, by requiring keywords to begin with '&', and forbidding variable names from beginning with '&'. (&define (flatten x) (&cond ((null? x) x) ((pair? x) (append (flatten (car x)) (flatten cdr x))) (&else (list x)))) Partitioning variable names and macros like this has an interesting side effect. Recall that the hygenic macro system (at least in its draft form) has no reserved words, and allows local macros to shadow variables, and variables to shadow macros. This can lead to funny results, such as the following: (let ((else #f)) (cond (#f 3) (else 4) (#t 5))) --> 5 Under my proposal, keywords and variable names occupy different name spaces, and therefore cannot shadow one another. Making keywords distinct from variables may also simplify the algorithm that implements hygienic macros. This supposition is based on a footnote in my copy of the draft macro proposal: Chris Hanson implemented a prototype and wrote a paper on his experience, pointing out that an implementation based on syntactic closures must determine the syntactic roles of some identifiers before macro expansion based on textual pattern matching can make those roles apparent. Does anyone know if this paper is publically available? -- Doug Moen | doug@snitor.uucp | uunet!snitor!doug | doug.tor@sni.de (Europe)
jaffer@gerber.ai.mit.edu (Aubrey Jaffer) (03/10/91)
doug@snitor.uucp (Doug Moen) brings up some important points: The main problems with traditional macros are: 1. They must be defined to the system before any code using them is loaded; this is a common source of obscure bugs. 2. They are usually global; macros can be made to follow lexical scope rules, but many people find the resulting scope rules confusing. 3. Unless they are written very carefully, macros are vulnerable to inadvertant capture of free variables; to get around this, macros may have to generate code in which procedure values appear as quoted constants. 4. There is a similar problem with syntactic keywords if the keywords of special forms are not reserved. If keywords are reserved, then either macros introduce new reserved words, invalidating old code, or else special forms defined by the programmer do not have the same status as special forms defined by the system. Problem 1. could be solved by specifying that macros are lexically scoped EVEN AT TOP LEVEL. The order of definitions would then not matter. This would require that loaded files be loaded twice; a two pass lexical system. Macros typed at top level would then be at the users own risk (because code already expanded would not be changed); But this is already the situation in lisps. I think that 3. is solved by Hygenic Macro expansion. Problem 4. can be alleviated by not allowing reserved symbols to be bound in macros (just as they cannot be bound by lambdas). The number of reserved symbols is small enough that this should not present a hardship. Using a prefix character (like `&') to prefix all macro names has the additional benifit that, even if the Scheme standard committees do not put this in the specs, it can be used as an element of style, much as most C programmers use uppercase names for #defines. If a programmer wants to have ALL special forms similarly marked all he has to do is define the macros &IF, &DEFINE, &COND, etc to expand to IF, DEFINE, COND, etc.
jeff@aiai.ed.ac.uk (Jeff Dalton) (03/11/91)
In article <13849@life.ai.mit.edu> jaffer@gerber.ai.mit.edu (Aubrey Jaffer) writes: >Using a prefix character (like `&') to prefix all macro names has the >additional benifit that, even if the Scheme standard committees do not >put this in the specs, it can be used as an element of style, much as >most C programmers use uppercase names for #defines. I am happy for you to use this style (even though I wouldn't find it all that helpful myself), but I definitely do not want such a rule to be part of the definition of Scheme. One of thge main reasons I'd object is that I find the "&" ugly and intrusive. Other conventions such as using upper case or different brakctes ([] or {}, say) also suffer from aesthetic and practical objections. However if there were a program that produced Algol-style listings, I wouldn't mind if it put macro names in bold face along with if, define, etc. BTW, does anyone out there have such a program?