[comp.lang.scheme] Macros again

wilson@cs.utexas.edu (Paul Wilson) (06/03/91)

It appears that I should clarify a lot of what my previous postings said.

First, when I referred to lexically scoped macros, I did indeed refer to
(roughly, at least) hygeinic macro expansion.  (More on this below.)

Second, when I said that standardized macros are the easiest-to-fix barrier
to portable Scheme code, I meant that at two different levels:

   1. It looks like hygeinic macro expansion is getting to be mature and
      well-understood enough that at least the most basic functionality
      could be standardized with a standard syntax, now.  Fancier
      functionality could be standardized later.

   2. Even if that's not the case, it is past time to standardize some
      kind of basic macro functionality.  If necessary, leave it undefined
      whether they're hygeinic or not.  For a lot of simple but unimportant
      macro uses, it's okay (more on this below, too).

Now for a little more detail...

  The reason that I refer to hygeinic macro expansion as lexically scoped
is that, as I understand it -- correct me if I'm wrong -- there is a very
close analogy between the scoping of syntax bindings and the scoping of
variable bindings.  Not only that, but given Scheme's unified naming
principle, it's the *same* scoping principle in a deep sense.  The
"painting" or "renaming" of variables is really just a mechanism for
implementing multiple syntactic scopes.  The more straightforward
way of doing this would be to "close" syntax bindings in syntactic
environments, using an actual data structure.  Not only that, but those
syntactic environments are most straightforwardly implemented using
normal variable binding environments with a couple of added features. 
Instead of using raw symbols as names for syntactic constructs, you'd use
a symbol closed in a syntactic environment, in very much the same
way that you use a closure instead of a raw procedure.

   (This differs from my previous posting in which I talked about
compile-time environments, roughly a la the ones in Abelson and Sussman's
Scheme-in-Scheme compiler.  Forget that for now -- it's even simpler if you
just have an interpreter.)

   An example may make this a whole lot easier to understand.  Suppose
we have a form like this:

   (let ((foo 1))
     (let-syntax ((bar <some-macro>))
        (let ((baz 2))
            <body>)))

The straightforward way to interpret this is to create a binding contour
(extend the binding environment) when the first let is entered.  That
extends the prior environment with a binding for foo.

Similarly, when the let-syntax is entered, a new binding contour is
created, binding bar.  But since this contour is syntactic, it has
a couple of special properties -- for one, the value is immutable,
and for another, the value is used specially.  But still, it's a
binding environment pretty much like any other.

And when we enter the inner let, another contour is created for baz.

If we implement binding environments in the obvious way, using a
scoping chain, then by the time we've gotten to the <body>, our
binding environment looks like this

[foo:1] -> [bar:<some-macro-thingamajig>] -> [baz:2] -> <outer envt>

When hygeinic macro expansion "paints" or "renames" a symbol, that's
really just making a unique name that specifies a symbol _bound_
_in_a_particular_environment_.  So if I rename the variable "bar"
as "bar-2" or whatever, what that's really saying is _this_particular_
_bar_in_this_particular_environment.

The use of renaming is just a technique for specifying bindings that
happens to be convenient when you're implementing macro-expansion
as a prepass.

But the more straightforward, Scheme-like way is to integrate it into the
interpreter and actually use an actual reference to the binding environment
to "paint" symbols introduced in that environment.

I'm not necessarily suggesting that syntactic binding environments should
be implemented this way, rather than using renaming, but it's conceptually
simpler and clearer.  And it avoids needing a prepass that does all of
the front-end work an extra time -- it's all one scoping mechanism,
so it may as well all be done in one pass, right? 


When it comes to compilers (instead of interpreters), these syntactic
binding contours would be present in the compile-time environments,
but not in the actual runtime environments.

(I extended the Scheme-48 compile-time environments in an analogous way
to handle loop labels; you just ignore the syntax contours when computing
lexical addresses, since they won't be reflected in the runtime scoping
chain.  Other than that, they play pretty much the same role in name
resolution as normal contours -- one namespace, one scope chain.
This experience is what prompted my comment about any decent Scheme
compiler being hairy enough that supporting syntax contours is no
biggie.  Just have three kinds of links in the compile-time scope
chain, rather than two.)

My comment about "reifying" syntactic environments was a bit careless.
All I really meant was that you should have some kind of hook into
the scoping mechanism, if you want to make it easy for people to
implement macro mechanisms in the language.  Maybe pattern-matching
(or whatever) can be optional, but you should be able to "get at"
the scope mechanism somehow or other.

(Using the approach above, where symbols are renamed by pairing with
environment objects, the problem of generating unique identifiers
is solved in the most straightforward way -- syntactic binding
environments are first class objects with identities in the compiler code.
Whether they should literally be first class in the language is something
else.)

------

About Scheme as Lisp Lite vs. Lisp Right:  it seems to me that getting
macros right is not going to make a Scheme system all that much bigger,
and if you're worried about that increment in size, you're probably
majoring in the minors.  A bigger worry is garbage collection --
unless you have a generational gc, your locality is going to be the
absolute pits.  Especially if you don't have a stack cache, and you
really heap-allocate all of your continuations and environments.

If you use a simple gc, you're going to use a lot of memory,
or page a lot, or incur a lot of gc cost.  If that's not true in
your system, it's probably because your system is really, really
slow, and that hides the gc cost.  But maybe really, really slow
is okay in an extension language interpreter.


Also, the Lisp Lite vs. Lisp Right controversy might be considered to
be a conflict between the Lisp 1.5 heritage and the Algol 60 heritage.
Lisp Right (hygeinic) macros fit in with the Algolic idea of scoping and
abstraction, where Lisp Lite (traditional) macros retain the idea of
S-expressions as being interpretable data structures.  Hygeinic macros
get away from viewing programs as data structures that you can arbitrarily
hack around with and execute.  It's a move towards a program representation
that has more sophisticated (and restrictive) structure, but is better
suited for writing structured programs.  

It seems that the committee is generally more enamored of the Algol-60
design philosophy than the Lisp 1.5 philosophy.  On the other hand,
if syntactic environments were really first-class, you could probably
unify the two philosophies;  maybe you could even write the pattern-
matching facilities for macros cleanly in the language.  (?)

------

About least-common-denominator macros:  the thing that spurred my original
posting was that I was fixing up Chris Hanson's balanced binary tree
code to run efficiently in a fairly simple Scheme system.  Chris' code takes
advantage of a couple of MIT Scheme features, in particular define-integrable.

A common use of define-integrable is in defining access procedures for
data abstraction.  I didn't want to replace those uses with simple defines
for efficiency reasons -- I wanted to come up with a portable version
of Chris' code that was still very efficient, if not *quite* as
pretty.

The tried and true Lisp way of doing this is to use a macro.  So all
of those access procedures could be changed to macros with no efficiency
cost.  And the macros would expand into things that a small Scheme
compiler like Scheme-48 is smart enough to inline.  So my code would
only be a shade uglier than Chris', but portable and fast.

This is what motivates me to say that some least-common-denominator
macro could be defined now, and enhanced later.  If I can write
a portable macro equivalent to

  (define-integrable (bbt-node-parent node) (vector-ref node 3))

I'll be a lot happier.  Just standardizing the *syntax* for this kind
of trivial macro would be helpful, even if it's not specified how
free variables are treated.  (If my code is the only code that
uses bbt-node-parent -- because it's not an interface to the
ADT -- I can make sure that it's safe because none of the code
does anything weird with the variables referenced in the body.)

This is enough to let me write code that's efficient, portable, and
pretty _enough_ that nobody would ever have to go back and clean
it up or upgrade it -- it would be suitable for a library routine.

That's why macros are necessary -- to patch over some of Scheme's other
deficiencies in terms of data abstraction, etc.  Sure, it's a quick-and-
dirty way of solving problems, but it can be done reasonably cleanly
if you're careful.  We need to be able to write decent, efficient,
reusable code soon, rather than waiting until Scheme is complete and perfect.

(And the example above is particularly relevant to a Lisp Lite 
Scheme system -- when you can't count on a fancy compiler, you need
macros for speed-critical code.  For a Lisp Right system, the
mechanism for lexical scope of macros will also support things
like define-integrable.)

   -- Paul

P.S.  I don't claim to be an expert on any of this, especially on
      the technical details of macro expansion;  I'm just a user who
      wants his macros.  Feel free to correct any misconceptions, etc.
-- 
| Paul R. Wilson, Interactive Computing Envts. Lab. lab ph.: (312) 996-9216   |
| U. of Illinois at Chicago EECS Dept. (M/C 154)    wilson@bert.eecs.uic.edu* |
| P.O. Box 4348   Chicago,IL 60680                  fax ph.: (312) 413-0024   |
| *Next yr, Asst Prof, UT Austin CS Dept-- after 8/1 use wilson@cs.utexas.edu |

jaffer@zurich.ai.mit.edu (Aubrey Jaffer) (06/04/91)

As I look towards implementing macros in scm, a few questions occur to
me:

I was planning to replace every invocation of a macro in the code with
the expansion of the macro.  Is there any problem with this?  I
consider macros to be new derived expression types.  Since, according
to the definition, derived expression types can be replaced with the
expansions, this seems the natural thing to do with macros.

Local (non-top level) macros seem like a useless gold plated feature
that will introduce problems with use of symbols in programs.  Scheme
defined keywords are not allowed to be used as variable names.  It
seems a reasonable expectation that macro names also not be allowed as
varible names.  But local macros would seem to require that the local
macro names could be used elsewhere as variables.  All the examples I
have seen so far of local macros are contorted and hard to read.  Can
anyone come up with a really useful example of a local macro?

It seems to me that the quickest way to test and disseminate a new
macro proposal is to write code which reads in a scheme program
containing macros and writes out a new program with the macros
expanded.  This would be compatible with every scheme implementation.
Has this been done?  Can it be done?

jinx@zurich.ai.mit.edu (Guillermo J. Rozas) (06/04/91)

In article <JAFFER.91Jun3170845@aarau.ai.mit.edu> jaffer@zurich.ai.mit.edu (Aubrey Jaffer) writes:

   Path: ai-lab!zurich.ai.mit.edu!jaffer
   From: jaffer@zurich.ai.mit.edu (Aubrey Jaffer)
   Newsgroups: comp.lang.scheme
   Date: 3 Jun 91 21:08:45 GMT
   References: <1510@yoakum.cs.utexas.edu>
   Sender: news@ai.mit.edu
   Organization: M.I.T. Artificial Intelligence Lab.
   Lines: 23

   As I look towards implementing macros in scm, a few questions occur to
   me:

   I was planning to replace every invocation of a macro in the code with
   the expansion of the macro.  Is there any problem with this?  I
   consider macros to be new derived expression types.  Since, according
   to the definition, derived expression types can be replaced with the
   expansions, this seems the natural thing to do with macros.

That is the way that all implementations that I know of handle macros.
It is the only way they can effectively be handled if you don't
execute the S-expressions directly and instead translate them into a
different code (native code for a particular real or virtual machine).

   Local (non-top level) macros seem like a useless gold plated feature
   that will introduce problems with use of symbols in programs.  Scheme
   defined keywords are not allowed to be used as variable names.  It
   seems a reasonable expectation that macro names also not be allowed as
   varible names.  But local macros would seem to require that the local
   macro names could be used elsewhere as variables.  All the examples I
   have seen so far of local macros are contorted and hard to read.  Can
   anyone come up with a really useful example of a local macro?

Three independent issues here:

- Whether the keywords of core special forms should be disallowed as
variables.

I believe that the current restriction in the reports, etc. is a
consequence of the lack of macros, and that when they are finally
adopted (if ever), the restriction will be lifted.

- Whether the keywords of macros should be disallowed as variables.

This is a very bad idea.  For example, if your system happens to
provide a macro called TIME for timing/benchmarking expressions, then
the variable TIME is no longer available for variables in programs.
Similar for DELAY, LOOP, and many other common names.  Furthermore, it
means that programs to be used together must avoid even for local
variables the names used by either as keywords!  This makes combining
code from different sources very painful.

- Whether local macros are useful.
I feel the opposite of what you feel.  I think that only local macros
are useful and global macros are horrible (although convenient)
because of all the problems they present with respect to modularity,
separate compilation, etc.

When I use macros, I often wrap a LET-SYNTAX (MIT Scheme) around the
whole file and all the macros are local.  That way I can combine the
program with any other and I know that it won't be affected by a macro
vs. procedure interaction.

It all depends on what size programs you expect to put together.
Collisions in the name space become more common the larger the
programs become, and at that point you start using lexical scoping
(even for macros) more heavily in order to avoid potential problems.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (06/04/91)

In article <1510@yoakum.cs.utexas.edu>, wilson@cs.utexas.edu (Paul Wilson) writes:
> (I extended the Scheme-48 compile-time environments in an analogous way ...
> And the macros would expand into things that a small Scheme
> compiler like Scheme-48 is smart enough to inline.

What's Scheme-48?

-- 
Should you ever intend to dull the wits of a young man and to
incapacitate his brains for any kind of thought whatever, then
you cannot do better than give him Hegel to read.  -- Schopenhauer.

oz@nexus.yorku.ca (Ozan Yigit) (06/04/91)

Paul Wilson (wilson@cs.utexas.edu) writes:

>That's why macros are necessary -- to patch over some of Scheme's other
>deficiencies in terms of data abstraction, etc.  Sure, it's a quick-and-
>dirty way of solving problems, but it can be done reasonably cleanly
>if you're careful.  We need to be able to write decent, efficient,
>reusable code soon, rather than waiting until Scheme is complete and perfect.

Having examined just about every serious scheme implementation that is also
mentioned in the Jonathan's implementations list over the past three or four
years, I am rather puzzled about this never-ending flap over macros. All of
those implementations I have looked at (chez, c-scheme, scheme84, scheme88,
macscheme, fools, elk, xscheme, skim, umb-scheme, vincennes scheme, ...)
had either the old-fashioned or the extend-syntax type macro facilities, and
the former types were almost trivially made to support extend-syntax, thanks
to Kent's portable package.

People who need practical,low-level macros have been using them. People who
needed portable syntax extensions anybody can read and understand have been
using Kent's extend-syntax layer. People who can pull it off have have been
working on a proposal for macros-done-right (minus the choice of "***" over
"..." ;-)).

I think Scheme has been practical for quite sometime, and there seems to be
no reason to wait for anything.

oz
---
This is a public service annoncement: The Scheme Repository can be found on
nexus.yorku.ca [130.63.9.66] under pub/scheme. Latest contributions include
a nice collection of material from Stephen J. Bevan.
---
In seeking the unattainable, simplicity  |  Internet: oz@nexus.yorku.ca
only gets in the way. -- Alan J. Perlis  |  Uucp: utai/utzoo!yunexus!oz

alms@cambridge.apple.com (Andrew L. M. Shalit) (06/05/91)

   jinx@zurich.ai.mit.edu (Guillermo J. Rozas) writes:

      >jaffer@zurich.ai.mit.edu (Aubrey Jaffer) writes:
      >
      >Local (non-top level) macros seem like a useless gold plated feature...

   I feel the opposite of what you feel.  I think that only local macros
   are useful and global macros are horrible (although convenient)
   because of all the problems they present with respect to modularity,
   separate compilation, etc.

   When I use macros, I often wrap a LET-SYNTAX (MIT Scheme) around the
   whole file and all the macros are local.  That way I can combine the
   program with any other and I know that it won't be affected by a macro
   vs. procedure interaction.

   It all depends on what size programs you expect to put together.
   Collisions in the name space become more common the larger the
   programs become, and at that point you start using lexical scoping
   (even for macros) more heavily in order to avoid potential problems.

I agree with both Jinx and Aubrey.

The need for local macros depends a lot on whether you have a module
system which is distinct from LET-style binding.  If you have a
separate module facility, then macros can be scoped at the module
level and you don't need LET-SYNTAX.  However, if LET and friends are
your only way to chunk code, then you probably need something like
LET-SYNTAX.