[net.ai] AIList Digest V3 #14

LAWS@SRI-AI.ARPA (02/06/85)
From: AIList Moderator Kenneth Laws <AIList-REQUEST@SRI-AI.ARPA>


AIList Digest            Tuesday, 5 Feb 1985       Volume 3 : Issue 14

Today's Topics:
  AI Tools - Common Lisp
----------------------------------------------------------------------

Date: 4 Feb 85 13:43:34 EST
From: Charles Hedrick <HEDRICK@RUTGERS.ARPA>
Subject: Common Lisp and Lexical Bindings

I saw a comment on AIList about Common Lisp that should probably be
answered.  The claim is that the Gold-Hill CL interpreter is faster than
the VAX interpreter because the VAX interpreter does lexical bindings.
This is almost certainly false. During the early design stages of Elisp
(the extended-addressing TOPS-20 UCI Lisp), I tried several different
binding strategies. I found that in the interpreter there was almost no
difference in speed caused by a full A-list binding strategy.  This
should be about the same as implementing lexical bindings in the obvious
way.  The main problem with lexical binding in the interpreter is that
it uses CONS cells, since most implementations use some sort of binding
list to keep local bindings. This it causes more GC's.  In the DEC-20
CL, I construct these lists on the stack, since bindings are not needed
once you exit from the routine in which they are made.  If someone
constructs a lexical closure, I copy all bindings from the stack to the
heap.  But this happens fairly seldom.  The mechanisms needed to do the
copying from stack to heap are somewhat delicate, but it can be made to
work.  There are also implementations using indirect pointers, if that
turns out to be reasonable on your machine.  A CL interpreter will
probably be slower than a Maclisp interpreter, because of a number of
things:
  - lexical closures
  - the & binding options
  - multiple values
Each of these things can be implemented without adding serious overhead,
but the affect of all of them together is noticable.  However I think a
properly tuned interpreter should be able to get within a factor of 1.5
of Maclisp.  The current DEC-20 Common Lisp is entirely interpreted.
Even system functions are supplied in interpreted form.  Although more
speed would be desirable (Our compiler will be out Real Soon Now), one
can certainly do real work in our system.  I can see that someone might
want to produced a stripped-down pseudo-CL for some sort of real-time
work.  In that case, a lot of thought would have to go into what to
leave out.  I do not think it makes sense to leave out only lexical
binding.  That alone does not cause serious performance problems.  The
real problem is the size of the language, and the number of options,
particularly in the sequence functions.  This means that good performance
can be obtained only by careful special-case optimizing.  Unfortunately
the language is so large that a compiler that does appropriate optimizations
will take a while to develop.

------------------------------

Date: 22 Sep 84 05:11:10 EDT
From: Charles Hedrick <HEDRICK@RUTGERS.ARPA>
Subject: report of meeting about Common Lisp

[Forwarded from the Rutgers bboard, with permission from the author.
I just recently discovered this September message.  -- KIL]

This is a report on the Common Lisp Workshop, held at the Naval
Postgraduate School, in Monterrey, CA, 18 and 19 Sept 84.  The meeting
was called by ARPA, to examine the present state of CL, and to make
suggestions on where it should go next.  The attendees were mostly
associated with organizations that were implementing CL or thinking of
doing so, though there were also some user organizations.  There was a
mix of Universities, commercial vendors, etc.

The main thrust of the meeting seemed to be how CL would make the
transition between a nice idea dreamed up by a few language designers to
a language that is being supported by a large number of vendors and
required by ARPA.  My discussion will follow the overall organization of
the sessions.  (However this order is not chronological.  It is
organized so that some of my users won't have to wade through technical
details to see what is likely to be of the most interest to them.)

   ARPA policy
   Subsets
   Organizational issues
   Proposed extensions
   Workstation/server architecture
   Multi-processing facilities

I. ARPA Policy

The folks who were here from ARPA were Ron Ohlander and Steve Squires.
Apparently Ron will be leaving ARPA (when? I didn't get the time - I
think within a year), and Squires will end up carrying the ball for CL.
Ron did most of the talking in this meeting.

One of the factors that is going to change the nature of CL is the fact
the ARPA is planning to push it strongly.  No final decisions are made,
but it looks like ARPA is going to require and/or strongly suggest that
CL be used for its research contracts.  This will be particularly the
case for the Strategic Computing project, since they intend for all
contractors involved in that project to be able to share code.  They
made all the normal qualifications about doing this in a way that will
not stiffle innovation, and allowing exceptions as appropriate.  But the
evidence is that they will exert very strong pressure towards CL.  (They
mentioned in passing that they also plan to specify that systems they
pay for should use TCP/IP for networking.  Incidentally, they also said
that if you want the RFP for the Strategic Computing initiative in new
architectures, you should ask for
   N0039-84-R-0605(Q)
from the Naval Electronics Systems Command, Code 2013.  The following
phone number is not for Code 2013, but they can refer you:
202-692-6085.)

One of the amusing results of ARPA's policies is that there is now a bit
of a battle over how much of CL you have to implement in order to
qualify.  Certain vendors seem to be interested in providing some degree
of CL compatibility within their existing Lisps, and on that basis want
to be qualified to participate in cases where CL is specified.  It is
not clear to what extent they have sound technical reasons for not
wanting to do full CL, and to what extent they feel that they don't have
time to do so soon enough.

ARPA seems to be willing to put at some money into helping CL get off
the ground, and also to supplying some clerical support, and possibly
legal and organizational advice.

II. Subsets

One of the questions which was posed is whether the CL community should
specify one or more official subsets.  There are a number of reasons
why subsets might be desirable:
  - several people have proposed a subset for teaching purposes.  The
        language is so big that many courses would probably prefer not
        to deal with the whole thing.  It might be helpful if different
        texts would use the same subset.  This could promote a
        competitive marketplace.  It might also be nice if our AI
        textbooks and our Lisp programming intro assumed the same
        subset! This might not mean a special implementation, since it
        would be easy enough to hide the names of the functions that are
        not in the subset.  Instructional applications also tend to be
        on small machines, and so might also fall afoul of the second
        requirement:
  - it might be nice to implement CL on small machines.  Existing CL
        implementations seem to take between 1 and 2.5 Mbytes of ram
        (more if you count editors, etc.).  It would be nice to be able
        to have CL for the Macintosh and other smaller machines.
  - full CL has features that may make it hard to do efficient
        implementations.  This could be important for slower micros. But
        it also affects people interested in doing "embedded systems",
        i.e. things that have to go inside missles, or that have to do
        process control, etc. Examples of such features are lexical
        binding, multiple values, and sequence functions.  There was
        considerable disagreement over how significant this issue is.
        Some felt that with enough work all of these problems could be
        overcome, but it does seem clear that the first implementations
        of full CL are going to be noticably slower than simpler Lisps
        such as PSL.
 - some vendors may not find it practical to implement full CL
        immediately.  They would like to be able to start with a subset,
        and have that be enough to qualify them to participate in
        projects for which ARPA wants CL to be used.

There was no concensus on this issue.  Discussion of subsets got off to
a slow start, but it kept coming up, and tempers starting getting hotter
as time went on.  Here is my reading of the general reactions:
  - people seemed to agree that an educational subset might be useful
        and was harmless.  No one seemed to feel that the CL designers
        or this meeting were ready to specify such a subset.  So it
        seems that textbook writers will be left on their own, at least
        until we see how a few of these subsets turn out.
  - little was said about the small-machine problem.
  - there was a lot of discussion about the last two types of subsets.
        (They are hard to separate.)  There were strong feelings on both
        sides.  Some people feel that only a full CL should be called
        CL, and that we should not encourage subsets.  Even the most
        extreme holders of this view did feel sympathetic to purveyors
        of existing Lisp implementations, and did agree that they should
        be encouraged to provide whatever degree of CL compatibility
        they felt they could manage.  This issue is obviously going to
        come up again at the next meeting, and will be discussed hotly
        with ARPA in the meantime.  ARPA will have a strong effect on
        this.  If they plan to require use of CL, then they will  have
        the final say on what they mean by CL.  There will surely be
        subsets of this kind.  I would be willing to bet that ARPA will
        allow it for embedded systems and process control, where there
        are clear technical reasons.  I have no idea what they will do
        in other cases.  (Maybe the Ada approach, where subsets are
        allowed as long as there is a clear plan to move to a full
        implementation.)  There is also some indication that ARPA may
        find it acceptable to do work in another Lisp as long as there
        is a program to translate the results into CL.  Clearly a subset
        would qualify here.

Note that some of these "subsets" may not be real subsets.  It is likely
that they will have to add a few features.  E.g. those implementors who
do not want to bear the overhead of generic sequence functions may add a
few type-specific functions, such as STRING-CONCAT.  It is quite likely
that people who do this will want these functions added to the full
language, so that their implementations will be true subsets.

III.   Organizational issues

It is interesting to see how much difference it makes that this language
is going to be supported by vendors.  They want to make sure
  - that the language is well-defined.  This means that there is some
        authoritative way to answer questions, and that a validation
        procedure is developed (including a validation suite).
  - that it is possible to make changes where implementation experience
        shows that it is desirable, or as the CL community comes up with
        important new ideas
  - that changes to the language do not happen too quickly
  - that their interests are represented in whatever group is authorized
        to change the language.

It is clear that these requirements imply a person or persons who
control the development of the language.  Initially the language was
designed primarily by a group of 5 people (the so-called "gang of 5"),
with participation by many others over the Arpanet.  The vendors that I
heard would like for those original designers to continue to have a
strong influence over the language.  (Indeed the Gang of 5 is probably
the most enthusiastic to turn things over a formal organization.) Most
people see that we are going to start some organization analogous to a
standards committee.  However most people do not want to be involved in
ANSI, ISO, etc.  The feeling seems to be that there is too much
bureacracy, and that CL still needs enough clarification and additions
that we could not tolerate the delays involved in conventional standards
organizations.  Clearly some vendors would like to see an ANSI standard
eventually, but everyone seems to agree that we are not ready yet. Here
is a partial list of things that the people responsible for the language
have to handle:
  - some way to process proposals for changes to the language. Everyone
        envisions that some sort of vote of a large CL community will be
        required to approve changes.  (This has been true all along,
        except for last-minute details.)  So we are looking for a person
        or persons to receive suggestions, distribute them for comment,
        and conduct votes if appropriate.  I suspect that this group
        might also solict suggestions and possible make some themselves.
  - some way to give authoritative answers to questions that call for an
        interpretation of the language specification
  - destribution of any decisions that result from these two processes
        to all interested parties
  - an archive of all decisions, and possible of all discussion
  - a "delta document".  This would represent all changes that will show
        up in the next edition of the CL manual.  I.e. it is with
        respect to the most recently published edition of the CL manual.
  - new editions of the CL manual.  Initially this may happen as often
        as once a year
  - maintenance of online documentation.  This would be used by builtin
        help facilities, etc.  This will require some negotiation with
        Digital Press, as they currently hold a copyright for the
        manual.
  - licensing special editions of the manual.  Vendors may want to
        intersperse details of their implementation in the text, so that
        the user has a single, integrated manual for  Vendor X's CL.
        Most people seem to feel that this is OK as long as the manual
        contains the unadulterated text of the official CL manual, with
        all additions being set off visibly (e.g. printed in a
        contrasting color).  They may also allow subsets to cut parts of
        the CL manual, but this will require that there is a clear
        disclaimer that this is not CL.  Anyway, somebody is going to
        have to set reprint policies and monitor what is going on.  This
        will also have to be done in conjunction with Digital Press.
  - a test/validation suite.
  - implementation notes
  - a library of public-domain CL code (the "yellow pages" library)
  - a group to vote on changes and matters of policy.  Generally some
        way of providing "legitimacy" to the whole process.
  - trademarking of the language.  We are not sure whether it is too
        late to trademark CL.  One proposal is to trademark CL-84,
        CL-85, etc.  The date would be associated with a test suite (and
        probably also an edition of the manual - these would be issued
        at the same time).  There is no clear concensus that
        trademarking is needed, but it should at least be looked into.
  - budget for clerical support, mailings, and other expenses associated
        with the above.

We have an interim arrangement to handle all of this for the next 6
months.  A committee will make a proposal for a permanent organization
to take effect at the end of 6 months.  Probably there will be another
CL workshop at that time.  The CL mailing list will continue to be used
to take votes on major issues, and generally to represent the CL
community as a whole.  This list may be split, as there seem to be some
people who are just random users, and do not want to (or should not)
participate in the design decisions.  The gang of 5 will moderate the
mailing list, and will also continue to take somewhat of a leadership
role in technical matters, i.e. answering questions, coordinating
proposed changes, etc.  This coordination includes maintaining archives,
the delta document, etc.  They will investigate some of the other
issues, such as licensing the manual for online use and special
editions, trademarks, and preparation of an initial budget. (This budget
will probably be covered by ARPA.)  They will try to do something about
the Yellow Pages library.  (There is actually already one at CMU.  Maybe
this will just continue for the  interim period.)  Committees were
appointed to propose extensions to the language in several important
areas (see below).  The results will be discussed on the CL mailing
list.  There is also a committee to propose a permanent organization.
It is likely that some funding will be needed for the 6 month interim,
if only to handle clerical support.  There way a broad hint that the
Gang of 5 might find ARPA receptive to a proposal that ARPA fund this.

IV.   Proposed extensions

No one was crazy enough to propose that we should come up with
extensions to CL on the spot during the meeting.  Instead we tried to
agree on what areas are the most important to look into. Committees
volunteered to look into each of these areas.  We hope that they will
propose extensions.  I think most of us agree that the actual language
design is going to be done by individuals or very small groups in each
case.  The committees are thus the people who want to be in on initial
discussions, and also people who are considering writing proposals or
parts of proposals.  Here was the initial set of extensions proposed:
  object-oriented programming
  window support
  error handling
  multiprocessing support
  graphics
  iteration (e.g. some sort of macros for writing loops)
  facilities to monitor the internal state of Lisp
  interface to the surrounding system
  networking
  interface to database facilities
  configuration and version management tools
  pattern matching
  calling foreign (non-Lisp) functions
  destructuring
  international character sets
  program manipulation facilities
  coercion among numerical types
  storage management
We took a vote, giving each person 6 votes.  Most of the above
received 0 to 4 votes.  The only significant votes were for the
following items.
  almost 100% -  object-oriented programming
  almost 100% -  error handling
  about 50%   -  display support (windows, etc.)
  about 50%   -  calling foreign (non-Lisp) functions
  about 33%   -  iteration
  about 33%   -  graphics
These are the areas for which committees were set up.

V.   Workstation/server architecture

A number of people expect that we will continue to have systems of
differing power.  That is, in your office will be something for around
$15K.  It will be able to handle CL.  You will do a lot of your
development work on it.  But when you want to process a lot of
real-world data, you will want a more powerful machine.  These machines
would likely be $100K or more.  The folks from ARPA seemed to feel that
configurations like this would be important for their Strategic
Computing projects.

The issue put before us was what sort of language facilities are needed
to support this.  No one seemed to feel that we knew enough of this sort
of thing that we were ready to add such facilities to the language.  Of
course individual researchers would make extensions in the course of
their research.  But we would like to see several such projects before
adopting a particular design permanently.

VI.   Multi-processing facilities

This discussion was somewhat similar to the previous one.  We envision
multiprocessing as becoming more important as time goes on.  Again, the
Strategic Computing project is likely to use this.  The question is what
language facilities will be needed to support multiprocessing. No one
feels we know  enough about this area to say at the moment.  Brief
mention was made of several pieces of work in this area:
  Gabriel's work at Stanford
  Halstead's at MIT
Both of these are shared-memory.
  the remote sensing project at CMU.  Multiple PERQ's with CL.
        Uses the facilities of the PERQ OS.
This is more like conventional networking.
  BBN's butterfly project.  Many 68000 systems in parallel, with a Lisp
        Machine as a front end to supply the user interface.  Both
        the 68000's and the LM will use CL.

It is clear that there are not only many different approaches, but even
more than one basic model (shared memory and networking are sort of the
opposite ends of the spectrum).  In the end we may need language
facilities to support both styles.  But nobody is ready to say much at
the moment.

Anyone interested in working on multiprocessing support is invited to
send mail to RPG@SU-AI to be added to a mailing list.

------------------------------

End of AIList Digest
********************