[net.lang.lisp] A BNF for Lisp

chris@aquila.UUCP (chris) (11/07/85)

A moment of reflection yields the following BNF syntax for
Lisp.

Define a lisp program as a Sexpression, where Sexpression is
given as:

	Sexpression	->	ATOM
			|	( )
			|	( Sexpression . Sexpression )
			|	( Sexpression-List )

	Sexpression-List ->	Sexpression
			|	Sexpression-List Sexpression

An ATOM in turn is a sequence of characters excluding
whitespace and the characters "()."


		Chris Retterath, Consensys Corp.
		( dciem!aquila!chris )

segre@uicsl.UUCP (11/11/85)

/* Written 12:10 pm  Nov  7, 1985 by chris@aquila.UUCP in uicsl:net.lang.lisp */
/* ---------- "A BNF for Lisp" ---------- */
A moment of reflection yields the following BNF syntax for
Lisp.

Define a lisp program as a Sexpression, where Sexpression is
given as:

	Sexpression	->	ATOM
			|	( )
			|	( Sexpression . Sexpression )
			|	( Sexpression-List )

	Sexpression-List ->	Sexpression
			|	Sexpression-List Sexpression

		Chris Retterath, Consensys Corp.
		( dciem!aquila!chris )

/* End of text from uicsl:net.lang.lisp */

=============================

Actually, the following is unnecessary:

			|	( Sexpression-List )

since an Sexpression can be ( ) or (Sexpression . Sexpression) you can get
a list of Sexpressions by "CONSing" as follows:

(Sexpression . (Sexpression . (Sexpression ...  . Sexpression ]

where the last Sexpression is ( ).

So it is sufficient to give:

	Sexpression	->	ATOM
			|	( )
			|	( Sexpression . Sexpression )

Of course the syntax given before is also correct, but has some redundant
parts. This "simplified" syntax seems to be more pleasing, since it reflects
the fact that a LISP expression can be an atom, NIL, or a structure of CONSes.


=====

Alberto Segre
Artificial Intelligence Research Group
Department of Electrical and Computer Engineering
Coordinated Science Laboratory
University of Illinois at Urbana-Champaign


ARPA or CSNET:	uicsl!segre@uiuc
UUCP:		...!{ihnp4, convex, pur-ee}!uiucdcs!uicsl!segre

ags@pucc-h (Dave Seaman) (11/13/85)

In article <60@aquila.UUCP> chris@aquila.UUCP (chris) writes:
>A moment of reflection yields the following BNF syntax for
>Lisp.
>
>Define a lisp program as a Sexpression, where Sexpression is
>given as:
>
>	Sexpression	->	ATOM
>			|	( )
>			|	( Sexpression . Sexpression )
>			|	( Sexpression-List )
>
>	Sexpression-List ->	Sexpression
>			|	Sexpression-List Sexpression
>
>An ATOM in turn is a sequence of characters excluding
>whitespace and the characters "()."

Actually the production "Sexpression -> ( Sexpression . Sexpression )"
should be omitted and the definition of Sexpression-List should be amended to

	Sexpression-List ->	Sexpression
			|	Sexpression-List Sexpression
			|	Sexpression-List . Sexpression

This allows S-expressions such as ( A B . C ), which could not be obtained
otherwise.  It also allows ( A . B . C ), which is perfectly legal on input,
even though it would never be produced by "print".
-- 
Dave Seaman			 ..!pur-ee!pucc-h!ags

ags@pucc-h (Dave Seaman) (11/13/85)

In article <60@aquila.UUCP> chris@aquila.UUCP (chris) writes:
>A moment of reflection yields the following BNF syntax for
>Lisp.
>
>Define a lisp program as a Sexpression, where Sexpression is
>given as:
>
>	Sexpression	->	ATOM
>			|	( )
>			|	( Sexpression . Sexpression )
>			|	( Sexpression-List )
>
>	Sexpression-List ->	Sexpression
>			|	Sexpression-List Sexpression
>
>An ATOM in turn is a sequence of characters excluding
>whitespace and the characters "()."

OK, except you can't get ( A B . C ) from this syntax.

The production

	Sexpression -> ( Sexpression . Sexpression )

should be changed to

	Sexpression -> ( Sexpression-List . Sexpression )
-- 
Dave Seaman			 ..!pur-ee!pucc-h!ags

choy@whuts.UUCP (CHOY) (11/14/85)

> 
> This allows S-expressions such as ( A B . C ), which could not be obtained
> otherwise.  It also allows ( A . B . C ), which is perfectly legal on input,
> even though it would never be produced by "print".
> -- 
> Dave Seaman			 ..!pur-ee!pucc-h!ags

( A . B . C ) is not legal on most of the Lisp system I know of.

	Chi Choy
	whuts!choy

hopp@nbs-amrf.UUCP (Ted Hopp) (11/14/85)

> A moment of reflection yields the following BNF syntax for
> Lisp.
> 
> Define a lisp program as a Sexpression, where Sexpression is
> given as:
> 
> 	Sexpression	->	ATOM
> 			|	( )
> 			|	( Sexpression . Sexpression )
> 			|	( Sexpression-List )
> 
> 	Sexpression-List ->	Sexpression
> 			|	Sexpression-List Sexpression
> 
> An ATOM in turn is a sequence of characters excluding
> whitespace and the characters "()."
> 
> 
> 		Chris Retterath, Consensys Corp.
> 		( dciem!aquila!chris )

That's good.  I take it you don't want to distinguish between atoms that
evaluate to themselves (like integers) and atoms that start out unbound
(like ATOM).  Also, what's wrong with |this (is an) atom| as an atom (at
least in Franz lisp)?  You don't want to distinguish between strings and
symbols, either?

A major difficulty in defining a lisp syntax in mosts lisp dialects is the
presence of read macros.  Thus 'A usually gets translated to (quote A) by
the reader (and remember, it is the reader that defines the syntax!)  I
don't see how 'A fits the above bnf.

-- 

Ted Hopp	{seismo,umcp-cs}!nbs-amrf!hopp

dimitrov@csd2.UUCP (Isaac Dimitrovsky) (11/16/85)

[]
Ted Hopp writes:
> A major difficulty in defining a lisp syntax in mosts lisp dialects is the
> presence of read macros.  Thus 'A usually gets translated to (quote A) by
> the reader (and remember, it is the reader that defines the syntax!)  I
> don't see how 'A fits the above bnf.

Lisp macros have the computational power of Turing machines;
thus, it's impossible to define a bnf for lisp that would
take read macros into account.

Isaac Dimitrovsky
allegra!cmcl2!csd2!dimitrov   (l in cmcl2 is letter l not number 1)
251 Mercer Street, New York NY 10012     (212) 674-8652

... Hernandez steps in to face ... Orl ... HERchiiiser ... and it's a liiine
driive, deeeeep to the gap in left center ...	- Bob Murphy, Voice of the Mets

chris@aquila.UUCP (chris) (11/18/85)

I never though a BNF for Lisp would generate so much Net traffic!
As some people have commented, the grammar I posted cannot handle
S-expressions of the form ( a . b . c ) -- changes were posted to do this.

Remember that what I posted was a BNF description of a grammar! A BNF
deals with tokens and productions; it is not concerned with lexical analysis,
or even Lisp read macros! An ATOM is any non-list Lisp object; it can
be an integer, a string, even an escaped symbol name (which may contain
suitably quoted parantheses, like |this (is an) atom|. Non-lisp programmers
would certainly find this last example confusing (imagine a C variable
named 'if (x > 0) exit(1)'), but the intent is clear to Lisp programmers.
hopp@nbs-amrf.UUCP (Ted Hopp) has been chasing a red herring when he
expects a grammar description to deal with issues of scanning as well as
parsing; he further confuses himself with premature concern for evaluation
of the read form. The BNF is not concerned with EVALUATION -- if you
don't understand this, then go away and read a compiler book.

Read macros can safely be ignored by the BNF because they ARE macros;
the Lisp reader still gets (after macro processing) a valid sequence of
tokens that is handled by the BNF. That is, 'A ==> (quote A). (Q.E.D.)
(And don't tell me about back-quote; `(a b ,c ,@d ...) is an EVALUATION
mechanism. The Reader produces something like (back-quote (a b (comma a)
(comma-at d) ...)), which is a valid form for later EVALUATION).

Note that just because Lisp places the read macro
handling in the reader does not mean it has to be there; as C programmers
know, macro scanning can be done in another program entirely so long
as valid code is produced. Of course, the Lisp macros are more powerful
because they ARE in the language; however, this greater power is rarely used.
An example would be a read macro that generates different code depending
on the value of some other Lisp symbol; such a macro would be confusing
to debug and dangerous to use in most cases. You can write one if you want
to try convince me of its utility.

		Chris Retterath ( ..!dciem!aquila!chris )
		Consensys Corporation.

ags@pucc-h (Dave Seaman) (11/19/85)

In article <63@aquila.UUCP> chris@aquila.UUCP () writes:
>I never though a BNF for Lisp would generate so much Net traffic!
>As some people have commented, the grammar I posted cannot handle
>S-expressions of the form ( a . b . c ) -- changes were posted to do this.

I was the one who started the ( a . b . c ) discussion.  I cancelled that
article shortly after I sent it, but several people have seen it and
responded to it.

No version of Lisp that I know accepts ( a . b . c ), but ( a b . c ) is
legal Lisp and should be allowed by the grammar.  I posted a revised
BNF to reflect this.
-- 
Dave Seaman			 ..!pur-ee!pucc-h!ags

bhayes@Glacier.ARPA (Barry Hayes) (11/21/85)

In article <63@aquila.UUCP> chris@aquila.UUCP () writes:
>I never though a BNF for Lisp would generate so much Net traffic!
>As some people have commented, the grammar I posted cannot handle
>S-expressions of the form ( a . b . c ) -- changes were posted to do this.
...
>		Chris Retterath ( ..!dciem!aquila!chris )
>		Consensys Corporation.

Okay, boys and girls.  I noticed this before, and at risk of getting
my ankle firmly between my molars...
What the hell does "( a . b . c )" mean?  I quote from "Common Lisp",
(c) 1984 by Digital Press, by that Steele guy:
[Quote begins, page 26, section 2.4]
A dotted list is one whose last cons does not have a nil for its
cdr, rather some other data object (which is not a cons, or the
first-mentioned cons would not be the last cons of the list.)
Such a list is called "dotted" because of the special notation
used for it: the elements of the list are written between
parentheses as before, but after the last element and before
the right parenthesis are written a dot (surrounded by blank space)
and then the cdr of the last cons.
[Back to you, Robin.]

Alrighty, then, wizzbangers, just what DOES "(a . b . c)" mean?
 -Barry "Insert toes, jump forward" Hayes

tim@k.cs.cmu.edu (Tim Maroney) (11/22/85)

I think there are some communication problems here.  What is the BNF
intended to be used for?  Reader macros are anything but a red herring if
you are actually building a Lisp system, including not only interpreters,
but pretty-printers of source code and structured editors.  On the other
hand, if this is merely an academic exercise, then nasty bits of reality
such as the context-sensitive lexical analysis of most Lisps can be safely
ignored.
-=-
Tim Maroney, CMU Center for Art and Technology
tim@k.cs.cmu.edu	uucp: {seismo,decwrl,etc.}!k.cs.cmu.edu!tim
CompuServe: 74176,1360	Once it ruled the Earth; now it delivers pizza.

barmar@mit-eddie.UUCP (Barry Margolin) (11/24/85)

In article <63@aquila.UUCP> chris@aquila.UUCP () writes:
>Read macros can safely be ignored by the BNF because they ARE macros;

Unfortunately, this is not really true.  Read macros can do
arbitrary computations and may have side-effects; '#.form' is defined to
evalute 'form' at read time.  A user-written read macro may be even more
complex.  Also, splicing read macros may be used to have a read
macro expand into any number of objects, while the macros that chris
describes always turn into one object.  Finally, a grammar that doesn't
know about read macros doesn't know how many of the original tokens
translate into a single lisp object.

>the Lisp reader still gets (after macro processing) a valid sequence of
>tokens that is handled by the BNF. That is, 'A ==> (quote A). (Q.E.D.)
>(And don't tell me about back-quote; `(a b ,c ,@d ...) is an EVALUATION
>mechanism. The Reader produces something like (back-quote (a b (comma a)
>(comma-at d) ...)), which is a valid form for later EVALUATION).

This is true of some implementations of backquote, but not the Maclisp
one.  In Maclisp, the backquoted expression might translate into (append
(list 'a 'b c) d ...).  However, '#+LISPM (send window :init)' turns
into absolutely nothing (like a comment) at read time on anything other
than a Lisp Machine.  But my main point is that a macro-ignorant parser
would not realize that the above expression is one s-expression; a
simple-minded scanner would treat that as two or three s-expressions.

>Note that just because Lisp places the read macro
>handling in the reader does not mean it has to be there; as C programmers
>know, macro scanning can be done in another program entirely so long
>as valid code is produced. Of course, the Lisp macros are more powerful
>because they ARE in the language; however, this greater power is rarely used.
>An example would be a read macro that generates different code depending
>on the value of some other Lisp symbol; such a macro would be confusing
>to debug and dangerous to use in most cases. You can write one if you want
>to try convince me of its utility.

As I mentioned above, one of the standard macros in Maclisp, Zetalisp,
and Common Lisp is #+, which is designed to permit read-time
conditionalization based on the environment.  The intent is to allow a
program designed to be run in various environments with different
syntaxes for to work.  For instance, one might write
	(setq variable #+LISPM 123s4 #-LISPM 123e4)
("123s4" is Zetalisp syntax for a small-float with the value 123x10^4,
but other Lisps may not have small-floats and they may choke on the
syntax).
-- 
    Barry Margolin
    ARPA: barmar@MIT-Multics
    UUCP: ..!genrad!mit-eddie!barmar