[comp.std.c++] YACCable C++ grammar

jar@HQ.Ileaf.COM (Jim Roskind x5570) (02/10/91)

							2/9/91

Since more and more people have been asking about the C++ grammar that
I have supplied, I decided to post a set of answers to the most
commonly asked questions.


---------------------------------------------------------------------
Q 1: How can I get a copy of the grammar?

When I release a grammar I post it to comp.lang.c++, and cross post
notification of the posting to comp.lang.c, comp.compilers, and
comp.std.c++.  In addition I have been fortunate enough to arrange to
have it archived.  The most recent release can be obtained as follows:

Doug Lea and Doug Schmidtt have graciously offered to provide anonymous
ftp sites for the 6 files, as well as the Berkeley YACC source (if you
need it). 

ics.uci.edu (128.195.1.1) in the ftp/pub directory as
c++-grammar.tar.Z and byacc.tar.Z

mach1.npac.syr.edu in the ftp/pub/C++ directory as cgrammar1.1.tar.Z
and byacc.tar.z


---------------------------------------------------------------------
Q 2: ... but I don't have ftp access.  Is there any other way to get it?

Write me a convincing email note and I will usually be able to send
you a uuencoded, compressed, tar file.  If you don't have UNIX access,
and are willing, sometimes I will send a uuencoded ZIP archive.
Finally, when life is *really* rough, you can arrange to mail me a
diskette with return postage...  Obviously I would prefer that you use
ftp, and most resist the floppy mailer route.  On the other hand, you
get the fastest service via ftp, and clearly the slowest service via
postal floppy.

---------------------------------------------------------------------
Q 3: When are you going to release a C++ 2.1 updated grammar?

Basically this question translates into: When will the grammar be made
to support templates, exception handling, and nested types.  To my
knowledge, aside from possible bugs, the current grammar supports all
other features of the language.  In actuality, C++ 2.1 is a slightly
vague concept, because cfront 2.1 does not support all that is written
in the ARM, and the ARM does not describe cfront 2.1.

It should be noted that my grammar cannot be in constant agreement
with such implementations as cfront because a) my grammar is
internally consistent (mostly courtesy of its formal nature and YACC
verification), and b) YACC generated parsers don't dump core. (I will
probably take a lot of flack for that last snipe, but.... every time I
have had difficulty figuring what was meant syntactically by some
construct that the ARM was vague about, and I fed it to cfront, cfront
dumped core.)

Getting back to my translation of the questions: when will my grammar
support templates, exception handling, and nested types?

Templates: The syntax *suggested* by the ARM introduces numerous
ambiguities to the language.  This includes syntactic ambiguities as
well as lexical ambiguities (re., lexical uses of angle brackets for
grouping: This can cause nested temples to have pairs of adjacent
angle brackets, which look a heck of a lot like shift operators ">>"
:-o ).  Some syntactically ambiguous examples (with viable semantic
interpretations for both options) were presented at the last ANSI C++
meeting, and there is expected to be a follow-up presentation and
discussion at the upcoming ANSI C++ meeting.  I would rather not
provide syntax for language constructs that are in flux.  I also
consider it a crying shame to see *more* ambiguities introduced into
the language.  It is one thing to complain about the problems arising
from a C heritage, it is another thing to gratuitously add
ambiguities.  I can at least hope that some acceptable and
non-ambiguous syntax will emerge. (Please feel free to write your
congressman, I mean your ANSI C++ representative, about this issue).

Exception handling: After many go-arounds, a non-ambiguous syntax
emerged that the language designers felt comfortable with.
Unfortunately, as this construct made its way into the ANSI C++
working document, an "enhancement" changed "throw" into an operator
instead of a statement keyword like "return".  This change and
incomplete documentation left the meaning of "throw (a++), (b++);"
unclear  (i.e., is it "(throw (a++)), (b++);", or "throw ((a++),
(b++));" ).  Since this interpretation plays into a politically
disputed area of resumptive vs non-resumptive exception handling, I
have no interest in encouraging either interpretation.  Bottom line:
when the ANSI C++ committee decides, then the grammatical addition to
my grammar will be straight forward.  There will be at least one paper
at the upcoming ANSI C++ meeting on this topic, and so it is likely
that this will soon be resolved.

Nested types: My big reason for not taking care of this was laziness.
There are some significant anticipated problems here, in that the
feedback between the parser and the lexer must be increased, but I
doubt that it is insurmountable.  I have been hoping that I could take
care of the other two topics listed above at the same time as I
provided this enhancement.  Each release is a time consuming effort on
my part as I update the associated papers on the grammar.
Fortunately, I don't know that anyone has *really* implemented nested
types as spec'd, so this is not generally the item that folks are
after me to add.


---------------------------------------------------------------------
Q 4: Do you have a symbol table routine? OR When are you going to
release the "rest" of your compiler/translator? OR Why doesn't your
grammar handle #includes?

I started out just distributing a C++ grammar, an ANSI C grammar, and
a paper on the ambiguities in the C++ language (1/90 and 3/90).  In
June of '90 I added a flex input file, and a patch for Berkeley YACC
that draws graphical parse trees automatically.  I have no plans on
releasing *source* for any additional areas of a compiler/translator
(although I do "market" a shareware ANSI C preprocessor that runs on
either DOS, OS2, Sun3, Sun4, IBM-RS6000 for $40).

Since I did not even originally intend to release a lexical analyzer,
but I later changed my mind, I may eventually put out some other items
as well.  I was motivated to release my flex input file by the
presence of a less adequate posting to the net.  Basically, if I can
be convinced that releasing software will encourage the use of my
grammar (and hence implicitly the support the standard that I am
pushing), then I will probably help out.


---------------------------------------------------------------------
Q 5: Why don't you hook your grammar up to g++?

I would like to see this done, as I would like to see major vendors
supporting my standard.  Unfortunately, I have not been able to reach
an agreement with the FSF whereby my grammar would remain as freely
distributable as it currently is.  I am fearful that a copyleft would
prevent commercial software vendors from adopting the standard.  If
and when I can be assured that the use of my grammar by commercial
vendors will not be impeded, and the FSF be assured that my
contribution (with its copyright entanglements) to their effort will
not be harmful to their overall goals, then the marriage will likely
take place.  

I actually have a coworker that has offered to "perform the marriage",
but until it becomes clear that the result would become "mainstream"
g++", he is not willing to expend the effort.  If someone out there
would like to perform the marriage, and hope that the results can be
accepted into the main GNU camp after the fact, I would like to hear
how it goes.

---------------------------------------------------------------------
Q 6: Why do you care about a C++ standard, and why are you so psyched
about "your" standard?

I would like to build some tools for C++ and make some money selling
and servicing those tools.  If C++ is well defined, then it is
possible for the "little guy" to do this.  If C++ is nothing more than
a shifting-sands standard controlled by a large company (that
humorously enough, doesn't respond well to the request "put it in
writing") then I have no chance.  As it currently stands, the
traditional question of "conformance" boils down to "what does cfront
do?"  I personally can not afford to track the endless series of
changes and fixes that dozens of programmers can install into cfront.
A clear syntax specification of the language is the most basic first
step.  I am trying to get enough "little guys" together that we can
nail down the standard and get on with business.  As it turned out, a
large number of large companies appear (if I can believe my email)
quite interested in using my grammar, and it is hopefully only a
matter of time before notable products enter the market supporting my
standard.

*IF* there had been a formal description of the syntax of C++, there
is *NO* *WAY* I would have wasted my time developing and distributing
my grammar.  When I began, it was hypothesized that C++ was so mangled
to start with, that LR parsing technology could not handle the task.
Fortunately, I was not aware of this belief at the time.  I emphasize
"MY" grammar only in order to help to give a focus for this uprising
toward standardization.  Even today, I know of no other publically
available grammar that is anywhere near as detailed and specific as my
grammar (...and yes, I am very aware of the contents of the ARM and
the Draft ANSI C++ Working Documents).


---------------------------------------------------------------------

Jim Roskind-   Author of a YACCable C++ grammar
Independent Consultant
(407)729-4348 or (617)290-0710 x5570
jar@hq.ileaf.com or ...!uunet!leafusa!jar
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.