[comp.lang.c++] C++ Grammar

liao@isi.edu (Yingsha Liao) (02/07/91)

Does anyone know where I could find a most recent version of C++ grammar?
I appreciate it if someone can give me a copy or tell me how I could get
a copy of it. 

yingsha

rfg@NCD.COM (Ron Guilmette) (02/08/91)

In article <16676@venera.isi.edu> liao@venera.isi.edu (Yingsha Liao) writes:
>Does anyone know where I could find a most recent version of C++ grammar?
>I appreciate it if someone can give me a copy or tell me how I could get
>a copy of it. 
>
>yingsha

There is no "official" C++ grammar (yet :-) but there is one that is
freely available and (in my opinion) quite good.

For further information, send E-mail to Jim Roskind.  His E-mail address
is <uunet!florida.eng.ileaf.com!jar>.

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

jncs@uno.edu (02/09/91)

>There is no "official" C++ grammar (yet :-) but there is one that is
>freely available and (in my opinion) quite good.
>
>For further information, send E-mail to Jim Roskind.  His E-mail address
>is <uunet!florida.eng.ileaf.com!jar>.
>
Could he post it in the net for the rest of us, instead of having each of
us asking him for it?

Thanks

jimad@microsoft.UUCP (Jim ADCOCK) (02/12/91)

In article <1991Feb08.130548.6878@iti.com> todd@iti.com (maroC ddoT) writes:
|In article <3786@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
|>There is no "official" C++ grammar (yet :-) but there is one that is
|>freely available and (in my opinion) quite good.
|>
|>For further information, send E-mail to Jim Roskind.  His E-mail address
|>is <uunet!florida.eng.ileaf.com!jar>.
|>
|A C++ grammer 'summary' (isn't an exact statement of the language) can be 
|found in _The Annotated C++ Reference Manual_ by Ellis and Stroustrup.
|The book is based on C++ as of Feb 1990.

I think when people start to dig into it, they'll discover the grammar
"summary" in the ARM is an awfully loose "upper bound" on the actual
C++ grammer.  So I'd recommend trying Roskind's statement of grammer
first -- If you're making any kind of tool.  [I don't have experience
with Roskind's grammer, but I have played with the one in ARM]

|BTW, has anyone reviewed this book yet? Or offered any comments?

It is widely accepted that serious C++ programmers need and use
the "ARM" -- its the only thing that you'll find that comes close
to defining the complete language.  Its the second or third C++
book everyone should buy.

shap@shasta.Stanford.EDU (shap) (02/21/91)

The problem with Roskind's grammar (at least the last time I looked - it
may have been updated in the past few months) is that it requires the
lexer to resolve "typedef" v/s "identifier", which is probably the most
difficult problem in the language as far as parsing goes.  If anyone has
built a tool that overcomes this limitation, including the associated
symtab support, I would be interested to see it.

Jonathan

jbuck@galileo.berkeley.edu (Joe Buck) (02/23/91)

In article <112@shasta.Stanford.EDU>, shap@shasta.Stanford.EDU (shap) writes:
|> The problem with Roskind's grammar (at least the last time I looked - it
|> may have been updated in the past few months) is that it requires the
|> lexer to resolve "typedef" v/s "identifier", which is probably the most
|> difficult problem in the language as far as parsing goes.  If anyone has
|> built a tool that overcomes this limitation, including the associated
|> symtab support, I would be interested to see it.

I've seen this point raised for years and years and years, and I've never
understood it (mainly about typedefs in C, but it applies to C++).  Why
do people object so strenously to the parser asking the lexer for help
when distinguishing identifiers from types?  The implementation is obvious,
it's clean, it's easy to do.  Once a token has been declared as a type,
mark it as such in the symbol table, so the next time the token appears,
the lexer returns the token indicating that it's a type, not an identifier.
It's simple, it's clean, and it works.  But yet there is a faction that
screams about the "impurity" of it.  Why?

Actually, I doubt if C++ can be parsed without using this trick, with
anyone's grammar.

--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck	

hansen@pegasus.att.com (Tony L. Hansen) (02/24/91)

<< From: shap@shasta.Stanford.EDU (shap; Jonathan)
<< In article <112@shasta.Stanford.EDU>, shap@shasta.Stanford.EDU (shap) writes:
<< The problem with Roskind's grammar (at least the last time I looked -
<< it may have been updated in the past few months) is that it requires
<< the lexer to resolve "typedef" v/s "identifier", which is probably the
<< most difficult problem in the language as far as parsing goes.  If
<< anyone has built a tool that overcomes this limitation, including the
<< associated symtab support, I would be interested to see it.

< From: jbuck@galileo.berkeley.edu (Joe Buck)
< I've seen this point raised for years and years and years, and I've
< never understood it (mainly about typedefs in C, but it applies to
< C++).  Why do people object so strenously to the parser asking the
< lexer for help when distinguishing identifiers from types?  The
< implementation is obvious, it's clean, it's easy to do.  Once a token
< has been declared as a type, mark it as such in the symbol table, so
< the next time the token appears, the lexer returns the token indicating
< that it's a type, not an identifier.  It's simple, it's clean, and it
< works.  But yet there is a faction that screams about the "impurity" of
< it.  Why?
<
< Actually, I doubt if C++ can be parsed without using this trick, with
< anyone's grammar.

I think what Jonathan is complaining about is NOT the difficulty with parsing
C++ and the fact that a smart lexer&symbol-table is necessary, but that no
one has posted one which will work with the grammar. In addition, the job is
NOT as simple as what you imply, as you have to worry about various scopes
coming into play. The name X may be a type name or it may be a variable name,
depending entirely on context. Once you decide that X is a type name, you
can't just always tag it as a type name; you have to know the context in
which X is being used before deciding on which it is. The job is even messier
now because of nested classes. You essentially have to have multiple symbol
tables which are linked together. You essentially start at the innermost
symbol table and work your way out. Even that description is not complete;
even the ANSI C++ committee is having difficulty coming up with a set of
rules which accurately describe how to search for a given symbol.

					Tony Hansen
				att!pegasus!hansen, attmail!tony
				    hansen@pegasus.att.com
				      tony@attmail.com

jimad@microsoft.UUCP (Jim ADCOCK) (02/26/91)

In article <112@shasta.Stanford.EDU> shap@shasta.Stanford.EDU (shap) writes:

|The problem with Roskind's grammar (at least the last time I looked - it
|may have been updated in the past few months) is that it requires the
|lexer to resolve "typedef" v/s "identifier", which is probably the most
|difficult problem in the language as far as parsing goes.  If anyone has
|built a tool that overcomes this limitation, including the associated
|symtab support, I would be interested to see it.

Huh?  As far as I can see, the need for the lexer feedback hack is 
central to any C/C++ parser using a lex/yacc like approach.  The need
for the hack is implicit in the C/C++ language's permission of allowing
typedef to define new types, that then can be used in the future without
a "struct", "class", "union", or "type" keyword as one would find in other
languages that allows differentiation the typedef word from identifiers.
I don't see this as a restriction in Roskind's approach.  I see it as a 
central requirement of C/C++ parsing because of "typedefs."  All lex/yacc -
like approaches that I have seen use this hack.

How else would you do it?