[comp.lang.c] 'C', is it's grammar context sensit

kenny@m.cs.uiuc.edu (08/29/90)

>	Hello again ! This question is directed towards the 'C' and
>compiler gurus out there. I was studying the grammar for the 'C'
>language and I couldn't help but notice that for declarations the
>grammar is context sensitive.
[...]
>Since the 'typedef-name' is an identifier is impossible to determine that
>it is a type defintion without looking at the context. I guess that one could
>do a pre-scan of the source code and build typedef trees but I thought that
>'C' was context free grammar.

You've hit on a common problem in parsing C -- the typedef vs.
identifier issue.

Indeed, C is `context sensitive' in that a typedef name has a very
different syntactic value from that of an ordinary identifier.
Nevertheless, parser generators that handle only simpler languages do
very well with C.  How do they resolve this contradiction?  They
discard a certain sort of purity and introduce an informal feedback.

In processing the program, a compiler will be maintaining a symbol
table, and keeping typedef names in it.  It is a simple matter to make
the lexer inspect the symbol table when processing an identifier, and
to return a different token type for an identifier that has appeared
in a typedef.  In this way, the grammar has different lexeme types for
identifiers and typedefs, and the context sensitivity goes away.

Every C compiler that I've studied has this feedback mechanism between
the semantic phase and the lexer.  It's a familiar solution to the
problem.

Kevin, KE9TV
until 8/31: kenny@cs.uiuc.edu
after 9/17: ke9tv@nrtc.northrop.com