kenny@m.cs.uiuc.edu (08/29/90)
> Hello again ! This question is directed towards the 'C' and >compiler gurus out there. I was studying the grammar for the 'C' >language and I couldn't help but notice that for declarations the >grammar is context sensitive. [...] >Since the 'typedef-name' is an identifier is impossible to determine that >it is a type defintion without looking at the context. I guess that one could >do a pre-scan of the source code and build typedef trees but I thought that >'C' was context free grammar. You've hit on a common problem in parsing C -- the typedef vs. identifier issue. Indeed, C is `context sensitive' in that a typedef name has a very different syntactic value from that of an ordinary identifier. Nevertheless, parser generators that handle only simpler languages do very well with C. How do they resolve this contradiction? They discard a certain sort of purity and introduce an informal feedback. In processing the program, a compiler will be maintaining a symbol table, and keeping typedef names in it. It is a simple matter to make the lexer inspect the symbol table when processing an identifier, and to return a different token type for an identifier that has appeared in a typedef. In this way, the grammar has different lexeme types for identifiers and typedefs, and the context sensitivity goes away. Every C compiler that I've studied has this feedback mechanism between the semantic phase and the lexer. It's a familiar solution to the problem. Kevin, KE9TV until 8/31: kenny@cs.uiuc.edu after 9/17: ke9tv@nrtc.northrop.com