carroll@snail.CS.UIUC.EDU (03/24/88)
/* Written 11:30 pm Mar 19, 1988 by chris@mimsy.UUCP in snail:comp.lang.c */ Several people suggested replacing lex description such as if return (IF); then return (THEN); else return (ELSE); {id} { symenter(yytext); return (ID); } with {id} { /* first try it as a keyword */ k = keyword_index(yytext); if (k >= 0) return (k); /* not a keyword, must be an identifier */ symenter(yytext); return (ID); } This may (probably does) help in lex, but it is clearly the wrong way around in almost all cases. The lexer has to traverse a DFA to decide that something is an ID. While it is at it it could easily tell whether it is a keyword instead. Obviously this makes the DFA table larger, but the result should run faster. ( ... ) /* End of text from snail:comp.lang.c */ I disagree. For a compiler class, we had to write a lex file, and the official way was to use the IF {return T_IF } etc. thing. It was big and slow. I changed mine to use the single rule for identifier, and look it up in a table, and I got a speed up of roughly 4 in lex/compile time, and 40% reduction in compiled object size. I think that the point is that the DFA that decides if something is a token or not is so much simpler than one that has to check for all of the keywords, that the gain is positive from removing the keyword cases. In addition, I would think that for most cases, you have to find non-builtin "keywords" (e.g, variable names, etc.) as you go, and so have to check for something that is not a keyword but is still an identifier anyway. Alan M. Carroll amc@woodshop.cs.uiuc.edu carroll@s.cs.uiuc.edu ...{ihnp4,convex}!uiucdcs!woodshop!amc Quote of the day : "Touch my soul, catch the very light Hide the moment, from my eager eyes"