johnl@ima.UUCP (08/12/87)
I'm building a compiler for Algol 68, which presents some interesting tokenizing and parsing problems. Right now, I'm using a p.d. Lex, but I've heard bad things said about Lex in general, usually that it's slow. Does anyone out there know of a (semi-)p.d. Lex-type program that is better? Or, more generally, is there a truly better way to tokenize? As far as Yacc goes, it seems to me that the power of LALR vs. LL parsing, and the fact that it is table-driven are big wins, over and above the development advantages. (Table-driven gives you: smaller parsers for large languages, accessibility of the entire parse state for error diagnostics, ability to build other tools that use the same tables (e.g., for debugging the grammar)) People like to claim that Yacc is slow, but has anyone really investigated this? -- Dale Worley Cullinet Software ARPA: cullvax!drw@eddie.mit.edu UUCP: ...!seismo!harvard!mit-eddie!cullvax!drw [Most people I know write lexers by hand, because it's so easy. Lex does indeed generate big slow lexers -- it's too powerful in the wrong way for lexing most computer languages. I've also heard that yacc is slow, but have never been persuaded that it makes any difference. What I'd really like to hear is how you deal with Algol-68's two-level grammar without expanding it to a context free grammar the size of a small planet. I've heard of no work on parsing such grammars directly. -John] -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
johnl@ima.UUCP (08/17/87)
At the Winter 1987 Usenix Van Jacobson of LBL labs presented a paper describing a much improved version of Lex. He got the main processing down to a single table lookup in memory! (The rumor was that it was just marginally slower than 'cat'). I don't know what the current status of the project is; I would very much like either a copy of his paper or the program itself. Anyone know more than I? Randy Smith @ NCI Supercomputer Facility c/o PRI, Inc. PO Box B, Bldng. 430 Phone: (301) 698-5660 Frederick, MD 21701 Uucp: ...!uunet!mimsy!elsie!ncifcrf!randy -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request
vern%lbl-helios@lbl-rtsg.arpa (Vern Paxson) (08/20/87)
> At the Winter 1987 Usenix Van Jacobson of LBL labs presented a > paper describing a much improved version of Lex... > processing down to a single table lookup in memory! (The rumor was > that it was just marginally slower than 'cat'). I don't know what the > current status of the project is; I would very much like either a copy > of his paper or the program itself. Anyone know more than I? A student I'm supervising is adding Van's fast algorithm to my lex re-write ("flex"). He's finished with the basics of the implementation, but there's still a lot of tuning and clean-up before it'll be ready for a beta-test and subsequent release. (Details on distribution terms are still being worked out, but it looks like it'll have a copyright that says "freely redistribute, but don't make a significant enhancement without contacting us first, and be willing to give UC rights to the enhancement"; possibly it'll carry a more generous, GNU-like copyright.) While there's still tuning to do, the preliminary results, done for a C tokenizer, are (1) fast as cat? No, not quite (I'll be going over the implementation with Van to see where tuning might be needed); (2) fast as a hand-coded scanner? Well, as things stand now, it is about 15% faster than PCC's tokenizer, which seems to have been done with some care. Vern Paxson vern@lbl-csam.arpa Real Time Systems ucbvax!lbl-csam.arpa!vern Lawrence Berkeley Laboratory (415) 486-6411 -- Send compilers articles to ima!compilers or, in a pinch, to Levine@YALE.ARPA Plausible paths are { ihnp4 | decvax | cbosgd | harvard | yale | cca}!ima Please send responses to the originator of the message -- I cannot forward mail accidentally sent back to compilers. Meta-mail to ima!compilers-request