[comp.compilers] Flex speed,

markz@ssc.UUCP (Mark Zenier) (11/18/90)

> [Lex is slow, but flex uses newer techniques so that it
> is usually as fast as a hand-coded lexer.  ... -John]

What kind of speedup can be gained by using flex instead of lex?

How much does the include file / new buffer stuff in flex 2.3 cost
(in terms of speed)?

Mark Zenier
markz@ssc.uucp
-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.

vern@lau.ee.lbl.gov (Vern Paxson) (11/22/90)

In article <539@ssc.UUCP> markz@ssc.UUCP (Mark Zenier) writes:
> What kind of speedup can be gained by using flex instead of lex?

Here are some old timings (made in April, 1988).  They compare various
flex compression options with AT&T lex.  These aren't quite guaranteed-
not-to-exceed numbers (flex now takes advantage of rule sets that don't
require backtracking to squeeze out a few more cycles) but fairly close.
The scanner being timed tokenizes C, including recognizing keywords (which
maximizes flex's advantages).

> How much does the include file / new buffer stuff in flex 2.3 cost
> (in terms of speed)?

Virtually nothing.  The buffer structures are only consulted when the
end of the buffer is reached (and sometimes when doing an input() or an
unput()).  Since the default buffer size is 16K bytes, the overhead is
negligble.

		Vern


flex vs. lex timings for a C tokenizer which includes keywords:

Generation times:

	lex	    83.0 secs
	flex	     3.9	# fully compressed scanner
	flex -cfe    7.1	# uncompressed table, equivalence classes
	flex -cf    15.0	# uncompressed table, no equivalence classes

Scanner object file sizes:

	lex	   41.0K bytes
	flex	    9.4K
	flex -cfe  49.6K
	flex -cf  126.5K

Running times on a 28,088 line input (685K characters):

	lex	   29.8 secs
	flex	   19.3
	flex -cfe   9.0
	flex -cf    7.8

The timings were made on a Sun 3/60.  All times are user + system CPU time,
and no hashing of identifiers was done.

Summary:

    For about the same sized scanner, you get a factor of 3 in performance.
    For a 30% faster scanner, you get a scanner 1/4th the size, and it's
	generated in 1/20th the time.
    For a scanner that's 3 times larger, you get a factor of 3.8 in
	performance.

-- 
Send compilers articles to compilers@iecc.cambridge.ma.us or
{ima | spdcc | world}!iecc!compilers.  Meta-mail to compilers-request.