[net.bugs.4bsd] Lex bugs

macrakis@harvard.ARPA (Stavros Macrakis) (09/03/85)

The trivial Lex sources "%%\n" and "%%\n%%\n" appear never to work
correctly.  I am not being perverse in noting this (although I am a
firm believer in the importance of the trivial cases working
correctly); rather, these were the simplest subcases I found of cases
where I was having problems.

Call the source file a.l.

A)  lex a.l; cc lex.yy.c -ll; a.out	-> no-op (returns immediately)
B)  lex a.l; cc lex.yy.c; 
	ld lex.yy.o -ll -lc; a.out  	-> illegal instr.
C)  lex a.l; cc lex.yy.c;
		    ld lex.yy.o -ll	-> undef symbols (not surprising)

Note that the Lesk and Schmidt paper in the Unix Programmer's Manual
(3/84) states that "[t]he absolute minimum Lex program is thus `%%',
which translates into a program which copies the input to the output
unchanged."  Case A, which is the only case that executes without
error, does not do this, but rather immediately returns, ignoring the
input.

Case B is a different bug.  Case C is there to show that the `-lc' is
necessary.  Apparently, main/_main is not getting defined properly.
dbx also has severe problems in dealing with lex-generated programs.
The problems seem to go away if you link in a program which
explicitly defines main and calls exit.  This does not appear to be
documented anywhere, and even if it were, there should still be some
sort of link error if main is not defined.

Lex version Berkeley 4.1 8/11/83

sdyer@bbncc5.UUCP (Steve Dyer) (09/04/85)

Without commenting on the bugs you are seeing in lex, it should be
emphasized that handing "ld" a C-language object file alone is
strictly verboten unless you know exactly what you are doing, since
the run-time startup prefix needs to be included.  In a sense,
you are 'working beneath' the abstraction provided you by "cc".
You could have avoided this by saying

cc lex.yy.o -ll

which turns into

ld /lib/crt0.o lex.yy.o -ll -lc
-- 
/Steve Dyer
{harvard,seismo}!bbnccv!bbncc5!sdyer
sdyer@bbncc5.ARPA