[net.unix-wizards] egrep bug

trb@floyd.UUCP (05/27/83)

In 4.1bsd, when egrep comes upon a file (to be searched) which it can't
open it immediately exits (2) even though there are more files to be
searched.  This is inconsistent with the behavior of grep and fgrep and
it's also inconvenient.  If I'm searching through *.c and one of the
files happens to be unreadable, I don't want egrep to stop.  The fix
for that is easy, just change the exit(2) in the error code after the
open (in function execute()) to a return.  Problem is, *grep exits with
one of the following codes:

	0 - found a match
	1 - found no match
	2 - syntax errors or inaccessible files.

Notice that it's quite possible to find a match and also have
inaccessible files.  I suppose you can return a 2 in that case.
I don't know whose job it is to care for the grep family, but I don't
know whether they want to put a flag in egrep or what, because they
previously considered inaccessible files fatal.  I think that's
ridiculous, but I don't want to break someone's shell script.

Right now, egrep exits all over the place with hard coded values,
never based upon a previous state, except for the "found a match"
flag, which looks like this

	exit(nsucc == 0);

which I consider pretty baroque.  

I hacked my egrep so that if it can't open a file it sets a new
flag cantopen=2; and returns instead of exiting.  Then, right before the
aforementioned exit(nsucc == 0); (at the end of main())
	if(cantopen == 2) exit(2);

This seems right to me, but it will change the behavior of egrep in a
semi-documented way.

	Andy Tannenbaum   Bell Labs  Whippany, NJ   (201) 386-6491

jwp@sdchema.UUCP (05/29/83)

There's another problem in 4.1BSD egrep that I discovered quite a while
back, but don't remember if I told anyone or not (though if you haven't
discovered it by now you probably don't have any applicatons that care).
It has to do with using the '-f' flag to get the patterns then trying
to read the data from a pipe, e.g:

	program | egrep -f patternfile ...

The inner 'if' at the 'out' label in 'main()' reads:

if(freopen(fname = *argv, "r", stdin) == NULL) {
	fprintf(stderr, "egrep: can't open %s\n", fname);
	exit(2);
}

The inner 'if' in 'nextch()' reads:

if((c = getc(stdin)) == EOF) return(0);

This bit of cleverness causes a slight problem (egrep exits after reading
all of the patterns in since stdin is really pointing at "patternfile"
and nextch() returns EOF).  I have a fix for anyone who cares and doesn't
want to work it out themselves (it's relatively straight forward - simply
use another file pointer for the pattern file).

			John Pierce, Chemistry, UC San Diego
			{ucbvax, philabs}!sdcsvax!sdchema!jwp

guy@rlgvax.UUCP (06/02/83)

It's a V7 bug; a "diff" between V7 and 4.1BSD "egrep.y" reveals that this
part of its behavior hasn't changed.  On the other hand, a quick check of
the System III "egrep.y" reveals that it does NOT exit if a file does
not exist; in fact, it does exactly what you wanted it to (i.e., it eventually
exits with an exit code of 2).  So don't feel (too) guilty about changing it.

Any chance UCB will get an academic USG license ($800 for the first CPU, and
I think free for all subsequent ones) any time in the future?  Then they
will be able to send out more new Bell stuff with their releases, and won't
have to reinvent things in order to get around the license.  It'll help
make a "best of all possible worlds" system (after all, believe it or not,
there are some things that USG UNIX does right and Research UNIX - which
4.?BSD is ultimately derived from - doesn't), as well as the other way around -
ever tried reading in a "tar" tape from a foreign site on a system (like USG)
which lets you (for which read "tar") give away files?  Feh!

		Guy Harris
		RLG Corporation
		{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy