[comp.bugs.sys5] perl finds bug in fgrep

bob@dhw68k.cts.com (Bob Best) (08/15/88)

The enclosed shar file contains a set of files that provide evidence that
the SYSV fgrep(1) is broken.  I discovered this while doing some benchmarks
between /bin/fgrep and a perl version of fgrep.
----------------cut here-------------------
#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  To overwrite existing
# files, type "sh file -c".  You can also feed this as standard input via
# unshar, or by typing "sh <file", e.g..  If this archive is complete, you
# will see the following message at the end:
#		"End of shell archive."
# Contents:  README bad good text
# Wrapped by bob@dallnix on Sun Aug 14 11:40:00 1988
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f 'README' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'README'\"
else
echo shar: Extracting \"'README'\" \(949 characters\)
sed "s/^X//" >'README' <<'END_OF_FILE'
XWhile doing some benchmarking between a perl version of fgrep and the
XSYSV binary version of fgrep, I discovered that the perl version was
Xextracting more lines containing the pattern than the SYSV version.
XI asked myself, "is it possible that users have been relying for years
Xon a grep tool that has only now been found faulty?"  Several more
Xtests using randomly extracted words from /usr/dict/words convinced me
Xthat, indeed, the perl version had found a bug in the SYSV version.
X
XIn an effort to narrow down the bug as much as possible, I trimmed text until
XI finally arrived at the enclosed files that manifest the bug.
XTo verify the result, I tested this using fgrep on two different systems
Xrunning SYSV Unix.
X
XTo test this on your system, do the following:
X
Xfgrep -f bad text
Xfgrep -f good text
X
XIf your system has the bug, the test using 'bad' will fail to extract the
Xline that contains the string 'stall'.
X
XBob Best (bob@dhw68k.cts.com)
END_OF_FILE
if test 949 -ne `wc -c <'README'`; then
    echo shar: \"'README'\" unpacked with wrong size!
fi
# end of 'README'
fi
if test -f 'bad' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'bad'\"
else
echo shar: Extracting \"'bad'\" \(26 characters\)
sed "s/^X//" >'bad' <<'END_OF_FILE'
Xinsouciant
Xnichrome
Xstall
END_OF_FILE
if test 26 -ne `wc -c <'bad'`; then
    echo shar: \"'bad'\" unpacked with wrong size!
fi
# end of 'bad'
fi
if test -f 'good' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'good'\"
else
echo shar: Extracting \"'good'\" \(26 characters\)
sed "s/^X//" >'good' <<'END_OF_FILE'
Xincreasing
Xnichrome
Xstall
END_OF_FILE
if test 26 -ne `wc -c <'good'`; then
    echo shar: \"'good'\" unpacked with wrong size!
fi
# end of 'good'
fi
if test -f 'text' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'text'\"
else
echo shar: Extracting \"'text'\" \(21 characters\)
sed "s/^X//" >'text' <<'END_OF_FILE'
XI am installing Unix
END_OF_FILE
if test 21 -ne `wc -c <'text'`; then
    echo shar: \"'text'\" unpacked with wrong size!
fi
# end of 'text'
fi
echo shar: End of shell archive.
exit 0
-- 
Bob Best
uucp: ...{trwrb,hplabs}!felix!dhw68k!bob	InterNet: bob@dhw68k.cts.com

bob@dhw68k.cts.com (Bob Best) (08/15/88)

As a follow-up to my earlier posting regarding the bug in fgrep(1),
I have isolated the bug down to the following 'minimal' set of strings.

In the 'strings' file:
abcd
cef
bg

In the 'text' input file:
abcef

Then, enter the command:

fgrep -f strings text

If you have the bug, fgrep will fail to match the string 'cef' to the text line.
-- 
Bob Best
uucp: ...{trwrb,hplabs}!felix!dhw68k!bob	InterNet: bob@dhw68k.cts.com

bob@dhw68k.cts.com (Bob Best) (08/15/88)

Actually, the following reduced set of strings is sufficient:

in 'strings':
abcd
ce
bf

in 'text':
abce

then:
fgrep -f strings text

-- 
Bob Best
uucp: ...{trwrb,hplabs}!felix!dhw68k!bob	InterNet: bob@dhw68k.cts.com

gwyn@smoke.ARPA (Doug Gwyn ) (08/16/88)

In article <10535@dhw68k.cts.com> bob@dhw68k.cts.com (Bob Best) writes:
>The enclosed shar file contains a set of files that provide evidence that
>the SYSV fgrep(1) is broken.

Geoff Whale published fixes for this bug, and Gould CSD found yet another
problem, some time before 1-Nov-1986 when I installed the fixes in the
BRL System V emulation (along with case folding bug fixes and speed-ups
from Guy Harris).  Get them from there...

andrew@alice.UUCP (08/16/88)

if its any consolation, my new grep (which right now is probably
the grep for system v, rel 4 if all goes well) does not exhibit
this bug. in fact, fairly early on in my testing i discovered
that fgrep was less reliable than my grep.

treval@tauros.UUCP (Trevor Luker) (08/23/88)

g'day all,

	try the following (On NCR Tower SysV.2):-

	$ ps -ef | grep xxxxxx
Gives->	treval  8911  8910  4 13:12:01 X00t  0:00 grep xxxxxx
	$ ps -ef | fgrep xxxxxx
Gives->	$


	Is this a related problem to the grep(1) bug? 
	Here fgrep(1) seems to be the one that is broken. 
	Is this because fgrep forks and the arguments no longer appear in the 
	argument space? If not, why not?

	Cheers, treval

	PS: (To NCR...)

	When will SysV.3.x be available for the Tower? 
	And why are the Tower machines so expensive? (All they are is an
	Atari ST in a big box after all ;->)
-- 
<==============================================================================>
 Trevor Luker		        		CCE Luxembourg
 treval@tauros.UUCP				Batiment Jean Monnet, C2/60
 ...!mcvax!prlb2!tauros!treval			L-2920 Luxembourg

jbayer@ispi.UUCP (id for use with uunet/usenet) (08/25/88)

In article <458@tauros.UUCP>, treval@tauros.UUCP (Trevor Luker) writes:
] g'day all,
] 
] 	try the following (On NCR Tower SysV.2):-
] 
] 	$ ps -ef | grep xxxxxx
] Gives->	treval  8911  8910  4 13:12:01 X00t  0:00 grep xxxxxx
] 	$ ps -ef | fgrep xxxxxx
] Gives->	$
] 
] 

I tried the same thing on Xenix 2.2.3.  It worked fine.

Jonathan Bayer			(I don't know, I only work here)

vause@cs-col.Columbia.NCR.COM (Sam Vause) (08/26/88)

In article <458@tauros.UUCP> treval@tauros.UUCP (Trevor Luker) writes:
>g'day all,
>
>	try the following (On NCR Tower SysV.2):-
>	$ ps -ef | grep xxxxxx
>Gives->	treval  8911  8910  4 13:12:01 X00t  0:00 grep xxxxxx
>	$ ps -ef | fgrep xxxxxx
>Gives->	$
>
>	When will SysV.3.x be available for the Tower? 
>	And why are the Tower machines so expensive? (All they are is an
>	Atari ST in a big box after all ;->)

The problem shown above has been corrected by the TOWER 32/4xx-6xx Operating
System Release 2.00.00 software.

UNIX(tm) V.3.x will be released on the TOWER 32-4xx-6xx platform with the
3.00.00 Operating System Release.  Consult with your local sales & support
office for definitive product content and release dates.

+------------------------------------------------------------------+
|Sam Vause, NCR Corporation, Customer Services - TOWER Support	   |
|3325 Platt Springs Road, West Columbia, SC 29169 (803) 791-6953   |
|                                vause@cs-col.Columbia.NCR.COM     |
|		...!ucbvax!sdcsvax!ncr-sd!ncrcae!cs-col!vause      |
+------------------------------------------------------------------+

rcodi@yabbie.rmit.oz (Ian Donaldson) (08/26/88)

From article <458@tauros.UUCP>, by treval@tauros.UUCP (Trevor Luker):
> 	try the following (On NCR Tower SysV.2):-
> 
> 	$ ps -ef | grep xxxxxx
> Gives->	treval  8911  8910  4 13:12:01 X00t  0:00 grep xxxxxx
> 	$ ps -ef | fgrep xxxxxx
> Gives->	$
> 
> 
> 	Is this a related problem to the grep(1) bug? 

No, no bugs here at all.

Its a race condition: fgrep doesn't get started by the shell until
ps does, so ps sometimes completes before fgrep is started.

If you repeat your second example a few times you will eventually see
some output.

Ian D

levy@ttrdc.UUCP (Daniel R. Levy) (08/27/88)

In article <163@ispi.UUCP>, jbayer@ispi.UUCP (id for use with uunet/usenet) writes:
> In article <458@tauros.UUCP>, treval@tauros.UUCP (Trevor Luker) writes:
> ] 	try the following (On NCR Tower SysV.2):-
> ] 	$ ps -ef | grep xxxxxx
> ] Gives->	treval  8911  8910  4 13:12:01 X00t  0:00 grep xxxxxx
> ] 	$ ps -ef | fgrep xxxxxx
> ] Gives->	$
> I tried the same thing on Xenix 2.2.3.  It worked fine.

Ditto on AT&T 3B20 running SVR2.

Are you sure it isn't fgrep munging its argument somehow, or the ps finishing
(with buffered output waiting to come through the pipe) before the fgrep has
been exec'd?  Try these:

$ ps -ef | grep ps
$ ps -ef | fgrep ps

$ sleep 30 xxxxxx &	#sleep ignores extra arguments
$ ps -ef | fgrep xxxxxx	#do you see at least the sleep process?
-- 
|------------Dan Levy------------|  THE OPINIONS EXPRESSED HEREIN ARE MINE ONLY
| Bell Labs Area 61 (R.I.P., TTY)|  AND ARE NOT TO BE IMPUTED TO AT&T.
|        Skokie, Illinois        | 
|-----Path:  att!ttbcad!levy-----|

rjd@occrsh.ATT.COM (Randy_Davis) (08/27/88)

In article <458@tauros.UUCP>, treval@tauros.UUCP (Trevor Luker) writes:
] g'day all,
] 
] 	try the following (On NCR Tower SysV.2):-
] 
] 	$ ps -ef | grep xxxxxx
] Gives->	treval  8911  8910  4 13:12:01 X00t  0:00 grep xxxxxx
] 	$ ps -ef | fgrep xxxxxx
] Gives->	$

Uh, really???  Ever thought of trying it with a different command than ps?  If
you recall, ps(1) only gives you a "snapshot" of what is going on.  I think it
is possible that it is the ps command is not showing grep because it was not
in the process table at that time.  Try "ps -ef | grep grep" a few times in a
row, and you will probably notice it not showing up every now and then,
depending on your processor.

Randy

rogerc@ncrcae.Columbia.NCR.COM (Roger Collins) (08/29/88)

In article <842@yabbie.rmit.oz> rcodi@yabbie.rmit.oz (Ian Donaldson) writes:
>
> Its a race condition: fgrep doesn't get started by the shell until
> ps does, so ps sometimes completes before fgrep is started.
> 
> If you repeat your second example a few times you will eventually see
> some output.

Not necessarily, Ian.  In all releases of the TOWER 32 4xx/6xx until the
current 2.00.00, the -f option of ps(1) does not print the arguments of 
a process if argc is invalid.  A lot of commands invalidate argc by
decrementing it during command-line parsing.  My guess is fgrep decrements
argc, grep doesn't.  That simple.

--
Roger Collins
NCR - E&M Columbia
TOWER Uniprocessing Systems
rogerc@ncrcae.Columbia.NCR.COM