[net.unix-wizards] *grep

ggs (03/16/83)

With reference to the comments on 'grep' and friends by Andy Tannenbaum:

The figures you report are in the same direction as I have seen, but
I get a much more dramatic improvement using 'egrep'.  I prepared
a 1,000,000 byte file containing 65 byte records (we do a lot of data
analysis around here).  My times were the following:

grep 12345 onemeg		28.2u 4.7s 1:03 elapsed (machine busy)

egrep 12345 onemeg		12.6u 5.7s 1:02 elapsed (machine busy)

fgrep 12345 onemeg		20.9u 4.7s 1:20 elapsed (machine busy)

When using a regular expression, the difference is even more dramatic:

grep [A-Z]$ onemeg		89.9u 5.8s 1:56 elapsed (almost idle)

egrep [A-Z]$ onemeg		12.1u 5.0s 19 elapsed (almost idle)

Note the dramatic slowdown in user time when using 'grep' to search for wild
cards.  In all cases, the 5 seconds of system time are the normal cost of disk
I/O.  The proposed BSD 4.2 disk organization should reduce this to 1 or 2
seconds.

		Griffith G. Smith, Jr.
Phone:		(201) 582-7736
Internet:	ggs@mhb5b.uucp
UUCP:		mhb5b!ggs
USPS:		600 Mountain Avenue Room 5F-119
		Murray Hill, NJ 07974

G:tut (03/21/83)

Let's get rid of fgrep.  If you want to look for multiple
patterns you can always do

	egrep "word1|word2|word3|word4" 

rather than having a pattern file.  There are too many
programmers who actually believe the manual page and use
fgrep, erroneously thinking it's faster.

I'm counting on the USG to delete fgrep before System 6
comes out!

Bill Tuthill
ucbvax!g:tut

thomas (03/23/83)

Ah, but fgrep will handle MUCH larger patterns than egrep will.  I've
had egrep say 'pattern too large' (or whatever the message is) too many
times to advocate getting rid of fgrep.  What needs to change is the
documentation!

=Spencer