edhall%rand-unix@sri-unix.UUCP (11/11/83)
There are definitely still enough differences to require 3 greps. For example, try comparing: egrep '....................................................................' with: grep '....................................................................' The grep will be much faster than the egrep. This seems to be true for any regular expression with a lot of wild card characters (i.e. `.''s). Also, when you are looking for matches with a long list of words, you'll find: fgrep -f list much faster than: egrep -f list (Alas, in both cases the maximum number of strings in `list' has a fairly small limit--about 400 for fgrep, and substantially less for egrep.) So, though I use egrep in about 85% of cases, I still find some use for fgrep and plain ol' grep. -Ed Hall edhall@rand-unix (ARPA) decvax!randvax!edhall (UUCP)
thomas@utah-gr.UUCP (Spencer W. Thomas) (11/15/83)
The other day, I tried egrep '^.......?.?.?.?.?$' after a couple of minutes (!) it told me "regular expression too big"!?!? Anybody know why this is? I finally did egrep '......' | egrep -v '............' =Spencer
dan%bbncd@sri-unix.UUCP (11/16/83)
From: Dan Franklin <dan@bbncd> Each time the 3 greps are discussed, and people point out that they use different algorithms, each best for different kinds of regular expressions, I am puzzled by the leap to the conclusion that they must therefore be different programs. Some UNIX C compilers have several different algorithms for the 'switch' statement, choosing either an indexed table, a hashed table with linear rehash, or the obvious if/then/else structure for the output, depending on the properties of the input. These compilers do not provide 'switch1', 'switch2', and 'switch3' statements; the compiler examines the properties of the case list and chooses the best representation. If the only difference between the three greps were the space-time performance of each algorithm, the sensible thing to do would be to have one 'grep' which chose the most efficient algorithm for the regular expression--with, perhaps, a switch so the user could override grep's choice on special occasions (no heuristic can be perfect). So why doesn't somebody do just that? Consider how much new-user puzzlement (and excess unix-wizards mail) would be eliminated! There is a reason: the three greps interpret three different forms of regular expression. You can't take an arbitrary shell script which uses, say, 'grep' and substitute 'egrep' everywhere without first scrutinizing each regular expression to make sure it doesn't have parentheses, vertical bars, etc. So even if 'egrep' could use a variant of the 'grep' algorithm in the right circumstances, you couldn't throw away 'grep'. (Each command also accepts a different subset of options, but that problem could be solved.) Too bad. Dan Franklin
JTW@MIT-XX.ARPA (11/22/83)
From: John T. Wroclawski <JTW@MIT-XX.ARPA> So why doesn't somebody do just that? There is a reason: the three greps interpret three different forms of regular expression. You can't take an arbitrary shell script which uses, say, 'grep' and substitute 'egrep' everywhere without first scrutinizing each regular expression to make sure it doesn't have parentheses, vertical bars, etc. Now wait a minute. What's to prevent the (one) unified grep from basing it's choice of algorithm partly on whether that algorithm can handle the particular regular expr grep was given as an argument? -------