[net.news.b] grepping for "Subject" in /usr/spool/news

henry@utzoo.UUCP (Henry Spencer) (10/04/84)

> ....  Folks may recall articles
> that went around a while back asking why "grep" wasn't smart enough to use
> the "fast fgrep" algorithm when appropriate.  The above results show that
> grep may be smarter than was thought.

It's been well-known for quite a while that fgrep is the slowest of the
three greps for simple cases.  It really shines only when the list of
words it is looking for is substantial, in which case it is much faster
than the others (assuming you can even convince the others to work under
those conditions).

The Bell folks are rumored to have developed/discovered a new matching
algorithm that invalidates the "we don't know a single algorithm that
spans a wide enough range of space-time tradeoffs" comment in the grep
manual page; anybody know anything about this?  References?
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

mark@cbosgd.UUCP (Mark Horton) (10/05/84)

In article <elsie.4003> ado@elsie.UUCP writes:
>The second point is what prompts this posting.  Folks may recall articles
>that went around a while back asking why "grep" wasn't smart enough to use
>the "fast fgrep" algorithm when appropriate.  The above results show that
>grep may be smarter than was thought.

It's well known in the UNIX circles that egrep is the fastest grep,
then grep, then fgrep.  This is strange but true.

Which reminds me of a story, which I belive I first heard from Mike Lesk.

It seems a woman ran into a meeting at Bell Labs, late, all out of
breath.  She said "I'm sorry I'm late, I was grepping my apartment
for my keys."

To which the reply came: "You should have used egrep, it's faster."

tim@cithep.UucP (Tim Smith ) (10/07/84)

I solved this problem by hacking together a version of grep with
a new flag, the -o flag, which means only match one line per file.
It's a simple change to grep, and it doesn't require teaching grep
about usenet news headers, so it may be used for other purposes.
-- 
Tim Smith		ihnp4!cithep!tim  or  ihnp4!wlbr!callan!tim