[comp.unix.wizards] grep "sorted file" inappropriate

andrew@frip.gwd.tek.com (Andrew Klossner) (07/02/88)

[]

	"How about an option (only applicable when patterns start with
	"^") to tell grep that the file is lexically sorted (?what
	order?) so it can terminate if the input line is lexically
	greater than the pattern (s).  This is useful for searching
	through sorted lists (files, dictionaries, etc...)"

If your pattern is a string (no meta-characters), what you really want
is a binary search, as implemented by the Berkeley look(1) command.
This is a different sort of operation than grep's sequential search,
and offering it as a separate command makes sense.

If your pattern contains meta-characters, its position in a sorted list
isn't necessarily defined.  For example, when do you stop searching
when the pattern is "^.rwx"?  You could define rules that specify when
to stop searching for each of several different forms of pattern, based
on the existing pattern algebra, but this starts to be more complexity
than the problem merits.

  -=- Andrew Klossner   (decvax!tektronix!tekecs!andrew)       [UUCP]
                        (andrew%tekecs.tek.com@relay.cs.net)   [ARPA]

dik@cwi.nl (Dik T. Winter) (07/03/88)

In article <10132@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
 > []
 > 
 > 	"How about an option (only applicable when patterns start with
 > 	"^") to tell grep that the file is lexically sorted (?what
 > 	order?) so it can terminate if the input line is lexically
 > 	greater than the pattern (s).  This is useful for searching
 > 	through sorted lists (files, dictionaries, etc...)"
 > 
 > If your pattern is a string (no meta-characters), what you really want
 > is a binary search, as implemented by the Berkeley look(1) command.
 > This is a different sort of operation than grep's sequential search,
 > and offering it as a separate command makes sense.
 > 
 > If your pattern contains meta-characters, its position in a sorted list
 > isn't necessarily defined.  For example, when do you stop searching
 > when the pattern is "^.rwx"?  You could define rules that specify when
 > to stop searching for each of several different forms of pattern, based
 > on the existing pattern algebra, but this starts to be more complexity
 > than the problem merits.
 > 
And also, ^ at the start of a pattern has already a meaning.
-- 
dik t. winter, cwi, amsterdam, nederland
INTERNET   : dik@cwi.nl
BITNET/EARN: dik@mcvax