udi@cs.arizona.edu (Udi Manber) (06/28/91)
We are proud to announce the release of version 1.0 of agrep - a new tool for text searching with errors. agrep is similar to egrep (or grep or fgrep), but it is much more general. It is also usually faster than egrep (on a SUN SparcStation II it is about twice as fast for typical queries even with errors). It is based on an entirely different algorithm. The two most significant features of agrep that are not supported by the grep family are 1) the ability to search for approximate patterns; for example, "agrep -2 homogenos foo" will find homogeneous as well as any other word that can be obtained from homogenos with at most 2 substitutions, insertions, or deletions. 2) agrep is record oriented rather than just line oriented; a record is by default a line, but it can be user defined; for example, "agrep -d '^From ' 'breakdown; (inter|arpa|bit)net' mbox" outputs all mail messages that contain breakdown and one of either internet, arpanet, or bitnet. Another example: "agrep -d '$$' pattern foo" will output all paragraphs (separated by an empty line) that contain pattern. Other features include searching for regular expressions (with or without errors), unlimited wild cards, AND and OR operations, limiting the errors to only insertions or only substitutions or any combination, allowing each deletion, for example, to be counted as, say, 2 substitutions or 3 insertions, restricting parts of the query to be exact and parts to be approximate, and many more. agrep is available by anonymous ftp from cs.arizona.edu (IP 192.12.69.5) as pub/agrep/agrep.tar.Z (or in uncompressed form as pub/agrep/agrep.tar). The tar file contains the source code (in C), man pages (agrep.1), and a postscript file (agrep.ps) of a technical report (TR #91-11) describing the design and implementation of agrep. This is the first version of agrep. There may be some bugs, especially with complicated patterns and a combination of options. Please mail bug reports (or any other comments) to sw@cs.arizona.edu or to udi@cs.arizona.edu. We would appreciate if users notify us (at the address above) of any extensions, improvements, or interesting uses of this software. Prof. Udi Manber (udi@cs.arizona.edu) Dept. of Computer Science University of Arizona Tucson, AZ 85721 June 10, 1991.