[net.wanted] Need unix command file HELP!

sutton@aero.ARPA (Stew Sutton) (01/30/86)

          


We are looking for a utility that can, when given a arbitrary string,
can locate all occurences of that string anywhere on the system. Our
local Un*x gurus can't figure this out, so we are appealing to those out
in Netland to help us out.

We are looking for the command to work like this:

findstring this-is-the-string

The utility would return all the files (and their pathnames from the
root) to the screen. Of course if the protections on the file indicate
that the file cannot be read, the program should ignore that file and keep
on going. We think it can be done using a command file using the 'ls'
and 'awk' commands but we just can get it right.

Please send source code (or ideas on writing this code) to us and we
will post to net a summary of working code.

Thanks in advance.

sutton@aerospace.ARPA
{ihnp4!sdcrdcf,randvax,trwrb} ! aero ! sutton
sutton%aerospace.ARPA@WISCVM.BITNET

earle@smeagol.UUCP (Greg Earle) (02/01/86)

> We are looking for a utility that can, when given a arbitrary string,
> can locate all occurences of that string anywhere on the system. Our
> local Un*x gurus can't figure this out, so we are appealing to those out
> in Netland to help us out.

Some Gurus you got there ...

> We are looking for the command to work like this:
> 
> findstring this-is-the-string

find / -exec fgrep this-is-the-string '{}' \; (UGGGHHH!)

Warning! Only execute during hours when no one else is in building!!
Guaranteed to tie up CPU for indefinite periods! :@)

If you only want the file names, this *might* work, I'm not sure ...

find / -exec "fgrep this-is-the-string '{}' | awk -F: '{print $1}'" \;
(DOUBLE UGGGHHH)
-- 

	Greg Earle
	JPL Spacecraft Data Systems group
	sdcrdcf!smeagol!earle (UUCP)
	ia-sun2!smeagol!earle@csvax.caltech.edu (ARPA)

hfavr@mtuxo.UUCP (a.reed) (02/04/86)

> We are looking for a utility that can, when given a arbitrary string,
> can locate all occurences of that string anywhere on the system. Our
> local Un*x gurus can't figure this out, so we are appealing to those out
> in Netland to help us out.
> We are looking for the command to work like this:
> 
> findstring this-is-the-string
> 
> The utility would return all the files (and their pathnames from the
> root) to the screen. Of course if the protections on the file indicate
> that the file cannot be read, the program should ignore that file and keep
> on going. We think it can be done using a command file using the 'ls'
> and 'awk' commands but we just can get it right.
> Please send source code (or ideas on writing this code) to us and we
> will post to net a summary of working code.
> Thanks in advance.
> 
> sutton@aerospace.ARPA
> {ihnp4!sdcrdcf,randvax,trwrb} ! aero ! sutton
> sutton%aerospace.ARPA@WISCVM.BITNET

# In ksh or sh this is a one-liner:
2>/dev/null find / -exec fgrep -l $1 {} \;
# Please do not post ELEMENTARY shell questions to net.unix-wizards!
# 			Adam Reed (ihnp4!npois!adam)

wescott@sauron.UUCP (Michael Wescott) (02/04/86)

In article <587@smeagol.UUCP> earle@smeagol.UUCP (Greg Earle) writes:
>> We are looking for a utility that can, when given a arbitrary string,
>> can locate all occurences of that string anywhere on the system.
>
>find / -exec fgrep this-is-the-string '{}' \; (UGGGHHH!)
>
>Warning! Only execute during hours when no one else is in building!!
>Guaranteed to tie up CPU for indefinite periods! :@)
>
>If you only want the file names, this *might* work, I'm not sure ...
>
>find / -exec "fgrep this-is-the-string '{}' | awk -F: '{print $1}'" \;
>(DOUBLE UGGGHHH)

I agree, UGGHHH.  I know its not on most BSD systems, but it has its
uses.  I'm talking about `xargs'.  For much less cpu usage try:

find / -type f -print | grep -v outfile | xargs grep 'pattern'  > outfile

or some reasonable variation with your favorite grep (or bm).
Xargs accumulates arguments from stdin and execs the command and args
given for a reasonable number of collected arguments.  Hence grep
gets executed once per ten or twenty files rather than once per file.

I think it's reasonable to expect to search only regular files, hence
"-type f" and you better exclude "outfile" unless you use "grep -l".

	-Mike Wescott
	ncrcae!wescott

jso@edison.UUCP (John Owens) (02/04/86)

> We are looking for the command to work like this:
> 
> findstring this-is-the-string
> 
> The utility would return all the files (and their pathnames from the
> root) to the screen. Of course if the protections on the file indicate
> that the file cannot be read, the program should ignore that file and keep
> on going. We think it can be done using a command file using the 'ls'
> and 'awk' commands but we just can get it right.
> 
> sutton@aerospace.ARPA
> {ihnp4!sdcrdcf,randvax,trwrb} ! aero ! sutton

find / -type f -exec fgrep -l $1 {} \; 2>/dev/null

-- 

			   John Owens
General Electric Company		Phone:	(804) 978-5726
Factory Automated Products Division	Compuserve: 76317,2354
	       houxm!burl!icase!uvacs
...!{	       decvax!mcnc!ncsu!uvacs	}!edison!jso
		 gatech!allegra!uvacs

smithson@calma.UUCP (Brian Smithson) (02/04/86)

In article <587@smeagol.UUCP> earle@smeagol.UUCP (Greg Earle) writes:
>> We are looking for a utility that can, when given a arbitrary string,
>> can locate all occurences of that string anywhere on the system. Our
>> local Un*x gurus can't figure this out, so we are appealing to those out
>> in Netland to help us out.
>
>Some Gurus you got there ...
>
>> We are looking for the command to work like this:
>> 
>> findstring this-is-the-string
>
>find / -exec fgrep this-is-the-string '{}' \; (UGGGHHH!)
>
>Warning! Only execute during hours when no one else is in building!!
>Guaranteed to tie up CPU for indefinite periods! :@)
>[...]
> 
How about:  nice -20 "find / -exec fgrep this-is-the-string {} \;"  ?
Better pack a lunch, though... :-)

jsdy@hadron.UUCP (Joseph S. D. Yao) (02/11/86)

All the folk who are responding that the way to get the file names
of files containing a particular string are kind of forgetting that
the grep family does  n o t  automatically print out file names.
This:

>find / -exec fgrep this-is-the-string '{}' \;

will give a file full of lines containing this-is-the-string.  Try:

find / -exec grep this-is-the-string '{}' /dev/null \;

**OR** (quicker) :

find / -type d -a -exec ksh findstr "this-is-the-string" {} \;

findstr:
#!/bin/ksh
# or /bin/sh

str="$1"
dir="$2"
file=""
text=""

if [ ! -d "$dir" ]; then exit 1; fi

cd "$dir"

for file in *; do
	if [ ! -f "$file" ]; then continue; fi
	text=`file "$file" | grep text`
	if [ "" = "$text" ]; then continue; fi
	# if you want the complete text:
	# grep "$str" "$dir/$file" /dev/null
	# otherwise
	text=`grep "str" "$file" | line`
	if [ "" != "$text" ]; then
		echo "$dir/$file"
	fi
done
exit 0
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}

mo@wgivax.UUCP (02/15/86)

>From jsdy@hadron.UUCP (Joseph S. D. Yao) Sun Feb  6 01:28:16 206
>Summary: grep won't always print file name!
>
>All the folk who are responding that the way to get the file names
>of files containing a particular string are kind of forgetting that
>the grep family does  n o t  automatically print out file names.
>This:
>
>>find / -exec fgrep this-is-the-string '{}' \;
>
>will give a file full of lines containing this-is-the-string.  Try:
>
>find / -exec grep this-is-the-string '{}' /dev/null \;

WRONG!  fgrep -l WILL print the file name, and WILL NOT print the string
        it will look for ONLY the first occurrence in a file, speeding
		things up, AND fgrep is faster than grep

>**OR** (quicker) :
>
>find / -type d -a -exec ksh findstr "this-is-the-string" {} \;

(-: GREAT,  NOW HOW DO I FIND THE KORN SHELL ? :-)

>findstr:
>#!/bin/ksh
># or /bin/sh
>
>str="$1"
>dir="$2"
>file=""
>text=""
>
>if [ ! -d "$dir" ]; then exit 1; fi
>
>cd "$dir"
>
>for file in *; do
>	if [ ! -f "$file" ]; then continue; fi
>	text=`file "$file" | grep text`
>	if [ "" = "$text" ]; then continue; fi
>	# if you want the complete text:
>	# grep "$str" "$dir/$file" /dev/null
>	# otherwise
>	text=`grep "str" "$file" | line`
>	if [ "" != "$text" ]; then
>		echo "$dir/$file"
>	fi
>done
>exit 0

this is admittedly "safer", since it skips non-text files, but look at all
those sub-processes you're starting up for every used inode on the system!

haven't we heard enough about this, YET?

gkloker@utai.UUCP (Geoff Loker) (02/16/86)

In article <259@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>All the folk who are responding that the way to get the file names
>of files containing a particular string are kind of forgetting that
>the grep family does  n o t  automatically print out file names.
>This:
>
>>find / -exec fgrep this-is-the-string '{}' \;
>
>will give a file full of lines containing this-is-the-string.  Try:
>

I don't know if this is any quicker than your script file suggestion
for finding file names with the string (we don't have ksh), but the
grep family does have an option to print out only the name(s) of
file(s) that contain the match-string.  Try:

find / -exec fgrep -l this-is-the-string '{}' \;
-- 
Geoff Loker
Department of Computer Science
University of Toronto
Toronto, ON
M5S 1A4

USENET:	{ihnp4 decwrl utzoo uw-beaver}!utcsri!utai!gkloker
CSNET:		gkloker@toronto
ARPANET:	gkloker.toronto@csnet-relay

lew@gsg.UUCP (Paul Lew) (02/17/86)

>All the folk who are responding that the way to get the file names
>of files containing a particular string are kind of forgetting that
>the grep family does  n o t  automatically print out file names.
>This:
>
>find / -exec fgrep this-is-the-string '{}' \;
>
>will give a file full of lines containing this-is-the-string.  Try:
>

	Notice that if you do grep on more than one files, file
	names will be displayed.  A simple solution to the problem
	is to use:

	 find / -exec fgrep this-is-the-string /dev/null '{}' \;

	and YOU DO NOT HAVE TO WRITE ANY SCRIPT to do so.
-- 
----------------------------------------------------------------------
Paul S. Lew				decvax!gsg!lew		(UUCP)

General Systems Group
51 Main Street, Salem, NH  03079	(603) 893-1000
----------------------------------------------------------------------

mjs@sfsup.UUCP (M.J.Shannon) (02/18/86)

> >This:
> >
> >>find / -exec fgrep this-is-the-string '{}' \;
> >
> >will give a file full of lines containing this-is-the-string.  Try:
> >
> 
> find / -exec fgrep -l this-is-the-string '{}' \;
> -- 
> Geoff Loker

If your grep/egrep/fgrep doesn't support -l, then try the following:

	find / -exec fgrep string '{}' /dev/null ';' |
		sed -e 's/:.*//' |
		sort -t/ -u

Note that this fails miserably if you have files whose names include a ':'.
-- 
	Marty Shannon
UUCP:	ihnp4!attunix!mjs
Phone:	+1 (201) 522 6063

Disclaimer: I speak for no one.

"If I never loved, I never would have cried." -- Simon & Garfunkel

stevesu@copper.UUCP (Steve Summit) (02/18/86)

In article <245@aero.ARPA>, sutton@aero.ARPA (Stew Sutton) writes:
> We are looking for a utility that can, when given a arbitrary string,
> can locate all occurences of that string anywhere on the system. Our
> local Un*x gurus can't figure this out, so we are appealing to those out
> in Netland to help us out.

Stew's question has basically been answered, but I've got two
cents to add:

	1. Since such a command is probably going to generate
	   voluminous output, it is tempting to redirect it to a
	   file for later perusal.  If you do so, be extremely
	   careful: if your program is searching the entire
	   filesystem, it is likely to find your output file,
	   each of whose lines contains the string you're looking
	   for, and which will therefore get re-appended to the
	   file, ad infinitum...

	   I make this mistake every few years, filling up a disk
	   every time.  If you don't need to search the entire
	   filesystem, just make sure you put the output file
	   somewhere where it won't get found, like /tmp.  The
	   general solution would be an exclusion option on the
	   find command, which would be generally useful.
	   (Another trick would be to make the output file
	   unreadable.)

	2. Joe Yao pointed out the problem of the grep family not
	   printing the filename if given a single argument.  My
	   solution, which is a bit wasteful, but probably more
	   efficient than Joe's shell script, goes like this:

		find / -exec grep 'little dog' {} /dev/null \;

	   grep notices two arguments, so cheerfully prints the
	   filename if it finds the string, although it's
	   virtually guaranteed never to occur in the second one
	   (unless /dev/null accidentally got replaced with a
	   real file, but that's another story).

                                         Steve Summit
                                         tektronix!copper!stevesu

stevesu@copper.UUCP (Steve Summit) (02/18/86)

A thousand apologies.  Joe Yao suggested the exact same /dev/null
trick I did; I missed it and got distracted by the complicated-
looking Korn shell script at the bottom of his article.

                                         Steve Summit
                                         tektronix!copper!stevesu

dv@well.UUCP (David W. Vezie) (02/24/86)

In article <144@wgivax.UUCP> mo@wgivax.UUCP writes:
>WRONG!  fgrep -l WILL print the file name, and WILL NOT print the string
>        it will look for ONLY the first occurrence in a file, speeding
>		things up, AND fgrep is faster than grep
>

Ummm...  I don't know about your machine, but I just did an informal
benchmark comparing {e,,f}grep for speed, and found out that of the
three, egrep is fastest, followed by grep, and slowest was fgrep.
(this is on 4.2BSD)
--- 
David W. Vezie
	    {dual|hplabs}!well!dv - Whole Earth 'Lectronics Link, Sausalito, CA
(4 lines, 113 chars)

mo@wgivax.UUCP (02/27/86)

>Reply-To: dv@well.UUCP (David W. Vezie)

>In article <144@wgivax.UUCP> mo@wgivax.UUCP writes:
>>WRONG!  fgrep -l WILL print the file name, and WILL NOT print the string
>>        it will look for ONLY the first occurrence in a file, speeding
>>		things up, AND fgrep is faster than grep
>>

>Ummm...  I don't know about your machine, but I just did an informal
>benchmark comparing {e,,f}grep for speed, and found out that of the
>three, egrep is fastest, followed by grep, and slowest was fgrep.
>(this is on 4.2BSD)

I haven't used egrep very much, but having worked with UNIX for 5 years
on Vax 11/780's, Sun's, Masscomp's, and various other 68k machines (mostly
4.[12]), I have always observed (-: yes, this is a subjective observation :-)
that fgrep runs faster than grep when searching for a specific string.

Anyway, the point is that the grep family of commands has an option which
will print out the file name, and not the lines in which the pattern occurs.

Let's avoid a holy war.  The point in this entire back and forth discussion
is RTFM!!!!  There have been many mistakes in the postings responding to
the original article.  It's great to want to help, but be sure that you have
all the facts before giving advice.