sutton@aero.ARPA (Stew Sutton) (01/30/86)
We are looking for a utility that can, when given a arbitrary string, can locate all occurences of that string anywhere on the system. Our local Un*x gurus can't figure this out, so we are appealing to those out in Netland to help us out. We are looking for the command to work like this: findstring this-is-the-string The utility would return all the files (and their pathnames from the root) to the screen. Of course if the protections on the file indicate that the file cannot be read, the program should ignore that file and keep on going. We think it can be done using a command file using the 'ls' and 'awk' commands but we just can get it right. Please send source code (or ideas on writing this code) to us and we will post to net a summary of working code. Thanks in advance. sutton@aerospace.ARPA {ihnp4!sdcrdcf,randvax,trwrb} ! aero ! sutton sutton%aerospace.ARPA@WISCVM.BITNET
earle@smeagol.UUCP (Greg Earle) (02/01/86)
> We are looking for a utility that can, when given a arbitrary string, > can locate all occurences of that string anywhere on the system. Our > local Un*x gurus can't figure this out, so we are appealing to those out > in Netland to help us out. Some Gurus you got there ... > We are looking for the command to work like this: > > findstring this-is-the-string find / -exec fgrep this-is-the-string '{}' \; (UGGGHHH!) Warning! Only execute during hours when no one else is in building!! Guaranteed to tie up CPU for indefinite periods! :@) If you only want the file names, this *might* work, I'm not sure ... find / -exec "fgrep this-is-the-string '{}' | awk -F: '{print $1}'" \; (DOUBLE UGGGHHH) -- Greg Earle JPL Spacecraft Data Systems group sdcrdcf!smeagol!earle (UUCP) ia-sun2!smeagol!earle@csvax.caltech.edu (ARPA)
hfavr@mtuxo.UUCP (a.reed) (02/04/86)
> We are looking for a utility that can, when given a arbitrary string, > can locate all occurences of that string anywhere on the system. Our > local Un*x gurus can't figure this out, so we are appealing to those out > in Netland to help us out. > We are looking for the command to work like this: > > findstring this-is-the-string > > The utility would return all the files (and their pathnames from the > root) to the screen. Of course if the protections on the file indicate > that the file cannot be read, the program should ignore that file and keep > on going. We think it can be done using a command file using the 'ls' > and 'awk' commands but we just can get it right. > Please send source code (or ideas on writing this code) to us and we > will post to net a summary of working code. > Thanks in advance. > > sutton@aerospace.ARPA > {ihnp4!sdcrdcf,randvax,trwrb} ! aero ! sutton > sutton%aerospace.ARPA@WISCVM.BITNET # In ksh or sh this is a one-liner: 2>/dev/null find / -exec fgrep -l $1 {} \; # Please do not post ELEMENTARY shell questions to net.unix-wizards! # Adam Reed (ihnp4!npois!adam)
wescott@sauron.UUCP (Michael Wescott) (02/04/86)
In article <587@smeagol.UUCP> earle@smeagol.UUCP (Greg Earle) writes: >> We are looking for a utility that can, when given a arbitrary string, >> can locate all occurences of that string anywhere on the system. > >find / -exec fgrep this-is-the-string '{}' \; (UGGGHHH!) > >Warning! Only execute during hours when no one else is in building!! >Guaranteed to tie up CPU for indefinite periods! :@) > >If you only want the file names, this *might* work, I'm not sure ... > >find / -exec "fgrep this-is-the-string '{}' | awk -F: '{print $1}'" \; >(DOUBLE UGGGHHH) I agree, UGGHHH. I know its not on most BSD systems, but it has its uses. I'm talking about `xargs'. For much less cpu usage try: find / -type f -print | grep -v outfile | xargs grep 'pattern' > outfile or some reasonable variation with your favorite grep (or bm). Xargs accumulates arguments from stdin and execs the command and args given for a reasonable number of collected arguments. Hence grep gets executed once per ten or twenty files rather than once per file. I think it's reasonable to expect to search only regular files, hence "-type f" and you better exclude "outfile" unless you use "grep -l". -Mike Wescott ncrcae!wescott
hamilton@uiucuxc.CSO.UIUC.EDU (02/04/86)
our mailer couldn't find "sutton@aero". >We are looking for a utility that can, when given a arbitrary string, >can locate all occurences of that string anywhere on the system. Our >local Un*x gurus can't figure this out, so we are appealing to those out >in Netland to help us out. try something like: find / -type f -a -exec fgrep -s this-is-the-string \{\} \; -a -print if possible, add extra terms to limit the search. for example, if you want to search only for C source files, add: -name \*.c -a before the "-exec". you could certainly write something more efficient, but do you really need it that often? wayne hamilton U of Il and US Army Corps of Engineers CERL UUCP: {ihnp4,pur-ee,convex}!uiucdcs!uiucuxc!hamilton ARPA: hamilton%uiucuxc@a.cs.uiuc.edu USMail: Box 476, Urbana, IL 61801 CSNET: hamilton%uiucuxc@uiuc.csnet Phone: (217)333-8703
jso@edison.UUCP (John Owens) (02/04/86)
> We are looking for the command to work like this: > > findstring this-is-the-string > > The utility would return all the files (and their pathnames from the > root) to the screen. Of course if the protections on the file indicate > that the file cannot be read, the program should ignore that file and keep > on going. We think it can be done using a command file using the 'ls' > and 'awk' commands but we just can get it right. > > sutton@aerospace.ARPA > {ihnp4!sdcrdcf,randvax,trwrb} ! aero ! sutton find / -type f -exec fgrep -l $1 {} \; 2>/dev/null -- John Owens General Electric Company Phone: (804) 978-5726 Factory Automated Products Division Compuserve: 76317,2354 houxm!burl!icase!uvacs ...!{ decvax!mcnc!ncsu!uvacs }!edison!jso gatech!allegra!uvacs
smithson@calma.UUCP (Brian Smithson) (02/04/86)
In article <587@smeagol.UUCP> earle@smeagol.UUCP (Greg Earle) writes: >> We are looking for a utility that can, when given a arbitrary string, >> can locate all occurences of that string anywhere on the system. Our >> local Un*x gurus can't figure this out, so we are appealing to those out >> in Netland to help us out. > >Some Gurus you got there ... > >> We are looking for the command to work like this: >> >> findstring this-is-the-string > >find / -exec fgrep this-is-the-string '{}' \; (UGGGHHH!) > >Warning! Only execute during hours when no one else is in building!! >Guaranteed to tie up CPU for indefinite periods! :@) >[...] > How about: nice -20 "find / -exec fgrep this-is-the-string {} \;" ? Better pack a lunch, though... :-)
jsdy@hadron.UUCP (Joseph S. D. Yao) (02/11/86)
All the folk who are responding that the way to get the file names
of files containing a particular string are kind of forgetting that
the grep family does n o t automatically print out file names.
This:
>find / -exec fgrep this-is-the-string '{}' \;
will give a file full of lines containing this-is-the-string. Try:
find / -exec grep this-is-the-string '{}' /dev/null \;
**OR** (quicker) :
find / -type d -a -exec ksh findstr "this-is-the-string" {} \;
findstr:
#!/bin/ksh
# or /bin/sh
str="$1"
dir="$2"
file=""
text=""
if [ ! -d "$dir" ]; then exit 1; fi
cd "$dir"
for file in *; do
if [ ! -f "$file" ]; then continue; fi
text=`file "$file" | grep text`
if [ "" = "$text" ]; then continue; fi
# if you want the complete text:
# grep "$str" "$dir/$file" /dev/null
# otherwise
text=`grep "str" "$file" | line`
if [ "" != "$text" ]; then
echo "$dir/$file"
fi
done
exit 0
--
Joe Yao hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
mo@wgivax.UUCP (02/15/86)
>From jsdy@hadron.UUCP (Joseph S. D. Yao) Sun Feb 6 01:28:16 206 >Summary: grep won't always print file name! > >All the folk who are responding that the way to get the file names >of files containing a particular string are kind of forgetting that >the grep family does n o t automatically print out file names. >This: > >>find / -exec fgrep this-is-the-string '{}' \; > >will give a file full of lines containing this-is-the-string. Try: > >find / -exec grep this-is-the-string '{}' /dev/null \; WRONG! fgrep -l WILL print the file name, and WILL NOT print the string it will look for ONLY the first occurrence in a file, speeding things up, AND fgrep is faster than grep >**OR** (quicker) : > >find / -type d -a -exec ksh findstr "this-is-the-string" {} \; (-: GREAT, NOW HOW DO I FIND THE KORN SHELL ? :-) >findstr: >#!/bin/ksh ># or /bin/sh > >str="$1" >dir="$2" >file="" >text="" > >if [ ! -d "$dir" ]; then exit 1; fi > >cd "$dir" > >for file in *; do > if [ ! -f "$file" ]; then continue; fi > text=`file "$file" | grep text` > if [ "" = "$text" ]; then continue; fi > # if you want the complete text: > # grep "$str" "$dir/$file" /dev/null > # otherwise > text=`grep "str" "$file" | line` > if [ "" != "$text" ]; then > echo "$dir/$file" > fi >done >exit 0 this is admittedly "safer", since it skips non-text files, but look at all those sub-processes you're starting up for every used inode on the system! haven't we heard enough about this, YET?
gkloker@utai.UUCP (Geoff Loker) (02/16/86)
In article <259@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes: >All the folk who are responding that the way to get the file names >of files containing a particular string are kind of forgetting that >the grep family does n o t automatically print out file names. >This: > >>find / -exec fgrep this-is-the-string '{}' \; > >will give a file full of lines containing this-is-the-string. Try: > I don't know if this is any quicker than your script file suggestion for finding file names with the string (we don't have ksh), but the grep family does have an option to print out only the name(s) of file(s) that contain the match-string. Try: find / -exec fgrep -l this-is-the-string '{}' \; -- Geoff Loker Department of Computer Science University of Toronto Toronto, ON M5S 1A4 USENET: {ihnp4 decwrl utzoo uw-beaver}!utcsri!utai!gkloker CSNET: gkloker@toronto ARPANET: gkloker.toronto@csnet-relay
geoff@desint.UUCP (Geoff Kuenning) (02/17/86)
In article <259@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes: > text=`file "$file" | grep text` Actually, there's still a gotcha here: it will pick up files named 'text.o', 'context.o', etc. (I know this from painful experience!). A better way to detect text files is: text=`file "$file" | grep ':.* text'` or even text=`file "$file" | grep 'text$'` but I'd use the last only after checking the source of /bin/file to make sure it always put 'text' last on the line. Anyway, here is a TESTED (what a radical idea) shell script that: (1) Searches only text files (2) Accepts the -l and -n switches of 'grep' (3) remembers to put in /dev/null if -l is not specified, and (4) uses xargs if it is available in /bin or /usr/bin. Both the BSD and the System V variants have been tested (it adapts dynamically). It is an improved version of a script that I sent to the original question-asker more than a month ago. Now, can we *please* move on to a new subject? Geoff Kuenning {hplabs,ihnp4}!trwrb!desint!geoff ------------------------cut here-------------------------- : Use /bin/sh #!/bin/sh # # Locate a string in any (text) file, anywhere in the system. # # Usage: # # findstring [-l] [-n] [-g grep-program] [root-directory] search-string # # If the search-string contains semicolons, they should be backslashed, # thus: # # findstring "break\;" # # The -l and -n switches are passed to the 'grep' program. # # The -g switch selects a different grep program; the default is 'grep'. # # WARNING: this command is very slow, and loads down the system # quite heavily. The System V version opens every file in the # system; the BSD version does that and also spawns at least one # process for every file in the system and another for every # text file. You can reduce the system load by a little bit by # initiating the command from the root directory. # # Note: this is written to be portable to System V and BSD. It # has only been tested on system V, though the BSD code was also # tested there. # PATH=/bin:/usr/bin ROOTDIR=/ grepargs= nullfile=/dev/null grep=grep while : do case "X$1" in X-l) grepargs="$grepargs -l" nullfile= shift ;; X-n) grepargs="$grepargs -n" shift ;; X-g) grep=$2 shift; shift ;; X-*) set illegal arguments - this will cause a message break ;; *) break ;; esac done if [ $# -gt 1 ] then ROOTDIR=$1 shift fi if [ $# -ne 1 ] then echo 'Usage: findstring [-l] [-n] [-g grep-program]' \ '[root-directory] search-string' 1>&2 exit 2 fi # # If you have UniSoft System V, test xargs to see if it has a bug by # typing: # # echo a b c | xargs echo # # If you get nothing back, you have the bug. If you get "a b c", # you don't. # # If you have the bug, you will have to disable the xargs variant # below and make it run the non-xargs version. This is unfortunately # *much* slower. # if [ -x /bin/xargs -o -x /usr/bin/xargs ] then # # The system has xargs; use it # find $ROOTDIR -type f -print \ | xargs file \ | sed -n '/: .* text/s/: .*$//p' \ | xargs $grep $grepargs "$1" $nullfile else # # Too bad, there's no xargs. We'll have to do it the hard way. # find $ROOTDIR -type f -exec file {} \; \ | sed -n '/: .* text/{ s/: .*$// s;^;'"$grep $grepargs '$1' $nullfile"' ;p }' \ | sh fi # # Unfortunately, the grep's will return a nonzero status if they find # nothing, so there isn't much point in returning their status. # exit 0
lew@gsg.UUCP (Paul Lew) (02/17/86)
>All the folk who are responding that the way to get the file names >of files containing a particular string are kind of forgetting that >the grep family does n o t automatically print out file names. >This: > >find / -exec fgrep this-is-the-string '{}' \; > >will give a file full of lines containing this-is-the-string. Try: > Notice that if you do grep on more than one files, file names will be displayed. A simple solution to the problem is to use: find / -exec fgrep this-is-the-string /dev/null '{}' \; and YOU DO NOT HAVE TO WRITE ANY SCRIPT to do so. -- ---------------------------------------------------------------------- Paul S. Lew decvax!gsg!lew (UUCP) General Systems Group 51 Main Street, Salem, NH 03079 (603) 893-1000 ----------------------------------------------------------------------
mjs@sfsup.UUCP (M.J.Shannon) (02/18/86)
> >This: > > > >>find / -exec fgrep this-is-the-string '{}' \; > > > >will give a file full of lines containing this-is-the-string. Try: > > > > find / -exec fgrep -l this-is-the-string '{}' \; > -- > Geoff Loker If your grep/egrep/fgrep doesn't support -l, then try the following: find / -exec fgrep string '{}' /dev/null ';' | sed -e 's/:.*//' | sort -t/ -u Note that this fails miserably if you have files whose names include a ':'. -- Marty Shannon UUCP: ihnp4!attunix!mjs Phone: +1 (201) 522 6063 Disclaimer: I speak for no one. "If I never loved, I never would have cried." -- Simon & Garfunkel
stevesu@copper.UUCP (Steve Summit) (02/18/86)
In article <245@aero.ARPA>, sutton@aero.ARPA (Stew Sutton) writes: > We are looking for a utility that can, when given a arbitrary string, > can locate all occurences of that string anywhere on the system. Our > local Un*x gurus can't figure this out, so we are appealing to those out > in Netland to help us out. Stew's question has basically been answered, but I've got two cents to add: 1. Since such a command is probably going to generate voluminous output, it is tempting to redirect it to a file for later perusal. If you do so, be extremely careful: if your program is searching the entire filesystem, it is likely to find your output file, each of whose lines contains the string you're looking for, and which will therefore get re-appended to the file, ad infinitum... I make this mistake every few years, filling up a disk every time. If you don't need to search the entire filesystem, just make sure you put the output file somewhere where it won't get found, like /tmp. The general solution would be an exclusion option on the find command, which would be generally useful. (Another trick would be to make the output file unreadable.) 2. Joe Yao pointed out the problem of the grep family not printing the filename if given a single argument. My solution, which is a bit wasteful, but probably more efficient than Joe's shell script, goes like this: find / -exec grep 'little dog' {} /dev/null \; grep notices two arguments, so cheerfully prints the filename if it finds the string, although it's virtually guaranteed never to occur in the second one (unless /dev/null accidentally got replaced with a real file, but that's another story). Steve Summit tektronix!copper!stevesu
stevesu@copper.UUCP (Steve Summit) (02/18/86)
A thousand apologies. Joe Yao suggested the exact same /dev/null trick I did; I missed it and got distracted by the complicated- looking Korn shell script at the bottom of his article. Steve Summit tektronix!copper!stevesu
dv@well.UUCP (David W. Vezie) (02/24/86)
In article <144@wgivax.UUCP> mo@wgivax.UUCP writes: >WRONG! fgrep -l WILL print the file name, and WILL NOT print the string > it will look for ONLY the first occurrence in a file, speeding > things up, AND fgrep is faster than grep > Ummm... I don't know about your machine, but I just did an informal benchmark comparing {e,,f}grep for speed, and found out that of the three, egrep is fastest, followed by grep, and slowest was fgrep. (this is on 4.2BSD) --- David W. Vezie {dual|hplabs}!well!dv - Whole Earth 'Lectronics Link, Sausalito, CA (4 lines, 113 chars)
mo@wgivax.UUCP (02/27/86)
>Reply-To: dv@well.UUCP (David W. Vezie) >In article <144@wgivax.UUCP> mo@wgivax.UUCP writes: >>WRONG! fgrep -l WILL print the file name, and WILL NOT print the string >> it will look for ONLY the first occurrence in a file, speeding >> things up, AND fgrep is faster than grep >> >Ummm... I don't know about your machine, but I just did an informal >benchmark comparing {e,,f}grep for speed, and found out that of the >three, egrep is fastest, followed by grep, and slowest was fgrep. >(this is on 4.2BSD) I haven't used egrep very much, but having worked with UNIX for 5 years on Vax 11/780's, Sun's, Masscomp's, and various other 68k machines (mostly 4.[12]), I have always observed (-: yes, this is a subjective observation :-) that fgrep runs faster than grep when searching for a specific string. Anyway, the point is that the grep family of commands has an option which will print out the file name, and not the lines in which the pattern occurs. Let's avoid a holy war. The point in this entire back and forth discussion is RTFM!!!! There have been many mistakes in the postings responding to the original article. It's great to want to help, but be sure that you have all the facts before giving advice.