[comp.unix.wizards] recursive grep

williamt@athena1.Sun.COM (William A. Turnbow) (08/23/89)

Here is a short quicky (I hope).  I am trying to do the following:

find . -type d -exec grep string {}/* \;


However, find apparently does not expand the braces unless they are
separated by spaces.  I've tried a variety of quotes and backslashes,
but no go.

I've always ended up writing a tiny shell script in the past to do this,
but there's got to be some easy way I'm missing.

email is fine...

Thanks

-wat-

steve@polyslo.CalPoly.EDU (Steve DeJarnett) (08/23/89)

In article <122979@sun.Eng.Sun.COM> williamt@sun.UUCP (William A. Turnbow) writes:
>Here is a short quicky (I hope).  I am trying to do the following:
>
>find . -type d -exec grep string {}/* \;

	If you're trying to grep for a string in every file in or below the
current directory, why not do this:

	find . -type f -exec grep string {} \;

>However, find apparently does not expand the braces unless they are
>separated by spaces.  I've tried a variety of quotes and backslashes,
>but no go.

	I suspect that find "sees" the /* after the braces, and presumes
that you mustn't really want it to expand the filename there.  I've never
known find to need a space between the braces, but, then, that certainly
doesn't mean that it never would expect that.  :-)

-------------------------------------------------------------------------------
| Steve DeJarnett            | Smart Mailers -> steve@polyslo.CalPoly.EDU     |
| Computer Systems Lab       | Dumb Mailers  -> ..!ucbvax!voder!polyslo!steve |
| Cal Poly State Univ.       |------------------------------------------------|
| San Luis Obispo, CA  93407 | BITNET = Because Idiots Type NETwork           |
-------------------------------------------------------------------------------

dce@Solbourne.COM (David Elliott) (08/23/89)

In article <13710@polyslo.CalPoly.EDU> steve@polyslo.CalPoly.EDU (Steve DeJarnett) writes:
>In article <122979@sun.Eng.Sun.COM> williamt@sun.UUCP (William A. Turnbow) writes:
>>Here is a short quicky (I hope).  I am trying to do the following:
>>
>>find . -type d -exec grep string {}/* \;
>
>	If you're trying to grep for a string in every file in or below the
>current directory, why not do this:
>
>	find . -type f -exec grep string {} \;

A closer solution is

	find . -type f -exec grep string {} /dev/null \;

This will force grep to print the filename.  Even better is

	find . -type f -print | xargs grep string /dev/null

if you have xargs.  xargs will run

	grep string /dev/null {filenames}

for sets of file names read from stdin.  The result is far fewer
forks and execs of grep, so you get the results much quicker.

>	I suspect that find "sees" the /* after the braces, and presumes
>that you mustn't really want it to expand the filename there.  I've never
>known find to need a space between the braces, but, then, that certainly
>doesn't mean that it never would expect that.  :-)

Actually, find just looks specifically for {} as an argument.  I've
always thought find should expand {} in any part of an argument, but
that's probably because I wanted it to do that when I first started
using it.

-- 
David Elliott		dce@Solbourne.COM
			...!{uunet,boulder,nbires,sun}!stan!dce

"I had a dream that my kids had been reparented." - Tom LaStrange

ekrell@hector.UUCP (Eduardo Krell) (08/23/89)

In article <13710@polyslo.CalPoly.EDU> steve@polyslo.CalPoly.EDU (Steve DeJarnett) writes:

>	If you're trying to grep for a string in every file in or below the
>current directory, why not do this:
>
>	find . -type f -exec grep string {} \;

Because exec'ing one grep for each file is slower than exec'ing one
grep with multiple file arguments for each directory in the hierarchy.
    
Eduardo Krell                   AT&T Bell Laboratories, Murray Hill, NJ

UUCP: {att,decvax,ucbvax}!ulysses!ekrell  Internet: ekrell@ulysses.att.com

ams@cbnewsl.ATT.COM (andrew.m.shaw) (08/24/89)

>>Here is a short quicky (I hope).  I am trying to do the following:
>>
>>find . -type d -exec grep string {}/* \;

>...why not do this:
>
>	find . -type f -exec grep string {} \;
>

No, find does not expand {} unless isolated.  Why not use the the much
ignored xargs and save yourself n execs of grep?  Thus:

	find . -type f -print | xargs grep string

dg@lakart.UUCP (David Goodenough) (08/24/89)

steve@polyslo.CalPoly.EDU (Steve DeJarnett) sez:
> williamt@sun.UUCP (William A. Turnbow) writes:
>>Here is a short quicky (I hope).  I am trying to do the following:
>>
>>find . -type d -exec grep string {}/* \;
> 
> 	If you're trying to grep for a string in every file in or below the
> current directory, why not do this:
> 
> 	find . -type f -exec grep string {} \;

Simple. The first does one exec per directory, the second does one exec per
file. I agree with Mr. Turnbow that it is extremely obnoxious behaviour
on the part of find. The only way I can see to do it is to do some real
funky work with awk, maybe:

	find . -type d -print | awk '{ print "grep string " $0 "/*" }' | sh

But then I use awk for most everything, no matter how ugly :-)
-- 
	dg@lakart.UUCP - David Goodenough		+---+
						IHS	| +-+-+
	....... !harvard!xait!lakart!dg			+-+-+ |
AKA:	dg%lakart.uucp@xait.xerox.com			  +---+

sja@sirius.hut.fi (Sakari Jalovaara) (08/25/89)

> find . -type f -print | xargs grep string

xargs pops up every couple of months in comp.unix.* but I haven't seen
this mentioned yet:

	Script started on Fri Aug 25 12:24:58 1989
	tmp (sirius) 11> touch 'file names can have spaces\
	and even newlines in them\!'
	tmp (sirius) 12> find . -type f -print | xargs grep helloWorld
	grep: ./file: No such file or directory
	grep: names: No such file or directory
	grep: can: No such file or directory
	grep: have: No such file or directory
	grep: spaces: No such file or directory
	grep: and: No such file or directory
	grep: even: No such file or directory
	grep: newlines: No such file or directory
	grep: in: No such file or directory
	grep: them!: No such file or directory
	tmp (sirius) 13> exit
	exit

	script done on Fri Aug 25 12:26:11 1989

You probably don't want to run "xargs rm" and such as root.
									++sja

ams@cbnewsl.ATT.COM (andrew.m.shaw) (08/27/89)

 In article <666@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes:
 >steve@polyslo.CalPoly.EDU (Steve DeJarnett) sez:
 >> williamt@sun.UUCP (William A. Turnbow) writes:
 >>>Here is a short quicky (I hope).  I am trying to do the following:
 >>>
 >>>find . -type d -exec grep string {}/* \;
 >> 
 >> 	If you're trying to grep for a string in every file in or below the
 >> current directory, why not do this:
 >> 
 >> 	find . -type f -exec grep string {} \;
 >
 >Simple. The first does one exec per directory, the second does one exec per
 >file. I agree with Mr. Turnbow that it is extremely obnoxious behaviour
 >on the part of find. The only way I can see to do it is to do some real
 >funky work with awk, maybe:
 >
 >	find . -type d -print | awk '{ print "grep string " $0 "/*" }' | sh
 >
 >But then I use awk for most everything, no matter how ugly :-)

Since my previous posting may have gotten lost, I resend that I recommend
the following:

	find . -type f -print | xargs fgrep string

Neat and clean.

	Andrew Shaw

jeff@cdp.UUCP (08/27/89)

I wouldn't complain about xargs not being capable of handling
filenames with spaces and newlines.  There are a lot of other
programs that will break under the same circumstances.

	Jeff Dean
	uunet!pyramid!cdp!jeff

guy@auspex.auspex.com (Guy Harris) (08/29/89)

 >I wouldn't complain about xargs not being capable of handling
 >filenames with spaces and newlines.  There are a lot of other
 >programs that will break under the same circumstances.

In which case I'd not only continue to complain about "xargs", but
complain about those other programs as well....

ado@elsie.UUCP (Arthur David Olson) (08/29/89)

> . . .I recommend the following:
> 	find . -type f -print | xargs fgrep string

Using
	find . -type f -print | xargs fgrep string /dev/null
will help ensure that all files are treated consistently.  If xargs bunched
all but one of your files into its first exec of fgrep, then passed the last
file to fgrep, you'd get output such as
	firstfile: This is a string.
	secondfile: This is also a string.
	This is the last string.
with the first command above; you'd get output such as
	firstfile: This is a string.
	secondfile: This is also a string.
	thelastfile: This is the last string.
with the second command above.
-- 
Gettysburg Address: 266 words.  Spencer article bodies, 8/12-18/89: 14439 words.
	Arthur David Olson    ado@alw.nih.gov    ADO is a trademark of Ampex.

grr@cbmvax.UUCP (George Robbins) (08/29/89)

In article <1641@cbnewsl.ATT.COM> ams@cbnewsl.ATT.COM (andrew.m.shaw,580,) writes:
> 
>  In article <666@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes:
>  >steve@polyslo.CalPoly.EDU (Steve DeJarnett) sez:
>  >> williamt@sun.UUCP (William A. Turnbow) writes:
>  >
>  >	find . -type d -print | awk '{ print "grep string " $0 "/*" }' | sh
>  >
>  >But then I use awk for most everything, no matter how ugly :-)
> 
> Since my previous posting may have gotten lost, I resend that I recommend
> the following:
> 
> 	find . -type f -print | xargs fgrep string
> 
> Neat and clean.

Iff your system happens to have xargs - many Berkeley derived systems don't,
in which case the "find | filter | sh" can stil handle the problem.  It may
not be as quick, since it does one command per filename, but it's certainly
more flexible.

-- 
George Robbins - now working for,	uucp: {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing	arpa: cbmvax!grr@uunet.uu.net
Commodore, Engineering Department	fone: 215-431-9255 (only by moonlite)

les@chinet.chi.il.us (Leslie Mikesell) (08/30/89)

In article <2390@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:

> >I wouldn't complain about xargs not being capable of handling
> >filenames with spaces and newlines.  There are a lot of other
> >programs that will break under the same circumstances.

>In which case I'd not only continue to complain about "xargs", but
>complain about those other programs as well....

Well, how would you go about parsing filenames out of a list if you
can't use spaces or newlines as the delimiters?

Personally, I think it is a mistake to allow control characters or
shell metacharacters to be in filenames.  Actually, I'd say that
it's a mistake to use any characters that could be be in filenames
as shell metacharacters, but given the selection available I guess
the shell is not really at fault.

We've been through this before and I doubt that anyone has changed
their mind, but I'll bet no one wants to have a file named ";rm *"
in their directories waiting for a shell script to eval it or a
program to insert it into a system() call.

Les Mikesell

oz@yunexus.UUCP (Ozan Yigit) (08/30/89)

In article <7774@cbmvax.UUCP> grr@cbmvax.UUCP (George Robbins) writes:

>Iff your system happens to have xargs - many Berkeley derived systems don't,
>in which case the "find | filter | sh" can stil handle the problem.

Well, everyone has a zippy solution, but it seems, backquotes are either
out of fashion, too simple, or because of broken shells, ("Arguments too
long" ?? Nooo... really ??) nobody suggested something like

	egrep ptui `find whatever -print`

Hmm. I thought I had it all this time. :-)

oz
-- 
The king: If there's no meaning	   	    Usenet:    oz@nexus.yorku.ca
in it, that saves a world of trouble        ......!uunet!utai!yunexus!oz
you know, as we needn't try to find any.    Bitnet: oz@[yulibra|yuyetti]
Lewis Carroll (Alice in Wonderland)         Phonet: +1 416 736-5257x3976

cudcv@warwick.ac.uk (Rob McMahon) (08/31/89)

Re: `find ... -print | xargs' and filenames with spaces/newlines.

In article <9408@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes:
>Well, how would you go about parsing filenames out of a list if you can't use
>spaces or newlines as the delimiters?

Spaces should be no problem.  Find prints it's filenames separated by
newlines, and xargs should have an option which means take each line of input
literally as a single argument, *not* ignoring spaces anywhere (in fact I
think this should be the default, how often do you use xargs with more than
one argument per line ?).  It would be useful if find quoted each newline in a
filename with a backslash.

Rob
-- 
UUCP:   ...!mcvax!ukc!warwick!cudcv	PHONE:  +44 203 523037
JANET:  cudcv@uk.ac.warwick             ARPA:   cudcv@warwick.ac.uk
Rob McMahon, Computing Services, Warwick University, Coventry CV4 7AL, England

perry@ccssrv.UUCP (Perry Hutchison) (08/31/89)

In article <7774@cbmvax.UUCP> grr@cbmvax.UUCP (George Robbins) writes:

>in which case the "find | filter | sh" can stil handle the problem.  It may
>not be as quick, since it does one command per filename, but it's certainly

This discussion has now come full circle.  The whole point of the original
posting was how to run the command once per *directory* instead of once per
*file*, so as to reduce overhead.  It has been previously noted that once
per file is readily accomplished without pipes or filters by

    find <dir> -type f -exec <command> \;

It seems that xargs, if available, reduces overhead satisfactorily although
it will not produce "one execution per directory" as such.  If xargs is not
available or if exactly one execution per directory is needed, one may need
a nearly-trivial script (original poster's solution) since "find" will not
substitute a {} which appears *within* an argument.  This limitation is
arguably a (longstanding) flaw in find.

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/31/89)

In article <3478@yunexus.UUCP> oz@yunexus.UUCP (Ozan Yigit) writes:
>out of fashion, too simple, or because of broken shells, ("Arguments too
>long" ?? Nooo... really ??) nobody suggested something like
>	egrep ptui `find whatever -print`

Nobody suggested it because, as you indicated, it doesn't work if the
find output is too large.  It's not "broken shells"; it's an operating
system restriction.

dg@lakart.UUCP (David Goodenough) (08/31/89)

ams@cbnewsl.ATT.COM (andrew.m.shaw,580,) sez:
> 
>  In article <666@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes:
>  >steve@polyslo.CalPoly.EDU (Steve DeJarnett) sez:
>  >> williamt@sun.UUCP (William A. Turnbow) writes:
>  >
>  >	find . -type d -print | awk '{ print "grep string " $0 "/*" }' | sh
>  >
>  >But then I use awk for most everything, no matter how ugly :-)
> 
> Since my previous posting may have gotten lost, I resend that I recommend
> the following:
> 
> 	find . -type f -print | xargs fgrep string
> 
> Neat and clean.

Script started on Thu Aug 31 10:22:20 1989
lakart!dg(~)[61]-> find xargs
lakart!dg(~)[62]-> ^D
script done on Thu Aug 31 10:22:34 1989

Great. Now what do all us Berkeley folks do when we don't have xargs.
How's about we pirate a copy from a local SysV site that happens to
have a source licence. Naahhhh - that might get us in trouble. :-)
-- 
	dg@lakart.UUCP - David Goodenough		+---+
						IHS	| +-+-+
	....... !harvard!xait!lakart!dg			+-+-+ |
AKA:	dg%lakart.uucp@xait.xerox.com			  +---+

les@chinet.chi.il.us (Leslie Mikesell) (09/01/89)

In article <191@titania.warwick.ac.uk> cudcv@warwick.ac.uk (Rob McMahon) writes:
>Re: `find ... -print | xargs' and filenames with spaces/newlines.

>>Well, how would you go about parsing filenames out of a list if you can't use
>>spaces or newlines as the delimiters?

>Spaces should be no problem.  Find prints it's filenames separated by
>newlines, and xargs should have an option which means take each line of input
>literally as a single argument, *not* ignoring spaces anywhere.

It's a can of worms any way you turn it.  You could intersperse a filter
between find and xargs to quote the filenames in the ways acceptable
to xargs.  However, this is non-trivial since the names might contain
instances of any or all of the quote characters and it is still imperfect
since you can't tell newlines in the filenames from those added by find.

>It would be useful if find quoted each newline in a filename with a backslash.

But then what do you do about literal backslashes (which can already be a
problem if you let xargs or the shell parse them)?

I'd like my commands to do the same thing whether typed to the shell,
exec'ed, xarg'ed or even handled by shell scripts that do things like
"eval set $*".   That means I don't want any *?[\"'space/newline(etc.)
metacharacters in my filenames, or even any leading "-"'s. 

How about a hook in the file system switch to allow arbitrary character
translations in filenames?  Perhaps it could be above the RFS/NFS links
so if you mount someone else's filesystem you can still make it look
the way you prefer.

Les Mikesell

gaggy@jolnet.ORPK.IL.US (Gregory Gulik) (09/01/89)

In article <3478@yunexus.UUCP> oz@yunexus.UUCP (Ozan Yigit) writes:
>In article <7774@cbmvax.UUCP> grr@cbmvax.UUCP (George Robbins) writes:
>
>>Iff your system happens to have xargs - many Berkeley derived systems don't,
>>in which case the "find | filter | sh" can stil handle the problem.
>
>Well, everyone has a zippy solution, but it seems, backquotes are either
>out of fashion, too simple, or because of broken shells, ("Arguments too
>long" ?? Nooo... really ??) nobody suggested something like
>
>	egrep ptui `find whatever -print`
>
>Hmm. I thought I had it all this time. :-)
>
>oz

Well, it seems it SHOULD work, but not quite.  If the directory
you are looking through is rather large, you'll get an error from
the shell.  I tried a command similar to the above and ksh gave
me this:

ksh: /usr/bin/egrep: arg list too long

I assume other shells have similar limitations.

That why there is an xargs command.  I don't have access to a BSD
machine so I can't speak about csh.  Maybe csh can handle it better
so it doesn't need xargs???

-greg


-- 
Gregory A. Gulik	Phone:	(312) 825-2435
	E-Mail: ...!jolnet!gagme!greg || ...!chinet!gag
		|| gulik@depaul.edu || variations thereof.

erict@flatline.UUCP (J. Eric Townsend) (09/03/89)

About a year ago, I wanted to be able to traverse the entire
/usr/spool/news tree and search for keywords in articles.  (Hey,
I was bored. :-)

It's a kludge, it's not pretty, and it probably uses more than it's
share of system time.  If you can suggest a way to make
it cleaner/faster, please let me know.  Flames>/dev/null.

However, it's basic /bin/sh, and I've yet to find a place that it
*doesn't* work.

#!/bin/sh
#^^^^^^^^ Wish I had csh to see if this really works. :-)
#try -- recursive search directory structure
#usage: try directory string
#LIVES= it's home directory
LIVES=/u/erict/bin
for i in $1/*
do
if test -d $i
then
   $LIVES/try $i $2 2>/dev/null
#error into devnull handles the "can't open foo" error messages! :-P
else
   grep "$2" $i
fi
done
-- 
"Watch has a clock on it" -- ficc!peter's 3.7 year old son, on seeing
                              an analog watch.
J. Eric Townsend unet!sugar!flatline!erict com6@uhnix1.uh.edu
EastEnders Mailing list: eastender@flatline.UUCP

kent@ssbell.UUCP (Kent Landfield) (09/03/89)

In article <677@lakart.UUCP> dg@lakart.UUCP (David Goodenough) writes:
>
>Great. Now what do all us Berkeley folks do when we don't have xargs.
>How's about we pirate a copy from a local SysV site that happens to
>have a source licence. Naahhhh - that might get us in trouble. :-)
>-- 
>	dg@lakart.UUCP - David Goodenough		+---+

Just look in your neighborhood comp.sources.unix archives. Volume 3
contains xargs. :-) No licencing violations needed. :-) It was posted
to mod.sources before the great renaming... I have not tried it but 
here is the header from the posting for all that are interested...

---------------- Start Header - c.s.u/volume3/xargs ---------------

Newsgroups: mod.sources
Subject: xargs - execute a command with many arguments
Message-ID: <1380@panda.UUCP>
Date: 6 Feb 86 14:43:16 GMT

Mod.sources:  Volume 3, Issue 106
Submitted by: seismo!amdahl!gam (Gordon A. Moffett)

Here is a reimplementation of the System V utility xargs.  I haven't
heard any complaints about it, though [1] There is room for improvement
regarding the command buffer size (tho' it is better than the System V
area in that particular regard) [2] It does not have all the features
of the System V version (as the man page points out).

                               Gordon A. Moffett
                               {ihnp4,seismo,hplabs}!amdahl!gam

---------------- End Header - c.s.u/volume3/xargs ---------------

			-Kent+
---
Kent Landfield               UUCP:     kent@ssbell
Sterling Software FSG/IMD    INTERNET: kent@ssbell.uu.net
1404 Ft. Crook Rd. South     Phone:    (402) 291-8300 
Bellevue, NE. 68005-2969     FAX:      (402) 291-4362

sahayman@iuvax.cs.indiana.edu (Steve Hayman) (09/05/89)

> Much discussion of problems with filenames containing embedded
> newlines or spaces.


How about an option to "find" or maybe a new predicate so that it
would separate filenames in its output not by newline but by
a '\0' character, and then a corresponding option for "xargs"
to tell it that its input files are separated by '\0' instead
of '\n'.   (since '\0' and '/'  are the only characters that
can't appear in a valid pathname, and find already outputs slashes.)

If you wanted a really secure find | xargs combo, perhaps this
would help.

alfie@cs.warwick.ac.uk (Nick Holloway) (09/05/89)

In article <25420@iuvax.cs.indiana.edu> sahayman@iuvax.cs.indiana.edu (Steve Hayman) writes:
> > Much discussion of problems with filenames containing embedded
> > newlines or spaces.
> [ Steve suggests modifying find and xargs to separate filenames by '\0's ]
> 
> If you wanted a really secure find | xargs combo, perhaps this
> would help.

Perhaps, while we are adding new predicates to find, we could just cut
out the middle man, and implement a new -xargs flag.

e.g: 	find / -name core -xargs rm -f \;

There would (should?) be no problems with spaces/newline et al., since
`find' would build the args to be passed to exec*() directly [leave the
shell out of it], and even the command for xargs could have spaces in,
by doing

	find ... -xargs "o o" \;

I admit that you do not have the control you do with the standard
xargs, but I believe that this would be satisfactory for the majority
of uses.

Hopefully this would not have the misfeature (bug) that the xargs here
has.  Even if no files are given, the command will be executed. (This
also applies to the scripts posted).

	echo "/"      | xargs ls	# performs 'ls /'
	cat /dev/null | xargs ls	# performs 'ls' - wrong!

I used to wonder why my tidyup script kept producing errors from rm,
until I discovered this. Now I use the more innefficient "-exec \;". 
If I were given a -xargs flag, I would use it. (Also if the internal
-ls printed set[ug]id etc correctly - I would use that instead of
the "-exec ls {} \;")

--
JANET       : alfie@uk.ac.warwick.cs               |  `O O'  | Nick Holloway
BITNET/EARN : alfie%uk.ac.warwick.cs@ukacrl        | // ^ \\ | Comp Sci Dept
INTERNET    : alfie%cs.warwick.ac.uk@nsfnet-relay.ac.uk      | Uni of Warwick
UUCP        : ..!mcvax!ukc!warwick!alfie, alfie@warwick.UUCP | Coventry, UK.

bob@wyse.wyse.com (Bob McGowen Wyse Technology Training) (09/07/89)

In article <2390@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
>
> >I wouldn't complain about xargs not being capable of handling
> >filenames with spaces and newlines.  There are a lot of other
> >programs that will break under the same circumstances.
>
>In which case I'd not only continue to complain about "xargs", but
>complain about those other programs as well....

The shell (as well as many other utilities) is line oriented, which
means that it uses whitespace to delimit input lines.  Even though
you can quote these characters (as well as the wild cards) and get
them into file names it is not something that is recommended for
obvious reasons.

Since you can create them with the shell quoting mechanisms you
can access them from the shell with the same methods, but these
are not available to most other utilities.  So the question
becomes one of how you would separate one name from another if you
were not using whitespace?

If you want to have spaces or newlines in filenames (which is
perfectly OK, the kernel doesn't care) you would have to consider
writing a new shell, a la the macintosh interface.  Otherwise we
have to live with what we've got (which I am perfectly willing and
happy to do).

I guess this could be considered a flame though I do not intend it to be.

Bob McGowan  (standard disclaimer, these are my own ...)
Customer Education, Wyse Technology, San Jose, CA
..!uunet!wyse!bob
bob@wyse.com

guy@auspex.auspex.com (Guy Harris) (09/10/89)

>> >I wouldn't complain about xargs not being capable of handling
>> >filenames with spaces and newlines.  There are a lot of other
>> >programs that will break under the same circumstances.
>>
>>In which case I'd not only continue to complain about "xargs", but
>>complain about those other programs as well....
>
>The shell (as well as many other utilities) is line oriented, which
>means that it uses whitespace to delimit input lines.

No, that means that it uses *newlines* to delimit lines.  It uses
whitespace to delimit tokens, but you can use quotes to get around that.

>Since you can create them with the shell quoting mechanisms you
>can access them from the shell with the same methods, but these
>are not available to most other utilities.

"Most" other utilities, at least in UNIX as distributed, take their file
names from the command line, and thus the shell quoting mechanisms *are*
available to them.

>So the question becomes one of how you would separate one name from
>another if you were not using whitespace?

'\0', as has already been suggested.  It doesn't have to be the default
separator, but as an option it might be useful - e.g., so that "find |
cpio", which some sites use for backup, won't get confused by files
containing newlines.  (It won't be confused by files containing other
sorts of white space, e.g. spaces.)

>If you want to have spaces or newlines in filenames (which is
>perfectly OK, the kernel doesn't care) you would have to consider
>writing a new shell, a la the macintosh interface.  Otherwise we
>have to live with what we've got (which I am perfectly willing and
>happy to do).

Some people already have interfaces like that; back when I was working
at CCI, the Office Power system had an interface that did allow spaces
in file names, and if it made sense to put a space in a file name, I'd
put it there.  Other systems may do so as well, so at least *some*
programs had better be prepared for spaces in file names, at a minimum -
which includes programs like "xargs", since somebody might do a "find
-print | xargs" that its the directory of a user of such a system.

In other words, the fact that it's inconvenient or impossible to use
file names containing certain characters in *some* programs cannot be
used as an excuse for not fixing at least some other programs -
including, as noted, "xargs" - from being able to handle them.  Or, to
quote the Robustness Principle cited in at least one RFC: "Be
conservative in what you send; be liberal in what you accept from
others;" the latter part means "be liberal enough to accept file names
containing funny characters, since some funny character may decide to
create a file with such a name."

jmm@ecijmm.UUCP (John Macdonald) (09/11/89)

In article <2434@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
| [...]
|
|In other words, the fact that it's inconvenient or impossible to use
|file names containing certain characters in *some* programs cannot be
|used as an excuse for not fixing at least some other programs -
|including, as noted, "xargs" - from being able to handle them.  Or, to
|quote the Robustness Principle cited in at least one RFC: "Be
|conservative in what you send; be liberal in what you accept from
|others;" the latter part means "be liberal enough to accept file names
|containing funny characters, since some funny character may decide to
|create a file with such a name."

It is easy to get that definition of liberal while the current discussion
about "white and control characters in filenames" is going on.  However,
in a different context, one might consider it to be liberal to allow
free form input (e.g. accepting multi-column ls output, user-typed
input with multiple files per line and unnoticed trailing whitespace, etc.).

Of course, this is the sort of situation that leads to feeping creaturism
- (xargs -n for null-delimited names; xargs -l for line delimted names;
xargs -f for free format white delimited names; watch this space for the
next exciting option appearing soon on a command line near you).
-- 
John Macdonald

bet@orion.mc.duke.edu (Bennett Todd) (09/12/89)

The original poster asked for a straightforward construct to let him run
a "find . -type d ..." invoking "grep {}/*" for each directory found.
Other worthwhile suggestions have been offered which probably work
better for what he probably had in mind; however, I am surprised no one
mentioned the straightforward

	find . -type d -print | while read dir;do
		grep string $dir/*
	done

I don't know any way to do that sort of thing under C-shell, but Bourne
shell and successors handle it fine. There is this incredible feeling of
power in being able to type

	oifs=IFS
	IFS=":"
	while read login passwd uid gid tail;do
		# some processing for all the logins on the system
	done </etc/passwd
	IFS=oifs

whenever I feel like it.

-Bennett
bet@orion.mc.duke.edu

leo@philmds.UUCP (Leo de Wit) (09/12/89)

In article <15560@duke.cs.duke.edu> bet@orion.mc.duke.edu (Bennett Todd) writes:
|The original poster asked for a straightforward construct to let him run
|a "find . -type d ..." invoking "grep {}/*" for each directory found.
|Other worthwhile suggestions have been offered which probably work
|better for what he probably had in mind; however, I am surprised no one
|mentioned the straightforward
|
|	find . -type d -print | while read dir;do
|		grep string $dir/*
|	done
|
   []

I don't know whether the original poster (who?) posted his question too
to comp.unix.questions, 'cause quoting myself:

|From leo Fri Sep  8 12:56:11 MET DST 1989
|Article 14005 of comp.unix.questions:
|Path: philmds!leo
|>From: leo@philmds.UUCP (Leo de Wit)
|Newsgroups: comp.unix.questions
|Subject: Re: find cannot find all files in a directory
|Keywords: find files directory
|Message-ID: <1082@philmds.UUCP>
|Date: 7 Sep 89 05:53:20 GMT
|References: <874@wubios.wustl.edu>
|Reply-To: leo@philmds.UUCP (Leo de Wit)
|Organization: Philips I&E DTS Eindhoven
|Lines: 13
|
|In article <874@wubios.wustl.edu> david@wubios.wustl.edu (David J. Camp) writes:
||I want find to return all the files in each directory to exec.  That is,
||I want to do something like:
||
||     find /path -type d -exec command {}/\* \; -print
||
||so that command will be run on each file, one directory at a time.
|
|If you don't have xargs, how about:
|
|     find /path -type d -print|while read dir; do command $dir/*; done
|
|   Leo.
|

So there _was_ prior art, Bennett, though you'll probably have missed it.

   Leo.

jc@minya.UUCP (John Chambers) (09/26/89)

> There is this incredible feeling of power in being able to type:
> 	oifs=IFS
> 	IFS=":"
> 	while read login passwd uid gid tail;do
> 		# some processing for all the logins on the system
> 	done </etc/passwd
> 	IFS=oifs
> whenever I feel like it.

Well, I tried this, adding the two missing dollars, of course; in
particular I tried:

|	oifs=$IFS
|	IFS=":"
|	while read login passwd uid gid descr dir shell
|	do	echo dir=$dir
|	done </etc/passwd
|	IFS=$oifs

It worked for some lines, but for any line that had a null field (e.g.,
several had no password, and some of the system accounts had null descr
fields), the "while read" line ignored all the null fields, causing the
wrong value to be assigned to all the subsequent fields.

And it seemed like such a good idea.  It'd be real useful to have such
an elegant way to extract all the home directories.  I've tried using
awk for the same job, with similar results.

At least sed can handle it.

Sigh.

-- 
#echo 'Opinions Copyright 1989 by John Chambers; for licensing information contact:'
echo '	John Chambers <{adelie,ima,mit-eddie}!minya!{jc,root}> (617/484-6393)'
echo ''
saying

schwartz@psuvax1.cs.psu.edu (Scott Schwartz) (10/01/89)

In article <9408@chinet.chi.il.us> Leslie Mikesell writes:

| Well, how would you go about parsing filenames out of a list if you
| can't use spaces or newlines as the delimiters?

Good point.  Wouldn't it be nice if programs that spit out filenames
also (optionally?) spit out the terminating \0?  Then you'd have the
correct delimiter at your disposal.  

| Personally, I think it is a mistake to allow control characters or
| shell metacharacters to be in filenames. 

Define control character, shell, and metacharacter.  :-)

| We've been through this before and I doubt that anyone has changed
| their mind, but I'll bet no one wants to have a file named ";rm *"
| in their directories waiting for a shell script to eval it or a
| program to insert it into a system() call.

Maybe noninteractive shells should turn off globbing, as a safety
feature?  

System() can certainly be tricky to get right.  In a philisophical
kind way it has the same problem as gets() :-)



--
Scott Schwartz		<schwartz@shire.cs.psu.edu>
Now back to our regularly scheduled programming....