[comp.unix.wizards] folding arguments

lm@arizona.edu (Larry McVoy) (02/12/88)

I frequently do stuff like

$ command `find $DIR -print`

which bombs out when I find too many files.  I've written a little program
that lets me do

$ find $DIR -print | fa | while read x; do command $x; done

I'm willing to post it since it might be useful to others but first I'd like
to know if you've got a better solution.  

The problem is splitting a find stream into lines that will fit into
argv.  Fa will take options as to what is reasonable; defaults are 5000
chars and 1000 args max.) I'd rather not have to drag this around but I
don't want some gross shell script solution either (i.e. I want to be
able to get it all in on one command line == ~80 chars).  I'd use xargs
but that's a s5-ism.  Is there a BSD equiv?

Comments?
-- 
Larry McVoy	lm@arizona.edu or ...!{uwvax,sun}!arizona.edu!lm
		Use the force - read the source.

lew@gsg.UUCP (Paul Lew) (02/12/88)

>    $ command `find $DIR -print`
>    
>    which bombs out when I find too many files.  I've written a little program
>    that lets me do
>    
>    $ find $DIR -print | fa | while read x; do command $x; done

I used the following frequently:

     $ find $DIR -print | awk '{print "command",$0}' | sh	(flexible)
or:
     $ find $DIR -print | sed 's/^/command /' | sh		(fast)

Both are shorter than the 2nd command and it does not need program 'fa'.
It also works from either csh or Bourne shell, i.e., you can do this on all
types of Unix.  You can also put shell options at end, e.g., sh -x, to echo
every command as it executes, or sh -vn to test it first before you delete
the wrong files ;-(

Anyone has a even shorter solution?
-- 
Paul Lew			{oliveb,harvard,decvax}!gsg!lew	(UUCP)
General Systems Group, 5 Manor Parkway, Salem, NH 03079	(603) 893-1000

lew@gsg.UUCP (Paul Lew) (02/12/88)

I forgot to mention that you may use the standard method:

   $ find $DIR -exec command '{}' \;		instead of:
   $ find $DIR -print | sed s/^/command /' | sh

However, the 2nd method allows the find command to be any command that
generates a list of filenames and is more generic. e.g.,

   $ tar t | sed s/^/command /' | sh
-- 
Paul Lew			{oliveb,harvard,decvax}!gsg!lew	(UUCP)
General Systems Group, 5 Manor Parkway, Salem, NH 03079	(603) 893-1000

dhesi@bsu-cs.UUCP (Rahul Dhesi) (02/12/88)

In article <120@gsg.UUCP> lew@gsg.UUCP (Paul Lew) writes:
>     $ find $DIR -print | awk '{print "command",$0}' | sh	(flexible)
>or:
>     $ find $DIR -print | sed 's/^/command /' | sh		(fast)

Me, I prefer:

     $ find $DIR -exec command {} \;

However, the original poster wanted to give the command as many
arguments as possible, so it would be invoked less often, presumably
because efficiency was important to him.  Something like the original
fa, which folds argument lists into manageable size, is essential
in that case.  In a pinch I would try something like this:

     $ find $DIR -print | nroff | sed -e 's/  */ /g' | ...

Since nroff right-justifies by default, we use sed to squeeze multiple
blanks to a single blank so "while read" in sh will work.  (Not
tested.)
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

jgy@hropus.UUCP (John Young) (02/12/88)

> I frequently do stuff like
> 
> $ command `find $DIR -print`
> 
> which bombs out when I find too many files.  I've written a little program
> that lets me do
> 
> $ find $DIR -print | fa | while read x; do command $x; done
> 
> Im willing to post it since it might be useful to others but first I'd like
> to know if you've got a better solution.  
> 

Yes, on System V you can say:

$ find $DIR -print | xargs command

xargs will read it's standard input and by default execute 'command'
with as many arguments as can fit.  As usual there are an array of
other options.

lm@arizona.edu (Larry McVoy) (02/13/88)

In article <120@gsg.UUCP> lew@gsg.UUCP (Paul Lew) writes:
>I used the following frequently:
>
>     $ find $DIR -print | awk '{print "command",$0}' | sh	(flexible)
>or:
>     $ find $DIR -print | sed 's/^/command /' | sh		(fast)
>
>Both are shorter than the 2nd command and it does not need program 'fa'.

But I think this has the following problem: "command" gets exec-ed
once per argument.   That's exactly what I want to avoid.  I really
want infinite space for args to exec but failing that I want something
that bunchs up args into suitable form for an exec.  So fa will fit as
many as it can in, say 5000 bytes (configurable at runtime), so I don't
have to run the same command so many times.
-- 
Larry McVoy	lm@arizona.edu or ...!{uwvax,sun}!arizona.edu!lm
		Use the force - read the source.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/13/88)

In article <3839@megaron.arizona.edu> lm@megaron.arizona.edu.UUCP (Larry McVoy) writes:
>I really want infinite space for args to exec but failing that I want
>something that bunchs up args into suitable form for an exec.

This exists on most systems under the name "xargs" or "apply"
(different designs, but both will do what you want).  I agree
it is a useful facility and support its inclusion in IEEE 1003.2.

dmcanzi@watdcsu.waterloo.edu (David Canzi) (02/14/88)

In article <2091@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>
>     $ find $DIR -print | nroff | sed -e 's/  */ /g' | ...
>
>Since nroff right-justifies by default, we use sed to squeeze multiple
>blanks to a single blank so "while read" in sh will work.  (Not
>tested.)

Instead of nroff, I used the fmt command, like so:

find $DIR -print | fmt -1021 | ...

The fmt command may be a Berkeleyism.  1021 is the maximum width that
fmt allows me to specify on the two systems (BSD4.3 and Sun 3.2) where
I tried this.

-- 
David Canzi

rbj@icst-cmr.arpa (Root Boy Jim) (02/17/88)

   From: Rahul Dhesi <dhesi@bsu-cs.uucp>
   Date: 12 Feb 88 15:14:46 GMT

   In article <120@gsg.UUCP> lew@gsg.UUCP (Paul Lew) writes:
   >     $ find $DIR -print | awk '{print "command",$0}' | sh	(flexible)
   >or:
   >     $ find $DIR -print | sed 's/^/command /' | sh		(fast)

The original author mentioned that xargs (the obvious and best)
solution is a System V solution. So get those tapes out and port it,
as well as cpio and maybe a few others. You are already licensed for
it if you are running BSD.  In any case, a PD version has been posted
to the source groups.

   Me, I prefer:

	$ find $DIR -exec command {} \;

   However, the original poster wanted to give the command as many
   arguments as possible, so it would be invoked less often, presumably
   because efficiency was important to him.  Something like the original
   fa, which folds argument lists into manageable size, is essential
   in that case.  In a pinch I would try something like this:

	$ find $DIR -print | nroff | sed -e 's/  */ /g' | ...

   Since nroff right-justifies by default, we use sed to squeeze multiple
   blanks to a single blank so "while read" in sh will work.  (Not
   tested.)

Now we're getting somewhere. Indeed, such an `fa' type program exists;
it is called `fmt'. So one might try:

	% find $DIR -print | fmt | sed 's/^/cmd /' | sh

It is a pity that fmt is hardwired with a line size of 72 columns. Fold,
which does believe in differing line sizes, chops things up without
regard for word boundarys. These two programs could easily be merged.

   Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

	(Root Boy) Jim Cottrell	<rbj@icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688
OKAY!!  Turn on the sound ONLY for TRYNEL CARPETING,
 FULLY-EQUIPPED R.V.'S and FLOATATION SYSTEMS!!

lew@gsg.UUCP (Paul Lew) (02/17/88)

>	% find $DIR -print | fmt | sed 's/^/cmd /' | sh
>
> It is a pity that fmt is hardwired with a line size of 72 columns.

I was very surprised when dmcanzi@watdcsu.waterloo.edu (David Canzi) pointed
out you can specify a width to fmt:

	% find $DIR -print | fmt -1000 | sed 's/^/cmd /' | sh
				 ^^^^^
I checked with man pages for fmt over and over, guess what?  Another
undocumented feature!  I think you can only learn thing like this by
reading network news besides look into the source code.
-- 
Paul Lew			{oliveb,harvard,decvax}!gsg!lew	(UUCP)
General Systems Group, 5 Manor Parkway, Salem, NH 03079	(603) 893-1000

allbery@ncoast.UUCP (Brandon Allbery) (02/18/88)

As quoted from <2091@bsu-cs.UUCP> by dhesi@bsu-cs.UUCP (Rahul Dhesi):
+---------------
| In article <120@gsg.UUCP> lew@gsg.UUCP (Paul Lew) writes:
| >     $ find $DIR -print | awk '{print "command",$0}' | sh	(flexible)
| >or:
| >     $ find $DIR -print | sed 's/^/command /' | sh		(fast)
| 
| Me, I prefer:
| 
|      $ find $DIR -exec command {} \;
+---------------

Hmmm...   $ find $DIR -print | xargs command

+---------------
| in that case.  In a pinch I would try something like this:
| 
|      $ find $DIR -print | nroff | sed -e 's/  */ /g' | ...
| 
| Since nroff right-justifies by default, we use sed to squeeze multiple
| blanks to a single blank so "while read" in sh will work.  (Not
| tested.)
+---------------

The shell (sh, at least) isn't fazed by multiple spaces... at least, not the
AT&T version.  But if you're after efficiency, nroff is the LAST thing you
should use!  ;-)
-- 
	      Brandon S. Allbery, moderator of comp.sources.misc
       {well!hoptoad,uunet!hnsurg3,cbosgd,sun!mandrill}!ncoast!allbery
KABOOM!!! Worf: "I think I'm sick." LaForge: "I'm sure half the ship knows it."