[net.unix] Wildcard-specs

VERACSD@usc-isi.arpa (09/25/86)

Does UNIX support recursive directory-level matching?  i. e. Does it allow one
to wildcard-specify all files in a dir as well as all files in the dir's
subdirectories?  cf. the use of "**" in Symbolics' pathnames.  

If it does, how does one specify this? (Else, why not?.)

(Context this ? arose in:  I'm (obviously) learning Unix, and want to grep
all files in a dir's hierarchy.)

Thanks.

-- ck

lacasse@RAND-UNIX.arpa (09/26/86)

To grep all files in the directories immediately below you, use
    grep strings */*
For another level, use:
    grep strings */*/*

If you run over N characters (about 8K on 4.2BSD I think) in the expansion
of the total file name list, you will have to break it up.  (You get the
message "argument list too long".)  The ** symbolics uses won't do it.

Some commands (e.g. rm, chgrp, chown, chmod[4.3 only] ) have a -r flag
that does a recursive descent.

If you have all day, you can use:
    find . -exec grep string {} \;
which will do a recursive descent from where you are, but starts a lot
of new processes along the way.

      Mark LaCasse                  qantel!hplabs!sdcrdcf!randvax!lacasse
      c/o The Rand Corporation       cbosgd!ihnp4!sdcrdcf!randvax!lacasse
      1700 Main Street              lacasse@Rand-Unix
      Santa Monica, CA 90406
	213/393-0411  ext. 7420

hmc@hwee.UUCP (Hugh Conner) (10/08/86)

In article <4142@brl-smoke.ARPA> VERACSD@usc-isi.arpa writes:
>Does UNIX support recursive directory-level matching?  i. e. Does it allow one
>to wildcard-specify all files in a dir as well as all files in the dir's
>subdirectories?  cf. the use of "**" in Symbolics' pathnames.

We've had discussions about this at meetings of our Local user group. It is
one feature which we all agree is lacking. Unfortunately we cannot agree on a
neat, workable, solution. Suggestions include using ** as mentioned above,
making * match subdirectories as well (it does in cpio), among others. I'd
be interested to know if anyone has other solutions, and if anyone has actually
implemented it.
-- 
*+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*
+              "Who are all these people in my office anyway?"                +
+                                                                             +
+     Hugh M. Conner                                  hmc@ee.hw.ac.uk         +
*+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*

bob@its63b.ed.ac.uk (ERCF08 Bob Gray) (10/09/86)

In article <4162@brl-smoke.ARPA> lacasse@RAND-UNIX.arpa writes:
>
>If you have all day, you can use:
>    find . -exec grep string {} \;

You would be faster with
	grep string `find . -print`
provided that there weren't too many filenames. There is a
limit built into your system on the number of parameters you
can pass in an exec().

It has always seemed odd to me that a system with a
hierarchical file system didn't have some pattern which
recursively matched to sub directories and the files in
them. It would be very nice to do something like

	vi **.mk

to edit all the makefiles in a particular set of directories.

In most cases the pattern matching of
e.g.
	ls -l **
has the same meaning as
	ls -l `find . -print`

The problems begin when you attempt to make the "**"
wildcard as general as the other metacharacters. The pattern
"**x**"  means "find any file or directory which has the
letter x in its name. match to either the filename or
to ALL the files in the sub-tree if it is a directory.
i.e.
	**xyz**
should match to
	abcxyz.x
	abc/abcxyz.y
	abc/def/xxyzz/a.c
	abc/def/xxyzz/b.c
	abc/def/xxyzz/d.c
	abc/def/xxyzz/e.c

I started to implement "**" pattern matching in the Bourbe
shell a few years ago. Getting it to match with example 1.
was fairly trivial. The more general "**xyz**" case caused
an explosion in the number of filename matches before
sorting and elimination of duplicates. It was very easy to
run out of space on the machine I was working on in those days.

If I dig the code out of the archives, what sould happen in
a command like
	cp **.c x

where x is a directory. Should the shell quietly make new
directories or should it issue seperate cp commands for each
target directory, then let cp complain if they don't exist.
	Bob Gray.
	ERCC.

meissner@dg_rtp.UUCP (Michael Meissner) (10/15/86)

In article <191@hwee.UUCP> hmc@hwee.uucp (Hugh Conner) writes:
> In article <4142@brl-smoke.ARPA> VERACSD@usc-isi.arpa writes:
>>Does UNIX support recursive directory-level matching?  i. e. Does it allow one
>>to wildcard-specify all files in a dir as well as all files in the dir's
>>subdirectories?  cf. the use of "**" in Symbolics' pathnames.
>
> We've had discussions about this at meetings of our Local user group. It is
> one feature which we all agree is lacking. Unfortunately we cannot agree on a
> neat, workable, solution. Suggestions include using ** as mentioned above,
> making * match subdirectories as well (it does in cpio), among others. I'd
> be interested to know if anyone has other solutions, and if anyone has actually
> implemented it.

   I use Data General's AOS/VS operating system, as well as our UNIX(es), and
this is an area where each system has some concepts that the other could
pick up.  In this case, AOS/VS has the following wildcard characters:

	+	Match any string except a directory separator (* in UNIX).
	-	Match any string except a directory separator that doesn't
		contain a period.
	*	Match any single character except a directory separator or a
		a period (somewhat like in UNIX).
	#	Match all files in subordinate directories.
	\	Omit a given pattern from being matched.
	:	Directory separtor (/ in UNIX).

Thus, if I wanted to match all of the C files, except those that begin with
z, I would specify the pattern:

	cmd #:+.c\z+

Obviously if such a scheme were to be put into UNIX, the characters used
would have to be changed (\ in particular).  However, the # and the \ wild-
cards are VERY powerful.

    To get the same effect in UNIX as the wildcard above, I would have to do
something like:

	find . -type f -name "\.c$" | egrep -v "^z" | xargs cmd

	Michael Meissner, Data General
	...{ decvax, ucbvax, inhp4 }!mcnc!rti-sel!dg_rtp!meissner

meissner@dg_rtp.UUCP (Michael Meissner) (10/15/86)

In article <191@hwee.UUCP> hmc@hwee.uucp (Hugh Conner) writes:
> In article <4142@brl-smoke.ARPA> VERACSD@usc-isi.arpa writes:
>>Does UNIX support recursive directory-level matching?  i. e. Does it allow one
>>to wildcard-specify all files in a dir as well as all files in the dir's
>>subdirectories?  cf. the use of "**" in Symbolics' pathnames.
>
> We've had discussions about this at meetings of our Local user group. It is
> one feature which we all agree is lacking. Unfortunately we cannot agree on a
> neat, workable, solution. Suggestions include using ** as mentioned above,
> making * match subdirectories as well (it does in cpio), among others. I'd
> be interested to know if anyone has other solutions, and if anyone has actually
> implemented it.

   I use Data General's AOS/VS operating system, as well as our UNIX(es), and
this is an area where each system has some concepts that the other could
pick up.  In this case, AOS/VS has the following wildcard characters:

	+	Match any string except a directory separator (* in UNIX).
	-	Match any string except a directory separator that doesn't
		contain a period.
	*	Match any single character except a directory separator or a
		a period (somewhat like in UNIX).
	#	Match all files in subordinate directories.
	\	Omit a given pattern from being matched.
	:	Directory separtor (/ in UNIX).

Thus, if I wanted to match all of the C files, except those that begin with
z, I would specify the pattern:

	cmd #:+.c\z+

Obviously if such a scheme were to be put into UNIX, the characters used
would have to be changed (\ in particular).  However, the # and the \ wild-
cards are VERY powerful.

    To get the same effect in UNIX as the wildcard above, I would have to do
something like:

	find . -type f -name "\.c$" -print | egrep -v "^z" | xargs cmd

	Michael Meissner, Data General
	...{ decvax, ucbvax, inhp4 }!mcnc!rti-sel!dg_rtp!meissner

jso@edison.UUCP (John Owens) (10/16/86)

In article <77@its63b.ed.ac.uk>, bob@its63b.ed.ac.uk (ERCF08 Bob Gray) writes:
> If I dig the code out of the archives, what sould happen in
> a command like
> 	cp **.c x
> 
> where x is a directory. Should the shell quietly make new
> directories or should it issue seperate cp commands for each
> target directory, then let cp complain if they don't exist.

The pattern matching can't depend on the command; besides not fitting
the UNIX philosophy at all, it would be ridiculously difficult in the
code.  The above pattern match would have to expand to:
	cp a.c s1/stop.c s2/s3/foobar.c x
and cp would do what it does today, namely create x/a.c x/stop.c and
x/foobar.c .

John Owens		General Electric Company - Charlottesville, VA
jso@edison.GE.COM	old arpa: jso%edison.GE.COM@seismo.CSS.GOV
+1 804 978 5726		old uucp: {seismo,decuac,houxm,calma}!edison!jso