[comp.unix.questions] Descending directory hierarchy?

johnb@lakesys.lakesys.com (John C. Burant) (12/01/89)

I've been thinking about the find command on SysV's... the one that makes you
type find / name [filename] -print to find files, and only finds files with
the exact filename specified... and I've been thinking about writing a program
that will act like the BSD find... it list all files with the phrase you look
for in the filename... (like if I looked for t, I'd get a lot of files listed)

Which brings me to the question: Is there a way to descend the directory 
hierachy into every directory, or do I have to write a routine to do that?

Thanks.
-John



-- 
John C. Burant | johnb@lakesys.lakesys.com      | "Now don't you wish you
Glendale, WI   | johnb@lakesys.UUCP             | had someone with perfect
[.signature]   | ... uunet!marque!lakesys!johnb | pitch in YOUR band?"
--------------------------------------------------------------------------

coleman@cam.nist.gov (5672) (12/02/89)

In article <1372@lakesys.lakesys.com>, johnb@lakesys.lakesys.com (John C. Burant) writes:
> I've been thinking about the find command on SysV's... the one that makes you
> type find / name [filename] -print to find files, and only finds files with
> the exact filename specified... and I've been thinking about writing a program
> that will act like the BSD find... it list all files with the phrase you look
> for in the filename... (like if I looked for t, I'd get a lot of files listed)
> 
> Which brings me to the question: Is there a way to descend the directory 
> hierachy into every directory, or do I have to write a routine to do that?

I suggest the use of the library function ftw(3). Their is one bug
that I encountered on the SUN running bsd 4.3. ftw(3) won't handle
symbolic links so I had to prepare a version on ftw() that does. You
may not have any problems with it if you don't have symbolics links.

Sean Coleman
NIST
coleman@bldr.nist.gov

rae98@wash08.uucp (Robert A. Earl) (12/02/89)

In article <1372@lakesys.lakesys.com> johnb@lakesys.lakesys.com (John C. Burant) writes:
>I've been thinking about the find command on SysV's... the one that makes you
>type find / name [filename] -print to find files, and only finds files with
>the exact filename specified... and I've been thinking about writing a program
>that will act like the BSD find... it list all files with the phrase you look
>for in the filename... (like if I looked for t, I'd get a lot of files listed)
>
>Thanks.
>-John

At least on my SVR2 (NCR Tower 32/[468]X0, you can do this with:
find /usr01/rae98 -name "*unix*" -print
which gives:
/usr01/rae98/News/comp/unix
/usr01/rae98/News/comp/sources/unix

Which probably isn't the best way to show what I meant, but you get the idea...

-- 
===========================================================
Sorry....no signatures available today...please try later.

cpcahil@virtech.uucp (Conor P. Cahill) (12/02/89)

In article <1372@lakesys.lakesys.com>, johnb@lakesys.lakesys.com (John C. Burant) writes:
> I've been thinking about the find command on SysV's... the one that makes you
> type find / name [filename] -print to find files, and only finds files with
> the exact filename specified... and I've been thinking about writing a program
> that will act like the BSD find... it list all files with the phrase you look
> for in the filename... (like if I looked for t, I'd get a lot of files listed)
> 
> Which brings me to the question: Is there a way to descend the directory 
> hierachy into every directory, or do I have to write a routine to do that?

First of all the -name parameter to find acts the same way in both system V
and in BSD.  In either system, if you did a "find / -name t -print" you \
would only find files that are named 't', not files that have a 't' in thier
names.  

To search for a file with a 't' in the name, you need to do the following:

	find / -name "*t*" -print..

Again, this is the same for both system V and BSD.


Second, System V has an ftw(3) "File Tree Walk" routine that can be used 
to descend the entire directory hierarchy.




-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+

chris@mimsy.umd.edu (Chris Torek) (12/03/89)

>In article <1372@lakesys.lakesys.com> johnb@lakesys.lakesys.com
>(John C. Burant) writes:
>>... I've been thinking about writing a program [for SysV] that will act 
>>like the BSD find... it list all files with the phrase you look for in
>>the filename...

In article <1989Dec2.135436.22689@virtech.uucp> cpcahil@virtech.uucp
(Conor P. Cahill) writes:
>First of all the -name parameter to find acts the same way in both system V
>and in BSD.

John Burant was probably referring to the Ames Fast File Finder, invoked
on Tahoe-or-later BSD systems as

	find foo

which is much like

	find / -name '*foo*' -print

but is *MUCH* faster (I mean really stupendously faster; you have to pipe
this through `more' to have a chance of reading it).  It works by reading
a database (rebuilt at the system administrator's discretion, typically
once each Sunday morning), so it can only get those files listed in the
database (typically those whose name can be found by the user `nobody').

To quote the find manual:

     The second form rapidly searches a database for all path-
     names which match pattern.  Usually the database is recom-
     puted weekly and contains the pathnames of all files which
     are publicly accessible.  If escaped, normal shell "glob-
     bing" characters (`*', `?', `[', and ']') may be used in
     pattern, but the matching differs in that no characters
     (e.g. `/') have to be matched explicitly.	As a special
     case, a simple pattern containing no globbing characters is
     matched as though it were *pattern*; if any globbing charac-
     ter appears there are no implicit globbing characters.
	.
	.
	.
BUGS
     The first form's syntax is painful, and the second form's
     exact semantics is confusing and can vary from site to site.

[`semantics is'??]
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

guy@auspex.auspex.com (Guy Harris) (12/07/89)

>I've been thinking about the find command on SysV's... the one that makes you
>type find / name [filename] -print to find files, and only finds files with
>the exact filename specified...

No, it finds files whose names match the *pattern* specified, e.g.

	find / -name '*.c' -print

will show all the files with names that end in ".c".

>and I've been thinking about writing a program that will act like the
>BSD find... it list all files with the phrase you look for in the
>filename... (like if I looked for t, I'd get a lot of files listed)

The BSD "find" is really two programs jammed into one executable file:

	1) the standard UNIX "find", which exists on BSD, S5, etc.,
	   etc., and recursively walks a directory tree;

	2) the "fast find", which searches a database built, generally,
	   by a "cron" job that runs at night.

You're thinking of the second program.  Unfortunately, since they *were*
jammed into one executable (and source) file, and since the first
program is derived from AT&T-licensed code, you can't get the source to
the second program without an AT&T source license, unless they've either
1) replaced the first program with an "AT&T-free" version or 2) split
them into two files.  I really wish they'd at least do the latter....

>Which brings me to the question: Is there a way to descend the directory 
>hierachy into every directory, or do I have to write a routine to do that?

The "fast find" doesn't descend the directory hierarchy; it just zips
through a database built from a script that runs the conventional "find"
(the first program).  If you wrote a version that worked like the "fast
find" but that descended the directory hierarchy, it would act just like

	find / -name <pattern> -print

which is 1) a lot easier to type than the source code to a program that
duplicates this functionality and 2) a lot slower than "fast find".

(To be precise, quoting the comment at the front of the 4.3BSD "find.c":

 *              The second form searches a pre-computed filelist
 *              (constructed nightly by /usr/lib/crontab) which is
 *              compressed by updatedb (v.i.z.)  The effect of
 *                      find <name>
 *              is similar to
 *                      find / +0 -name "*<name>*" -print
 *              but much faster.
       
Note that the "fast find", according to a comment I remember its author,
James A.  Woods, making, doesn't handle files with names containing
characters with the 8th bit set.  Yes, such files exist; at one point I
had a symbolic link to "/vmunix" named "/UNIX(R)", where "(R)" was the
ISO Latin #1 "registered trademark" symbol.  I could easily believe that
users in countries whose native language isn't English - especially
non-computer-weenie users - would give files names in their native
language, including accented letters from ISO Latin #1.)
Newsgroups: johnb@lakesys.lakesys.com
Subject: Re: Descending directory hierarchy?
Summary: 
Expires: 
References: <1372@lakesys.lakesys.com>
Sender: 
Reply-To: guy@auspex.auspex.com (Guy Harris)
Followup-To: 
Distribution: usa
Organization: Auspex Systems, Santa Clara
Keywords: directory list find search

>I've been thinking about the find command on SysV's... the one that makes you
>type find / name [filename] -print to find files, and only finds files with
>the exact filename specified...

No, it finds files whose names match the *pattern* specified, e.g.

	find / -name '*.c' -print

will show all the files with names that end in ".c".

>and I've been thinking about writing a program that will act like the
>BSD find... it list all files with the phrase you look for in the
>filename... (like if I looked for t, I'd get a lot of files listed)

The BSD "find" is really two programs jammed into one executable file:

	1) the standard UNIX "find", which exists on BSD, S5, etc.,
	   etc., and recursively walks a directory tree;

	2) the "fast find", which searches a database built, generally,
	   by a "cron" job that runs at night.

You're thinking of the second program.  Unfortunately, since they *were*
jammed into one executable (and source) file, and since the first
program is derived from AT&T-licensed code, you can't get the source to
the second program without an AT&T source license, unless they've either
1) replaced the first program with an "AT&T-free" version or 2) split
them into two files.  I really wish they'd at least do the latter....

>Which brings me to the question: Is there a way to descend the directory 
>hierachy into every directory, or do I have to write a routine to do that?

The "fast find" doesn't descend the directory hierarchy; it just zips
through a database built from a script that runs the conventional "find"
(the first program).  If you wrote a version that worked like the "fast
find" but that descended the directory hierarchy, it would act just like

	find / -name <pattern> -print

which is 1) a lot easier to type than the source code to a program that
duplicates this functionality and 2) a lot slower than "fast find".

(To be precise, quoting the comment at the front of the 4.3BSD "find.c":

 *              The second form searches a pre-computed filelist
 *              (constructed nightly by /usr/lib/crontab) which is
 *              compressed by updatedb (v.i.z.)  The effect of
 *                      find <name>
 *              is similar to
 *                      find / +0 -name "*<name>*" -print
 *              but much faster.
       
Note that the "fast find", according to a comment I remember its author,
James A.  Woods, making, doesn't handle files with names containing
characters with the 8th bit set.  Yes, such files exist; at one point I
had a symbolic link to "/vmunix" named "/UNIX(R)", where "(R)" was the
ISO Latin #1 "registered trademark" symbol.  I could easily believe that
users in countries whose native language isn't English - especially
non-computer-weenie users - would give files names in their native
language, including accented letters from ISO Latin #1.)