[comp.unix.wizards] Problem with find

roston@robotics.jpl.nasa.gov (Gerry Roston) (09/23/88)

I am having a problem with find on various Suns (3/260, 3/60) running
Sun OS (3.5, 4.0).  The problem is as follows:  

One really nice feature of find is to say "find foo".  This does
a search on the find database to find that file.  This is blazingly fast
compared to doing a normal find.  To keep this find database up to date,
the script /usr/lib/find/updatedb is run once a week.  We have just
started running news, and I do not want to include the news articles
in the database (who would ever want to find 123?).

Ideally, I want to do the following:
    find / -name news/spool -prune -o print ...
however, this does not work.  My current solution is to do
    find / -name news -prune -o print ...
which has the affect of skipping ALL directories named news, and
all of their subdirectories.

Does anyone have any ideas how I can simply skip news/spool?

Thanx in advance.


gerry roston, robotic systems research group
jet propulsion laboratory, 4800 oak grove drive, m/s 23
pasadena, california, 91109
(818) 354-9124  (818) 354-6508

chris@mimsy.UUCP (Chris Torek) (09/23/88)

In article <108@forsight.Jpl.Nasa.Gov> roston@robotics.jpl.nasa.gov
(Gerry Roston) writes:
>Ideally, I want to do the following:
>    find / -name news/spool -prune -o print ...
>however, this does not work.

`find' only looks at one component of the path name at a time, so
there is no way to exclude a particular sub-path directly.  You could
use

	-exec expr {} : '.*/spool/news$' \;

or some variant.  Alas, this requires one fork()/exec() per file name
traversed.  You could try to reduce the cost by running this only on
`likely' candidates (-type d -name news, for instance).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

dhesi@bsu-cs.UUCP (Rahul Dhesi) (09/23/88)

In article <108@forsight.Jpl.Nasa.Gov> roston@robotics.jpl.nasa.gov (Gerry
Roston) writes:
[ on using find ]

>Does anyone have any ideas how I can simply skip news/spool?

Do I understand correctly that you want "find" to list all files except
those inside news/spool?

If so, just use a pipeline like this:

    find / | grep -v '\./usr/spool/news/spool.*'

My "stuff" utility for MS-DOS, which implements a tiny subset of
"find", looks for a match of the the entire current pathname if the
pattern supplied after -name contains any slashes.  I find this much
more useful than only matching the filename part and recommend to those
helping UNIX evolve that future versions of "find" do the same.  (But
please be upward compatible.)
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

leo@philmds.UUCP (Leo de Wit) (09/23/88)

In article <108@forsight.Jpl.Nasa.Gov> roston@robotics.jpl.nasa.gov (Gerry Roston) writes:
    [ ]...
>Ideally, I want to do the following:
>    find / -name news/spool -prune -o print ...
>however, this does not work.  My current solution is to do
>    find / -name news -prune -o print ...
>which has the affect of skipping ALL directories named news, and
>all of their subdirectories.
>
>Does anyone have any ideas how I can simply skip news/spool?

Use the inode number of the directory instead of the name. This should be
unique. Assuming you use the Bourne shell:

    set `ls -di /news/spool`
    find / -inum $1 -prune -o -print ...

The find(1) here has no -prune option, but judging from your
description of it this should work.

        Leo.

nick@ccicpg.UUCP (Nick Crossley) (09/24/88)

In article <108@forsight.Jpl.Nasa.Gov> roston@robotics.jpl.nasa.gov (Gerry Roston) writes:
>Ideally, I want to do the following:
>    find / -name news/spool -prune -o print ...
>however, this does not work.  My current solution is to do
>    find / -name news -prune -o print ...
>which has the affect of skipping ALL directories named news, and
>all of their subdirectories.
>
>Does anyone have any ideas how I can simply skip news/spool?

Whenever I want to do something like this, I just chmod 000 the directories
to be ignored.  This has the effect of making find skip those directories
with no error messages (at least on the systems I have used).

Of course, this might not be quite so easy if you need to allow access to
those directories from other processes running at the same time, as might well
be the case with news!

Also, don't forget to change the mode back afterwards ... :-)

-- 

<<< standard disclaimers >>>
Nick Crossley, CCI, 9801 Muirlands, Irvine, CA 92718-2521, USA
Tel. (714) 458-7282,  uucp: ...!uunet!ccicpg!nick

cbp@foster.avid.OZ (Cameron Paine) (09/26/88)

In article <108@forsight.Jpl.Nasa.Gov>, roston@robotics.jpl.nasa.gov (Gerry Roston) writes:
> I am having a problem with find on various Suns (3/260, 3/60) running
> Sun OS (3.5, 4.0).  The problem is as follows:  

Maybe I'm missing something cos I (sob) have never used Suns...

> Does anyone have any ideas how I can simply skip news/spool?

Have you considered:

	find / -print | egrep -v '^/usr/spool/news/.*$'

Of course, the regexp can be changed to suit the situation. You can even
'|' expressions together to exclude more than on tree in a single pass.

If this won't do what is wanted, would somebody please enlighten an
ignorant soul (one at a time, please :-).

cbp

ps: this works a treat when cpio-ing from root into a mounted
subdirectory.
-- 

cbp@foster.avid.oz - {ACS,CS}net
cbp%foster.oz.au@uunet.uu.net - ARPAnet
...!{hplabs,mcvax,nttlab,ukc,uunet}!munnari!foster.oz.au!cbp - UUCP
D

lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) (09/27/88)

In article <815@philmds.UUCP> leo@philmds.UUCP (Leo de Wit) writes:
: Use the inode number of the directory instead of the name. This should be
: unique. Assuming you use the Bourne shell:
: 
:     set `ls -di /news/spool`
:     find / -inum $1 -prune -o -print ...

We will remind the listeners that an inode number by itself is not necessarily
unique.  A file on a different device could easily have the same inode number.
If that happens to be a directory...

Since find doesn't have a -dnum option (that I know of), you could decrease
the odds of a false positive by checking both the inode and the name.

     find / \( -name 'news' -inum $1 -prune \) -o -print ...

Larry Wall
lwall@jpl-devvax.jpl.nasa.gov

leo@philmds.UUCP (Leo de Wit) (09/30/88)

In article <2935@jpl-devvax.JPL.NASA.GOV> lwall@jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>In article <815@philmds.UUCP> leo@philmds.UUCP (Leo de Wit) writes:
>: Use the inode number of the directory instead of the name. This should be
>: unique. Assuming you use the Bourne shell:
>: 
>:     set `ls -di /news/spool`
>:     find / -inum $1 -prune -o -print ...
>
>We will remind the listeners that an inode number by itself is not necessarily
>unique.  A file on a different device could easily have the same inode number.
>If that happens to be a directory...

Yes, my mistake. This can perhaps be circumvented by pruning the
directories that are in fact filesystems, doing a find in each
filesystem and only doing the -inum $1 prune in the filesystem
containing the spool directory, so we have

    find $root -inum 2 -prune -o -print

if spool is not contained in the tree (isn't inum == 2 <==> filesystem ?),
and

    find $root -inum 2 -prune -o -inum $1 -prune -o -print

if spool IS contained in the tree. A new problem is that this already
fails when $root is a filesystem (for $root is already pruned then), so
we have to start the finds in each subdirectory of the filesystems. The
filesystem that contains the spool dir can be found by looking into
/etc/fstab (by hand or automatically).

Larry Wall also mentioned the absence of a -dnum option. It puzzles me why
this option is left out, because nearly all members of the stat struct can
be specified with find. Why not this one?

Someone else offered a solution that removes the spool directories by
piping to egrep -v. I think this is a portable solution, however it
does not take advantage of -prune for those who have it, for that will
speed up find a lot if the directory pruned is the top of a large
tree.

        Leo.


P.S. I leave the solution of the original problem using -prune and
-inum in the way lined out as a puzzle, that is: find a nice
implementation. Any volunteers?

    find /idea -type solution -user yourself -ok cat {} \; |
    Pnews -h 'Re: Problem with find(1)'

domo@riddle.UUCP (Dominic Dunlop) (10/10/88)

In article <108@forsight.Jpl.Nasa.Gov> roston@robotics.jpl.nasa.gov (Gerry Roston) writes:
>I am having a problem with find on various Suns (3/260, 3/60) running
>Sun OS (3.5, 4.0).  The problem is as follows:  
> [Stuff about ways of speeding up find]
>Does anyone have any ideas how I can simply skip news/spool?

Well, if you had System V, release 3, you could use the -mount option to
stop find from crossing file-system mount points in its search.  But -mount
isn't even in the SVID.  Maybe Sun'll implement it in System V release 4...
It's certainly not recognised by the Sun 3 we have here.

Ain't compatibility great?
-- 
Dominic Dunlop
domo@sphinx.co.uk  domo@riddle.uucp

paul@ppgbms (Paul Matz) (10/13/88)

In article <922@riddle.UUCP>, domo@riddle.UUCP (Dominic Dunlop) writes:
> In article <108@forsight.Jpl.Nasa.Gov> roston@robotics.jpl.nasa.gov (Gerry Roston) writes:
>I am having a problem with find on various Suns (3/260, 3/60) running
>Sun OS (3.5, 4.0).  The problem is as follows:  
> [Stuff about ways of speeding up find]
>Does anyone have any ideas how I can simply skip news/spool?
> 
One way to skip certain subdirectories is to specify where find is to
begin its search in the directory tree.  Doing a:

	find / ...

will start at the root, and look at everything.  Giving a more exact
position in the tree to start will cause it to look starting at that
level and below, and may speed it up a bit.  Also, wildcard file 
specifications don't work the way you might expect with the "-name"
option.  Generally, you need to use double quotes around the wild-card spec.
Finally, the "-print" option must be last, or at least after other options 
such as "-newer", "-ctime", etc.

The other thing that came to mind was the use of "whereis".  There's
always that command for searching for source, binary or manual sections,
and is pretty fast.

Hope that helps.

Paul Matz  PPG Biomedical Systems  ppgbms!paul