flee@guardian.cs.psu.edu (Felix Lee) (10/10/90)
>I squirreled away a little program called 'descend' that does the >moral equivalent of a 'find . -print', except rather fast. "descend" is fast because it recognizes leaf directories and avoids stat()ing the files in that directory. This is usually a big win, since most files tend to be in leaf directories. "find" can't do this in general, since most of its predicates require stat()ing each file, but it wouldn't be too hard to add lazy stat()ing to find. And it may even be worth it. -- Felix Lee flee@cs.psu.edu
chris@mimsy.umd.edu (Chris Torek) (10/10/90)
In article <F6.rhxm2@cs.psu.edu> flee@guardian.cs.psu.edu (Felix Lee) writes: >"descend" is fast because it recognizes leaf directories and avoids >stat()ing the files in that directory. This is usually a big win, >since most files tend to be in leaf directories. > >"find" can't do this in general, since most of its predicates require >stat()ing each file, but it wouldn't be too hard to add lazy stat()ing >to find. And it may even be worth it. Quite likely. 4.3BSD-reno's `find' reimplementation (a redistributable version) uses the C library `fts' routines (from POSIX) which take flags indicating whether `stat's are desired. If you say `no', it stats only when this is required to search the directory. The new find sets the `I need stat information' only when at least one predicate requires it. The result: % cd /usr/src/local/games % time find . name obj ./umoria/obj 0.3u 0.8s 0:01 91% 35+56k 0+0io 2pf+0w % time find . type l ./umoria/obj 0.5u 5.2s 0:09 63% 42+66k 94+1io 3pf+0w % It makes a pretty big difference. If find said `no stat' and did the stats only when some predicate actually required it, that would help make things like find . \( name foo or name bar \) and type l run much faster. Uncommon? Not really: find / \( fstype local or prune \) \( \ \( name '[#,]*' atime +1 \) \ or \( \( name '*.bak' or name '.*.bak' \) atime +3 \) \ or \( name '.emacs_[0-9]*' atime +7 \) \ or \( name core \) \ \) print exec /bin/rm -f {} \; Incidentally, yes, find still accepts the old syntax (find . -name ...). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
pt@geovision.uucp (Paul Tomblin) (10/10/90)
flee@guardian.cs.psu.edu (Felix Lee) writes: >"descend" is fast because it recognizes leaf directories and avoids >stat()ing the files in that directory. This is usually a big win, >since most files tend to be in leaf directories. >"find" can't do this in general, since most of its predicates require >stat()ing each file, but it wouldn't be too hard to add lazy stat()ing >to find. And it may even be worth it. What happened to "ftw", that was discussed at Usenix a few years ago? Is it getting into unix distributions? Are programs like "find" and "ls -R" being re-implemented using it, like they're supposed to? It sure sounded like it was a cure for what ails find when it was described. -- Paul Tomblin, Department of Redundancy Department. ! My employer probably I'm not fat..... I'm metabolically challenged. ! does not agree with my I'm not underpaid... I'm financially challenged. ! opinions.... nrcaer!cognos!geovision!pt or uunet!geovision!pt ! Me neither.
bhoughto@cmdnfs.intel.com (Blair P. Houghton) (10/11/90)
In article <1240@geovision.UUCP> pt@geovision.uucp (Paul Tomblin) writes: > >What happened to "ftw", that was discussed at Usenix a few years ago? Someone mentioned it several days ago, but you know the net... It exists under Ultrix 3.1, so it's been around for a while. --Blair "Gnu?"
flee@dictionopolis.cs.psu.edu (Felix Lee) (10/12/90)
>What happened to "ftw", that was discussed at Usenix a few years ago?
ftw() as spec'd in System V.3 and SunOS 4.1 doesn't special-case leaf
directories, so you can't implement lazy stat()ing.
--
Felix Lee flee@cs.psu.edu
mercer@npdiss1.StPaul.NCR.COM (Dan Mercer) (10/16/90)
In article <26919@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: :In article <F6.rhxm2@cs.psu.edu> flee@guardian.cs.psu.edu (Felix Lee) writes: :>"descend" is fast because it recognizes leaf directories and avoids :>stat()ing the files in that directory. This is usually a big win, :>since most files tend to be in leaf directories. :> :>"find" can't do this in general, since most of its predicates require :>stat()ing each file, but it wouldn't be too hard to add lazy stat()ing :>to find. And it may even be worth it. : :Quite likely. : :4.3BSD-reno's `find' reimplementation (a redistributable version) uses :the C library `fts' routines (from POSIX) which take flags indicating :whether `stat's are desired. If you say `no', it stats only when this :is required to search the directory. The new find sets the `I need :stat information' only when at least one predicate requires it. The :result: Maybe I'm dense, but how does find know whether a file is a directory (and should be descended) or another type of file without stat()ing the file? -- Dan Mercer Reply-To: mercer@npdiss1.StPaul.NCR.COM (Dan Mercer) "MAN - the only one word oxymoron in the English Language"
mercer@npdiss1.StPaul.NCR.COM (Dan Mercer) (10/16/90)
In article <F&gya5o2@cs.psu.edu> flee@dictionopolis.cs.psu.edu (Felix Lee) writes:
:>What happened to "ftw", that was discussed at Usenix a few years ago?
:
:ftw() as spec'd in System V.3 and SunOS 4.1 doesn't special-case leaf
:directories, so you can't implement lazy stat()ing.
:--
:Felix Lee flee@cs.psu.edu
Sorry, missed the beginning of this thread - what is lazy stat()ing?
--
Dan Mercer
Reply-To: mercer@npdiss1.StPaul.NCR.COM (Dan Mercer)
"MAN - the only one word oxymoron in the English Language"
dattier@ddsw1.MCS.COM (David W. Tamkin) (10/19/90)
mercer@npdiss1.StPaul.NCR.COM (Dan Mercer) wrote in <638@npdiss1.StPaul.NCR.COM>: | Maybe I'm dense, but how does find know whether a file is a directory | (and should be descended) or another type of file without stat()ing the | file? I think the idea is that after find does stat() the file and sees that it is a directory with only two links, it shouldn't bother stat()ting the entries in that directory because none of them will be subdirectories. There will be only one more level to descend. David Tamkin Box 7002 Des Plaines IL 60018-7002 708 518 6769 312 693 0591 MCI Mail: 426-1818 GEnie: D.W.TAMKIN CIS: 73720,1570 dattier@ddsw1.mcs.com
kimcm@diku.dk (Kim Christian Madsen) (10/22/90)
dattier@ddsw1.MCS.COM (David W. Tamkin) writes:
=>I think the idea is that after find does stat() the file and sees that it is
=>a directory with only two links, it shouldn't bother stat()ting the entries
=>in that directory because none of them will be subdirectories. There will
=>be only one more level to descend.
If some nasty user hasn't removed one of the "." and/or ".." entry and
created directories to replace them.
Kim Chr. Madsen