[comp.sys.amiga] treewalk beta available on ucbvax

mwm@mica.berkeley.edu (Mike (I'll think of something yet) Meyer) (08/02/89)
[please excuse the crossposting, but this is two announcements in one,
and I think one belongs on each...]

I've just put twbeta.zoo on ucbvax.berkeley.edu, in pub/amiga. This
contains the beta test version of both the treewalk subroutine and the
treewalk command. Since the subroutine is a general purpose routine
that the command drags out into the cli, I'll describe the command
first.

The only short description of the treewalk command is that it's a
second-generation find. Or if you don't speak unix, but do have
lattice, it's a second-generation files. You point it at a directory,
and then for every file in the subtree rooted at that directory, it
evalutes a user-provided C-like expression using elements from that
files FileInfoBlock structure (among other things); if the expression
evaluates to true, it passes the files full name to the second phase.
The expression is optional, with a default value of "true". The second
phase is to either print the files name, or invoke a user-provided
command with the file names as argument. Unlike find, treewalk tries
to shove as many arguments onto the command line as will fit (which
can, of course, be disabled for brain-damaged commands).

Timings indicated that treewalk is roughly the same speed as "find"
from fish disk 197, taking between 50% and 200% of the time of find
depending on the task. If you're using files, you're in for a treat.
Worst case is that teewalk is twice as fast as files, and it's nearly
15 times faster at best. It can delete a tree in about 150% of the
time "delete all quiet" takes to delete the same tree. The .zoo files
includes all this timing data.

Treewalk also has an advantage that, unlike find (I don't know about
files), it doesn't use _any_ recursive procedures. It keeps it's
directory stack in the heap, so you don't have to worry about running
out of stack on truly large file systems.

Finally, treewalk is a lot more flexible than either files or find,
allowing (for instance) a selection of "day==`tuesday`" to get all
files created last Tuesday. Oh yeah, I threw ARexx support in just to
see if someone could think of something truly spiffy to do with it.

Now, we get to the justification for the .tech crosspost. The treewalk
routine.

The heart of the above command is a single routine. It's not very
long, and not very complicated. It's just one of those things that's
hard to do right, and sometimes very useful. You call it with a lock
and a pointer to a function, and it proceedes to call that function
with the fib of every file in the subtree, and a lock on the directory
that file is in. It allows for both preorder and postorder traversals,
and even lets you do both at once. As stated above, it doesn't use up
stack, doesn't change the lock you pass it, and tries very hard to
free all memory it allocates during processing. It allows the called
function to terminate the scan at every invocation.

If I had source to things like delete & dir, I'd have tried recoding
those using treewalk to see how things work. If you're interested in
creating things like "dir all" or "delete all", this might be of some
use.

The .zoo file includes all the sources (but if you've got Manx, you're
not going to be happy), documentation (not done very will, I'm
afraid), timings on the treewalk command, including commentary on what
it all means.

And don't ask me to send you a copy of the thing. I'm going to load
the thing onto a local BBS (AAA BBS, +1 415 222 9416), and it may
appear on the Lattice BBS. That's going to be it for the beta. The
final version will go out on all the usual channels.

	<mike
--
Come all you rolling minstrels,				Mike Meyer
And together we will try,				mwm@berkeley.edu
To rouse the spirit of the air,				ucbvax!mwm
And move the rolling sky.				mwm@ucbjade.BITNET