mwm@mica.berkeley.edu (Mike (I'll think of something yet) Meyer) (08/02/89)
[please excuse the crossposting, but this is two announcements in one, and I think one belongs on each...] I've just put twbeta.zoo on ucbvax.berkeley.edu, in pub/amiga. This contains the beta test version of both the treewalk subroutine and the treewalk command. Since the subroutine is a general purpose routine that the command drags out into the cli, I'll describe the command first. The only short description of the treewalk command is that it's a second-generation find. Or if you don't speak unix, but do have lattice, it's a second-generation files. You point it at a directory, and then for every file in the subtree rooted at that directory, it evalutes a user-provided C-like expression using elements from that files FileInfoBlock structure (among other things); if the expression evaluates to true, it passes the files full name to the second phase. The expression is optional, with a default value of "true". The second phase is to either print the files name, or invoke a user-provided command with the file names as argument. Unlike find, treewalk tries to shove as many arguments onto the command line as will fit (which can, of course, be disabled for brain-damaged commands). Timings indicated that treewalk is roughly the same speed as "find" from fish disk 197, taking between 50% and 200% of the time of find depending on the task. If you're using files, you're in for a treat. Worst case is that teewalk is twice as fast as files, and it's nearly 15 times faster at best. It can delete a tree in about 150% of the time "delete all quiet" takes to delete the same tree. The .zoo files includes all this timing data. Treewalk also has an advantage that, unlike find (I don't know about files), it doesn't use _any_ recursive procedures. It keeps it's directory stack in the heap, so you don't have to worry about running out of stack on truly large file systems. Finally, treewalk is a lot more flexible than either files or find, allowing (for instance) a selection of "day==`tuesday`" to get all files created last Tuesday. Oh yeah, I threw ARexx support in just to see if someone could think of something truly spiffy to do with it. Now, we get to the justification for the .tech crosspost. The treewalk routine. The heart of the above command is a single routine. It's not very long, and not very complicated. It's just one of those things that's hard to do right, and sometimes very useful. You call it with a lock and a pointer to a function, and it proceedes to call that function with the fib of every file in the subtree, and a lock on the directory that file is in. It allows for both preorder and postorder traversals, and even lets you do both at once. As stated above, it doesn't use up stack, doesn't change the lock you pass it, and tries very hard to free all memory it allocates during processing. It allows the called function to terminate the scan at every invocation. If I had source to things like delete & dir, I'd have tried recoding those using treewalk to see how things work. If you're interested in creating things like "dir all" or "delete all", this might be of some use. The .zoo file includes all the sources (but if you've got Manx, you're not going to be happy), documentation (not done very will, I'm afraid), timings on the treewalk command, including commentary on what it all means. And don't ask me to send you a copy of the thing. I'm going to load the thing onto a local BBS (AAA BBS, +1 415 222 9416), and it may appear on the Lattice BBS. That's going to be it for the beta. The final version will go out on all the usual channels. <mike -- Come all you rolling minstrels, Mike Meyer And together we will try, mwm@berkeley.edu To rouse the spirit of the air, ucbvax!mwm And move the rolling sky. mwm@ucbjade.BITNET