mwm@mica.berkeley.edu (Mike (I'll think of something yet) Meyer) (07/24/89)
Now that we've all had fun with finding out our names, I can ask the next "this is harder than it looks" question. I've got a "generalized" file tree walking routine. I think the interface is reasonable (at least, it'll do everything I want it to do), but I'd like to hear from others who might find this usefull. Better yet, I'd like for some such routine to appear as part of 1.4, so I don't have to fix mine to deal with links... Anyway, the interface is like so: treewalk(rootlock, userfunc, flags) ; BPTR rootlock ; int (*userfunc)(dirlock, fib) ; int flags ; BPT dirlock ; struct FileInfoBlock *fib ; treewalk returns TRUE if everything went fine, and FALSE if there was an error of some kind. rootlock is a lock on the directory that is the root of the tree. That not being a valid lock, or not being a directory, is considered an error. flags is currently only used for two things: specifying whether the directory tree (note: directory tree, not the file tree) is to be walked in preorder, postorder, or both. The flags are TREE_PRE, TREE_POST and TREE_BOTH (== TREE_PRE | TREE_POST). If TREE_BOTH is specified every directory is visited twice, leave directories one with one visit right after the other. Unfortunatly, asking for postorder means directories get scanned twice. userfunc is called to let the user decide what to do with directories & files. When a directory is visited, userfunc is first called with fib == NULL and dirlock being a lock on that directory. After that, it's called with the same lock, and fib being the FileInfoBlock for each file in that directory. Userfunc returns either TREE_STOP, to stop the tree walk at that point, or TREE_CONT, to continue it. The design goals were 1) robustness: treewalk's stack use isn't dependent on how deep the tree is; and if allowed to exit, it frees all resources it uses. Ditto for errors. 2) Speed: initial tests show it about as fast as a PD find, even though the find is optimized, and treewalk included all the debugging code. 3) Generality: I designed it so that every tree-walking application I could think of could use treewalk to do the work. 4) Cost in the user code: Many applications don't need full paths, so I decided not to track them. Passing the lock to userfunc allows it to get it if it needs it. Likewise, all files are passed to userfunc, instead of testing to see if it wants them inside of treewalk. Alternatives considered and rejected: 1) specify whether usrfunc should return the directory (fib == 0) or files or both (or neither) I decided that this was of low utility, and those routines that wanted to could verify that condition and return immediately. 2) Making files, as well as the tree, come out postorder. That seemed to be of marginal utility, and it wasn't clear how to do it. 3) Indicating whether a directory was being visitied pre or post order. Same comments. That's the function interface. I'd like to know of any appcliation that can't be recoded to fit. Please describe the application, as you can do quite a bit with statics in userfunc. The obvious example (a recurslive list) is: Preorder traversal, with the following userfunc: rlist(BPTR dirlock, struct FileInfoBlock *fib) { static char pathname[FMSIZE] ; if (fib == NULL) { if (getpath(dirlock, pathname)) return TREE_STOP ; strcat(pathname, strchr(pathname, ':') == NULL ? ":" : "/") ; puts(pathname) ; return TREE_CONT ; } printf("%s%s\n", pathname, fib->fib_FileName) ; return TREE_CONT ; } Last thing: I'm trying to build a better-than-find tool to put this at the CLI level. Base plan to date is: treewalk <flags> [ { filter-commands } ] [shell command] flags are: -+ - postorder traversal (pre is the default; post requires directories to be scanned twice). -- - preorder, only useful as "-+-" to get pre as well as post order. -f "shell command" only takes one argument -d <dir> start at directory <dir>. { filter-commands } (the "{}" are included) consists of a find-like expression (only cleaner) to decide if this file is going to be included or not. If I can figure out how, I'm going to allow Rexx scripts to be used as filters - which will also allow commands to be run on the files via "address command". Finally, any files that get through the filter will be fed as arguments to the "shell command". Filenames will be tacked on until there is no more room on the command line, unless the "-f" flag is given, in which case there will be one invocation of the command per file that passes the filter. If no shell command is given, the file names will be listed to stdout, one file per line, using the absolute path. Comments on this would also be appreciated. Thanx, <mike -- I know the world is flat. Mike Meyer Don't try tell me that it's round. mwm@berkeley.edu I know the world stands still. ucbvax!mwm Don't try to make it turn around. mwm@ucbjade.BITNET