dennis@rlgvax.UUCP (Dennis.Bednar) (02/22/88)
Comp.sources.misc: Volume 2, Issue 60 Submitted-By: "Dennis.Bednar" <dennis@rlgvax.UUCP> Archive-Name: duonly SHORT blurb from the duonly.help file, just to summarize what this is program is for: Prints disk usage block sizes only for directories, but NOT including sizes contributed by sub-directories. This tool was developed because when viewing the output of a regular du, it was difficult to see which directories were the real culprits for occupying the most number of disk blocks. #--------------- CUT HERE --------------- #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # duonly.c # duonly.help # This archive created: Wed Feb 17 20:31:04 EST 1988 # if test -f duonly.c then echo shar: will not over-write existing file 'duonly.c' else echo x - duonly.c # ............ F I L E B E G .......... duonly.c cat << '\SHAR_EOF' > duonly.c /* * duonly.c * dennis bednar Feb 17, 1988 * * Reads stdin of du output, and outputs sizes based ONLY on the * space used within a directory only. The size will be the * number of blocks used by the directory itself, plus any * files contained within that directory. * * NOTE: * Much of the source code in dugraph.c (contributed to comp.unix.misc) * I borrowed "as is". What functions I added are at the end. * I also added the "father" link. All of the brothers of the * first son, including the first son, have the same father node. * That is, if we have /a/b, and /a/c, then "b" and "c" are brothers, * and "b"'s and "c"'s father are both "a". I also added a new * variable "me_size" which is the size of the directory, not including * the sizes of the sons immediately below it. * * NOTE: If you do "du /dir1/dir2/dir3| duonly", then the size of * /dir1, and /dir2 will be 0, since they were NOT included in * the original output of du!!! * * NOTE: if you do "du /dir1/dir2/dir3| duonly", then the root * placeholder points to a cell for "", and NOT to "dir1" as * you might expect. This is because of how read_input() parses * its input. It is not a problem per se, except that the node * containing a blank name should not be printed in this case. * So, the solution is to add an extra check and avoid the print * when the name is "". By the way the root for "" only occurs * because the pathnames are absolute (begin with a /). * * NOTE: although output of directory names are sorted there is * one minor anomoly that occurs when you have input as follows * (sizes ommited, since irrelevant): * ../dir * ...dir/RCS * ...dir.ext * The important point is that we have both a "/" and a "." after the * same name "dir". * A sort would place the "." before the "/" (ie line 3 before line 2), * but the sort performed by du is only within each directory level, so * that duonly would output line 2 before line 3 (since ../dir and * ../dir.ext are sorted brothers, and ../dir/RCS is a sorted son of * ../dir) !! * */ #if 0 Path: rlgvax!sundc!seismo!uunet!husc6!think!ames!necntc!ncoast!allbery From: drw@culdev1.UUCP (Dale Worley) Newsgroups: comp.sources.misc Subject: Prettyprint a du listing Message-ID: <6803@ncoast.UUCP> Date: 17 Dec 87 02:33:54 GMT Sender: allbery@ncoast.UUCP Organization: Cullinet Software, Westwood, MA, USA Lines: 447 Approved: allbery@ncoast.UUCP X-Archive: comp.sources.misc/8712/8 I've always wanted to get a du listing that shows how the space is being used graphically. I finally wrote a program to digest a du listing and print out a tree, where each directory occupies lines proportionally to how much space the files in it consume. To run it, type "du | dugraph" or some such. The listing for each directory starts with a blank space showing space occupied by files directly in that directory, then the subtrees for each subdirectory (in descending order of size). If the subdirectories at the bottom get so small that they occupy less than 1 line each, they are all merged into an entry "(etc.)". The entire listing always occupies 60 lines (the value of 'length'). This program has tab-width = 5. --------------------------dugraph.c--------------------------------- #endif /* program to make a pretty graph out of a du report */ #include <stdio.h> #include <string.h> /* number of lines the listing should occupy */ int length = 60; /* message for suppressed directories */ #define SUPPRESSED "(etc.)" /* format of a tree node */ struct node { struct node *lson; /* left son */ struct node *rbrother;/* right brother */ struct node *father; /* parent node, up ptr */ unsigned long size; /* size of directory in kbytes */ unsigned long me_size; /* size of directory only */ int loc; /* location we will print it at */ int print_col;/* column to print name in */ int print_limit; /* location we can't print on or * after */ int last; /* are we last son of our father? */ char name[1]; /* name */ }; /* root of the tree */ struct node *root = NULL; /* total size of things listed */ unsigned long total_size; /* current line number we are on (0-origin) */ int current_line = 0; /* list of where to put bars */ int bar_list[50]; /* number of bars in the list */ int bar_count = 0; /* declare functions */ void read_input(); struct node *insert_in_tree(); void dfs(); void dfs1(); void missing_sizes(); void sort_size(); /* sort tree by size */ void sort_name(); /* sort tree alphabetically */ void calc_loc(); void blank(); void mark_last(); void calc_pc(); void output(); void position(); void show_node(); void my_space(); main() { struct node *t; /* scratch */ /* read the input and form a tree */ read_input(); root->size = 0; /* put sizes on entries that have none */ dfs(NULL, missing_sizes); /* sort each directory */ dfs(sort_name, NULL); /* for each directory, subtract the space used up by * all children one level below me, so that we * can tell how much space is occupied by this * directory only. * IMPORTANT: We need to compute in pre-order, or we * won't get the right results. */ dfs(my_space, NULL); /* print the tree */ dfs(show_node, NULL); exit(0); /* unused left-over code I might need later. */ /* calculate the total size */ total_size = 0; for (t = root->lson; t != NULL; t = t->rbrother) total_size += t->size; /* calculate the location of each directory */ /* blank out subdirectories that get scrunched together at the bottom */ root->print_limit = length; dfs(calc_loc, blank); /* print out the tree */ for (t = root->lson; t != NULL; t = t->rbrother) { /* mark the last son of each directory */ /* figure out the print columns */ t->print_col = 0; dfs1(calc_pc, mark_last, t); dfs1(output, NULL, t); } /* put blank space at end */ position(length); } /* read input and form a tree */ void read_input() { unsigned long size; /* size read from input */ char name[100]; /* directory name read from input */ /* make the dummy node at the top of the tree */ root = (struct node *)malloc(sizeof (struct node)); root->name[0] = '\0'; root->lson = NULL; root->father = NULL; /* if walking up the ladder, don't fall off ! */ /* read the next line of input */ while (fscanf(stdin, "%lu %s\n", &size, name) != EOF) { /* insert (or find) the directory in the tree and save its size */ insert_in_tree(name)->size = size; } } /* insert (or find) a directory in the tree */ struct node *insert_in_tree(name) char *name; /* name of the directory */ { struct node *t; /* pointer for searching down through tree */ char *np; /* points to next part of directory name to be * examined */ struct node *t1; /* scratch pointer */ char *np1;/* scratch pointer */ /* read through the name, one directory-part at a time, and hunt * down the tree, constructing nodes as needed */ for (t = root, np = name; np != NULL; np = np1) { /* extract the next directory-part */ if ((np1 = strchr(np, '/')) != NULL) { /* we found a slash, replace it with a null, and position * np1 to point to the remainder of the name */ /* can store node.name=="" when name begins with / */ *np1++ = '\0'; } /* else */ /* we found no shash, so we are at the end of the name * np1 has been set to NULL for us by strchr */ /* search the sons of this node for a node with the proper name */ for (t1 = t->lson; t1 != NULL && strcmp(t1->name, np) != 0; t1 = t1->rbrother) ; /* did we find one? */ if (t1 != NULL) /* yes, go to it */ t = t1; else { /* no, make one */ t1 = (struct node *)malloc(sizeof(struct node) + strlen(np)); strcpy(t1->name, np); t1->lson = NULL; t1->rbrother = NULL; t1->father = t; t1->size = 0; /* insert it in tree */ t1->rbrother = t->lson; t->lson = t1; t = t1; } } return t; } /* depth-first-search routine */ void dfs(pre_routine, post_routine) void (*pre_routine)(); /* routine to execute before scanning * descendants */ void (*post_routine)(); /* routine to execute after scanning * descendants */ { dfs1(pre_routine, post_routine, root); } /* depth-first-search service routine */ void dfs1(pre_routine, post_routine, t) void (*pre_routine)(); /* routine to execute before scanning * descendants */ void (*post_routine)(); /* routine to execute after scanning * descendants */ struct node *t; /* node to operate on */ { struct node *t1; /* scratch pointer */ /* if it exists, execute the pre-routine */ if (pre_routine != NULL) pre_routine(t); /* call self on sons of this node */ for (t1 = t->lson; t1 != NULL; t1 = t1->rbrother) dfs1(pre_routine, post_routine, t1); /* if it exists, execute the post-routine */ if (post_routine != NULL) post_routine(t); } /* add missing sizes */ void missing_sizes(t) struct node *t; { struct node *t1; /* scratch pointer */ unsigned long s; /* scratch */ if (t->size == 0) { /* size is missing, we have to calcuate it */ s = 0; for (t1 = t->lson; t1 != NULL; t1 = t1->rbrother) s += t1->size; t->size = s; } } /* sort the directories under a directory by size */ void sort_size(t) struct node *t; { struct node *p1, *p2, *p3, *pp; /* scratch pointers */ int nodes, n; /* scratch */ /* count the number of nodes */ nodes = 0; for (p1 = t->lson; p1 != NULL; p1 = p1->rbrother) nodes++; /* just a simple and inefficient bubble sort */ for (n = 1; n < nodes; n++) for (p1 = NULL, p2 = t->lson, p3 = p2->rbrother; p3 != NULL; p1 = p2, p2 = p3, p3 = p3->rbrother) { if (p2->size < p3->size) { /* exchange the nodes p2 and p3 */ pp = p3->rbrother; p3->rbrother = p2; p2->rbrother = pp; if (p1 != NULL) p1->rbrother = p3; else t->lson = p3; /* exchange the values of p2 and p3 */ pp = p2; p2 = p3; p3 = pp; } } } /* calculate the print location */ void calc_loc(t) struct node *t; { unsigned long cs; /* scratch */ struct node *t1, *t2; /* scratch pointers */ int print_limit; /* location next directory after t will * be printed */ if (t == root) cs = 0; else { /* figure out how much is in the directory itself */ for (t1 = t->lson, cs = 0; t1 != NULL; t1 = t1->rbrother) { cs += t1->size; } /* cs is the size accounted for by subdirectories */ cs = t->size - cs; } /* cs is the size of the files in the directory itself */ /* convert cs to lines */ cs = cs*length/total_size + t->loc; /* calculate where next directory after t will be */ print_limit = t->print_limit; /* assign locations */ for (t1 = t->lson, t2 = NULL; t1 != NULL; t2 = t1, t1 = t1->rbrother) { /* make sure we don't run into next directory */ if (cs >= print_limit) { cs = print_limit-1; } t1->loc = cs; if (t2 != NULL) t2->print_limit = cs; cs += t1->size*length/total_size; } if (t2 != NULL) t2->print_limit = print_limit; } /* figure out which directories to blank out */ void blank(t) struct node *t; { struct node *t1, *t2, *t3; /* loop pointers */ /* return if there aren't at least two sons */ if (t->lson == NULL || t->lson->rbrother == NULL) return; for (t1 = NULL, t2 = t->lson, t3 = t2->rbrother; t3 != NULL; t1 = t2, t2 = t3, t3 = t3->rbrother) if (t2->loc == t3->loc) { /* replace t1 and succeeding nodes with "(etc.)" */ t3 = (struct node *)malloc(sizeof (struct node) + sizeof (SUPPRESSED) - 1); strcpy(t3->name, SUPPRESSED); t3->lson = t3->rbrother = NULL; t3->loc = t2->loc; if (t1 == NULL) t->lson = t3; else t1->rbrother = t3; } } /* mark the last son of each directory */ void mark_last(t) struct node *t; { struct node *t1, *t2; /* scratch pointers */ t->last = 0; for (t1 = t->lson, t2 = NULL; t1 != NULL; t2 = t1, t1 = t1->rbrother) ; if (t2 != NULL) t2->last = 1; } /* calculate the print columns */ void calc_pc(t) struct node *t; { struct node *t1; /* scratch pointer */ int c; /* column suns will be printed in */ c = t->print_col + strlen(t->name) + 5; for (t1 = t->lson; t1 != NULL; t1 = t1->rbrother) t1->print_col = c; } /* write the output */ void output(t) struct node *t; { position(t->loc); printf("--%s%s", t->name, (t->lson != NULL ? "--+" : "")); /* remove the bar for our father if we are the last son */ if (t->last) bar_count--; /* add the location of the bar to the bar list if we have a son */ if (t->lson != NULL) { bar_list[bar_count] = t->print_col + strlen(t->name) + 5 - 1; bar_count++; } } /* position to a specific line */ void position(line) int line; /* line number */ { int i; /* counts through the bar list */ int j; /* current column number */ /* for every line we need to go down */ for (; current_line < line; current_line++) { putchar('\n'); /* print the bars for this line */ j = 0; for (i = 0; i < bar_count; i++) { for (; j < bar_list[i]; j++) putchar(' '); if (current_line == line-1 && i == bar_count-1) putchar('+'); else putchar('|'); j++; } } } #if 0 -----------------------------------example----------------------------------- --.--+ | | | | | | | +--scpp--+--ftps--+ | | +--scpp--+--temp | +--error | | | +--shar--+ | | | +--temp +--uemacs--+--uemacs3.9 | | | | | | +--patch--+--dist | | | | | +--build | | | +--sccs--+--all | | | | | | | +--(etc.) +--yacctest | | | +--yacc | | | +--rnsend--+--dist1 | +--dist3 | +--dist2 +--bin--+ | +--source +--sources | +--kwfrob +--rsts.tape +--rnews +--ftp-server +--(etc.) ------------------------------end of example------------------------------ -- Dale Worley Cullinet Software ARPA: culdev1!drw@eddie.mit.edu UUCP: ...!seismo!harvard!mit-eddie!culdev1!drw Nothing shocks me -- I'm a scientist. #endif /* begin of new code added by dennis bednar */ /* * Print the full name associated with the path to this component. * Since each node contains only one component of the full path name, * in order to print out the entire name, we have to print the * "father part of the component", followed by the component. * This is why I invented the "father" up pointer. * This is done recursively, I might add. * * the root is a dummy holder, it contains no real name. * The top node which contains the first real name will have * a father pointer to root. * Rather, root -> lson contains the first real root of the tree. * This is why root->rbrother is NULL. * * So do NOT print the root node's name. * Also, when printing root->lson's father, it will be "". */ void show_node( p ) struct node *p; { if (p == root) /* just a placeholder, its lson is the real root */ return; /* don't print jibberish */ /* avoid printing root's lson node whose name is "". * This occurs when first line of stdin contains a * directory name beginning with a leading /. * PS, since root node's name is also "", I probably could remove * the "if" check above, and let this if stmt do the work of * both, but I didn't feel like it. */ if (p -> name[0] == '\0') return; /* avoid printing /a and /b whose size is zero, just * because we did a "du /a/b/c | duonly". */ if (p -> me_size == 0L) return; /* printf("'%s' and my father is <%s>\n", p -> name, p -> father -> name); */ printf("%ld\t", p -> me_size); path_print(p); putchar( '\n' ); /* terminate line */ } /* * recursively print the full path by printing the "part before me", * then printing my component name. */ path_print( p ) struct node *p; { /* this should not ever happen, because of how path_print() is written */ if (p == NULL) return; /* paranenoid */ /* this code is structured so that a leading slash will NOT * be printed before the first component, but rather ONLY * between all components, and NOT after the last component. */ if (p -> father == root) { printf("%s", p ->name ); return; } else { /* recursion */ path_print( p -> father ); /* print components before me */ printf( "/%s", p -> name ); /* print my component */ } } /* * compute size of this directory only, ie don't include * size of children directories. This leaves space only * occupied by this directory remaining in the "me_size". */ void my_space( p ) struct node *p; { long total = 0; struct node *s; /* each son of the parent p */ total = p -> size; /* blocks used by this directory */ /* subtract blocks used by immediate children directories. * WILL NOT WORK CORRECTLY if a child is a file !!!!! * this will leave total containing only the number of blocks * in this directory. */ for ( s = p -> lson; s; s = s -> rbrother) total -= s -> size; /* store my size only for this parent */ p -> me_size = total; } /* * sort the directories under a directory by name. * * this is just the old "sort()" routine [sort() renamed to * sort_size() for clarity] with a minor change, where * we now compare names, not size. That is, we sort * the brothers or peers at each level. * SEE ALSO the anomoly concerning "." and "/" with regards * to sorting, described at the top of this file. */ void sort_name(t) struct node *t; { struct node *p1, *p2, *p3, *pp; /* scratch pointers */ int nodes, n; /* scratch */ int cmp; /* count the number of nodes */ nodes = 0; for (p1 = t->lson; p1 != NULL; p1 = p1->rbrother) nodes++; /* just a simple and inefficient bubble sort */ for (n = 1; n < nodes; n++) for (p1 = NULL, p2 = t->lson, p3 = p2->rbrother; p3 != NULL; p1 = p2, p2 = p3, p3 = p3->rbrother) { cmp = strcmp( p2 -> name, p3 -> name ); if (cmp > 0) /* not alphabetized, p3 1st, p2 2nd */ { /* exchange the nodes p2 and p3 */ pp = p3->rbrother; p3->rbrother = p2; p2->rbrother = pp; if (p1 != NULL) p1->rbrother = p3; else t->lson = p3; /* exchange the values of p2 and p3 */ pp = p2; p2 = p3; p3 = pp; } } } \SHAR_EOF # ............ F I L E E N D .......... duonly.c fi # end of overwriting check if test -f duonly.help then echo shar: will not over-write existing file 'duonly.help' else echo x - duonly.help # ............ F I L E B E G .......... duonly.help cat << '\SHAR_EOF' > duonly.help duonly Prints disk usage block sizes only for directories, but NOT including sizes contributed by sub-directories. This tool was developed because when viewing the output of a regular du, it was difficult to see which directories were the real culprits for occupying the most number of disk blocks. For example, if you did a "du /usr", the last line would be something like: 18112 /usr even though /usr might only contain sub-directories, but no files. Hence, it might be that a sub-directory of /usr was the real culprit for being a disk hog, and not /usr (or a sub-sub-directory, etc., of /usr might be the culprit). Duonly is designed to read standard input generated by "du", and print information in the same style as du: size directory_name However, the size reported by duonly is the size in blocks contributed "ONLY by the directory_name itself, including the sum of the sizes of all regular files directly beneath directory_name". It does NOT include the sizes inherited by ANY of the children directories. Thus the size reported by duonly might be thought of as ONLY for that directory. SAMPLE USAGE du | duonly du /usr | duonly NOTES The directory names are sorted alphabetically by each level. There is one minor anomoly about the sorted order of directory names. Duonly will output this order: 10 dir 5 dir/subdir 3 dir.2 but a true "du | duonly | sort +1", with sorting applied to the second column would really output: 10 dir 3 dir.2 5 dir/subdir since "." lexically preceeds "/". With the exception noted, the rest of the output generated by du is in the correct order. \SHAR_EOF # ............ F I L E E N D .......... duonly.help fi # end of overwriting check # end of shell archive exit 0 -- FullName: Dennis Bednar UUCP: {uunet|sundc}!rlgvax!dennis USMail: CCI; 11490 Commerce Park Dr.; Reston VA 22091 Telephone: +1 703 648 3300