conor@goose.STANFORD.EDU (Conor Rafferty) (09/21/87)
Title:
dusort: size sort output of "du"
Description:
This shell script is useful for evaluating where the majority of
disk in your account is tied up. It takes the output of du
and sorts the directories by size. No big trick, except that
typically you want the subdirectories of a directory to move
with their common parent.
Compare:
$ du emacs | $ du emacs |sort -n
105 emacs/src/port |17 emacs/shortnames
2821 emacs/src |28 emacs/lisp/term
28 emacs/lisp/term |68 emacs/lisp/.backup
68 emacs/lisp/.backup |105 emacs/src/port
1606 emacs/lisp |229 emacs/man
855 emacs/etc |635 emacs/info
635 emacs/info |855 emacs/etc
229 emacs/man |1606 emacs/lisp
17 emacs/shortnames |2821 emacs/src
6193 emacs |6193 emacs
with:
$ du emacs|dusort |$ du emacs |dusort -t
6193 emacs | /emacs(6193)
2821 /src | /src(2821)
105 /port | /port(105)
1606 /lisp | /lisp(1606)
68 /.backup | /.backup(68)
28 /term | /term(28)
855 /etc | /etc(855)
635 /info | /info(635)
229 /man | /man(229)
17 /shortnames | /shortnames(17)
The -t option is for Gnumacs' selective-display mode.
BUGS:
Filenames are assumed to have only characters greater than space.
"du a/b/c/d |dusort" prints bogus lines for a,b and c.
Uses an awk-sed-awk sandwich followed by an awk formatter.
300 directories takes about 25 seconds on a sun, of which about
3 seconds was setting up the pipeline.
--------------------CUT-HERE--------------------
#!/bin/sh
#
# sort a "du" listing by directory size
# usage: du | dusort
FILES=
TFORM=0
while test $# -ge 1; do
case $1 in
-t) TFORM=1; ;;
*) FILES="$FILES $1"; ;;
esac
shift
done
#build complex keys so that subdirectories move with parent
awk '{ size[ $2] = $1 }
END {
for (i in size) {
printf "%s ", i;
oj = 1; l = length(i);
#build up an aggregate key from all its parents
for (j = 1; j <= l; ) {
for (; j <= l; j++) if (substr(i,j,1) == "/") break;
name = substr(i, oj, j-oj);
j++;
printf "%d ", size[name];
}
#print itself once more to compare ahead of its children
printf "%d\n", size[i];
}
}' $FILES |
#sort numerically
sort -r -n +1 -2 +2 -3 +3 -4 +4 -5 +5 -6 +6 -7 +7 -8 +8 -9|
#just print the path and its size. In two popular flavors.
awk '{if('$TFORM') printf "%s(%d)\n", $1, $NF; else
printf "%d\t%s\n", $NF, $1}' |
#indent directories
# This awk could be combined with the previous one
# but it really performs a separate function.
# Cut it off and put it in a separate file called 'ind' if you like it.
#
# ind: indent output from du or find
#
awk '
BEGIN {blank=" "}
{
for (s=length; s > 0 && substr($0, s, 1) > " " ; s--)
;
for (e=length; substr($0, e, 1) != "/" && e > s+1; e--)
;
print substr($0, 1, s) substr(blank, 1, e-s-1) substr($0, e);
}
'
--------------------CUT-HERE--------------------
conor rafferty The command
conor@sierra.stanford.edu 1,$s/^\([^,]*\), *\(.*\)/\2 \1/
decwrl!shasta!conor@sierra although hard to read, does the job.
--- Brian W. Kernighan "Advanced Editing on Unix"