[ca.unix] File size distribution survey

jrg@Apple.COM (John R. Galloway Jr.) (08/07/90)

Below you will find a short shell archive containing 4 scripts which when
run will produce a historgram of the file size distribution (in 4K chunks)
on your system (or some subtree thereof).  I am interested in getting this
info from big users (e.g. 2 or more GBytes of disk) and especially from
really big users (10 or 50 or ? GBytes of disk).  The restriction of course
is that you must be using all that space for unix files, not a data base
or some other application that reads/writes raw partitions since these
scripts won't count them (you can attach their size in a note if you like)

If you are willing to run it for me, please send me the resulting output along
with a brief statement concerning the use of the file storage (e.g. general sw
development on vax, super coputer simulation on cray xmp, graphics, CAD, etc.).
It takes 7 minutes on my (tiny) Maciix A/UX system with 120MB in 10,000 files.
Hopefully your mialage will be better, but still you might want to do this in
an off peak period. THANKS!!

	please use jrg@galloway.sj.ca.us for the return file.
	or ..fernwood!galloway!jrg

	-jrg


#! /bin/sh
##  This is a shell archive.  Remove anything before this line, then unpack
##  it by saving it into a file and typing "sh file".  To overwrite existing
##  files, type "sh file -c".  You can also feed this as standard input via
##  unshar, or by typing "sh <file".  If this archive is complete, you will
##  see the following message at the end:
#		"End of shell archive."
# Contents:  consolidate.awk count.awk fsize.awk read.me sizes.sh
# Wrapped by jrg@apple.com on Mon Aug  6 13:35:17 1990
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
if test -f consolidate.awk -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"consolidate.awk\"
else
echo shar: Extracting \"consolidate.awk\" \(269 characters\)
sed "s/^X//" >consolidate.awk <<'END_OF_consolidate.awk'
XBEGIN {bs=4096;size = bs;number=0;printf("# of files   of 4K blocks\n")}
X{if ( $1*512 <= size ) number=number+$2;
Xelse
X{ printf("%8d        %d\n", number, size/4096);
Xwhile ($1 * 512 > size) size=size+bs; number=$2 }}
XEND {printf("%8d        %d\n", number, size/4096)}
END_OF_consolidate.awk
if test 269 -ne `wc -c <consolidate.awk`; then
    echo shar: \"consolidate.awk\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f count.awk -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"count.awk\"
else
echo shar: Extracting \"count.awk\" \(118 characters\)
sed "s/^X//" >count.awk <<'END_OF_count.awk'
XBEGIN {match=1;count=0}
X{if ($1 == match) count++; else {print match,count;count=1;match=$1}}
XEND {print match,count}
END_OF_count.awk
if test 118 -ne `wc -c <count.awk`; then
    echo shar: \"count.awk\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f fsize.awk -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"fsize.awk\"
else
echo shar: Extracting \"fsize.awk\" \(51 characters\)
sed "s/^X//" >fsize.awk <<'END_OF_fsize.awk'
X{if (NF > 2) printf "%.0f\n" , ($5 + 4095) / 4096}
END_OF_fsize.awk
if test 51 -ne `wc -c <fsize.awk`; then
    echo shar: \"fsize.awk\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f read.me -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"read.me\"
else
echo shar: Extracting \"read.me\" \(916 characters\)
sed "s/^X//" >read.me <<'END_OF_read.me'
XThese scripts produce a historgram of the file sizes on your system.
XThe sizes.sh script takes an arg as the dir to start with (the top of
Xtthe tree).  The current dir is used if no arg is given.
X
XIdeally as root, beig in the dir containg the scripts, you would say:
X
X# sizes.sh / >size.out
Xor if your are in the root just
X# sizes.sh >sizes.out
X
Xsizes.sh 	a 1 line shell script that pipes an ls -lR / into the other
X		scripts, result is written to std out so you need to provide
X		a bucket for output.  Should work with csh, ksh, or sh.
X		If you would send the resulting output file
X		to jrg@galloway.sj.ca.us (or fernwood!galloway!jrg) I would
X		greatly appreciate it.
Xfsize.awk	an awk script that strips out the file size parameter from the
X		ls -l listing
Xcount.awk	an awk script that counts the like entries in a sorted list
Xconsolidate.awk	an awk script that groups the result of the above into 4 KB
X		chunks.
X
END_OF_read.me
if test 916 -ne `wc -c <read.me`; then
    echo shar: \"read.me\" unpacked with wrong size!
fi
# end of overwriting check
fi
if test -f sizes.sh -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"sizes.sh\"
else
echo shar: Extracting \"sizes.sh\" \(90 characters\)
sed "s/^X//" >sizes.sh <<'END_OF_sizes.sh'
Xls -lR $1 | awk -f ./fsize.awk | sort -n | awk -f ./count.awk | awk -f ./consolidate.awk 
END_OF_sizes.sh
if test 90 -ne `wc -c <sizes.sh`; then
    echo shar: \"sizes.sh\" unpacked with wrong size!
fi
chmod +x sizes.sh
# end of overwriting check
fi
echo shar: End of shell archive.
exit 0
-- 
internet   jrg@galloway.sj.ca.us  John R. Galloway, Jr.
applelink  d3413                  CEO..receptionist         795 Beaver Creek Way
human     (408) 259-2490          Galloway Research         San Jose, CA  95133