[mod.sources] newspace - determine newsgroup disk usage

sources-request@panda.UUCP (02/06/86)

Mod.sources:  Volume 3, Issue 107
Submitted by: talcott!seismo!amdahl!gam (G A Moffett)


: This is a shar archive.  Extract with sh, not csh.
echo x - README
sed -e 's/^X//' > README << '!Funky!Stuff!'
XDESCRIPTION
X	Newspace(1) reports the disk space used by Usenet newsgroups.
X	It is more intellegent than a du(1) of the directories of the
X	newsgroups; see the man page for details.
X
XCONFIGURATION EXPECTATIONS
X	The shell script is set up for the default news configuration;
X	however, if yours is different, inspect the shell script's use
X	of the environment variables "newsdir" and "active", which
X	give the directory under which the newsgroups are and the
X	location of the active file.
X
XSYSTEM EXPECTATIONS
X	The shell script by default uses the System V cut(1) and xargs(1)
X	commands; if you don't have cut(1), some commented-out code is
X	provided to imitate its function.  Edit accordingly.
X
X	A public domain xargs(1) has been distributed separatedly
X	for non-System V Unixes.  As with cut(1), some
X	commented-out code in the shell script is provided to imitate
X	xargs if you don't have it (although use of xargs is encouraged!).
X
XCOMMENT
X	One notable feature of this shell script is that it contains the
X	longest pipeline-command I think I have ever seen; all the work
X	is done, in fact, by a single pipeline.
X
XHISTORY
X	The original newspace was written by me and distributed to
X	the San Francisco Bay Area sites via ba.news in January, 1986,
X	partly for testing purposes.  David St. Pierre (ptsfa!dsp)
X	suggested some improvements and enhancements (thanks, David),
X	which are included here.  Before it's posting to mod.sources,
X	John P. Nelson (panda!jpn) provided an alternative for the
X	xargs usage (thanks, John).
X
XBUGS
X	I have no bugs to report; however I would like to hear of any
X	bugs or enhancements for this shell script, and any comments
X	and improvements for the manual page.  (One obvious enhancement,
X        perhaps, is for newspace to report the size of particular
X        newsgroups, instead of all of them at once).
X
X
X				Gordon A. Moffett
X				{ihnp4,seismo,hplabs}!amdahl!gam
X
!Funky!Stuff!
echo x - newspace.1
sed -e 's/^X//' > newspace.1 << '!Funky!Stuff!'
X.\" @(#)newspace.1	1.2 1/21/86 21:36:09 - Amdahl/UTS
X.TH NEWSPACE 1 "Local"
X.SH NAME
Xnewspace \- show how much space each newsgroup uses
X.SH SYNOPSIS
X.B /usr/lib/news/newspace
X.SH DESCRIPTION
X.I Newspace
Xis a shell script which
Xdisplays the total disk space used by the articles in
Xeach Usenet newsgroup.
XWhile
X.IR du (1)
Xis used to measure the disk space, the report is
X.I not
Xa
X.I du
Xof all the newgroup directories, but the disk space used
Xby that newsgroup alone;
X.IR eg ,
Xthe total for net.music does not include the total for
Xnet.music.gdead.
X.P
XThe output is sorted so that largest newsgroups are
Xdisplayed first.
X.P
X.SH FILES
X.TP 25
X/usr/lib/news/archive
Xlist of newsgroups to measure
X.TP 25
X/usr/spool/news/\(**
Xwhere all newsgroups are stored
X.SH SEE ALSO
Xdu(1)
X.SH DIAGNOSTICS
XSelf-explanetory.  In particular, the list of newsgroups to
Xmeasure is taken from /usr/lib/news/active; and the
Xnewsgroups are expected to be found under /usr/spool/news.
X.SH BUGS
!Funky!Stuff!
echo x - newspace.sh
sed -e 's/^X//' > newspace.sh << '!Funky!Stuff!'
X
X# print filespace used by each newsgroup, largest-first
X
Xactive=/usr/lib/news/active
Xnewsdir=/usr/spool/news
X
X# make sure needed files/directories are available
X
Xif [ ! -f $active ]
Xthen
X	echo "$0: can't find active file" 1>&2
X	exit 1
Xfi
X
Xif [ ! -r $active ]
Xthen
X	echo "$0: can't open active file" 1>&2
Xfi
X
Xif [ ! -d $newsdir ]
Xthen
X	echo "$0: no $newdir directory!" 1>&2
X	exit 1
Xfi
X
Xcd $newsdir
X
X# the first section of the pipeline meaures the filespace under each
X#       directory which corresponds to a newsgroup name in the
X#       active file
X
X# the following cat-while loop is supplied in case you dont have cut(1);
X#       it's output should be piped into xargs
X	
X#cat $active | while read ng junk	# fetch newsgroups
X#do
X#	echo $ng
X#done |
X
X# another choice: John Nelson (panda!jpn) supplied the following
X#       alternative for the cut-tr-xargs-du pipeline; its output
X#       should be piped into awk for the large awkscript below
X
X#du -s `awk '{ print $1 }' $active | tr '.' '/'` |
X
X# OK, here's the vanilla System V version ...
X
Xcut -d" " -f1 -s $active |              # substitutions for this supplied above
X	tr '.' '/' |			# cvt newsgroup names to dirs
X	xargs du -s |			# measure them all
X
X# this awk program takes these measurements of the
X# directories and redistributes the weights of specific newsgroups;
X# that is, the size of net/music/gdead is subtracted from the
X# the size of net/music, and so on.
X
Xawk '
X	{ newsgroups[$2] = $1 }
XEND {
X	for (ng in newsgroups) {
X		n = split(ng, l, "/")
X		tng = l[1]		# parent newsgroups of ng
X
X		# look at each parent (?ancestor) of this newsgroup
X
X		for (i = 2; i <= n; i++) {
X
X			# subtract total of this newsgroup from total
X			#	of parent newsgroups
X
X			if (newsgroups[tng] > 0)
X				newsgroups[tng] -= newsgroups[ng]
X			tng = tng "/" l[i]
X		}
X	}
X
X	# print them out (only non-empty ones, tab-separated)
X
X	for (ng in newsgroups) {
X		if (newsgroups[ng] > 0)
X			print newsgroups[ng], "\t", ng
X	}
X}' |
X	tr '/' '.'  |		# restore standard newsgroup names
X	sort +0nr		# largest newsgroups first
!Funky!Stuff!
exit