[comp.archives] [news.software.b] reap

dt@yenta.alb.nm.us (David B. Thomas) (12/19/90)

Archive-name: news/expire/reap/1990-12-16
Archive: wuarchive.wustl.edu:/usenet/alt.sources/2159 [128.252.135.4]
Original-posting-by: dt@yenta.alb.nm.us (David B. Thomas)
Original-subject: reap(8) (was Steady state news.)
Reposted-by: emv@ox.com (Edward Vielmetti)

You might be interested in my reap(8) utility.  It expires according to
a scheme that you define until there is enough disk space (you specify
the amount considered "enough"), then stops.

That way, if there's little traffic for a while, the old news hangs around
because there's room, but if you're inundated with tons of new stuff, the
old stuff is expired more aggressively, to keep the disk space adequate.

I posted it to alt.sources a while back.  I'm currently adding some new
features, like also checking # of inodes, and making the specification
of newsgroups more powerful and flexible, but the present version is
stable and I'll give it to anyone who wants it.

Here's the readme file:

A Problem:
	News volume fluctuates wildly...sites with small disks must either
expire aggresively, often deleting more than necessary, or worry that an
unexpected huge dose of news might overfill the disk.


A Solution:
	Implement a tool that expires according to a user-defined scheme
until sufficient freespace is reclaimed, then stops, leaving as much
juicy news online as is feasible.  Reap does this.


Expire Does Two Jobs:
	Both Bnews and Cnews expires really do two jobs:
		1. trim history files
		2. delete outdated articles
Thanks to some inspired jootsing (acronym for "jumping out of the system")
by Mike Murphy (mrm@sceard.com) and others, it is more than possible to
separate those two functions.  This is, of course, in keeping with the
unix philosophy of one tool doing one job well!


What Reap Does:
	Reap only takes care of the second job: deleting old articles.
It works by checking freespace, and processing one line at a time from a
list of expire functions, until the desired freespace is attained.
Each expire function consists of an age limit in days (decimals okay)
and a list of newsgroups to process or not process, sys file style.  Ex:

	.5	alt.sex.pictures,talk,!talk.bizarre,junk
	1	rec,!rec.games,!rec.humor

This example would check freespace, and if more space is needed, expire
to .5 days everything in alt.sex.pictures,talk (except for talk.bizarre)
and junk.  Then it would stop and check freespace again.  If still more
space is needed, it would expire to 1 day everything in rec except
rec.games.* and rec.humor.*.  It's that simple.


Okay...So How Do I Arrange To Trim The History Files, Since Reap Can't:
	Included in this distribution is a shell script (mostly written
by Mike Murphy) to handle Cnews history files.  It shouldn't be too
difficult to do something similar for Bnews, or you can give in and use
the original expire utility with options that tell it to expire the
history files only...but I think this will be slow.  It just comes down
to removing lines from ordinary text files, based on their contents.
Murphy and I used awk.


But Is It Fast:
	Yes, largely because it doesn't have to do much.  Even "find | rm"
is slower because find is repeatedly exec-ing rm.  "rm -rf" has me beat,
though, I'll bet! :-)

	Since the functions of deleting articles and trimming history
are separate, I now run reap every six hours, but trim the history list
just once a day.  That effectively keeps my disk space up to snuff, but
only thrashes at the history file in the middle of the night.


Credits:
	I owe a lot to Mike Murphy for inspiring me with his "trasher"
system.  I also owe a lot to all of your netters who will flood me with
suggestions and improvements in the coming weeks (hint, hint!).

						little david
						dt@yenta.alb.nm.us
-- 
Twice five syllables...
Plus seven can't say much but
That's Haiku for you.