dexpire@ftp.ee.lbl.gov (Craig Leres) (01/12/91)
I finally got fed up of having to babysit my spool partition and wrote a dynamic expire. Interested parties are invited to participate in an alpha test. Following successful completion, I plan to post dexpire to alt.sources and also make it available via anonymous ftp. I think my code is pretty solid but its not completely inconceivable that it could trash your spool partition. This means that you shouldn't ask to be in on the alpha unless you're pretty sure you can deal with any problems that develop. We run cnews so I need some bnews people to help me figure out the details of using dexpire with bnews. There are two obvious quanities this program could consider; time and disk space. That is, decide which articles to delete based on their relative ages or based on the relative disk consumption of their newsgroups. I chose to base dexpire's decisions on time since it seems more intuitive to me. After all, if a bunch of huge articles show up, it makes more sense to me that the length of time all articles are kept goes down a bit instead of massively gouging one newsgroup or two newsgroups. Also, if you base your decisions on disk space, you either have to stat() all the articles in the spool partition (or do ugly things like try to cache disk usage information). Dexpire. It's fast. It's rational. Manual entry and README are appended. Craig ------ #! /bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #! /bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # dexpire.lpr # README # This archive created: Fri Jan 11 23:08:50 1991 export PATH; PATH=/bin:$PATH echo shar: extracting "'dexpire.lpr'" '(6218 characters)' if test -f 'dexpire.lpr' then echo shar: will not over-write existing file "'dexpire.lpr'" else sed 's/^X//' << \SHAR_EOF > 'dexpire.lpr' X XNAME X dexpire - dynamic expire for netnews X XSYNOPSIS X dexpire [ -dnv ] [ -a active ] [ -c dexplist ] X [ -s spool_dir ] [ -f Kbytes ] X XDESCRIPTION X Dexpire deletes old news articles. Collections of newsgroups X (called "classes") are assigned priorities. These priorities X are used to dynamically determine how long articles in each X class may be kept so that a specified amount of disk space X is made available. X X Unlike expire(8), dexpire does not rebuild the history file. X The administrator must arrange to do this some other way X (see below). X XOPTIONS X The -d flag causes internal data structures to be dumped. X This can be useful when debugging a new control file. X X The -n flag prevents dexpire from actually removing any X articles. It can be informative to use this flag in conjunc- X tion with -v to see what dexpire would do if turned loose. X X The -v flag causes verbose information to be displayed to X stdout. This flag may be used more than once to get more and X more detailed information. X X The -a flag specifies an alternate active file. X X The -c flag specifies an alternate control file. X X The -s flag specifies an alternate spool directory. X X The -f flag is used to specify the desired number of free X Kbytes upon exit. The default is 4000 Kbytes. A trailing X 'M' specifies Mbytes, e.g. "-f 4M" means 4096 Kbytes. X XFILE FORMATS X The control file, dexplist, configures dexpire and has a X format similar to the explist file used by expire(8). Com- X ment lines begin with '#'. The first field specifies one or X more newsgroups and/or newsgroup trees (multiples should be X separated by commas). The special keyword all matches any X group and usually appears as the last rule in the control X file. X X The second field is a single letter that specified that the X line applies only to moderated (m), unmoderated (u), or to X either kind of newsgroup (x). X X The third field specifies the priority of the group; high X priority groups are kept longer than low priority groups. X The priorities are relative to each other which means that X the ratio of priorities determines how long articles are X kept. For example if one group has half the priority of X another group, articles in it are only kept half as long. X Groups with priority zero (0) are never expired. X X The optional forth field specifies a minimum number of days X to keep articles. It's most commonly used with low priority X groups. However, overzealous use of this feature leads to X the kind of problems dexpire was written to avoid. X X The first line of the control file that applies to a given X newsgroup is used to determine is class. X XIMPLEMENTATION X Here's what dexpire does when it runs. First, it checks to X see how much space is free. (If there is nothing to do, it X exits.) Next, it reads the control rules from dexplist. X Using these rules, it reads the active file and places each X newsgroup into its appropriate class. It also keeps track of X the first and last articles in each newsgroup. Next, the age X of the oldest article in each newsgroup and class is deter- X mined. The class ages are used to calculate the number of X days to keep the highest (or "standard") class. (This is X also known as the "standard" number of days.) No article is X kept for more than the standard number of days; articles in X lower priority classes are kept for less time. Finally, X passes are made over the classes, starting with the lowest X priority and working up. If a pass completes without freeing X enough disk space, the number of standard days is lowered by X a small amount and a new pass is started. The process is X repeated as many times as necessary to free the required X amount of disk space. X X Note that dexpire keeps track of how much space it has freed X by adding up the block counts from each article deleted. X This prevents it from getting confused by other activity in X the spool partition. X XHISTORY REBUILD X Currently there's no good way to rebuild the history file. X We run regular expire(8) once a week with a control file X that specifies to keep history entries for at least 30 days X and to unconditionally delete articles that are more than X 120 days old. But since expire(8) insists on reading each X and every article, this is more expensive than it needs to X be (for our purposes, anyway). A better solution would be to X write a utility to rebuild the history file (or add an X option to expire(8)). X X It's usually a good idea to run updatemin afterwards to X update the "minimum" fields in the active file. Otherwise, X later dexpire runs may waste cpu time and some newsreaders X (e.g. rn) may get confused. X XFILES X /usr/new/lib/news/active - newsgroup article information X /usr/new/lib/news/dexplist - control file X /usr/spool/news - news spool partition X XSEE ALSO X expire(8) X XAUTHOR X Craig Leres - leres@ee.lbl.gov X XBUGS X A large number of batched articles (or other files in the X spool partition) can cause dexpire to free up more space X than is necessary. Perhaps it should be smart enough to see X how much space is used in the in.coming directory. X X Since dexpire uses stat(2) to determine the age articles X (instead of reading headers) it's possible to fool it by X modifying articles in the spool partition. However, since X it's interested in the oldest article in each class, this X shouldn't cause problems unless the oldest article in every X group of a particular class has the wrong timestamp. X X Explicit Expires headers are completely ignored. X X There should be an option to consider inodes rather than X Kbytes since inodes are sometimes the critical resource. X X Currently, dexpire only knows how to use statfs(2) to get X filesystem statistics. X SHAR_EOF if test 6218 -ne "`wc -c < 'dexpire.lpr'`" then echo shar: error transmitting "'dexpire.lpr'" '(should have been 6218 characters)' fi fi # end of overwriting check echo shar: extracting "'README'" '(4084 characters)' if test -f 'README' then echo shar: will not over-write existing file "'README'" else sed 's/^X//' << \SHAR_EOF > 'README' X@(#) $Header: article,v 1.1 91/01/11 23:52:42 leres Exp $ (LBL) X X README for dexpire X XHere is the dexpire distribution. Hopefully, you should find the Xfollowing files: X X Makefile - compilation rules X README - this file X dexpire.8 - manual entry X dexpire.c - main program X disk.c - disk usage routines X file.c - active and dexplist parsers X util.c - random utility routines X version.c - release version number and date X dexpire.h - configuration X disk.h - forward declarations X file.h - forward declarations X util.h - forward declarations X patchlevel.h - patchlevel (just in case there are bugs) X dexplist - sample control file X dodexpire - sample dexpire script X blocktest.c - block size test program X X Installation Instructions X XWe are a cnews site and so these instructions are biased towards Xcnews. This package is known to compile and nominally run under SunOS X3.5 and SunOS 4.1 on Sun 3's and Sun 4's (and under Ultrix, thanks to XStan Barber). Dexpire uses statfs() to determine disk usage. If you Xdon't have statfs(), you'll have to write your own version of disk.c. XSince disk_usage() is only invoked once, it would be acceptable to Xfork() and parse the output of /bin/df. If your trying to build on a XSequent, you need to add "-lseq" to "LIBS" in the Makefile to link in Xgetopt(3). X XFirst test to make sure that dexpire's assumptions about the filesystem Xblock size are correct: X X make blocktest X XThis program checks to make sure that the st_blocks field of the stat Xstructure. Dexpire assumes that the units are in 512 byte blocks X(perhaps rounded up to the next even block because the filesystem Xfragment size is 1024 bytes). If blocktest doesn't successfully build Xand report "success," running dexpire might be dangerous. (I'm not Xpositive there are any Unix systems this test will fail on but it helps Xme sleep better). X XNext, configure dexpire.h. Although the location of the spool Xdirectory, active and dexpire files can be changed with flags, it's Xusually more convenient to have the builtins correct. It shouldn't be Xnecessary to change the DTIME or TOGO limits. MAX_FREE is a safety and Xmight need to be increased if you have a really, really small spool Xpartition. X XNow configure the Makefile. If you don't have gcc, comment out the CC Xline. It might also be necessary to change the target in the install Xrule. X XNow configure a dexplist file. If you currently use the cnews expire, Xit's pretty easy to convert a explist file to the dexplist format; Xbasically, edit it down so that you only have the first 3 fields. XOtherwise, make a copy of the sample dexplist and start hacking. X XNow compile and test. It's strongly recommended that you use -n until Xyou're sure everything works ok. Try something like: X X dexpire -vn -f 10000 X XPut the output on a file so you can examine it at your leisure. Make Xsure the reported disk statistics match what /bin/df says. The first Xpass should always find at least one article that could be deleted. The Xend of the run should report a reasonable number of articles and the Xcorrect number of bytes to be deleted. X XThe -d flag can useful in debugging a new dexplist file. It's a good Xidea to check the output of: X X dexpire -vdn -f 10000 X Xto make sure newsgroups end up in the classes you expect. X XIf your new dexpire policy differs from your old expire setup, it isn't Xunusual for there to be hundreds of consecutive unproductive passes. XBut since dexpire caches article timestamps, these extra passes only Xuse up a little extra cpu time; and after the new policy catches up, Xthings will settle down to 10 to 20 passes. X XOur news node is a Sun 3/50 with a CDC Wren V running SunOS 3.5. Our Xspool partition is about 350 Mbytes. It usually takes about 12 minutes Xto do a daily dexpire (including updatemin). Run time depends pretty Xmuch depend on the number articles deleted. X XPlease send comments, suggestions, bug reports, etc to: X X Craig Leres X leres@ee.lbl.gov (ucbvax!leres for uucp weenies) X Lawrence Berkeley Laboratory X One Cyclotron Road X Mail Stop 46A-1123 X Berkeley, California 94720 SHAR_EOF if test 4084 -ne "`wc -c < 'README'`" then echo shar: error transmitting "'README'" '(should have been 4084 characters)' fi fi # end of overwriting check # End of shell archive exit 0