forrie@morwyn.UUCP (Forrie Aldrich) (06/06/91)
I'm flabbergastered. For some reason my EXPIRE command isn't expiring
a lot of different articles... in particular I have noticed that some
of the articles that are crossposted into groups that I don't get here
don't expire... I have to manually delete them. This can't be right, and
I would appreciate some advice here...
There version of news I have is: Bnews 2.11 patchlevel 19 ... which is
the latest and greatest if I am not mistaken...
Please respond via email to:
... uunet!virgin!unhtel!morwyn!forrie
as I don't regularly get the news.*.* groups on my node.
Thanks in advance...
Forrie
--
--------------------=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--------------------
Forrest Aldrich, Jr.| (a reliable path here someday) |forrie@morwyn.UUCP
| <email paths> |
CREATIVE CONNECTIONS| uunet!virgin!unhtel!morwyn!forrie |Graphic Illustration
------------------\-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=/------------------
\___ PO Box 1541 - Dover, NH 03820 ___/
rusty@anasaz.UUCP (Rusty Carruth) (06/20/91)
Some time back someone remarked about the glut of "expire" replacements, and how that they felt that people were going astray by not using the "supplied" expire program. (massivly interpreted/filtered observation there - please note that this was/is NOT intended as a flame, nor am I intending to be casting aspersions on ANYBODY's work! I am VERY thankful for all the software others have supplied which supports news!) I've finally come to the point that I am able to make a semi-rational statement as to why I believe that this happens. When I run expire, I am usually wanting to free up some given amount of disk space (for incoming news to land in, for example). I rarely am thinking of limiting the length of time articles hang around in a newsgroup - I simply want to get some free disk space without angering my users too much. So, when I "write" an expire table, what do I enter? Retention times. and, if I don't get enough space from the expire, I either have to make the times shorter (and remember to go back and change it after expire has finished), manually remove some stuff, or ... I currently administer 2 news machines. One is in the process of being converted to cnews, the other is still on Bnews (the one I'm posting from, as it happens). I've been using "reap", with some mods, here on "this machine" (anasaz) for some time with pretty good results, except that I've got a VERY complex reap list, which makes adding new news groups a pain. The algorithm I'm hoping to implement runs something like this: Set a goal for the amount of free space you want now. lump newsgroups into one of 3 categories: junk, good, archive (archive is not currently being done) set "high" and "low" limits for each newsgroup (see below for their use) set "rate" values for each newsgroup (also see below) for each junk group, expire anything older than 1 day and see if you have freed up the space desired if so stop, if not, continue to: for each good newsgroup: expire any articles older than the "high" limit for this newsgroup if avail space > desired space, stop end for if we still need more space: for each newsgroup (note - junk groups included) if oldest article is older than "low limit" then expire "rate" days of articles (i.e. if rate = 1, expire 1 days worth) if avail space > desired space, stop end for Another person here has an idea based upon priorities and such, but it seemed even harder to implement than my hare-brained idea :-) Reading the doc for expire, it looks like I could add another field to the middle field of the history line which contains the SIZE of the file (thus saving me from having to scan the entire directory structure to calculate file sizes). Would there be any massive problems with doing this? (Note that my intention is to run the above algorithm from top to bottom, THEN actually remove the files, thus allowing me to traverse the tree only once) Also, I take it that a '-' in the second subfield means that the article has been expired? Anybody crazy enough to help me on this insane project? Would the "powers-that-be" be interested in including my version of "param_expire" (or whatever in the world it turns out to be called) in future Cnews's (as an optional method for expiration)? (Assuming that I get it finished this century...) Is this even a good idea, in other folks' minds? (PLEASE, if you reply to this question, notice and address the issue of <why it is we run expire in the first place> (see paragraph 3 above)). "Raving wildly, Rusty hits the "s" key in rn" :-) Rusty {ames!ncar!noao!asuvax,mcdphx}!anasaz!rusty anasaz!rusty\ 73 de Rusty Carruth, N7IKQ (602) 870-3330 anasaz.UUCP!rusty>@asuvax P.O. Box 27001, Tempe, AZ 85285 rusty%anasaz.UUCP/ \.eas.asu.edu
adeboer@gjetor.geac.COM (Anthony DeBoer) (06/21/91)
In article <4313@anasaz.UUCP> rusty@anasaz.UUCP (Rusty Carruth) writes: > [ proposed spec for freespace-based expire program ] > >(Note that my intention is to run the above algorithm from top to bottom, >THEN actually remove the files, thus allowing me to traverse the tree only >once) Actually, C news expire never traverses the /usr/spool/news tree. It reads through the history file, decides what to do with each line, and reaches into the spool directories only to unlink() (or archive) articles. It will normally also (ie. unless you use the -r option) rewrite the history file to reflect the deletions. If I was sitting down to write your program, I'd use expire to do the dirty work (it's already written, it's fast, and it works), feeding more-or-less severe explist files to it and then checking freespace to see if the next pass should be taken or not. You could either write a series of explist files manually, calling them explist.1, explist.2, and so on, each reflecting one pass of your algorithm, or write a program (which could be an awk script) to generate the n-th version from a master file containing additional parameters. A shell script based on the existing "doexpire" script could handle taking the appropriate number of passes everytime cron invokes it (and you could have it start up periodically during the day and check if you're really tight on space and do a pass or so, and feed it a different parameter on the cron command line at night to do a proper cleanup). If you want to get fancy, have it save the "severity level" it ran at the last time, and use this when you start off. If there's a lot of freespace, back off a level or two, then start with an expire pass at the appropriate level. (BTW, if you want to do a run to delete only "junk" groups, you could feed expire an explist that tells it that all groups except the ones listed stick around for 999 days, for example.) Just as a disclaimer, even though my gut feeling was originally that I needed something like this on my system, it's turned out that a pretty-near-vanilla C News is working quite happily here, so I've never sat down to actually implement such a thing. The only real problem I've had with news is that newsrun, spacefor, and relaynews were conspiring to use up all my inodes, which I've patched, and Henry tells me they're looking at doing a proper fix in the next major release. -- Anthony DeBoer NAUI#Z8800 adeboer@gjetor.geac.com Geac Canada Ltd., Toronto uunet!geac!gjetor!adeboer
flee@cs.psu.edu (Felix Lee) (06/22/91)
>[...], feeding more-or-less severe explist files to it and then >checking freespace to see if the next pass should be taken or not. This is called "progressive expire". Several people have implemented various forms of this, posted to alt.sources and such. I've been sporadically working on implementing pure space-based expiry. You set a target amount of space free or space used in whatever newsgroups you like, and in a single pass enough articles are removed to satisfy the constraints. The advantage of this is that expiry can be a continuous process. As you receive news you can remove a corresponding amount of old news so your disk space usage remains at a steady state. This should let you run smoothly with tight space, especially with some cooperation from "spacefor". The disadvantage of this is that it's probably going to be a little more expensive than simple date-based expiry. -- Felix Lee flee@cs.psu.edu
henry@zoo.toronto.edu (Henry Spencer) (06/25/91)
In article <4313@anasaz.UUCP> rusty@anasaz.UUCP (Rusty Carruth) writes: >When I run expire, I am usually wanting to free up some given amount >of disk space (for incoming news to land in, for example). I rarely >am thinking of limiting the length of time articles hang around in >a newsgroup ... We actually have two different user communities here, with the distinction a function of how tight your disk space is. Those of us with reasonably ample resources (for the moment!) do tend to think about hang-around time. > lump newsgroups into one of 3 categories: junk, good, archive (archive > is not currently being done) > set "high" and "low" limits for each newsgroup (see below for their use) > set "rate" values for each newsgroup (also see below) > ... >Another person here has an idea based upon priorities and such, but it seemed >even harder to implement than my hare-brained idea :-) The main reason why we didn't attempt something like a space-based expire in C News was the problem of defining what the policy should be. The more I tried to write a description, the more complex it got, and the less obvious it was that people could understand it and that it would meet their needs. What you've defined is a plausible approach if your groups can be split into those three categories easily. >Reading the doc for expire, it looks like I could add another field to the >middle field of the history line which contains the SIZE of the file (thus >saving me from having to scan the entire directory structure to calculate >file sizes). Would there be any massive problems with doing this? I thought very seriously about doing exactly this, in fact, and the only reason it wasn't done was that in the end I didn't have a use for it. I think nothing should mind; nothing in C News depends on having exactly two subfields, although it's possible that other stuff (NNTP?) does. Putting a size in as a third subfield is a reasonable idea, although please do it in bytes -- the concept of "block" is not portable. >Also, I take it that a '-' in the second subfield means that the >article has been expired? No, no. Please read the documentation! It means that no explicit expiry date was supplied. >Would the "powers-that-be" be interested in including my version of >"param_expire" (or whatever in the world it turns out to be called) >in future Cnews's (as an optional method for expiration)? ... It's not out of the question, but I'd like to see more attention to a sophisticated policy mechanism. As you've specified it so far, it could be done without too much trouble using iterative running of the existing expire with tighter (possibly mechanically generated) explists. Not as quick as a single pass, but probably acceptably fast for most sites, and it would be much simpler to set up. -- "We're thinking about upgrading from | Henry Spencer @ U of Toronto Zoology SunOS 4.1.1 to SunOS 3.5." | henry@zoo.toronto.edu utzoo!henry