reid@Cascade.ARPA (11/02/84)
We run 2.10.2 news, essentially as distributed in net.sources. Lacking any particular documentation on expire except the out-of-date man page that came with 4.2BSD, and reading through the source for expire enough to see that it is not obvious, I have been doing my expiring in the following way: % grep expire /usr/adm/daily.sh /usr/lib/news/expire -n arpa.unix-wizards -e30 -a /usr/lib/news/expire -n net.sources -e15 -a /usr/lib/news/expire -e30 -n all /usr/lib/news/expire -n net.singles -n net.flame -n net.politics -n net.religion -e10 This seems to more or less work, though it has left some very strange things in my history files from time to time. My complaint is that it takes 3 hours of wall clock time and 73 minutes of CPU time on an idle 750 to run these 4 expire commands: % lastcomm expire | head -8 expire S news __ 762 secs Fri Oct 26 07:40 expire S news __ 2258 secs Fri Oct 26 06:38 expire S news __ 681 secs Fri Oct 26 05:54 expire S news __ 720 secs Fri Oct 26 04:55 expire S news __ 536 secs Thu Oct 25 06:58 expire S news __ 2454 secs Thu Oct 25 05:29 expire S news __ 253 secs Thu Oct 25 05:13 expire S news __ 294 secs Thu Oct 25 04:54 Does everybody's expire take this long? If not, what am I doing wrong? If so, does anybody but me think this is too much?
chuqui@nsc.UUCP (Zonker T. Chuqui) (11/05/84)
In article <1069@Cascade.ARPA> reid@Cascade.ARPA writes: >We run 2.10.2 news, essentially as distributed in net.sources. > >% grep expire /usr/adm/daily.sh >/usr/lib/news/expire -n arpa.unix-wizards -e30 -a >/usr/lib/news/expire -n net.sources -e15 -a >/usr/lib/news/expire -e30 -n all >/usr/lib/news/expire -n net.singles -n net.flame -n net.politics -n net.religion -e10 A more efficient way of doing this for 2.10.2 would be: /usr/lib/news/expire -a arpa.unix-wizards -e30 /usr/lib/news/expire -a net.sources -e15 /usr/lib/news/expire -n net.singles net.flame net.politics net.religion -e10 One change to expire is that the -a flag now accepts arguments, so the first expire will do the work of both the original first and third. It will expire everything AND archive only arpa.unix-wizards. You have to be rather familiar with the expire source to figure this out-- the code for it isn't obvious. Previous versions of expire had it so that the -a flag was an all or nothing situation. >This seems to more or less work, though it has left some very strange things >in my history files from time to time. My complaint is that it takes 3 hours >of wall clock time and 73 minutes of CPU time on an idle 750 to run these 4 >expire commands: > >Does everybody's expire take this long? If not, what am I doing wrong? >If so, does anybody but me think this is too much? Expire is, to put it nicely, a hog. Your figures aren't out of line with what you are asking it to do. Cutting out that fourth expire will help, and if you can keep net.singles et all for 15 days instead of 10 this MIGHT (untested! untested!) work: /usr/lib/news/expire -e15 -a net.sources -n net.singles net.flame net.politics net.religion chuq -- From the Department of Bistromatics: Chuq Von Rospach {cbosgd,decwrl,fortune,hplabs,ihnp4,seismo}!nsc!chuqui nsc!chuqui@decwrl.ARPA I'd know those eyes from a million years away....
geoff@desint.UUCP (Geoff Kuenning) (11/10/84)
From chuqui@nsc.UUCP:
>Expire is, to put it nicely, a hog.
The current expire opens up every article file to look for an "Expires:" line
in the header. To find out how much this costs (approximately), I did:
cd /usr/spool/news
time find . -type f -print | xargs cat >/dev/null
(In retrospect head -5 would have been more accurate, but it's not off by too
much). It ran for 45 minutes before I had to abort it, and produced exactly
the same seeking pattern as expire. My normal expires run somewhere from an
hour to 1:15 when there is no other disk activity, and eat essentially 100%
of the seek time on the drive.
The obvious solution is to put the expiration date in the history file. This
is a bit beyond my current free-time level. So I was wondering about doing
a shell script something like this:
break up /usr/lib/news/history into article pathnames
sort the list, and 'comm' it against yesterday's list to get a list of
newly-arrived articles
Append their pathnames and expiration dates to a file called
/usr/lib/news/expdates
From here, it is fairly easy to see how to expire without opening lots of
files. Only, when I plot it out a bit more, it becomes obvious that you need
to write at least one program that calls getdate.y to crack the dates and
has the smarts to expire based on newsgroup, Expires line, and Date-Received
and such lines. So that doesn't seem like much of an approach, either.
Can anybody out there come up with a quick hack to cut down on these multiple
opens? Or does somebody maybe have the time to do it right?
--
Geoff Kuenning
First Systems Corporation
...!ihnp4!trwrb!desint!geoff
henry@utzoo.UUCP (Henry Spencer) (11/13/84)
We've got inews modified to put the expiry date (if any) in the history file, and expire modified to look at it. But this inews/expire pair are based on 2.10 (not even 2.10.1) and I haven't had a chance to compare it to 2.10.2 yet to decide whether the changes are compatible. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
chuqui@nsc.UUCP (Cheshire Chuqui) (11/17/84)
In article <206@desint.UUCP> geoff@desint.UUCP (Geoff Kuenning) writes: > >The current expire opens up every article file to look for an "Expires:" line >in the header. > >My normal expires run somewhere from an >hour to 1:15 when there is no other disk activity, and eat essentially 100% >of the seek time on the drive. > >The obvious solution is to put the expiration date in the history file. Unfortunately, it isn't quite so obvious. Expire has the '-e' flag that changes the time to expire, plus the '-i' and '-I' flags that cause it to ignore 'Expires:' header lines. Use of any of these flags make generating expiration dates for the history file at reception time impossible. If you are willing to use only the default cases of expire, it would help, but I have yet to see a news site that does that. For example, I set DFLTEXP (the defs.h value for when to get rid of things) rather high, usually 30 days or so, and then use a series of expires with the '-e' option to keep the data base to a specific size depending on how much disk space I've got allocated to it. What might be useful would be to add code so that inews flags articles with 'Expires:' lines to some file, say in the form '<article_id> <expiration_date>', if we do that then expire can use the existing date in the history file as the basis for the expiration and reference the expiration date from this other file if neccessary. If might also be possible to simply flag articles with 'Expires:' in it in the history file and get expire to only look at them, saving us another file at the expense of a change to the history file format. If we DO change the history file format, what does this break? Anyone out there have any ideas? chuq -- From the Department of Bistromatics: Chuq Von Rospach {cbosgd,decwrl,fortune,hplabs,ihnp4,seismo}!nsc!chuqui nsc!chuqui@decwrl.ARPA This plane is equipped with 4 emergency exits, at the front and back of the plane and two above the wings. Please note that the plane will be travelling at an average altitude of 31,000 feet, so any use of these exits in an emergency situation will most likely be futile.
lepreau@utah-cs.UUCP (Jay Lepreau) (11/19/84)
Chuq claims that the use of "expire -e N" negates the value of putting the expiration date in the history file. Huh?? I'm no expert on the news software, but that makes no sense to me. The object is to avoid opening every news article to find the poster-specified expiration date. Putting that date in the history file doesn't change one bit the algorithm for determining the actual expiration date. Jay Lepreau
henry@utzoo.UUCP (Henry Spencer) (11/20/84)
> >The obvious solution is to put the expiration date in the history file. > > Unfortunately, it isn't quite so obvious. Expire has the '-e' flag that > changes the time to expire, plus the '-i' and '-I' flags that cause it to > ignore 'Expires:' header lines. Use of any of these flags make generating > expiration dates for the history file at reception time impossible. ... It's still dead simple. If the file came in with no explicit expiry date, you simply record it as such in the history file. (The way we do it is to give the expiry date as "-" in this case.) The history file is already recording the arrival date, which (around here, at least) is the other date that expire needs to look at. Expire can make its decisions exactly the same way it has in the past, but rather more quickly. > If we DO change the history file format, what does this break? Anyone out > there have any ideas? It broke practically nothing here; I had to make very small adjustments in one or two places. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
chuqui@nsc.UUCP (Cheshire Chuqui) (11/20/84)
In article <3114@utah-cs.UUCP> lepreau@utah-cs.UUCP (Jay Lepreau) writes: >Chuq claims that the use of "expire -e N" negates the value of putting >the expiration date in the history file. Huh?? I'm no expert on the >news software, but that makes no sense to me. The object is to avoid >opening every news article to find the poster-specified expiration >date. Putting that date in the history file doesn't change one bit the >algorithm for determining the actual expiration date. > here is the situation. Assume we put the expiration date in the history file. When a message comes in, we add the DFLTEXP value (say 14 days) to the time received to get the expiration time (november 1 becomes november 15 for expiration). If you then run expire with the -e option to expire something at, say, 10 days, that article should really expire november 11. the expiration time becomes useless because the system would still have to read the file and add the -e expiration time to it to find out if it should be expired. A better alternative is to make sure the date in the history file is the date received (in an easily usable format), add a flag if there is an Expires: header line (or better yet, do away with it completely), and base expiration on the date in the history time + DFLTEXP or the -e value, opening up the article only to get Expires: lines. Come to think of it, that shouldn't be too much trouble to implement. Me and my copious free time..... chuq -- From the Department of Bistromatics: Chuq Von Rospach {cbosgd,decwrl,fortune,hplabs,ihnp4,seismo}!nsc!chuqui nsc!chuqui@decwrl.ARPA This plane is equipped with 4 emergency exits, at the front and back of the plane and two above the wings. Please note that the plane will be travelling at an average altitude of 31,000 feet, so any use of these exits in an emergency situation will most likely be futile.
phil@amdcad.UUCP (Phil Ngai) (11/20/84)
I bet Jay thought the date put in the history file would be derived from the Expires: line or the date of receipt if there was no Expires: line. If you want even more flexibility, you could also store the date the article was received. > In article <3114@utah-cs.UUCP> lepreau@utah-cs.UUCP (Jay Lepreau) writes: > >Chuq claims that the use of "expire -e N" negates the value of putting > >the expiration date in the history file. Huh?? I'm no expert on the > >news software, but that makes no sense to me. > > here is the situation. Assume we put the expiration date in the history > file. When a message comes in, we add the DFLTEXP value (say 14 days) to > the time received to get the expiration time (november 1 becomes november > 15 for expiration). -- When you've seen one tree, you've seen them all. -Bonzo Reagan Phil Ngai (408) 749-5790 UUCP: {ucbvax,decwrl,ihnp4,allegra,intelca}!amd!phil ARPA: amdcad!phil@decwrl.ARPA
gnu@sun.uucp (John Gilmore) (11/22/84)
The problem here is that there are two algorithms for expiration date: (1) messages without Expires: take receipt date, add "-e" value or DFLTEXP if none was used. (2) messages with Expires: use the specified date So, it sounds to me like the info you need in the history file is: (A) Was Expires: specified? (B) if A==no, the receipt date if A==yes, the specified expiration date Then you'll never need to look in the message to finger out either case.
rick@seismo.UUCP (Rick Adams) (11/24/84)
The following cheap fix will greatly speed up the case where several expires are run consecutively. It will only do any good if you are using the -n option. A much faster expire (roughly 3 times) will be part of 2.10.3 in about a month or so. ---rick *** expire.c Fri Nov 23 17:12:54 1984 --- expire.c.new Fri Nov 23 17:18:10 1984 *************** *** 515,520 if (sscanf(afline,"%s %ld %ld %c",nbuf,&maxart, &minart, &cansub) < 4) xerror("Active file corrupt"); minart = maxart > 0 ? maxart : 1L; /* Change a group name from a.b.c to a/b/c */ --- 515,525 ----- if (sscanf(afline,"%s %ld %ld %c",nbuf,&maxart, &minart, &cansub) < 4) xerror("Active file corrupt"); + if (!ngmatch(nbuf,ngpat) { + fputs(afline, nhfd); + continue; + } + minart = maxart > 0 ? maxart : 1L; /* Change a group name from a.b.c to a/b/c */