jv@mh.nl (Johan Vromans) (08/17/90)
I have noted that in the active file, the lower article number always remains 1. Is this normal? E.g.: comp.os.vms 0000002238 0000000001 y ^^^^^^^^^^ Lowest article number (currently) is 1746, not 1. Johan -- Johan Vromans jv@mh.nl via internet backbones Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62911/62500 ------------------------ "Arms are made for hugging" -------------------------
henry@zoo.toronto.edu (Henry Spencer) (08/17/90)
In article <1990Aug16.185023.26200@squirrel.mh.nl> Johan Vromans <jv@mh.nl> writes: >I have noted that in the active file, the lower article number always >remains 1. Is this normal? This is normal unless you have a crontab entry that occasionally runs upact (more portable) or updatemin (rather faster). This is currently an ill-documented option; it will probably become standard when I get around to changing it. We currently run updatemin weekly, for the benefit of some stupid reader software. (The lower number is basically an inadequate kludge that smarter software should never look at, but there is a lot of dumb software in the world, sigh...) -- It is not possible to both understand | Henry Spencer at U of Toronto Zoology and appreciate Intel CPUs. -D.Wolfskill| henry@zoo.toronto.edu utzoo!henry
del@thrush.mlb.semi.harris.com (Don Lewis) (08/17/90)
In article <1990Aug17.034849.17801@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >In article <1990Aug16.185023.26200@squirrel.mh.nl> Johan Vromans <jv@mh.nl> writes: >>I have noted that in the active file, the lower article number always >>remains 1. Is this normal? > >This is normal unless you have a crontab entry that occasionally runs >upact (more portable) or updatemin (rather faster). This is currently >an ill-documented option; it will probably become standard when I get >around to changing it. We currently run updatemin weekly, for the benefit >of some stupid reader software. (The lower number is basically an >inadequate kludge that smarter software should never look at, but there >is a lot of dumb software in the world, sigh...) Updatemin is fast enough that we run it daily, right after expire. -- Don "Truck" Lewis Harris Semiconductor Internet: del@mlb.semi.harris.com PO Box 883 MS 62A-028 Phone: (407) 729-5205 Melbourne, FL 32901
tale@turing.cs.rpi.edu (David C Lawrence) (08/17/90)
In article <1990Aug17.034849.17801@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
This is normal unless you have a crontab entry that occasionally runs
upact (more portable) or updatemin (rather faster).
Or optionally put it in doexpire so it is executed as soon as expire is.
(The lower number is basically an inadequate kludge that smarter
software should never look at, but there is a lot of dumb software
in the world, sigh...)
I agree; the lowest article scheme fails here due to Expires:
sometimes making large holes in groups. I think I will review the
latest NNTP protocol for a way to put in "this group has n articles in
it" information in it after a readdir and check for S_IFREG files.
This of course isn't terribly efficient, but is accurate right at the
time of doing it. It's a real loser when trying to present a summary
of groups and relative article volumes.
The nice thing about the min field is it does give me a usually pretty
close estimate of how many articles in the group, take the same amount
of time to figure out no matter how huge the group is. Since I would
rather have this information slightly wrong sometimes than not at all,
I run updatemin. Unless I am missing something obvious, which could
well be at the moment, I don't see a wonderful way which smart
newsreaders could come up with that information without doing the
costly operation above each time they wanted. Of course, some things
like trn and nn keep their own databases (boy, do I love all this
space used on my disk) and run their own daemons, so they can keep an
up-to-date cache of this information somewhere. Right now, mthreads
(for trn) daemon will only expire things from its database once a day.
--
(setq mail '("tale@cs.rpi.edu" "tale@ai.mit.edu" "tale@rpitsmts.bitnet"))
brad@looking.on.ca (Brad Templeton) (08/17/90)
In article <1990Aug17.034849.17801@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >of some stupid reader software. (The lower number is basically an >inadequate kludge that smarter software should never look at, but there >is a lot of dumb software in the world, sigh...) Want to explain this Henry? Programs do need the minimum -- for creating reasonable sized bitmaps, for example. They can either figure out the minimum (by doing opendir on the spool directory) or they can get it from the active file, which they already read. So you can calculate it 300 times per day in every reading session, or once, in an upact type program. So why is this dumb? -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
henry@zoo.toronto.edu (Henry Spencer) (08/17/90)
In article <1990Aug17.071243.16518@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >>of some stupid reader software. (The lower number is basically an >>inadequate kludge that smarter software should never look at, but there >>is a lot of dumb software in the world, sigh...) > >Programs do need the minimum -- for creating reasonable sized bitmaps, for >example. Programs should do a directory sweep to find out what articles are *actually present* rather than making the -- unwise and often wrong -- assumption that there is a nearly-contiguous sequence between min and max. The code to do this has to be present anyway, since no reader in its right mind finds the next available article by a straight linear search. Directory reading is cheap and quick. (There is admittedly a problem with doing this over NNTP, which is a serious flaw in NNTP but is no excuse when NNTP is not involved.) >So you can calculate it 300 times per day in every reading session, or >once, in an upact type program. The right way to do it is indeed to do it once, but to record useful summary information rather than just a single number. Some of the new fancy newsreaders are starting to do that. -- It is not possible to both understand | Henry Spencer at U of Toronto Zoology and appreciate Intel CPUs. -D.Wolfskill| henry@zoo.toronto.edu utzoo!henry
henry@zoo.toronto.edu (Henry Spencer) (08/17/90)
In article <SS^%M2&@rpi.edu> tale@turing.cs.rpi.edu (David C Lawrence) writes: >... I think I will review the >latest NNTP protocol for a way to put in "this group has n articles in >it" information in it after a readdir and check for S_IFREG files. You don't need to bother with the (relatively expensive) S_IFREG check, actually, if you report an estimate rather than a guaranteed-accurate number, and filter out non-numeric names. -- It is not possible to both understand | Henry Spencer at U of Toronto Zoology and appreciate Intel CPUs. -D.Wolfskill| henry@zoo.toronto.edu utzoo!henry
brian@ucsd.Edu (Brian Kantor) (08/18/90)
In article <SS^%M2&@rpi.edu> tale@turing.cs.rpi.edu (David C Lawrence) writes: >... I think I will review the >latest NNTP protocol for a way to put in "this group has n articles in >it" information in it after a readdir and check for S_IFREG files. Uh, that's already there, dude. Chapter and verse: RFC 977 February 1986 Network News Transfer Protocol 3.2. The GROUP command 3.2.1. GROUP GROUP ggg The required parameter ggg is the name of the newsgroup to be selected (e.g. "net.news"). A list of valid newsgroups may be obtained from the LIST command. The successful selection response will return the article numbers of the first and last articles in the group, and an estimate of the number of articles on file in the group. It is not necessary that the estimate be correct, although that is helpful; it must only be equal to or larger than the actual number of articles on file. (Some implementations will actually count the number of articles on file. Others will just subtract first article number from last to get an estimate.) When a valid group is selected by means of this command, the internally maintained "current article pointer" is set to the first article in the group. If an invalid group is specified, the previously selected group and article remain selected. If an empty newsgroup is selected, the "current article pointer" is in an indeterminate state and should not be used. Note that the name of the newsgroup is not case-dependent. It must otherwise match a newsgroup obtained from the LIST command or an error will result. 3.2.2. Responses 211 n f l s group selected (n = estimated number of articles in group, f = first article number in the group, l = last article number in the group, s = name of the group.) 411 no such news group
brad@looking.on.ca (Brad Templeton) (08/18/90)
In article <1990Aug17.163437.2013@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >there is a nearly-contiguous sequence between min and max. The code to do >this has to be present anyway, since no reader in its right mind finds the >next available article by a straight linear search. Directory reading is Actually, many readers do exactly that. By and large, many sites refuse to accept long expiry dates on most groups (Thanks to the help fo C news in part) so this is not that big a loss, particularly with caches. I'll tell you why I don't do it. Because opendir isn't fully standard yet, and every variant feature you use is another porting headache. This may be an irrational fear -- opendir or a standard 16 byte record directory format can be found almost everywhere nowadays. But one just grows to fear such moves, when a loop of opens is sure to work and generally isn't far off, either. -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
dave@galaxia.Newport.RI.US (News Administrator) (08/20/90)
In article <1990Aug17.071243.16518@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >In article <1990Aug17.034849.17801@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: >>of some stupid reader software. (The lower number is basically an >>inadequate kludge that smarter software should never look at, but there >>is a lot of dumb software in the world, sigh...) > >Programs do need the minimum -- for creating reasonable sized bitmaps, for >example. Why do you assume that you can create a reasonable sized bitmap based on the min and max article numbers? If I have a high volume group that happens to contain a few articles with long expiration dates I can still get what would look like a huge group based on max-min but in fact it might currrently contain significantly less than that. At one point back in the days of 2.10.1 (i.e. before the min field was introduced), I was concerned about overflowing the bitmap array so I wrote a set of functions that replaced all of the bitmap related macros and used a dynamically created linked list as the data structure instead of using a statically created array. Obviously, calling a function and doing a linked list lookup is not as fast as having a macro that does a few shifts and an array lookup, but I challenge anybody to tell the difference between the two when they are reading news with vnews/rn/trn/etc. Maybe a really high performance machine doing some kind of weird news processing in a tight loop could tell the difference, but not a user who is generating a single bitmap access for each article that gets displayed on their screen. The linked list approach has the really nice advantage of being very difficult to overflow. I am still using the linked list approach in some programs I have that analyze .newsrc files and they work quite nicely. Since these programs are not actually reading news articles, just analyzing .newsrc files, they are primarily doing "bitmap" manipulations and I do not feel that they are suffering any serious performance degradation from using the linked list functions. If anybody would like a copy of my code let me know and I will send it out. -- David H. Brierley Home: dave@galaxia.Newport.RI.US Work: dhb@quahog.ssd.ray.com Be excellent to each other.