[news.software.b] C news: expire, and batcher

blarson@dianne.usc.edu (bob larson) (08/27/90)

(I have sent a much more detailed version of this message directly to
c-news@utstat.toronto.edu)

New C news administrators should be warned that the first run of
expire will change the size of history.pag in a radical manner.  (Also
true any time the history.pag is recreated.) I spent a fair amount of
time tracking down a non-bug.

I've modified batcher not to include Xref: headers.  Actually quite
simple.  If there is demand, I'll post an unoffical patch here, but
would prefer that it just gets incorperated into the next offical
patch.

-- 
Bob Larson (blars)	blarson@usc.edu			usc!blarson
	Hiding differences does not make them go away.  Accepting
	differences makes them unimportant.

henry@zoo.toronto.edu (Henry Spencer) (08/28/90)

In article <26817@usc.edu> blarson@dianne.usc.edu (bob larson) writes:
>I've modified batcher not to include Xref: headers.  Actually quite
>simple.  If there is demand, I'll post an unoffical patch here, but
>would prefer that it just gets incorperated into the next offical
>patch.

Unless you've found a really clever way of doing it, it's not likely,
I'm afraid.  You can't just use egrep, because whatever you do has to
affect headers *only*.  As mentioned in notebook/rfcerrata, we consider
the don't-send-Xref requirement to be erroneous, since no news system
we're aware of -- including B2.xx, which the RFC is supposed to be
describing -- implements it.  We'd be willing to comply with it but
only if it was essentially free, and imposing significant processing
in the batching isn't.
-- 
Committees do harm merely by existing. | Henry Spencer at U of Toronto Zoology
                       -Freeman Dyson  |  henry@zoo.toronto.edu   utzoo!henry

flee@dictionopolis.cs.psu.edu (Felix Lee) (08/28/90)

Actually, it's pretty cheap to get rid of Xref: in "batcher", if you
assume that Xref: lines are always first.  It's somewhat less than the
cost of a strncmp plus a strchr per article.  Doesn't really feel like
it's worth it though, given that it's just as easy to ignore Xref: on
the other side.
--
Felix Lee	flee@cs.psu.edu

blarson@dianne.usc.edu (bob larson) (08/28/90)

In article <1990Aug27.172424.18636@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <26817@usc.edu> blarson@dianne.usc.edu (bob larson) writes:
>>I've modified batcher not to include Xref: headers.  Actually quite
>>simple.  If there is demand, I'll post an unoffical patch here, but
>>would prefer that it just gets incorperated into the next offical
>>patch.
>
>Unless you've found a really clever way of doing it, it's not likely,
>I'm afraid.  You can't just use egrep, because whatever you do has to
>affect headers *only*.

It's a change to batcher.c, and the only clever part about it is noting
that C news puts the Xref: line first.  If the first 6 characters of
the file are "Xref: " the portion before the first newline is not put
in the batch.  (and the size on the "#! rnews" line changes to match.)
A fair portion of the code is parinoia about Xref lines that exceed
the 8192 byte buffer.  (It would be simpler to impelment a policy
decision to not send such articles on, but the batcher is not the
place to put policy decisions.)


-- 
Bob Larson (blars)	blarson@usc.edu			usc!blarson
	Hiding differences does not make them go away.  Accepting
	differences makes them unimportant.

henry@zoo.toronto.edu (Henry Spencer) (08/29/90)

In article <26827@usc.edu> blarson@dianne.usc.edu (bob larson) writes:
>It's a change to batcher.c, and the only clever part about it is noting
>that C news puts the Xref: line first.  If the first 6 characters of
>the file are "Xref: " the portion before the first newline is not put
>in the batch...

Hmm.  That's a reasonable approach, indeed about the only reasonable one.
I am still reluctant to incorporate even a modest lump of code to deal
with a requirement that (a) every other news system ignores, and (b) we
think should be dealt with by fixing the specs.  Are there really many
people troubled by this?
-- 
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry

geoff@athena.mit.edu (Geoff Collyer) (08/29/90)

bob larson:
> ... noting that C news puts the Xref: line first.

This is *not* guaranteed, by code or documentation.  It is quite
possible for relaynews to emit Xref: as other than the first header,
so writing code that relies upon Xref: being the first header is
unwise.  (Consider an article with an enormous header which triggers
Plan B of article filing: since the newsgroups aren't known for
certain when the first headers are written to disk, an Xref: cannot be
emitted before writing the start of the headers to disk.)

Geoff Collyer, wishing that the Cornell-Toronto IP connection would
hurry up and speed up.