[news.software.b] article "header" contains non-header line

brendan@cs.widener.edu (Brendan Kehoe) (03/26/91)

   The newly patched version of cnews is humming along nicely. (With
  four large patches like that, nothing but the README & Copyright
  files failed. Whew.)

   Anyway, my log's getting intermittent entries of the type in the
  Subject (non-header line). I'm assuming this is the new strict
  header policy in action. I do have one question .. is there any way
  to tweak it to *accept* those articles not strictly conforming to
  rfc1036 until more sites make them click? (Admittedly, it's only
  dropped 31 out of ~1600 cuz of this rule, but it'd be nice.)

   Just curious.

Brendan


-- 
     Brendan Kehoe - Widener Sun Network Manager - brendan@cs.widener.edu
  Widener University in Chester, PA                A Bloody Sun-Dec War Zone
 Now that we know he has ID, we could give him an account. finger bush@cs....

henry@zoo.toronto.edu (Henry Spencer) (03/26/91)

In article <3*M#TB_@cs.widener.edu> brendan@cs.widener.edu (Brendan Kehoe) writes:
>  ... I do have one question .. is there any way
>  to tweak it to *accept* those articles not strictly conforming to
>  rfc1036 until more sites make them click? ...

Not configurable at present, I'm afraid.  You will get a report on who's
sending you those articles in the regular daily report from newsdaily,
but that's basically just a digest of the log entries.
-- 
"[Some people] positively *wish* to     | Henry Spencer @ U of Toronto Zoology
believe ill of the modern world."-R.Peto|  henry@zoo.toronto.edu  utzoo!henry

a3@rivm.nl (Adri Verhoef) (03/28/91)

Brendan:
>>  ... I do have one question .. is there any way
>>  to tweak it to *accept* those articles not strictly conforming to
>>  rfc1036 until more sites make them click? ...

Henry:
>Not configurable at present, I'm afraid.  You will get a report on who's
>sending you those articles in the regular daily report from newsdaily,
>but that's basically just a digest of the log entries.

But wouldn't it be nice (maybe too nice) to put them in 'junk',
so that you at least can have a look at what's wrong with them?
...Better not mail the offending article (header maybe?) to the
News Admin...

--
Adri, a3@rivm.nl

barrett@Daisy.EE.UND.AC.ZA (Alan P. Barrett) (03/28/91)

I don't really like this drop-on-the-floor policy for bad news articles,
because it increases the difficulty of debugging posting and transport
software.  How about mail-to-administrator or store-in-junk-newsgroup or
store-somewhere-and-mail-pointer-to-administrator?

--apb
Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa
Internet: barrett@ee.und.ac.za           UUCP: m2xenix!quagga!undeed!barrett

henry@zoo.toronto.edu (Henry Spencer) (03/29/91)

In article <1991Mar28.080325.7729@Daisy.EE.UND.AC.ZA> barrett@Daisy.EE.UND.AC.ZA (Alan P. Barrett) writes:
>I don't really like this drop-on-the-floor policy for bad news articles,
>because it increases the difficulty of debugging posting and transport
>software.  How about mail-to-administrator or store-in-junk-newsgroup or
>store-somewhere-and-mail-pointer-to-administrator?

We need to do a major rethink of error handling policy at some point.
At the moment it's awkward to do much about this.

It's not a trivial problem, unfortunately.  The above suggestions work
fine for one or two articles, but less well for ten thousand "obviously
bad" articles.
-- 
"The stories one hears about putting up | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 are all true."  -D. Harrison|  henry@zoo.toronto.edu  utzoo!henry

ske@pkmab.se (Kristoffer Eriksson) (04/13/91)

In article <1991Mar28.165240.13757@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>> How about mail-to-administrator or store-in-junk-newsgroup or
>>store-somewhere-and-mail-pointer-to-administrator?
>
> The above suggestions work fine for one or two articles, but less well for
>ten thousand "obviously bad" articles.

Where would you get ten thousand bad articles from? And why would it not work
to store them somewhere (for instance in junk), if it would have worked to
store them together with all other news had they just contained a few more
spaces in strategic places?


And in article <1991Mar30.224201.20534@zoo.toronto.edu> he writes:
>There were various possibilities for how to go about this, but it always
>seemed to boil down to the hard, cold fact that people seldom fix their
>news software until they are forced to.  Alerts just don't help much.

Since silently dropping bad articles won't force anyone to do anything, what
you are saying is that this problem may very well never be fixed. But in
that case it is bad policy to drop these articles, that are only formally
bad but not unparsable, because you won't make the problem go away, but
you will be punishing all the people who unknowingly post their articles
with bad posting software, and all other people who would have enjoyed
reading the dropped articles. If you think the problem will not go away,
the only reasonable thing to do is to give in and make the best of the
situation as it is, as others have already pointed out: be liberal in what
you receive and conservative in what you send. Headers missing a space are
very far from irrecoverable.


And in article <1991Apr5.081416.6660@zoo.toronto.edu> he writes:
>The alternative is to inundate him with, potentially, hundreds of complaints
>*per posting*.

If you return the complaint and don't forward the article, the number of
comlaints will depend on the (electronic) distans between the sender and
the first C news site on any separate branch of the senders connectivity
tree. Some will get only one or a few complaints, and some will get
complaints from all over the world.

You might look at the articles path and only send complaints if the article
hasn't passed an unreasonable number of sites.

But apart from that, why would a few hundred comlaints be such a bad thing?
When it has become known at the sending site that sending messages doesn't
work well, they have no reason to be sending any more messages until the
problem is fixed, so at least it doesn't have to be a continuing nuiscance.
And, talking about forcing people to fix their software, isn't a full
mailbox of complaints more of a force than plain silence?

However, automatically correcting missing spaces would remove this question
alltogether...

-- 
Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden
Phone: +46 19-13 03 60  !  e-mail: ske@pkmab.se
Fax:   +46 19-11 51 03  !  or ...!{uunet,mcsun}!sunic.sunet.se!kullmar!pkmab!ske

scs@iti.org (Steve Simmons) (04/15/91)

ske@pkmab.se (Kristoffer Eriksson) writes:

>henry@zoo.toronto.edu (Henry Spencer) writes:
>>There were various possibilities for how to go about this, but it always
>>seemed to boil down to the hard, cold fact that people seldom fix their
>>news software until they are forced to.  Alerts just don't help much.

>Since silently dropping bad articles won't force anyone to do anything . . .

The articles are not dropped silently.  If you run the supplied
daily cleanup script, it tells you what sites are feeding you
articles with bad headers.  This permits your (presumably) friendly
neighbors to tell you about problems, and you to do the same for them.
-- 
"Our informal mission is to improve the love life of operators worldwide."
  Peter Behrendt, president of Exabyte.  Quoted in Digital Review, Feb 4, 1991.

henry@zoo.toronto.edu (Henry Spencer) (04/17/91)

In article <5299@pkmab.se> ske@pkmab.se (Kristoffer Eriksson) writes:
>> The above suggestions work fine for one or two articles, but less well for
>>ten thousand "obviously bad" articles.
>
>Where would you get ten thousand bad articles from?

It's not prohibitively difficult to arrange it, if one of your news feeds
has been badly constipated -- or had to be restored from an old backup --
and suddenly dumps several megabytes of very stale news on you.  Systematic
mangling likewise is not hard to arrange, especially if you're a neighbor
of a gateway machine running badly-written software.

>And why would it not work
>to store them somewhere (for instance in junk), if it would have worked to
>store them together with all other news had they just contained a few more
>spaces in strategic places?

Maybe they would have been rejected for other reasons (e.g. duplication)
if those headers had been legal.  Putting them in junk, at the moment,
also means passing them on to other sites, which is definitely not wanted.
(Although Geoff has talked about changing this.)

>>There were various possibilities for how to go about this, but it always
>>seemed to boil down to the hard, cold fact that people seldom fix their
>>news software until they are forced to.  Alerts just don't help much.
>
>Since silently dropping bad articles won't force anyone to do anything...

It's not (quite) silent; the dropping is logged and newsdaily reports on it.

>... If you think the problem will not go away,

We think the problem will go away, albeit slowly.  Word will percolate
upstream that postings are getting dropped.  We'd prefer a more direct
method of notification, but there really isn't any way to get it to the
people at fault -- the software authors -- without inconveniencing a lot
of innocent people and possibly making them pay substantial phone bills
for the privilege.

>the only reasonable thing to do is to give in and make the best of the
>situation as it is, as others have already pointed out: be liberal in what
>you receive and conservative in what you send. Headers missing a space are
>very far from irrecoverable.

As I've said before, we are *still* more liberal in what we receive than
most other news packages, notably B News.

Headers missing a space are recoverable, *if* you're sure that's the only
problem, and *if* you're willing to rewrite headers in the conviction that
you know what the problem is.  We see no way the software can be confident
of either of these things.  We've seen too many examples of software that
is *sure* it understands the situation, and proceeds to make it worse by
well-meaning attempts at repair.

>You might look at the articles path and only send complaints if the article
>hasn't passed an unreasonable number of sites.

That's workable unless one of those reasonably-few sites is, say, uunet.
What you really want to control for is fanout, which is only very loosely
correlated to path length.

It's easy to concoct methods that will send a small number of complaints
to the author in the typical case.  The problem is making them work in
the extreme cases.  Believe me, we *have* thought about this.  At length.
We see no solution.

>But apart from that, why would a few hundred comlaints be such a bad thing?

You've obviously never had to pay Usenet phone bills yourself, or justify
them to a skeptical management.

>And, talking about forcing people to fix their software, isn't a full
>mailbox of complaints more of a force than plain silence?

Sure, if there were some way to send them to the software author, rather
than his innocent victims.  Contrary to popular misconception, not everyone
on Usenet has the expertise to fix the software when something goes wrong.
-- 
And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry

scs@iti.org (Steve Simmons) (04/17/91)

henry@zoo.toronto.edu (Henry Spencer) writes:

>In article <5299@pkmab.se> ske@pkmab.se (Kristoffer Eriksson) writes:
>>> The above suggestions work fine for one or two articles, but less well for
>>>ten thousand "obviously bad" articles.

>>Where would you get ten thousand bad articles from?

>It's not prohibitively difficult to arrange it, if one of your news feeds
>has been badly constipated -- or had to be restored from an old backup --
>and suddenly dumps several megabytes of very stale news on you.  Systematic
>mangling likewise is not hard to arrange, especially if you're a neighbor
>of a gateway machine running badly-written software.

You already have a solution, Henry.  In the present daily reporting of
news activity you list only "top 5" sites for various bad things.  Why
not keep the first 20 bad articles (or the first 20K) and throw the rest
away?  It's a good save compromise between nothing and overflow.
-- 
"Our informal mission is to improve the love life of operators worldwide."
  Peter Behrendt, president of Exabyte.  Quoted in Digital Review, Feb 4, 1991.

henry@zoo.toronto.edu (Henry Spencer) (04/18/91)

In article <scs.671836683@wotan.iti.org> scs@iti.org (Steve Simmons) writes:
>>>> The above suggestions work fine for one or two articles, but less well for
>>>>ten thousand "obviously bad" articles.
>
>You already have a solution, Henry.  In the present daily reporting of
>news activity you list only "top 5" sites for various bad things.  Why
>not keep the first 20 bad articles (or the first 20K) and throw the rest
>away?  It's a good save compromise between nothing and overflow.

A reasonable approach, but unfortunately it's awkward to pin down and to
implement.  It can't be the first 20 out of each batch!  Unfortunately,
communication between batches is awkward, because each is processed by
a separate relaynews.  Newsdaily has it easy; it does the "top 5" out of
the whole log file.

We might be able to work out something along these lines, but I'm not
making any promises just yet.
-- 
And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry

ggw@wolves.uucp (Gregory G. Woodbury) (04/21/91)

Well, we have a "junk" newsgroup to place articles that are not
localized in groups into when they arrive.  Perhaps we could have a
"trash" or "bad" newsgroup into which the bad header articles are linked
and all but the latest 20 articles are deleted from it each day.
-- 
Gregory G. Woodbury @ The Wolves Den UNIX, Durham NC
UUCP: ...dukcds!wolves!ggw   ...mcnc!wolves!ggw           [use the maps!]
Domain: ggw@cds.duke.edu     ggw%wolves@mcnc.mcnc.org
[The line eater is a boojum snark! ]           <standard disclaimers apply>