[net.news.b] News bug -- a major black hole found

chuqui@nsc.UUCP (Chuq Von Rospach) (08/02/85)

I think I've found a major problem with news, one that is causing some of
our black holes out there. It is a very bizarre combination of events. The
basic components are control messages, compress (cunbatch) and dropping
error messages on the floor. I don't have a fix (yet) but at least I know
what I'm looking for now.

What happens is this: Someone sends a control message, such as a cancel,
without a text body. A lot of sites out there have a known bug that cause
these messages to go out with an '^?' as the message body because it ends
up writing the EOF char before realizing it is EOF. This message is then 
stored somewhere by the F command in the sys file. Later on, all of these
wonderful things are batched together, sent through compress (I have 3.0,
but I don't think it is limited to this version) and shipped across.

When it gets to the other side, it is de-compressed and shipped to rnews
for de-batching. Unfortunately, you end up with the following sequence of
lines:

    #! rnews 999
    [control message header]
    ^?#!rnews 9999
    [next message header]

unfortunately, the rnews batcher doesn't seem to be able to handle this.
You get one of two things to happen:

    o the well known 'Inbound news is garbled' message since the !rnews
    isn't there as expected. You end up losing the rest of the batch
    unless you get lucky -- permanently. At least you know you lost
    something, and can ask your feed to reship it, if they save the
    batching files for you.

    o The not so well known problem I just found. Instead of getting a
    garbled message, I found that rnews will recognize its problem and
    start issuing 'Out of sync -- skipping: [lost data]' lines. This
    actually seems to be more likely to happen than the 'inbound news
    is garbled' and for some reason these messages don't get logged or
    mailed to anyone -- uucp seems to throw them away. The end result,
    unfortunately, is a number of messages that simply get thrown away
    somewhere with no error message recorded in any log. I don't know how
    much data is being lost, but I don't like seeing data lost silently.

I found this quite accidently. I happened to be mucking with news while my
feed was coming in, and saw the garbled message in the log. I grabbed the
compressed file from uucp, decompressed it by hand, and then ran it through
rnews to find the garbaged spot. It was the spot immediately after the
control message. I edited out the part of the file before the problem, ran
it through again, and shoved it back into rnews. All of a sudden, rnews was
complaining on stdout (perhaps stderr) about being out of sync. Checking,
it was immediately after ANOTHER control message. the rnews batcher does
NOT log these errors, meaning the data is lost completely.

What to do? I don't know yet, but I'm going to explore the following:

    o make sure that when you store a control message, it ends with a
    newline. This should be done for locally posted stuff AND anything
    that passes through. This doesn't protect you from upstream sites
    and this problem, but will keep you from screwing your downstream
    sites. Since this bug seems to be degenerative (each site can pass on a
    control message, which comes through fine, and have a new set of 
    message get eaten every time news gets shipped) this is a really NASTY
    problem -- if you thought the line-eater was bad, realize it only
    mucked up a single article, and the one messed up by the poster. This
    bug is a virus, and eats random articles at random sites. 

    o modify the rnews unbatcher to do two things: LOG these error messages
    and store ANY data that it can't deal with somewhere to be unpacked by
    hand later. This will cause more work for an SA, but at least messages
    won't get lost. 

Any suggestions are more than welcome. I found this on a fluke, and
frankly, I don't know if we can ever quantify how much data is getting
eaten by this thing. If I see if properly, it can even attack (silently)
a site running without batching with sendbatch, so stopping the compression
doesn't seem to be a solution. With the success we've had in eradicating
the line eater, I'm really scared about what this does to the net.

chuq
-- 
:From the carousel of the autumn carnival:        Chuq Von Rospach
{cbosgd,fortune,hplabs,ihnp4,seismo}!nsc!chuqui   nsc!chuqui@decwrl.ARPA

Your fifteen minutes are up. Please step aside!

howard@cyb-eng.UUCP (Howard Johnson) (08/13/85)

> I think I've found a major problem with news, one that is causing some of
> our black holes out there. It is a very bizarre combination of events.

Bizarre, yes.  (But I won't cross-post to net.bizarre just yet. :-))

> What happens is this: Someone sends a control message, such as a cancel,
> without a text body. [...] ends up writing the EOF char before realizing
> it is EOF. [...]
> 
>     #! rnews 999
>     [control message header]
>     ^?#!rnews 9999
>     [next message header]

Fortunately, this doesn't seem to happen on the most widely-distributed
version of 2.10.2 (9/18/84 version, which I have).

> unfortunately, the rnews batcher doesn't seem to be able to handle this.
>     o make sure that when you store a control message, it ends with a
>     newline. [...]
> 
>     o modify the rnews unbatcher to do two things: LOG these error messages
>     and store ANY data that it can't deal with somewhere to be unpacked by
>     hand later. This will cause more work for an SA, but at least messages
>     won't get lost. 

The version of news I have does the first of these, but not the second.