[news.admin] automatically mailing warnings about dropped

tp@mccall.com (Terry Poot) (06/17/91)

In article <1991Jun14.224045.26157@zorch.SF-Bay.ORG>,
xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
> tp@mccall.com (Terry Poot) writes:
>> herrickd@iccgcc.decnet.ab.com writes:
>>> chip@tct.com (Chip Salzenberg) writes:
>
>>>> I hereby retract my 10^hopcount proposal in
>>>> favor of an error report broadcast using a
>>>> systematically changed message ID. I'm not sure
>>>> whether that solution is the best one possible,
>>>> but it's the best I've read so far, and I am now
>>>> persuaded that its cost would be acceptable.
>
>Too much overhead; you are targeting a message at
>_all_ Usenet sites to inform _one_ site.  I really
>can't see the attraction of that at all.

Do you have the same problem with Cancel control messages, which exhibit
the same behavior? B news attempted to limit the cancel to only the sites
that needed to see it. It didn't work well, so C news and ANU news at the
very least always propagate cancels to all feed sites. And in this case,
there isn't a real problem, it's just a way for someone to try to get his
foot out of his mouth before it gets wedged too tight. The error reports
are much more useful to the net as a whole than cancels, and after the
initial rash of these when the facility is first introduced, there should
be fewer of them than cancels flowing around the net. (And as for the
initial rash of errors, there are proposals out there that reduce this
problem to a reasonable amount as well.)
....
>> Regular messages in a specific group seem much
>> more reliable.
>
>Worse and worse; we are right back to the problem of
>the original posting warning that a change was coming;
>if you put information where it won't be seen, you
>might as well not go to the trouble.
>
>There are _two_ problems to be solved here:
>
>1) A technical problem: how to get the information
>back to the author or the site admin at the site
>where failing messages are being created, without

Solved by the error report proposal, if you consider the WHOLE proposal.

>breaking anything further (Cnews has already gone
>through a phase of reliably recycling old news, and
>now one of reliably discarding without notice
>articles with what used to be working article
>headers; no sense trying for three disasters in a
>row), 

Are you trying to tell us that news is too unreliable a transport to carry
notices of its own problems? Do you have a serial line from each of your
ethernet controllers tying back into your cpu for error reports? :-)

>such as drowning the net in excess messages or

I think you over-estimate. Most of the net runs good software, or I doubt
even the C news crowd would have seen the changes that sparked all this as
a doable thing.

>the site of origin in excess email. Only the second
>possibility has yet been effectively addressed so
>far; all the methods of using news to do the job
>send far too many copies to sites that don't care at
>all about the information, wasting time, money, and
>patience.

Each site will store one copy of each report. He might get one from each
feed, but he probably won't, any more than he gets any other article from
every feed he has. If he does have 2 fully redundant feeds, he probably
wants it that way.

>2) A human factors problem: how to put the
>information returned in a place where it is hard to
>ignore, not lost in a mass of similar, but
>inapplicable, warnings that condition the intended
>recipient out of looking for the warnings at all.

Remember a key facet of the proposal is that there will be a small number
of sites that pick up all the error reports and mail them to the site with
the problem. Thus, you won't have to read the group to find out you have
a problem, someone will send you mail telling you. Of course, newer news
software could have a feature whereby it scans the error reports looking
for errors referring to the current site, so that notification would be
swifter, and would occur even if mail couldn't get through, but it will
still work quite well for a site not running such software, because he'll
get notification by mail. 

>It's the old "crying wolf" problem. I have yet to
>see a proposal here that beats sending a small
>number of _pertinent_ notices to the email box of
>the author/site admin, a location that a) is
>regularly perused by the recipient, and b) waits
>"forever" to be read (no expire == loss of
>information). 

Unfortunately, it is all too likely that the small number in question is 
precisely zero. The various configurations of net connectivity are such
that I've yet to see a proposal for a probabilistic method that hasn't been
shot down by a counter example of a configuration that would get too many
or too few messages.

>Nothing posted to a busy newsgroup
>satisfies this need at all. Most "inform by news"
>proposals would require changes at the site of
>origin to pull out only the pertinent notices and
>present them to the responsible party, and one of
>the givens in this situation is that changes at the
>site of origin won't be made until _after_ notice of
>a problem is seen, making "inform by news" methods
>worthless.

Read the whole proposal. The mail out notification was in Neil Rickert's
original posting about this method of reporting errors.
--
Terry Poot <tp@mccall.com>                   The McCall Pattern Company
(uucp: ...!rutgers!ksuvax1!deimos!mccall!tp) 615 McCall Road
(800)255-2762, in KS (913)776-4041           Manhattan, KS 66502, USA

mathew@mantis.co.uk (Giving C News a *HUG*) (06/17/91)

xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
> > chip@tct.com (Chip Salzenberg) writes:
> > I hereby retract my 10^hopcount proposal in
> > favor of an error report broadcast using a
> > systematically changed message ID. I'm not sure
> > whether that solution is the best one possible,
> > but it's the best I've read so far, and I am now
> > persuaded that its cost would be acceptable.
> 
> Too much overhead; you are targeting a message at
> _all_ Usenet sites to inform _one_ site.  I really
> can't see the attraction of that at all.

It's the safest solution.  As far as I know, nobody has come up with a good
objection to it.

>         all the methods of using news to do the job
> send far too many copies to sites that don't care at
> all about the information, wasting time, money, and
> patience.

They send *one* article to each site for each *one* header-invalid article
that that site would otherwise have received.  I think that's pretty
acceptable.  Remember, stale articles are still silently dropped.

> 2) A human factors problem: how to put the
> information returned in a place where it is hard to
> ignore, not lost in a mass of similar, but
> inapplicable, warnings that condition the intended
> recipient out of looking for the warnings at all.

The news-based solution has two answers to this:

1. Responsible sysadmins can set up their software to look for error reports
   with that site's name in.

2. A central machine (say, at UUNET) can send a single mail-based warning to
   each site appearing in the error reports for a given week.

> It's the old "crying wolf" problem. I have yet to
> see a proposal here that beats sending a small
> number of _pertinent_ notices to the email box of
> the author/site admin, a location that a) is
> regularly perused by the recipient, and b) waits
> "forever" to be read (no expire == loss of
> information).

Right. But part of the news-based solution includes this. All we need is one
or two volunteer sites to monitor the errors newsgroup.

mathew

tp@mccall.com (Terry Poot) (06/20/91)

This message isn't worth reading unless you are Kent or agree with him.

In article <1991Jun18.113758.16382@zorch.SF-Bay.ORG>,
xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>tp@mccall.com (Terry Poot) writes:
>> xanthian@zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>
>[about broadcasting notice of dropped articles:]
>
>>> Too much overhead; you are targeting a message at
>>> _all_ Usenet sites to inform _one_ site. I really
>>> can't see the attraction of that at all.
>
>> Do you have the same problem with Cancel control
>> messages, which exhibit the same behavior?
>
>Huh?  A cancel goes to all sites because it _must_
>go to all sites to be effective.  Notice of a dropped
>article needs to get to only one site, the site of
>origin, to be effective.  Not at all comparable
>situations.

And when you figure out a way of getting an error report to the site that
generated the article reliably and without requiring updated news software
to be installed net-wide, I'll happily support it over this proposal. In
this case, the error report must also be propagated by news to be reliably
delivered to the originating site AND the central email site. Without
changing the news software, that means it has to go net-wide. If we could
change all the software, we wouldn't have this particular problem, would
we?

>> The error reports are much more useful to the net
>> as a whole than cancels, and after the initial
>> rash of these when the facility is first
>> introduced, there should be fewer of them than
>> cancels flowing around the net. (And as for the
>> initial rash of errors, there are proposals out
>> there that reduce this problem to a reasonable
>> amount as well.)
>
>Not at all true if a month of news gets barfed back
>onto the net by a mechanism that looks like a bad
>header problem and evokes this mechanism you defend;
>at that point, if the Bnews site connectivity from
>the failing site is enough to get the barf out to
>the "whole net", the deluge of messages this
>mechanism you defend would dump on the net would
>disable the whole net for a significant period of
>time, like weeks, while sysadmins cleared up the
>junk while trying to preserve the news and mail. The
>cost in phone charges alone would be horrendous.
>I've yet to see a modification to the broadcast
>notification proposal that addresses this _fatal_
>flaw. News barfs with mangled headers happen several
>times a year; a mechanism that could take the whole
>net down several times a year makes the internet
>worm look like a love pat; it only nailed 6000
>machines, and that by accident; the broadcast
>proposal risks the whole ball of wax, and
>deliberately.

Hogwash. A mechanism has been proposed to solve this. Read my dialog with
Chris Lewis. And even if it hadn't, you vastly overestimate the problem.
The worst case news flood would be exactly double the size of the original
deluge, none of which has destroyed the net yet, and only at non-C news
sites that had connectivity via non-C news sites to the originator of the
flood, since C news sites would drop the original messages and generate
error reports. C news sites and those separated by a C news "barrier" from
the source of the flood would get only the error reports, which are very
probably shorter on the average than the original articles. Thus the 
situation would be better than it was a few months ago, and the net has yet
to die.

HOWEVER, I proposed a scheme that would drastically reduce this load, while
not at all impairing the working of the error mechanism itself. In a 
nutshell, each site only generates one error report per day per site
appearing in the message id of a bad message. Most munged batches in which
every article would generate an error are those where the flooding site
has stomped on the headers, so they probably all come from him. Note that
if the news is simply old and being regurgitated, no error reports are
generated, because there is nothing wrong with the article. B news, and
current C news sites will simply drop the article as too old, without
generating an error report.

Please post a realistic scenario (something that has actually happened)
that this scheme doesn't address, and we can work on refinements if needed.
For full details of the limiting scheme, see one of my relatively recent
articles.

>>>> Regular messages in a specific group seem much
>>>> more reliable.
>
>>> Worse and worse; we are right back to the problem
>>> of the original posting warning that a change was
>>> coming; if you put information where it won't be
>>> seen, you might as well not go to the trouble.
>
>>> There are _two_ problems to be solved here:
>
>>> 1) A technical problem: how to get the
>>> information back to the author or the site admin
>>> at the site where failing messages are being
>>> created, without
>
>> Solved by the error report proposal, if you
>> consider the WHOLE proposal.
>
>Nope, I've considered the whole proposal; a proposal
>that breaks the net doesn't solve problems.

The proposal doesn't break the net, and it does solve problems.

>>> breaking anything further (Cnews has already gone
>>> through a phase of reliably recycling old news,
>>> and now one of reliably discarding without notice
>>> articles with what used to be working article
>>> headers; no sense trying for three disasters in a
>>> row),
>
>> Are you trying to tell us that news is too
>> unreliable a transport to carry notices of its own
>> problems?
>
>First, it isn't much of a software engineering win
>to depend on a failing mechanism to carry notice of
>its own failure.

The failure in question is a bad message. The fact that someone else
on the net generates a bad message doesn't in any way indicate that the
error report message will be bad.

>Second, designing a mechanism that breaks it worse in
>the process of "fixing" things is no win at all.

Invalid premise.

>>> such as drowning the net in excess messages or
>
>> I think you over-estimate. Most of the net runs
>> good software, or I doubt even the C news crowd
>> would have seen the changes that sparked all this
>> as a doable thing.
>
>Most of the net runs _abyssmal_ software [a lot of
>it hides under the name of "email", which has a
>blind 30% first reply failure rate for naive users];
>news failures are relatively public, so squeeking
>wheels get attention. Still, the UUCP transport
>suite has a crash and burn failure mode in uucico,
>news spools overflow regularly all over the net,
>sites can't agree on the legality of posting to
>groups not carried at the site, header lines needed
>for some news reading software aren't maintained by
>other news posting software, subject lines get
>truncated, the list goes on. I sure wouldn't brag on
>the workability of news anywhere I had a reputation
>to maintain.  Ignoring problems this bad in software
>doesn't say much for ones professionalism.

Mostly irrelevant and outside the scope of this discussion. We are
discussing news. Specifically we are discussing what to do to report
the fact that a badly formatted news article has been dropped. Stick
to the issue. (You said you had a 10 minute attention span; you must be
a slow reader.)

>>> the site of origin in excess email. Only the
>>> second possibility has yet been effectively
>>> addressed so far; all the methods of using news
>>> to do the job send far too many copies to sites
>>> that don't care at all about the information,
>>> wasting time, money, and patience.
>
>> Each site will store one copy of each report. He
>> might get one from each feed, but he probably
>> won't, any more than he gets any other article
>> from every feed he has. If he does have 2 fully
>> redundant feeds, he probably wants it that way.
>
>Don't know about where you are, but here the dropped
>news and other problems require redundant feeds and
>still we miss parts of source group multi-part
>postings and such. Using a threaded newsreader like
>trn gives you a new appreciation of just how much
>news _never_ gets to your site; I'd guess the raw
>failure rate is between 5% and 15% of all news goes
>missing with a single feed, 2%-5% with two fairly
>independent feeds.

I've never been so unfortunate as to have such a lousy feed. I don't dispute
that they exist, but I do dispute that they are the norm. Such a thing
could be researched, why don't you look into it.

>>> 2) A human factors problem: how to put the
>>> information returned in a place where it is hard
>>> to ignore, not lost in a mass of similar, but
>>> inapplicable, warnings that condition the
>>> intended recipient out of looking for the
>>> warnings at all.
>
>> Remember a key facet of the proposal is that there
>> will be a small number of sites that pick up all
>> the error reports and mail them to the site with
>> the problem.
>
>Hardly a "key facet"; it was included as an optional
>gawdawful nuisance that would be included if the
>yelling were sufficiently loud, but preferably not.

Who's version of the proposal did you read. As I see it, Neil Rickert
(with whom I've been corresponding for about a month), myself, and Mathew
(dare I mention him :-)) are the people that have been pushing this idea
from the begining, with others signing on along the way. I can't recall
any of those people having that attitude (except Chris Lewis at first as
he was just being won over, but I think he's come around).

>Also, it won't work. The further you get from the
>path from failing article posting site to failing
>article detecting site, the less chance you have of
>getting email through. In reality, the sites you are
>trying to reach are mostly leaf sites at the end of
>long cul-de-sacs, running little known or long
>superseded software, exactly the sites least likely
>to have good mail map entries and the longer you
>make the mail path, the less likely it is to
>function.

If mail worked as badly as you indicate, the net would not function as well
as it presently does, and your proposal has even less validity. In addition
to the possibility of no error reports being generated, you have the
possibility of the ones that are being lost. The one thing that we know the
site can do is propagate news to the site that saw the error. Thus we have
a reasonable chance that he will get a news posting from that site (yes, I
know that isn't always true, but then there isn't a perfect solution, is
there?). So the worse email works, the better a news based system looks.
But then it isn't nearly as bad as you indicate. (Unless the high failure
rates you specify for news and mail are all happening at the link from you
to other sites? Have you checked to be sure that it isn't your system 
dropping everything.

>> Thus, you won't have to read the group to find out
>> you have a problem, someone will send you mail
>> telling you.
>
>Which you are unlikely to receive, the mailing site
>now being many more hops away.

I get mail from Japan and New Zealand just fine. I even get mail from 
you (or did it take you lots of tries). I'd say you have a fairly good
chance of receiving mail in the average case. And yes, there are certainly
sites and subnets that are total basket cases. In which case your proposal
falls flat on it's face again. Whereas he still might see the notification
in news. If he doesn't, he certainly can't complain that every effort wasn't
made to notify him.

>> Of course, newer news software could have a
>> feature whereby it scans the error reports looking
>> for errors referring to the current site, so that
>> notification would be swifter,
>
>Sure, but the sites running up to date software are
>not the ones we need to reach, so don't even include
>them in your planning.

I'm not, that wasn an aside, in response to something you said. In no way
does the proposal depend on the site originating bad messages running up-
to-date software.

>> and would occur even if mail couldn't get through,
>> but it will still work quite well for a site not
>> running such software, because he'll get
>> notification by mail.
>
>Probably not.

Probably.

>>> It's the old "crying wolf" problem. I have yet to
>>> see a proposal here that beats sending a small
>>> number of _pertinent_ notices to the email box of
>>> the author/site admin, a location that a) is
>>> regularly perused by the recipient, and b) waits
>>> "forever" to be read (no expire == loss of
>>> information).
>
>> Unfortunately, it is all too likely that the small
>> number in question is precisely zero. The various
>> configurations of net connectivity are such that
>> I've yet to see a proposal for a probabilistic
>> method that hasn't been shot down by a counter
>> example of a configuration that would get too many
>> or too few messages.
>
>Sigh. Between the idiots who want this whole
>discussion to be decided on a popularity contest
>between Mathew and Henry rather than the rather
>profound technical and software engineering and
>programming ethics issues, and the people who, as
>Mathew says, keep moving the goalposts, it's
>profoundly difficult to make progress here.
>
>Given that no one has bothered to make a definitive
>statement of what constitutes "too many" or "too
>few" messages, of course the non-contributors have
>no trouble setting up their own strawmen, and then
>promptly knocking them down. If I can state that at
>least one article in sixteen posted with a bad
>header gets a response, and at most sixteen
>responses are returned per failing article, then I
>suspect a sufficient two parameter probabilistic
>email notification system can be designed, though
>I'm not going to be the one to do it.

I won't repeat the counter arguments. Suffice it to say that nobody has
designed such a system. Whether you are correct or not as to whether it
could be done, the potential gains us nothing. Only a design for a system
that will work can be implemented. If you aren't going to do it, then what
are you arguing for? You are waiting for someone else to prove you right.
Unless such a person steps forward, your assertions are unproven. More
importantly at present is that they can not be implemented, having not been
designed.

>
>>> Nothing posted to a busy newsgroup satisfies this
>>> need at all. Most "inform by news" proposals
>>> would require changes at the site of origin to
>>> pull out only the pertinent notices and present
>>> them to the responsible party, and one of the
>>> givens in this situation is that changes at the
>>> site of origin won't be made until _after_ notice
>>> of a problem is seen, making "inform by news"
>>> methods worthless.
>
>> Read the whole proposal. The mail out notification
>> was in Neil Rickert's original posting about this
>> method of reporting errors.
>
>Yes it was, with the attitude described above, and,
>as noted above, it simply will fail in too large a
>proportion of cases.

Conjecture false, or certainly very subjective. Premise false. Statement
meaningless.

>The design goals for a notification method should
>_not_ require that each site get the _same_ level of
>notification or have the _same_ limits on notices
>received, just that each site have _some_ useful
>level of notification, and that each site have
>_some_ guarantee of a reasonable limit on the number
>of notices received.

Your system offers no guarantees at all. Since a random number varies from
0 to unity, your system only guarantees that a site will receive some number
of notifications ranging from 0 to the number of sites that got his 
message. 0 is unacceptable in the long term. Probabilistic systems come
with no guarantees, only expectations.

It is not a design goal that the whole net receive notification, it is a 
side effect of the best proposal so far whose design goal is that the
originating site receive notification. I agree it would be nice to remove
that side effect. I'll support the first proposal that does so and also
provides a sufficiently robust chance of the originating site of a bad 
message finding out about it.

>
>It should also be a design goal that only Cnews
>sites require software changes or behavioral changes
>to make the notification methods work. Lots of sites
>have "absentee sysadmins" that essentially _never_
>read news, and will only know about any of this if a
>message shows up in their email.

The news based proposal does this better than yours. Your system requires
an estimate of the number of sites running the software, which is hard
at best to arrive at. In fact, it requires the number of such sites to which
any bad message will propagate to be reasonably constant as seen from any
site that might potentially issue a bad message (i.e. all of them), which
is the point on which your proposal has been so effectively attacked. 
Examples of net configurations (real ones) at wide extremes have been
presented, and no counter has been offered as far as I have seen.

>More important, it is probably really worthwhile
>putting up and agreeing upon the design goals for
>notification, before haranguing about whether one or
>another proposed design meets the goals; a moving
>goalposts problem again.

We agree on the problem. The design goal is a solution to the problem.
Given that no solution is perfect, setting design goals in advance simply
biases the choice. No this isn't standard design and analysis technique.
Such technique rarely takes into account that a problem can not be
completely solved. Rather it defines a solution that can be acheived and
then designs a method of acheiving it. Thus you ask us to decide what we
can and can not accomplish before working out ways to do so. Given our
limited options, comparing the pros and cons of different designs will be
a much more effective way of finding the solution that appears to work best.
--
Terry Poot <tp@mccall.com>                   The McCall Pattern Company
(uucp: ...!rutgers!ksuvax1!deimos!mccall!tp) 615 McCall Road
(800)255-2762, in KS (913)776-4041           Manhattan, KS 66502, USA