[news.admin] Cnews "dropped article" Notifications - another proposal

timk@wynnds.xenitec.on.ca (Tim Kuehn) (06/04/91)

I made a previous proposal about a possible way to notify site admin's 
that articles are being dropped via some kind of post to news.lists or 
other relevent newsgroup, and got a reasonable response. There were some
shortcomings in that proposal, which, on further thought, I now think I've
overcome and present below. 

Comments, thoughts, and criticisms welcome. Please keep flames to a 
low-boil :) Praise always welcomed :)

Overview:
---------
The big part of the hoopla about the current Cnews method of handling 
badly formatted articles is the fact that Cnews only logs the error, but 
does not report it. Current proposals on the floor suggest mailing a 
notification to the offending site admin or posting a list of bad articles 
to a newsgroup. This falls down when you consider that some sites with a 
wide fan-out before hitting a current patchlevel Cnews site(s) could 
theoretically find themselves bombarded with email or the newsgroup flooded
with notifications. This is not good. 

Problem with current proposals:
-------------------------------
What seems to be the problem with these kind of proposals is that they're 
attempting to notify the offender of their errors in a manner that is 
*too detailed* with article numbers, error types, etc. If too many sites 
notify one site or set of sites this could lead to a swamping of whatever 
transmission media (and possibly store-and-forward spool space) may be 
between the offending site(s) and their upstream feeds, not to mention 
the hapless sysadmin who finds himself swamped with mail.

Alternative Proposal:
---------------------
What is *really* needed here is a simple "You're site is posting bad
articles" message getting back to the offender, as well as notification
of the site name where an error was found. Do *not* include the error 
type or any article numbers, let the offending site's sysadmin *mail* 
the sysadmin of one of the sites that caught the error asking for further
clarification of what the error was. Assuming co-operative sysadmins (is
that too much to expect?) the offender site's sysadmin can fix the problem 
causing the error and the problem'll be solved.

If co-operative sysadmins *are* too much to expect :(, then make the 
implimentation of this module optional via Cnews's "build" script. If 
the sysadmin doesn't want to help out, he doesn't have to then.

Advantages:
-----------
- 100% reporting that errors have been found and articles are being 
  dropped. 
- would not swamp the net in a net.terrorist fashion.

Disadvantages:
--------------
- Initially this article could be somewhat large until sites get their 
  posting SW RFC compliant and would use up corresponding bandwidth.
- would be repeatedly propogated between machines
- does not give specific error messages detailing precise problems
  or message id's

The News Article Design: 
------------------------
The news article would be circulated in appropriate newsgroup with 
the following format:

offender_site1: [site_name1:date1, site_name2:date4, site_name3:date7]
offender_site2: [site_name1:date2, site_name2:date5, site_name3:date8]
offender_site3: [site_name1:date3, site_name2:date6, site_name3:date9]

Where: 
------
offender_site - the machine name of the site who's posting non-acceptable 
		news articles. 
site_name     -	the machine name of the site who caught the badly-formatted
		news articles and dropped them on the floor
date          - the last date this site found a badly-formatted article
		from offender_site.

Local "Offender Site" File:
---------------------------
offender_site1	error_date1
offender_site2	error_date2
offender_site3	error_date3
offender_site4	error_date4


This would be a local copy of the list of offenders kept by 
the machine, and would be used to update the news articles
when they come in. 

Procedure:
----------
1) An article of the above format is received by the site and stored 
in the system somewhere. 

2) This article would be compared against the local "offender_site" list. 
Any differing or omitted entries in the news article compared to the local 
offender-site list would be updated in the news article to reflect the 
current offender-site name and date information. 

(Note: the news article would only be scanned for entries made by the 
*local* site. Other site information would not be touched but left to 
the remote sites to update as required.)

3) If the news article was changed by the local machine, it would be 
reposted to the net and shipped out with the next poll. If it was
not changed, then it would not be reposted.

Crontasks:
----------
At periodic intervals, a crontask would scan the log file for offending
sites who's articles had been dropped. It would then update the local 
offender-sites file to reflect the new last-error-found-date, and would 
also scan for expired last-error-found in the offender-site local file 
entries and delete them if they were too old.

Question/Caveat:
----------------
Under (3) above, if multiple copies of the article came in, provision 
would have to be made to ensure that the number of copies kept on the 
local spool area don't grow like wildfire, but only one file is kept 
on the system at a time.

------------------------------------------------------------------------ 
Tim Kuehn			 TDK Consulting Services  (519)-888-0766
timk@wynnds.xenitec.on.ca  -or-  !{watmath|lsuc}!xenitec!wynnds!timk
Valpo EE turned loose on unsuspecting world! News at 11!
"You take it seriously when someone from a ballistics research lab calls you."
Heard at a Unix user's meeting discussing connectivity issues.

stealth@engin.umich.edu (Mike Pelletier) (06/05/91)

In article <1991Jun4.161440.4161@wynnds.xenitec.on.ca>
	 timk@wynnds.xenitec.on.ca (Tim Kuehn) writes:

	[See the references line]

All in all, looks like a very workable idea, needing minimal to zero
modifications to C-news code.  I think I'll go practice perl a bit by
(making an attempt at) writing a script that does this sort of thing...

-- 
Mike Pelletier             | "Wind & waves are breakdowns in the commitment of
The University of Michigan |  getting from here to there, but they are the con-
  College of Engineering   |  ditions for sailing.  Not something to eliminate,
Student/Systems Admin      |  but something to dance with."

weigl@sibelius.inria.fr (Konrad Weigl) (06/07/91)

In article <1991Jun4.161440.4161@wynnds.xenitec.on.ca>, timk@wynnds.xenitec.on.ca (Tim Kuehn) writes:
>posting a list of bad articles 
> to a newsgroup. This falls down when you consider that some sites with a 
> wide fan-out before hitting a current patchlevel Cnews site(s) could 
> theoretically find themselves bombarded with email or the newsgroup flooded
> with notifications. This is not good. 

I am no *pro* in this field, but if a site identifying a bad article would first
read the newsgroup before anything else is done, there would be no problem:
If the article is posted already, no further reaction, else:

Any site finding an unreported bad article informs source & posts simultaneously.

Am I missing something?

Instead of a newsgroup with a wide fan-out & lots of traffic a centralized location 
plus backup-site might be more efficient as keeper of the "bad-article found&reported-file":

You'd only have traffic for specific mail inquiries ("Is article Nr XXX in file?") plus
return, instead of undifferentiated newsgroup dissemination as soon as a new badly-formatted
article crops up.

Konrad Weigl               Tel. (France) 93 65 78 63
Projet Pastis              Fax  (France) 93 65 78 58
INRIA-Sophia Antipolis     email weigl@mirsa.inria.fr
2004 Route des Lucioles    
B.P. 109
06561 Valbonne Cedex
France

ske@pkmab.se (Kristoffer Eriksson) (06/08/91)

Here's yet another proposal for what to do about articles that Cnews doesn't
want to propagate further, that I haven't seen anyone touch on yet.

The problem with having every Cnews site that encounters the objectionable
article returning a mail message to the author is that there is no upper
bound on how many messages the author may be sent from all over the net.

So one way to twist this problem is how to select which site or sites that
should return a message, if having all sites do that is out. My proposal is
to make it possible for every site to request this service of any other site,
and return messages only to those sites that have requested that. Each Cnews
site would have a list of sitenames that have requested notification of
errors, which it would check every time it encounters an objectionable
article. To make this work smoothely, one would need some automated means
of adding and removing ones own sitename to or from this list on any other
site of your choosing. A new type of control message would do. A mail server
would probably be the absolutely best solution, so requests could go directly
to the target site instead of being broadcast through the news system, but
that would bring us somewhat outside what can be contained in the Cnews system
alone. A friendly news admin could add sitenames manually to this list, too,
but a desireable goal is to remove the dependancy on friendly news admins, so
that wouldn't be quite enough. Thus I propose a control message as the best
solution.

Each site that worries about the well of their articles, would study their
net connections and determine one or several strategically placed sites
that they want to receive error messages from, for instance the nearest
Cnews site(s) or some regionally important big site. But ultimately it
would probably be wise for most sites to set up error-returning partners
like this, just as a routine matter, worries or not.

As a default, we could also let Cnews always return bad messages that
originated at an immediate neighboughr, even without any explicit request.
That shouldn't be too dangerous.

Apart from all that, I still think it is overly pedantic to drop articles
simply because of a missing space after a colon. I don't see that the space
carries any significance, any meaning of its own, right after that colon in
the header, especially not in non-essential headers. To me the space appears
only to be an aesthetic enhancement, comparable to white space between the
syntax elements of a programming language. In most laguages you can change
the number of spaces without affecting the meaning. That the RFCs specify
only exactly one space, can be viewed simply as the "default" rendering
that should be used when the header leaves your program, without saying
much about what your program is allowed to accept when it arrives at your
program. I think such an interpretation is in the spirit of "be liberal in
what you expect, and strict about what you emit".

-- 
Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden
Phone: +46 19-13 03 60  !  e-mail: ske@pkmab.se
Fax:   +46 19-11 51 03  !  or ...!{uunet,mcsun}!sunic.sunet.se!kullmar!pkmab!ske