[news.software.b] CNews - now a pedantic software! :

heiko@methan.chemie.fu-berlin.de (Heiko Schlichting) (03/30/91)

Hello,

we installed the new CNews-patches a few hours ago and dropped a lot
of articles because of the new RFC-validation.

The main reason is "article header contains non-header line":

---- extracted from bad articles: ----

From: Roger.Sheppard@bbs.actrix.gen.nz
Message-ID: <1991Mar29.142208.17661@actrix.gen.nz>
Distribution:world
            ^^
From: Roger.Sheppard@bbs.actrix.gen.nz
Message-ID: <1991Mar29.143645.17980@actrix.gen.nz>
Distribution:world
            ^^

From: hideg@spsd4360a.erim.org (Steve Hideg (Mr. Fabulous) )
Message-ID: <1991Mar28.215634.14594@math.lsa.umich.edu>
References:<3078@beguine.UUCP> <CTAN.91Mar26215420@world.std.com>
          ^^

From: mvb@eagle.mit.edu (Mary V. Burke)
Message-ID: <1991Mar29.155634.22364@athena.mit.edu>
References:<1991Mar27.234135.3299@i88.isc.com>
          ^^

From: sallyb@sequent.UUCP (Sally A. Bowley)
Message-ID: <56373@sequent.UUCP>
Expires:
       ^^
--------------------------------
and *MUCH* more in a few hours... :-(

We don't think it was a good idea to drop a lot of articles due to
a missing space. A lot of information is lost... :-(

The new CNews is a pedantic software now...

Bye, Vera and Heiko (news@fu-berlin.de)
-- 
 |~|    Heiko Schlichting                   | Freie Universitaet Berlin 
 / \    heiko@fub.uucp                      | Institut fuer Organische Chemie
/FUB\   heiko@methan.chemie.fu-berlin.de    | Takustrasse 3
`---'   phone +49 30 838-2677; fax ...-5163 | D-1000 Berlin 33  Germany

brad@looking.on.ca (Brad Templeton) (03/31/91)

It's a tough decision, but one I would have to stand by.  If something
is a valid part of the standard, and there's no debate that it's a mistake,
and no claim that the variation is a proposed extension to the standard,
then the article should be bounced.

USENET has stagnated for years in what should be one of the thriving areas
of computer develompment.   You can't make any changes because of all the
old and broken software.   Those putting out bad articles have to be
thrashed.

I would go further.  If there is something that's commonly agreed upon as
bad, I would have ONE detector set up on the net that mails back to the
poster about the problem.

Now, that said, I think the standard could stand to be modified to permit:
	^Expires:$
and not insist on:
	^Expires: $

However:
	^Newsgroups:news.software.b$
Probably should be punted.

This is based on a general rule that a CR counts as whitespace, and that
space should not itself be a special character.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

chip@tct.com (Chip Salzenberg) (04/01/91)

According to admin@methan.chemie.fu-berlin.de:
>We don't think it was a good idea to drop a lot of articles due to
>a missing space. A lot of information is lost... :-(

Yeah.  Ain't it great?

Seriously, the net is better off with C News sites enforcing the
standards.  I for one am very glad to see this change.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.com>, <uunet!pdn!tct!chip>
   "All this is conjecture of course, since I *only* post in the nude.
    Nothing comes between me and my t.b.  Nothing."   -- Bill Coderre

rli@buster.stafford.tx.us (Buster Irby) (04/01/91)

brad@looking.on.ca (Brad Templeton) writes:

>It's a tough decision, but one I would have to stand by.  If something
>is a valid part of the standard, and there's no debate that it's a mistake,
>and no claim that the variation is a proposed extension to the standard,
>then the article should be bounced.

I disagree.  Who is going to learn a lesson by bouncing an
article.  Certainly not the author, since his news system allowed
him to post it.  If He receives no feedback from his news system
he will believe that there is nothing wrong with the article.  In
fact, what this is going to cause is an increase in the number of
articles posted saying "This is a test to see if my articles are
getting out".

I believe that during a transitional period (say 6-12 months or
so), the software should be willing to accept anything that is
reasonable from outside sources but should not be willing to post
local articles which do not conform.  This would give immediate
(local) feedback to the offenders by not posting their articles,
while not depriving the net readers of articles which are
slightly out of spec.

Finally, I would like to support a statement made by Peter da
Silva the other day:

>"Be liberal about what you receive, conservative about what you generate"

kyle@uunet.UU.NET (Kyle Jones) (04/02/91)

In article <27F63B37.CFD@tct.com> chip@tct.com (Chip Salzenberg) writes:
 > According to admin@methan.chemie.fu-berlin.de:
 > >We don't think it was a good idea to drop a lot of articles due to
 > >a missing space. A lot of information is lost... :-(
 > 
 > Yeah.  Ain't it great?

No, it ain't great.  This change punishes the wrong parties.
It's the people who are running the "correct" software that are
being deprived of information.  I can understand refusing to
parse all the strange date formats out there, but not this
niggling over a space.  C-News knows what's wrong with the
article, but refuses to correct it.  This is quite different from
knowing something is wrong but having no idea how to correct it,
or knowing what is wrong and having no way to correct it.

rli@buster.stafford.tx.us (Buster Irby) (04/03/91)

chip@tct.com (Chip Salzenberg) writes:

>Seriously, the net is better off with C News sites enforcing the
>standards.  I for one am very glad to see this change.

You would be singing a different tune it your article had been
bounced because it failed to comply to the new date standard.
Check your headers, you will find that the articles you are posting
contain an invalid date stamp.  Since Henry decided to bend the rules
a little for the date checking and allow the old ctime(3) format,
don't you believe that he could bend them a little to accomodate a 
missing space?

Clean up your own act before you start tossing stones Chip.

henry@zoo.toronto.edu (Henry Spencer) (04/03/91)

In article <1991Apr02.183535.13886@buster.stafford.tx.us> rli@buster.stafford.tx.us writes:
>Check your headers, you will find that the articles you are posting
>contain an invalid date stamp. 

Sure about that?  Sure they didn't have a valid one when posted, but got
it rewritten into an invalid one by a B News site en route?

>Since Henry decided to bend the rules
>a little for the date checking and allow the old ctime(3) format...

Geoff is the guilty party on this stuff, although I agree with most of
his decisions.  And RFC1036 itself strongly recommends bending the rules
on date checking (and only on date checking).
-- 
"The stories one hears about putting up | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 are all true."  -D. Harrison|  henry@zoo.toronto.edu  utzoo!henry

rli@buster.stafford.tx.us (Buster Irby) (04/03/91)

rli@buster.stafford.tx.us (Buster Irby) writes:

>chip@tct.com (Chip Salzenberg) writes:

>>Seriously, the net is better off with C News sites enforcing the
>>standards.  I for one am very glad to see this change.

>You would be singing a different tune it your article had been
>bounced because it failed to comply to the new date standard.
>Check your headers, you will find that the articles you are posting
>contain an invalid date stamp.  Since Henry decided to bend the rules
>a little for the date checking and allow the old ctime(3) format,
>don't you believe that he could bend them a little to accomodate a 
>missing space?

>Clean up your own act before you start tossing stones Chip.

Well, it has been pointed out to me that the date header in
Chip's articles were valid when they left his site, but had
apparantly been munged by some other site before reaching buster.

I sincerely appologize for bashing him wrongly.

However, this just makes my point even stronger.  Where is the
justice in dropping an article on the floor that was munged by
another site downstream of the author's site.  The new practice
of just dropping all non compliant articles will aggravate this
problem because we will be discarding the very articles we need
to debug the problem.

henry@zoo.toronto.edu (Henry Spencer) (04/04/91)

In article <1991Apr03.042907.16938@buster.stafford.tx.us> rli@buster.stafford.tx.us writes:
>However, this just makes my point even stronger.  Where is the
>justice in dropping an article on the floor that was munged by
>another site downstream of the author's site...

It would make it stronger if this actually happened.  It doesn't.
The date munging that B News does gives dates that C News accepts.
For all the problems B News's header mangling causes, the resulting
headers generally *are* legal RFC822/1036 headers, so this is an
imaginary problem.
-- 
"The stories one hears about putting up | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 are all true."  -D. Harrison|  henry@zoo.toronto.edu  utzoo!henry

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (04/04/91)

In article <1991Apr3.001125.2057@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:

| Geoff is the guilty party on this stuff, although I agree with most of
| his decisions.  And RFC1036 itself strongly recommends bending the rules
| on date checking (and only on date checking).

  I'm going to make a suggestion here, and let whoever agrees with my
vision of the future run with it. In a few years we will have a date
rollover. While a date containing, for instance, 91 and 04, is obviously
parsable into day of the month and year, that won't be true shortly.

  Consider a data spec as follows:
	month	three letters, capitalised
	day	one or two digit number 1..31
	year	four digit number
	time	three groups of two digit numbers, colon separated
	TZ	three uppercase characters or a one or two digit
		number starting with + or -.

  This is almost like the fields in RFC1036 (as I remember, I have to
get a new copy), except the year. It has the advantage that every field
may be differentiated from every other, and therefore order is no longer
important. This seems like a wonderful idea, given that we're talking
about a missing space.

  Wouldn't it be nice to allow any order, one or more blanks or tabs as
delimiters, and worry about important things?

  If people agree with me perhaps something could be done. If you think
it's a good idea, but too early to worry, you must be unfamiliar with
the time for a good idea to become practice on usenet. And if nothing is
done the need will (or lack of it) will be obvious after it's too late ;-)
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
        "Most of the VAX instructions are in microcode,
         but halt and no-op are in hardware for efficiency"

henry@zoo.toronto.edu (Henry Spencer) (04/04/91)

In article <3313@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>... While a date containing, for instance, 91 and 04, is obviously
>parsable into day of the month and year, that won't be true shortly.

Note that RFC822/1036 has never legitimized such dates, and C News now
refuses to accept them.  Dates like "04/05/91" have *never* been parsable
unambiguously.

>  This is almost like the fields in RFC1036 (as I remember, I have to
>get a new copy), except the year...

RFC1123 amended RFC822, and by proxy RFC1036, to allow four-digit years
and strongly encourage them.

>... and therefore order is no longer
>important. This seems like a wonderful idea...

If you look closely at the current C News distribution, you'll find that
such routines are in fact already provided.  The "Date:" header parsing
uses a more rigorous (and faster) version that wants to see an RFC822/1036
date, though, because of our observation that essentially all "Date:"
headers already contained standard-format dates.  A wonderful idea it
may be, but it's pretty much unnecessary in this particular context.

One reason for including the more liberal parser, by the way, is that
we want to get rid of getdate() entirely before too long, and we need
something to parse "Expires:" headers.  *Those* are much more variable,
because they are manually typed rather than automatically generated.
The plan is to accept any unambiguous (note that word: unambiguous,
which means none of this "04/05/91" garbage) date there.
-- 
"The stories one hears about putting up | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 are all true."  -D. Harrison|  henry@zoo.toronto.edu  utzoo!henry

kucharsk@Solbourne.COM (William Kucharski) (04/04/91)

In article <1991Apr03.042907.16938@buster.stafford.tx.us> rli@buster.stafford.tx.us writes:
 >However, this just makes my point even stronger.  Where is the
 >justice in dropping an article on the floor that was munged by
 >another site downstream of the author's site.  The new practice
 >of just dropping all non compliant articles will aggravate this
 >problem because we will be discarding the very articles we need
 >to debug the problem.

However, C News isn't dropping articles because of the date; it's rather
lenient in that regard.  However, if you have a site that's rewriting, say,
"Newsgroups: " lines without the space, you definitely have broken news
software somewhere and the article SHOULD be dropped...
-- 
| William Kucharski, Solbourne Computer, Inc.     | Opinions expressed above
| Internet:   kucharsk@Solbourne.COM	          | are MINE alone, not those
| uucp:	...!{boulder,sun,uunet}!stan!kucharsk     | of Solbourne...
| Snail Mail: 1900 Pike Road, Longmont, CO  80501 | "It's Night 9 With D2 Dave!"

fields-doug@cs.yale.edu (Doug Fields) (04/04/91)

In article <1991Apr01.132449.29232@buster.stafford.tx.us>, rli@buster.stafford.tx.us (Buster Irby) writes:
|> I believe that during a transitional period (say 6-12 months or
|> so),

Well hang on a second. First of all, the standard mandates certain
things which should be adhered to no matter what. You don't see people
out there crying because they have C programs that nobody can compile
because they're using their own personal version of C now do you?

And that 6-12 months is REDICULOUS. How fast does news propogate through
the network (from Internet down to slow UUCP)? Fast. Figure on the outside
it'll take a week for an article to get to every system receiving it. So,
perhaps a transitional peroid of AT MOST a month if you want to give leeway.
The standard does NOT mandate leeway, but people do...

Finally, I agree with Brad's assertion that CR is whitespace and should be
treated as such.

|> >"Be liberal about what you receive, conservative about what you generate"

One thing about this... If you're liberal about what you receive, there is no
need to be conservative about what you generate, now is there?

Doug
-- 
Doug Fields -POB 1789 Yale Station, New Haven, CT 06520- (FAX) +1 203 661-2996
Internet: fields-doug@cs.yale.edu <-- Best to reach me. Voice: +1 203 436-0184
uucp: ...uunet!sir-alan!admiral!doug --------------------- Thank you Sir-Alan!
BBS: (T2500) +1 203 661-2873, (HST/V.32) -1279, (V.32) -0450, (v29/MNP6) -2967

brad@looking.on.ca (Brad Templeton) (04/04/91)

There are a variety of things that can go wrong, and how you bounce
depends on them.

For things that are clearly the fault of the site, you bounce back to
usenet@sender-site.   For things that are likely the fault of the user,
you bounce back to the user, with a note that if it's the software's
fault, to talk to the net admin.

However, I do encourage that the standard be changed so that:

	^Header:$

is a valid null header.   Requiring ^Header: $ is wasteful and doesn't
make a lot of sense.  Is this and 822ism or a 1036ism?

Yes, at first, bouncing would result in a bad message drawind scores
of bounce messages.   This would change *real* fast.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

henry@zoo.toronto.edu (Henry Spencer) (04/04/91)

In article <1991Apr03.230016.8369@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>However, I do encourage that the standard be changed so that:
>
>	^Header:$
>
>is a valid null header.   Requiring ^Header: $ is wasteful and doesn't
>make a lot of sense.  Is this and 822ism or a 1036ism?

It's a 1036ism, and might possibly have been an accident.  Actually, my
own feeling is that null headers simply ought to be outlawed, as a silly
waste of bytes, but I don't suppose people will go for that...  The current
inews does eliminate them.

Unfortunately, the people with broken software who are yelling hardest
about colon-space are the ones with non-empty headers without the space.

>For things that are clearly the fault of the site, you bounce back to
>usenet@sender-site...

I am inclined to be sympathetic to the poor naive sysadmins who don't
have the time or expertise to fix their broken software and don't want
to be bombarded with a hundred messages an hour -- especially over
expensive long-distance phone calls -- telling them to fix it.  If
there were some way of pointing the messages at the authors of the
software, now *that* would be different.
-- 
"The stories one hears about putting up | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 are all true."  -D. Harrison|  henry@zoo.toronto.edu  utzoo!henry

cdr@hobbes.amd.com (Carl Rigney) (04/05/91)

In article <29844@cs.yale.edu> fields-doug@cs.yale.edu (Doug Fields) writes:
>Finally, I agree with Brad's assertion that CR is whitespace and should be
>treated as such.

	Oh, then you must agree that this:

To: user@amd.com
Subject: 

Stuff
Cc: others@amd.com

	Is equivalent to

To: user@amd.com
Subject:  Stuff
Cc: others@amd.com

	After all, both have two whitespace characters following the Subject:,
	by your definition.

--
Carl Rigney
cdr@amd.com

duncan@comp.vuw.ac.nz (Duncan McEwan) (04/05/91)

In article <1991Apr03.042907.16938@buster.stafford.tx.us> rli@buster.stafford.tx.us writes:

>Well, it has been pointed out to me that the date header in
>Chip's articles were valid when they left his site, but had
>apparantly been munged by some other site before reaching buster.

The header may not have been munged at all -- some news readers (trn for
instance) converts the date header to local time and prints it in ctime
format for display only (I only noticed this after following this thread for
a while and wondering why *everyones* "Date:" headers seemed to be in illegal
ctime format :-)

However, as a previous poster pointed out, a basic philosophy in implementing
rfc's has always been (paraphrased) "be liberal in what you accept,
conservative in what you generate".  Unless there is a real technical reason
why you shouldn't accept an illegal header, why do you want to reject it?  In
this case, you could justify rejecting illegal date formats because they make
it harder for software to do things like converting dates to local time, or
attempting to order articles by posting date.  But a missing space after a
":" shouldn't cause anyone major problems (If someone can point out to me a
reason why it does cause problems, then I will most likely change my mind :-)

Duncan

wisner@ims.alaska.edu (Bill Wisner) (04/05/91)

>	Oh, then you must agree that this:
>
>To: user@amd.com
>Subject: 
>
>Stuff
>Cc: others@amd.com

>	Is equivalent to

>To: user@amd.com
>Subject:  Stuff
>Cc: others@amd.com

Don't be stupid.  The specs specifically say that the end of the header is
indicated by a blank line.

Bill Wisner <wisner@ims.alaska.edu> Gryphon Gang Fairbanks AK 99775
"If a person offend you, and you are in doubt as to whether it was
intentional or not, do not resort to extreme measures; simply watch
your chance and hit him with a brick." -- Mark Twain

lmb@sat.com (Larry Blair) (04/05/91)

In article <1991Apr4.063315.20892@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:

=I am inclined to be sympathetic to the poor naive sysadmins who don't
=have the time or expertise to fix their broken software and don't want
=to be bombarded with a hundred messages an hour -- especially over
=expensive long-distance phone calls -- telling them to fix it.  If
=there were some way of pointing the messages at the authors of the
=software, now *that* would be different.

If you were sympathetic to the sysadmins you wouldn't just throw away all of
their site's postings without any indication.  Even the most naive admin would
rather be bombarded with mail than have everything posted from their site
disappear into a black hole.
-- 
Larry Blair   lmb@sat.com   {apple,decwrl}!sat!lmb

geoff@world.std.com (Geoff Collyer) (04/05/91)

Duncan McEwan:
>Unless there is a real technical reason why you shouldn't accept an
>illegal header, why do you want to reject it?
...
>But a missing space after a ":" shouldn't cause anyone major problems
>(If someone can point out to me a reason why it does cause problems, then
>I will most likely change my mind :-)

It shouldn't cause problems, but it does.  B News (in conformance with
RFC 1036) insists upon the space after the colon; if it's absent, B News
will complain "Inbound news is garbled!" (as I recall) and drop the
article (and the batch? it's been a long time).  So articles lacking
spaces after colons have been getting dropped by B News sites anyway.

One compelling argument for making C News drop such articles too was that
large B News sites (e.g. uunet) were getting the same illegal articles
over and over from their C News neighbours, and these articles were
causing at least a lot of complaints to the news administrator from the
news system and possibly loss of other news in the same batch too (I
don't remember the details; perhaps someone from an afflicted site can
comment).  Since the articles are illegal to start with, it seems
harmless to drop them immediately rather than letting them go on and then
be dropped by the B News sites.

Our usual disclaimer: relaynews just passes the bytes; it doesn't attempt
to repair broken articles.  If the article is broken enough to cause
trouble to the rest of Usenet, relaynews just logs it and drops it.
-- 
Geoff Collyer		world.std.com!geoff, uunet.uu.net!geoff

geoff@world.std.com (Geoff Collyer) (04/05/91)

Larry Blair writes:
>If you were sympathetic to the sysadmins you wouldn't just throw away all of
>their site's postings without any indication.

There are log file entries describing the problems with the illegal
articles on the machine that throws them away.  Is that really no
indication?

Arranging to keep the offending articles is not as straightforward as one
might think, but something may be arranged.  Stay tuned.

>Even the most naive admin would rather be bombarded with mail than have
>everything posted from their site disappear into a black hole.

Not if they are neighbours (or administrators) of UUNET (and if UUNET
wouldn't drop the article).  Getting stoned by hundreds of messages in
response to each posting would not thrill most administrators.
-- 
Geoff Collyer		world.std.com!geoff, uunet.uu.net!geoff

peter@taronga.hackercorp.com (Peter da Silva) (04/05/91)

fields-doug@cs.yale.edu (Doug Fields) writes:
> |> >"Be liberal about what you receive, conservative about what you generate"

> One thing about this... If you're liberal about what you receive, there is no
> need to be conservative about what you generate, now is there?

Sure... the next guy down the line might be positively reactionary.
-- 
               (peter@taronga.uucp.ferranti.com)
   `-_-'
    'U`

jerry@olivey.ATC.Olivetti.Com (Jerry Aguirre) (04/05/91)

In article <1991Apr4.231602.17306@world.std.com> geoff@world.std.com (Geoff Collyer) writes:
>It shouldn't cause problems, but it does.  B News (in conformance with
>RFC 1036) insists upon the space after the colon; if it's absent, B News
>will complain "Inbound news is garbled!" (as I recall) and drop the
>article (and the batch? it's been a long time).  So articles lacking
>spaces after colons have been getting dropped by B News sites anyway.

Darn if he isn't right.  I just verified this by hand editing an article
to get rid of a few of the spaces and passing it to rnews (B 2.11.19).
Sure enough it gave the message "inews: Inbound news is garbled".  The
article was no where to be found.  No mail to the poster.  Zip!

If Henry and Geoff had made it clear that they were not only making C
news follow the spec. but were making it more backward compatible then
they would have gotten a lot less flak.

--
"Batching is good.  Batching is great.  All hail batching!
It is impossible to process individual articles as efficently as
batches, and even if you could it would be an evil thing to do."

henry@zoo.toronto.edu (Henry Spencer) (04/05/91)

In article <1991Apr04.223504.20615@sat.com> lmb@sat.com (Larry Blair) writes:
>If you were sympathetic to the sysadmins you wouldn't just throw away all of
>their site's postings without any indication.

The alternative is to inundate him with, potentially, hundreds of complaints
*per posting*.  This is very much exceedingly antisocial, especially for
sysadmins who pay phone bills for all that mail.

Agreed that *one* warning now and then would be a good thing.  We could
not think of any way to do it.
-- 
"The stories one hears about putting up | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 are all true."  -D. Harrison|  henry@zoo.toronto.edu  utzoo!henry

nreadwin@micrognosis.co.uk (Neil Readwin) (04/05/91)

In article <1991Apr01.132449.29232@buster.stafford.tx.us>,
rli@buster.stafford.tx.us (Buster Irby) writes:
|> I would like to support a statement made by Peter da Silva the other day:
|> 
|> >"Be liberal about what you receive, conservative about what you generate"

 OK, so maybe C news should do what it can to accept the article on the
 local system but refuse to pass it on to downstream sites :-\
 
 Phone: +44 71 528 8282  E-mail: nreadwin@micrognosis.co.uk
 You will know me by my river, you will know me by the Clyde ...

lear@turbo.bio.net (Eliot) (04/06/91)

lmb@sat.com (Larry Blair) writes:

>If you were sympathetic to the sysadmins you wouldn't just throw away all of
>their site's postings without any indication.  Even the most naive admin would
>rather be bombarded with mail than have everything posted from their site
>disappear into a black hole.

Some thoughts:

#1:	Any processing done on all machines (``the NET'') MAY have
	effects on all machines.  Thus one must consider the effects
	fairly carefully before introducing such systemic solutions.

#2:	The right place to catch format errors is on input into the
	system.  This is the least expensive (and presumably the way
	CNews works), and has the maximum impact, because 99% of the
	time the system can give an interactive error message.

As an example of #1, I am reminded of the person who forged 5 sendsys
control messages from webber@rutgers.edu, when no such address
existed.  So that cost 10*N where N was the number of machines who
responded (now about 10K or so?)/

Both of these rules affirm the robustness principle, ``Be liberal with
what you accept and conservative with what you send.'' (See RFC 793
2.10, and other places).  Neither of these rules will solve all
problems.

A much better solution would be enough logging information that
various system administrators can run perl scripts to determine when
problems exist, and who to contact.  While this isn't an automatic
solution, the only automatic ones I would consider are somewhat longer
term.

One possibility is an inband control message that only gets turned
into mail or log information when it hits the offending site.  This
method presumes that the error is not so significant that the system
in question will be unable to process it.  This method would set the
overhead of the error message to no more than one control message.  In
some cases this would cause more processing, in some cases less.
-- 
Eliot Lear
[lear@turbo.bio.net]

mcr@Sandelman.OCUnix.on.ca (Michael Richardson) (04/08/91)

In article <50613@olivea.atc.olivetti.com> jerry@olivey.ATC.Olivetti.Com (Jerry Aguirre) writes:
>Darn if he isn't right.  I just verified this by hand editing an article
>to get rid of a few of the spaces and passing it to rnews (B 2.11.19).
>Sure enough it gave the message "inews: Inbound news is garbled".  The
>article was no where to be found.  No mail to the poster.  Zip!

  Hmm. This might explain the problems that a downstream B News site
was having a while ago. (And continues to have now and then)
  The difference is that the site tends to fill /usr/lib/news with a
never-ending log file...

>If Henry and Geoff had made it clear that they were not only making C
>news follow the spec. but were making it more backward compatible then
>they would have gotten a lot less flak.

  Agreed.


-- 
   :!mcr!:            |  The postmaster never | So much mail, 
   Michael Richardson |    resolves twice.    |  so little time.
HOME: mcr@sandelman.ocunix.on.ca 	Bell: (613) 237-5629
    Small Ottawa nodes contact me about joining ocunix.on.ca!

sob@tmc.edu (Stan Barber) (04/08/91)

In article <3313@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>  I'm going to make a suggestion here, and let whoever agrees with my
>vision of the future run with it. In a few years we will have a date
>rollover. While a date containing, for instance, 91 and 04, is obviously
>parsable into day of the month and year, that won't be true shortly.
>
>  Consider a data spec as follows:
>	month	three letters, capitalised
>	day	one or two digit number 1..31
>	year	four digit number
>	time	three groups of two digit numbers, colon separated
>	TZ	three uppercase characters or a one or two digit
>		number starting with + or -.
>
>  This is almost like the fields in RFC1036 (as I remember, I have to
>get a new copy), except the year. It has the advantage that every field
>may be differentiated from every other, and therefore order is no longer
>important. This seems like a wonderful idea, given that we're talking
>about a missing space.

Actually, RFC1123 has already modified the rules to deal with the coming
year 2000. Since RFC1123 modified RFC822 and since RFC1036 sez that 
dates are supposed to be formated like RFC822 sez, it follows that dates
should look like this in news (and mail). Here is the section on the new
date format from RFC1123.

      5.2.14  RFC-822 Date and Time Specification: RFC-822 Section 5

         The syntax for the date is hereby changed to:

            date = 1*2DIGIT month 2*4DIGIT

         All mail software SHOULD use 4-digit years in dates, to ease
         the transition to the next century.

         There is a strong trend towards the use of numeric timezone
         indicators, and implementations SHOULD use numeric timezones
         instead of timezone names.  However, all implementations MUST
         accept either notation.  If timezone names are used, they MUST
         be exactly as defined in RFC-822.

         The military time zones are specified incorrectly in RFC-822:
         they count the wrong way from UT (the signs are reversed).  As
         a result, military time zones in RFC-822 headers carry no
         information.

         Finally, note that there is a typo in the definition of "zone"
         in the syntax summary of appendix D; the correct definition
         occurs in Section 3 of RFC-822.

The mods to Bnews to support are simple. Look in funcs2.c for the arpadate
function.




-- 
Stan           internet: sob@bcm.tmc.edu         Director, Networking 
Olan           uucp: rutgers!bcm!sob             and Systems Support
Barber         Opinions expressed are only mine. Baylor College of Medicine

brad@looking.on.ca (Brad Templeton) (04/09/91)

To be honest, I think the most efficient thing to do would be to add
"an integer specifying the number of seconds since the epoch" as a valid date,
and the preferred date.

It's easy to parse :-) and very difficult to interpret incorrectly, and it
takes no extra CPU.   Newsreaders would interpret it and display it as
desired locally.

Of course, this suggestion should have been done at the start, since you
now have to support the old multi-format date and this requires the use of
complex libraries.   An optional comment could go after this figure with
the date in human form.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

rickert@mp.cs.niu.edu (Neil Rickert) (04/10/91)

In article <1991Apr09.160444.25262@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>To be honest, I think the most efficient thing to do would be to add
>"an integer specifying the number of seconds since the epoch" as a valid date,
>and the preferred date.

 Which epoch?  The unix epoch, the MSDOS epoch, or Jan 1st 1900 which is used
as an epoch in a number of systems?

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

amanda@visix.com (Amanda Walker) (04/10/91)

rickert@mp.cs.niu.edu (Neil Rickert) writes:

   Which epoch?  The unix epoch, the MSDOS epoch, or Jan 1st 1900 which is used
   as an epoch in a number of systems?

Indeed.  Remeber, Brad, the world doesn't always revolve around UNIX :)...

--
Amanda Walker						      amanda@visix.com
Visix Software Inc.					...!uunet!visix!amanda
-- 
"It's very early--I may have to hurt you."	--"Good Morning, Viet Nam"

henry@zoo.toronto.edu (Henry Spencer) (04/10/91)

In article <1991Apr9.221012.19572@visix.com> amanda@visix.com (Amanda Walker) writes:
>   Which epoch?  The unix epoch, the MSDOS epoch, or Jan 1st 1900 which is used
>   as an epoch in a number of systems?
>
>Indeed.  Remeber, Brad, the world doesn't always revolve around UNIX :)...

It'll be the Usenet epoch... the date of my first posting... :-) :-) :-)
-- 
"The stories one hears about putting up | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 are all true."  -D. Harrison|  henry@zoo.toronto.edu  utzoo!henry

wb8foz@mthvax.cs.miami.edu (David Lesher) (04/11/91)

rickert@mp.cs.niu.edu (Neil Rickert) writes:


> Which epoch?  The unix epoch, the MSDOS epoch, or Jan 1st 1900 which is used
>as an epoch in a number of systems?

Let's get ahead of the ball this time...

1 Jan 2001 0000:0000

and just use negative numbers until then....

-- 
A host is a host from coast to coast.....wb8foz@mthvax.cs.miami.edu 
& no one will talk to a host that's close............(305) 255-RTFM
Unless the host (that isn't close)......................pob 570-335
is busy, hung or dead....................................33257-0335

brad@looking.on.ca (Brad Templeton) (04/11/91)

This has little to do with unix.  Pick any epoch you like, as long as everybody
agrees with it.   The Unix epoch is a convenient one because you don't have
to make any changes on Unix software, as if adding a constant is a big change.

I must admit I do like the choice of Jan 1, 2001 and negative numbers.  I
might also suggest a minute count instead of a second count, since there is
no great need for the seconds on this one.    Avoids the problem of the
date wrap in 2^31 seconds.

-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

billd@fps.com (Bill Davidson) (04/11/91)

In article <1991Apr9.221012.19572@visix.com> amanda@visix.com (Amanda Walker) writes:
>rickert@mp.cs.niu.edu (Neil Rickert) writes:
>   Which epoch?  The unix epoch, the MSDOS epoch, or Jan 1st 1900 which is used
>   as an epoch in a number of systems?
>Indeed.  Remeber, Brad, the world doesn't always revolve around UNIX :)...

How about this idea?

The year followed by seconds since 12am Jan 1 of that year.

as in:

1991 8633384

It's still very easy to parse and leaves no questions about epochs
(well, other than that of our calendar ;-).

--Bill Davidson
-- 
Interviewer: Can you say something in defense against your critics?
Andy Warhol: I'm sorry but I can't because they're right.

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (04/11/91)

In article <1991Apr11.044234.23491@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:

| I must admit I do like the choice of Jan 1, 2001 and negative numbers.  I
| might also suggest a minute count instead of a second count, since there is
| no great need for the seconds on this one.    Avoids the problem of the
| date wrap in 2^31 seconds.

  Sure nice to have seconds, though, when you're trying to figure out
the order of messages.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
        "Most of the VAX instructions are in microcode,
         but halt and no-op are in hardware for efficiency"

wb8foz@mthvax.cs.miami.edu (David Lesher) (04/12/91)

>| I must admit I do like the choice of Jan 1, 2001 and negative numbers. 

Err,
I just thought of a problem. What with Lotus patenting 1 2 and
3, IBEAM grabbing /2 and PS, DEC now marketing vacuum cleaners,
and so forth.....

	Will Cnews users need a site license from Arthur C. Clarke?

;-}
-- 
A host is a host from coast to coast.....wb8foz@mthvax.cs.miami.edu 
& no one will talk to a host that's close............(305) 255-RTFM
Unless the host (that isn't close)......................pob 570-335
is busy, hung or dead....................................33257-0335

jbrown@locus.com (Jordan Brown) (04/12/91)

How about using as the epoch for this new date-representation standard
the moment the standard is agreed on?  Right now is some negative number,
next year is some positive number; the only question is where 0 is.
How about midnight on the date of the RFC announcing it?

98% :-) ... but it's as good a time as any.

barrett@Daisy.EE.UND.AC.ZA (Alan P. Barrett) (04/13/91)

In article <1991Apr09.160444.25262@looking.on.ca>,
brad@looking.on.ca (Brad Templeton) writes:
> To be honest, I think the most efficient thing to do would be to add
> "an integer specifying the number of seconds since the epoch" as a valid date,
> and the preferred date.

Humans sometimes read (and even write) news article headers.  Let's not make
that too difficult.

--apb
Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa
Internet: barrett@ee.und.ac.za           UUCP: m2xenix!quagga!undeed!barrett

jrc@brainiac.mn.org (Jeffrey Comstock) (04/13/91)

In article <1991Apr9.185748.11721@mp.cs.niu.edu> rickert@mp.cs.niu.edu (Neil Rickert) writes:
>In article <1991Apr09.160444.25262@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>>To be honest, I think the most efficient thing to do would be to add
>>"an integer specifying the number of seconds since the epoch" as a valid date,
>>and the preferred date.
>
> Which epoch?  The unix epoch, the MSDOS epoch, or Jan 1st 1900 which is used
>as an epoch in a number of systems?

VMS has a really strange one.  I think it starts on Feb 14, 1758.

jrc@brainiac.mn.org (Jeffrey Comstock) (04/13/91)

In article <1991Apr11.044234.23491@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>This has little to do with unix.  Pick any epoch you like, as long as everybody
>agrees with it.   The Unix epoch is a convenient one because you don't have
>to make any changes on Unix software, as if adding a constant is a big change.
>
>I must admit I do like the choice of Jan 1, 2001 and negative numbers.  I
>might also suggest a minute count instead of a second count, since there is
>no great need for the seconds on this one.    Avoids the problem of the
>date wrap in 2^31 seconds.
>
>-- 
>Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

How about forget about seconds and use days.  Then start the epoch at
1 A.D.  Its good for 11767033 years.  By that time we won't care about 
news anymore :-).

ske@pkmab.se (Kristoffer Eriksson) (04/13/91)

In article <1991Apr4.231602.17306@world.std.com> geoff@world.std.com (Geoff Collyer) writes:
>It shouldn't cause problems, but it does.  B News (in conformance with
>RFC 1036) insists upon the space after the colon; if it's absent, B News
>will complain "Inbound news is garbled!" (as I recall) and drop the
>article

This is only part of the truth (as far as I can determine by reading the
B news source). B news indeed requires a space to recognize what header
line it is looking at (function type() in header.c), but only the colon
is required for it to consider the line a header line (plus, the header name
must start with letter and not contain spaces).

If the space is left out, the line will be treated as a non-standard header
line, in stead of the header the sender thought they supplied, but still a
header nonetheless.

The message about garbled news is produced if reading of the header part
of the article failes as a whole (function hread() in header.c). Leaving out
the space on some headers will not in general make hread() to return failure.
However, there are some other requirements that the header part has to meet,
in order not to fail. In particular, the From:, Date:, and Message-ID:
headers must be supplied (function frmread()). If the space is left out of
any of these three header lines, B news will perform the way you claim and
drop the article.

All examples of real-life left out spaces that have been quoted this far
by other posters here, only involved other non-essential headers, though.
The examples I've seen quoted involved the headers Distribution:, Sender:,
Expires:, Summary:, and References:. Inews will not complain about missing
spaces in these. These headers just will be ineffective, nothing else.

I suppose that missing spaces in the above-mentioned three essential
header lines would already have been discovered and corrected long ago,
since they _would_ cause the articles to be dropped.

-- 
Kristoffer Eriksson, Peridot Konsult AB, Hagagatan 6, S-703 40 Oerebro, Sweden
Phone: +46 19-13 03 60  !  e-mail: ske@pkmab.se
Fax:   +46 19-11 51 03  !  or ...!{uunet,mcsun}!sunic.sunet.se!kullmar!pkmab!ske

rmk@rmkhome.UUCP (Rick Kelly) (04/13/91)

In article <1991Apr9.221012.19572@visix.com> amanda@visix.com (Amanda Walker) writes:
>rickert@mp.cs.niu.edu (Neil Rickert) writes:
>
>   Which epoch?  The unix epoch, the MSDOS epoch, or Jan 1st 1900 which is used
>   as an epoch in a number of systems?
>
>Indeed.  Remeber, Brad, the world doesn't always revolve around UNIX :)...

And you can see the unfortunate state of the world today. :-)


Rick Kelly	rmk@rmkhome.UUCP	frog!rmkhome!rmk	rmk@frog.UUCP

huntting@csn.org (Brad Huntting) (04/13/91)

In article <50613@olivea.atc.olivetti.com> jerry@olivey.ATC.Olivetti.Com (Jerry Aguirre) writes:
>In article <1991Apr4.231602.17306@world.std.com> geoff@world.std.com (Geoff Collyer) writes:
>>It shouldn't cause problems, but it does.  B News (in conformance with
>>RFC 1036) insists upon the space after the colon; if it's absent, B News
>>will complain "Inbound news is garbled!" (as I recall) and drop the
>>article (and the batch? it's been a long time).  So articles lacking
>>spaces after colons have been getting dropped by B News sites anyway.

> [...]

>If Henry and Geoff had made it clear that they were not only making C
>news follow the spec. but were making it more backward compatible then
>they would have gotten a lot less flak.

Catch me if I'm wrong but didn't we hash out this Bnews<->Cnews
`garbled'-y-gook several weeks ago?  And wasn't it in light of this very
problem that these measures were taken?  And didn't Henry mention that
these patches were in the works at the time?

Of course, I could be wrong... :)

-- 


brad

brad@looking.on.ca (Brad Templeton) (04/14/91)

Actually, in retrospect, the best plan is probably to use a floating point
number for the date.   In spite of the fact that I see no problem with
a low resolution figure like minutes, some people seem keen on fine resolution
for this number which is used only to sort articles by posting time, measure
when to expire or reject them, and give the user a rough idea of the time of
posting.   For the first, minutes is good enough.  For the latter, hours
would do.

However, perhaps a floating point number of days is the most logical date
interchange format.   This is clean, although FP minutes has the advantage
that people who want to just use integer math and don't care about seconds
have an easy out.  Just about all news software would be fine with long integer
minutes, and those who really want accurate time could use the fractional part
if they desired.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

nreadwin@micrognosis.co.uk (Neil Readwin) (04/14/91)

In article <1991Apr13.003126.2977@brainiac.mn.org>, jrc@brainiac.mn.org
(Jeffrey Comstock) writes:
|> VMS has a really strange one [epoch].  I think it starts on Feb 14, 1758.

It starts on November 17, 1858, which is the base of the Modified Julian Day
system adopted by the Smithsonian Astrophysical Observatory (SAO) in 1957 for
satellite tracking. The year 1858 preceded the oldest star catalog in use at 
SAO, which also avoided having to use negative time in any of the satellite 
tracking calculations

 Phone: +44 71 528 8282  E-mail: nreadwin@micrognosis.co.uk
 Quote: The 64 bit signed value has a range of approximately 29,000 years; 
 VMS development has not developed a plan for what to do when it overflows.

richard@locus.com (Richard M. Mathews) (04/16/91)

brad@looking.on.ca (Brad Templeton) writes:

>However, perhaps a floating point number of days is the most logical date
>interchange format.   This is clean, although FP minutes has the advantage
>that people who want to just use integer math and don't care about seconds
>have an easy out.  Just about all news software would be fine with long integer
>minutes, and those who really want accurate time could use the fractional part
>if they desired.

I am not taking a position for or against such a proposal.  I would like
to point out, however, that such a standard already exists.  Astronomers
and others have long had a need for dates with which calculations are
easy.  They use fractional days to whatever precision is appropriate,
and they use a standard epoch.  If you want to pick an epoch, why not
one which has already gained acceptance somewhere?

The system used by astronomers is the Julian Date.  In
	Meeus, Jean, _Astronomical_Formulae_for_Calculators_,
	(Willman-Bell, 1985).
it says
	The Julian Day begins at Greenwich *noon*, that is at 12h
	Universal Time (or 12h Ephemeris Time, and in that case the
	expression Julian Ephemeris Day is generally used).  For
	example, 1977 April 26.4 = JD 2443259.9.

The epoch turns out to be JD 0 == January 1.5, 4713 B.C. (assuming I
did the calculation right, but that date sounds about right).  I have
no idea why this number was chosen (except for a vague memory that it
was someone's idea of the time of the creation of the universe).

Note that modern dates are up around 2 million.  A common convention is
to subtract off 2.4 million from these dates.  Note that JD 2400000.0
is November 16.5, 1858.  The convention of subtracting 2.4 million is
the origin of the Modified Julian Day mentioned in another posting:

From: nreadwin@micrognosis.co.uk (Neil Readwin)
>|> VMS has a really strange one [epoch].  I think it starts on Feb 14, 1758.
>It starts on November 17, 1858, which is the base of the Modified Julian Day
>system adopted by the Smithsonian Astrophysical Observatory (SAO) in 1957 for
>satellite tracking.

Meeus's book provides algorithms for conversion from Gregorian Date to
Julian Date and back.  I've modified what is there slightly to make this
more like code than the verbal recipe there.  In the following, INT(x) means
the integer part (round towards negative infinity: INT(-1.5) == -2).  This is
not intended to be in any particular programming language though it looks a
lot like C.  This is untested, so I could easily have translated something
wrong from the book.

    To convert Gregorian (with year -1 == 2 B.C.) to JD
	Let "y" be the year, "m" be the month (1 to 12), and "d" be the day
	(including any fraction).  Calculate the Julian Day, "jd".

	if (m <= 2) {
		y = y - 1;
		m = m + 12;
	}
	if ((y > 1582)
	 OR (y == 1582 AND m > 10)
	 OR (y == 1582 AND m == 10 AND d >= 15)) {
		a = INT(y/100);
		b = 2 - a + INT(a/4);
	}
	else
		b = 0;

	jd = INT(365.25 * y) + INT(30.6001 * (m + 1)) + d + 1720994.5 + b;

    To convert positive JD to Gregorian (with year -1 == 2 B.C.)
	Let "jd" be the Julian Day (including fraction).  Calculate
	"day", "month", and "year".

	z = INT(jd + .5);

	if (z < 2299161)
		a = z;
	else {
		a0 = INT((z - 1867216.25) / 36524.25);
		a = z + 1 + a0 - INT(a0/4);
	}
	b = a + 1524;
	c = INT((b - 122.1) / 365.25);
	d = INT(365.25 * c);
	e = INT((b - d) / 30.6001);

	day = b - d - INT(30.6001 * e) + jd + .5 - z;
	if (e <= 13)
		month = e - 1;
	else
		month = e - 13;
	if (month <= 2)
		year = c - 4715;
	else
		year = c - 4716;

Richard M. Mathews			Lietuva laisva = Free Lithuania
richard@locus.com			Brivu Latviju  = Free Latvia
lcc!richard@seas.ucla.edu		Eesti vabaks   = Free Estonia
...!{uunet|ucla-se|turnkey}!lcc!richard