[news.software.b] Changes to CNEWS date parsing - what is now legal?

jms@carat.arizona.edu (Joel M. Snyder) (06/22/91)

I hesitate to jump into this fray, but... I'm the maintainer of a 
news reader (Vnews for VAX/VMS), and I can see the smoke of this
issue from a mile away.  However, the fire seems to have escaped
me.  Can someone summarize the date/time formats which are now considered
legal by C-News?  

The format of the date normally supplied by VNEWS would be, for this
message:

	21 JUN 91 23:01:23 

I interpret this to be incorrect, since RFC-822 seems to imply that
JUN needs to be Jun and one MUST have a time zone.  

So: how strictly is C-News going to hold me to RFC-822?  More
generally, "what are the new rules?"

Thanks for any help,

jms

Joel M Snyder, 627 E Speedway, 85705  Phone: 602.626.8680 FAX: 602.795.0900
The Mosaic Group, Dep't of MIS, the University of Arizona, Tucson
BITNET: jms@sovset  Internet: jms@carat.arizona.edu  SPAN: 47541::carat::jms   
"Never contend with a man who has nothing to lose." - Gracian

eggert@twinsun.com (Paul Eggert) (06/23/91)

jms@carat.arizona.edu (Joel M. Snyder) writes:

>	21 JUN 91 23:01:23 

>I interpret this to be incorrect, since RFC-822 seems to imply that
>JUN needs to be Jun and one MUST have a time zone.  

The JUN doesn't have to be Jun; see ``Case Independence'' in RFC 822.
Unfortunately if you spell it JUN you'll tickle bugs in old versions of
some widely used news software (NN springs to mind).

You're right, the time zone is required.  It's best to use GMT.

Apparently the quoted time is local time; if you must use local time because
that's all VMS gives you, then it's best to use a numeric time zone.
Also, RFC 1123 recommends using a 4-digit year.  E.g. the date should
be something like ``21 Jun 1991 23:01:23 -0700''.

By the way, your article's Message-ID is simply the current time in
seconds; does this mean that your news software mishandles two articles
posted during the same second at the same host?


>So: how strictly is C-News going to hold me to RFC-822?  More
>generally, "what are the new rules?"

The _rules_ are the same as before: RFCs 1036 and 822 as amended by 1123.
C News is still more permissive than the RFCs,
but's obviously it's unwise to rely on this.

henry@zoo.toronto.edu (Henry Spencer) (06/25/91)

In article <1991Jun22.213542.2256@twinsun.com> eggert@twinsun.com (Paul Eggert) writes:
>>So: how strictly is C-News going to hold me to RFC-822?  More
>>generally, "what are the new rules?"
>
>The _rules_ are the same as before: RFCs 1036 and 822 as amended by 1123.
>C News is still more permissive than the RFCs,
>but's obviously it's unwise to rely on this.

Exactly.  Any permissiveness beyond 822+1123 should be assumed to be a
historical accident which is subject to being fixed at any time.  I think
about the only looseness we currently allow is that our table of timezones
is bigger than 822's, and also a *single* unrecognized word in the timezone
slot is read as "GMT" on the assumption that it is the local timezone
abbreviation in Swahili or something.  About the only change that is likely
is that if -- as is reportedly likely -- the next revision of 1123 comes
out with a SHOULD almost-mandating use of numeric timezone plus a
parenthesized comment with local abbreviation, we will probably amend the
code to accept the comment (currently rejected because 1036 does not allow
822 comments in headers).
-- 
"We're thinking about upgrading from    | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 to SunOS 3.5."              |  henry@zoo.toronto.edu  utzoo!henry

barrett@Daisy.EE.UND.AC.ZA (Alan P Barrett) (06/28/91)

In article <1991Jun24.202026.16593@zoo.toronto.edu>,
henry@zoo.toronto.edu (Henry Spencer) writes:
> About the only change that is likely is that if -- as is reportedly
> likely -- the next revision of 1123 comes out with a SHOULD
> almost-mandating use of numeric timezone plus a parenthesized comment
> with local abbreviation, we will probably amend the code to accept the
> comment (currently rejected because 1036 does not allow 822 comments
> in headers).

It is not clear to me that 1036 forbids comments in Date lines.

RFC 1036 says:
"   RFC-822 specifies that all text in parentheses is to be interpreted
"   as a comment.  It is common in Internet mail to place the full name
"   of the user in a comment at the end of the "From" line.  This
"   standard specifies a more rigid syntax.  The full name is not
"   considered a comment, but an optional part of the header line.
"   Either the full name is omitted, or it appears in parentheses after
"   the electronic address of the person posting the message, or it
"   appears before an electronic address which is enclosed in angle
"   brackets.  Thus, the three permissible forms are:
"
"             From: mark@cbosgd.ATT.COM
"             From: mark@cbosgd.ATT.COM (Mark Horton)
"             From: Mark Horton <mark@cbosgd.ATT.COM>

The ``more rigorous syntax'' in which parentheses are not treated as
comments appears to me to apply only to From:  (and Sender:)  lines.

"   The "Date" line (formerly "Posted") is the date that the message was
"   originally posted to the network.  Its format must be acceptable
"   both in RFC-822 and to the getdate(3) routine that is provided with
"   the Usenet software.

Clear as mud.  What does ``must be acceptable to the getdate(3)
routine'' really mean?

It really is high time 1036 was rewritten.

--apb
Alan Barrett, Dept. of Electronic Eng., Univ. of Natal, Durban, South Africa
RFC822: barrett@ee.und.ac.za             Bang: m2xenix!quagga!undeed!barrett

henry@zoo.toronto.edu (Henry Spencer) (06/29/91)

In article <1991Jun28.154101.6873@Daisy.EE.UND.AC.ZA> barrett@Daisy.EE.UND.AC.ZA (Alan P Barrett) writes:
>It is not clear to me that 1036 forbids comments in Date lines.

Notice that the restricted subset of 822 syntax defined in section 2 of
1036 makes *no* provision for comments, except for 822-comment-like things
in From: and Sender: for which explicit provision is made.  Nor will any
(as far as I know) of the existing news packages tolerate 822 comments in
random places.  As an implementation accident due to the use of getdate(),
B News will put up with them in Date: lines, but nowhere else.

>"   The "Date" line (formerly "Posted") is the date that the message was
>"   originally posted to the network.  Its format must be acceptable
>"   both in RFC-822 and to the getdate(3) routine that is provided with
>"   the Usenet software.
>
>Clear as mud.  What does ``must be acceptable to the getdate(3)
>routine'' really mean?

It means "mumble". :-)  If you believe that 1036 is supposed to be a
restricted subset of 822, then only an 822 date is acceptable.  Alas,
1036 repeatedly contradicts itself on its relationship to 822.

>It really is high time 1036 was rewritten.

Amen.  Eliot is talking about that being the next job after the current
revisions to the NNTP RFC, and that sounds reasonable to me.  We don't
actually have to do much to 1036's information content -- it's not very
broken and doesn't need much fixing, mostly just a critical inspection
in regard to the current 822 revision work -- but the actual presentation
needs heavy revision.
-- 
Lightweight protocols?  TCP/IP *is*     | Henry Spencer @ U of Toronto Zoology
lightweight already; just look at OSI.  |  henry@zoo.toronto.edu  utzoo!henry