[news.software.b] Our friend, the GMT date.

flee@shire.cs.psu.edu (Felix Lee) (12/04/89)

Jef Poskanzer <jef@well.sf.ca.us> wrote:
> Interesting.  When the (second) test article left helios, it said:
>     Date: 3 Dec 89 08:58:48 -0800
> But I'm getting reports that when it arrived on other sites it said:
>     Date: 3 Dec 89 16:58:48 GMT

B news (2.11.19 and earlier) will parse the Date: header and rewrite
it as GMT.  If it can't parse the date, it will rewrite it as GMT
anyway, which is where "31 Dec 69 23:59:59 GMT" comes from.

If you don't like rewriting the date, it's a simple fix to header.c;
but even if *you* don't rewrite it, your neighbors probably will.

TMNN does rewrite the date, but won't put "31 Dec 69".  Instead, it
will put the current date, and add an "X-Unparsable-Date" field.

RFC-1036 says that the Date field shouldn't change, but doesn't say
whether you can change the representation while keeping the value.  It
doesn't say what to do with unparsable dates either; it is legal to
reject such articles out of hand.

TMNN will also rewrite the Expires date, which is probably the wrong
thing to do with relative dates ("Expires: 3 days").

Relative dates aren't blessed by the RFC, but are understood by
getdate().  The RFC defines a valid date as something 822-ish that
getdate will accept, an operational definition not terribly useful to
un-Unix news implementers.
--
Felix Lee	flee@shire.cs.psu.edu	*!psuvax1!flee

brad@looking.on.ca (Brad Templeton) (12/04/89)

Almost makes me wonder if we/they shouldn't have started with a machine
only date field, integer seconds since some epoch.  (Could use the Unix
epoch, or even the USENET epoch, I guess.)

This would have taken out the complex getdate routine from both news
database programs and readers that try to understand the date, as well
as eliminating these problems.

The readers would need a very simple date printing routine like ctime,
that works from a seconds-since-epoch value.  This is found just about
everywhere, and certainly any Unix.

But it's too late to change now.  The only way a change could be done
would be to support both for many years, and that's hardly a solution!
(Based on news upgrades out in the field, at least 10 years!)

(To please those keen on the local time, a date plus time zone value, in
plus/minus minutes, would have done the trick)

I think RFC 822, which we followed, is also silly to express the time
in a human format.  There are a zillion ways of writing the date out
there, and you can never satisfy everybody.  Sigh.

Of course, it could be arranged to output the old date only when feeding
old sites, and to convert incomings, but that's still to messy.

To those who say, "I want to write the date my way, how dare you rewrite it?"
I say, "why?"   Why do you want to write the date a different way.  All
that really matters is how it is displayed when it comes time to read it,
and that it be universally understood by software.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

henry@utzoo.uucp (Henry Spencer) (12/05/89)

In article <1989Dec3.210413.27043@psuvax1.cs.psu.edu> flee@shire.cs.psu.edu (Felix Lee) writes:
>B news (2.11.19 and earlier) will parse the Date: header and rewrite
>it as GMT....
>TMNN does rewrite the date, but won't put "31 Dec 69"...

For the sake of completeness, C News's behavior should be mentioned.
We have religious objections to gratuitous header rewriting, and don't
do it.  The current relaynews doesn't even notice the Date header, since
it tries to notice as few headers as possible.

>...The RFC defines a valid date as something 822-ish that
>getdate will accept, an operational definition not terribly useful to
>un-Unix news implementers.

C'mon now, be fair.  The RFC does say that, but it also supplies an example
of a format that definitely is acceptable to both, and warns against a few
that aren't.
-- 
Mars can wait:  we've barely   |     Henry Spencer at U of Toronto Zoology
started exploring the Moon.    | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

kjones@talos.uucp (Kyle Jones) (12/05/89)

Brad Templeton writes:
 > Almost makes me wonder if we/they shouldn't have started with a machine
 > only date field, integer seconds since some epoch.  (Could use the Unix
 > epoch, or even the USENET epoch, I guess.)
 > [...]
 > But it's too late to change now.  The only way a change could be done
 > would be to support both for many years, and that's hardly a solution!
 > (Based on news upgrades out in the field, at least 10 years!)

There's have to be some overlap, yes, but not until *everyone* upgrades
to the new software.  That'll never happen.  At some point we have to
shrug and say "to hell with them" or we'll never move forward.

Again, if we always humor the laggards who don't want to actually
administer and maintain their news systems, then we're letting the
people who care the least about USENET have the most influence over it.

amanda@mermaid.intercon.com (Amanda Walker) (12/06/89)

In article <1989Dec5.142032.7706@talos.uucp>, kjones@talos.uucp (Kyle Jones)
writes:
> Again, if we always humor the laggards who don't want to actually
> administer and maintain their news systems, then we're letting the
> people who care the least about USENET have the most influence over it.

Not so, because USENET is not made up solely of news admins.  The people
who care most about USENET are the people who use it.  Oftentimes they
simply don't have any effective influence over the people who are "maintaining"
news on their systems.

That being said, I've started thinking more and more seriously about the
idea of starting to build "tomorrow's USENET".  There are a lot of
constraints imposed by universal compatibility with existing sites that
would no longer exist if one were to write news from scratch today, such
as:

 - 7-bit printable ASCII only
 - 32K maximum effective message size
 - broken cross-referencing (hi, Brad :-))
 - using the UNIX file system as a database index

and so on.  Lots of people have started making noises about multi-media
mail, news, and RFCs.  Others have started or talked about things like
gateways to and from other services (ClariNet, BIX, Compu$erve, MCI Mail,
FAX, ...).  Still other people have complained about the fact that USENET
has moved from being a network testbed to an operational network.

Well, maybe it's time to start talking about the next one.  It seems
impractical to try and overhaul USENET itself, but nothing's stopping us
from doing something else that makes it obsolete...  If there's enough
energy out there to do things like TMNN, CNews, NN, and so on, it seems
to me that there'd be enough to do something new.

Amanda Walker
InterCon Systems Corporation
Purveyor of fine Macintosh networking software worldwide [:-)]
--

Makey@LOGICON.ARPA (Jeff Makey) (12/06/89)

In article <1989Dec4.173315.22862@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>We have religious objections to gratuitous header rewriting, and don't
>do it.  The current relaynews doesn't even notice the Date header, since
>it tries to notice as few headers as possible.

I guess it should be pointed out that the B news behavior is perfectly
compatible with the religious doctrine, "be liberal in what you
accept; be conservative in what you send."  In this context, rewriting
the Date: header is anything but gratuitous.  Aren't holy wars fun?

                           :: Jeff Makey

Department of Tautological Pleonasms and Superfluous Redundancies Department
    Disclaimer: Logicon doesn't even know we're running news.
    Internet: Makey@LOGICON.ARPA    UUCP: {nosc,ucsd}!logicon.arpa!Makey

henry@utzoo.uucp (Henry Spencer) (12/06/89)

In article <1601@intercon.com> amanda@mermaid.intercon.com (Amanda Walker) writes:
>...constraints imposed by universal compatibility with existing sites that
>would no longer exist if one were to write news from scratch today, such
>as:
> ...
> - using the UNIX file system as a database index

Be careful not to throw the baby out with the bathwater.  In the early days
of C News, Geoff and I spent a lot of time mulling over alternate ways of
organizing the database.  We finally concluded that there was no alternative
we could think of that was worth the trouble.  (At that time, newsreaders
hadn't proliferated to the same extent -- C News has had a *very* long
gestation period -- so compatibility wasn't a big part of this.)  The
present scheme has some disadvantages, but it also has advantages.  The
Unix file system is not a bad way to organize a database.
-- 
1233 EST, Dec 7, 1972:         |     Henry Spencer at U of Toronto Zoology
last ship sails for the Moon.  | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

brad@looking.on.ca (Brad Templeton) (12/07/89)

In article <1989Dec6.031329.13569@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>Be careful not to throw the baby out with the bathwater.  In the early days
>of C News, Geoff and I spent a lot of time mulling over alternate ways of
>organizing the database.  We finally concluded that there was no alternative
>we could think of that was worth the trouble.

Henry's probably right, although it's a close call.  The first reason you
might think of for switching is saving disk space.  Turns out you waste
only about 16% of your disk space on 1K blocking, so that isn't enough
reason to switch.  (Of course on 1 4K block system, it's different, the
wastage approaches 40%, I would guess.)

On non-unix systems without links, the problem is worse and you have to
find another route.  For DOS, with no links and 4K blocking, the current
method is right out.

Another improvement would be speed.  A system that allowed you to
index into long files need not even unpack batches.  'unbatching' could
be simply a matter of reading the pointers from the header of the file
and storing them in a table.  Process a meg of news in 5 seconds.
(Have to handle Path: specially, though.)

While this complicates the readers, it makes them faster.  If a person
reads news once a day, almost all the news will be in a small set of
files -- only a few dozen files to open and seek around in.

Expire is zippy, too.  Just drop the batches in the order they came in.


But on the whole, when you add the reader breaking issue, it isn't
worth it.

But you do have to *add* to the structure, as NN has done.  Of course,
NN's addition of a special program is a kludge, and the maintenance of
a database of subjects etc. is something the inews program should do.

Support for a database (more general tha NNs) has to go into NNTP soon,
or the future generation of readers won't work with it.

Such a database also applies to any signature searching.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

amanda@mermaid.intercon.com (Amanda Walker) (12/07/89)

In article <57592@looking.on.ca>, brad@looking.on.ca (Brad Templeton) writes:
> Another improvement would be speed.  [stuff about not unpacking batches]

Exactly.

> But on the whole, when you add the reader breaking issue, it isn't
> worth it.

This is part of my motivation for thinking about doing something new.
There are a *lot* of things that end up not being worth it as changes
to USENET, but are definitely worth thinking about for a new design.

Sorry if I didn't make this clear.

I'm not even thinking of things running "in parallel" to news, such as
ClariNet or whatever.  It's probably worth it to support UUCP, because of
the Telebit Trailblazer if nothing else, but my thoughts have been focused
on retooling news from the ground up, as it were.

--Amanda
--

bill@twwells.com (T. William Wells) (12/07/89)

In article <57592@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
: Henry's probably right, although it's a close call.  The first reason you
: might think of for switching is saving disk space.  Turns out you waste
: only about 16% of your disk space on 1K blocking, so that isn't enough
: reason to switch.  (Of course on 1 4K block system, it's different, the
: wastage approaches 40%, I would guess.)

Some numbers from my system:

	blocksz % over actual space used
	512     12%
	1K      22%
	2K      44%
	4K      115%

Average article size: 2246 bytes.

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

frank@ladc.bull.com (Frank Mayhar) (12/08/89)

In article <56239@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>Almost makes me wonder if we/they shouldn't have started with a machine
>only date field, integer seconds since some epoch.  (Could use the Unix
>epoch, or even the USENET epoch, I guess.)

I dislike this idea intensely.  The problem you run into with this scheme is,
how do you represent it?  In my experience, it has been as a single machine
word.  This has all sort of nasty side effects, as has been noted recently
in other groups; side effects such as the software breaking when the sign bit
turns on (my organization had to solve this for our OS two years ago), or
when the count exceeds the size of the machine word.  Certainly good design
should take care of this, but how many software systems nowadays are designed
that well?  Answer: very few, if any.  And no matter how well you design it,
SOMEBODY will exceed the design limits.  So the best idea is to use some
scheme that isn't limited, or at least has no effective limits.  (How many
people are going to be using Unix in the year 21561?  Not many, is my guess.)
-- 
Frank Mayhar  frank@ladc.bull.com (..!{uunet,hacgate,rdahp}!ladcgw!frank)
              Bull HN Information Systems Inc.  Los Angeles Development Center
              5250 W. Century Blvd., LA, CA  90045    Phone:  (213) 216-6241