flee@shire.cs.psu.edu (Felix Lee) (12/04/89)
Jef Poskanzer <jef@well.sf.ca.us> wrote: > Interesting. When the (second) test article left helios, it said: > Date: 3 Dec 89 08:58:48 -0800 > But I'm getting reports that when it arrived on other sites it said: > Date: 3 Dec 89 16:58:48 GMT B news (2.11.19 and earlier) will parse the Date: header and rewrite it as GMT. If it can't parse the date, it will rewrite it as GMT anyway, which is where "31 Dec 69 23:59:59 GMT" comes from. If you don't like rewriting the date, it's a simple fix to header.c; but even if *you* don't rewrite it, your neighbors probably will. TMNN does rewrite the date, but won't put "31 Dec 69". Instead, it will put the current date, and add an "X-Unparsable-Date" field. RFC-1036 says that the Date field shouldn't change, but doesn't say whether you can change the representation while keeping the value. It doesn't say what to do with unparsable dates either; it is legal to reject such articles out of hand. TMNN will also rewrite the Expires date, which is probably the wrong thing to do with relative dates ("Expires: 3 days"). Relative dates aren't blessed by the RFC, but are understood by getdate(). The RFC defines a valid date as something 822-ish that getdate will accept, an operational definition not terribly useful to un-Unix news implementers. -- Felix Lee flee@shire.cs.psu.edu *!psuvax1!flee
brad@looking.on.ca (Brad Templeton) (12/04/89)
Almost makes me wonder if we/they shouldn't have started with a machine only date field, integer seconds since some epoch. (Could use the Unix epoch, or even the USENET epoch, I guess.) This would have taken out the complex getdate routine from both news database programs and readers that try to understand the date, as well as eliminating these problems. The readers would need a very simple date printing routine like ctime, that works from a seconds-since-epoch value. This is found just about everywhere, and certainly any Unix. But it's too late to change now. The only way a change could be done would be to support both for many years, and that's hardly a solution! (Based on news upgrades out in the field, at least 10 years!) (To please those keen on the local time, a date plus time zone value, in plus/minus minutes, would have done the trick) I think RFC 822, which we followed, is also silly to express the time in a human format. There are a zillion ways of writing the date out there, and you can never satisfy everybody. Sigh. Of course, it could be arranged to output the old date only when feeding old sites, and to convert incomings, but that's still to messy. To those who say, "I want to write the date my way, how dare you rewrite it?" I say, "why?" Why do you want to write the date a different way. All that really matters is how it is displayed when it comes time to read it, and that it be universally understood by software. -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
henry@utzoo.uucp (Henry Spencer) (12/05/89)
In article <1989Dec3.210413.27043@psuvax1.cs.psu.edu> flee@shire.cs.psu.edu (Felix Lee) writes: >B news (2.11.19 and earlier) will parse the Date: header and rewrite >it as GMT.... >TMNN does rewrite the date, but won't put "31 Dec 69"... For the sake of completeness, C News's behavior should be mentioned. We have religious objections to gratuitous header rewriting, and don't do it. The current relaynews doesn't even notice the Date header, since it tries to notice as few headers as possible. >...The RFC defines a valid date as something 822-ish that >getdate will accept, an operational definition not terribly useful to >un-Unix news implementers. C'mon now, be fair. The RFC does say that, but it also supplies an example of a format that definitely is acceptable to both, and warns against a few that aren't. -- Mars can wait: we've barely | Henry Spencer at U of Toronto Zoology started exploring the Moon. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
kjones@talos.uucp (Kyle Jones) (12/05/89)
Brad Templeton writes: > Almost makes me wonder if we/they shouldn't have started with a machine > only date field, integer seconds since some epoch. (Could use the Unix > epoch, or even the USENET epoch, I guess.) > [...] > But it's too late to change now. The only way a change could be done > would be to support both for many years, and that's hardly a solution! > (Based on news upgrades out in the field, at least 10 years!) There's have to be some overlap, yes, but not until *everyone* upgrades to the new software. That'll never happen. At some point we have to shrug and say "to hell with them" or we'll never move forward. Again, if we always humor the laggards who don't want to actually administer and maintain their news systems, then we're letting the people who care the least about USENET have the most influence over it.
amanda@mermaid.intercon.com (Amanda Walker) (12/06/89)
In article <1989Dec5.142032.7706@talos.uucp>, kjones@talos.uucp (Kyle Jones) writes: > Again, if we always humor the laggards who don't want to actually > administer and maintain their news systems, then we're letting the > people who care the least about USENET have the most influence over it. Not so, because USENET is not made up solely of news admins. The people who care most about USENET are the people who use it. Oftentimes they simply don't have any effective influence over the people who are "maintaining" news on their systems. That being said, I've started thinking more and more seriously about the idea of starting to build "tomorrow's USENET". There are a lot of constraints imposed by universal compatibility with existing sites that would no longer exist if one were to write news from scratch today, such as: - 7-bit printable ASCII only - 32K maximum effective message size - broken cross-referencing (hi, Brad :-)) - using the UNIX file system as a database index and so on. Lots of people have started making noises about multi-media mail, news, and RFCs. Others have started or talked about things like gateways to and from other services (ClariNet, BIX, Compu$erve, MCI Mail, FAX, ...). Still other people have complained about the fact that USENET has moved from being a network testbed to an operational network. Well, maybe it's time to start talking about the next one. It seems impractical to try and overhaul USENET itself, but nothing's stopping us from doing something else that makes it obsolete... If there's enough energy out there to do things like TMNN, CNews, NN, and so on, it seems to me that there'd be enough to do something new. Amanda Walker InterCon Systems Corporation Purveyor of fine Macintosh networking software worldwide [:-)] --
Makey@LOGICON.ARPA (Jeff Makey) (12/06/89)
In article <1989Dec4.173315.22862@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >We have religious objections to gratuitous header rewriting, and don't >do it. The current relaynews doesn't even notice the Date header, since >it tries to notice as few headers as possible. I guess it should be pointed out that the B news behavior is perfectly compatible with the religious doctrine, "be liberal in what you accept; be conservative in what you send." In this context, rewriting the Date: header is anything but gratuitous. Aren't holy wars fun? :: Jeff Makey Department of Tautological Pleonasms and Superfluous Redundancies Department Disclaimer: Logicon doesn't even know we're running news. Internet: Makey@LOGICON.ARPA UUCP: {nosc,ucsd}!logicon.arpa!Makey
henry@utzoo.uucp (Henry Spencer) (12/06/89)
In article <1601@intercon.com> amanda@mermaid.intercon.com (Amanda Walker) writes: >...constraints imposed by universal compatibility with existing sites that >would no longer exist if one were to write news from scratch today, such >as: > ... > - using the UNIX file system as a database index Be careful not to throw the baby out with the bathwater. In the early days of C News, Geoff and I spent a lot of time mulling over alternate ways of organizing the database. We finally concluded that there was no alternative we could think of that was worth the trouble. (At that time, newsreaders hadn't proliferated to the same extent -- C News has had a *very* long gestation period -- so compatibility wasn't a big part of this.) The present scheme has some disadvantages, but it also has advantages. The Unix file system is not a bad way to organize a database. -- 1233 EST, Dec 7, 1972: | Henry Spencer at U of Toronto Zoology last ship sails for the Moon. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
brad@looking.on.ca (Brad Templeton) (12/07/89)
In article <1989Dec6.031329.13569@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: >Be careful not to throw the baby out with the bathwater. In the early days >of C News, Geoff and I spent a lot of time mulling over alternate ways of >organizing the database. We finally concluded that there was no alternative >we could think of that was worth the trouble. Henry's probably right, although it's a close call. The first reason you might think of for switching is saving disk space. Turns out you waste only about 16% of your disk space on 1K blocking, so that isn't enough reason to switch. (Of course on 1 4K block system, it's different, the wastage approaches 40%, I would guess.) On non-unix systems without links, the problem is worse and you have to find another route. For DOS, with no links and 4K blocking, the current method is right out. Another improvement would be speed. A system that allowed you to index into long files need not even unpack batches. 'unbatching' could be simply a matter of reading the pointers from the header of the file and storing them in a table. Process a meg of news in 5 seconds. (Have to handle Path: specially, though.) While this complicates the readers, it makes them faster. If a person reads news once a day, almost all the news will be in a small set of files -- only a few dozen files to open and seek around in. Expire is zippy, too. Just drop the batches in the order they came in. But on the whole, when you add the reader breaking issue, it isn't worth it. But you do have to *add* to the structure, as NN has done. Of course, NN's addition of a special program is a kludge, and the maintenance of a database of subjects etc. is something the inews program should do. Support for a database (more general tha NNs) has to go into NNTP soon, or the future generation of readers won't work with it. Such a database also applies to any signature searching. -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
amanda@mermaid.intercon.com (Amanda Walker) (12/07/89)
In article <57592@looking.on.ca>, brad@looking.on.ca (Brad Templeton) writes: > Another improvement would be speed. [stuff about not unpacking batches] Exactly. > But on the whole, when you add the reader breaking issue, it isn't > worth it. This is part of my motivation for thinking about doing something new. There are a *lot* of things that end up not being worth it as changes to USENET, but are definitely worth thinking about for a new design. Sorry if I didn't make this clear. I'm not even thinking of things running "in parallel" to news, such as ClariNet or whatever. It's probably worth it to support UUCP, because of the Telebit Trailblazer if nothing else, but my thoughts have been focused on retooling news from the ground up, as it were. --Amanda --
bill@twwells.com (T. William Wells) (12/07/89)
In article <57592@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
: Henry's probably right, although it's a close call. The first reason you
: might think of for switching is saving disk space. Turns out you waste
: only about 16% of your disk space on 1K blocking, so that isn't enough
: reason to switch. (Of course on 1 4K block system, it's different, the
: wastage approaches 40%, I would guess.)
Some numbers from my system:
blocksz % over actual space used
512 12%
1K 22%
2K 44%
4K 115%
Average article size: 2246 bytes.
---
Bill { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com
frank@ladc.bull.com (Frank Mayhar) (12/08/89)
In article <56239@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >Almost makes me wonder if we/they shouldn't have started with a machine >only date field, integer seconds since some epoch. (Could use the Unix >epoch, or even the USENET epoch, I guess.) I dislike this idea intensely. The problem you run into with this scheme is, how do you represent it? In my experience, it has been as a single machine word. This has all sort of nasty side effects, as has been noted recently in other groups; side effects such as the software breaking when the sign bit turns on (my organization had to solve this for our OS two years ago), or when the count exceeds the size of the machine word. Certainly good design should take care of this, but how many software systems nowadays are designed that well? Answer: very few, if any. And no matter how well you design it, SOMEBODY will exceed the design limits. So the best idea is to use some scheme that isn't limited, or at least has no effective limits. (How many people are going to be using Unix in the year 21561? Not many, is my guess.) -- Frank Mayhar frank@ladc.bull.com (..!{uunet,hacgate,rdahp}!ladcgw!frank) Bull HN Information Systems Inc. Los Angeles Development Center 5250 W. Century Blvd., LA, CA 90045 Phone: (213) 216-6241