gordoni@berlioz.ua.oz (Gordon Irlam) (06/22/91)
From article <859@spam.ua.oz>, by ross@spam.ua.oz.au (Ross Williams) [A meta standard for data compression.] > Date string - A date string is a standard string of length 11 having > the format "dd-mmm-yyyy" where dd is in the range "01".."31", mmm > is in the range "Jan","Feb",.."Dec" (Case dependent), and yyyy is > in the range "1900" and "9999". Hence, 20-Jun-1991 But this creates yet another incompatible date/time format. It would be better to adopt a standard date/time format. A fairly common date/time format on the Internet is that used in RFC 822. It looks something like this. 20 Jun 91 15:48:18 GMT Unfortunately this date/time format has several disadvantages: - It contains "white space" characters. - Even without the white space the mapping from literal strings to date/times is many to 1. - The representation of the month only makes sense in English speaking countries. - It doesn't include the century. (What will happen to Usenet on the 1st of January 2000?) It would probably be better to choose one of the following. 1991-06-20 1991-06-20T15:48:18Z These are the "extended format complete representation of a calendar date", and the "extended format complete representation of a moment of Coordinated Universal Time" as specified by ISO 8601. Advantages of the yyyy-mm-dd format over the widely used dd-mm-yyyy format include: - "The avoidance of confusion in comparison with existing national conventions using different systems of ascending order." (the U.S. national format is easily confused with that of most other countries) - "The ease with which the whole date may be treated as a single number for the purposes of filing and classification." (ie. sorting) - "The possibility of continuing the order by adding digits for hour-minute-second." Note that ISO 8601 includes representations for many other date/time quantities that are not relevant here. These include yyyy-ddd and yyyy-Www-d formats, basic formats, reduced precision, truncated representations, fractional hours, minutes, and seconds, periods of time, and time zone differences. It would be a serious mistake to allow any ISO 8601 date/time format since writing a program to parse an arbitrary ISO date/time representation would be a big challenge. Instead adopt just one possible representation. (I would suggest the second of the two formats presented above.) Gordon Irlam. (gordoni@cs.adelaide.edu.au)
enag@ifi.uio.no (Erik Naggum) (06/24/91)
Gordon Irlam <gordoni@berlioz.ua.oz> writes: | | A fairly common date/time format on the Internet is that used in | RFC 822. It looks something like this. | | 20 Jun 91 15:48:18 GMT | | Unfortunately this date/time format has several disadvantages: | | - It doesn't include the century. (What will happen to Usenet on | the 1st of January 2000?) The IETF Host Requirements working group sensibly recommended four-digit year specification in RFC 1123: 5.2.14 RFC-822 Date and Time Specification: RFC-822 Section 5 The syntax for the date is hereby changed to: date = 1*2DIGIT month 2*4DIGIT All mail software SHOULD use 4-digit years in dates, to ease the transition to the next century. | It would probably be better to choose one of the following. | | 1991-06-20 This is probably the most readable choice. | 1991-06-20T15:48:18Z This seems needlessly cluttered. | It would be a serious mistake to allow any ISO 8601 date/time format | since writing a program to parse an arbitrary ISO date/time | representation would be a big challenge. Instead adopt just one | possible representation. (I would suggest the second of the two | formats presented above.) I'm sure writing an ISO 8601 parser which returns the UNIX standard time representation (seconds since 1970-01-01 00:00:00 +0000, or whatever ;-) is a challenge, but it would be very, very useful. Are there anybody out there who would like to work with me on this? </Erik> -- Erik Naggum Professional Programmer +47-2-836-863 Naggum Software Electronic Text <erik@naggum.no> 0118 OSLO, NORWAY Computer Communications <enag@ifi.uio.no>
shores@fergvax.unl.edu (Shores) (06/24/91)
In <3761@sirius.ucs.adelaide.edu.au> gordoni@berlioz.ua.oz (Gordon Irlam) writes: >From article <859@spam.ua.oz>, by ross@spam.ua.oz.au (Ross Williams) > [A meta standard for data compression.] >> Date string - A date string is a standard string of length 11 having >> the format "dd-mmm-yyyy" where dd is in the range "01".."31", mmm >> is in the range "Jan","Feb",.."Dec" (Case dependent), and yyyy is >> in the range "1900" and "9999". >Hence, > 20-Jun-1991 >But this creates yet another incompatible date/time format. It would >be better to adopt a standard date/time format. >A fairly common date/time format on the Internet is that used in >RFC 822. It looks something like this. > 20 Jun 91 15:48:18 GMT >Unfortunately this date/time format has several disadvantages: > - It contains "white space" characters. > - Even without the white space the mapping from literal strings to > date/times is many to 1. > - The representation of the month only makes sense in English > speaking countries. > - It doesn't include the century. (What will happen to Usenet on > the 1st of January 2000?) >It would probably be better to choose one of the following. > 1991-06-20 > 1991-06-20T15:48:18Z >These are the "extended format complete representation of a calendar >date", and the "extended format complete representation of a moment of >Coordinated Universal Time" as specified by ISO 8601. I have a better idea. Instead of storing a date STRING, why not just store a number? The Macintosh stores dates as a 4 byte number, representing the seconds elapsed since Jan 1, 1904. Unix has a similar convention, only from 1972 (better, IMHO). Then it should be up to the user program to represent the date. The Mac has IUDateString, unix and others have ctime(), etc. --tom shores PS: considering the nature of this group, shouldn't it be called "comp.ression" :-) Tom... Tommy... Thomas... the Tom-ster, the Tom-boy, the Tomminator... ... Tom Shores, Department of Mathematics, University of Nebraska. ... shores@fergvax.unl.edu
msp33327@uxa.cso.uiuc.edu (Michael S. Pereckas) (06/25/91)
In <shores.677707660@fergvax> shores@fergvax.unl.edu (Shores) writes: >I have a better idea. Instead of storing a date STRING, why not just >store a number? The Macintosh stores dates as a 4 byte number, >representing the seconds elapsed since Jan 1, 1904. Unix has a similar >convention, only from 1972 (better, IMHO). Then it should be up to the >user program to represent the date. The Mac has IUDateString, unix and >others have ctime(), etc. That is totally non-human-readable, however. That may not matter for the application that started this thread, whatever that was, but it can be useful. Further, everyone will probably use a different scheme, and one 32 bit number looks the same as the next. Also, if you decide that you need tenth-second accuracy, the string can be extended without breaking the logic of the scheme, even if it does -- < Michael Pereckas <> m-pereckas@uiuc.edu <> Just another student... > "This desoldering braid doesn't work. What's this cheap stuff made of, anyway?" "I don't know, looks like solder to me."
campbell@redsox.bsw.com (Larry Campbell) (06/25/91)
I believe there is an ISO standard (sorry, don't know the number). It's very simple. Dates and times are represented as a string of 12 to 16 decimal digits: YYYYMMDDHHMMSSHH where the last two digits represent hundredths of a second; I believe the seconds, and hundredths of seconds, are optional. You could, of course, add as many trailing digits as you like, if you need to achieve nanosecond precision, without ambiguity. This format is completely unambiguous, is easily understood by both humans and computers, sorts easily, is not Anglocentric, and is compact. Of course, the time represented is assumed to be UTC, so no time zone decorations are required. Your user interface software should know how to display this in the local time zone, in the local language. -- Larry Campbell The Boston Software Works, Inc., 120 Fulton Street campbell@redsox.bsw.com Boston, Massachusetts 02109 (USA)
enag@ifi.uio.no (Erik Naggum) (06/25/91)
Lessee, it's June 25th, 1991, 3:51pm local time, or 19910625135127. Alternatively it's 1991-06-25 15:51:27 +02:00. From both a machine- and human-readable point of view, delimiters can be very helpful in disambiguating syntaxes. A string of digits is not much different from a string of bits, as I see it. Especially as the string gets longer, it's hard for humans to sort out what's what. Pick out the day of the month from 19910625135127 and 1991-06-25 15:51:27, as a simple exercise. Further, I don't want to see heuristics added to the parsing algorithm in order to find out what time was _really_ intended. E.g. is 91-06-25 a date 1900 years ago or just a sloppy syntax? Is 91062515512744 now, with centisecond precision, 1900 years ago with centisecond precision or the 15th day of th 25th month of the year 9106 at 51:27:44? Ok, so it isn't the latter, because it's absurd, but there are cases where it would be hard to figure it out. </Erik> -- Erik Naggum Professional Programmer +47-2-836-863 Naggum Software Electronic Text <erik@naggum.no> 0118 OSLO, NORWAY Computer Communications <enag@ifi.uio.no>
hpa@casbah.acns.nwu.edu (H. Peter Anvin) (06/26/91)
In article <ENAG.91Jun25161013@gyda.ifi.uio.no> of comp.std.internat, enag@ifi.uio.no (Erik Naggum) writes: > Lessee, it's June 25th, 1991, 3:51pm local time, or 19910625135127. > Alternatively it's 1991-06-25 15:51:27 +02:00. From both a machine- > and human-readable point of view, delimiters can be very helpful in > disambiguating syntaxes. A string of digits is not much different > from a string of bits, as I see it. Especially as the string gets > longer, it's hard for humans to sort out what's what. Pick out the > day of the month from 19910625135127 and 1991-06-25 15:51:27, as a > simple exercise. > > Further, I don't want to see heuristics added to the parsing algorithm > in order to find out what time was _really_ intended. E.g. is > 91-06-25 a date 1900 years ago or just a sloppy syntax? A standardized non-computer-related way of codifying dates in numeric form is the Julian day number. It is designed to be zero on noon UTC, 0 Jan 4711 B.C. if I remember it right ("0 Jan" is astronomese for 31 Dec the year before). It increments by one every 24 hours; an arbitrary number of decimals (or binals) can be added to the number for arbitrary precision. Presume you need millisecond precision. There are 86,400,000 ms in a day, so you need either 8 decimals or 27 binals. There are roughly 2,500,000 days since the zero point, so in order to describe that range and cover the same time span into the future (to sometime in the 68th century) you need 7 digits or 23 bits. Total: 15 digits or 50 bits. If you can cut down to centisecond performance, only 14 digits or 46 bits. Check in an astronomical book or table to check out the exact Julian day number epoch; quite some astronomical literature lists the Julian day number for the first day of the year. Personally, I think the JDN would fit very well in a 48-bit, 2's-complement format with subcentisecond precision: 24 binals, 24 integer bits. Could cover almost 46,000 years ranging from 27,000 B.C. to 18,000 A.D. /Peter -- MAIL: hpa@casbah.acns.nwu.edu (hpa@nwu.edu after this summer) "finger" the address above for more information.
yfcw14@castle.ed.ac.uk (K P Donnelly) (06/27/91)
If I received a message with a time stamp of 1991-06-25 15:51:27 +02:00 I would assume, if I didn't know better, that it had been sent at 17:51 Universal Time, since 15+2=17. In fact, of course, it was sent at 1991-06-25 13:51:27 Anyone agree with me that the sign convention for time-zones is unfortunate? Kevin Donnelly
hpa@casbah.acns.nwu.edu (H. Peter Anvin) (06/28/91)
In article <11329@castle.ed.ac.uk>, yfcw14@castle.ed.ac.uk (K P Donnelly) writes: |> If I received a message with a time stamp of |> 1991-06-25 15:51:27 +02:00 |> I would assume, if I didn't know better, that it had been sent at |> 17:51 Universal Time, since 15+2=17. The sign convention is NOT unfortunate, it is only the way it has been made to look on USENET. "+02:00" is more properly written as "GMT+02:00" or "UTC+2". The problem is the juxtaposition of a time with its time zone code, the latter stripped of "GMT" or "UTC". /Peter -- INTERNET: hpa@casbah.acns.nwu.edu (hpa@nwu.edu after this summer) BITNET: HPA@NUACC HAM RADIO: N9ITP, SM4TKN FIDONET: 1:115/989.4 "finger" the Internet address above for more information.
enag@ifi.uio.no (Erik Naggum) (06/29/91)
K P Donnelly <yfcw14@castle.ed.ac.uk> writes: | | If I received a message with a time stamp of | 1991-06-25 15:51:27 +02:00 | I would assume, if I didn't know better, that it had been sent at | 17:51 Universal Time, since 15+2=17. In fact, of course, it was sent at | 1991-06-25 13:51:27 Well, it's actually short-hand for GMT+02:00. The time listed is two hours more than UT. I think it's intuitive, because I view it from UT, not from local time to UT. I think this makes a lot of sense. In addition, if we change, we will get massive confusion, and equally useless time zone indications as the military time zones in RFC 822 (which were listed wrong, and consequently are rendered ambiguous -> meaningless). </Erik> -- Erik Naggum Professional Programmer +47-2-836-863 Naggum Software Electronic Text <erik@naggum.no> 0118 OSLO, NORWAY Computer Communications <enag@ifi.uio.no>