[comp.compression] Data compression standard

gordoni@berlioz.ua.oz (Gordon Irlam) (06/22/91)

From article <859@spam.ua.oz>, by ross@spam.ua.oz.au (Ross Williams)

  [A meta standard for data compression.]

> Date string - A  date string is a standard string  of length 11 having
>     the format "dd-mmm-yyyy" where dd  is in the range "01".."31", mmm
>     is in the range "Jan","Feb",.."Dec"  (Case dependent), and yyyy is
>     in the range "1900" and "9999".

Hence,

    20-Jun-1991

But this creates yet another incompatible date/time format.  It would
be better to adopt a standard date/time format.

A fairly common date/time format on the Internet is that used in
RFC 822.  It looks something like this.

    20 Jun 91 15:48:18 GMT

Unfortunately this date/time format has several disadvantages:

    - It contains "white space" characters.

    - Even without the white space the mapping from literal strings to
      date/times is many to 1.

    - The representation of the month only makes sense in English
      speaking countries.

    - It doesn't include the century.  (What will happen to Usenet on
      the 1st of January 2000?)

It would probably be better to choose one of the following.

    1991-06-20

    1991-06-20T15:48:18Z

These are the "extended format complete representation of a calendar
date", and the "extended format complete representation of a moment of
Coordinated Universal Time" as specified by ISO 8601.

Advantages of the yyyy-mm-dd format over the widely used dd-mm-yyyy
format include:

    - "The avoidance of confusion in comparison with existing national
       conventions using different systems of ascending order."

      (the U.S. national format is easily confused with that of most
       other countries)

    - "The ease with which the whole date may be treated as a single
       number for the purposes of filing and classification."

      (ie. sorting)

    - "The possibility of continuing the order by adding digits for
       hour-minute-second."

Note that ISO 8601 includes representations for many other date/time
quantities that are not relevant here.  These include yyyy-ddd and
yyyy-Www-d formats, basic formats, reduced precision, truncated
representations, fractional hours, minutes, and seconds, periods of
time, and time zone differences.

It would be a serious mistake to allow any ISO 8601 date/time format
since writing a program to parse an arbitrary ISO date/time
representation would be a big challenge.  Instead adopt just one
possible representation.  (I would suggest the second of the two
formats presented above.)

                                       Gordon Irlam.
                                       (gordoni@cs.adelaide.edu.au)