[comp.bugs.4bsd] Addition to Arthur Olsen/4.3bsd table-driven ctime

stevesu@copper.UUCP (03/30/87)

I'd like to express my appreciation for the work that has gone
into the reworked set of ctime routines that has recently been
released (the one written by Arthur Olsen).  A table-driven
approach is clearly the only way to go, and this implementation
works admirably well.  I foresee one problem with its use,
however, which I'll describe, along with a proposed solution.

Suppose I'm a software vendor (which I am, or at least I work for
one), and I ship binary executables to my customers (which I do),
executables which reference ctime and friends (which they do).
What, if anything, should I do to ctime on my development system?
I cannot be sure that my customers will install the necessary 
timezone support files.  Therefore, any copy of ctime linked into
my programs must work "correctly" when transported to a machine
without /etc/zoneinfo in place.

The version posted to net.sources defaulted to GMT in the absence
of /etc/zoneinfo.  The "Official 4.3BSD" version posted last week
arranges that local timezone correction be applied in this
situation, based on the kernel's notion of the timezone.  I
insist that both timezone _a_n_d DST corrections be applied, even in
the absence of /etc/zoneinfo, which means that ctime must carry
some DST information along in its data segment, so it can perform
at least as well as it used to.

The next question is, should any DST information hard-compiled
into ctime be fixed to reflect recent changes?  The surprising
answer is "no."  For sites which have installed /etc/zoneinfo,
the up-to-date tables found there will take precedence.  At sites
without /etc/zoneinfo, the safest assumption is that they have
not really dealt with the ctime problem at all, but are tweaking
the system clock (in the case of the upcoming change in the US,
setting the clock ahead one hour on April 5, and setting it back
an hour on April 26, when the old ctime thinks DST kicks in).
Such a strategy is reasonable, and works fine as long as the DST
correction applied by date(1) when the internal GMT is set
matches the correction applied by each and every program when
GMT is converted back to local time.  An executable with a
"fixed" version of ctime would in fact behave incorrectly at such
a site.

Therefore, the DST information hard-coded into any improved
version of ctime must be exactly as broken as the old, "standard"
one.  In that way, it will work equally well on systems that have
done no more to deal with the DST change than changing their
clocks for three weeks, and on systems that have adopted the
proposed fix.

(The only flaw in this theory is the possibility that people on
some systems have prolonged their agony by going to the trouble
of relinking everything, but with a "fixed" ctime that still
relies only on compiled-in tables.  This approach is in fact the
one suggested by Keith Bostic's "ARTICLE 13."  If anybody is
considering this, take my advice and don't.  If you're in a
position to relink at all, it's really no more trouble to use the
table-driven ctime, and that way you won't have to worry the next
time the DST rules change.)

Herewith are context diffs, against the 4.3 version posted by
Keith Bostic last week, of a version of the Arthur Olsen ctime
that works in a backwards-compatible way in the absence of
/etc/zoneinfo.  (I am also posting the complete, modified version
to net.sources.)  The code actually uses the same DST tables as
the old 4.1/4.2 ctime, to guarantee compatibility.  The old-style
tables are automatically converted into the internal state lists
needed by the new ctime's algorithm.

(These diffs still reflect some debugging code I added, to verify
that my new code was building correct internal data structures.
I did regression tests against real state lists created by zic
from a kludged-up description file reflecting the old transition
dates.)

I should point out that I have not tested the new code in the
southern hemisphere case, and I suspect that it will get things
wrong there for the first four months of 1970.  (This code only
attempts to perform DST correction after 1970, anyway.  In that
respect, it may be incompatible with the old ctime, which applied
DST in years prior to 1970 if you passed it a negative time.
The type of the internal variables has been bouncing back and
forth between `long' and `unsigned long' as people try to decide
how it should work.)

While working over ctime, I came up with a couple of questions:

	In asctime, shouldn't the year really be printed with %4d
	(or maybe %-4d) so that the returned string is guaranteed
	to have its advertised 26-character length?  (I realize
	that the code is lifted directly from the X3J11 draft
	standard, and that %d will only get it wrong for dates in
	the middle ages that a 32-bit time_t can't begin to reach.
	On the other hand, asctime gets handed a broken-down tm
	struct, so early years are quite possible.)

	Shouldn't the offtime() routine be declared static?
	It's not a publicized interface.

                                           Steve Summit
                                           stevesu@copper.tek.com

*** ctime.orig.c	Sun Mar 29 16:11:46 1987
--- ctime.c	Mon Mar 30 11:27:22 1987
***************
*** 215,220
  #endif /* USG_COMPAT */ 
  		}
  	}
  	return 0;
  }
  

--- 215,223 -----
  #endif /* USG_COMPAT */ 
  		}
  	}
+ #ifdef DEBUG
+ 	printstate();
+ #endif
  	return 0;
  }
  
***************
*** 218,223
  	return 0;
  }
  
  static
  tzsetkernel()
  {

--- 221,244 -----
  	return 0;
  }
  
+ #ifdef DEBUG
+ 
+ printstate()
+ {
+ 	int i;
+ 
+ 	printf("TS/DST info state:\n");
+ 	for(i = 0; i < s.timecnt; i++)
+ 		printf("time %d: %ld %d\n", i, s.ats[i], s.types[i]);
+ 	for(i = 0; i < s.typecnt; i++)
+ 		printf("type %d: %ld %d %d (%s)\n", i,
+ 			s.ttis[i].tt_gmtoff, s.ttis[i].tt_isdst,
+ 				s.ttis[i].tt_abbrind,
+ 					s.chars + s.ttis[i].tt_abbrind);
+ }
+ 
+ #endif
+ 
  static
  tzsetkernel()
  {
***************
*** 224,229
  	struct timeval	tv;
  	struct timezone	tz;
  	char	*tztab();
  
  	if (gettimeofday(&tv, &tz))
  		return -1;

--- 245,251 -----
  	struct timeval	tv;
  	struct timezone	tz;
  	char	*tztab();
+ 	static dstsetkernel();
  
  	if (gettimeofday(&tv, &tz))
  		return -1;
***************
*** 227,233
  
  	if (gettimeofday(&tv, &tz))
  		return -1;
! 	s.timecnt = 0;		/* UNIX counts *west* of Greenwich */
  	s.ttis[0].tt_gmtoff = tz.tz_minuteswest * -SECS_PER_MIN;
  	s.ttis[0].tt_abbrind = 0;
  	(void)strcpy(s.chars, tztab(tz.tz_minuteswest, 0));

--- 249,257 -----
  
  	if (gettimeofday(&tv, &tz))
  		return -1;
! 	s.timecnt = 0;
! 	s.typecnt = 1;
! 			/* UNIX counts *west* of Greenwich */
  	s.ttis[0].tt_gmtoff = tz.tz_minuteswest * -SECS_PER_MIN;
  	s.ttis[0].tt_isdst = 0;
  	s.ttis[0].tt_abbrind = 0;
***************
*** 229,234
  		return -1;
  	s.timecnt = 0;		/* UNIX counts *west* of Greenwich */
  	s.ttis[0].tt_gmtoff = tz.tz_minuteswest * -SECS_PER_MIN;
  	s.ttis[0].tt_abbrind = 0;
  	(void)strcpy(s.chars, tztab(tz.tz_minuteswest, 0));
  	tzname[0] = tzname[1] = s.chars;

--- 253,259 -----
  	s.typecnt = 1;
  			/* UNIX counts *west* of Greenwich */
  	s.ttis[0].tt_gmtoff = tz.tz_minuteswest * -SECS_PER_MIN;
+ 	s.ttis[0].tt_isdst = 0;
  	s.ttis[0].tt_abbrind = 0;
  	(void)strcpy(s.chars, tztab(tz.tz_minuteswest, 0));
  	tzname[0] = tzname[1] = s.chars;
***************
*** 236,241
  	timezone = tz.tz_minuteswest * 60;
  	daylight = tz.tz_dsttime;
  #endif /* USG_COMPAT */
  	return 0;
  }
  

--- 261,274 -----
  	timezone = tz.tz_minuteswest * 60;
  	daylight = tz.tz_dsttime;
  #endif /* USG_COMPAT */
+ 
+ 	if(tz.tz_dsttime)
+ 		dstsetkernel(&tz);
+ 
+ #ifdef DEBUG
+ 	printstate();
+ #endif
+ 
  	return 0;
  }
  
***************
*** 386,389
  	tmp->tm_zone = "";
  	tmp->tm_gmtoff = offset;
  	return tmp;
  }

--- 419,615 -----
  	tmp->tm_zone = "";
  	tmp->tm_gmtoff = offset;
  	return tmp;
+ }
+ 
+ /*
+  *  Backwards-compatible DST information tables.
+  *
+  *  The tables give the day number of the first day after the
+  *  Sunday of the change.
+  *
+  *  DO NOT FIX THESE TABLES.
+  *  Yes, they're wrong in several ways, including 1987 and beyond
+  *  in the United States, but they happen to match the old ctime.c
+  *  that is compiled into virtually all programs under 4.2bsd-
+  *  derived systems.  This is important if programs compiled with
+  *  this version of ctime are to work correctly when shipped (in
+  *  binary form) to systems which have not upgraded to the
+  *  /etc/zoneinfo scheme.
+  *
+  *  These hardwired tables are only used when /etc/zoneinfo cannot
+  *  be accessed.  A system which does not have /etc/zoneinfo is
+  *  probably handling DST fluctuations by changing the system clock.
+  *  Therefore, all programs on such a system (whether linked with
+  *  old or new versions of ctime) must use the same DST rules.
+  *  On a system with old-fashioned versions of ctime, handling
+  *  DST fluctuations by changing the system clock, a "correct"
+  *  version of ctime would in fact display incorrect results.
+  */
+ 
+ struct dstab {
+ 	int	dayyr;
+ 	int	daylb;
+ 	int	dayle;
+ };
+ 
+ static struct dstab usdaytab[] = {
+ 	1974,	5,	333,	/* 1974: Jan 6 - last Sun. in Nov */
+ 	1975,	58,	303,	/* 1975: Last Sun. in Feb - last Sun in Oct */
+ 	0,	119,	303,	/* all other years: end Apr - end Oct */
+ };
+ 
+ static struct dstab ausdaytab[] = {
+ 	1970,	400,	0,	/* 1970: no daylight saving at all */
+ 	1971,	303,	0,	/* 1971: daylight saving from Oct 31 */
+ 	1972,	303,	58,	/* 1972: Jan 1 -> Feb 27 & Oct 31 -> dec 31 */
+ 	0,	303,	65,	/* others: -> Mar 7, Oct 31 -> */
+ };
+ 
+ /*
+  * The European tables ... based on hearsay
+  * Believed correct for:
+  *	WE:	Great Britain, Ireland, Portugal
+  *	ME:	Belgium, Luxembourg, Netherlands, Denmark, Norway,
+  *		Austria, Poland, Czechoslovakia, Sweden, Switzerland,
+  *		DDR, DBR, France, Spain, Hungary, Italy, Jugoslavia
+  * Eastern European dst is unknown, we'll make it ME until someone speaks up.
+  *	EE:	Bulgaria, Finland, Greece, Rumania, Turkey, Western Russia
+  */
+ 
+ static struct dstab wedaytab[] = {
+ 	1983,	86,	303,	/* 1983: end March - end Oct */
+ 	1984,	86,	303,	/* 1984: end March - end Oct */
+ 	1985,	86,	303,	/* 1985: end March - end Oct */
+ 	0,	400,	0,	/* others: no daylight saving at all ??? */
+ };
+ 
+ static struct dstab medaytab[] = {
+ 	1983,	86,	272,	/* 1983: end March - end Sep */
+ 	1984,	86,	272,	/* 1984: end March - end Sep */
+ 	1985,	86,	272,	/* 1985: end March - end Sep */
+ 	0,	400,	0,	/* others: no daylight saving at all ??? */
+ };
+ 
+ static struct dayrules {
+ 	int		dst_type;	/* number obtained from system */
+ 	int		dst_hrs;	/* hours to add when dst on */
+ 	struct	dstab *	dst_rules;	/* one of the above */
+ 	enum {STH,NTH}	dst_hemi;	/* southern, northern hemisphere */
+ } dayrules [] = {
+ 	DST_USA,	1,	usdaytab,	NTH,
+ 	DST_AUST,	1,	ausdaytab,	STH,
+ 	DST_WET,	1,	wedaytab,	NTH,
+ 	DST_MET,	1,	medaytab,	NTH,
+ 	DST_EET,	1,	medaytab,	NTH,	/* XXX */
+ 	-1,
+ };
+ 
+ static
+ dstsetkernel(tzp)
+ struct timezone *tzp;
+ {
+ 	struct dayrules *drp;
+ 	int tabsize;
+ 	int timei;
+ 	int y;
+ 	int yleap;
+ 	int i;
+ 	int d, di;
+ 	time_t t;
+ 	int daylb, dayle;
+ 	char *p;
+ 
+ 	for(drp = dayrules; drp->dst_type >= 0; drp++)
+ 		if(drp->dst_type == tzp->tz_dsttime)
+ 			break;
+ 
+ 	if(drp->dst_type < 0)
+ 		return;
+ 
+ 	/* this ends up computing tabsize - 1, but that's what we want */
+ 
+ 	for(tabsize = 0; drp->dst_rules[tabsize].dayyr > 0; tabsize++)
+ 		;
+ 
+ 	/* 2038 is the year that signed 32 bit time_t's give out */
+ 
+ 	for(y = 1970, d = 0, t = 0, timei = 0; y < 2038; y++) {
+ 		daylb = drp->dst_rules[tabsize].daylb;
+ 		dayle = drp->dst_rules[tabsize].dayle;
+ 
+ 		for(i = 0; i < tabsize; i++)
+ 			if(y == drp->dst_rules[i].dayyr) {
+ 				daylb = drp->dst_rules[i].daylb;
+ 				dayle = drp->dst_rules[i].dayle;
+ 				break;
+ 			}
+ 
+ 		yleap = isleap(y);
+ 
+ 		if(yleap) {
+ 			if(daylb >= 58)
+ 				daylb++;
+ 
+ 			if(dayle >= 58)
+ 				dayle++;
+ 		}
+ 
+ 		/*
+ 		 *  January 1, 1970 was a Wednesday.
+ 		 *  d is the difference between January 1 of the loop
+ 		 *  year (y) and January 1, 1970, in days.
+ 		 *  daylb and dayle are (0-origin) day offsets with
+ 		 *  respect to January 1.
+ 		 *  So (d + dayl[be] - 3) % 7 is the day (0 == Sunday)
+ 		 *  of daylb or dayle.
+ 		 *  That's also the number to subtract from daylb or
+ 		 *  dayle to get the day number (since January 1 of
+ 		 *  the loop year) of the preceding Sunday.
+ 		 */
+ 
+ 		daylb -= (d + daylb - 3) % 7;
+ 		dayle -= (d + dayle - 3) % 7;
+ 
+ 		s.ats[timei] = t + SECS_PER_DAY * daylb
+ 				+ tzp->tz_minuteswest * SECS_PER_MIN
+ 							+ 2 * SECS_PER_HOUR;
+ 
+ 		s.ats[timei + 1] = t + SECS_PER_DAY * dayle
+ 				+ tzp->tz_minuteswest * SECS_PER_MIN
+ 					+ (drp->dst_hemi == NTH ? 1 : 2)
+ 							* SECS_PER_HOUR;
+ 
+ 		if(drp->dst_hemi == NTH) {
+ 			s.types[timei] = tzp->tz_dsttime;
+ 			s.types[timei + 1] = 0;
+ 		} else {
+ 			s.types[timei] = 0;
+ 			s.types[timei + 1] = tzp->tz_dsttime;
+ 		}
+ 
+ 		timei += 2;
+ 
+ 		di = year_lengths[yleap];
+ 
+ 		d += di;
+ 		t += di * SECS_PER_DAY;
+ 	}
+ 
+ 	s.timecnt = timei;
+ 
+ 	s.ttis[1].tt_gmtoff = tzp->tz_minuteswest * -SECS_PER_MIN
+ 						+ drp->dst_hrs * SECS_PER_HOUR;
+ 
+ 	s.ttis[1].tt_isdst = tzp->tz_dsttime;
+ 
+ 	for(p = s.chars; *p != '\0'; p++)
+ 		;
+ 
+ 	(void)strcpy(++p, tztab(tzp->tz_minuteswest, tzp->tz_dsttime));
+ 
+ 	s.ttis[1].tt_abbrind = p - s.chars;
+ 
+ 	tzname[1] = p;
+ 
+ 	s.typecnt = 2;
  }

guy@gorodish.UUCP (03/31/87)

> Suppose I'm a software vendor (which I am, or at least I work for
> one), and I ship binary executables to my customers (which I do),
> executables which reference ctime and friends (which they do).
> What, if anything, should I do to ctime on my development system?
> I cannot be sure that my customers will install the necessary 
> timezone support files.

If you are a hardware vendor as well, you should just go ahead and
drop in the new timezone stuff and not worry about older systems,
unless you plan to build executables on a new whizzy system and have
them run on an old dingy system.  If you don't claim that software
built on version 5.0 of your OS will work on version 4.0 (and there
are very often excellent reasons *not* to claim this), no problem.
Presumably, installing version 5.0 of your OS also installs the
timezone support files (if it doesn't, somebody screwed up).

If you're just a software vendor, leave it alone.  Presumably, you'll
be building software on a BUBco Boring UNIX Box 1601, to sell to
people with BUBco Boring UNIX Boxes in the 1600 series.  If you build
it under version 5.0 under BUBIX, and only claim that it runs under
5.0 or later versions, you can assume that the time zone support
files are already there.  If you build it under 4.0, and BUBco claims
that programs built under 4.0 will still run under 5.0, as long as
you trust BUBco you can assume that there's no problem.

The only problem occurs if you build things under 5.0 expecting them
to work under 4.0.  If BUBco claims that this is the case, then
either they're lying or they've already set "ctime" up properly.  If
they don't claim that this is the case, then you shouldn't be
assuming that it *is* the case.

(If you're monkeying with BUBco's "libc.a", then you'd better *know*
what you're doing, and be prepared for unpleasant surprises anyway.
Think carefully before you overwrite your vendor's "ctime" code, if
you're planning to use that machine to develop software to be
installed on other machines that may not have installed the new
"ctime" stuff.)

*BUB and Boring UNIX Box are Trademarks of Russell Sandberg Enterprises.

stevesu@copper.UUCP (04/06/87)

In article <15918@sun.uucp>, guy%gorodish@Sun.COM (Guy Harris) writes:
> If you're just a software vendor, leave it alone.  Presumably, you'll
> be building software on a BUBco Boring UNIX Box 1601, to sell to
> people with BUBco Boring UNIX Boxes in the 1600 series...
> The only problem occurs if you build things under 5.0 expecting them
> to work under 4.0.

What I'm worried about is that, for the moment at least, we ship
identical binaries to both Ultrix and 4.[23]bsd sites.  Some of
these binaries are compiled and linked on an Ultrix machine,
others on a 4.2 machine.  In the past, for our particular
applications, this hasn't seemed to make a difference.

(Please, no speeches proving to me why we shouldn't be doing
this.  I, too, can contrive situations in which this policy
could fail.  We'll abandon it when we have to, but for now, it
works and it's convenient.)

In the case of the new & improved ctime, I'm worried because, to
the extent that it's been posted in comp.bugs.4bsd.ucb-fixes, the
new ctime is now in some sense "official" 4.3bsd, but it may or
may not be adopted by any given 4.3 site and it leads to Ultrix
incompatibility.

                                           Steve Summit
                                           stevesu@copper.tek.com