[comp.sources.bugs] Patches to

pcg@aber-cs.UUCP (Piercarlo Grandi) (01/06/90)

This is a patch to elm2.2 PL14 for improved sorting functionality, speed,
etc... Here it is:

	A description of some changes to elm 2.2 PL14

I use elm to maintain a large number of folders of archived news
articles, and of electronic mail.

Each folder may well contain several hundred messages and be well
over a mmegabyte lonf, and contain many different threads.  To me
elm should be as efficient as cataologing system as possible.

I have modified elm to handler with greater efficiency very large
folders, and to presents their contents in sorted order.

Th major change ffected has been to sorting; the standard elm keeps
two dates for each message, sent and received.  The sent date is
kept as a set of strings, and the received one as a set of binary
numbers.  I have inverted this, as most often one wants to
manipulate the message by its sent date, which is essential to its
character, rather than the received date, which depends only on
transport.  I have also added code that adds the sent date as the
secondary sorting criterion if the primary sort criterion is
another.  I have also improved the subject sort code.

Since I keep my news folders sorted by subject, these changes mean
that within each subject articles will be efficiently sorted by
date written, thus automatically in the right order to follow the
devwelopment of the thread.

I keep my mail folders sorted by date sent, because their "urgency"
seems to me more dependent on when they were written than on when
they were delivered.  Occasionally I sort my mail folders by
sender, and again it is useful to have messages from the same
sender sorted by date written.

Other changes have been improved parsing of envelope (the delivered
date), and enabling RFC822 date parsing for the date written.

Here is a comment on the various files modified:

hdrs/defs.h

	added field 'second' to struct 'date_rec', and swapped
	the storage method of the received date and the sent
	date.

utils/from.c

	corrected a bug evident when frm'ing from several
	folders, that the folder's FILE was not closed,
	eventually running out of filedescriptors.

src/addr_util.c

	Changed the parsing of the envelope to cope with the
	two main different format for the envelope date, and to
	store the envelope date as strings in the message
	descriptor.

	Made the RFC822 parse_arpa_date the default and only
	one, and made it store the parsed date into the binary
	'sent' substruct of the message descriptor.

src/aliadb.c
src/conn_to.c
src/domains.c
src/init.c
src/opt_utils.c
src/strings.c

	Now compiles certain code only when really needed.
	Saves a little space in the executable.

src/elm.c

	Changes made necessary because now the received date is
	stored as strings, and the sent date as binary numbers.

src/leavembox.c

	Nullified the code that on exit resorts the folder as
	it was when we opened it. We want to have the mailbox
	physically sorted as we have it when we clsoe it, so
	that typically examining it with other tools (e.g. frm)
	gives us messages in the same order as when we look at
	it with elm.

src/newmbox.c

	Changed the logic used to allocate the array of message
	headers.  We initally allocate it 3 clicks long, grow
	it by 1 klick every time it needs expanding, and when
	we change mailbox we free the headers in excess.
	Another small correction also is because we now parse
	the written date with parse_arpa_date into a binary
	record.

src/read_rc.c

	The default weedlist, which *cannot be overriden* by
	the user, has been made more sensible, both for saved
	news articles and regular mail. We weed out all fields
	that are either already present in elm's message index
	or message pager (e.g. From, To, ...), or are only
	relevant to transport or other mail handling programs
	and not the user (e.g. Message-Id, Received, ...).

src/screen.c

	A couple of changes because now the sent date is binary
	and no longer symbolic.

src/sort.c

	A couple of changes. First of all now we have SENT_DATE
	as a default, secondary sort criterion. This is
	virtually always appropriate, and very inexpensive,
	given that now it is the sent date that is stored in
	binary form.

	Second, the code that strips any leading "Re: " type
	prefix from the subject has been completely rewritten
	(it was quite bad), and made more efficient and
	general, and a few pitfalls avoided.

===================================================================
RCS file: hdrs/defs.h,v
retrieving revision 1.1
diff -c -r1.1 hdrs/defs.h
*** /tmp/,RCSt1a22140	Sun Dec 24 16:43:51 1989
--- hdrs/defs.h	Sun Dec 24 16:41:22 1989
***************
*** 271,276 ****
--- 268,274 ----
  	int  year;		/**     time...		 **/
  	int  hour;
  	int  minute;
+ 	int  second;
         };
  
  struct header_rec {
***************
*** 281,287 ****
  	int  exit_disposition;	/** whether to keep, store, delete **/
  	int  status_chgd;	/** whether became read or old, etc. **/
  	long offset;		/** offset in bytes of message **/
! 	struct date_rec received;/** when elm received here    **/
  	char from[STRING];	/** who sent the message?      **/
  	char to[STRING];	/** who it was sent to	       **/
  	char messageid[STRING];	/** the Message-ID: value      **/
--- 279,285 ----
  	int  exit_disposition;	/** whether to keep, store, delete **/
  	int  status_chgd;	/** whether became read or old, etc. **/
  	long offset;		/** offset in bytes of message **/
! 	struct date_rec sent;	/** when the msg was sent    **/
  	char from[STRING];	/** who sent the message?      **/
  	char to[STRING];	/** who it was sent to	       **/
  	char messageid[STRING];	/** the Message-ID: value      **/
***************
*** 288,294 ****
  	char dayname[8];	/**  when the                  **/
  	char month[10];		/**        message             **/
  	char day[3];		/**          was 	       **/
! 	char year[5];		/**            sent            **/
  	char time[NLEN];	/**              to you!       **/
  	char subject[STRING];   /** The subject of the mail    **/
  	char mailx_status[WLEN];/** mailx status flags (RO...) **/
--- 286,292 ----
  	char dayname[8];	/**  when the                  **/
  	char month[10];		/**        message             **/
  	char day[3];		/**          was 	       **/
! 	char year[5];		/**            delivered       **/
  	char time[NLEN];	/**              to you!       **/
  	char subject[STRING];   /** The subject of the mail    **/
  	char mailx_status[WLEN];/** mailx status flags (RO...) **/
===================================================================
RCS file: utils/from.c,v
retrieving revision 1.1
diff -c -r1.1 utils/from.c
*** /tmp/,RCSt1a22140	Sun Dec 24 16:43:52 1989
--- utils/from.c	Sun Dec 24 14:48:08 1989
***************
*** 117,127 ****
--- 114,127 ----
  	    }
  	  }
  	  else
+ 	  {
  	    if (read_headers(optind+1 == argc)==0)
  	      if (optind+1 == argc)
  	        printf("No mail\n");
  	      else
  	        printf("No messages in that folder!\n");
+ 	    fclose(mailfile);
+ 	  }
  
  	  optind++;
  	}
===================================================================
RCS file: src/addr_util.c,v
retrieving revision 1.1
diff -c -r1.1 addr_util.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:04 1989
--- src/addr_util.c	Sat Dec 23 22:38:26 1989
***************
*** 434,444 ****
  	timebuff[0] = '\0';
  	junk[0] = '\0';
  
! 	/* From <user> <day> <month> <day> <hr:min:sec> <year> */
  
! 	sscanf(buffer, "%*s %*s %*s %*s %*s %s %*s %s", timebuff, junk);
  
! 	if (strlen(timebuff) < 3) {
  	  dprint(3,(debugfile, 
  		"Real_from returns FAIL [no time field] on\n-> %s\n", 
  		buffer));
--- 433,458 ----
  	timebuff[0] = '\0';
  	junk[0] = '\0';
  
! 
! 
! 	/*
! 	    From
! 	    <user>
! 	    <dayname> | <day>
! 	    <month>
! 	    <day> | <year>
! 	    <time>
! 	    <year> | <timezone>
! 	*/
! 
! 	sscanf(buffer, "%s %s %s %s %*s %s %*s",
! 	    junk, holding_from,
! 	    timebuff, rec_ptr->month, rec_ptr->time);
  
! 	strncpy(rec_ptr->from, holding_from, STRING-1);
! 	rec_ptr->from[STRING-1] = '\0';
  
! 	if (strlen(rec_ptr->time) < 3) {
  	  dprint(3,(debugfile, 
  		"Real_from returns FAIL [no time field] on\n-> %s\n", 
  		buffer));
***************
*** 445,482 ****
  	  return(FALSE);
  	}
  
! 	if (timebuff[1] != ':' && timebuff[2] != ':') { 
  	  dprint(3,(debugfile, 
  		"Real_from returns FAIL [bad time field] on\n-> %s\n", 
  		buffer));
  	  return(FALSE);
  	}
- 	if (junk[0] != '\0') {	/* try for 8 field entry */
- 	  junk[0] = '\0';
- 	  sscanf(buffer, "%*s %*s %*s %*s %*s %s %*s %*s %s", timebuff, junk);
- 	  if (junk[0] != '\0') {
- 	    dprint(3, (debugfile, 
- 		  "Real_from returns FAIL [too many fields] on\n-> %s\n", 
- 		  buffer));
- 	    return(FALSE);
- 	  }
- 	  eight_fields++;
- 	}
- 
- 	/** now get the info out of the record! **/
  
! 	if (eight_fields) 
! 	  sscanf(buffer, "%s %s %s %s %s %s %*s %s",
! 	            junk, holding_from, rec_ptr->dayname, rec_ptr->month, 
!                     rec_ptr->day, rec_ptr->time, rec_ptr->year);
  	else
! 	  sscanf(buffer, "%s %s %s %s %s %s %s",
! 	            junk, holding_from, rec_ptr->dayname, rec_ptr->month, 
!                     rec_ptr->day, rec_ptr->time, rec_ptr->year);
! 	
! 	strncpy(rec_ptr->from, holding_from, STRING-1);
! 	rec_ptr->from[STRING-1] = '\0';
! 	resolve_received(rec_ptr);
  	return(rec_ptr->year[0] != '\0');
  }
  
--- 459,481 ----
  	  return(FALSE);
  	}
  
! 	if (rec_ptr->time[1] != ':' && rec_ptr->time[2] != ':') { 
  	  dprint(3,(debugfile, 
  		"Real_from returns FAIL [bad time field] on\n-> %s\n", 
  		buffer));
  	  return(FALSE);
  	}
  
! 	if (isdigit(timebuff[0]))
! 	{
! 	    sscanf(buffer, "%*s %*s %s %*s %s %*s %*s",
! 		rec_ptr->day, rec_ptr->year);
! 	    rec_ptr->dayname[0] = '\0';
! 	}
  	else
! 	    sscanf(buffer, "%*s %*s %s %*s %s %*s %s",
! 		rec_ptr->dayname, rec_ptr->day, rec_ptr->year);
! 
  	return(rec_ptr->year[0] != '\0');
  }
  
***************
*** 625,741 ****
  	}
  }
  
! parse_arpa_date(string, entry)
  char *string;
! struct header_rec *entry;
  {
! 	/** Parse and figure out the given date format... return
! 	    the entry fields changed iff it turns out we have a
! 	    valid parse of the date!  **/
  
! 	char word[15][WLEN], buffer[SLEN], *bufptr;
! 	char *aword;
! 	int  words = 0;
! 
! 	strcpy(buffer, string);
! 	bufptr = (char *) buffer;
! 
! 	/** break the line down into words... **/
! 
! 	while ((aword = strtok(bufptr," \t '\"-/(),.")) != NULL) {
! 	  strcpy(word[words++], aword);
! 	  bufptr = NULL;
  	}
  
! 	if (words < 6) {	/* strange format.  We're outta here! */
! 	  dprint(3,(debugfile, 
! 		"Parse_arpa_date failed [less than six fields] on\n-> %s\n",
! 		string));
! 	  return;
  	}
  
! 	/* There are now five possible combinations that we could have:
! 	 
! 	    Date: day_number month_name year_number time timezone
! 	    Date: day_name day_number month_name year_number ...
! 	    Date: day_name month_name day_number time year_number
! 	    Date: day_name month_name day_number year_number time
! 	    Date: day_number month_name year_number time timezone day_name
  
! 	   Note that they are distinguishable by checking the first
! 	   character of the second, third and fourth words... 
! 	*/
  
! 	if (isdigit(word[1][0])) {			/*** type one! ***/
! 	  if (! valid_date(word[1], word[2], word[3])) {
! 	    dprint(3,(debugfile, 
! 		  "parse_arpa_date failed [bad date: %s/%s/%s] on\n-> %s\n",
! 		  word[1], word[2], word[3], string));
! 	    return;		/* strange date! */
! 	  }
! 	  strncpy(entry->day, word[1], 2);
! 	  entry->day[2] = '\0';
! 	  strncpy(entry->month, word[2], 3);
! 	  entry->month[3] = '\0';
! 	  strncpy(entry->year,  word[3], 4);
! 	  entry->year[4] = '\0';
! 	  strncpy(entry->time,  word[4], 10);
! 	  entry->time[10] = '\0';
! 	}
! 	else if (isdigit(word[2][0])) {		        /*** type two! ***/
! 	  if (! valid_date(word[2], word[3], word[4])) {
! 	    dprint(3,(debugfile,
! 		  "parse_arpa_date failed [bad date: %s/%s/%s] on\n-> %s\n",
! 		  word[2], word[3], word[4], string));
! 	    return;		/* strange date! */
! 	  }
! 	  strncpy(entry->day, word[2], 2);
! 	  entry->day[2] = '\0';
! 	  strncpy(entry->month, word[3], 3);
! 	  entry->month[3] = '\0';
! 	  strncpy(entry->year,  word[4], 4);
! 	  entry->year[4] = '\0';
! 	  strncpy(entry->time,  word[5], 10);
! 	  entry->time[10] = '\0';
! 	}
! 	else if (isdigit(word[3][0])) {		
! 	  if (word[4][1] == ':' || 
!               word[4][2] == ':') {	               /*** type three! ***/
! 	    if (! valid_date(word[3], word[2], word[5])) {
! 	     dprint(3, (debugfile,
! 		"parse_arpa_date failed [bad date: %s/%s/%s] on\n-> %s\n",
! 		    word[3], word[2], word[5], string));
! 	      return;		/* strange date! */
! 	    }
! 	    strncpy(entry->year,  word[5], 4);
! 	    entry->year[4] = '\0';
! 	    strncpy(entry->time,  word[4], 10);
! 	    entry->time[10] = '\0';
! 	  }
! 	  else {				       /*** type four!  ***/ 
! 	    if (! valid_date(word[3], word[2], word[4])) {
! 	     dprint(3,(debugfile,
! 		    "parse_arpa_date failed [bad date: %s/%s/%s] on\n-> %s\n",
! 		    word[3], word[2], word[4], string));
! 	      return;		/* strange date! */
! 	    }
! 	    strncpy(entry->year,  word[4], 4);
! 	    entry->year[4] = '\0';
! 	    strncpy(entry->time, word[5], 10);
! 	    entry->time[10] = '\0';
! 	  }
! 	  strncpy(entry->day, word[3], 2);
! 	  entry->day[2] = '\0';
! 	  strncpy(entry->month, word[2], 3);
! 	  entry->month[3] = '\0';
! 	}
! 
! 	/** finally, let's just normalize the monthname to be a three
! 	    letter abbreviation, with the first capitalized and the
! 	    second and third in lowercase... **/
  
! 	strcpy(entry->month, shift_lower(entry->month));
! 	entry->month[0] = toupper(entry->month[0]);
  }
  
  fix_arpa_address(address)
--- 624,1027 ----
  	}
  }
  
! /* Revised verision of parse_arpa_date() by
!  * Marvin Solomon <solomon@cs.wisc.edu>, June 1989.
!  * The original version ignored the time zone, make sorting by date sent
!  * (for example) much less useful.  When I went to fix that, I found that
!  * the syntax accepted didn't seem to correspond to much of anything--it
!  * certainly wasn't even a subset of what rfc822 specifies.  I wrote the
!  * following ad hoc parsing routines, which accept all of 822 plus some
!  * of the more common violations that I have seen in my incoming mail.
!  *
!  * It would make much more sense to simply translate dates into "Unix time"
!  * (seconds past midnight Jan 1, 1970), but the rest of this program wants
!  * everything in symbolic form, and I'm not about to change that.
!  */
! /*
! Quoting from RFC 822:
!      5.  DATE AND TIME SPECIFICATION
! 
!      5.1.  SYNTAX
! 
!      date-time   =  [ day "," ] date time        ; dd mm yy
! 						 ;  hh:mm:ss zzz
! 
!      day         =  "Mon"  / "Tue" /  "Wed"  / "Thu"
! 		 /  "Fri"  / "Sat" /  "Sun"
! 
!      date        =  1*2DIGIT month 2DIGIT        ; day month year
! 						 ;  e.g. 20 Jun 82
! 
!      month       =  "Jan"  /  "Feb" /  "Mar"  /  "Apr"
! 		 /  "May"  /  "Jun" /  "Jul"  /  "Aug"
! 		 /  "Sep"  /  "Oct" /  "Nov"  /  "Dec"
! 
!      time        =  hour zone                    ; ANSI and Military
! 
!      hour        =  2DIGIT ":" 2DIGIT [":" 2DIGIT]
! 						 ; 00:00:00 - 23:59:59
! 
!      zone        =  "UT"  / "GMT"                ; Universal Time
! 						 ; North American : UT
! 		 /  "EST" / "EDT"                ;  Eastern:  - 5/ - 4
! 		 /  "CST" / "CDT"                ;  Central:  - 6/ - 5
! 		 /  "MST" / "MDT"                ;  Mountain: - 7/ - 6
! 		 /  "PST" / "PDT"                ;  Pacific:  - 8/ - 7
! 		 /  1ALPHA                       ; Military: Z = UT;
! 						 ;  A:-1; (J not used)
! 						 ;  M:-12; N:+1; Y:+12
! 		 / ( ("+" / "-") 4DIGIT )        ; Local differential
! 						 ;  hours+min. (HHMM)
! */
! 
! #define SKIP_WS(p) while (isspace(*p)) p++
! #define SKIP_ALPHA(p) while (isalpha(*p)) p++
! #define SKIP_DIGITS(p) while (isdigit(*p)) p++
! 
! static char *day_name[8] = {
!     "sun", "mon", "tue", "wed", "thu", "fri", "sat", 0
! };
! 
! static char *month_name[13] = {
!     "jan", "feb", "mar", "apr",
!     "may", "jun", "jul", "aug",
!     "sep", "oct", "nov", "dec", 0
! };
! 
! static int month_len[12] = {
!     31, 28, 31, 30, 31, 30, 31,
!     31, 30, 31, 30, 31 };
! 
! /* The following time zones are taken from a variety of sources.  They
!  * are by no means exhaustive, but seem to include most of those
!  * in common usage.  A comprehensive list is impossible, since the same
!  * abbreviation is sometimes used to mean different things in different
!  * parts of the world.
!  */
! static struct tzone {
!     char *str;
!     int offset; /* offset, in minutes, EAST of GMT */
! } tzone_info[] = {
!     /* the following are from rfc822 */
!     "ut", 0, "gmt", 0,
!     "est", -5*60, "edt", -4*60,
!     "cst", -6*60, "cdt", -5*60,
!     "mst", -7*60, "mdt", -6*60,
!     "pst", -8*60, "pdt", -7*60,
!     "z", 0, /* zulu time (the rest of the military codes are bogus) */
! 
!     /* these are also popular in Europe */
!     "wet", 0*60, "wet dst", 1*60, /* western european */
!     "met", 1*60, "met dst", 2*60, /* middle european */
!     "eet", 2*60, "eet dst", 3*60, /* eastern european */
!     "bst", 1*60, /* ??? british summer time (=+0100) */
! 
!     /* ... and Canada */
!     "ast", -4*60, "adt", -3*60, /* atlantic */
!     "nst", -3*60-30, "ndt", -2*60-30, /* newfoundland */
!     "yst", -9*60, "ydt", -8*60, /* yukon */
!     "hst", -10*60, /* hawaii (not really canada) */
! 
!     /* ... and Asia */
!     "jst", 9*60, /* japan */
!     "sst", 8*60, /* singapore */
! 
!     /* ... and the South Pacific */
!     "nzst", 12*60, "nzdt", 13*60, /* new zealand */
!     "wst", 8*60, "wdt", 9*60, /* western australia */
!     /* there's also central and eastern australia, but they insist on using
!      * cst, est, etc., which would be indistinguishable for the us zones */
!      (char *)0, 0
! };
! 
! /* Translate a symbolic timezone name (e.g. EDT or NZST) to a number of
!  * minutes *east* of gmt (if the local time is t, the gmt equivalent is
!  * t - tz_lookup(zone)).
!  * Return 0 if the timezone is not recognized.
!  */
! static int tz_lookup(str)
! char *str;
! {
!     struct tzone *p; 
! 
!     for (p = tzone_info; p->str; p++) {
! 	if (strcmp(p->str,str)==0) return p->offset;
!     }
!     dprint(5,(debugfile,"unknown time zone %s\n",str));
!     return 0;
! }
! 
! /* Return smallest i such that table[i] is a prefix of str.  Return -1 if not
!  * found.
!  */
! static int prefix(table, str)
! char **table;
! char *str;
! {
!     int i;
! 
!     for (i=0;table[i];i++)
! 	if (strncmp(table[i],str,strlen(*table))==0)
! 	    return i;
!     return -1;
! }
! 
! /* The following routines, get_XXX(p,...), expect p to point to a string
!  * of the appropriate syntax.  They return decoded values in result parameters,
!  * and return p updated to point past the parsed substring (also stripping
!  * trailing whitespace).
!  * Return 0 on syntax errors.
!  */
! 
! /* Parse a year: ['1' '9'] digit digit WS
!  */
! static char *
! get_year(p, result)
! char *p;
! int *result;
! {
!     int year;
! 
!     if (!isdigit(*p)) {
! 	dprint(5,(debugfile,"missing year: %s\n",p));
! 	return 0;
!     }
!     year = atoi(p);
!     /* be nice and allow 19xx, althought that's not really kosher */
!     if (year>=1900 && year <=1999) year -= 1900;
!     if (year<0 || year>99) {
! 	dprint(5,(debugfile,"ridiculous year %d\n",year));
! 	return 0;
!     }
!     SKIP_DIGITS(p);
!     SKIP_WS(p);
!     *result = year;
!     return p;
! }
! 
! /* Parse a time: hours ':' minutes [ ':' seconds ] WS
!  * Check that 0<=hours<24, 0<=minutes,seconds<60.
!  * Also allow the syntax "digit digit digit digit" with implied ':' in the
!  * middle.
!  * Convert to minutes and seconds, with results in (*m,*s).
!  */
! static char *
! get_time(p,m,s)
! char *p;
! int *m, *s;
! {
!     int hours, minutes, seconds;
! 
!     /* hour */
!     if (!isdigit(*p)) {
! 	dprint(5,(debugfile,"missing time: %s\n",p));
! 	return 0;
!     }
!     hours = atoi(p);
!     SKIP_DIGITS(p);
!     if (*p++ != ':') {
! 	/* perhaps they just wrote hhmm instead of hh:mm */
! 	minutes = hours % 60;
! 	hours /= 60;
!     }
!     else {
! 	if (hours<0 || hours>23) {
! 	    dprint(5,(debugfile,"ridiculous hour: %d\n",hours));
! 	    return 0;
! 	}
! 	minutes = atoi(p);
! 	if (minutes<0 || minutes>59) {
! 	    dprint(5,(debugfile,"ridiculous minutes: %d\n",minutes));
! 	    return 0;
! 	}
!     }
!     SKIP_DIGITS(p);
!     if (*p == ':') {
! 	p++;
! 	seconds = atoi(p);
! 	if (seconds<0 || seconds>59) {
! 	    dprint(5,(debugfile,"ridiculous seconds: %d\n",seconds));
! 	    return 0;
! 	}
! 	SKIP_DIGITS(p);
!     }
!     else seconds = 0;
!     minutes += hours*60;
!     SKIP_WS(p);
!     *m = minutes;
!     *s = seconds;
!     return p;
! }
! 
! /* Parse a Unix date from which the leading week-day has been stripped.
!  * The syntax is "Jun 21 06:45:44 CDT 1989" with timezone optional.
!  * i.e., month day time [ zone ] year
!  * where day::=digit*, year and time are as defined above,
!  * and month and zone are alpha strings starting with a known 3-char prefix.
!  * The month has already been processed by the caller, so we just skip over
!  * a leading alpha* WS.
!  *
!  * Unlike the preceding routines, the result is not an updated pointer, but
!  * simply 1 for success and 0 for failure.
!  */
! static int
! get_unix_date(p,y,d,m,s,t)
! char *p;
! int *y, *d, *m, *s, *t;
! {
! 
!     SKIP_ALPHA(p);
!     SKIP_WS(p);
!     if (!isdigit(*p)) return 0;
!     *d = atoi(p);  /* check the value for sanity after we know the month */
!     SKIP_DIGITS(p);
!     SKIP_WS(p);
!     p = get_time(p,m,s);
!     if (!p) return 0;
!     if (isalpha(*p)) {
! 	*t = tz_lookup(p);
! 	SKIP_ALPHA(p);
! 	SKIP_WS(p);
!     }
!     else *t = 0;
!     p = get_year(p,y);
!     if (!p) return 0;
!     return 1;
! }
! 
! 
! /* Parse an rfc822 (with extensions) date.  Return 1 on success, 0 on failure.
!  */
! parse_arpa_date(string, date)
  char *string;
! struct date_rec *date;
  {
!     char buffer[BUFSIZ], *p, *q;
!     int mday, month, year, minutes, seconds, tz;
! 
!     /* first get everything into lower case */
!     for (p=buffer, q=buffer+sizeof buffer; *string && p<q; p++, string++) {
! 	*p = isupper(*string) ? tolower(*string) : *string;
!     }
!     *p = 0;
!     p = buffer;
!     SKIP_WS(p);
  
!     if (prefix(day_name,p)>=0) {
! 	SKIP_ALPHA(p);
! 	SKIP_WS(p);
! 
! 	if (*p==',') {
! 	    p++;
! 	    SKIP_WS(p);
  	}
+ 	/* A comma is required here, but we'll be nice guys and look the other
+ 	 * way if it's missing.
+ 	 */
+     }
  
!     /* date */
! 
!     /* day of the month */
!     if (!isdigit(*p)) {
! 	/* Missing day.  Maybe this is a Unix date?
! 	 */
! 	month = prefix(month_name,p);
! 	if (month >= 0 &&
! 	    get_unix_date(p, &year, &mday, &minutes, &seconds, &tz)) {
! 		goto got_date;
  	}
+ 	dprint(5,(debugfile,"missing day: %s\n",p));
+ 	return 0;
+     }
+     mday = atoi(p);  /* check the value for sanity after we know the month */
+     SKIP_DIGITS(p);
+     SKIP_WS(p);
  
!     /* month name */
!     month = prefix(month_name,p);
!     if (month < 0) {
! 	dprint(5,(debugfile,"missing month: %s\n",p));
! 	return 0;
!     }
!     SKIP_ALPHA(p);
!     SKIP_WS(p);
  
!     /* year */
!     if (!(p = get_year(p,&year))) return 0;
! 
!     /* time */
!     if (!(p = get_time(p,&minutes,&seconds))) return 0;
! 
!     /* zone */
!     for (q=p; *q && !isspace(*q); q++) continue;
!     *q = 0;
!     if (*p=='-' || *p=='+') {
! 	char sign = *p++;
! 
! 	if (isdigit(*p)) {
! 	    int i;
! 
! 	    for (i=0; i<4; i++) {
! 		if (!isdigit(p[i])) {
! 		    dprint(5,(debugfile,"ridiculous numeric timezone: %s\n",p));
! 		    return 0;
! 		}
! 		p[i] -= '0';
! 	    }
! 	    tz = (p[0]*10 + p[1])*60 + p[2]*10 + p[3];
! 	    if (sign=='-') tz = -tz;
! 	}
! 	else {
! 	    /* some brain-damaged dates use a '-' before a symbolic time zone */
! 	    SKIP_WS(p);
! 	    tz = tz_lookup(p);
! 	}
!     }
!     else tz = tz_lookup(p);
! 
! got_date:
!     month_len[1] = (year%4) ? 28 : 29;
!     if (mday<0 || mday>month_len[month]) {
! 	dprint(5,(debugfile,"ridiculous day %d of month %d\n",mday,month));
! 	return 0;
!     }
  
!     /* shift everything to UTC (aka GMT) */
!     minutes -= tz;
!     if (tz > 0) { /* east of Greenwich */
! 	if (minutes < 0) {
! 	    if (--mday < 0) {
! 		if (--month < 0) {
! 		    year--; /* don't worry about 1900! */
! 		    month = 11;
! 		}
! 		mday = month_len[month] - 1;
! 	    }
! 	    minutes += 60*60;
! 	}
!     }
!     if (tz < 0) { /* west of Greenwich */
! 	if (minutes >= 24*60) {
! 	    if (++mday >= month_len[month]) {
! 		if (++month >= 12) {
! 		    year++; /* don't worry about 1999! */
! 		    month = 0;
! 		}
! 		mday = 0;
! 	    }
! 	    minutes -= 24*60;
! 	}
!     }
  
!     date->month = month;
!     date->year = year;
!     if (date->year > 100) date->year -= 1900;
!     date->day = mday;
!     date->hour = minutes/60;
!     date->minute = minutes%60;
!     date->second = seconds;
!     return 1;
  }
  
  fix_arpa_address(address)
===================================================================
RCS file: src/aliasdb.c,v
retrieving revision 1.1
diff -c -r1.1 aliasdb.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:06 1989
--- src/aliasdb.c	Fri May 12 23:17:43 1989
***************
*** 101,106 ****
--- 98,105 ----
  	return;
  }
  
+ #if defined(OPTIMIZE_RETURN) || !defined(DONT_TOUCH_ADDRESSES)
+ 
  int
  expand_site(cryptic, expanded)
  char *cryptic, *expanded;
***************
*** 387,389 ****
--- 386,390 ----
  
  	return(NULL);			            /* failed if it's here! */
  }
+ 
+ #endif
===================================================================
RCS file: src/conn_to.c,v
retrieving revision 1.1
diff -c -r1.1 conn_to.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:06 1989
--- src/conn_to.c	Fri May 12 23:17:46 1989
***************
*** 36,41 ****
--- 33,40 ----
  
  #include "headers.h"
  
+ #ifndef DONT_TOUCH_ADDRESSES
+ 
  char *strcpy();
  
  get_connections()
***************
*** 180,182 ****
--- 179,183 ----
  
  	return(0);				/* it all went okay... */
  }
+ 
+ #endif
===================================================================
RCS file: src/domains.c,v
retrieving revision 1.1
diff -c -r1.1 domains.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:07 1989
--- src/domains.c	Fri May 12 23:17:48 1989
***************
*** 37,42 ****
--- 34,41 ----
  
  #include "headers.h"
  
+ #if defined(OPTIMIZE_RETURN) || !defined(DONT_TOUCH_ADDRESSES)
+ 
  #ifdef BSD
  # undef toupper
  # undef tolower
***************
*** 307,309 ****
--- 306,310 ----
  
  	return( (char *) address);
  }
+ 
+ #endif
===================================================================
RCS file: src/elm.c,v
retrieving revision 1.1
diff -c -r1.1 elm.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:09 1989
--- src/elm.c	Sat Dec 23 20:08:34 1989
***************
*** 829,843 ****
  
  	Write_to_screen(buffer, 0);
  
! 	sprintf(buffer, "\n\rReceived on: %d/%d/%d at %d:%02d\n\r",
! 	        current_header->received.month+1,
! 	        current_header->received.day,
! 	        current_header->received.year,
! 	        current_header->received.hour,
! 	        current_header->received.minute);
  	Write_to_screen(buffer, 0);
  
! 	sprintf(buffer, "Message sent on: %s, %s %s, %s at %s\n\r",
  	        current_header->dayname,
  	        current_header->month,
  	        current_header->day,
--- 826,840 ----
  
  	Write_to_screen(buffer, 0);
  
! 	sprintf(buffer, "\n\rSent on: %d/%d/%d at %d:%02d\n\r",
! 	        current_header->sent.month+1,
! 	        current_header->sent.day,
! 	        current_header->sent.year,
! 	        current_header->sent.hour,
! 	        current_header->sent.minute);
  	Write_to_screen(buffer, 0);
  
! 	sprintf(buffer, "Message received on: %s, %s %s, %s at %s\n\r",
  	        current_header->dayname,
  	        current_header->month,
  	        current_header->day,
===================================================================
RCS file: src/init.c,v
retrieving revision 1.1
diff -c -r1.1 init.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:11 1989
--- src/init.c	Sun Dec 24 16:23:00 1989
***************
*** 234,240 ****
--- 253,261 ----
  #else
  	gethostname(hostname, sizeof(hostname));
  #endif
+ #if defined(OPTIMIZE_RETURN) || !defined(DONT_TOUCH_ADDRESSES) || !defined(INTERNET)
  	gethostdomain(hostdomain, sizeof(hostdomain));
+ #endif
  
  	/* Determine the default mail file name.
  	 * 
===================================================================
RCS file: src/leavembox.c,v
retrieving revision 1.1
diff -c -r1.1 leavembox.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:12 1989
--- src/leavembox.c	Thu Dec 14 23:47:07 1989
***************
*** 292,297 ****
--- 303,309 ----
  	  return(0);
  	}
  
+ #ifdef undef
  	/** we have to check to see what the sorting order was...so that
  	    the order in which we write messages is the same as the order
  	    of the messages originally.
***************
*** 304,309 ****
--- 316,322 ----
  	  sort_mailbox(message_count, FALSE);
  	  sortby = last_sortby;
  	}
+ #endif
  
  	/* Formulate message as to number of keeps, stores, and deletes.
  	 * This is only complex so that the message is good English.
===================================================================
RCS file: src/newmbox.c,v
retrieving revision 1.1
diff -c -r1.1 newmbox.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:15 1989
--- src/newmbox.c	Sat Dec 23 21:31:38 1989
***************
*** 384,395 ****
  	      struct header_rec **new_headers;
  	      int new_max;
  
- 	      new_max = max_headers + KLICK;
  	      if (max_headers == 0) {
  		new_headers = (struct header_rec **)
  		  malloc(new_max * sizeof(struct header_rec *));
  	      }
  	      else {
  		new_headers = (struct header_rec **)
  		  realloc(headers, new_max * sizeof(struct header_rec *));
  	      }
--- 403,415 ----
  	      struct header_rec **new_headers;
  	      int new_max;
  
  	      if (max_headers == 0) {
+ 		new_max = 3 * KLICK;
  		new_headers = (struct header_rec **)
  		  malloc(new_max * sizeof(struct header_rec *));
  	      }
  	      else {
+ 	        new_max = max_headers + KLICK;
  		new_headers = (struct header_rec **)
  		  realloc(headers, new_max * sizeof(struct header_rec *));
  	      }
***************
*** 504,510 ****
  	    /** when it was sent... **/
  
  	    else if (first_word(buffer, "Date:")) 
! 	      parse_arpa_date(buffer, current_header);
  
  	    /** some status things about the message... **/
  
--- 525,531 ----
  	    /** when it was sent... **/
  
  	    else if (first_word(buffer, "Date:")) 
! 	      parse_arpa_date(buffer + 5, &current_header->sent);
  
  	    /** some status things about the message... **/
  
***************
*** 598,603 ****
--- 619,635 ----
  	}
  	else 
            rewind(mailfile);
+ 
+ 	{
+ 	  register unsigned excess;
+ 
+ 	  for (excess = count; excess < max_headers; excess++)
+ 	    if (headers[excess] != NULL)
+ 	      {
+ 		(void) free(headers[excess]);
+ 		headers[excess] = NULL;
+ 	      }
+ 	}
  
  	/* Sort folder *before* we establish the current message, so that
  	 * the current message is based on the post-sort order.
===================================================================
RCS file: src/opt_utils.c,v
retrieving revision 1.1
diff -c -r1.1 opt_utils.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:16 1989
--- src/opt_utils.c	Tue Nov 14 23:40:42 1989
***************
*** 107,112 ****
--- 108,114 ----
  
  #endif  /* GETHOSTNAME */
  
+ #if !defined(DONT_ADD_FROM) && defined(OPTIMIZE_RETURN)
  
  gethostdomain(hostdom, size)    /* get domain of current host */
  char *hostdom;
***************
*** 137,142 ****
--- 139,145 ----
  	return 0;
  }
  
+ #endif
  
  #ifdef NEED_CUSERID
  
===================================================================
RCS file: src/read_rc.c,v
retrieving revision 1.1
diff -c -r1.1 read_rc.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:17 1989
--- src/read_rc.c	Sat Dec 23 22:43:10 1989
***************
*** 683,694 ****
  	    allocation!
  	**/
  
! 	static char *default_list[] = { ">From", "In-Reply-To:",
! 		       "References:", "Newsgroups:", "Received:",
! 		       "Apparently-To:", "Message-Id:", "Content-Type:",
! 		       "From", "X-Mailer:", "Status:",
! 		       "*end-of-defaults*", NULL
! 		     };
  
  	for (weedcount = 0; default_list[weedcount] != (char *) 0;weedcount++){
  	  if ((weedlist[weedcount] = 
--- 683,695 ----
  	    allocation!
  	**/
  
! 	static char *default_list[] = {
! 	    "From", ">From:", "Message-I", "Sender:", "X-",
! 	    "Received:", "Status:", "To:", "Reply-To:",
! 	    "Posted", "Path:", "Xref:", "Mmdf", "Nf-", "Approved:",
! 	    "Distribution:", "References:", "In-",
! 	    "*end-of-defaults*", NULL
! 	};
  
  	for (weedcount = 0; default_list[weedcount] != (char *) 0;weedcount++){
  	  if ((weedlist[weedcount] = 
===================================================================
RCS file: src/screen.c,v
retrieving revision 1.1
diff -c -r1.1 screen.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:19 1989
--- src/screen.c	Sat Dec 23 21:50:41 1989
***************
*** 284,289 ****
--- 284,290 ----
  char *from;
  {
  char *strchr();
+ extern char *arpa_monname[];
  
  	/** Build in buffer the message header ... entry is the current
  	    message entry, 'from' is a modified (displayable) from line, 
***************
*** 326,333 ****
  		show_status(entry->status),
  		(entry->status & TAGGED?  '+' : ' '),
  	        message_number,
! 	        entry->month, 
! 		atoi(entry->day));
  
  	/* show "To " in a way that it can never be truncated. */
  	if (really_to) {
--- 327,334 ----
  		show_status(entry->status),
  		(entry->status & TAGGED?  '+' : ' '),
  	        message_number,
! 	        arpa_monname[entry->sent.month], 
! 		entry->sent.day);
  
  	/* show "To " in a way that it can never be truncated. */
  	if (really_to) {
===================================================================
RCS file: src/sort.c,v
retrieving revision 1.1
diff -c -r1.1 sort.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:19 1989
--- src/sort.c	Sat Dec 23 19:57:24 1989
***************
*** 74,106 ****
  	    Received date is keyed on the file offsets (think about it)
  	    Sender uses the truncated from line, same as "build headers",
  	    and size and subject are trivially obvious!!
! 	    (actually, subject has been modified to ignore any leading
! 	    patterns [rR][eE]*:[ \t] so that replies to messages are
! 	    sorted with the message (though a reply will always sort to
! 	    be 'greater' than the basenote)
  	 **/
  
! 	char from1[SLEN], from2[SLEN];	/* sorting buffers... */
! 	struct header_rec *first, *second;
! 	int ret;
  	
  	first = *p1;
  	second = *p2;
  
! 	switch (abs(sortby)) {
! 	case SENT_DATE:
! 		ret = compare_dates(first, second);
! 		break;
! 
  	case RECEIVED_DATE:
! 		ret = compare_parsed_dates(first->received, second->received);
  		break;
  
  	case SENDER:
  		tail_of(first->from, from1, first->to);
  		tail_of(second->from, from2, second->to);
  		ret = strcmp(from1, from2);
  		break;
  
  	case SIZE:
  		ret = (first->lines - second->lines);
--- 71,103 ----
  	    Received date is keyed on the file offsets (think about it)
  	    Sender uses the truncated from line, same as "build headers",
  	    and size and subject are trivially obvious!!
! 	    Subjects are comapred only on their body, ignoring any leading
! 	    "Re: " type prefix. Since we always compare by sorted date
! 	    as well, articles that start a thread with the same subject
! 	    will come first naturally, which is good, because not all
! 	    postnews version put in a "Re: " type prefix on followups.
  	 **/
  
! 	register struct header_rec *first, *second;
! 	int ret = 0;
! 	int how = abs(sortby);
  	
  	first = *p1;
  	second = *p2;
  
! 	if (how != SENT_DATE) switch (how) {
  	case RECEIVED_DATE:
! 		ret = compare_dates(first, second);
  		break;
  
  	case SENDER:
+ 	{
+ 		char from1[SLEN], from2[SLEN];	/* sorting buffers... */
  		tail_of(first->from, from1, first->to);
  		tail_of(second->from, from2, second->to);
  		ret = strcmp(from1, from2);
  		break;
+ 	}
  
  	case SIZE:
  		ret = (first->lines - second->lines);
***************
*** 111,119 ****
  		break;
  
  	case SUBJECT:
! 		/* need some extra work 'cause of STATIC buffers */
! 		strcpy(from1, skip_re(shift_lower(first->subject)));
! 		ret = strcmp(from1, skip_re(shift_lower(second->subject)));
  		break;
  
  	case STATUS:
--- 108,114 ----
  		break;
  
  	case SUBJECT:
! 		ret = strcmp(skip_re(first->subject),skip_re(second->subject));
  		break;
  
  	case STATUS:
***************
*** 126,135 ****
  		break;
  	}
  
! 	if (sortby < 0)
! 	  ret = -ret;
  
! 	return ret;
  }
  
  char *sort_name(type)
--- 121,130 ----
  		break;
  	}
  
! 	if (ret == 0)
! 	    ret = compare_parsed_dates(first->sent, second->sent);
  
! 	return (sortby < 0) ? -ret : ret;
  }
  
  char *sort_name(type)
***************
*** 236,283 ****
  }
  
  char *skip_re(string)
! char *string;
  {
  	/** this routine returns the given string minus any sort of
  	    "re:" prefix.  specifically, it looks for, and will
  	    remove, any of the pattern:
  
! 		( [Rr][Ee][^:]:[ ] ) *
  
  	    If it doesn't find a ':' in the line it will return it
  	    intact, just in case!
  	**/
! 
! 	static char buffer[SLEN];
! 	register int i=0;
  
! 	while (whitespace(string[i])) i++;
  
! 	do {
! 	  if (string[i] == '\0') return( (char *) string);	/* forget it */
! 
! 	  if (string[i] != 'r' || string[i+1] != 'e') 
! 	    return( (char *) string);				/*   ditto   */
! 
! 	  i += 2;	/* skip the "re" */
! 
! 	  while (string[i] != ':') 
! 	    if (string[i] == '\0')
! 	      return( (char *) string);		      /* no colon in string! */
! 	    else
! 	      i++;
! 
! 	  /* now we've gotten to the colon, skip to the next non-whitespace  */
! 
! 	  i++;	/* past the colon */
! 
! 	  while (whitespace(string[i])) i++;
! 
! 	} while (string[i] == 'r' && string[i+1] == 'e');
!  
! 	/* and now copy it into the buffer and sent it along... */
  
! 	strcpy(buffer, (char *) string + i);
  
! 	return( (char *) buffer);
  }
--- 231,282 ----
  }
  
  char *skip_re(string)
! register char *string;
  {
  	/** this routine returns the given string minus any sort of
  	    "re:" prefix.  specifically, it looks for, and will
  	    remove, any of the pattern:
  
! 		"[ \t]*([Rr][Ee][^:]*:[ \t])+"
  
  	    If it doesn't find a ':' in the line it will return it
  	    intact, just in case!
  	**/
! 	char *subject;
  
! 	while (*string != '\0' && whitespace(*string)) string++;
! 	/* [ \t] recognized */
  
! 	/* We return the subject minus the leading space, as we also
! 	remove any leading white space after the colon alter on. */
! 	subject = string;
! 
! 	if (*string == '\0') return subject;
! 
! 	if (tolower(*string) != 'r' || tolower(*(string+1)) != 'e') 
! 	  return subject;
! 
! 	{
! 
! 	  string++; string++;
! 	  /* [Rr][Ee] skipped */
! 
! 	  while (*string != '\0' && *string != ':') string++;
! 	  /* [^:]* skipped */
! 
! 	  if (*string == '\0') return subject;
! 
! 	  string++;
! 	  /* : skipped */
! 
! 	  while (*string != '\0' && whitespace(*string)) string++;
! 	  /* [ \t]* skipped */
! 
! 	  subject = string;
! 
! 	  if (*string == '\0') return subject;
  
! 	} while (tolower(*string) == 'r' && tolower(*(string+1)) == 'e');
  
! 	return subject;
  }
===================================================================
RCS file: src/strings.c,v
retrieving revision 1.1
diff -c -r1.1 strings.c
*** /tmp/,RCSt1a22077	Sun Dec 24 16:28:20 1989
--- src/strings.c	Fri May 12 23:17:59 1989
***************
*** 105,110 ****
--- 102,108 ----
  	register int loc, i = 0, cnt = 0, using_to = 0;
  	
  #ifndef INTERNET
+ # ifdef USE_DOMAIN
  	
  	/** let's see if we have an address appropriate for hacking: 
  	    what this actually does is remove the spuriously added
***************
*** 116,121 ****
--- 114,120 ----
  	if (chloc(from,'!') != -1 && in_string(from, buffer))
  	   from[strlen(from)-strlen(buffer)] = '\0';
  
+ # endif
  #endif
  
  	for (loc = strlen(from)-1; loc >= 0 && cnt < 2; loc--) {

-- 
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk