[news.software.b] C expire + B news

fletcher@cs.utexas.edu (Fletcher Mattox) (08/14/89)

Here's what I did to shoehorn C expire into 2.11 news.
My motivation was not just speed, but the desire to keep
expire from locking up news for hours while our 20 NNTP
neighbors filled up /usr/spool/news/.rnews with duplicate
articles.

File locking is provided only for UNIXes with flock(2), though it
shouldn't be difficult to accommodate other locking primitives.

Fletcher


# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by cs.utexas.edu!fletcher on Mon Aug 14 09:11:47 CDT 1989
# Contents:  README Patch.bnews Patch.cnews flock.c
 
echo x - README
sed 's/^@//' > "README" <<'@//E*O*F README//'
It was easy to get C news expire to work with B news.
Here's what I did.

Apply Patch.bnews and Patch.cnews.

Arrange for CEXPIRE to be defined in B news and for
BNEWS to be defined in expire.c.  

I chose to keep the B news date format in the history file
(I have local scripts which parse this field), so that a typical
history line looks like:

<644@uakari.primate.wisc.edu>	08/04/89 15:59~-	comp.terminals/1661 

You can make expire faster (and perhaps make history more aesthetic,
depending on your point of view) by converting the date to seconds.

Also, I gratuitously changed expire to ignore malformed history
lines rather than to quit when it found one.

Regarding message-id case:  B news converts ids to lower case before
writing the dbm file.  C news does not.  I chose to go with B news'
convention.  If you ever have to rebuild the history file from
scratch, with C news' mkhistory, you'll need to add a strlower()
to mkdbm.c.

Rhetorical questions: Why does B news downcase message ids?  Why
doesn't C news?  (Actually I think I know the answer to the second
question).

Regarding locking:  Rather than drag all the B news #ifdefs into
C news, I just brought over the 4BSD flocking code--that's all I needed.
If you can't use that, you'll have to write your own newslock() and
newsunlock().  C expire only needs to lock the news system during its
last read(2) of the history file--less than one second.  Where you
*really* need locking is during upact, which can take many minutes.  I
used flock.c for that, i.e. my doexpire looks something like:

	cd $NEWSLIB
	./expire -v explist
	./flock active upact
	rnews -U

Fletcher
@//E*O*F README//
chmod u=rw,g=rw,o=r README
 
echo x - Patch.bnews
sed 's/^@//' > "Patch.bnews" <<'@//E*O*F Patch.bnews//'
*** inews.c.OLD	Thu Aug  3 16:22:31 1989
--- inews.c	Fri Jul 28 17:46:31 1989
***************
*** 949,954
  	int is_invalid = FALSE;
  	int exitcode = 0;
  	long now;
  #ifdef DOXREFS
  	register char *nextref = header.xref;
  #endif /* DOXREFS */

--- 949,955 -----
  	int is_invalid = FALSE;
  	int exitcode = 0;
  	long now;
+ 	char ebuf[DATELEN+2];
  #ifdef DOXREFS
  	register char *nextref = header.xref;
  #endif /* DOXREFS */
***************
*** 964,969
  
  	(void) time(&now);
  	tm = gmtime(&now);
  	if (header.expdate[0])
  		addhist(" ");
  #ifdef USG

--- 965,974 -----
  
  	(void) time(&now);
  	tm = gmtime(&now);
+ #ifdef CEXPIRE
+ 	sprintf(ebuf, "~%s", header.expdate[0] ? header.expdate : "-");
+ #else
+ 	ebuf[0] = '\0';
  	if (header.expdate[0])
  		addhist(" ");
  #endif /* CEXPIRE */
***************
*** 966,971
  	tm = gmtime(&now);
  	if (header.expdate[0])
  		addhist(" ");
  #ifdef USG
  	sprintf(bfr,"%2.2d/%2.2d/%d %2.2d:%2.2d\t",
  #else /* !USG */

--- 971,977 -----
  	ebuf[0] = '\0';
  	if (header.expdate[0])
  		addhist(" ");
+ #endif /* CEXPIRE */
  #ifdef USG
  	sprintf(bfr,"%2.2d/%2.2d/%d %2.2d:%2.2d%s\t",
  #else /* !USG */
***************
*** 967,973
  	if (header.expdate[0])
  		addhist(" ");
  #ifdef USG
! 	sprintf(bfr,"%2.2d/%2.2d/%d %2.2d:%2.2d\t",
  #else /* !USG */
  	sprintf(bfr,"%02d/%02d/%d %02d:%02d\t",
  #endif /* !USG */

--- 973,979 -----
  		addhist(" ");
  #endif /* CEXPIRE */
  #ifdef USG
! 	sprintf(bfr,"%2.2d/%2.2d/%d %2.2d:%2.2d%s\t",
  #else /* !USG */
  	sprintf(bfr,"%02d/%02d/%d %02d:%02d%s\t",
  #endif /* !USG */
***************
*** 969,975
  #ifdef USG
  	sprintf(bfr,"%2.2d/%2.2d/%d %2.2d:%2.2d\t",
  #else /* !USG */
! 	sprintf(bfr,"%02d/%02d/%d %02d:%02d\t",
  #endif /* !USG */
  		tm->tm_mon+1, tm->tm_mday, tm->tm_year,tm->tm_hour, tm->tm_min);
  	addhist(bfr);

--- 975,981 -----
  #ifdef USG
  	sprintf(bfr,"%2.2d/%2.2d/%d %2.2d:%2.2d%s\t",
  #else /* !USG */
! 	sprintf(bfr,"%02d/%02d/%d %02d:%02d%s\t",
  #endif /* !USG */
  		tm->tm_mon+1, tm->tm_mday, tm->tm_year, tm->tm_hour, 
  		tm->tm_min, ebuf);
***************
*** 971,977
  #else /* !USG */
  	sprintf(bfr,"%02d/%02d/%d %02d:%02d\t",
  #endif /* !USG */
! 		tm->tm_mon+1, tm->tm_mday, tm->tm_year,tm->tm_hour, tm->tm_min);
  	addhist(bfr);
  	log("%s %s ng %s subj '%s' from %s", spool_news != DONT_SPOOL
  		? "queued" : (mode==PROC ? "received" : "posted"),

--- 977,984 -----
  #else /* !USG */
  	sprintf(bfr,"%02d/%02d/%d %02d:%02d%s\t",
  #endif /* !USG */
! 		tm->tm_mon+1, tm->tm_mday, tm->tm_year, tm->tm_hour, 
! 		tm->tm_min, ebuf);
  	addhist(bfr);
  	log("%s %s ng %s subj '%s' from %s", spool_news != DONT_SPOOL
  		? "queued" : (mode==PROC ? "received" : "posted"),
*** control.c.OLD	Thu Aug  3 16:25:39 1989
--- control.c	Fri Jul 28 17:44:25 1989
***************
*** 626,631
  	char whatsisname[BUFLEN], nfilename[BUFLEN];
  	time_t t;
  	int su = 0;
  #ifndef u370
  	struct hbuf htmp;
  #endif /* !u370 */

--- 626,632 -----
  	char whatsisname[BUFLEN], nfilename[BUFLEN];
  	time_t t;
  	int su = 0;
+ 	char ebuf[3];
  #ifndef u370
  	struct hbuf htmp;
  #endif /* !u370 */
***************
*** 640,645
  		log("Can't cancel %s:  non-existent", argv[1]);
  		(void) time(&t);
  		tm = localtime(&t);
  #ifdef USG
  		sprintf(bfr,"%s\t%2.2d/%2.2d/%d %2.2d:%2.2d\tcancelled",
  #else /* !USG */

--- 641,651 -----
  		log("Can't cancel %s:  non-existent", argv[1]);
  		(void) time(&t);
  		tm = localtime(&t);
+ #ifdef CEXPIRE
+ 	strcpy(ebuf, "~-");
+ #else
+ 	ebuf[0] = '\0';
+ #endif /* CEXPIRE */
  #ifdef USG
  		sprintf(bfr,"%s\t%2.2d/%2.2d/%d %2.2d:%2.2d%s\tcancelled",
  #else /* !USG */
***************
*** 641,647
  		(void) time(&t);
  		tm = localtime(&t);
  #ifdef USG
! 		sprintf(bfr,"%s\t%2.2d/%2.2d/%d %2.2d:%2.2d\tcancelled",
  #else /* !USG */
  		sprintf(bfr,"%s\t%02d/%02d/%d %02d:%02d\tcancelled",
  #endif /* !USG */

--- 647,653 -----
  	ebuf[0] = '\0';
  #endif /* CEXPIRE */
  #ifdef USG
! 		sprintf(bfr,"%s\t%2.2d/%2.2d/%d %2.2d:%2.2d%s\tcancelled",
  #else /* !USG */
  		sprintf(bfr,"%s\t%02d/%02d/%d %02d:%02d%s\tcancelled",
  #endif /* !USG */
***************
*** 643,649
  #ifdef USG
  		sprintf(bfr,"%s\t%2.2d/%2.2d/%d %2.2d:%2.2d\tcancelled",
  #else /* !USG */
! 		sprintf(bfr,"%s\t%02d/%02d/%d %02d:%02d\tcancelled",
  #endif /* !USG */
  		   argv[1], tm->tm_mon+1, tm->tm_mday, tm->tm_year, tm->tm_hour,
  		   tm->tm_min);

--- 649,655 -----
  #ifdef USG
  		sprintf(bfr,"%s\t%2.2d/%2.2d/%d %2.2d:%2.2d%s\tcancelled",
  #else /* !USG */
! 		sprintf(bfr,"%s\t%02d/%02d/%d %02d:%02d%s\tcancelled",
  #endif /* !USG */
  		   argv[1], tm->tm_mon+1, tm->tm_mday, tm->tm_year, tm->tm_hour,
  		   tm->tm_min, ebuf);
***************
*** 646,652
  		sprintf(bfr,"%s\t%02d/%02d/%d %02d:%02d\tcancelled",
  #endif /* !USG */
  		   argv[1], tm->tm_mon+1, tm->tm_mday, tm->tm_year, tm->tm_hour,
! 		   tm->tm_min);
  		savehist(bfr);
  		return -1;
  	}

--- 652,658 -----
  		sprintf(bfr,"%s\t%02d/%02d/%d %02d:%02d%s\tcancelled",
  #endif /* !USG */
  		   argv[1], tm->tm_mon+1, tm->tm_mday, tm->tm_year, tm->tm_hour,
! 		   tm->tm_min, ebuf);
  		savehist(bfr);
  		return -1;
  	}
@//E*O*F Patch.bnews//
chmod u=rw,g=rw,o=r Patch.bnews
 
echo x - Patch.cnews
sed 's/^@//' > "Patch.cnews" <<'@//E*O*F Patch.cnews//'
*** expire.c.OLD	Thu Aug  3 18:51:46 1989
--- expire.c	Thu Aug  3 18:41:52 1989
***************
*** 425,430
  	datum lhs;
  	datum rhs;
  	register int ret;
  
  	cd(ctlfile((char *)NULL));
  	old = open("history", 0);

--- 425,431 -----
  	datum lhs;
  	datum rhs;
  	register int ret;
+ 	char name[512];		/* should be plenty of room for a msg-id */
  
  	cd(ctlfile((char *)NULL));
  	old = open("history", 0);
***************
*** 449,455
  			nameend = strchr(line, '\t');
  			if (nameend == NULL) {
  				errno = 0;
! 				fail("bad return from doline(): `%.75s'", line);
  			}
  
  			/* make the DBM entry */

--- 450,457 -----
  			nameend = strchr(line, '\t');
  			if (nameend == NULL) {
  				errno = 0;
! 				warning("bad return from doline(): `%.75s'", line);
! 				continue;
  			}
  
  			/* make the DBM entry */
***************
*** 454,460
  
  			/* make the DBM entry */
  			*nameend = '\0';
- 			lhs.dptr = line;
  			lhs.dsize = strlen(line)+1;
  			here = ftell(new);
  			rhs.dptr = (char *)&here;

--- 456,461 -----
  
  			/* make the DBM entry */
  			*nameend = '\0';
  			lhs.dsize = strlen(line)+1;
  #ifdef BNEWS
  			if (lhs.dsize > sizeof name) {
***************
*** 456,461
  			*nameend = '\0';
  			lhs.dptr = line;
  			lhs.dsize = strlen(line)+1;
  			here = ftell(new);
  			rhs.dptr = (char *)&here;
  			rhs.dsize = sizeof(here);

--- 457,473 -----
  			/* make the DBM entry */
  			*nameend = '\0';
  			lhs.dsize = strlen(line)+1;
+ #ifdef BNEWS
+ 			if (lhs.dsize > sizeof name) {
+ 				warning("message id too long: `%.75s'", line);
+ 				continue;
+ 			}
+ 			strcpy(name,line);
+ 			strlower(name);
+ 			lhs.dptr = name;
+ #else
+ 			lhs.dptr = line;
+ # endif
  			here = ftell(new);
  			rhs.dptr = (char *)&here;
  			rhs.dsize = sizeof(here);
***************
*** 1274,1276
  
  	return(result);
  }

--- 1286,1325 -----
  
  	return(result);
  }
+ 
+ #ifdef BNEWS
+ 
+ /*
+  * Lock the system, B news style.
+  */
+ 
+ #include <sys/syslog.h>
+ #include <sys/file.h>
+ static int fd;
+ 
+ static
+ void
+ newslock()
+ {
+ 	char *active = ctlfile("active");
+ 
+ 	openlog(progname, 0, LOG_NEWS);
+ 	if ((fd = open(active, 2)) < 0) {
+ 		perror(active);
+ 		exit(1);
+ 	}
+ 	if (flock(fd, LOCK_EX) < 0) {
+ 		perror("expire can't flock the active file");
+ 		exit(1);
+ 	}
+ 	syslog(LOG_INFO, "%s: locked", active);
+ }
+ 
+ static
+ void
+ newsunlock()
+ {
+ 	(void) close(fd);
+ 	syslog(LOG_INFO, "%s: active file unlocked");
+ }
+ #endif		/* BNEWS*/
@//E*O*F Patch.cnews//
chmod u=rw,g=rw,o=r Patch.cnews
 
echo x - flock.c
sed 's/^@//' > "flock.c" <<'@//E*O*F flock.c//'
#include <stdio.h>
#include <sys/file.h>

main(argc, argv)
char **argv;
{
	char *file;
	int i;
	int fd;

	if (argc < 3) {
		fprintf(stderr, "Usage: %s file_to_lock command [args]\n", argv[0]);
		exit(1);
	}
	file = argv[1];
	if ((fd = open(file, 2)) < 0) {
		perror(file);
		exit(1);
	}
	if (flock(fd, LOCK_EX) < 0) {
		perror("flock");
		exit(1);
	}
	argv += 2;
#ifdef debug
	for (i = 0; argv[i] && argv[i][0]; i++)
		printf("argv[%d]= %s\n", i, argv[i]);
#endif debug
	execvp(argv[0], argv);
	perror(argv[0]);
	exit(1);
}
@//E*O*F flock.c//
chmod u=rw,g=rw,o=r flock.c
 
exit 0

rick@uunet.UU.NET (Rick Adams) (08/14/89)

> Regarding message-id case:  B news converts ids to lower case before
> writing the dbm file.  C news does not.  I chose to go with B news'
> convention.  If you ever have to rebuild the history file from
> scratch, with C news' mkhistory, you'll need to add a strlower()
> to mkdbm.c.

B news does it because message-ids are supposed to be case insensitive
when compared. The simplest way to do that with dbm is to map
everything to lower case.

How does cnews keep the case independance without a similar hack?

henry@utzoo.uucp (Henry Spencer) (08/19/89)

In article <63816@uunet.UU.NET> rick@uunet.UU.NET (Rick Adams) writes:
>B news does it because message-ids are supposed to be case insensitive
>when compared. The simplest way to do that with dbm is to map
>everything to lower case.
>
>How does cnews keep the case independance without a similar hack?

C News treats message-ids as case-sensitive.  The issue is tricky; Geoff,
who is our RFCologist, reports that the case of message-ids is not addressed
in RFC1036, so RFC822 dominates.  And RFC822 does *NOT* say that message-ids
are case-insensitive, Rick's comments notwithstanding.

The reason the issue is tricky is that RFC822 doesn't say that they are
case-sensitive either.  It's worse.  A message-id is <stuff@domain>.  The
"domain" part is case-INSENSITIVE.  The "stuff" part is case-SENSITIVE,
except that all variations of "postmaster", e.g. "PoSTmAsTeR", compare
equal.  Lordy.		

So B News is just as wrong as C News on this.  B2.10.1 and before treated
message-ids as case-sensitive, like C News.  B2.11 treats them as case-
insensitive.  Neither is right.

Perhaps we should implement the 822 rules, and be the first news system
to actually be correct.  We're a bit reluctant to do so, though.
We don't see that anything is gained by it.  News transmission, by any
route we know of, is not going to alter the case of message-ids.  (We
would be interested to hear any counterexamples.)  It seems to be an
unnecessary complication.
-- 
V7 /bin/mail source: 554 lines.|     Henry Spencer at U of Toronto Zoology
1989 X.400 specs: 2200+ pages. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu