chris@wugate.wustl.edu (Chris Myers) (08/31/89)
Here is a set of patches that I made to NNTP 1.5 and BNews 2.11 that helps reduce the CPU time and I/O for transmitting news via NNTP. I have been using these patches on wuarchive.wustl.edu for several days now without any difficulties. If you use CNews, discard the patch named 'news-2.11-patch' and whenever I mention a 'Z' flag below substitute 'n' instead. The original nntpxmit program would open each message, scan the headers and extract the message ID before sending the "IHAVE" command. This actually uses a fair bit of CPU time and a causes a LOT of I/O. Since rnews already knows the message ID it's much more efficient to have rnews write the message ID (and the message path) into the batch file and just have NNTP read it back out later. This patch won't do a lot for lightly loaded NNTP servers, but will help a well-connected site a BUNCH. What this set of patches does is: a) Modify rnews/inews to add a 'Z' flag in the sys file (used instead of the 'F' flag). The 'Z' option causes rnews to create batch file entries of the form: "<path> <messageID>" rather than just "<path>". b) Modify nntpxmit to check for a batch file which has both the message path and message ID in the batch file. If it finds both, it will not scan the message for the ID. If it does NOT find both (i.e. the old batch file format) it will go ahead and scan for the message ID. Thus the new nntpxmit is compatible with the old one and can operate with either type of batch file (or a batch file with BOTH types of entries for that matter). Here is a shar file containing two patches. The first, 'news-2.11-patch', should be applied to ifuncs.c in ./news-2.11/src and the second, 'nntp-1.5-patch' is applied to nntpxmit.c in ./nntp-1.5/xmit Chris Myers Washington University in St. Louis -------------------------------- CUT HERE ----------------------------------- #!/bin/sh # to extract, remove the header and type "sh filename" if `test ! -s ./news-2.11-patch` then echo "writing ./news-2.11-patch" cat > ./news-2.11-patch << '\End\Of\Shar\' *** ifuncs.c Tue Aug 29 19:13:27 1989 --- ../ifuncs.c Tue Aug 29 19:32:37 1989 *************** *** 281,287 **** int useexist = (index(sp->s_flags, 'U') != NULL); /* I: append messageid to file. implies F flag */ int appmsgid = maynotify && (index(sp->s_flags, 'I') != NULL); ! /* allow specification based on size */ if ((size_ptr = strpbrk(sp->s_flags, "<>")) != NULL) { struct stat stbuf; --- 281,288 ---- int useexist = (index(sp->s_flags, 'U') != NULL); /* I: append messageid to file. implies F flag */ int appmsgid = maynotify && (index(sp->s_flags, 'I') != NULL); ! /* Z: append name AND messageid to file. implies F flag */ ! int appboth = (index(sp->s_flags, 'Z') != NULL); /* allow specification based on size */ if ((size_ptr = strpbrk(sp->s_flags, "<>")) != NULL) { struct stat stbuf; *************** *** 362,368 **** sp->s_name, oldid, hh.ident); } ! if (appfile || appmsgid) { if (firstbufname[0] == '\0') { extern char histline[]; localize("junk"); --- 363,369 ---- sp->s_name, oldid, hh.ident); } ! if (appfile || appmsgid || appboth) { if (firstbufname[0] == '\0') { extern char histline[]; localize("junk"); *************** *** 392,399 **** ofp = fopen(sp->s_xmit, "a"); if (ofp == NULL) xerror("Cannot append to %s", sp->s_xmit); ! fprintf(ofp, "%s", appmsgid ? hh.ident : firstbufname); #ifdef MULTICAST while (--mc >= 0) fprintf(ofp, " %s", *sysnames++); --- 393,402 ---- ofp = fopen(sp->s_xmit, "a"); if (ofp == NULL) xerror("Cannot append to %s", sp->s_xmit); ! if (!appboth) fprintf(ofp, "%s", appmsgid ? hh.ident : firstbufname); + else fprintf(ofp, "%s %s", firstbufname, hh.ident); + #ifdef MULTICAST while (--mc >= 0) fprintf(ofp, " %s", *sysnames++); \End\Of\Shar\ else echo "will not over write ./news-2.11-patch" fi if [ `wc -c ./news-2.11-patch | awk '{printf $1}'` -ne 1837 ] then echo `wc -c ./news-2.11-patch | awk '{print "Got " $1 ", Expected " 1837}'` fi if `test ! -s ./nntp-1.5-patch` then echo "writing ./nntp-1.5-patch" cat > ./nntp-1.5-patch << '\End\Of\Shar\' *** nntpxmit.c.old Tue Aug 29 19:56:36 1989 --- nntpxmit.c Tue Aug 29 19:56:40 1989 *************** *** 368,373 **** --- 368,374 ---- #else char *mode = "r"; #endif FTRUNCATE + char mesgid[255]; if ((Qfp = fopen(file, mode)) == (FILE *)NULL) { char buf[BUFSIZ]; *************** *** 408,415 **** */ catchsig(interrupted); ! while((fp = getfp(Qfp, Article, sizeof(Article))) != (FILE *)NULL) { ! if (!sendarticle(host, fp)) { (void) fclose(fp); requeue(Article); Article[0] = '\0'; --- 409,416 ---- */ catchsig(interrupted); ! while((fp = getfp(Qfp, Article, sizeof(Article), mesgid)) != (FILE *)NULL) { ! if (!sendarticle(host, fp, mesgid)) { (void) fclose(fp); requeue(Article); Article[0] = '\0'; *************** *** 449,463 **** ** Watch all network I/O for errors, return FALSE if ** the connection fails and we have to cleanup. */ ! sendarticle(host, fp) char *host; FILE *fp; { register int code; char buf[BUFSIZ]; char *e_xfer = "%s xfer: %s"; ! switch(code = ihave(fp)) { case CONT_XFER: /* ** They want it. Give it to 'em. --- 450,465 ---- ** Watch all network I/O for errors, return FALSE if ** the connection fails and we have to cleanup. */ ! sendarticle(host, fp, mesgid) char *host; FILE *fp; + char *mesgid; { register int code; char buf[BUFSIZ]; char *e_xfer = "%s xfer: %s"; ! switch(code = ihave(fp, mesgid)) { case CONT_XFER: /* ** They want it. Give it to 'em. *************** *** 753,766 **** ** Read the header of a netnews article, snatch the message-id therefrom, ** and ask the remote if they have that one already. */ ! ihave(fp) FILE *fp; { register int code; register char *id; char buf[BUFSIZ]; ! if ((id = getmsgid(fp)) == (char *)NULL || *id == '\0') { /* ** something botched locally with the article ** so we don't send it, but we don't break off --- 755,770 ---- ** Read the header of a netnews article, snatch the message-id therefrom, ** and ask the remote if they have that one already. */ ! ihave(fp, mesgid) FILE *fp; + char *mesgid; + { register int code; register char *id; char buf[BUFSIZ]; ! if ((strlen(mesgid) == 0) && ((id = getmsgid(fp)) == (char *)NULL || *id == '\0')) { /* ** something botched locally with the article ** so we don't send it, but we don't break off *************** *** 771,776 **** --- 775,782 ---- return(ERR_GOTIT); } + if (strlen(mesgid) > 0) id = mesgid; + if (!msgid_ok(id)) { sprintf(buf, "%s: message-id syntax error: %s", Article, id); log(L_DEBUG, buf); *************** *** 801,814 **** ** Returns a valid FILE pointer or NULL if end of file. */ FILE * ! getfp(fp, filename, fnlen) register FILE *fp; char *filename; register int fnlen; { register FILE *newfp = (FILE *)NULL; register char *cp; char *mode = "r"; while(newfp == (FILE *)NULL) { if (fgets(filename, fnlen, fp) == (char *)NULL) --- 807,822 ---- ** Returns a valid FILE pointer or NULL if end of file. */ FILE * ! getfp(fp, filename, fnlen, mesgid) register FILE *fp; char *filename; register int fnlen; + char *mesgid; { register FILE *newfp = (FILE *)NULL; register char *cp; char *mode = "r"; + char buffer[255]; while(newfp == (FILE *)NULL) { if (fgets(filename, fnlen, fp) == (char *)NULL) *************** *** 822,827 **** --- 830,840 ---- if (filename[0] == '\0') continue; + + if (index(filename, ' ') != NULL) { + sscanf(filename, "%s %s", buffer, mesgid); + strcpy(filename, buffer); + } else strcpy(mesgid, ""); if ((newfp = fopen(filename, mode)) == (FILE *)NULL) { /* \End\Of\Shar\ else echo "will not over write ./nntp-1.5-patch" fi if [ `wc -c ./nntp-1.5-patch | awk '{printf $1}'` -ne 3781 ] then echo `wc -c ./nntp-1.5-patch | awk '{print "Got " $1 ", Expected " 3781}'` fi echo "Finished archive 1 of 1" exit
jerry@olivey.olivetti.com (Jerry Aguirre) (09/01/89)
In article <237@wugate.wustl.edu> chris@wugate.wustl.edu (Chris Myers) writes: >a) Modify rnews/inews to add a 'Z' flag in the sys file (used instead of the >'F' flag). The 'Z' option causes rnews to create batch file entries of the >form: "<path> <messageID>" rather than just "<path>". I have also made this change locally and have found it useful. I used the "Q" (Queue) flag instead of "Z" but that is, of course, not important. A more important difference is in how the file was organized. I write it as: <messageID> pathname instead of the reverse. This makes it much simpler to check for the new format. It is only necessary to check if the first character is an "<". I was a little leary of the other format because there is no guarantee that an "<" won't be imbedded in a path name. For example if one did an index for "<" on a pathname that looked like: /usr/spool/newsfile/<890831FE03@foo.bar> one could confuse the pathname with an ID. The above is a perfectly legal path name in Unix and I can immagine a news system that would use something like that. I have discussed this with others and they have convinced me that it would be desirable to accept multiple pathnames and even no pathnames at all. (This is primarily for compatability with CNEWS though others might benifit.) The optional formats would then be: pathname pathname pathname ... <messageID> <messageID> pathname <messageID> pathname pathname ... Handling a message ID by itself shouldn't be too difficult as there is already code to handle a request for an article ID and find the pathname from the history file. I happen to feel this is a high overhead way of doing it but as it is not difficult to code it should probably be supported. The multiple pathnames handle the case where the article might be expired early in one group but still exist in another. I suggest that we decide on a standard for NNTP that the different versions of news can code for. Making the nntpxmit code flexible now will prevent future hacking on the code to make it work for "DNews". Jerry
coolidge@brutus.cs.uiuc.edu (John Coolidge) (09/01/89)
jerry@olivey.olivetti.com (Jerry Aguirre) writes: >A more important difference is in how the file was organized. I write >it as: > <messageID> pathname >instead of the reverse. This makes it much simpler to check for the new >format. It is only necessary to check if the first character is an "<". >I was a little leary of the other format because there is no guarantee >that an "<" won't be imbedded in a path name. For example if one did an >index for "<" on a pathname that looked like: > /usr/spool/newsfile/<890831FE03@foo.bar> >one could confuse the pathname with an ID. The above is a perfectly >legal path name in Unix and I can immagine a news system that would use >something like that. You can't confuse a message ID with a pathname in the original format pathname <messageID> because the whitespace separating the pathname and messageID is enough to make the distinction between the two obvious. Spaces are illegal in both pathnames and messageID's, so the space is a unique separator. Another (and, at least in my case, a more immediately important) reason I like the original system better is that C News comes preequipped with the ability to generate files of this form. Just replace the 'F' with an 'n' in the sys file entry, install the nntp side of the patch, and you're saving i/o bandwidth immediately. --John -------------------------------------------------------------------------- John L. Coolidge Internet:coolidge@cs.uiuc.edu UUCP:uiucdcs!coolidge Of course I don't speak for the U of I (or anyone else except myself) Copyright 1989 John L. Coolidge. Copying allowed if (and only if) attributed.
coolidge@brutus.cs.uiuc.edu (John Coolidge) (09/01/89)
I write: >Spaces are illegal in both >pathnames and messageID's, so the space is a unique separator. Of course this isn't true, as was pointed out to me via e-mail. Pathnames can indeed have spaces. MessageID's, however, cannot have either spaces or left angle brackets (other than the required starting bracket), and must close with a right angle bracket, so as long as the filename is not of the form /news/path/text <stuff> (where the < is part of the filename, not a shell command), then the proposed system implemented by the patch is safe. As long as a news filename never includes the sequence "space left-angle-bracket" the system should still be safe, since space left-angle-bracket would then be unambiguously the start of a messageid. I doubt that restriction is likely to bother implementors too much :-) --John -------------------------------------------------------------------------- John L. Coolidge Internet:coolidge@cs.uiuc.edu UUCP:uiucdcs!coolidge Of course I don't speak for the U of I (or anyone else except myself) Copyright 1989 John L. Coolidge. Copying allowed if (and only if) attributed.
sob@watson.bcm.tmc.edu (Stan Barber) (09/16/89)
I agree that we need to make nntp smarter about various version of news. This is why I added the CNEWS stuff to the last patch. I think nntp and news will remain somewhat intertwined for some time (at least until a major rewrite of nntp comes along). So, I'd say that if you change your news sofware, you may need to recompile NNTP to adapt it to what you have done. I don't think adding adaptive mechanisms to NNTP is necessary. STAN -- Stan internet: sob@bcm.tmc.edu Manager, Networking Olan uucp: {rutgers,mailrus}!bcm!sob Information Technology Barber Opinions expressed are only mine. Baylor College of Medicine