david@ms.uky.edu (David Herron -- One of the vertebrae) (07/28/88)
It's amazing what you'll do when you're bored. One afternoon I moved the SPOOLNEWS code from inews.c into nntpd to see what sort of effect it would have on the system load, it "seems" to help quite a bit. [it's rather hard to test you see] What we had noticed here was that we'd often have 3 or 4 nntpd's running at a time sending us news. When nntp receives an article it starts up an rnews to handle the reception. While that's nice and clean and elegant, that rnews in reality executes very little code, which raises the overhead percentage -- that is, it always costs "x" amount of resources to start up a process and the whole process execution costs "y" and since very little is done (when you have SPOOLNEWS defined) to insert the article into the system "y" is very close to "x". Moving the SPOOLNEWS stuff into nntpd avoids "x" -- or rather having many "x"'s going on all at once. Because eventually, when the "rnews -U" is run, all those "x"'s will happen anyway but now they'll be happening serially rather than in parallel. (parallel would be ok if this were running over on our Sequent, but it's not over there yet ... instead we're on a uVaxII with 13 megs of memory.) The patch includes a context diff for server/spawn.c and also a line to add to common/conf.h, and is derived from unadulterated nntp v1.5 sources. As an added bonus you get to see one of the silliest programs I've ever written. I'm almost embarassed to post this, especially since I'm going to be looking for a job later this year :-). Anyway. The program is probably the MOST stupid way of unbatching a batch in existance. It looks for "#! rnews" lines and assumes that it's the beginning of a new article. It also strips blanks out of otherwise empty lines, strips away empty lines before headers and otherwise cleans up the batch file. All this so that I can have a news feed from an IBM mainframe at Penn State. The reason that it's being posted is that it also has had the SPOOLNEWS code put into it. #! /bin/sh : This is a shell archive, meaning: : 1. Remove everything above the '#! /bin/sh' line. : 2. Save the resulting text in a file. : 3. Execute the file with /bin/sh '(not csh)' to create the files: : 'conf.DIFF' : 'spawn.DIFF' : 'split.batch.c' : This archive created: 'Wed Jul 27 16:03:27 1988' : By: 'David Herron -- One of the vertebrae ()' export PATH; PATH=/bin:$PATH echo shar: extracting "'conf.DIFF'" '(61 characters)' if test -f 'conf.DIFF' then echo shar: will not over-write existing file "'conf.DIFF'" else sed 's/^X//' >'conf.DIFF' <<'SHAR_EOF' X#define SPOOLNEWS /* Emulate the SPOOLNEWS code from news */ SHAR_EOF if test 61 -ne "`wc -c < 'conf.DIFF'`" then echo shar: error transmitting "'conf.DIFF'" '(should have been 61 characters)' fi fi # end of overwriting check echo shar: extracting "'spawn.DIFF'" '(3343 characters)' if test -f 'spawn.DIFF' then echo shar: will not over-write existing file "'spawn.DIFF'" else sed 's/^X//' >'spawn.DIFF' <<'SHAR_EOF' X*** spawn.c.orif Tue May 31 10:39:10 1988 X--- spawn.c Thu Jul 14 22:02:37 1988 X*************** X*** 3,9 **** X #endif X X #include "common.h" X! X #include <signal.h> X X #ifdef XFER_TIMEOUT X--- 3,10 ---- X #endif X X #include "common.h" X! #include <stdio.h> X! #include <time.h> X #include <signal.h> X X #ifdef XFER_TIMEOUT X*************** X*** 11,16 **** X--- 12,33 ---- X static int old_xfer_lines; X #endif X X+ X+ char * X+ errmsg(code) X+ int code; X+ { X+ extern int sys_nerr; X+ extern char *sys_errlist[]; X+ static char ebuf[6+5+1]; X+ X+ if (code > sys_nerr) { X+ (void) sprintf(ebuf, "Error %d", code); X+ return ebuf; X+ } else X+ return sys_errlist[code]; X+ } X+ X static char tempfile[256]; X X /* X*************** X*** 59,66 **** X--- 76,88 ---- X #endif not USG X register FILE *fp; X X+ #ifdef SPOOLNEWS X+ (void) sprintf(tempfile, "%s/.spXXXXXX", spooldir); X+ (void) mktemp(tempfile); X+ #else X (void) strcpy(tempfile, "/tmp/rpostXXXXXX"); X (void) mktemp(tempfile); X+ #endif /* SPOOLNEWS */ X X fp = fopen(tempfile, "w"); X if (fp == NULL) { X*************** X*** 122,127 **** X--- 144,209 ---- X (void) chown(tempfile, uid_poster, gid_poster); X #endif X X+ /* X+ * Ok, now we have the article in "tempfile". We X+ * should be able to fork off, close fd's 0 to 31 (or X+ * whatever), open "tempfile" for input, thus making X+ * it stdin, and then execl the inews. We think. X+ */ X+ X+ #ifdef SPOOLNEWS X+ X+ { X+ register struct tm *tp; X+ time_t t; X+ #define BUFLEN 512 X+ char buf[BUFLEN]; X+ extern struct tm *gmtime(); X+ int randno = getpid(); X+ X+ (void) time(&t); X+ tp = gmtime(&t); X+ retry: X+ /* This file name "has to" be unique (right?) */ X+ (void) sprintf(buf, "%s/.rnews/%02d%02d%02d%02d%02d%x", X+ spooldir, X+ tp->tm_year, tp->tm_mon+1, tp->tm_mday, X+ tp->tm_hour, tp->tm_min, randno); X+ X+ if (link(tempfile, buf) < 0) { X+ char dbuf[BUFLEN]; X+ if (errno == EEXIST) { X+ randno++; X+ goto retry; X+ } X+ sprintf(dbuf, "%s/.rnews", spooldir); X+ #define N_UMASK 022 /* from localize.ukma */ X+ if (mkdir(dbuf, 0777&~N_UMASK) < 0) { X+ sprintf(errbuf, "%s dospool: Cannot mkdir %s: %s", X+ hostname, dbuf, errmsg(errno)); X+ # ifdef LOG X+ syslog(LOG_ERR, "%s", errbuf); X+ /* xerror("Cannot mkdir %s: %s", dbuf, errmsg(errno)); */ X+ #endif X+ return(-1); X+ } X+ if (link(tempfile, buf) < 0) { X+ sprintf(errbuf, "%s dospool: Cannot link(%s,%s): %s", X+ hostname, tempfile, buf, errmsg(errno)); X+ # ifdef LOG X+ syslog(LOG_ERR, "%s", errbuf); X+ /* xerror("Cannot link(%s,%s): %s", tempfile, buf, X+ * errmsg(errno)); */ X+ #endif X+ return(-1); X+ } X+ } X+ (void) unlink(tempfile); X+ return(1); X+ } X+ X+ #else /* SPOOLNEWS */ X+ X /* Set up a pipe so we can see errors from rnews */ X X if (pipe(fds) < 0) { X*************** X*** 132,144 **** X return (-1); X } X X- /* X- * Ok, now we have the article in "tempfile". We X- * should be able to fork off, close fd's 0 to 31 (or X- * whatever), open "tempfile" for input, thus making X- * it stdin, and then execl the inews. We think. X- */ X- X pid = vfork(); X if (pid == 0) { /* We're in child */ X #ifdef POSTER X--- 214,219 ---- X*************** X*** 225,230 **** X--- 300,306 ---- X X return (exit_status ? -1 : 1); X } X+ #endif /* SPOOLNEWS */ X } X X #ifdef XFER_TIMEOUT SHAR_EOF if test 3343 -ne "`wc -c < 'spawn.DIFF'`" then echo shar: error transmitting "'spawn.DIFF'" '(should have been 3343 characters)' fi fi # end of overwriting check echo shar: extracting "'split.batch.c'" '(3105 characters)' if test -f 'split.batch.c' then echo shar: will not over-write existing file "'split.batch.c'" else sed 's/^X//' >'split.batch.c' <<'SHAR_EOF' X X#include <stdio.h> X#include <time.h> X#include <sys/types.h> X X#define SPOOLNEWS X X#ifdef SPOOLNEWS X X#define spooldir "/net/spool.news" X Xextern struct tm *gmtime(); Xstruct tm *curtime2; Xtime_t curtime1; X Xextern char *mktemp(); Xchar tempfile[256], outfile[256]; Xint tfd; X X#endif X Xmain(argc, argv) Xint argc; Xchar **argv; X{ X FILE *outf, *fopen(); X register int bol, opn, c, boa; X int spccount; X char tmp[100], *command; X int len, randno, getpid(); X X /* puts("beginning"); X * fflush(stdout); X * puts(argv[0]); X * fflush(stdout); X * puts(argv[1]); X * fflush(stdout); */ X if (argc != 2) { X fprintf(stderr, "Usage: split.batch command <file\n"); X exit(1); X } X command = argv[1]; X randno = getpid(); X bol = (1==1); X boa = (1==1); X opn = (1==0); X while((c=getchar()) != EOF) { X if (bol && c == ' ') { X spccount = 1; X while ((c=getchar()) == ' ') X spccount++; X if (c == EOF) { X for (; spccount >= 1; spccount--) X fputc(' ', outf); X goto out; X } X if (c == '\n' && boa) { X bol = (1==1); X continue; X } X if (c != '\n') X for (; spccount >= 1; spccount--) X fputc(' ', outf); X /* X * This falls into the part at the end of the while X * loop. X */ X } X else if (boa && bol && c == '\n') X continue; X else if (bol && c == '#') { X bol = (1==0); X c = getchar(); X if (c != '!') X fprintf(outf, "#%c", c); X else if ((c=getchar()) != ' ') X fprintf(outf, "#!%c", c); X else if ((c=getchar()) != 'r') X fprintf(outf, "#! %c", c); X else if ((c=getchar()) != 'n') X fprintf(outf, "#! r%c", c); X else if ((c=getchar()) != 'e') X fprintf(outf, "#! rn%c", c); X else if ((c=getchar()) != 'w') X fprintf(outf, "#! rne%c", c); X else if ((c=getchar()) != 's') X fprintf(outf, "#! rnew%c", c); X else if ((c=getchar()) != ' ') X fprintf(outf, "#! rnews%c", c); X else { X gets(tmp); X /* puts(tmp); */ X if (opn) { X#ifndef SPOOLNEWS X pclose(outf); X#else X fclose(outf); X#endif X opn = (1==0); X } X /* X * len = atoi(tmp); X * sprintf(tmp, "%s.%d.%d", base, pid++, len); X * printf("%s\n", tmp); X */ X#ifndef SPOOLNEWS X outf = popen(command, "w"); X if (outf == NULL) { X fprintf(stderr, X "Gosh darn! Can't open |%s!\n", command); X exit(1); X } X#else X do { X (void) sprintf(tempfile, "%s/.spXXXXXX", spooldir); X (void) mktemp(tempfile); X } while ((tfd = creat(tempfile, 0644) < -1)); X (void) close(tfd); X (void) time(&curtime1); X curtime2 = gmtime(&curtime1); Xretry: X (void) sprintf(outfile, "%s/.rnews/%02d%02d%02d%02d%02d%x", X spooldir, X curtime2->tm_year, curtime2->tm_mon+1, X curtime2->tm_mday, curtime2->tm_hour, X curtime2->tm_min, randno); X if (link(tempfile, outfile) < 0) { X randno++; X goto retry; X } X unlink(tempfile); X outf = fopen(outfile, "w"); X#endif X opn = (1==1); X boa = (1==1); X bol = (1==1); X continue; X } X } X boa = (1==0); X if (opn) X fputc(c, outf); X /* putchar(c); */ X if (c == '\n') X bol = (1==1); X else X bol = (1==0); X } Xout: X if (opn) { X fflush(outf); X fclose(outf); X } X exit(0); X} SHAR_EOF if test 3105 -ne "`wc -c < 'split.batch.c'`" then echo shar: error transmitting "'split.batch.c'" '(should have been 3105 characters)' fi fi # end of overwriting check : End of shell archive exit 0 -- <---- David Herron -- The E-Mail guy <david@ms.uky.edu> <---- ska: David le casse\*' {rutgers,uunet}!ukma!david, david@UKMA.BITNET <---- <---- Looking forward to a particularly blatant, talkative and period bikini ...
matt@oddjob.UChicago.EDU (Maxwell House Daemon) (07/29/88)
David Herron sez: > It's amazing what you'll do when you're bored. One afternoon I moved the > SPOOLNEWS code from inews.c into nntpd to see what sort of effect it would > have on the system load, it "seems" to help quite a bit. This may be a big loser, David. Until you process the article it won't be in the history file. Hence nntpd will continue to accept more copies of that article until you run rnews -U. This increases the network load, which defeats one of the goals of NNTP. Since you say you often have multiple incoming NNTP sessions, I think you will get multiple copies quite often. Check your log. I know that oddjob sometimes get offered the same article from two sources (separated by over 1000 miles) mere seconds apart. ________________________________________________________ Matt Crawford matt@oddjob.uchicago.edu
flee@blitz (Felix Lee) (07/29/88)
In <10080@e.ms.uky.edu> David Herron writes: > It's amazing what you'll do when you're bored. One afternoon I moved the > SPOOLNEWS code from inews.c into nntpd to see what sort of effect it would > have on the system load, it "seems" to help quite a bit. A better way to spend your time is to fix nntpd so that it doesn't spawn a new inews every time it receives an article. But B2.11 inews is going to fork off a new inews for every article anyway. This would still be a slight win, especially if your fork() uses copy-on-write. Even better would be C news, which doesn't fork at all. (What about 3.0?) But nntpd would have to get the size of the article for #!rnews batches. You could 1) save the article (either in memory or on disk); 2) have the sender tell you the size; 3) create a batch that uses delimiters instead of byte counts. I like the last option best. -- Felix Lee *!psuvax1!flee
david@ms.uky.edu (David Herron -- One of the vertebrae) (08/03/88)
In article <14956@oddjob.UChicago.EDU> matt@oddjob.UChicago.EDU (Maxwell House Daemon) writes: >David Herron sez: >> It's amazing what you'll do when you're bored. One afternoon I moved the >> SPOOLNEWS code from inews.c into nntpd to see what sort of effect it would >> have on the system load, it "seems" to help quite a bit. >This may be a big loser, David. I know. I knew that when I posted it. >Until you process the article it >won't be in the history file. I know. I've been running with SPOOLNEWS all along. That's because I wanted to be able to control how many "streams" of processes were unbatching news. The cost is two fork()/exec() executions per article to process the article. The first set is to save away the standard input into SPOOLDIR/.rnews and the other is to examine it to determine if it is already present in HISTFILE and if it is not to insert it into the right place in the news hierarchy. On a normal system however you can easily get multiple streams of rnews processes running, if you have SPOOLNEWS undefined. That will happen whenever you have >1 UUCP neighbor shipping you news at once. Or if you have multiple NNTP neighbors. With multiple streams of these processes running our uVaxII gets veery sloooowwww. And I have a vested interest in this machine not getting bogged down since it's the one where I 'live' :-). >Hence nntpd will continue to accept >more copies of that article until you run rnews -U. This increases >the network load, which defeats one of the goals of NNTP. Yes I understand that. I've switched from running my news scripts every 15 minutes to every 10 minutes, plus I've staggered the news transmission scripts with the news reception scripts so that they will often sidestep each other. There is also flock stuff going on so that not more than one "newsdaemon" script is running on a machine at any one time. If 10 minutes isn't fast enough a turn around time I could decrease the granularity to 5 minutes or some such. It's merely a matter of editting crontab ... There is a trade-off between network load and host load. I have more host load than I can handle, but with a reshuffling of what happens when I can handle it. As I see it the better fix is one which was discussed on nntp-managers some time ago (but not yet implemented to my knowledge). That is to put something into the nntp_access file which will put limits onto the number of accesses from certain subsets of the network. This could be on a per-host, per-net or anybody else basis. If I could use that to force only (say) 2 connections from the outside world, then this machine could handle the load. I would be able to turn SPOOLNEWS off in nntp, and possibly in news in general. This is yet another version of the local policy decisions versus global policy decisions debate. In my case I have a colleague who was very adamant that I do something about nntp and the load it causes. This was the easiest fix. My patch does provide more flexibility in nntp administration ... It also saves a fork()/exec() pair if you have SPOOLNEWS defined in the underlying news system. (the one which writes the article into SPOOLDIR/.rnews). >Since you >say you often have multiple incoming NNTP sessions, I think you will >get multiple copies quite often. Check your log. I know that oddjob >sometimes get offered the same article from two sources (separated by >over 1000 miles) mere seconds apart. Unfortunately something broke the syslog stuff in news here a couple of months ago and I haven't had time to fix it... sigh. I've answered in length -- probably greater length than most of us need -- in order to let Matt know that I know what I'm talking about. Also if *is* a flaw in my reasoning then someone can point it out and let me correct my ways. -- <---- David Herron -- The E-Mail guy <david@ms.uky.edu> <---- ska: David le casse\*' {rutgers,uunet}!ukma!david, david@UKMA.BITNET <---- <---- Looking forward to a particularly blatant, talkative and period bikini ...