berry (01/09/83)
#N:zinfandel:8600005:000:3774 zinfandel!berry Jan 6 14:25:00 1983 Many of you noticed that many many duplicate articles came spewing out of zehntel and zinfandel over Christmas. We shut down the news, notes and uucp systems, and when I came back from holiday I started to track down what had happened. I quickly isolated the newsoutput program which sends articles from the notesfile syatem to news. Somehow the sequencer file kept getting trashed, causing all articles to be sent again and again. The cause was a combination of two bugs and a "feature". What happened was this: for each article that newsoutput decides to send to news, it does a popen() to link to rnews, and writes the article down the pipe thus created. When done it did a fclose() (BUG 1). Thus the rnews process went off into the background to finish up its task. The problem with this is that if there are a lot of articles to go out, and the system is a bit loaded, then you can run out of processes. This is exactly what happened to us. When the process table gets full, the fork() in popen() will fail, and popen() will return NULL. BUT it does not close the open file descriptors it got from the pipe() call. (BUG 2). The routine newsnote() or newsresp() will detect that the popen failed, and exit. Meanwhile the open file table fills up. If the popen() fails several times, the file table becomes completely full. Then, when newsoutput attempts to update the sequencer file, the open() of the file fails, and newsoutput tries to creat() the file, thinking that it doesn't exsist. The creat() fails too, but not before the file has been truncated. (The FEATURE). Newsoutput aborts here, and next time it is invoked, since the sequencer file is empty, it tries to send everything. The system quicky becomes loaded with rnews's running in background, and it happens again. HOW TO FIX IT _____________ To fix BUG one, change the calls to fclose() in notesfile source file 'newsdump.c' to pclose(). This will cause newsoutput to wait for rnews to finish before continuing. Sure it will be slower, but this should run in the middle of the night anyway, right? The lines affected are around 77 and 143; I added some #ifdef DEBUG stuff to mine, so the line numbers may not match yours. To fix BUG TWO, edit ../src/libc/stdio/popen.c according to the following diff -c: *** /usr/src/libc/stdio/opopen.c Thu Jan 6 11:29:51 1983 --- /usr/src/libc/stdio/popen.c Thu Jan 6 11:31:23 1983 *************** *** 1,4 /* @(#)popen.c 4.1 (Berkeley) 12/21/80 */ #include <stdio.h> #include <signal.h> #define tst(a,b) (*mode == 'r'? (b) : (a)) --- 1,9 ----- /* @(#)popen.c 4.1 (Berkeley) 12/21/80 */ + /* + * 6 Jan 1983 Berry Kercheval: when fork in popen() fails, close the file + * descriptors before returning. + */ + #include <stdio.h> #include <signal.h> #define tst(a,b) (*mode == 'r'? (b) : (a)) *************** *** 26,32 execl("/bin/sh", "sh", "-c", cmd, 0); _exit(1); } ! if(pid == -1) return NULL; popen_pid[myside] = pid; close(hisside); --- 31,39 ----- execl("/bin/sh", "sh", "-c", cmd, 0); _exit(1); } ! if(pid == -1) { ! close (myside); /* close them descriptors!! */ ! close (hisside); return NULL; } popen_pid[myside] = pid; *************** *** 28,33 } if(pid == -1) return NULL; popen_pid[myside] = pid; close(hisside); return(fdopen(myside, mode)); --- 35,41 ----- close (myside); /* close them descriptors!! */ close (hisside); return NULL; + } popen_pid[myside] = pid; close(hisside); return(fdopen(myside, mode)); ******end of diff********* That's what it takes. Sorry for the inconveniance to all. --Berry Kercheval Zehntel Inc. 2625 Shadelands Drive Walnut Creek, CA 94598 (415)932-6900