mf@ircam.fr (Michel Fingerhut) (10/22/90)
Machine: DECsystem 5820 (RISC) OS: Ultrix 4.0 (Rev. 179) Every once in a while (every 3-4 days), the name daemon starts eating CPU time, goes to the top of the queue, and fills the syslog error message table with messages of the form Oct 22 10:23:37 localhost: 93 named: accept: Too many open files (one a second, approximately) until it is killed and/or chokes /usr/spool. Upon restart, it works fine. There is no apparent flood of requests prior to that. Does anyone have a suggestion on how to approach the problem? Thanks, Michael Fingerhut
tinguely@plains.NoDak.edu (Mark Tinguely) (10/24/90)
In article <1990Oct22.105209.28006@ircam.ircam.fr> mf@ircam.fr (Michel Fingerhut) writes: >Machine: DECsystem 5820 (RISC) >OS: Ultrix 4.0 (Rev. 179) >Every once in a while (every 3-4 days), the name daemon starts eating CPU >time, goes to the top of the queue, and fills the syslog error message >table with messages of the form > Oct 22 10:23:37 localhost: 93 named: accept: Too many open files Do you have machines that queries the name server by TCP rather than UDP? This can be found by using `netstat'. We had the same problem with a IBM 3090 querying our the BIND 4.8.1 (and earlier releases) nameserver. I am sure the Ultrix server is based upon BIND 4.8. About 7 months ago I posted the fix to this problem, and (though I did not check), I think a simular fix went into BIND 4.8.2. There are two problems, but both are based on the fact that TCP queries are queued. It is possible with the orginal BIND code, that these queries are not properly released as they sit waiting on a time queue. UDP resolutions are just discarded if they can not be resolved right away, and do not cause this problem. If you do not want to update your nameserver to BIND (boy did I find out this week how many people think I am a radical for running public-domain software [that works correctly]), then ask at DEC to update the server. Last week I removed my "diff" files for the BIND error (assuming these were picked up in BIND 4.8.3 located at ucbarpa.berrkeley.edu in the 4.3 directory). I just quickly scanned the areas that I modified in the BIND 4.8.3 files and did not see the removal of queued TCP entries, but since I don't follow the BIND mailing list, they may have implemented the solution in a different fashion than I did (or did not pick the changes at all). If there is a need for the TCP BIND fixes, I can restore them to our anonymous ftp partition. -- Mark Tinguely North Dakota State University, Fargo, ND 58105 UUCP: ...!uunet!plains!tinguely BITNET: tinguely@plains.bitnet INTERNET: tinguely@plains.NoDak.edu