[comp.unix.ultrix] Run away named "accept: Too many open files" Ultrix 4.1A

farhad@Sunburn.Stanford.EDU (Farhad Shakeri) (05/23/91)

Hi,

We are having some problem with named under Ultrix 4.1A that is
causing a lot of headaches.

For some reason after named is used a while (I think), suddenly
named process  starts grabbing about %65 or more  of CPU and print
millions of:

"May 22 09:34:31 LOCALHOST: 18415 named: accept: Too many open files"

(about 60 per second) in syslog file that causes the /var filesystem
to fill up.

I tried Ultrix 4.2 named also and I got the same result.  My next task
is to go back to ultrix 3.1C named.

The config files have been used before and on many different machines
and old versions of Ultrix too.

Any help would be appreciated.


-- 
       +----------------------------------------------------+
      /   Farhad  Shakeri         E-Mail:                  /
     /  Stanford   University     farhad@CS.Stanford.EDU  /
    / Computer Science Department                        /

gengenba@forwiss.uni-passau.de (Michael Gengenbach) (05/25/91)

farhad@Sunburn.Stanford.EDU (Farhad Shakeri) writes:

>We are having some problem with named under Ultrix 4.1A that is
>causing a lot of headaches.
>[...]
>"May 22 09:34:31 LOCALHOST: 18415 named: accept: Too many open files"

We have the same problem here in Passau on our DECsystem 5810. We use
the following procedure to kill and restart the named in case of trouble.
We call it ctrlnamed and run it every 10 Minutes from cron.

BTW, the DEC service refuses to look into the problem. They say that
our installation is not supported because the named files are located
in /etc/namedb and not in /var/dss/namedb and we didn't use bindsetup
for installation.

If there is a real solution, let me know.

Michael

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#!/bin/csh -f

set logfile=/var/adm/infolog
set killlog=/etc/namedb/killlog

set linenumber=10
set count=5

set current_pid=`cat /etc/named.pid`

set matchinglines = `tail -$linenumber $logfile | egrep  -e "$current_pid named.*Too many open files" | wc -l`


if ($matchinglines > $count) then
# something seems to go wrong
	kill `cat /etc/named.pid`
	/usr/etc/named
	# sleep, so named gets a chance to write his pid
	sleep 5
	set new_pid=`cat /etc/named.pid`
	echo "`date` restart named (old pid: $current_pid, new pid: $new_pid)" >>$killlog
	sleep 60
	endif

set current_pid=`cat /etc/named.pid`

if ("`ps augx | grep $current_pid | grep -v grep`" == "") then
# there is no named running
	/usr/etc/named
	# sleep, so named gets a chance to write his pid
	sleep 5
	set new_pid=`cat /etc/named.pid`
	echo "`date` *start* named (old pid: $current_pid, new pid: $new_pid)" >>$killlog
	endif
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- 
Michael Gengenbach, FORWISS Passau, gengenbach@forwiss.uni-passau.de

treese@crl.dec.com (Win Treese) (05/29/91)

[As has been pointed out, this was inadvertently posted as a followup
to the wrong message...]

From what I understand, this is a known bug in the current ULTRIX BIND.

For an unsupported, at-your-own-risk new version, you can ftp from

	{crl,gatekeeper}.dec.com:/pub/DEC/cra-bind.tar.Z

This is BIND 4.8.3 for RISC/ULTRIX.

It has no Kerberos support, and the Hesiod parts haven't been tested on ULTRIX.
But you can use it if you want.

If you find a bug, don't call DEC Service, but do send mail to
cra-bind@crl.dec.com.

Win Treese						Cambridge Research Lab
treese@crl.dec.com					Digital Equipment Corp.

jeff@henson.cc.wwu.edu (Jeff Wandling) (05/31/91)

gengenba@forwiss.uni-passau.de (Michael Gengenbach) writes:

>farhad@Sunburn.Stanford.EDU (Farhad Shakeri) writes:

>>We are having some problem with named under Ultrix 4.1A that is
>>causing a lot of headaches.
>>[...]
>>"May 22 09:34:31 LOCALHOST: 18415 named: accept: Too many open files"

Same here.

[... stuff deleted ...]

>BTW, the DEC service refuses to look into the problem. They say that
>our installation is not supported because the named files are located
>in /etc/namedb and not in /var/dss/namedb and we didn't use bindsetup
>for installation.

We have a 5500 server running Ultrix 4.1 and the situation described is
exactly what we are experiencing here.

The difference is that we *did* use bindsetup for installation and our
named files are located in /var/dss/namedb. Are we entitled to support
from DEC?

>If there is a real solution, let me know.

We would like to know if other sites who used "standard" procedure for
setting up bind are getting any help from DEC.

Is there an e-mail address at DEC for reporting these problems?


-- 
Jeff Wandling
Western Washington University Computer Center
uucp: uw-beaver!henson.cc.wwu.edu!jeff
Internet:  jeff@henson.cc.wwu.edu

juh@qt.IPA.FhG.de (Juergen Henke) (06/01/91)

In article <1991May30.225158.12891@henson.cc.wwu.edu>
	   jeff@henson.cc.wwu.edu (Jeff Wandling) writes:
>
>>If there is a real solution, let me know.
>
>We would like to know if other sites who used "standard" procedure for
>setting up bind are getting any help from DEC.
>
>Is there an e-mail address at DEC for reporting these problems?
>
>
>-- 
>Jeff Wandling
>Western Washington University Computer Center
>uucp: uw-beaver!henson.cc.wwu.edu!jeff
>Internet:  jeff@henson.cc.wwu.edu

I also ran into problems with the DEC bind (DECstation 5000, Ultrix V4.1) and
got from one person on the bind-mailing-list following advice:

Throw away this bind and use the one from gatekeeper.dec.com.

This one is a modified 4.8.3 bind which works perfectly for about 4 months now.

Juergen


--
_________________________________________________________________________
Juergen Henke, e-mail juh@qt.IPA.FhG.de, PSI-mail PSI%4505016002::JUH_IPA
Fraunhofer-Institut f. Produktionstechnik u. Automatisierung
Nobelstrasse 12, D-7000 Stuttgart 80