[comp.bugs.4bsd] inetd looping detection too short & dummy environ set wrong

asp@uunet.UU.NET (Andrew Partan) (12/09/89)

Subject: inetd looping detection too short & dummy environ set wrong
Index:	etc/inetd/inetd.c

Description:
	1) inetd has some loop detection code to see if some service is
	being invoked 'too often', and, if so, stops servicing requests
	for the service for a while.  It detects a loop by keeping
	track of the number of servers it starts for each service.  It
	does not start more than TOOMANY (40) servers in CNTY_INTVL
	(60) seconds.  If it detects a loop, it logs "server failing
	(looping), service terminated".  The problem is that machines
	are now fast enough that you may want to run more than 40
	servers a minute (we did, anyhow).

	2) inetd sets a dummy environment variable 'inetd_dummy' so
	that daemons have some space to overwrite their environment for
	ps.  It tries to initialiaze this variable to all x's, but
	calculates the length of the variable wrong, so that only the
	1st few bytes get set to x's.

Repeat-By:
	1) Try to run more than 40 services a second (like "rsh
	compress" of news batches to a fast machine).

	2) Inspection of the source.

Fix:
	1) Increase the loop count from 40 to 400 (should hold us for a
	while - really need better loop detection code here).

	2) Use the proper size of the dummy environment variable when
	initializing it.

*** inetd.c.ORIG	Fri Jan 27 12:33:18 1989
--- inetd.c	Fri Dec  8 13:00:20 1989
***************
*** 79,87 ****
  #include <pwd.h>
  #include <stdio.h>
  #include <strings.h>
  
! #define	TOOMANY		40		/* don't start more than TOOMANY */
  #define	CNT_INTVL	60		/* servers in CNT_INTVL sec. */
  #define	RETRYTIME	(60*10)		/* retry after bind or server fail */
  
  #define	SIGBLOCK	(sigmask(SIGCHLD)|sigmask(SIGHUP)|sigmask(SIGALRM))
--- 79,87 ----
  #include <pwd.h>
  #include <stdio.h>
  #include <strings.h>
  
! #define	TOOMANY		400		/* don't start more than TOOMANY */
  #define	CNT_INTVL	60		/* servers in CNT_INTVL sec. */
  #define	RETRYTIME	(60*10)		/* retry after bind or server fail */
  
  #define	SIGBLOCK	(sigmask(SIGCHLD)|sigmask(SIGHUP)|sigmask(SIGALRM))
***************
*** 224,232 ****
  		/* space for daemons to overwrite environment for ps */
  #define	DUMMYSIZE	100
  		char dummy[DUMMYSIZE];
  
! 		(void)memset(dummy, 'x', sizeof(DUMMYSIZE) - 1);
  		dummy[DUMMYSIZE - 1] = '\0';
  		(void)setenv("inetd_dummy", dummy, 1);
  	}
  
--- 224,232 ----
  		/* space for daemons to overwrite environment for ps */
  #define	DUMMYSIZE	100
  		char dummy[DUMMYSIZE];
  
! 		(void)memset(dummy, 'x', DUMMYSIZE - 1);
  		dummy[DUMMYSIZE - 1] = '\0';
  		(void)setenv("inetd_dummy", dummy, 1);
  	}
  

	--asp@uunet.uu.net (Andrew Partan)
	ASN.1 Object Identifier: "{joint-iso-ccitt mhs(6) group(6) 157}"

casey@gauss.llnl.gov (Casey Leedom) (12/09/89)

| From: asp@uunet.UU.NET (Andrew Partan)
| 
| Fix:
| 	1) Increase the loop count from 40 to 400 (should hold us for a
| 	while - really need better loop detection code here).

  But this fix will leave some slower machines out in the cold.  I have to
agree with the second part of your parenthetical comment.  We really do
need a better mechanism.  Merely patching the current mechanism not only
doesn't fix the current problems, but also may aggravate them.

  A much better idea might be to look at the exit codes of the child
server processes and if more than N of the exit codes in a row were
non-zero, then fail the server.

  However this wouldn't catch a looping client which was submitting
endless legitimate requests.  Unfortunately this is an impossible
situation to detect correctly.  Legitimate (but notably low performance)
programs may perform their jobs by sponsoring hundreds of server requests.

Casey

guy@auspex.UUCP (Guy Harris) (12/16/89)

>  However this wouldn't catch a looping client which was submitting
>endless legitimate requests.  Unfortunately this is an impossible
>situation to detect correctly.

And, if it's truly legitimate, it's not clear what should be done about
it.  It can cause denial of service, but whether this merits having the
service cut off or not is, I guess, a policy question that should be set
by a particular site.  The loop count should, perhaps, be settable from
the command line of "inetd", which at least lets the site choose how
much is too much....

One of the reasons for what may or may not be correctly called "loop
detection" here is to deal with a UDP server that dies before picking
up the packet left for it, causing "inetd" to fire up another copy of
that server, which dies before picking it up, etc..  (This can be
considered a loop, but it's not a case of some program looping
infinitely....)

Checking the exit code (both for a non-zero "normal" exit, and an
exit-by-signal) could help here; that check does require that the author
of the server violate long-standing UNIX programmer habits :-) and
actually take some care to have the exit status reflect the success or
failure of the server, rather than, say, the number of characters
produced by the last "printf" in the program. 

It also would mean that the exit status should *not* reflect whether it
could completely successfully respond to the request - if it failed due
to some error in the request, the failure indication should be sent back
to the client, since the error is then in some sense the client's
"fault" - but should reflect failure only if it's caused by, say, a
necessary file being missing, or corrupted, or something that probably
means *all* requests are doomed to failure (and therefore something that
needs to be fixed on the server machine).