lindberg@cs.chalmers.se (Gunnar Lindberg) (07/21/89)
At times our mail gateway, chalmers.se [129.16.1.1] a VAX 11/750, gets very heavily loaded by people on other (faster, :-) machines sending large volumes of data via: foreach f ($FILES) mail foo%bar.se@chalmers.se end Now, that's completely legal, still I would like to make chalmers.se say "Wait, I'm too busy" at such times. Well, I looked into "sendmail" code and basically it does: if (load > OX) /* getla() > RefuseLA */ deny_all_connections_for_5_seconds(); I don't want to offend anyone, but definately that code isn't very "polite", :-). Besides, "getla()" didn't work on a VAX with a frozen configuration. We corrected that and changed the code into: accept(); /* never deny connection */ if ( ! fork()) if (load > OX) exit(message("421", "I'm too busy")); Using this code on a Sequent we found that it counts in milli-jobs (possibly that's why it never gets overloaded, :-) so getla() had to to do some extra tricks (#ifdef sequent). All the diffs follow below. Gunnar Lindberg sendmail 5.60: =================================================================== RCS file: conf.c,v retrieving revision 1.2 diff -c -r1.2 conf.c *** /tmp/,RCSt1000790 Thu Jul 13 16:06:11 1989 --- conf.c Thu Jul 13 16:03:45 1989 *************** *** 429,434 **** --- 429,445 ---- if (Nl[0].n_type == 0) return (-1); } + /* + * Gunnar Lindberg, lindberg@cs.chalmers.se: + * When using ".fc" ("-bz") files all sorts of funny things may + * happen, e.g. you may find "kmem > 0", but no longer a valid + * file descriptor for "/dev/kmem". If you close it you will block + * acess to its current file (whatever *that* may be). The only + * reasonable(?) thing to do is just to re-open "/dev/kmem". + */ + if (lseek(kmem, (off_t) Nl[X_AVENRUN].n_value, 0) == -1) + kmem = open("/dev/kmem", 0, 0); + if (lseek(kmem, (off_t) Nl[X_AVENRUN].n_value, 0) == -1 || read(kmem, (char *) avenrun, sizeof(avenrun)) < sizeof(avenrun)) { =================================================================== RCS file: conf.c,v retrieving revision 1.3 diff -c -r1.3 conf.c *** /tmp/,RCSt1a01977 Fri Jul 21 09:03:01 1989 --- conf.c Thu Jul 13 17:55:52 1989 *************** *** 412,418 getla() { static int kmem = -1; ! # ifdef sun long avenrun[3]; # else double avenrun[3]; --- 412,418 ----- getla() { static int kmem = -1; ! # if defined(sun) || defined sequent long avenrun[3]; # else double avenrun[3]; *************** *** 425,430 if (kmem < 0) return (-1); (void) ioctl(kmem, (int) FIOCLEX, (char *) 0); nlist("/vmunix", Nl); if (Nl[0].n_type == 0) return (-1); --- 425,433 ----- if (kmem < 0) return (-1); (void) ioctl(kmem, (int) FIOCLEX, (char *) 0); + # ifdef sequent + nlist("/dynix", Nl); + # else sequent nlist("/vmunix", Nl); # endif sequent if (Nl[0].n_type == 0) *************** *** 426,431 return (-1); (void) ioctl(kmem, (int) FIOCLEX, (char *) 0); nlist("/vmunix", Nl); if (Nl[0].n_type == 0) return (-1); } --- 429,435 ----- nlist("/dynix", Nl); # else sequent nlist("/vmunix", Nl); + # endif sequent if (Nl[0].n_type == 0) return (-1); } *************** *** 449,454 # ifdef sun return ((int) (avenrun[0] + FSCALE/2) >> FSHIFT); # else return ((int) (avenrun[0] + 0.5)); # endif } --- 453,461 ----- # ifdef sun return ((int) (avenrun[0] + FSCALE/2) >> FSHIFT); # else + # ifdef sequent + return ((int) ((avenrun[0] + 500)/1000)); + # else sequent return ((int) (avenrun[0] + 0.5)); # endif sequent # endif *************** *** 450,455 return ((int) (avenrun[0] + FSCALE/2) >> FSHIFT); # else return ((int) (avenrun[0] + 0.5)); # endif } --- 457,463 ----- return ((int) ((avenrun[0] + 500)/1000)); # else sequent return ((int) (avenrun[0] + 0.5)); + # endif sequent # endif } =================================================================== RCS file: srvrsmtp.c,v retrieving revision 1.3 diff -c -r1.3 srvrsmtp.c *** /tmp/,RCSt1000790 Thu Jul 13 16:06:18 1989 --- srvrsmtp.c Thu Jul 13 16:02:41 1989 *************** *** 144,149 **** --- 144,152 ---- /* this must be us!! */ CurHostName = MyHostName; } + + hostbusy(); /* Accept connection? Non-returning if not. */ + expand("\001e", inp, &inp[sizeof inp], CurEnv); message("220", inp); SmtpPhase = "startup"; *************** *** 532,537 **** --- 535,558 ---- break; } } + } + + /* + * Gunnar Lindberg, lindberg@cs.chalmers.se: + * Check that we are not too busy to accept the connection (OXnn). + * If we are, we just say "421 I'm too busy" and close. Now, that's + * not very polite, but we have to show him we mean it. + */ + static + hostbusy() + { + if (getla() > RefuseLA) + { + message("421", "%s too busy, please try later", MyHostName); + if (InChild) + ExitStat = EX_QUIT; + finis(); + } } /* ** SKIPWORD -- skip a fixed word. =================================================================== RCS file: daemon.c,v retrieving revision 1.4 diff -c -r1.4 daemon.c *** /tmp/,RCSt1000790 Thu Jul 13 16:06:25 1989 --- daemon.c Thu Jul 13 16:06:04 1989 *************** *** 179,187 **** --- 179,195 ---- struct sockaddr_in otherend; extern int RefuseLA; + /* + * Gunnar Lindberg, lindberg@cs.chalmers.se: + * Since we now test load average in the child and reply + * "421 I'm too busy" if if we are, we dont have to reject + * connections here any more. + */ + #ifdef notdef /* see if we are rejecting connections */ while (getla() > RefuseLA) sleep(5); + #endif notdef /* wait for a connection */ do ===================================================================
paul@uxc.cso.uiuc.edu (07/25/89)
Re: modifying sendmail to return 421 too busy This is not a good idea. On machines where fork() takes significant resources, having the child return the 421 means that the process image has already been duplicated just to return a error message. It then gets worse. When the sending sendmail gets an open time-out from a loaded remote machine with vanilla sendmail, it skips the remaining messages to the same site during that queue run. Returning a 421 error causes the sending sendmail to skip to the next message instead. Thus the receiver will fork(), issue 421, and exit() for each message in the sender's queue. This can bring a loaded uni-processor VAX to its knees. Paul Pomes Univ of Illinois, CSO
lindberg@cs.chalmers.se (Gunnar Lindberg) (07/25/89)
To summarize: There are a number of good reason *not* to implement my "brilliant" idea of sendmail replying "421 I'm too busy": 1) Paul Pomes <paul@uxc.cso.uiuc.edu>: Thus the receiver will fork(), issue 421, and exit() for each message in the sender's queue. This can bring a loaded uni-processor VAX to its knees. 2) Brian Kantor <brian@ucsd.edu>: If you have multiple mail servers (i.e., more than one MX host), refusing connections on one of them will cause incoming mail to be redirected to another of them. If instead you accept the connection and "421" it, the mail gets requeued on the sender and the other MX hosts are not tried. 3) Both: Current sendmails note that a host is refusing connections and on the current queue run will avoid trying to make additional connections to it. Anyway, forget about my changes to "srvrsmtp.c" and "daemon.c" - the original code does a much better job than I did! However, I do think the changes to "conf.c", to make "getla()" work on hosts with a frozen configuration and on Sequent are still valid. Gunnar Lindberg