Kevin_Crowston@XV.MIT.EDU.UUCP (02/27/87)
Message type: Message Topic: SMTP Text: Re: Message of 25 Feb 87 05:01 from MRC%PANDA@SUMEX-AIM.Stanford.EDU > The server should NOT make the client wait while a message is > being delivered... I faced this issue when implementing our mail relay. I decided that the client SMTP would have to wait while the relay delivered the message. Otherwise, the relay could acknowledge the message and then crash or discover that the destination mail server was unable to take the message. Either way, the mail goes on the floor, hardly desirable. Acknowledgement should mean that the message is really okay. On the other hand, I also get multiple copies of a lot of whole and partial messages; it seems that some hosts are less patient than others... Kevin Crowston MIT Sloan School of Management
MRC%PANDA@SUMEX-AIM.STANFORD.EDU.UUCP (02/27/87)
Kevin Crowston - Your relay should queue the message on its local disk and acknowledge it once it is safely written. That protects against the system crash problem. If the message cannot be accepted by the other end, then the message should be returned to the sender via the return-path address. An SMTP server should NEVER block a client waiting for delivery. It is STUPID and WRONG-HEADED to keep an SMTP connection open for ANY period of time longer than is necessary to get the bits across and acknowledged. The world isn't necessarily the Internet with no charges per packet or time charges for a virtual circuit. When time charges are a reality, mail servers that block clients COST REAL MONEY. Sorry for flaming, but this really is an important concept. -- Mark -- -------
geof@decwrl.DEC.COM@imagen.UUCP (02/27/87)
> > The server should NOT make the client wait while a message is > > being delivered... > > I faced this issue when implementing our mail relay. I decided that the > client SMTP would have to wait while the relay delivered the message. > Otherwise, the relay could acknowledge the message and then crash > or discover that the destination mail server was unable to take the message. > Either way, the mail goes on the floor, hardly desirable. Acknowledgement > should mean that the message is really okay. I agree whole-heartedly. The problem is with SMTP itself. TCP mandates that it is the client's responsibility to ensure that the remote client is up. In other words, TCP won't probe an idle connection (the old "keep-alive" discussion), so the higher level protocol must do so if it cares. This behavior on TCP's part is necessary to cope with potentially expensive network paths (e.g., a PTT network that bills by the packet), so that quiescent TCP connections do not run up big bills. If you're out of the office for lunch, you don't want your telnet connection to send packets around uselessly for an hour or more. As in most cases, it doesn't matter much when you're on an Ethernet, but it does in the more general case. In the case of SMTP, when a message is terminated with a ".CRLF", no SMTP data may flow except the server's success/fail response. Since the TCP connection is quiescent during this interval, TCP cannot detect a remote crash. The only reasonable thing to do is to have SMTP set its own death timer when it sends ".CRLF" and hope the message can be delivered during that time interval. The trouble is that there is no way to judge how long the SMTP death timer should be. Some machines deliver mail fast, others not so fast (mine is just plain slow). No matter what value you set for the death timer, you lose some of the time. And the way you lose is that mail to one type of host is always lousy. The ultimate answer would be to fix SMTP, so that the server could still respond with "OK, I'm still here" messages while it was delivering the mail. Given all the SMTP hosts out there, this is probably not going to happen. Ad hoc solutions include: 1. Have the server respond before the message is sent (bad, since messages can get dropped on the floor). 2. Adjust the timeouts to try and accomodate every host you would reasonably connect to => every TCP implementation. This is what we do now, and it doesn't work all the time. 3. Find some random data for the message sender to periodically queue. This would have the effect of taking the TCP connection out of its quiescent state, so that the TCP layer can detect a machine crash for you. This works unless the problem is that the remote SMTP server is in a tight loop, with the remote TCP still healthy (that's a "software bug"-type situation that can be detected and fixed). I favor [3]. Try this: When you send ".CRLF": set timer for how long you expect this to take (T) set timer for how long you are willing to hang (D >> T) set noops=0 wait for input from server On TIMER T: send NOOP<CRLF> command to server noops = noops + 1 set timer to T go back to waiting for input from server On INPUT: process success/fail message from SMTP SEND command while noops > 0 do read & discard command from server noops = noops - 1 end On TIMER D: assume failure of message. The idea is that by sending NOOP commands, the TCP layer will probe the underlying connection for you. Thus, the ultimate timer, D, can be VERY long, since it detects bugs in the remote SMTP, not random events. The annoyance is that you have to ignore enough responses to match each noop you sent (I guess the other annoyance is that it is a miserable hack that should be shot at sunrise...). An obvious enhancement is to query the local TCP before sending a NOOP -- it is not necessary to send anything unless the local TCP is quiescent. This is extremeley useful in the situation where the SMTP connection is dribbling along at 1200 baud somewhere and the REAL problem is that the message hasn't been TRANSMITTED yet. The timer T should be long enough to give the other machine a running shot at delivering the message in that time (say 1-5 minutes). - Geof
jordan@UCBARPA.BERKELEY.EDU.UUCP (02/27/87)
Kevin Crowston writes:
I decided that the client SMTP would have to wait while the
relay delivered the message. Otherwise, the relay could
acknowledge the message and then crash or discover that the
destination mail server was unable to take the message.
Sendmail seems to handle this correctly, since "delivered" to that part
of the code means "placed in the queue" (i.e., wrote it to disk ... if
the machine then crashes, the daemon will pick up where it left off
since the queue file is still there) -- you can't acknowlege the
message as being sent before you have firm control of it. That's what
lock-step is all about. Once you have done that, if you find later
that you can't deliver it, it's up to the recipient SMTP process to
send it back to where it came from. This can be handled
asynchronously.
/jordan
sy.Ken@CU20B.COLUMBIA.EDU.UUCP (02/27/87)
> The server should NOT make the client wait while a message is > being delivered... I faced this issue when implementing our mail relay. I decided that the client SMTP would have to wait while the relay delivered the message... I think the furthest the acknowledgement process should go is essentially "message received by this host and queued for delivery locally". In so many cases, there's often too much processing involved in delivering to the final destination mailbox that the sending system should NOT have to wait for all of this to go on. I see the cases of local mailbox delivery and mail forwarding as the same. For example, host A wants to send to host C, not on host A's network. It must therefore forward through host B. Should host A have to wait while host B tries to forward the mail through all the way to host C? This case is clearly unreasonable. The local delivery process can often be just as unreasonable for a variety of reasons, and thus, the mail should be stuffed into some local delivery queue (which would presumably be a fast process), and actual local delivery can then happen asynchronously with the SMTP dialog. If there is some fatal case where the mail cannot actually be delivered after being queued on the target system for local delivery, then the entire message can be returned to the sender by the mailer. This is how the TOPS-20 mailer works, and it seems like a fairly airtight procedure in practice as well as in theory. /Ken -------
mrose@nrtc-gremlin.arpa.UUCP (02/27/87)
Hack. Hack. Hack. Two things: 1. As Jordan pointed out: as soon as the SMTP server queues the message for delivery (not actually delivers it), the server should send the success acknowledgement to the client. Even if your host is single-threaded, the server can always deliver the mail *after* the SMTP connection is closed. 2. Why hack SMTP? I can find similar faults with interactions in FTP. And in just about any command/response application that you can run on top of TCP. The correct solution is to add an *option* to TCP saying to use keep-alives. Things like SMTP could use it, things like telnet (where a failure is obvious to any interactive user) don't have to use it. With this solution, you only have to make a very small change to the way an application opens the network, instead of complicating the peer-to-peer protocol used by the application. Keep it simple guys! /mtr
MRC%PANDA@SUMEX-AIM.Stanford.EDU.UUCP (02/27/87)
A much better approach is for the SMTP server to queue the message on its local disk and acknowledge immediately. The delivery can be done by an asynchronous process. Unless your system is in real bad shape, it shouldn't take any time at all to write a file on the disk. It is much better to cure the disease (SMTP servers taking an indeterminate amount of time to respond) than it is to mask the symptoms. -------
dms@HERMES.AI.MIT.EDU.UUCP (02/27/87)
Date: Thu, 26 Feb 87 16:47:02 PST
From: jordan@ucbarpa.berkeley.edu (Jordan Hayes)
Organization: Experimental Computer Facility (XCF), UC Berkeley
Kevin Crowston writes:
I decided that the client SMTP would have to wait while the
relay delivered the message. Otherwise, the relay could
acknowledge the message and then crash or discover that the
destination mail server was unable to take the message.
Sendmail seems to handle this correctly, since "delivered" to that part
of the code means "placed in the queue" (i.e., wrote it to disk ... if
the machine then crashes, the daemon will pick up where it left off
since the queue file is still there) -- you can't acknowlege the
message as being sent before you have firm control of it. That's what
lock-step is all about. Once you have done that, if you find later
that you can't deliver it, it's up to the recipient SMTP process to
send it back to where it came from. This can be handled
asynchronously.
/jordan
Actually, sendmail doesn't handle this completely correctly. Before
sendmail queue's up a message, and gives the acknowledgment back to
the sender, it attempts to expand every address in a mailing list.
This expansion can take a long time, since it means a call to the
resolver to qualify host names. So, messages sent to large mailing
lists take a long time to get queued up. What sendmail should be doing
is writing out a very simple queue file with the un-expanded
receipients. The background delivery process should do the expansion
the first time it comes across an un-expanded address.