[comp.mail.sendmail] A general question of mailers

phil@ux1.cso.uiuc.edu (05/26/90)

I know this is not the correct newsgroup for this question since it is of
a generic nature rather than specific to sendmail, but I cannot seem to
find a newsgroup dealing with issues of mailers delivering mail, the misc
group seems too light, and I suspect many of those in the know are reading
here anyway.

My question relates to the apparently conventional (standard?) practice
of validating a MAIL FROM: address before proceeding to accept the RCPT TO:
address.  I don't understand why this is done this way.  It seems to me that
things will work fine using other methods of mail delivery.

One of the problems with this practice by mailer programs is that long
delays often occur while the receiving mailer tries to validate the
MAIL FROM: address when domain servers are unreachable or unresponsive.
Many mailers assure single threaded mailing to any one host, which ends
up causing delays for all the mail waiting behind (I'd usually rather
not open 30 connections all to the same host anyway when my mailer suddenly
gets a batch of 30 messages from the mailing list server).

As to reliable return of mail, this is easily accomplished by recording the
HELO hostname with the mail control information until the MAIL FROM: address
can be verified.  If it cannot be verified, then rewrite the MAIL FROM:
address at that time naming the host it was received from, since I would
assume that host established reliable return by either verifying the
MAIL FROM: or rewriting to deliver back where it got it.

Can someone give me some explanations about this?  Is there a better place
that this kind of question should be posted?

--Phil Howard, KA9WGN-- | Individual CHOICE is fundamental to a free society
<phil@ux1.cso.uiuc.edu> | no matter what the particular issue is all about.

rayan@cs.toronto.edu (Rayan Zachariassen) (05/27/90)

Phil asks why bother validating the Mail From: address synchronously.

The answer depends on religion and economics:

Synchronous address validation gains you:

	- economy of data transfer, i.e. you only move the message
	  over the link if the receiving mailer isn't going to immediately
	  return it due to a problem in the SMTP envelope.

	  This is a matter of $$s to people who pay volume charges for
	  their IP links (e.g. IP/X.25).

	- real-time feedback from the SMTP server about the acceptability
	  of the mail.

	  This is the religion part... some people insist on having this
	  warm fuzzy feeling.

... but costs you:

	- latency, which relates to general robustness (the longer the
	  SMTP connection is established the higher the chance of something
	  going wrong), and affects the queue sizes on the SMTP client
	  machine.

	  On busy mail relays/gateways, this can be a significant effect.

	  If you pay time charges for your mail link (e.g., IP/X.25 on
	  some PDNs), this can also be a $$ concern.


On fixed-cost high-bandwidth links, I'm a firm believer in Asynchronous
address validation.  I call it the HOT ROCK model of message transfer
(let somebody else closer to the destination worry about the message,
ASAP).  It certainly keeps the queues down on busy mail machines.
I think that's pretty important to the quality of life on those machines.

rayan

lear@turbo.bio.net (Eliot) (05/27/90)

Actually, if you look at RFC 821 (page 36), the following replies are
allowed for mail:

            MAIL
               S: 250
               F: 552, 451, 452
               E: 500, 501, 421

S is the success code, F is for failure, and E is for a protocol
error.

         250 Requested mail action okay, completed
         421 <domain> Service not available,
             closing transmission channel
         451 Requested action aborted: local error in processing
         452 Requested action not taken: insufficient system storage

Strictly speaking, RFC 821 doesn't call for address validation, and in
fact doesn't allow for it.  There was some discussion about this topic
years ago (closer to when RFC 821 was written) when some TOPS-20
machines did return path validation.  The argument then was that such
validation should not occur, due to imperfections in the host naming
system.  The argument could now be made that with DNS, return paths
should be validated.  However, there is no point in doing so.  After
all, who are you going to return to if the return path is bad?  In
addition, it is quite possible that a message that would otherwise be
delivered would get sent to the bit bucket, with the error message
right behind.  This also follows the implementor's golden rule.
-- 
Eliot Lear
[lear@turbo.bio.net]

brian@ucsd.Edu (Brian Kantor) (05/27/90)

I disagree that the SMTP 'Mail From' shouldn't be parsed - but I think
most sites are doing it the wrong way.

What I think is proper is to check it for syntax, and if the syntax of
the MAIL command is incorrect, abort the transfer with an appropriate
protocol error message.  Otherwise accept it, and go on.

Where most mailer's errors come in doing the above is that they run the
hostname in the FROM command through a canonical hostname lookup, and
that causes big delays on some SMTP transactions.

It's silly to do the canonical hostname lookup at that point.  If the
mail is deliverable, you don't need to do it at all, and if the mail
isn't deliverable, you can do it when you return the mail.

The only reason for checking the address supplied in the FROM command at
all is to warn the sender of problems with his mailer - since that
moment is presumably the only time you have any way of getting in touch
with him - after all, the address is invalid, so you can't mail him a
failure notice.

I suspect a lot of sendmail.cf files have hostname canonicalization (is
that a word?) in places (such as ruleset 3) where the FROM command gets
run through the hostname processing.  Seems to me that canonical
hostname substitution need only be done later on, when preparing the
message for delivery.

I'm pretty sure we do the same stupid thing in our sendmail.cf file.
I'll have to look into fixing that.
	- Brian