[comp.mail.sendmail] SunOS4.1.1 sendmail.mx's actual ruleset sequence

ylee@csl.dl.nec.com (Ying-Da Lee) (05/09/91)

After getting some completely unexpected results from sendmail
(sendmail.mx distributed with SunOS4.1.1), I got a little
suspicious and put in a rule at the beginning of each ruleset
to leave an identifying mark in the address each time that
ruleset is applied.  I then sent a message from the machine
with the altered sendmail.cf to another machine in our lab and
checked the log on the receiving machine.  This is what the
log shows:

May  8 16:24:04 florida sendmail[29565]: AA29565: message-id=<9105082123.AA23760@texas.csl.dl.nec.com>
May  8 16:24:04 florida sendmail[29565]: AA29565: from=<4.12.1.3.4.1.3.ylee>, size=296, class=0
May  8 16:24:06 florida sendmail[29567]: AA29565: to=<4.22.2.3.4.22.2.0.3.ylee@csl.dl.nec.com>, delay=00:00:02, stat=User unknown

This would indicate that the sequence of rulesets applied
at the sending machin are

3,1,4,3,1,12,4 for the sender, and
3,0,2,22,4,3,2,22,4 for the recipient,

where 12 and 22 are the mailer-specific sender and recipient
rulsets respectively.

The sendmail.cf is my own, but that shouldn't matter as
we are only concerned with the predefined flow of well-known
rulsets, none of them are invoked indirectly through another
in the example cited above.

I must say I am very much surprised by this finding.  Can anybody
confirm/dispute/explain this?  Do other brands of sendmail behave
the same way?


	Ying-Da Lee			(214)518-3490
	C&C Software Development Lab
	NEC America			(214)518-3990 (FAX)
	ylee@csl.dl.nec.com
	uunet!necbsd!ylee

rickert@mp.cs.niu.edu (Neil Rickert) (05/09/91)

In article <1991May8.231130.25587@csl.dl.nec.com> ylee@csl.dl.nec.com (Ying-Da Lee) writes:
>After getting some completely unexpected results from sendmail
>(sendmail.mx distributed with SunOS4.1.1), I got a little
>suspicious and put in a rule at the beginning of each ruleset
>
>This would indicate that the sequence of rulesets applied
>at the sending machin are
>
>3,1,4,3,1,12,4 for the sender, and
>3,0,2,22,4,3,2,22,4 for the recipient,
>
>where 12 and 22 are the mailer-specific sender and recipient
>rulsets respectively.

  I posted a fairly complete rundown on the sequence of rules used approx.
one week ago (message-id <1991Apr29.144019.22206@mp.cs.niu.edu>).  Your
example amply confirms my description.  I did perhaps touch too lightly on
the reprocessing of the recipient address which I believe happens only for
SMTP mailers, and in my opinion is logically wrong.  (The IDA versions which
I use do not reprocess the recipient address this way).

  Now why don't you create an alias for that final recipient address so it
won't bounce, and run your test again to see if you can confirm my
description of the header rewriting sequence (the 'To:' and 'From:' headers).


-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

ylee@csl.dl.nec.com (Ying-Da Lee) (05/10/91)

In article <1991May9.005329.17471@mp.cs.niu.edu> rickert@mp.cs.niu.edu (Neil Rickert) writes:
>  I posted a fairly complete rundown on the sequence of rules used approx.
>one week ago (message-id <1991Apr29.144019.22206@mp.cs.niu.edu>).  Your
>example amply confirms my description.  I did perhaps touch too lightly on
>the reprocessing of the recipient address which I believe happens only for
>SMTP mailers, and in my opinion is logically wrong.  (The IDA versions which
>I use do not reprocess the recipient address this way).

That was a fine article, Neil, but I am afraid the part discussing
address reprocessing must have skated right by me.  Would you mind
writing another article concentrating on that issue?

>  Now why don't you create an alias for that final recipient address so it
>won't bounce, and run your test again to see if you can confirm my
>description of the header rewriting sequence (the 'To:' and 'From:' headers).

Well, I created an account for the peculiar address (I had to,
aliasing would have caused the result of alias substitution
rather than the envelop recipient being shown on the log), and
this is what the message looks like at the receiving end:

	From 4.12.1.3.4.1.3.ylee Thu May  9 15:50:42 1991
	Received: by texas.csl.dl.nec.com (4.1/YDL1.6-910507.10)
		id AA12684(texas.csl.dl.nec.com); Thu, 9 May 91 15:50:41 CDT
	Received: by florida.csl.dl.nec.com (4.1/YDL2.0-TEST)
		id AA01372(florida.csl.dl.nec.com); Thu, 9 May 91 15:50:39 CDT
	Date: Thu, 9 May 91 15:50:39 CDT
	From: 4.12.1.3.4.12.1.3.4.1.3.ylee
	Message-Id: <9105092050.AA01372@florida.csl.dl.nec.com>
	To: 4.22.2.3.ylee@csl.dl.nec.com
	Subject: testing
	Cc: 4.22.2.3.ylee@csl.dl.nec.com
	
	to ylee@texas.csl.dl.nec.com
	cc to ylee@csl.dl.nec.com

The corresponding log entries are:

May  9 15:50:41 texas sendmail[12684]: AA12684: message-id=<9105092050.AA01372@florida.csl.dl.nec.com>
May  9 15:50:41 texas sendmail[12684]: AA12684: from=<4.12.1.3.4.1.3.ylee>, size=373, class=0
May  9 15:50:42 texas sendmail[12687]: AA12684: to=<4.22.2.3.4.22.2.0.3.ylee@csl.dl.nec.com>, delay=00:00:01, stat=Sent

The To: and Cc: fields run through 3,2,22,4 as expected, but everything
else seems very peculiar.  Why is it that From, envelop sender, and
envelop recipient go through the loop twice and the From: field does
three times?  This is all very puzzling.

	Ying-Da Lee			(214)518-3490
	C&C Software Development Lab
	NEC America			(214)518-3990 (FAX)
	ylee@csl.dl.nec.com
	uunet!necbsd!ylee
	
 

rickert@mp.cs.niu.edu (Neil Rickert) (05/12/91)

In article <1991May9.221508.13637@csl.dl.nec.com> ylee@csl.dl.nec.com (Ying-Da Lee) writes:
>In article <1991May9.005329.17471@mp.cs.niu.edu> rickert@mp.cs.niu.edu (Neil Rickert) writes:
>>  I posted a fairly complete rundown on the sequence of rules used approx.
>
>That was a fine article, Neil, but I am afraid the part discussing
>address reprocessing must have skated right by me.  Would you mind
>writing another article concentrating on that issue?

 That was also a long article.  I don't intend rewriting it all.  I believe
a number of ftp sites may have made a copy available.  There is a file
called something like 'sendmail.rules' with the appropriate date at
uxc.cso.uiuc.edu

>>  Now why don't you create an alias for that final recipient address so it
>>won't bounce, and run your test again to see if you can confirm my
>>description of the header rewriting sequence (the 'To:' and 'From:' headers).
>
>Well, I created an account for the peculiar address (I had to,
>
>	From 4.12.1.3.4.1.3.ylee Thu May  9 15:50:42 1991
>...
>	From: 4.12.1.3.4.12.1.3.4.1.3.ylee
>	To: 4.22.2.3.ylee@csl.dl.nec.com
>
>The corresponding log entries are:
>
>May  9 15:50:42 texas sendmail[12687]: AA12684: to=<4.22.2.3.4.22.2.0.3.ylee@csl.dl.nec.com>, delay=00:00:01, stat=Sent

  A few more comments.

  Firstly, this corresponds exactly with my lengthy article on the
subject.

  The envelope sender (on 'From '):  There is an initial processing of
the incoming sender by rules 3,1,4.  The result is saved as $f.  This is
done in setsender() in envelope.c .  The comments there do not fully
explain the purpose, but on use seems to be to convert local names of
the form 'user@your.domain' to just a canonical 'user' to store in
$f.

 When a mailer is selected, the address in $f is processed to put it in
the form for the outgoing mailer.  This is done with a call to
remotename(), and involves using rulesets 3,1,mailer-specific,4.  The
result is assigned to $g.

 The 'From:' header is usually built from the value of $g.

 A typical 'sendmail.cf' may define $q with something like:

Dq$x <$g>

(Note the above is deliberately simplified from what you perhaps have).
Then the 'From:' header is defined in terms of $q

H?F?From: $q

 All of this means that the sender address, which has already been
processed twice, is taken from $g to store in the 'From:' header,
and then header processing will reprocess it once more.  On mail with
an incoming 'From:' header this will not happen unless the contents of
the incoming 'From:' header is completely identical (except for white space)
to the incoming envelope sender.  Otherwise only a single processing of the
existing header is performed.

 In the current IDA rulesets, we are defining $q in terms of $f (rather
than $g) to eliminate one of these steps.  Note that this is not for
efficiency, but because the IDA versions are capable of processing
header addresses differently from envelope addresses.  This capability
is used in the IDA rulesets.

 The reprocessing of the recipient envelope address only happens for
SMTP mailers.  The IDA versions do not do this.  The Berkeley versions
do, as pointed out a few weeks ago by kyle@uunet.uu.net.  It is done
in usersmtp.c with a call to remotename() just before sending the
RCPT-To: command.  In my opinion that additional processing is wrong, and
some of the potential problems were mentioned by kyle.

 What your tests did not show is the additional step that occurs for
some messages when the value of $f contains an '@' but the header
address does not.  In that case, if there is no '@' by the end of
ruleset 3, and if the 'C' flag is set in the 'receiving mailer'
the '@domain' from $f is appended to the address and it is processed
once more by ruleset 3.  (See my earlier article for more details).

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940