[net.unix-wizards] Sendmail Question

bhoward@funvax.UUCP (Bruce Howard) (08/10/86)

	-------------------------------------------
If you have seen this message twice, I apologize.  After posting
this the first time, I noticed some strangeness in our /usr/lib/news/sys
file and feared that the posting was lost.
	-------------------------------------------

Would someone please explain the order in which sendmail.cf
rulesets are called?  I'm quite aware what the documentation
says, but it is not clear to me whether the order is hardcoded
into sendmail, or whether this is just the order in which the
Berkeley sample sendmail.cf's happen to call the various 
rulesets.  I've looked at the config files for several sites
and most of them seem to call ruleset 3 from ruleset 0 before
resolving addresses to a specific mailer.

I seek enlightenment, not flames.

				Bruce

-- 

		 ---------------------------------
...decvax!ittatc!funvax!bhoward  Bruce Howard @ Fairfield University
...ihnp4!itivax!funvax!bhoward   Located in Scenic Fairfield, CT 06430
		 ---------------------------------

jsdy@hadron.UUCP (Joseph S. D. Yao) (08/17/86)

In article <111@funvax.UUCP> bhoward@funvax.UUCP (Bruce Howard) writes:
>Would someone please explain the order in which sendmail.cf
>rulesets are called?
>	...  I've looked at the config files for several sites
>and most of them seem to call ruleset 3 from ruleset 0 before
>resolving addresses to a specific mailer.
>...decvax!ittatc!funvax!bhoward  Bruce Howard @ Fairfield University
>...ihnp4!itivax!funvax!bhoward   Located in Scenic Fairfield, CT 06430

Having recently wrestled with the demon Sendmail on Ultrix 1.1
(essentially unmodified 4.2), I may be able to help.

As the documentation says or hints (for those not as lucky as
Bruce), there are essentially three major paths through the
rule sets.  They are:
	3 -> 0 -> 4		[the doc forgets 4]
	3 -> D -> 1 -> S -> 4
	3 -> D -> 2 -> R -> 4
S and R are different for each mailer, and are specified as
part of the M mailer specification.  I have not been able to
figure out what D is: nothing like that gets called by the
sendmail on which I was working.  1 and 2 each consist of a
commented-out rule.  This leaves us with, really:
	3 -> 0 -> 4
	3 -> S -> 4
	3 -> R -> 4

Now, who gets called where and when, and whence come S and R?

The mail consists of: a header, a body, and external information.
The body is the message per se.  The header contains (among other
things) From-type lines and To-type lines, containing addresses.
The external information contains such goodies as who really sent
the message, to whom it is really being sent, etc.  This is NOT taken
from the header, (a) to save re-scanning; and (b) to help prevent
cycles of inter-machine mail ping-pong, such as has happened in
the past.  (Perhaps someone at UCB or the NIC wants to tell about
this; I don't know enough.)

When sendmail is called in mailer mode (did you know how many
programs are inside its skin?), it calls 3-0-4 and then 3-1-4
on the sender address.  I am not entirely certain why.  It
then calls 3-0-4 on each to-address in the external information.
This generates, for each address, two or three things:
	(1) a mailer name for that address;
	(2) a name to pass that mailer to make it work; and
	(3) for non-local mailers only, a host address.

Each mailer may require To- and From-type lines to be editted
slightly differently.  For instance, if I send mail via UUCP,
I will want mail to say From: hadron!jsdy; but if I send it
via ARPA net (in theory only! we are not on the DDN!) I will
want the mail to say From: jsdy@hadron.COM.

DIGRESSION:  If Unix mail goes entirely domainist, the latter
form will be universal.  Strictly, a name of the form
name-or-whatever@host[.DOMAIN]* shouldn't be messed with; but
reality intrudes.  (I.e., the world will NOT go domainist all
at once; especially not the Unix world!)

Anyway, IF we want to edit these lines, we find that a different
copy of the message is being made for each mailer.  (Not per host,
if you ever wanted to try that -- and I did.)  If your mailer
specification says:

Mmailer ... S=n1 R=n2

then editting is done to the From-type lines via 3-n1-4, and
editting is done to the To-type lines via 3-n2-4.  E.g.:

Mlocal,	P=/bin/mail, F=rlsDFMmn, S=10, R=20, A=mail -d $u
Mprog,	P=/bin/sh,   F=lsDFMe,   S=10, R=20, A=sh -c $u
Muucp,	P=/usr/bin/uux, F=sDFMhuU, S=13, R=23, M=100000, A=uux - $h!rmail ($u)

Well, what are all those other rule sets?  Whenever you have
a rule of the form:
R<pattern>		$>N<pattern>
then, if the left pattern matches, the rule set N is called
with the right pattern as its initial pattern.  A rule set
returns either the pattern that results after each of its
rules has been applied, in turn, as many times as possible;
or whenever it encounters a right-hand side that is preceded
by $@ -- at which it returns the pattern generated by the
right-hand side.

[Bit of trivia -- the LHS and RHS  m u s t  be separated only
by one or more TABs: any spaces will be interpreted as part
of the rule to which they're attached!  If you have in the
middle a <TAB><SP><TAB>, then your RHS will be interpreted
to be just that <SP>, and you will never figure out what your
problem is!]

We are now back to the question:
>	...  I've looked at the config files for several sites
>and most of them seem to call ruleset 3 from ruleset 0 before
>resolving addresses to a specific mailer.

S0
# first make canonical
R$*<$*>$*		$1$2$3				defocus
R$+			$:$>3$1				make canonical

[The $: on the RHS says do this once only.  This is needed because
the LHS matches all except the null string.  The RHS takes that
string and calls ruleset 3 on it.]

This seemed senseless to me, at first.  After all, 3 just got
called, didn't it?  Why call it again, and waste the gobs of
time (really!) that that takes?  Well, it turns out that later
on in ruleset 0 we find rules:

R<@>:$*			$@$>0$1				retry after route strip
R$*<@>			$@$>0$1				strip null trash & retry

What are these doing?  Right, they are returning the value of
ruleset 0 called on the first wildcard pattern ($* matches all
patterns including the null string).  BUT, before we call 0, we
want to call 3!  So, we call 3 in 0!  Brilliant, no?  I didn't
think so.  I got rid of the $>3 rule, replaced the latter two
RHS's with:
			$@>29$1
(only 30 sets are allowed: 0-29), and constructed rule set 29:

S29
R$+			$:$>3$1
R$+			$:$>0$1

This, as you can now tell, just calls 3 and 0 once each, and
returns whatever pattern results from that.  This was entered
in file zerobase.m4, just before S0, so that any use of S0
perforce includes S29.

I hope this helps.  Any comments, flames, or additions should
double to e-mail as well as to news, as I don't always have time
(between bouts with various demons) to get through this newsgroup.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
			jsdy@hadron.COM (not yet domainised)