[net.mail] sendmail configurations

cball@inmet.UUCP (09/23/86)

I have a couple of comments/questions about sendmail configurations.  I
have solved them for our site with local modifications.  However, the
problems seem to be generic ones.  The second one in particular must
be handled by every gateway host, I'd like to find out how other sites
handle this stuff.

1) Every configuration which I've seen has the following:
>
>#####		RULESET ZERO PREAMBLE
>#####
>#####	The beginning of ruleset zero is constant through all
>#####	configurations.
>#####
>############################################################
>############################################################
>
>S0
>
># first make canonical
>R$*<$*>$*		$1$2$3				defocus
>R$+			$:$>3$1				make canonical

Am I missing something?  My understanding is that sendmail invokes
ruleset 3 automatically before invoking ruleset 0.  The first line
undoes everything that the automatic invocation accomplishes, and then
reruns the address through ruleset 3!  Since the distributed sendmail
configurations do a significant amount of work in ruleset 3, this
seems like a waste.  In fact, I disabled these lines in our configuration
about a year ago with no known ill effects.  I have tested pretty
thoroughly, and have always gotten the same results(only faster).
Actually, sendmail seems to run significantly faster once berknet,
old internet (RFC733), and some unneeded "boiler-plate" constructs
are removed.  I am posting this now to find out if this breaks something
I don't have the wit to figure out.

2) The internet domain-style naming scheme inherently removes the
notion of a hosts' network from its name.  There are many positive
aspects to this scheme.  However, it means that sendmail on a gateway
host or even a host with a simple, small lan in addition to uucp
connections needs a mechanism to determine which network gets a given
piece of mail.  Up to now, our site has had atomic hostnames and has
been able to use a sendmail class (pretty much) as follows:
	/* The class 'S' includes all primary host names from host table*/
CS/etc/hosts "%*s %s"
	/* rewriting rule */
R$*<@$*>$*		$:$1<@$[$2$]>$3		# map aliases to primary names
R$*<@$=S$*>$*		$#tcp$@$2$:$1<@$2$3>$4	# deliver mail to primary hosts

We have a couple of sun clusters and a milnet host(which can't act as a 
gateway) so our host tables are reasonably active and cannot reasonably
be included directly in the configuration file. The above rules have allowed
us to identify lan hosts correctly when the host table is accurate, no
separately maintained list of hosts is necessary.  The problem is that
host names are no longer atomic since period('.') is both a sendmail
operator and part of every domain-style hostname.  How can I correctly
identify tcp (or usenet) hosts in sendmail in order to select the correct
network?
	I have solved the problem locally by modifying the sendmail
routine maphostname() to append ".tcp" when the calls gethostbyname(3)
or gethostbyaddr(3) indicate the host is reachable via TCP.  The modified
routine is invoked on <address> when the rhs(right hand side) construct
$[<address>$] is executed.  This seems to be an appropriate solution,
in fact, I'm tempted to extend it with a new "network" option:
ON<net> [<net>]
<net> ::= "IPC" # add .tcp as described above
	| <executable fname> #Append ".<netdesignator>" as appropriate

This would allow sendmail to *know* that a network could deliver a piece
of mail before invoking its mailer, support better forwarding site
selection, and provide for network precedence.  However, I'd like to find
out what other sites do, particularly those which are on the internet as
well as usenet.  If there's an easier/better way to do this I'd like
to know about it.

	Thanks,
	Charles Ball
	Intermetrics, Inc.
	...{ihnp4,ima,mirror}!inmet!cball

jsdy@hadron.UUCP (Joseph S. D. Yao) (09/27/86)

In article <5600009@inmet> cball@inmet.UUCP writes:
>1) Every configuration which I've seen has the following:
>>S0
>># first make canonical
>>R$*<$*>$*		$1$2$3				defocus
>>R$+			$:$>3$1				make canonical

The following is slightly modified from an earlier posting.

As the documentation says or hints, there are essentially three
major paths through the rule sets.  They are:
	3 -> 0 -> 4		[the doc forgets 4]
	3 -> D -> 1 -> S -> 4
	3 -> D -> 2 -> R -> 4
S and R are different for each mailer, and are specified as
part of the M mailer specification.  I have not been able to
figure out what D is: nothing like that gets called by the
sendmail on which I was working.  1 and 2 each consist of a
commented-out rule.  This leaves us with, really:
	3 -> 0 -> 4
	3 -> S -> 4
	3 -> R -> 4
...

>S0
># first make canonical
>R$*<$*>$*		$1$2$3				defocus
>R$+			$:$>3$1				make canonical
This seemed senseless to me, at first.  After all, 3 just got
called, didn't it?  Why call it again, and waste the gobs of
time (really!) that that takes?  Well, it turns out that later
on in ruleset 0 we find rules:

R<@>:$*			$@$>0$1				retry after route strip
R$*<@>			$@$>0$1				strip null trash & retry

What are these doing?  Right, they are returning the value of
ruleset 0 called on the first wildcard pattern ($* matches all
patterns including the null string).  BUT, before we call 0, we
want to call 3!  So, we call 3 in 0!  Brilliant, no?  I didn't
think so.  I got rid of the $>3 rule, replaced the latter two
RHS's with:
			$@>29$1
(only 30 sets are allowed: 0-29), and constructed rule set 29:

S29
R$+			$:$>3$1
R$+			$:$>0$1

This, as you can now tell, just calls 3 and 0 once each, and
returns whatever pattern results from that.  This was entered
in file zerobase.m4, just before S0, so that any use of S0
perforce includes S29.

>2) The internet domain-style naming scheme inherently removes the
>notion of a hosts' network from its name.  ...
>	 ... needs a mechanism to determine which network gets a given
>piece of mail.  Up to now, our site has had atomic hostnames and has
>been able to use a sendmail class (pretty much) as follows:
>	/* The class 'S' includes all primary host names from host table*/
>CS/etc/hosts "%*s %s"

I trust you mean FS/etc/hosts ...

>R$*<@$*>$*		$:$1<@$[$2$]>$3		# map aliases to primary names
>R$*<@$=S$*>$*		$#tcp$@$2$:$1<@$2$3>$4	# deliver mail to primary hosts
>	 ... our host tables are reasonably active and cannot reasonably
>be included directly in the configuration file.  The problem is that
>host names are no longer atomic since period('.') is both a sendmail
>operator and part of every domain-style hostname.  How can I correctly
>identify tcp (or usenet) hosts in sendmail in order to select the correct
>network?

This is definitely a real problem.  Many good authors don't realise
that $=S matches exactly one token, and therefore cannot match any
host.domain - style names (three+ tokens).

But first:

>	I have solved the problem locally by modifying the sendmail
>routine maphostname() to append ".tcp" when the calls gethostbyname(3)
>or gethostbyaddr(3) indicate the host is reachable via TCP.

This is rather clever, but involves (gasp) modifications to code.

What I've been forced to resort to is twofold.  First, at least
internally, you should try to enforce a standard domain on your
machines, which is only reasonable if they ARE all yours.  More
important for what you're trying to do, you'll have to have more
than one list: L=hosts.lan, T=hosts.tcp, or whatever.  Then you
can test for <@$=L.$=D> or whatever, and appropriately invoke lan.
I suggest that tcp be a default, since /etc/hosts with tcp is
rather large ...
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
			jsdy@hadron.COM (not yet domainised)