[comp.protocols.tcp-ip] Food for thought

bzs@ENCORE.COM (Barry Shein) (11/09/88)

Is it just possible that we have tolerated a mail system which has
grown so complicated that a program like sendmail with its generalized
regular expressions pattern matchers and massively complicated rules
(and several other complex "support" programs to figure out paths,
like pathalias) is about the simplest program which can do an adequate
job acting as a gateway?

Maybe it's time to start asserting some authority and saying that by
DD/MM/YY only host@legal.domain.name will be accepted and to hell with
all this punctuative creativity. Let creative people find ways to
conform to a standard.

I'll let sendmail speak for itself...(there are over 300 such lines
in a typical config file):

# more miscellaneous cleanup
R$+			$:$>8$1				host dependent cleanup
R$+:$*;@$+		$@$1:$2;@$3			list syntax
R$+@$+			$:$1<@$2>			focus on domain
R$+<$+@$+>		$1$2<@$3>			move gaze right
R$+<@$+>		$@$>6$1<@$2>			already canonical

Complexity breeds error.

(note: This is *not* a bash at sendmail, I honestly have never been
able to think of a much different way to effectively handle the
current miasma of addressing schemes.)

	-Barry Shein, ||Encore||

david@ms.uky.edu (David Herron -- One of the vertebrae) (11/21/88)

In article <8811090326.AA26953@multimax.encore.com> bzs@ENCORE.COM (Barry Shein) writes:
>
>Is it just possible that we have tolerated a mail system which has
>grown so complicated that a program like sendmail with its generalized
>regular expressions pattern matchers and massively complicated rules
>(and several other complex "support" programs to figure out paths,
>like pathalias) is about the simplest program which can do an adequate
>job acting as a gateway?

AHEM!  If you'll look over at CSnet-Relay you'll see, not sendmail,
but MMDF instead.  MMDF is proof that you don't need a nearly impossible
to use language to do the things you just mentioned.  With MMDF it's
a mixture of C and certain details in configuration files.

>Complexity breeds error.

YES  I agree completely.

>(note: This is *not* a bash at sendmail, I honestly have never been
>able to think of a much different way to effectively handle the
>current miasma of addressing schemes.)
>
>	-Barry Shein, ||Encore||

So I'll describe how MMDF handles it and see what you think.  I apologize
if this isn't very well written, but it's off the cuff.

The model is that you have "channels" through which all mail passes,
either going in or out.  In times where there are two different
addressing methods on each end of the channel, the channel program
converts one to the other as appropriate.  It uses pure RFC822
internally, except that it supports the %-hack as well.  That is, a
message coming in from uucp land has !-addresses converted to
@-addresses but unfortunately in a mixed mode:

	known.site!unknown.site!user
to	unknown.site!user@known.site

Which we all know to be a bad thing.  But I can live with it ...

Once the message is within the system the various addresses
are looked up in its database of the known world.  There are
two sorts of tables, domain tables and channel tables.  Domain
tables specify which domains exist and what hosts are in the
domain.  Channel tables specify which channel to use to reach
particular hosts.

To configure the gateway into the system it's simply enough put
the following into a domain table:

	berkeley.edu: ucbvax.berkeley.edu, g.ms.uky.edu

Now, the left hand side (LHS) specifies the domain for this record.
The RHS (in domain tables), it doesn't have to be specified fully,
on each domain table there is a upper-level portion which is
attached to the name.. that is

	g: g.ms.uky.edu

is in the "ms.uky.edu" domain table (it's also in the "uky.csnet"
table, which is also another flexibility).

Now, the line with berkeley.edu.  On the LHS, if the domain we're
considering matches it then we use the information on the RHS to
generate a route to reach it.  That form with comma's and such
ends up generating an RFC822 route-addr like so:

	<@g.ms.uky.edu,@ucbvax.berkeley.edu:user@bleah.berkeley.edu>

Now, MMDF could support a different addressing scheme internally
and still use that same syntax in domain files to specify routes.
Right?  (In fact, I understand that Steve Kille is working on
a version of MMDF which does X.400 internally, but I don't know
any details).

Now, if once it reached "g" the message could get to ucbvax only
by UUCP, the UUCP channel will translate to:

<route-to-ucbvax>!ucbvax!ucbvax.berkeley.edu!bleah.berkeley.edu!user

The "<route-to-ucbvax>" stuff is stored in the channel table.  The LHS
in a channel table has the domain-name for this entry, and the RHS has
routing information.  Like in the SMTP channel it has the IP address of
the host.  In the UUCP channel it has pathalias output.

The only thing I know of which we cannot do here is to generate routing
which only works by using the %-hack.  The example is our IBM
mainframe, one of them anyway.  They have TCP/IP stuff on one of them
only.  If I were to configure things such that bitnet mail went through
their SMTP server (so that it would avoid the BITNET line between here
and there) ... well, the SMTP server on the mainframe chokes on
"user@host.bitnet" addresses.  But if I instead were able to generate
"user%host.bitnet@ukcc.uky.edu" it would work fine.  So I end up
routing all that stuff by way of BSMTP over bitnet lines, but the
Crosswell mailer they use doesn't understand RFC822 routes so I have to
munge the routing to make "@ukcc.uky.edu:user@host.bitnet" ->
"user@host.bitnet".  Sigh.



I haven't yet read the documents for ZMAIL, at least not closely,
but what I read of it looks good.
-- 
<-- David Herron; an MMDF guy                              <david@ms.uky.edu>
<-- ska: David le casse\*'      {rutgers,uunet}!ukma!david, david@UKMA.BITNET
<--
<-- Controlled anarchy -- the essence of the net.

dcrocker@TWG.COM (Dave Crocker) (11/22/88)

This is to elaborate on David Herron's reply, about MMDF.

The philosophical differences between sendmail and MMDF have always been
quite basic.  sendmail puts essentially full address parsing and mapping
decisions into the hands of the system administrator.  MMDF builds the
rules into code and gives the administrator access to certain parameterized
choices.  While the latter would seem to be simpler for administrators, the
presence of the domain system, as a logical indirection to the routing
information, makes the actual practise more painful that one would like.

In any event, when a message comes in, from ANYWHERE, address specifications
are mapped to a single canonical representation.  The works because,
ultimately, addresses reduce to a routing sequence.  That is, most of the
brouhaha about addresses has to do with syntax.  The range of semantic
choices seems to be rather small.

When the message is being sent, the channel doing the sending knows how to
map from canonical to channel-specific format.  This is built into the code.

To anticipate a concern:  You might fear that this makes MMDF too rigid in
its address handling.  To some extent, this is correct.  And intentional.

In reality, it is not very often that a new mapping algorithm needs to be
invented.  On the other hand, configuration files need to be built quite
frequently.  The world is quite sensitive to incorrect mappings and it is
not always easy to specify the mapping -- Eric Allman's efforts with
developing a language for it, in sendmail, were quite impressive -- and,
sure enough, random folk like system administrators, get it wrong frequently.

Hence the decision to bury as much of this thought into rigid code.

Dave

lubich@ethz.UUCP (Hannes Lubich) (11/28/88)

Dave is absolutely right. I've been running a Swisswide mail gateway to 
BBN for 2 years now using mmdf and UBC's EAN, since our academic mail is
almost completely based on EAN (i.e. X.400 mail).
The mmdf philosophy of compiling most of the mapping rules and still giving 
you as much flexibility in table configuration as possible really helps in 
setting up and managing such a gateway in fast and reliable way.
Besides, when EAN changed from version 1 to version 2, we needed a different 
interface between mmdf and EAN. The channel philosophy of mmdf again helped
us in coding this new channel in a very short time, compared with other mail 
gateways developed here.
Cheers
	--HaL

-- 
~ UUCP/Usenet 	       : {known world}!mcvax!cernvax!ethz!lubich
~ or                   : lubich@ethz.uucp
~ CSNET/ARPA/BITNET    : lubich@ifi.ethz.ch / lubich%ifi.ethz.ch@relay.cs.net
~ The usual disclaimer : No, it wasn't me, somebody must have used my account.