[net.mail.headers] "blaming Unix SendMail"

MRC@SU-SCORE.ARPA (Mark Crispin) (04/17/84)

     I will probably regret this, but I couldn't help myself.  It seems
that any program which requires configuration files of such complexity
that just about every site gets them wrong is sadly lacking in the most
basic principles of good software design.  Unix SendMail seems to be
such a program.

     Tell me, why does SendMail need such complex configuration files?
Wouldn't a preferable scheme be to look at ones environment at runtime
and do the right thing by default?

     I guess I'm wedged.  After all, I program on dinosaurs using the
obsolete thinking that software should do the right thing in its default
unmodified state without requiring such elaborate configuration procedures.
-------

honey@down.UUCP (code 101) (04/18/84)

why does sendmail require such complex configuration tables, you ask?
read the documentation (sendmail/doc/op.me):

	There is one point that should be made clear immediately:  the
	syntax of the configuration file is designed to be reasonably
	easy to parse, since this is done every time sendmail starts
	up, rather than easy for a human to read or write.

let's just ignore the fact that parsing was made trivial by geniuses of
decades past, and go blithely along, prematurely optimizing.  progress?
who needs it?  feh.
	peter honeyman

mark@cbosgd.UUCP (Mark Horton) (04/18/84)

	     Tell me, why does SendMail need such complex configuration files?
	Wouldn't a preferable scheme be to look at ones environment at runtime
	and do the right thing by default?

This is a little like claiming your computer should "do the right thing
by default" when presented with a null program.  "The right thing" is
highly subjective, and probably depends on reading someones mind.  For
example, how should "a!b@c" be interpreted?  "(a!b)@c" as required by
the ARPANET?  "a!(b@c)" as required by existing UUCP software?  Someone
has to make a decision, and this decision must be implemented in the
sendmail config file.  In fact, "looking in ones environment" is done on
UNIX largely by reading a file.  Wouldn't such a file be called a config file?

Sure, sendmail configuration files are complex.  So is machine language.
It doesn't mean that you shouldn't have any machine language on your
machine.  It means you write a compiler from a high level language.

Sendmail's config files are very complex because they do so much.  What
should be done is for someone to write a simple compiler or interactive
front end that asks a few questions and generates the appropriate file.

guy@rlgvax.UUCP (Guy Harris) (04/18/84)

>why does sendmail require such complex configuration tables, you ask?
>read the documentation (sendmail/doc/op.me):

>	There is one point that should be made clear immediately:  the
>	syntax of the configuration file is designed to be reasonably
>	easy to parse, since this is done every time sendmail starts
>	up, rather than easy for a human to read or write.

>let's just ignore the fact that parsing was made trivial by geniuses of
>decades past, and go blithely along, prematurely optimizing.  progress?
>who needs it?  feh.

Not entirely fair; the comment in the documentation refers to the *syntax*
of a configuration file (which is, admittedly, baroque) but the person
who complained about "sendmail" was complaining about the baroque *semantics*
of the configuration file.  Unfortunately, if you want a program which is
all things to all people (like "sendmail" is) you aren't going to get
something which is straightforward.  Frankly, I bless "sendmail"s ability
to be made to do lots of mail handling tasks without having to dive into
the source code (which is a poor idea because 1) it's complicated and you
may break it, 2) it means you've got private versions of "sendmail" running
around all over the place - Allman's paper on "sendmail" points out that
the "sendmail" configuration file was intended to make it possible to run
the same binary everywhere, and 3) lots of people out there don't have
source anyway).  I wish it were more straightforward, but I don't know
whether that's possible.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy

cak@Purdue.ARPA (Christopher A Kent) (04/19/84)

Header-People seems hardly to be the place to launch into TOPS-20 vs.
Unix flamage. Please desist.

Cheers,
chris
----------

lew@t4test.UUCP (Lew Mullen) (04/19/84)

So that's why it doesn't work ... what configuration files ?

				t4test!lew
				lew mullen

smoot@ut-sally.UUCP (Smoot Carl-Mitchell) (04/19/84)

As the local "sendmail hacker" here at UT, I do find the configuration
file syntax to be difficult to understand at times.  Fortunately,
Eric Allman did provide some pretty good template files to get users
started and I greatly appreciated that.  Most of the configuration
files I have modified are based upon one or more of Eric's files.
Without those files I would have spent a great deal more time
writing the configuration files we need.

In defense of sendmail it was designed to operate in a heterogenous
network mail environment and serve as a bridge between user 
communities using  different mail address syntax.  As a first stab
at trying to address the problem, I think Eric did a fine job.
It can be improved.  A more "user friendly" syntax would certainly
be useful and while not an easy job it is not impossible.
-- 
Smoot Carl-Mitchell, CS Dept. University of Texas at Austin
{seismo, ctvax, ihnp4}!ut-sally!smoot, smoot@ut-sally.{ARPA, UUCP}

wcwells%ucbopal.CC@Berkeley.ARPA (04/19/84)

I think we can agree that "machine readable" files are not necessarily
"human readable".  Yes, the sendmail configuration file is susceptible
to human error because it is complex, terse, and has to be manually
edited for installation.  I think the BSD Unix development group could
make installation easier by supplying more examples of sendmail
configuration files (eg. one for a site in one mail domain, one for a
site in a subdomain, one for a site in a gateway domain [multiple local
domains], etc.).  A program that asks the installer appropriate
questions and generates a configuration file for the more common types
of installations might reduce the number of errors being made.

Bill Wells
wcwells@Berkeley.ARPA
ucbvax!wcwells

VAF@CMU-CS-C.ARPA (Vince Fuller) (04/19/84)

One of the things that has continually amazed me about Unix is the steadfast
refusal to implement anything as something other than a text file. This whole
argument is silly. Why does sendmail (or the termcap library, or anything else
in Unixland for that matter) have to parse a rarely-changing text file every
time it is started? Why not have the file preparsed into a binary file that
imposes structure on the options for ease of access? This kind of simple
optimization (which is the way the TOPS-20 mailsystem handles address lists)
satisfies both requirements: the text file "source" can be easy for humans
to read - it doesn't have to be parsed often - so what it parsing takes a
little while longer, and the binary can be optimized for speed (probably would
do a lot better than the current approach). I'm sorry, but I won't buy the
"it has to be parsed quickly" argument.

--Vince

P.S. My apology to those offending by the flaming tone of the message. I am
     just tired of seeing so much braindamage like this in Unix - it really
     isn't that hard to sit down and THINK about how to implement something
     before you go out and do it.
-------

solomon@wisc-crys.arpa (Marvin Solomon) (04/20/84)

The change you suggested is already in sendmail (the version distributed
with 4.2 UNIX).  I believe I was the one who suggested the particular
"trick" now used.  I thought of it years ago while trying to make Adventure
start up more quickly.  Take ANY program that has a complicated and
time-consuming startup and add two options:  When the program is called
with the first option, it goes through the initialization, then dumps
everything willy-nilly into a file.  (In C, that can usually be done
by simply dumping bss: everything from _edata to _end; in FORTRAN it
can be done by dumping COMMON).  When called with the second option, the
program skips the initialization and simply loads the file.

Now the bad news:  The syntax for sendmail configuration files was designed
before the "freeze-file" idea was incorporated, but even though the
justification is now gone, nobody is very anxious to change it.
I think the best course is simply to improve the instructions for creating
configuration files, but that's much more boring that creating more software
:-)

Sendmail was designed and implemented by Eric Allman.  If you disagree
with some aspect of its design, argue with him.  I believe he's already
stated publicly that he's sorry he made the configuration file so cryptic,
and if he had it to do over, he'd do it differently.  

Comments like "I'm just tired of seeing so much braindamage like this in
Unix" are really off the mark.  I'm tired of seeing so much sectarian strife.

MRC@SU-SCORE.ARPA (Mark Crispin) (04/20/84)

     To answer smoot@UT-SALLY's defense of Sendmail, I should point
out that the TOPS-20 mailsystem (in particular, its MMailr module)
also operates in a heterogenous network mail environment.  MMailr
doesn't require gothic configuration files to do the right thing.

     I will confess that MMailr doesn't try to bridge between different
mail syntaxes.  user@host is a perfectly reasonable syntax to standardize
on, and I see no particular reason to encourage relative addressing
unless it is absolutely necessary.  What does
	foo!bar!rag!zowie
mean when host bar knows of TWO DIFFERENT machines called rag?  Can't
happen, you say?  Nonsense; it is a real problem between Internet and
several University LAN's right now.  In other words, relative addressing
is only safe to use in the cases where it is unnecessary!
-------

pallas@Pescadero.ARPA (Joseph I. Pallas) (04/21/84)

Sorry, MRC, but relative addressing is still necessary even when it is
safe to use.  In your example,

		foo!bar!rag!zowie

it's still possible that foo doesn't know how to talk to ANY machine
named rag, let alone two of them.  That's why SMTP supports source
routing.  Of course, domains will change all that Real Soon Now.

joe
	
	

MRC@SU-SCORE.ARPA (Mark Crispin) (04/21/84)

SMTP source routing is completely useless, due to restrictions on
what may be in the SMTP source route.  It might as well not exist.

My point was that relative addressing is not at all a desirable
thing.  It should only be looked at as a last resort.  Within a
single naming registry, it should be possible to use absolute
addressing.  Between naming registries, it should be possible to
give "an absolute address within an absolute registry" -- this
was the original concept behind domains although now domains refer
to administrative rather than technical entities (naming registries
traditionally have followed the latter rather than the former).
-------

chuqui@nsc.UUCP (Chuq Von Rospach) (04/21/84)

What I've wondered is why someone hasn't simply written a front end that
creates sendmail files. Isn't that what computers are for? (Hmm, with the
complexity of sendmail files, we are probably talking about one whale of a
parser/compiler.... Hmmm... )
-- 
From under the bar at Callahan's:		Chuq Von Rospach
{amd70,fortune,hplabs,menlo70}!nsc!chuqui	(408) 733-2600 x242

Never give your heart to a stranger, unless you are sure that you are dead.

chris@umcp-cs.UUCP (04/22/84)

Time to toss in another couple of pennies' worth of thought here:

MMDF (the Multi-Channel Memo Distribution Facility) has in interesting
way of "dealing with" UUCP syntax.  The basic idea behind MMDF is
actually quite similar to the ideas now in use in networking,
compiler design, computer design, microprocessor architecture, ...
(i.e., a lot of stuff).  That is:  do things modularly.

MMDF has a central input program (called submit).  This is not
(repeat NOT) a good user interface; all it's good for (and it's
pretty good for it) is taking a mail message (with a sender and a
list of addressees) and putting it into a queue.

It also has a central delivery program (called deliver).  This is
not an SMTP mailer, nor a UUCP mailer, nor anything else, all it's
good for is looking at queued messages and asking another program
to deliver them (or to tell it why it can't deliver them).

It then has a bunch of (mostly tiny) programs to do delivery for
various "channel"s.  There is one for local mail (delivery to
user's mailboxes).  There is another for CSNet PhoneNet mail [which
seems to be by far the buggiest, by the way].  I suppose that by
now there is one for X.25net mail.

And of course, there is one for UUCP mail.  It takes a message and
a list of addressees, breaks up the message (one for each address
as required by some older versions of /bin/rmail), and hands it to
the "uux" program, after editing message headers to match UUCP
requirements.

On the input side, the "/bin/rmail" program is completely rewritten.
It takes in a UUCP mail message and figures out the sender's name,
and the destination address.  It hands to submit a suitably edited
message for delivery to the destination.

-----------------
The interesting point about this whole system is that UUCP-style
addresses are meaningless to the MMDF system and never appear inside
it.  This really has quite a bit of flexibility, because with small
changes to the UUCP-in and UUCP-out programs, you can change the
way the UUCP-world sees you.  You never have to touch the queueing
system.  It also means that one can get rid of UUCP syntax in small,
relatively painless bits at a time.

-----------------
This is not to say that MMDF is the best system there is.  There
were a few bugs in the original distribution, some serious, some
minor, and the code is incredibly inefficient in some respects.
To mail to a mere hundred people takes it many CPU-minutes, inspecting
your five hundred line alias file a hundred times.  A hundred and
fifty messages in the queue, and it takes ten minutes for deliver
to even start delivering.  And to look up a host name alone in the
four or five alias tables can take seconds.  If you put this together
on a Vax 11/780 whose afternoon load average is over 35, you can
have lots of fun trying to send mail.

But on the other hand, there's a new version that uses better
database schemes, multiple queues, etc., which gets around most of
that.  (And I've done some work myself; things are much better now
here.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci (301) 454-7690
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris.umcp-cs@CSNet-Relay

jsq@ut-sally.UUCP (John Quarterman) (04/22/84)

There is no advantage in UUCP style a!b!c relative addressing:  it
exists for historical reasons and those of us who deal with the UUCP
network as it stands are forced to deal with it because many (most?)
UUCP sites do not have any way of dealing with Internet-style
addresses.  One hopes the UUCP domain project will allow the phasing
out of the old-style relative addressing, but until then hosts that
deal with both UUCP and the Internet (and CSNET and several local
networks in ut-sally's case) must be able to map between addressing
syntaxes.  We at ut-sally encourage users to enter addresses in
Internet domain syntax and let sendmail call a program to look up a
UUCP route and convert to relative form.  We do not encourage the use
of the relative form; we tolerate it because we still have to.

The problem of bang!decwrl!rhea!bang!user having bang in there
twice was quite common for a while.  DEC's ENET was hooked up to
UUCP in a fashion that pretended all DEC's machines were actually
UUCP sites.  Yet there were at least 20 name duplications:  vortex,
for instance, existed on both sides of the gateway.  The DEC domain
has lately taken on some solidity and addresses now tend to appear
more like bang!decwrl!user%bang.DEC, which removes the problem.
Until the UUCP domain exists, there will always be the possibility
of duplicate sites within the UUCP network proper (seems like there
used to be two machines named "turtlevax").

Religious arguments about "TOPS-20 does it better" are beside the
point:  I've never run across anybody who likes sendmail's configuration
syntax, and I wish somebody *would* write a compiler to convert from
some more reasonable language, but sendmail does get the job done.
(At least when there's somebody as patient as Smoot to make it do it.)
Part of the job *is* converting among diverse addressing syntaxes.
-- 
John Quarterman, CS Dept., University of Texas, Austin, Texas 78712 USA
jsq@ut-sally.ARPA, jsq@ut-sally.UUCP, {ihnp4,seismo,ctvax}!ut-sally!jsq
					moskvax!kgbvax!mcc!ut-sally!jsq

MRC@SU-SCORE.ARPA (Mark Crispin) (04/25/84)

Perfectly fair.  It doesn't matter whether one refers to syntax or
semantics of the configuration files.  The fact is that it is very
hard to respect a mailsystem which engages in idiotic behavior because
the case of one of single-letter options in the configuration file was
lower case instead of upper.  This has happened repeatedly in systems
across the ARPANET, and in-house at Stanford a number of times.

The developers and users of Unix and its software loudly proclaim how
advanced their software is and how it is the "clean non-kludgy" way
into the future (compared with systems such as DEC-20's which are
presumably "unclean and kludgy").  It seems quite fair to call SendMail
on this sort of thing.  If Unix wants to be more than a toy system it
has got to recognize that ergonomics is part of software engineering.
-------

robert@erix.UUCP (Robert Virding) (04/25/84)

Of course the best method is to have a human readable form which is then
parsed into a binary file which sendmail can read. The advantages are
obvious.

Is there any valid reason why this hasn't been done?  "We don't do things
int that way on UNIX" is not a valid reason!

				Robert Virding

mark@cbosgd.UUCP (Mark Horton) (04/26/84)

	Why does sendmail (or the termcap library, or anything else
	in Unixland for that matter) have to parse a rarely-changing
	text file every time it is started?

Check out the following
	/usr/lib/sendmail -bz
This creates a binary file for quick startup.

Termcap is being replaced by terminfo, which also uses binary files.

So of course it doesn't have to.  But it's great for prototyping when
the first stab is in a text format.

	Mark

smoot@ut-sally.UUCP (Smoot Carl-Mitchell) (04/27/84)

> Of course the best method is to have a human readable form which is then
> parsed into a binary file which sendmail can read. The advantages are
> obvious.
> 
> Is there any valid reason why this hasn't been done?  "We don't do things
> int that way on UNIX" is not a valid reason!
> 
> 				Robert Virding

I think this is a good idea in general.  I have some thoughts on such
an endeavor, but do not have the time to devote to it.  I also
want to see how domain based addressing evolves before tackling such
a task.  I think it is comparatively easy to at least make the
syntax of a configuration file human readable.  I too get tired
of all the "$*$-$+" stuff.
-- 
Smoot Carl-Mitchell, CS Dept. University of Texas at Austin
{seismo, ctvax, ihnp4}!ut-sally!smoot, smoot@ut-sally.{ARPA, UUCP}

wcwells@ucbopal.CC (William C. Wells) (07/10/84)

I would like to add to Mark Horton's reply:

	... For example, how should "a!b@c" be interpreted?
	"(a!b)@c" as required by the ARPANET?  "a!(b@c)" as required by
	existing UUCP software?

UUCP and Internet (ARPANET) mail address formats are not compatible.
Note that the "@" has is the primary delimiter in the
Internet mail world and that "!" is the primary delimiter in
the UUCP mail world.

UUCP Mail/Internet Mail gateways should be doing the address conversion
as follows: UUCP to LOCAL, then LOCAL to DOD Internet; and DOD Internet
to LOCAL, then LOCAL to DOD Internet (where LOCAL is a gateway domain
addressing scheme that provides for identification of different
external mail addresses).

If you are using sendmail, I suggest that you adopt the Internet
mail address format as your LOCAL format. Then in the
local mail domain you can use <uucp-address@uucp-neighbor.UUCP>
to identify UUCP addresses locally.

Here are some basic rules that a sendmail mail transport agent
acting as a gateway can follow to handle different address formats.

In the following examples "ucbvax" (@Berkeley) is a UUCP/Internet mail
gateway. And "b@c" is a valid Internet mail address. Note that
mail addresses are modified both at entering the local (gateway) mail
domain and when leaving the local mail domain.


Incoming from 		In the LOCAL		Outgoing to
UUCP mail		gateway domain		DOD Internet mail

From: x!y!z		From: y!z@x.UUCP	From: x!y!z@Berkeley.ARPA
To:   ucbvax!b@c	To:   b@c		To: b@c


Incoming from 		In the LOCAL		Outgoing to
DOD Internet mail 	gateway domain		UUCP mail

From: b@c		From: b@c		From: ucbvax!b@c
To: x!y!z@Berkeley.ARPA To:   y!z@x.UUCP	To: x!y!z


Note that there are two steps in the address conversion. First
the local mail agent information "ucbvax!" (from UUCP) or
"@Berkeley.ARPA" (from Internet) is stripped off, then the local
address remaining is interpreted.

For mail moving across the gateway, "@c" must be a full Internet mail
domain name.  That is a UUCP/Internet gateway should be strict about
only accepting complete addresses from non-local mail transport agents.
If "@c" is not a full domain name, then the gateway can only assume
that "@c" is in the local domain of the mail gateway.

Bill Wells
wcwells@Berkeley.ARPA
WCWELLS@UCBJADE.BITNET
ucbvax!wcwells