[net.general] new mail syntax standard published

mark (08/20/82)

Three new RFC's have been published on the ARPANET.  They deal with
the new mail standards.  Their numbers are
	RFC 819		Internet Syntax (18 pages)
	RFC 821		SMTP (Simple Mail Transfer Protocol) (68 pages)
	RFC 822		Header Format (replaces RFC 733) (47 pages)
I am posting the short one, RFC 819, to net.sources.  If there is
sufficient interest, I will post the others (although 821 probably
does not apply to our environment - it would apply to a local net
or a long-haul full duplex reliable net such as the arpanet) or
send them to interested people if there is a small number of such people.

There is one error in 819 you should be aware of.  Several examples
use names like "alpha!beta!gamma!john.UUCP" which do not contain an
"@" sign.   Since all internet addresses are of the form "user@host",
the dot should be changed to an at.  I'm not making this change since
it's not in the document as published, but readers should be aware
of the error.

I hope to see a timetable for conversion shortly.  My understanding
is that it is to be phased in gradually - first sites are supposed
to understand but not generate the new syntax, then sites are supposed
to understand both and generate the new, and eventually only care about
the new syntax.  An estimate is two months at each phase.

Software to support the new syntax exists.  If you are in Bell Labs or
otherwise licensed for UNIX 5.0 (presumably this means Bell System only)
I have a version of the 5.0 /bin/mail command that understands it and
generates new RFC 822 headers.  (It's the same one that went out for
testing a month or so ago - no bugs were found.)  If you are running
Berkeley UNIX, the Berkeley sendmail program supports this syntax.
(Sendmail isn't available to the general public yet but will be before
anyone seriously urges conversion.)  Both pieces of software are, of
course, free.  If you are running something else, some minor conversion
will probably be necessary, but no serious problems are expected unless
you have a home-grown mail system.  I understand that the authors of MMDF
and MH plan to support the new syntax, but I have nothing to do with that.

At this point I urge all UUCP sites to understand the new syntax and
to plan for conversion, but not yet to undertake actual conversion.
The intent is that UUCP will use the simplification "user@host.uucp",
at least initially.

	Mark Horton

gill (08/21/82)

#R:cbosgd:-254400:physics:16800001:000:8936
physics!gill    Aug 21 00:44:00 1982

The report "Domain Naming Convention for Internet User Applications"
(RFC 819), besides being the most poorly written piece of computer
documentation I have ever seen (why is it that committees always want to give
their document that "mechanically generated" tone?), is deeply flawed.

It's time to open discussion. As far as I'm concerned, nothing has been
"decided." While DOD sites may be under the committee's thumb, I
don't think the rest of the world is going to accept a "standard"
which not only fails to address any of the hard problems, but actually
prevents their eventual solution.

What is a mail destination address (MDA from now on)?

	A MDA is a way of uniquely specifying the destination of a message
	to a human being.

	In the context of computer mail, it is a way of unique specifying
	the destination of a message to a computer user.

What makes one format of MDA better than another? I believe the goals
are few and simple:

	a) Universality of name. No matter where the sender is,
	   my address is the same. I shouldn't have to sign my letters
	   "alice!gill OR gill@mc." Local abbreviations should only be a
	   means to aid human authors in generating the full MDA.

		Why is this important? I can think of two reasons:

		i) I can tell someone how to reach me without knowing
		   what system or network the sender is on.

		ii) No matter how my message is routed, the information
		    of how to reach me is never corrupted (819's
		    scheme of modifying from/to fields at network
		    boundries belongs in a Freshman's C- computer
		    science essay). Thus, when my message is
		    mis-routed,	the MDA remains robust enough to
		    get the message to me anyway.

		People who oppose universal addresses on historical grounds
		should be wary of confusing human design decisions
		with purely technical ones. The reasons we don't always
		dial long phone numbers is because its hard on the
		fingers and mind and also because the older technologies
		limited the routing of a "local" call. Using a computer
		mailer, it is not unreasonable to demand all addresses be
		full length. The translation of "joe" to his address on
		earth is a job to which our machines are quite well suited.

	b) The MDA should contain only as much informatin as neeeded to
	identify the recipient. This leaves the door open for future
	enhancements such as dynamic routing. The .NETWORKNAME in RFC 819
	closes this door by adding non-pertinant information, thus
	straight-jacketing the operation of all future mail programs.

		It would seem the committee members haven't paid very
		much attention to the large number of multiply connected
		systems. At MIT, most of the UNIX systems are accessable via
		both by our local CHAOSNET and by UUCP. The idea that
		I need two seperate addresses on a single computer just
		doesn't make sense. Multiple parentage is NOT a subject
		to be put off for "future investigation." It is a pressing
		problem which is currently being handled in ad-hoc ways.
		RFC 819 would force us to abandon all hope of freeing
		the author from the requirement to know the network
		topology.

		The conceptual error in RFC 819 is very fundamental. It
		claims that UUCP, ARPA, etc. are environments in
		which computer systems operate. They are instead nothing but
		transport mechanisms, not at all indicative of
		the various adminstrative contours which the RFC strives
		to represent. Instead of substituting administrative
		topology for physical topology, they have substituted
		transport topology.

		When a system upgrades to higher performance network, it
		should not be necessary to send out address corrections
		for other systems to take advantage of the new capablity.

	c) The MDA should in some way be isomorphic to the ways people
	   locate each other in the non-computer world.

		The wherabouts of a collegue are independant of which
		interstate I must traverse to visit his house. It is
		also independant of whether I take a car or a hot air
		balloon. Similarly, I should be required to know neither
		the path taken by my message NOR to what type of computer
		network he is connected.

		Instead of making the MDA format be determined by the
		hierarchy of computers, let's make it dependant on the
		hierarchy of people.

What is the topology of computer users?

	Though time has seen a vast increase in the number of computer
users, what has resulted is an increased width, not depth of
the hierarchy. At the fringes of this tree are people. The
next level up is the group. The definition of a "group" is easy:

	a collection of computer users with unique names and a
	capability of avoiding future name collisions.

A group may use several computers, but the users either know each other
well enough, or are somehow centrally administered, such that no names
are ever duplicated. Corporate departments or school projects are
examples of groups.

	Next comes the organization:

	Organizations are to groups as groups are to users; there are
sufficiently few groups in an organization such that the group names
can be unique.

Eaxmples of organizations would be MIT, UCB, TI, BBN.

Though I presently see no need for more levels, they may become necessary.
Perhaps when two companies have the same name, we'll classify them by
their field of interest (i.e. what they make).

What of diversified corporations, or consultants assosciated with many firms?
Multiple parantage is staring us in the face. Let's deal with it now.

Since our definition of an MDA does not dictate network type or routing,
we can have multiple names for a particular human without fear of
jamming any physical path. Corporations which work in several fields will
have entries in each of those fields. People who are members of several groups
can have entries in each of those groups.

With "future investigation" behind us, let us describe a possible syntax.

How about:

	user . group . organization

When the decision is made to add another level, the entire network will
have to appened another "domain." This is inevitable if we want to maintain the
"universality" of addresses. Since it may someday be better to route a message
around the world to get next door, and since mistakes will inevitably happen,
it is important to attach a complete address to that message. As
messages may hop from deep inside one part of the administrative hierarchy
to another, incrementally building the address as the message passes
by will not work. This is the fundamental mistake of RFC 819. Network
topology is independant of the adminstrative topology. One cannot
depend in any way on the former to specify locations in the latter.

Adding levels isn't as horrible as it sounds. I presently don't see a need
for more than 3 levels. Perhaps when there is a PC in every pot
we may need 4. Trees grow rather exponentially, you know.

The user interface, of course, will not require the author to specify
(or know) full blown addresses. In a mechanism similar to the UCB
per-user alias file, a mailer will be able to interpret the per author
context of a name at each level. What is important, however, is that
when the message enters the hostile network environment, it have a
dog license around its neck so that anyone who comes accross it knows
where it should go.

Routing:

	Routing will require a distributed dynamic map. It is not necessary
for all systems to know of each other, but instead mearly in which
direction to send unresolved addresses. Instead of taking the easy
way out and adding .UUCP to an address, one can easily imagine
a system whereby frequent addresses and routes are cached on local systems with
"misses" being refered to more knowledgable machines. To petrify
the route in the address is asking the human to be the cache instead of the
computer.

	A mechanism should be provided to explicitly specify the route
of a message (for testing and bootstrapping new systems). This should
go in a totally different place than the address.

To summarize, there are five basic flaws with RFC 819:

	i) It retains the coupling of administrative and network topologies.

	ii) It prevents intelligent routing to systems reachable by
	    several paths.

	iii) It necessary to specify several addresses for a
	     single user on a system connected to several networks.

	iv) The optimal address of a destinee is dependent on where the sender
	    is, and thus not something the destinee can communicate
	    independant of network topology.

	v) Messages are extremely vulnerable to loss at inter network
	   interfaces.

	I do not plan to implement nor support RFC 819 addresses on
any of the machines I administer. The present address formats are much more
clumsy, but at least they don't have the gall to dictate the future.

	Gill Pratt

	...alice!gill OR gill@mc


	Constructive Criticism and Judicious Editing by

	Pace Willisson

	...mhtsa!physics!pace OR pace@mc

gill (10/08/82)

#R:cbosgd:-254400:physics:16800001:000:8936
physics!gill    Aug 21 00:44:00 1982

The report "Domain Naming Convention for Internet User Applications"
(RFC 819), besides being the most poorly written piece of computer
documentation I have ever seen (why is it that committees always want to give
their document that "mechanically generated" tone?), is deeply flawed. 

It's time to open discussion. As far as I'm concerned, nothing has been
"decided." While DOD sites may be under the committee's thumb, I
don't think the rest of the world is going to accept a "standard"
which not only fails to address any of the hard problems, but actually 
prevents their eventual solution.

What is a mail destination address (MDA from now on)?

	A MDA is a way of uniquely specifying the destination of a message 
	to a human being.

	In the context of computer mail, it is a way of unique specifying
	the destination of a message to a computer user.

What makes one format of MDA better than another? I believe the goals 
are few and simple:

	a) Universality of name. No matter where the sender is,
	   my address is the same. I shouldn't have to sign my letters
	   "alice!gill OR gill@mc." Local abbreviations should only be a
	   means to aid human authors in generating the full MDA.

		Why is this important? I can think of two reasons:

		i) I can tell someone how to reach me without knowing
		   what system or network the sender is on.

		ii) No matter how my message is routed, the information
		    of how to reach me is never corrupted (819's
		    scheme of modifying from/to fields at network
		    boundries belongs in a Freshman's C- computer
		    science essay). Thus, when my message is
		    mis-routed,	the MDA remains robust enough to
		    get the message to me anyway.

		People who oppose universal addresses on historical grounds
		should be wary of confusing human design decisions
		with purely technical ones. The reasons we don't always
		dial long phone numbers is because its hard on the
		fingers and mind and also because the older technologies
		limited the routing of a "local" call. Using a computer
		mailer, it is not unreasonable to demand all addresses be 
		full length. The translation of "joe" to his address on
		earth is a job to which our machines are quite well suited.

	b) The MDA should contain only as much informatin as neeeded to
	identify the recipient. This leaves the door open for future
	enhancements such as dynamic routing. The .NETWORKNAME in RFC 819 
	closes this door by adding non-pertinant information, thus 
	straight-jacketing the operation of all future mail programs.

		It would seem the committee members haven't paid very
		much attention to the large number of multiply connected
		systems. At MIT, most of the UNIX systems are accessable via
		both by our local CHAOSNET and by UUCP. The idea that
		I need two seperate addresses on a single computer just
		doesn't make sense. Multiple parentage is NOT a subject
		to be put off for "future investigation." It is a pressing
		problem which is currently being handled in ad-hoc ways.
		RFC 819 would force us to abandon all hope of freeing
		the author from the requirement to know the network
		topology.

		The conceptual error in RFC 819 is very fundamental. It
		claims that UUCP, ARPA, etc. are environments in
		which computer systems operate. They are instead nothing but
		transport mechanisms, not at all indicative of
		the various adminstrative contours which the RFC strives
		to represent. Instead of substituting administrative
		topology for physical topology, they have substituted
		transport topology.

		When a system upgrades to higher performance network, it
		should not be necessary to send out address corrections
		for other systems to take advantage of the new capablity.

	c) The MDA should in some way be isomorphic to the ways people
	   locate each other in the non-computer world. 

		The wherabouts of a collegue are independant of which 
		interstate I must traverse to visit his house. It is
		also independant of whether I take a car or a hot air
		balloon. Similarly, I should be required to know neither
		the path taken by my message NOR to what type of computer
		network he is connected.

		Instead of making the MDA format be determined by the
		hierarchy of computers, let's make it dependant on the 
		hierarchy of people.

What is the topology of computer users?

	Though time has seen a vast increase in the number of computer
users, what has resulted is an increased width, not depth of
the hierarchy. At the fringes of this tree are people. The
next level up is the group. The definition of a "group" is easy:

	a collection of computer users with unique names and a
	capability of avoiding future name collisions.

A group may use several computers, but the users either know each other
well enough, or are somehow centrally administered, such that no names
are ever duplicated. Corporate departments or school projects are
examples of groups.

	Next comes the organization:

	Organizations are to groups as groups are to users; there are
sufficiently few groups in an organization such that the group names
can be unique. 

Eaxmples of organizations would be MIT, UCB, TI, BBN.
 
Though I presently see no need for more levels, they may become necessary.
Perhaps when two companies have the same name, we'll classify them by 
their field of interest (i.e. what they make).

What of diversified corporations, or consultants assosciated with many firms?
Multiple parantage is staring us in the face. Let's deal with it now.

Since our definition of an MDA does not dictate network type or routing,
we can have multiple names for a particular human without fear of 
jamming any physical path. Corporations which work in several fields will
have entries in each of those fields. People who are members of several groups
can have entries in each of those groups.

With "future investigation" behind us, let us describe a possible syntax.

How about:

	user . group . organization

When the decision is made to add another level, the entire network will
have to appened another "domain." This is inevitable if we want to maintain the
"universality" of addresses. Since it may someday be better to route a message
around the world to get next door, and since mistakes will inevitably happen,
it is important to attach a complete address to that message. As
messages may hop from deep inside one part of the administrative hierarchy
to another, incrementally building the address as the message passes
by will not work. This is the fundamental mistake of RFC 819. Network
topology is independant of the adminstrative topology. One cannot
depend in any way on the former to specify locations in the latter.

Adding levels isn't as horrible as it sounds. I presently don't see a need
for more than 3 levels. Perhaps when there is a PC in every pot
we may need 4. Trees grow rather exponentially, you know.

The user interface, of course, will not require the author to specify
(or know) full blown addresses. In a mechanism similar to the UCB
per-user alias file, a mailer will be able to interpret the per author
context of a name at each level. What is important, however, is that
when the message enters the hostile network environment, it have a
dog license around its neck so that anyone who comes accross it knows
where it should go.

Routing:

	Routing will require a distributed dynamic map. It is not necessary
for all systems to know of each other, but instead mearly in which 
direction to send unresolved addresses. Instead of taking the easy
way out and adding .UUCP to an address, one can easily imagine
a system whereby frequent addresses and routes are cached on local systems with
"misses" being refered to more knowledgable machines. To petrify
the route in the address is asking the human to be the cache instead of the 
computer.

	A mechanism should be provided to explicitly specify the route
of a message (for testing and bootstrapping new systems). This should
go in a totally different place than the address.

To summarize, there are five basic flaws with RFC 819:

	i) It retains the coupling of administrative and network topologies.

	ii) It prevents intelligent routing to systems reachable by
	    several paths.

	iii) It necessary to specify several addresses for a
	     single user on a system connected to several networks.

	iv) The optimal address of a destinee is dependent on where the sender
	    is, and thus not something the destinee can communicate 
	    independant of network topology.

	v) Messages are extremely vulnerable to loss at inter network
	   interfaces.

	I do not plan to implement nor support RFC 819 addresses on
any of the machines I administer. The present address formats are much more
clumsy, but at least they don't have the gall to dictate the future.

	Gill Pratt

	...alice!gill OR gill@mc


	Constructive Criticism and Judicious Editing by

	Pace Willisson

	...mhtsa!physics!pace OR pace@mc