[net.mail] parsing uusite!local@domain-spec - a partial solution

gregc@ucsfcgl.UUCP (Greg Couch%CGL) (01/09/85)

	How to generate uucp return addresses from the RFC 822 world

The basic problem:

	People want to use the RFC 822 domain addressing standard for all
mail addressing.  Most unix sites support uucp mail and many now support
the RFC scheme as well.  Return addresses given to uucp sites are often
generated assuming that the uucp site doesn't know about RFC 822, which
leads to an ambiguity when the uucp site does try to follow RFC 822.

	The return addresses is often generated as uusite!local@domain-spec.
It is parsed, ()'s for grouping:

	uusite!(local@domain-spec) by uucp-only sites (wanted behavior)
	(uusite!local)@domain-spec by RFC 822 sites - RFC 822 overrides
					(i.e. uusite!local is the local part)

Return addresses need to be generated without assuming that uucp sites
know or don't know about RFC 822.

	The only solution which is compatible with uucp and RFC 822 is not
to give uucp sites addresses with @'s in them.

	I am proposing the following syntax for uucp return addresses
from RFC conforming machines:

		uusite!domain-spec!local

		e.g. ucsfcgl!Berkeley.Arpa!gregc
		or ucsfcgl!ucsfmis.ucsf!gregc		(local top level domain)
		or ucsfcgl!gregc			(no domain-spec)

Why this particular syntax?

	It is a normal uucp address up to the first !.
	It's a RFC 822 conforming address (all local).
	It preserves the locality of the orignal local part.
	! is already special, why use another character?
	It avoids any problems a different special character would create
		since it may mean something special to some other site.
	It's easy to reconstruct the domain address.
	It doesn't break uucp mail.
	It appears to be what usenet uucp-mail project has decided upon
		(we will know soon, I hope).

What this fixes:

	Uucp return addresses are legal for both uucp-only sites and
	RFC conforming sites.

What this breaks:

	It may break sites that try to shorten chains of uucp addresses.
	Sites that use . as a special character with higher precedence
		than !.
	Any other program that parses uucp mail addresses as site!site!....

What this doesn't fix:

	The RFC 822 routing syntax isn't handled correctly, since
		routes can have more than one @.  The hack solution is
		to strip routing information until only one @ is in the
		address and hope that is a good enough (should be, but...).
	Uucp routing isn't solved.
	Mail that goes through sites that don't munge the From: and To:
		fields to RFC conforming site will still cause the mail
		to be unreplyable to.

In short:

	This proposal is only for generating return addresses for consumption
by uucp, it is not a general solution of mail addressing problems.  It is
designed for backward compatibility with uucp-only sites, but at the same
time, for compatibility with sites that are switching to the RFC notation.
I have a generic uucp/ethernet sendmail.cf file with this implemented and
tested, if anyone would like a copy, just send me mail.

Comments to:
			Greg Couch
			ucbvax!ucsfcgl!gregc
			ucsfcgl!gregc@Berkeley.Arpa
			gregc@ucsfcgl.Arpa (as soon as we're published)

mark@cbosgd.UUCP (Mark Horton) (01/10/85)

I heartily endorse Greg's proposal, since it's exactly the same as
what the UUCP project decided nearly a year ago to standardize on.
A document describing this standard in sufficient detail for an
implementation will be posted to this newsgroup in a few days - it's
nearly ready to go.  (It's still a draft and comments from the net
community will be invited when it's posted.  Changes are still possible.)

By the way, the domain!user syntax can indeed handle almost all cases
of legal 822 addresses with multiple @'s, because such multiple @'s can
only appear with explicit routes.
	@foo,@bar:user@mumble
becomes
	foo!bar!mumble!user
Addresses such as
	user@foo@bar
are not legal 822.

	Mark

hokey@gang.UUCP (Hokey) (01/12/85)

A hybrid route (a route isn't necessarily the same as an address) *should*
be unambiguous in an RFC822 header.  It isn't because most sendmail sites
treat the *address* specified in the From: and To: lines as a *route*, and
lop their name off the To: line and stick it back on the From: line.

This is totally malicious behavior.  It was thought necessary in the past
because people used "smart" user interfaces for replying to mail, and
since nobody had the implicit routing mechanisms in place to *route* the
mail to the *addresses*, they used the (munged) From: and To: *routes*
taken by the original mail to explicitly *route* the mail back to the new
recipient(s).  Great fun when somebody Far Away sends mail to you and a
local neighbor; replies from you to your neighbor and the sender both go
Far Away.  Subsequent replies bounce back and forth across an ever-increasing
*route*.

This routing problem has been solved by truncating the path to the farthest
"known" site.  This doesn't always work, however, because sites don't
always have unique names.  Edge-databases are swell for helping solve
this problem.  But I digress.

Quoting a hybrid route/address *in the header* is another acceptable
solution.  This works because the localpart has been distinguished
from the preceeding bang paths.  This may not be "good" 822 behavior,
but it *will* do the job.  Specifically, a!b!"c!d"@e *must* mean that
once the mail arrives at "e", it then goes to c!d.  "b" *can not* assume
that the "c" it know is the "c" known to "e", unless it has the connectivity
information for "e".  This is the edge database stuff again.

Note that unless the quotes are there, it is WRONG to mess up a "hybrid" in
the From: or To: lines by adding or deleting "mysite!" from the list,
because you may be changing the way somebody (rightly) expects to *route*
the message.

In the transport layer (uux, for example), quoting won't work because the
shell will strip them off, so there is no way to disambiguate the hybrid.
We have had several thrilling discussions about what to do with hybrids in
the transport layer.  I would like to reject them.  Others disagree.  We
all agree that we won't propagate the hybrid; the issue is whether or not
to accept them (from an arbitrary site.  The situation changes a bit if you
*know* the sender is an 822 site or a bang site, but that is another ball
of worms that also changes with Time and the Stars.).

Anyway, the place these monstrosities are most likely to emerge is when
replying to mail.  The "best" place to fix them is at the user interface.
The problem is also solved if people use "smart" mailers and/or tools
like pathalias or pathparse.  It would be of TREMENDOUS help if all the
sendmail sites left addresses in headers ALONE.

As one might suspect, this represents my perspective on the problem, and
there are people out there who disagree with me.
-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492