[comp.mail.headers] mixed addresses

vixie@palo-alto.DEC.COM (Paul Vixie) (07/19/88)

(Andrew Macpherson writes:)
# You have brought up a nasty.  In fact you are highlighting two
# of them.  Firstly '%' as an address character.  IT IS NOT LEGAL RFC822.
# It remains around for historical compatibility (RFC7?? --- read the
# first page of 822 if you really want to know which).

(Rick Adams writes:)
# The fact that you are mucking with the LOCAL-PART, which you are supposed
# to leave totally alone, is the cause of the problem.

(which I agree with, but then Rick says:)
# Anyone who gives % precedence over ! should fix their mailer.
# % is NOT a synonym for @. It is a valid part of the local-part
# of the address and should not be interpreted my any site save
# the destination machine.

I agree that % is not a defined pseudonym for @ and that anyone who tries
to say it is can be ignored.  It is, indeed, part of the "local part", and
there is no written standard that says it has to be processed one way or
another.

But, um, has anybody got users on their systems with "%" as part of the
user identifier?  Probably a few, but vastly: no.

"%" _is_ intepreted by most large sites, after all standard address characters
have been processed, stripped off, and yet the message is still unresolved.
The usual processing of "%" is:

	locate the rightmost %
	change it to an @
	go back and do what @ requires

This is a _HACK_.  But it's widely implemented.  It has the same level of
"pseudo standardization" that "!" has, though the two characters evolved on
different planets.

So when Rick said:
# Anyone who gives % precedence over ! should fix their mailer.

I say: uh-uh, nothing's wrong with my mailer.  If I'm to interpret % at all,
I'll be doing it mostly by the seat of my pants -- since there's no standard
for it, it's in the local-part after all.  It has a common meaning, which is
very similar to the meaning of "@" -- the thing to the right of it is a host
or domain or something, and the thing to the left of it is an address or a
route or some such that the thing to the right of it can be halfway depended
upon to understand.

I'll say again: % is just another character if there's an @ anywhere in an
address.  @ is spoken of in RFC822, % is not.  % should be treated like "a"
or "b" or "c" if there's an @ anywhere in the address (really route-addr
but you know what I mean).

The UUCP "!" is in the same state -- it's just another character if there's
an @ anywhere to be found.

So, shall I fix my mailer to send back mail that gets here looking like
"xyzzy!bar" or "bar%xyzzy" after I've stripped off the "@dec.com"?  I'd
rather not just bounce things, since there is something of a standard for
what these LOCAL-PART characters mean.

I give ! precedence over %, because I've already got a character that does
what % does -- that is, @.  % begins to have great value as a low-precedence
@ that will be treated as an @ once all @'s and !'s have been stripped out.

(Andrew Macpherson continues:)
# Having got that off my chest, here is the associated nasty:  mcvax
# uses it as a 'local-part' operator, and hands on addresses of the form
# a!b!u%l, which any Internetted (and probably any JANET) host will treat
# as send to 'l' for uucp forwarding.  

Bleah.

Seriously.  If I want it to go to "l" first, I can use "@l".  If I say "%l",
it probably means that I want to do something that @ can't do -- namely, not
be evaluated until it reaches "b".

# This is not usually a problem but occasionally we recieve US mail which
# has hopped to the arpanet and been strangely delt with...

"Strangely"?  Our conventions are different, that's true.  John Diamant @HP
once sent me an unfinished RFC that dealt with this issue, but like him, I
could never figure out quite how to resolve everything into one neat little
package.  But I do think that after the one, Crocker-given symbol has been
processed and we are down to our nitty-gritties trying to hand off a piece
of mail based on the local-part, that ! usefully precedes % in what little
decoding is possible.

# Now the other point... Mixed addresses.  If you live in the uucp world only
# you have no trouble.  If in 822 land likewise.  I understand the JNETters
# allow % as a source-routing so they also have a consistent rule-set.  All 3
# have a route specification method, therefore there is neither need nor
# justification for mixed-mode addressing. [...] The only safe and reasonable
# course to take is to provide the destination address in the format required
# by the network you are using.

I agree completely.  The rules I use for local-part precedence are worst-case,
and properly generated mail messages don't get that far into the bowels of my
mailers.  If something comes to me over UUCP, the envelope recipients can
easily be coded, each and every one, in pure !-notation.  If I want to submit
a message into a UUCP transport system, I can bloody well code up all the
envelope recipients in pure !-notation.  Likewise, if something comes in over
SMTP, the envelope recips can and should be in straight route-addr notation
(i.e., @a,@b,@c:u@d, and gosh that sure is ugly, Dave), and I can certainly
be expected to submit things in that form.

As Diamant (am I spelling that right, John?) points out, though, RFC822 and
its friends imply or demand that all domains named in a route-addr be 
registered with the NIC.  This is silly and inconvenient and everybody
ignores it.  But it does mean that if something comes to you over UUCP with
an envelope recip of a!b!c!d and you decide to reach "a" via SMTP and you
want to rewrite the envelope recip into route-addr and you rewrite it to be
@a,@b:d@c and either "b" or "c" is not registered with the NIC, you've just
broken another silly regulation.
-- 
Paul Vixie
Digital Equipment Corporation	Work:  vixie@dec.com	Play:  paul@vixie.UUCP
Western Research Laboratory	 uunet!decwrl!vixie	   uunet!vixie!paul
Palo Alto, California, USA	  +1 415 853 6600	   +1 415 864 7013

rsalz@bbn.com (Rich Salz) (07/19/88)

>[ Treating % as a delayed @ if there are no other meta-characters in the
>address ] has the same level of "pseudo standardization" that "!" has,
>though the two characters evolved on different planets.

Unh, no.  The "!" character for UUCP routing is documented in RFC976.

I'd also claim that UUCP and ARPA aren't two different planets, but were
at worst two different continents, shared by a few common passageways,
navigable only by some very hardy souls.  In particular, now-civilized
natives of far-off UUCP backwaters fondly remember the great trappers
and trackers that would hang around Old Camp Seismo.

There's definitely been a shift in the plate tectonics, and these
two lands are know merged into one (un)common market.

With apologies to Dave Mills for the second paragraph,
	/rich $alz
-- 
Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.

david@ms.uky.edu (David Herron -- One of the vertebrae) (07/21/88)

Paul,

neither rfc822 nor rfc976 specify that % does what the %-hack do.  In fact,
both specify other methods to do routing.  That means, both specs expect
the local-part to only be paid attention to by the 'local' machine, not
by machines along the route.

In:

	user%a@some.dom.ain
and	some!long!path!with!a%percent

the local part *shouldn't* be evaluated until it reaches the destination.
In the '@' case, until it reaches some.dom.ain, and the '!' case until
it reaches 'with'.  Whatever either of those places wishes to do with
a local-part like "a%b" is up to them BECAUSE IT IS NOT SPECIFIED BY
EITHER SPEC!

Now, I didn't read your posting very closely and it's possible that you
said exactly that in your posting.  But I think I caught a few statements
to the otherwise.

Oh, I suppose the next question is what to do with 

	some!long!path!a%b@some.dom.ain

RFC976 is the applicable spec.  (according to rfc822, the "some!long!path!a%b"
is *all* local part and get's evaluated according to some.dom.ain's rules
for evaluating such things).  The rfc recommends first that mailers not
use mixed adresses internally, instead preferring something like

	some.dom.ain!some!long!path!a%b
or	some!long!path!some.dom.ain!a%b

but if you must, to treat such an address as in rfc822 -- the local part
is the local part and isn't evaluated until it reaches some.dom.ain
(i.e. the "some.dom.ain!some!long!path!a%b" interpretation -- effectively)

It's right there in the specs in black&white (or purple&brown with the
right toner/paper combination ...)
-- 
<---- David Herron -- The E-Mail guy                         <david@ms.uky.edu>
<---- ska: David le casse\*'      {rutgers,uunet}!ukma!david, david@UKMA.BITNET
<----                   A misplaced Kansan trapped in the heart of Kentucky,
<---- the state where it is now illegal to water your lawn on the wrong day.

scott@stl.stc.co.uk (Mike Scott) (07/26/88)

In article <3503@palo-alto.DEC.COM> vixie@palo-alto.DEC.COM (Paul Vixie) writes:

>But, um, has anybody got users on their systems with "%" as part of the
>user identifier?  Probably a few, but vastly: no.

But it's not just user identifiers... consider a system where mail
is being gatewayed on a VMS machine between the un*x world and the
DEC foreign mail interface which uses a % character in the address to
define the interface - for example, mail from PSS has the address prefix
"PSI%", which can wreak some havoc when replied to by the unwary!
-- 
Regards. Mike Scott (scott@stl.stc.co.uk <or> ...uunet!mcvax!ukc!stl!scott)
phone +44-279-29531 xtn 3133.