[comp.mail.sendmail] Mail addresses and RFC1123

rickert@CS.NIU.EDU (Neil Rickert) (07/11/90)

 I don't normally see this news group, but a friend who has been forwarding me
excerpts suggested I add my $.02 worth.

 I have been particularly interested in the comments on RFC1123.

 Now, let's see if I have it straight.

 If I understand RFC1123 correctly, I may change the address (which I extracted
from a recent syslog entry):
    <@UICVM.uic.edu:Y03CLS1@NIU>
into the address:
    Y03CLS1@NIU
and never mind the fact that mail to the final address may be undeliverable.

 If I understand it correctly, I am not supposed to use routing, but if I
must, then instead of using source routes with an awkward syntax which
discourages use, but a well defined semantics, I am instead supposed to use
the %-hack with a much more user friendly syntax that encourages use of routes
even though they are officially discouraged, but with no defined
semantics -- only a suggested interpretation.

 As I interpret RFC1123, a host which does not understand the '!' character,
but which needs a route for delivery, should not use the address
  @a:c!u@b
with its clear interpretation of delivery -> a -> b ->c -> u  but the host
should instead format the address as
  c!u%b@a
with a suggested interpretation of -> a -> c -> b -> u.  (Note that since
the host does not understand !, it treats 'c!u' as merely a single mailbox
name).

 According to my understanding of RFC1123, if my mailer sees the address
   user%mcdchg%clout@gargoyle.uchicago.edu
it may not touch the local part, and must forward the mail to gargoyle, even
though experience tells me the mail will bounce.  My mailer must not under
any circumstances reformat the address as
   clout!mcdchg!user@gargoyle.uchicago.edu
although with this address the mail would probably be correctly delivered.

	----------------------------

Boy, I am sure glad to have read RFC1123.  Before reading it I was greatly
confused.  I used to think that the whole purpose was to ensure correct and
efficient delivery of the mail.  But RFC1123 has cleared that up.  I now
realize that this is like a chess game, where the pieces of the address are
moved around according to a strict set of rules, and final delivery of the
mail, if it ever occurs, is merely an incidental side effect.

=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Sci Dept, Northern Illinois U., DeKalb IL 60115
  InterNet, unix: rickert@cs.niu.edu              Bitnet, VM: T90NWR1@NIUCS
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

towfiq@interlan.Interlan.COM (Mark Towfigh) (07/13/90)

In article <9007111604.AA29811@cs.niu.edu> rickert@CS.NIU.EDU (Neil
Rickert) writes:

    Now, let's see if I have it straight.

OK.....

   As I interpret RFC1123, a host which does not understand the '!' character,
   but which needs a route for delivery, should not use the address
			     @a:c!u@b
   with its clear interpretation of delivery -> a -> b ->c -> u  but the host
   should instead format the address as
			     c!u%b@a
   with a suggested interpretation of -> a -> c -> b -> u.  (Note that since
   the host does not understand !, it treats 'c!u' as merely a single mailbox
   name).

Where do you get this interpretation?  It seems to me that the two
forms are identical, and should both result in a->b->c->u.  To quote
from the RFC in question:

	It is suggested that "%" have lower precedence
	than any other routing operator (e.g., "!") hidden in the
	local-part; for example, "a!b%c" would be interpreted as
	"(a!b)%c".

    According to my understanding of RFC1123, if my mailer sees the address
      user%mcdchg%clout@gargoyle.uchicago.edu
   it may not touch the local part,

True.

   and must forward the mail to gargoyle, even though experience tells
   me the mail will bounce.  My mailer must not under any circumstances
   reformat the address as
      clout!mcdchg!user@gargoyle.uchicago.edu

True again.

   although with this address the mail would probably be correctly delivered.

WHY?  What is the difference between the way these two addresses are
handled?  Why would one work, and not the other?  If they really do
yield different results, then gargoyle is broken, and you should not
rewrite to support their brain-deadedness -- bounced mail is a good
incentive to fix broken mailers.

   Boy, I am sure glad to have read RFC1123.  Before reading it I was greatly
   confused.  I used to think that the whole purpose was to ensure correct and
   efficient delivery of the mail.

It is -- I think you may have misinterpreted some of the sections.
--
Mark Towfigh, Racal InterLan, Inc.                 towfiq@interlan.Interlan.COM
W: (508) 263-9929 H: (617) 488-2818                       uunet!interlan!towfiq

  "The Earth is but One Country, and Mankind its Citizens" -- Baha'u'llah

paul@uxc.cso.uiuc.edu (Paul Pomes - UofIllinois CSO) (07/13/90)

towfiq@interlan.Interlan.COM (Mark Towfigh) writes:

>>In article <9007111604.AA29811@cs.niu.edu> rickert@CS.NIU.EDU (Neil
>>Rickert) writes:
>>
>>    Now, let's see if I have it straight.
>
>OK.....

 (some stuff deleted)
>   and must forward the mail to gargoyle, even though experience tells
>   me the mail will bounce.  My mailer must not under any circumstances
>   reformat the address as
>      clout!mcdchg!user@gargoyle.uchicago.edu

>True again.

>   although with this address the mail would probably be correctly delivered.

>WHY?  What is the difference between the way these two addresses are
>handled?  Why would one work, and not the other?  If they really do
>yield different results, then gargoyle is broken, and you should not
>rewrite to support their brain-deadedness -- bounced mail is a good
>incentive to fix broken mailers.

Please come tell my director that.  Like or not, some folks were hired to
make email work and not to make excuses (they pay a lot more for the former).
For the forseeable future I will reformat addresses to working forms AND
let the worst violators know that their mail systems are broken.  We are
not yet at the stage where we can insist on 'purity' and still get our jobs
done (and the director's mail delivered).

/pbp
--
         Paul Pomes

UUCP: {att,iuvax,uunet}!uiucuxc!paul   Internet, BITNET: paul@uxc.cso.uiuc.edu
US Mail:  UofIllinois, CSO, 1304 W Springfield Ave, Urbana, IL  61801-2987

rickert@CS.NIU.EDU (Neil Rickert) (07/25/90)

 In response to my earlier comment:
>
>   As I interpret RFC1123, a host which does not understand the '!' character,
>   but which needs a route for delivery, should not use the address
>			     @a:c!u@b
>   with its clear interpretation of delivery -> a -> b ->c -> u  but the host
>   should instead format the address as
>			     c!u%b@a
>   with a suggested interpretation of -> a -> c -> b -> u.  (Note that since
>   the host does not understand !, it treats 'c!u' as merely a single mailbox
>   name).
>

towfiq@interlan.Interlan.COM (Mark Towfigh) writes:

>Where do you get this interpretation?  It seems to me that the two
>forms are identical, and should both result in a->b->c->u.  To quote
>from the RFC in question:
>
>	It is suggested that "%" have lower precedence
>	than any other routing operator (e.g., "!") hidden in the
>	local-part; for example, "a!b%c" would be interpreted as
>	"(a!b)%c".
>

 This unfortunate language is just one of the problems with RFC1123.  To say
that '%' has lower precedence than '!' should mean that the processing of '!'
must PRECEDE the processing of '%'.  This was the interpretation I used.  The
algebraic notation (a!b) only confuses the issue.  In algebraic terminology,
the use of parentheses in (a*b)+c implies both that 'a*b' is to be processed
first before applying the '+', and that '(a*b)' must be treated as a unit.

As I read it, "(a!b)%c" has the following interpretation:
  Send the mail to host 'a' for delivery to mailbox 'b'.  Take the result
  of that mailing, whatever it is (undoubtedly an error message accompanying
  bounced mail, since mailbox 'b' is really on host 'c'), and use that result
  as the name of a mailbox on host 'c'.

 Actually the correct way to parenthesize 'a!b%c' is '(a!)b(%b)'.  This
completely specifies the binding of operators to operands, which is what
parentheses are good for.  Unfortunately it does not say which operator
should be processed first - algebraic notation was not designed for that.

 Until I hear a definitive statement to the contrary, I will continue with
my interpretation.  Incidentally most software I have seen only looks for
a '%' after it has failed to find any other addressing operators.  This means
that software which recognizes '!' as an addressing operator typically gives
higher precedence to '!' than to '%', while software which does not recognize
'!' necessarily gives higher precedence to '%'.  The result is hopeless
ambiguity.

 When software, such as used at a gateway host, provides a route back through
itself for return addresses, IT MUST USE AN OPERATOR OF HIGHEST PRECEDENCE.
Anything else requires modification of the local part.  In particular any
addressing operator in the local part which the gateway does not interpret
as an addressing operator (the '!' at bitnet gateways, for example), will
of necessity be treated as having lower precedence than the routing
operator.  Thus if ambiguity is to be avoided, any routing operator chosen
must be universally recognized as having the highest precedence.  The '%'
operator does not meet these requirements.

 As part of its justification to use '%' routes in place of source routes,
RFC1123 says:

              source routing is discouraged".  Many hosts implemented
              RFC-822 source routes incorrectly, so the syntax cannot be
              used unambiguously in practice.  Many users feel the
              syntax is ugly.  Explicit source routes are not needed in

 But consider the following rewrite rule from 'sendmail.main.cf' distributed
with Sun systems:

R$+%$+			$@$>3$1@$2			user%host

 This will convert 'user%relay2%relay1' to 'user@relay1%relay2' with
not much hope of correct delivery.  Given that there are more than two or
three Suns out there, one might conclude that 'many hosts implement %-routes
incorrectly, so the syntax cannot be used unambiguously in practice.'

=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Sci Dept, Northern Illinois U., DeKalb IL 60115
  InterNet, unix: rickert@cs.niu.edu              Bitnet, VM: T90NWR1@NIUCS
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

enag@ifi.uio.no (Erik Naggum) (07/25/90)

I was there when RFC1123 was written.  I had my own misgivings about
the language and use of parentheses with respect to the "routig
operators".  However, Bob Braden (the editor) told me what he had in
mind, and I saw that it was good.

In article <9007242101.AA32438@cs.niu.edu> rickert@CS.NIU.EDU (Neil Rickert) writes:

    This unfortunate language is just one of the problems with
   RFC1123.  To say that '%' has lower precedence than '!' should mean
   that the processing of '!'  must PRECEDE the processing of '%'.
   This was the interpretation I used.  The algebraic notation (a!b)
   only confuses the issue.  In algebraic terminology, the use of
   parentheses in (a*b)+c implies both that 'a*b' is to be processed
   first before applying the '+', and that '(a*b)' must be treated as
   a unit.

Your confusion comes from assuming that the text says anything about
what the operators mean, and that they are akin to algebraic oper-
ators.  Instead, the precedence question relates to grouping.  In the
same vein, a%b@c is treated as (a%b)@c.  The "operation" involved is
treating things as a local-part, not as destination domain.  What's
left over, so to speak, is the destination domain, as in (a!b)%c,
where c is there destination.

   As I read it, "(a!b)%c" has the following interpretation:

Your interpretation is intuitive with respect to algebraic notation
qua algebra, but not qua notation.  Please do not think that these
operators "operate" on the text!  They _only_ group and separate.

    Until I hear a definitive statement to the contrary, I will
   continue with my interpretation.

You're too stubborn.  That's not good even when you're right.  And
you're wrong.

   Incidentally most software I have
   seen only looks for a '%' after it has failed to find any other
   addressing operators.

Look at some more software.  If you're entrenched in the UUCP world
with sendmail braindamage, your complaints refer to your being
entrenched, not the rest of the world.  There have been people before
you who have been confused by the Internet folks's insistence on
talking about the Internet, but you make some record on remaining
confused for a long time.

   This means that software which recognizes
   '!' as an addressing operator typically gives higher precedence to
   '!' than to '%', while software which does not recognize '!'
   necessarily gives higher precedence to '%'.  The result is hopeless
   ambiguity.

Are these on the Internet, or are they UUCP bang-path only systems?
This makes all the difference in the world.  The Internet is not, as I
have had to point out to several people, a conglomeration of networks
and standards of all sorts, it's one particular technology with one
particular standard protocol suite.  SMTP and RFC822 in this case.  It
has been a tremendous success, and other people try to mimick the
Internet as best they can.  This does not make them "the Internet".

    When software, such as used at a gateway host, provides a route
   back through itself for return addresses, IT MUST USE AN OPERATOR
   OF HIGHEST PRECEDENCE.

This is a semantic error on your part.  You insist on thinking that
we're talking about text operations.  We're not talking about text
operations.  We're talking about parsing.  Look, you've even made me
repat myself.

    But consider the following rewrite rule from 'sendmail.main.cf' distributed
   with Sun systems:

When did Sun do anything but lose with respect to Internet mail?
I've had to throw out Sun's sendmail.cf at the systems I fiddle with
because it was designed so lamebrainedly it's a crime.

Instead of complaining, and showing us all kinds of broken systems,
why don't you try to make things according to standard?  That means,
for instance, that you sit down and try to grok the intentions behind
the specs, instead of comparing the standard to your broken system
which just happens to work because others follow the robustness
principle:

	"Be liberal in what you accept, and
	 conservative in what you send."

You seem to want it exactly the other way around.

If you would like, I can send you parts of the discussion relating to
source routing and the emerging consensus on the "%-hack".
--
[Erik Naggum]		Gaustadalleen 21	+47-256-7822
<erik@naggum.uu.no>	N-0371 OSLO; NORWAY	+47-260-4427 (fax)

towfiq@interlan.Interlan.COM (Mark Towfigh) (07/26/90)

I wrote (quoting RFC1123):

   It is suggested that "%" have lower precedence than any other routing
   operator (e.g., "!") hidden in the local-part; for example, "a!b%c"
   would be interpreted as "(a!b)%c".

In article <9007242101.AA32438@cs.niu.edu> rickert@CS.NIU.EDU (Neil
Rickert) writes:

   This unfortunate language is just one of the problems with RFC1123.
   To say that '%' has lower precedence than '!' should mean that the
   processing of '!' must PRECEDE the processing of '%'.

What you say is right, but I think you have a different notion of what
"processing" means.  It does not mean sending the mail, it means
parsing the address.  "!" is not the action of sending something via
UUCP -- this is where it differs from the algebraic operator "*".

   The algebraic notation (a!b) only confuses the issue.  In algebraic
   terminology, the use of parentheses in (a*b)+c implies both that 'a*b'
   is to be processed first before applying the '+', and that '(a*b)'
   must be treated as a unit.

And similarly, (a!b) must be treated as a unit, so you send a message
for a!b to c.

   As I read it, "(a!b)%c" has the following interpretation: Send the
   mail to host 'a' for delivery to mailbox 'b'.  Take the result of that
   mailing, whatever it is (undoubtedly an error message accompanying
   bounced mail, since mailbox 'b' is really on host 'c'), and use that
   result as the name of a mailbox on host 'c'.  Until I hear a
   definitive statement to the contrary, I will continue with my
   interpretation.

Please do not write any mailer software which behaves like this!

   Incidentally most software I have seen only looks for a '%' after it
   has failed to find any other addressing operators.  This means that
   software which recognizes '!' as an addressing operator typically
   gives higher precedence to '!' than to '%', while software which does
   not recognize '!' necessarily gives higher precedence to '%'.  The
   result is hopeless ambiguity.

Software which does not conform to protocol always results in such
problems.
--
Mark Towfigh, Racal InterLan, Inc.                 towfiq@interlan.Interlan.COM
W: (508) 263-9929 H: (617) 488-2818                       uunet!interlan!towfiq

  "The Earth is but One Country, and Mankind its Citizens" -- Baha'u'llah

fitz@wang.com (Tom Fitzgerald) (07/28/90)

enag@ifi.uio.no (Erik Naggum) writes:
> I was there when RFC1123 was written.  I had my own misgivings about
> the language and use of parentheses with respect to the "routig
> operators".  However, Bob Braden (the editor) told me what he had in
> mind, and I saw that it was good.

May I ask why?  It causes problems for UUCP sites, and basically forces
us to rewrite the local parts of nonlocal addresses (turning a%b into b!a)
in order to get stuff delivered.  (We don't do the rewriting here, so
we suffer with a fair amount of undeliverable mail).

> The Internet is not, as I
> have had to point out to several people, a conglomeration of networks
> and standards of all sorts, it's one particular technology with one
> particular standard protocol suite.  SMTP and RFC822 in this case.  It
> has been a tremendous success, and other people try to mimick the
> Internet as best they can.

One of the reasons it has been so enormously successful is its ability
to encompass other networks into it.  When two networks merge, the users
adopt the conventions of the more flexible net, and the less flexible
net becomes a special case.  So far, the Internet (especially the DNS)
has blown everyone away with its distributed naming authority and open-
ended mail addressing.

This kind of thing, which makes the Internet more rigid, and less able to
engulf the conventions of other nets, damages some of the things that made
the Internet (especially RFC 822) so successful in the first place.

> If you would like, I can send you parts of the discussion relating to
> source routing and the emerging consensus on the "%-hack".

I don't care about source routing, but I'd be interested to see the
discussion on the % hack.  Could you post it?

---
Tom Fitzgerald   Wang Labs        fitz@wang.com
1-508-967-5278   Lowell MA, USA   ...!uunet!wang!fitz

enag@ifi.uio.no (Erik Naggum) (07/31/90)

Hope I'm not late with this, it's been sitting in its own emacs buffer
after I accidentally renamed it.  Such is life with powerful tools.

In article <ENAG.90Jul25163050@slembe.ifi.uio.no> I wrote:
>> I was there when RFC1123 was written.  I had my own misgivings about
>> the language and use of parentheses with respect to the "routig
>> operators".  However, Bob Braden (the editor) told me what he had in
>> mind, and I saw that it was good.

In article <aqgu2j.72u@wang.com> fitz@wang.com (Tom Fitzgerald) replied:
> May I ask why?  It causes problems for UUCP sites, and basically forces
> us to rewrite the local parts of nonlocal addresses (turning a%b into b!a)
> in order to get stuff delivered.  (We don't do the rewriting here, so
> we suffer with a fair amount of undeliverable mail).

Taken as a whole, RFC 1123 improves the chances for mail to get from A
to B.  Several important items: It clearly states that you must not
touch or interpret a local-part unless you're [responsible for] the
receiving domain.  Second, it clearly states that source routes are Bad.
It also specifies one way to do what I call "subaddress" mail with the
"%-hack".  (The distinction between source route and subaddressing is
implicit in the HRRFC, but I've found a need to make the distinction
explicit, and chose this name for it.)

Note:	This is only one way to do it, a suggested way, but you don't
	have to abide by this, since we took great pains to ensure that
	only the target host interpret the local-part.  I would
	recommend following it for interoperability reasons on the
	Internet proper, though.

RFC 822 and the HRRFC has very little to say on local-parts.  We don't
say that you should only use the %-hack.  The 5.2.16 discussion
paragraphs even include an example with a bang-path local-part.

>> The Internet is not, as I have had to point out to several people,
>> a conglomeration of networks and standards of all sorts, it's one
>> particular technology with one particular standard protocol suite.
>> SMTP and RFC822 in this case.  It has been a tremendous success,
>> and other people try to mimick the Internet as best they can.

> One of the reasons it has been so enormously successful is its ability
> to encompass other networks into it.  When two networks merge, the
> users adopt the conventions of the more flexible net, and the less
> flexible net becomes a special case.  So far, the Internet (especially
> the DNS) has blown everyone away with its distributed naming authority
> and open-ended mail addressing.

I agree with this, and I don't think the HRRFC did anything to destroy
the open-ended mail addressing in the Internet.  Au contraire!

> This kind of thing, which makes the Internet more rigid, and less able to
> engulf the conventions of other nets, damages some of the things that made
> the Internet (especially RFC 822) so successful in the first place.

Read the discussions involved, please.  It has not been made more
"rigid" in any sense.  We can still engulf absolutely anything in the
local-part, but it would help if certain things are more predictable
than others.  Always having to check with the receiving domain what it
does won't work the way we want, either.  Therefore, it was felt that
it would be useful to suggest this particular syntax.  I think that was
very wise.  For most applications, it won't matter, as they don't use
more than one "sub-routing" character, anyhow.  For those who need to
source-route for testing purposes, it's very convenient to know what
will work.  For the sub-addressing question, it's clearly a local
problem, but again uniformity is a bonus.

I think we should try to refer to source routing as an intra-net (be it
the Internet or other) phenomenon, while subadressing is an inter-net
(between two nets, one of them perhaps being the Internet) phenomenon.
This could help us get a clearer picture of what we're talking about.

> I don't care about source routing, but I'd be interested to see the
> discussion on the % hack.  Could you post it?

I'm sorry, but I have a few problems locating it.  Give me a few more
hours by the tape drive.  As I remember, it was a large number of
messages devoted to the problem.
--
[Erik Naggum]		Gaustadalleen 21	+47-256-7822
<erik@naggum.uu.no>	N-0371 OSLO; NORWAY	+47-260-4427 (fax)