[comp.mail.misc] Mail guesses fixed

rwhite@nusdhub.UUCP (Robert C. White Jr.) (12/17/87)

Ladies and gentelmen,
	I previously posted this message but it Contained some
horrible mistakes in the form of typos.  The entire precidence map at
the bottom, for instance, had the left<>right binding reversed.
This message has been preceded by a cnacel message for the other text.
Sory about the screw-up.

Rob.


Hello All,

	Having been unable to get a decient refrence on what headers
are what, I would like to know if my perceptions about address lines
are correct.  I am going to give a few examples of possible mailing
addresses, and then what I think will happen,  I would like to have
responses as to correctness of the examples [and possibly the
approprateness of the usage].  NONE of this is garneteed correct, but
is instead intended as a group of perceptions which I would like
validated or invalidated.
	I will of corse sumarize any mailed responses.

To:	personA,personB,personC@machine1.smalldomain.largedomain

	The local machine will first attempt to locate "machine1"
because it is the most spesific refrence.  If the local machine fails
to find "machine1" it will then look for a machine which is a member
of "smalldomain" because this is the most limited set of machines of
which "machine1" is aledgedly a member.  If "smalldomain" is not found
then "largedomain" will be sought, and so on with domain sizes
increasing as the list progresses to the left, hence:
	machine.building.campus.school.educational-comunity-at-large.

	If the above address is attached to a message whic is received
at a machine which is a member of "largedomain", that machine will
look for "machine1" and if not found "smalldomain".  If neither is
found then the message will be frowarded to the top of "largedomain".
If this is the top of large domain, the message will bounce [fail].
This domain-resolution will continue through each seccussive domain
until the machine is found or the samllest domain is encountered and
fails to resolve the machine refrence.

	Upon reaching "machine1" the message will be delivered to
three seperate accounts "personA" "personB" and "personC"  If any of
these accounts do not exist then those messages will bounce without
interfeering with deleivery to the accounts which do exist.

To:	personA@sdomain.mdomain.ldomain

	The top of "sdomain" will be sought [as above] if it can be
found, it will search it's accounts, aliases, mailing-lists, and
group-lists [as exist] for an account or refrence to "personA" and
send the message to that [those] accounts.  A list of accounts [as
above] may exist in place of "personA" [i.e.
"group1,group2,list1,personB,personC"]

To:	personA@machine1%machine2%machine3

	The sending machine will send to "machine1" and remove
"machine1%" from the address [netting personA@machine2%machine3%].
machine1 will similarly forward to "machine2" removing "machine2%"
[netting personA@machine3].  Machine2 will forward to "machine3"
leaving the header unchainged.  Machine3 seeing "personA@machine3"
will deliver to that/those account/list/group.

To:	personA@machine1%machine2@machine3%machine4

	This one I am not shure about.  My best guess would be:
machine3->machine4->machine1->machine2.  It is clear that the "%"
[percent] symbols can be used like a bang path for explicit routing.
It is also clear that the "@" [at] sign has higher precidence, and
that the "@" groups right to left, leading me to read this as "personA
(at machine1 then machine2 (at machine3 then machine4))".  Of course
"personA" may again be a list [as above].
	IS THIS CORRECT??

To:	machine3!personA@machine1%machine2

	This again is a guess: machine1->machine2->machine3->personA
	IS THIS CORRECT?

To:	machine2!personA,personB@machine1

	Another Guess: machine1->personB
				>machine2->personA
	IS THIS CORRECT?  IS IT LEGAL?

	The summary of these two is: 1) any bang paths must be to the
left of the first "@" [at] sign. 2) These bang paths will only be
processed _after_ everything to the right of the "@" [at] sign.  3) The
message will be replicated for each member of the list [the left side
of the "@" [at] sign] and forwarded as necessary.

To:	personB@machine2,personA@machine1

	Another Guess: machine1->personA
				>machine2->personB
	IS THIS CORRECT OR LEGAL?

	This would seem to follow from the listing rule above.  It
also implies that "," [comma] is of a higher precidence then an "@"
[at] when it is to the right of the "@" [at], while maintaing a 
lower precidence to the left of the "@" [at].  Or it would indicate
that "," [comma] and "@" [at] are of equal precidence, but that they
_both_ group from right to left.

To:	(text)

	The material within the "(" [lparen] and ")" [rparen] is a
comment, and generates NO delivery address or forwarding at all.
	IS A COMMENT ALONE ON A LINE LEGAL?


Also, any of the above formats may occur on the same line as any other
as long as there is at least one unit of whitespace seperating each
form of address.  The order of such combination, or multiple
occuarances of the "To:" "Cc:" and "Bcc:" header lines are immaterial
to the delivery, and should be processed on a first come first served
basis.

Precidence map:

	"(" ")" 	Grouping L->R	MAtched Pairs required.
	"," "@"		Grouping R->L
	"%"		Grouping L->R
	"!"		Grouping L->R

Finally, a question:  I have seen ";" [semicolon] refrenced in some of
what I have read on mail addressing.  What is the ";" for and have I
missed any other symbols?


ANY comments on the correctness of any of this will be greatly
appriciated.

Rob.

p.s. I'll bet you'll never guess why my .signiture does _not_ contain
	a mailing address ;-)

barmar@think.COM (Barry Margolin) (12/19/87)

In article <569@nusdhub.UUCP> rwhite@nusdhub.UUCP (Robert C. White Jr.) writes:
>To:	personA,personB,personC@machine1.smalldomain.largedomain
>
>	If the above address is attached to a message whic is received
>at a machine which is a member of "largedomain", that machine will
>look for "machine1" and if not found "smalldomain".  If neither is
>found then the message will be frowarded to the top of "largedomain".
>If this is the top of large domain, the message will bounce [fail].
>This domain-resolution will continue through each seccussive domain
>until the machine is found or the samllest domain is encountered and
>fails to resolve the machine refrence.

Domains are name registries; a machine is not a member of a domain,
but its name can be.  I'll presume this is what you mean when you say
"a machine which is a member of 'largedomain'".  Now that I've cleared
that up, I'll tell you that it doesn't matter what domain the
receiving machine's name is a member of.  Binding a domain name to a
machine is independent of the local machine's domain.

What happens is that when a machine wants to send mail to an address
whose domain part (the part after the @) is of the form
name1.name2.name3 it tries to contact the nameserver for the
name2.name3 domain, and asks it for the information on how to send
mail to name1.name2.name3.  If it doesn't know the address of the
name2.name3 nameserver (possibly because there isn't one) it makes the
query of the nameserver for name3, and so on up the domain hierarchy.
When it finally gets to a point where it knows the appropriate
nameserver (the process must end because everyone is supposed to know
the addresses of the root nameservers, although it is conceivable that
all of them could be down simultaneously), the nameserver will either
give it the information or tell it to ask the server for a subdomain
(and hopefully it will include the address of the server).  This could
be repeated when the machine queries the specified nameserver, etc.,
but eventually it will find out how to send mail to name1.name2.name3.

The message would never actually be transmitted to name2.name3 or
name3 unless the mail information for name1.name2.name3 specifically
says that one of these hosts should be used for mail addressed to it.
But it is just as possible for the mail information for
name1.name2.name3 to specify that its mail should be delivered to
other1.other2.other3.  But if no nameserver has ever heard of
name1.name2.name3 the message will be returned.

>	Upon reaching "machine1" the message will be delivered to
>three seperate accounts "personA" "personB" and "personC"  If any of
>these accounts do not exist then those messages will bounce without
>interfeering with deleivery to the accounts which do exist.

This is wrong.  First of all, the above header is invalid, because all
addresses must be of the form <user>@<domain>.  There is no automatic
grouping.  If your mail-sending program allows the above To: line and
interprets all the addresses as being on
machine1.smalldomain.largedomain, then it must translate them to the
standard format before transmitting the message over the network.
When the SMTP protocol is being used to transfer mail between
computers the header isn't actually used, though; the recipients are
specified as part of the SMTP protocol, and the header is just part of
the data portion.

>To:	personA@sdomain.mdomain.ldomain
>
>	The top of "sdomain" will be sought [as above] if it can be
>found, it will search it's accounts, aliases, mailing-lists, and
>group-lists [as exist] for an account or refrence to "personA" and
>send the message to that [those] accounts.  A list of accounts [as
>above] may exist in place of "personA" [i.e.
>"group1,group2,list1,personB,personC"]

How a message is delivered once it is received by the destination
mailer is completely up to that host.  Your description above is
typical of most systems, though.

>To:	personA@machine1%machine2%machine3
>
>	The sending machine will send to "machine1" and remove
>"machine1%" from the address [netting personA@machine2%machine3%].
>machine1 will similarly forward to "machine2" removing "machine2%"
>[netting personA@machine3].  Machine2 will forward to "machine3"
>leaving the header unchainged.  Machine3 seeing "personA@machine3"
>will deliver to that/those account/list/group.

No.  Everything to the right of the @ MUST be a domain name.  There
are some machines that recognize %, but it must be to the LEFT of the
@.  The syntax of an address is <local-part>@<domain>.  The syntax of
<local-part> is arbitrary, and this is what permits hosts to treat
some characters within it as routing information.  The use of % is
completely unsanctioned by standards, but it has become extremely
common.  The way the above would usually be written is

To: personA%machine3%machine2@machine1

The important point is that the host sending to machine1 doesn't have
to recognize % for this to work; as far as it is concerned, it is
sending to a user whose name contains a bunch of % characters (and
there's no reason why some computer system couldn't have users with
such names).  When machine1 receives it, it presumably replaces the
last % in the <local-part> with an @ and drops its own name, resulting
in "personA%machine3@machine2" and reprocesses this normally, by
sending it to machine2.


>To:	personA@machine1%machine2@machine3%machine4
>
>	This one I am not shure about.  My best guess would be:
>machine3->machine4->machine1->machine2.  It is clear that the "%"
>[percent] symbols can be used like a bang path for explicit routing.
>It is also clear that the "@" [at] sign has higher precidence, and
>that the "@" groups right to left, leading me to read this as "personA
>(at machine1 then machine2 (at machine3 then machine4))".  Of course
>"personA" may again be a list [as above].

Multiple "@" are not permitted in an address.  However, this can be
permitted by using quoting, e.g.

To: "personA@machine1"@machine2

The quoting removes the special interpretation of the first @, so this
becomes an address whose <local-part> is personA@machine1 and whose
<domain> is machine2, so the message would be delivered to machine2.
If machine2 happens to treat an @ embedded in the <local-part> as
routing information (like % above) then this is a way to force
routing; however, again, the interpretation of <local-part> is
completely unspecified.

>	IS THIS CORRECT??
>
>To:	machine3!personA@machine1%machine2
>
>	This again is a guess: machine1->machine2->machine3->personA
>	IS THIS CORRECT?

First, I'll treat the above as "machine3!personA%machine2@machine1",
correcting your earlier error.

The interpretation of an address that contains both ! (the UUCP
routing character) and @ (the SMTP routing character) depends on the
way the host processing the address is configured.  A host that only
knows about UUCP will most likely not recognize the @ at all, so it
will treat the above as the user "personA%machine2@machine1" on host
"machine3", so the message will probably take the path
machine3->machine-1->machine2->personA.  A host that doesn't know
about UUCP will ignore the !, so it will treat it as user
"machine3!personA%machine2" on host "machine1", and the actual path
will depend on how machine1 deal with ! and %.  If a host recognizes
both then one of them will have to have a higher precedence than the
other, to resolve the ambiguity, and quoting could be used to force
grouping.  Some systems adjust the precedence based on the protocol
used to receive the message, so that if a message is received using
UUCP protocol the UUCP addressing style will be favored.

>To:	machine2!personA,personB@machine1
>
>	Another Guess: machine1->personB
>				>machine2->personA
>	IS THIS CORRECT?  IS IT LEGAL?

Well, I would treat this as machine2->personA and machine1->personB,
because none of the mail sending programs I use automatically group
comma-separated addresses as being for the same host.  In other words,

To: personA,personB@machine1

is equivalent to

To: personA
To: personB@machine1

and most systems interpret the first line as meaning personA on the
local system.

>To:	(text)
>
>	The material within the "(" [lparen] and ")" [rparen] is a
>comment, and generates NO delivery address or forwarding at all.
>	IS A COMMENT ALONE ON A LINE LEGAL?

The only address field that may contain no addresses is the BCC:
field.  All the rest require at least one address.

>Also, any of the above formats may occur on the same line as any other
>as long as there is at least one unit of whitespace seperating each
>form of address.  The order of such combination, or multiple
>occuarances of the "To:" "Cc:" and "Bcc:" header lines are immaterial
>to the delivery, and should be processed on a first come first served
>basis.

Well, since the header isn't used to determine delivery information,
the order is obviously immaterial.  A message may be delivered to
people no listed in the header at all.  This can often happen when
using blind-carbon-copy, since it is permitted to leave the BCC: field
empty.  The SMTP protocol looks like:

MAIL FROM:<user1@domain1>
RCPT TO:<user2@domain2>
RCPT TO:<user3@domain2>
...
DATA
<header>
<blank line>
<body>
.

The message is delivered to user2 and user3, regardless of the names
specified in the header.  I am aware that there are other mail
delivery protocols that do actually use the addresses in the header.

>Precidence map:
>
>	"(" ")" 	Grouping L->R	MAtched Pairs required.

Since parentheses are not binary operators, I don't know what you mean
by "grouping L->R".

>	"," "@"		Grouping R->L
>	"%"		Grouping L->R
>	"!"		Grouping L->R

The actual precedence is more like:

	double-quotes	No grouping	Matched Pairs required
	( )		No grouping	Matched Pairs required
	,		No grouping
	@ !		! groups L->R
			only one @ permitted
	%		groups R->L
	

>Finally, a question:  I have seen ";" [semicolon] refrenced in some of
>what I have read on mail addressing.  What is the ";" for and have I
>missed any other symbols?

There is a special address syntax allowed instead of
<local-part>@<domain>, called named groups, whose syntax is:

	<name>: <users> ;

where <users> is 0 or more <local-part>@<domain> addresses (if <users>
is empty then this is just a placeholder for an unenumerated mailing
list).  This syntax only exists in headers, it isn't allowed in the
SMTP delivery protocol.

The complete specification of the format of message headers is in
RFC-822.  SMTP is described in RFC-821.  If you have access to the
Internet these can be copied from SRI-NIC.ARPA, from the files
RFC:RFCnnn.TXT.  You can get hardcopy from the DDN Network Information
Center; for information on this, call (800) 235-3155 between 7am and
4pm Pacific time.  Of course, neither of the above standards will
mention % or !, since they aren't officially part of the protocol.

---
Barry Margolin
Thinking Machines Corp.

barmar@think.com
seismo!think!barmar