[net.mail] quotes in addresses

gnu@sun.uucp (John Gilmore) (05/16/84)

One of the mistakes made by RFC822 was re-suggested recently by Greg
Skinner -- non-nestable quotes in addresses:

   ...!seismo!"hao!...!sri-unix!enduser@endarpahost.arpa"

This works fine in simple cases but the lossage is that the quotes do
not nest.  Suppose your host received the above address and wanted to
pass it on to another host with different conventions.  The obvious
thing would be:

   mysite!"...!seismo!"hao!...!sri-unix!enduser@endarpahost.arpa""

but you can't tell whether the first two quotes match, or the first one
and the last one.  This is part of why we have problems embedding !, @,
and % addresses inside each other -- if quotes worked, you could hide
the special characters.

If you really want to implement nestable quotes, use [] or {}.  There
is already an assigned meaning for () and <> in addresses.  There is
also an assigned meaning for [] in a host name (numeric internet
address), so for simplicity {} should be used.

rpw3@fortune.UUCP (05/19/84)

#R:sun:-110200:fortune:15600004:000:5623
fortune!rpw3    May 18 21:05:00 1984

[A long response... I hope it's worth it...]

+---------------
| ***** fortune:net.mail / sun!gnu /  2:31 am  May 16, 1984
| One of the mistakes made by RFC822 was re-suggested recently by Greg
| Skinner -- non-nestable quotes in addresses:
| 
| ...!seismo!"hao!...!sri-unix!enduser@endarpahost.arpa"
| 
| This works fine in simple cases but the lossage is that the quotes do
| not nest.
+---------------

Go read RFC822 again.

Naked quotes don't nest, true, but on pages 10 & 11 [sections 3.3 & 3.4]
of my copy of 822 I read that a "quoted-string" may have a "quoted-pair"
within it, and it sure looks like '\"' (backslash, double-quote) is a
permitted "quoted-pair".

+---------------
| Suppose your host received the above address and wanted to
| pass it on to another host with different conventions.  The obvious
| thing would be:
| 
| mysite!"...!seismo!"hao!...!sri-unix!enduser@endarpahost.arpa""
| 
| but you can't tell whether the first two quotes match, or the first one
| and the last one.
+---------------

Actually, since it's already at your host ("mysite"?), I'm not sure quite
what you mean by this, but assuming that we are still talking about the
case where "seismo" is both a uucp host and an ARPA host and we want to
force it to treat the message as uucp, the solution is simply the one given:

	mysite!...!seismo!hao!...!sri-unix!"enduser@endarpahost.arpa"

The outer set of quotes hides the "@" from "seismo" and any other ARPA
site before "sri-unix". "sri-unix" sees an address which is ENTIRELY
quoted, and strips one layer of quotes, including turning all "quoted-pairs"
into "CHARs" before further interpretation. (NOTE: This doesn't currently
work, since "sri-unix" barfs on the quotes.)

The more complicated case (and the one I believe you are really trying
to address) is sending via a uucp chain to a known ARPA host a message
that will be sent out some other net (such as UUCP) at the known ARPA
host, while getting any unknown ARPA hosts in the middle to not meddle.
For example, if I want to send a message to myself the long way around
the San Francisco Bay, I should be able to mail to:

	hpda!hplabs!sri-unix!"\"dual!fortune!rpw3\"@Berkeley.ARPA"

without either "hpda" or "hplabs" trying to send on the ARPAnet (assuming
they even could). The outgoing router at each site sees:

	hpda.uucp:	hplabs!sri-unix!"\"dual!fortune!rpw3\"@Berkeley.ARPA"
	hplabs.uucp:	sri-unix!"\"dual!fortune!rpw3\"@Berkeley.ARPA"
	sri-unix.arpa:	"dual!fortune!rpw3"@Berkeley.ARPA
	ucbvax.uucp:	dual!fortune!rpw3
	dual.uucp:	fortune!rpw3

The analogous loop starting in ARPA-land (borrowing Eric Fair's name since I'm
not an ARPAnaut) is:

	To: "fortune!dual!ucbvax!\"fair@Berkeley.ARPA\""@sri-unix.ARPA

The outer set of quotes protects the inner "@" and "!" from Berkeley's gaze,
and the inner set protects the "@" from "sri-unix", forcing uucp to be used.
('Tis a shame, but this also doesn't really work! Yet.)

+---------------
| This is part of why we have problems embedding !, @,
| and % addresses inside each other -- if quotes worked, you could hide
| the special characters.
+---------------

If all systems really obeyed RFC822 quotes should work, as I read it.

+---------------
| If you really want to implement nestable quotes, use [] or {}.  There
| is already an assigned meaning for () and <> in addresses.  There is
| also an assigned meaning for [] in a host name (numeric internet
| address), so for simplicity {} should be used.
+---------------

No, use the "quoted-pair" backslash-doublequote. It's already in the standard.
All routers SHOULD understand it, even if they don't already. (Many of these
problems are not bugs in RFC822, but bugs in supposed implementations!)
Generations of UNIX Shell and Troff programmers understand it (even as they
detest it). It is already "nestable", in the forms "\\\"", "\\\\\\\"", etc.
We really don't need more than two levels of quoting except once in a long
long while, and for two levels the "quoted-pair" is clean enough.

The standard is already quite explicit about when they should be stripped
(essentially, when a "local-part" is about to be processed). See page 13,
section 3.4.4:

	"Quotation marks that delimit a quoted string and backslashes
	that quote the following character should NOT accompany the
	quoted-string when the string is passed to processes that do
	not interpret data according to this specification (e.g., mail
	protocol servers)."

	[Emphasis in the original]

I agree that there may be a bit of confusion here, but my interpreteation
is that such a process might look at the now-unquoted local-part and decide
that the (formerly quoted-) string is now "data to be interpreted according
to the specification"! This is a back-handed way to define a "relay"!

That is, I claim that from any ARPAnet site, the following two addresses
should be handled virtually identically:

To: Foo Bar <@seismo.ARPA: @su-score.ARPA: @sri-unix.ARPA: foobar@Berkeley.ARPA>

To: "\"\\\"foobar@Berkeley.ARPA\\\"@sri-unix.ARPA\"@su-score.ARPA"@seismo.ARPA

Since we will seldom get that deep, I have no problem with the ugliness.
I would prefer to see people try to work with the mechanism that's already
defined. Now if the RFC822 syntax indeed has a bug, somebody please comment,
but so far all I have seen is (a) people who haven't read the spec quite
closely enough, and (b) implementations that don't obey the spec completely
[probably because of (a)].
	
Rob Warnock

UUCP:	{ihnp4,ucbvax!amd70,hpda,harpo,sri-unix,allegra}!fortune!rpw3
DDD:	(415)595-8444
USPS:	Fortune Systems Corp, 101 Twin Dolphin Drive, Redwood City, CA 94065

ka@hou3c.UUCP (Kenneth Almquist) (05/20/84)

RFC822 allows for nested quoting.  You can include a quote inside a
quoted string by escaping it with a backslash.
				Kenneth Almquist