[comp.protocols.iso.x400.gateway] RFC-1148/2: what about underlines

Christian.Huitema@mirsa.inria.fr (Christian Huitema) (02/28/91)

The current version of RFC-1148 contained a bug. By mistake, it was
recommended to apply the following transformations between RFC-822 strings and
X.400 Printable strings:

map space(RFC)		to space(X.400)
map underline(RFC)	to escape sequence (u)

This has proven to be very inadequate for the following reasons:

* underline is very frequently used in RFC "local parts".

* space is almost never used in native RFC local parts.

* the usage of the escape sequence "(u)" in an X.400 surname is akward, and
puzzles regular X.400 users.

* the usage of spaces in RFC local parts implies the quoting of the local
part. This quoted addresses cannot be supported by some networks (e.g. UUCP).
They tend to confuse some widely deployed user agents like the standard unix
mailer "/bin/mail" and the standard Berkeley 4.[23] mailer.

In order to cope with this problem, a number of local solutions have been
deployed, e.g. development of UUCP specific mappings or local decisions to
deviate from the RFC. This partial solutions are clearly undesirables, as they
tend to jeopardize the global connectivity and introduce non reversible
mappings.

In order to enhance the interconnectivity, I suggest to adopt the following
bug correction:

map underline(RFC)	to space(X.400)
map space(RFC)		to escape sequence (u)

In order to protect existing software, we could perhaps establish a caveat
so that "when the local part is quoted, no mapping of characters should be
performed". But this should clearly come with a strong incentive to not use
the "quoted" expression when this can be avoided.

Christian Huitema

buclin@sic.epfl.ch (Bertrand Buclin) (03/06/91)

>In order to enhance the interconnectivity, I suggest to adopt the following
>bug correction:
>
>map underline(RFC)     to space(X.400)
>map space(RFC)         to escape sequence (u)

Why not map space to "(s)", it's more regular than "(u)" ?

>In order to protect existing software, we could perhaps establish a caveat
>so that "when the local part is quoted, no mapping of characters should be
>performed". But this should clearly come with a strong incentive to not use
>the "quoted" expression when this can be avoided.

This could just add some more confusion. If a quoted address is mapped
without character mapping, how will you detect once it is in X.400 world
that it was initially quoted ?

Exemple : "Bertrand Buclin"@Sic.Epfl.ch will according to your suggestion
map to : S=Bertrand Buclin;OU=Sic;...
When the message arrives from X.400 to an RFC-xxxx gateway, you will map
the address back to : Bertrand_Buclin@Sic.Epfl.CH, which is clearly not
the same address !

The defined mapping MUST be reversible and non ambiguous !

>In order to enhance the interconnectivity, I suggest ...
That RFC-1148bis be also a revision of RFC-987 (officially). Except RFC1026
there has been (officially) no revision of RFC-987. I agree that part of
RFC-1148 could also apply to X.400(84), but it needs to be stated.

--
Bertrand Buclin                              Internet: Buclin@Sic.Epfl.CH
Service Informatique Central                 X.400:    C=CH;A=ARCOM;P=SWITCH;
Ecole Polytechnique Federale de Lausanne               O=EPFL;OU=SIC;S=Buclin;
CH-1015 Lausanne                             Tel:      +41 21 693 22 11
(Switzerland)                                Fax:      +41 21 693 22 20

S.Kille@cs.ucl.ac.UK (Steve Kille) (03/13/91)

Christian,

I've not had a flame at you for ages!

 >From:  Christian Huitema <Christian.Huitema@fr.inria.mirsa>
 >To:    Steve Kille <S.Kille@uk.ac.ucl.cs>
 >Subject: RFC-1148/2: what about underlines
 >Date:  Thu, 28 Feb 1991 12:13:47 +0100

 >The current version of RFC-1148 contained a bug. By mistake, it was
 >recommended to apply the following transformations between RFC-822 strings and
 >X.400 Printable strings:

This is not a bug and you know it.  This issue was discussed at the 1148
editing meeting at UCL.  You said that the space/_ mapping was needed, but
the meetings concencus was not to do it.  Clearly, old discussions can be
reopened, but to pretend that they are new issues/suprises is not good
behaviour.

RFC 987 fudged the issue, by placing this mapping (and a set of others)
in an appendix and not clearly stating when the appendix should be used.
With 1148, we factored out this mapping into a separate RFC (1137), to be
used when traversing between full 822 and restricted 822 networks.  There
are a number of (more pathological) cases than space/_ covered.
(Incidentally, 1137 should probably be updated to deal with this new UUCP
which does not accept "/" problem).

The real issue is: "Does RFC 822 support quoted string mappings, and
in particular quoting of space".

If this is the case, then anything claiming to follow 822 should support
this.  Any network which cannot support it should use 1137 as a gateway
mechanism.  The approach in 1148 of mapping space in X.400 to space in RFC
822 is clearly the most desirable.  For an 822 user, this means that X.400
addresses can be inserted with no character transpositions.   An X.400 user
has to know about several acharacter encodings to input RFC 822 addresses -
and in particular (a).  All the encodings have the same style, and so (u) for
underscore is certainly a better choice that space.  You would get into a
horrible mess in dealing with an 822 address which contained space (these do
exist).

If this is not the case - for example if it is a recognised fact that this
usage causes problems and should be deprecated, then this should be noted.
There is a mechanism to do this, which is the host requirements.
If you really believe that quoted space is a problem on the Internet (I do
not), then this document should be changed.  Otherwise, it should be used as
a stick to get any broken implementations mended.


 >Christian Huitema


Steve

S.Kille@cs.ucl.ac.UK (Steve Kille) (03/13/91)

 >From:  Bertrand Buclin <buclin@ch.epfl.sic>
 >To:    ifip-gtwy@ch.epfl.sic
 >Subject: Re: RFC-1148/2: what about underlines
 >Date:  6 Mar 91 08:30:24 GMT


 >>In order to enhance the interconnectivity, I suggest ...
 >That RFC-1148bis be also a revision of RFC-987 (officially). Except RFC1026
 >there has been (officially) no revision of RFC-987. I agree that part of
 >RFC-1148 could also apply to X.400(84), but it needs to be stated.

I have come around to this opinion, and if I'm being honest, my major reason
for not doing it in the first place was laziness.  RFC 1148 has just fixed
too many bugs in RFC 987.  I will be adding an annexe to 1148bis which
specifies its application for 1984 nets.


Steve

buclin@sic.epfl.ch (Bertrand Buclin) (03/13/91)

>I have come around to this opinion, and if I'm being honest, my major reason
>for not doing it in the first place was laziness.  RFC 1148 has just fixed
>too many bugs in RFC 987.  I will be adding an annexe to 1148bis which
>specifies its application for 1984 nets.

Fine. It will make things much more easy. We will at last speak the same
language...

--
Bertrand Buclin                              Internet: Buclin@Sic.Epfl.CH
Service Informatique Central                 X.400:    C=CH;A=ARCOM;P=SWITCH;
Ecole Polytechnique Federale de Lausanne               O=EPFL;OU=SIC;S=Buclin;
CH-1015 Lausanne                             Tel:      +41 21 693 22 11
(Switzerland)                                Fax:      +41 21 693 22 20

Christian.Huitema@mirsa.inria.fr (Christian Huitema) (03/14/91)

>If you really believe that quoted space is a problem on the Internet (I do
>not), then this document should be changed.  Otherwise, it should be used as
>a stick to get any broken implementations mended.

I guess that this is exactly where we differ. It is not a matter of belief to
state that quoted parts do pose a problem on the Internet -- it is a fact.
This is ill supported by many user agents on the TCP-IP internet, and is not
supported at all by several non TCP-IP mail networks. And I have a profound
dislike for the idea of having two "RFC-822" formats -- one on the "real"
Internet, and another one on USENET; for one thing, there is no clear boundary
between the two nets.

Your approach is, in my opinion, moralistic and dictatorial (-). Insisting
that users get a new mail software before they could use our wonderfully
specified X.400 to RFC-822 gateway is contradictory with the general rule of
"be tolerant with what you accept, conservative with what you send". By
specifying a conservative mapping between spaces and underlines, you decouple
the problem of gateways and user software.

To put it shortly, I dont think that it is the mission of a gateway
specification to provide "a stick to get any broken implementations mended".
On the contrary, I think that this approach will result in a lesser
acceptation of the specification, and in chaotic coordination of the
gateways.

Christian Huitema

S.Kille@cs.ucl.ac.UK (Steve Kille) (03/14/91)

I should have been a bit clearer.  The document I was suggesting using as a
stick on "broken" implementations was the host requirements document, and
not my gateway spec.  Host requirements has placed a number of caveats on
RFC 822.

If we believe that 822 in reality does not handle quoted string, modify
the host requirements to make support for this feature optional.

If we believe that it does handle this feature, we should make efforts to
bring implementations up to scratch.

I would be more sympathetic if the space thing was the ONLY problem of this
nature in this mapping.   It is the most common, but there are other
analogous problems which do hit - although less frequently (The characters
"(" "(" "," ":" which are valid in X.400 addresses, and "/" which is
generated by 1148 and breaks some UUCP mailers.).   I dislike 90%
solutions.


Steve

Christian.Huitema@mirsa.inria.fr (Christian Huitema) (03/14/91)

In order to evaluate the real impact of blanks, underlines, quotes and
other amenities, I conducted a little experiment. I used the February
accounting files produced by our X.400 gateway, "kwai.inria.fr": it routed
6929 messages during that month. Then, I extracted all the addresses
which where mentionned in the envelops, which totals to 2759 addresses,
and I conducted a little analysis.

Out of the 6929 addresses, 332 used source route -- which forces a DD
notation. Out of this 332 source routed addresses, 317 originated in the UK.
Most of the source routing was bogus, with examples like:

	computer-science.manchester.ac.uk:romao@bchm1.aclcb.purdue.edu

which would redirect through the Manchester messages bound to the US.

In the 2759 addresses which I analysed, quotes where used five times,
for the following addresses:
	"CCF::EAMES"@hermes.mod.uk
	"CCF::GMD%hermes.mod.uk"@relay.mod.uk
	"CCF::MILNER"@hermes.mod.uk
	"grover%grover:atrc"@cs.ualberta.ca
	"ELEC!ROETHIG     "@enst.fr
The last address is probably the result of a user mistake. The first three
probably relate to an RFC to DECNET gateway, run for the UK ministry of
defense. The need of quoting results from the usage of the special character
":"; this could be easily alleviated if the gateway was a little more
clever, e.g. wrote its addresses as "EAMES@CCF.hermes.mod.uk". The fourth
address is probably buggy.

Out of the remaining 2754 addresses, 81 contained "special" characters, i.e.
characters that were neither alphabetical, nor numeric, nor an hyphen or a
dot. The special characters included:
    6: !
    1: #
   34: %
    1: +
    4: /
    2: =
    1: ?
   42: _
The question mark was found in the address:
	?!@#063#.uucp
which is indeed buggy. The bangs and percents result from attempts to assert
gateway addresses. The "/" resulted in one case from the RFC-822 notation of
an address generated at HP with the DD attribute "x4gate" (with 2 "=" signs),
and in another from the bogus address (invalid domain):

	postmaster/c@liuc.ucaen.inria.mcvax.fr

The "+" character was generated by Andrew:

	atkbb+Bad-Addresses@andrew.cmu.edu

The "#" character results from a creative gateway:

	kulander#kenneth%nersc.mfenet@ESNMRG.NERSC.gov

The "_" characters where found in bona fide user names.

All in all, this shows that the usage of quoting is extremely rare.

Christian Huitema

S.Kille@cs.ucl.ac.UK (Steve Kille) (03/15/91)

A check on our system here leads to similar results.  The feature is used,
although not extensively (DECNet addressing, quoating spaces in 1148, and
various oddments).

You conclude from this that it is not used much, and that the feature is
avoidable, and so it does not matter.

My conclusions are very different.  First, the feature is used for some real
amount of valid traffic. I'm sure that the originators and recipients of
these messages would not like to hear your dismissal of the validity of
their traffic.   It clearly does not give undue problems as:
   - the traffic exists and flows
   - we have not had lots of complaints from X.400 users who cannot
   	access parts of the 822 world


Second, that because it is used, we should make sure that the feature is
widely and sensibly supported.  The host requiremetns is the means to
achieve this.

As a consequence, I should feel free to use the feature to make 1148 work in
the best possible manner.  By this, I mean the best service for those sites
(the majority) that correctly support the host requirements in this area.


Steve

Christian.Huitema@mirsa.inria.fr (Christian Huitema) (03/15/91)

Steve,

I cannot accept this conclusion. Apart from the "space in ORName" artefact,
the feature is only used by questionably engineered gateways -- gateways which
are in no way representing standard RFC-822 or X.400 users. By removing this
requirement, we will guarantee that all X.400 users for which a mapping of
domain exists can exchange mail with all native Internet and Usenet users,
regardless of the host requirement (which, technically, only applies to TCP-IP
connected hosts). In short, problems would only hit the deliberate problem
makers -- the designers of these dubious gateways.

I suppose that we should follow the advice of the various network
administrators, rather than a purely theoretical approach.

Christian Huitema

S.Kille@cs.ucl.ac.UK (Steve Kille) (03/15/91)

Christian,

Looks like we just plain disagree.   Abuse about "questionably engineered gateways"
does not seem to be progressing things.

Steve