[comp.mail.misc] is uunet breaking your headers?

david@quad1.quad.com (David A. Fox) (05/25/90)

In article <1258@chinacat.Unicom.COM> chip@chinacat.Unicom.COM (Chip Rosenthal) writes:
>In an attempt to cater to broken mailers, uunet is munging From: lines.
>Even in the case of a legal and valid FQDN it is converting them to bang
>addresses.  I think this is evil and rude.  Am I alone in this?
>

Nope. And neither is uunet -- My forwarder (uxc.cso.uiuc.edu) also does 
this. They didn't originally (i.e., years ago), but they started sometime 
in the past couple years. It seems to be the "standard" technique for 
MX forwarders, as far as I can tell. 

In article <KARL.90May23204613@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
>peter@ficc.ferranti.com writes:
>   (I was expecting to get user@ficc.ferranti.com, and didn't have any check
>   in there for ficc.ferranti.com!user)
>
>If you use sendmail, in S3 do:
>R$*.$*!$*		$3@$1.$2		invert to @-format
>but only after verifying that there are no @'s in the address already.
>(That is, my use of the above rule is as the last stage of my Domain
>Absolutist Rabid Rerouter ruleset trio.)
>

I suppose. I thought that S3 wasn't quite the place to put it; In fact,
I was afraid to put it into S1 as well, since that would affect mail
"just passing through" my site. Furthermore, I suspect that the above
rule matches more addresses than I would care to see attacked anyway -
Wouldn't this turn "some.place!some.other.place!user" into
"user@some.place.some!other.place"?  Or do I misunderstand the pattern
matching rules? 

I've recently seen some of these misformed addresses (user@site!site), 
BTW - I won't say where from... :-) 

I thought it prudent to apply this "mx-hack" rewriting only under the 
following conditions:

1. When the mail is destined for a host *in* my domain. 
2. When the return address looks (exactly) like: 

	"mx-servant!some.domain.thing!user"

I wound up putting my fix into my mailer specific sender rewriting 
rules for my local (intra-domain) mailers. 

Was this too paranoid, or what? 

-- 
--
David  A. Fox					Quadratron Systems Inc.	
Inet: david@quad.com, Postmaster@quad.com
UUCP: david@quad1.uucp 
      uunet!psivax!quad1!david

"Man, woman, child... All is up against the wall - of science." 

karl_kleinpaste@cis.ohio-state.edu (05/25/90)

me:
   > If you use sendmail, in S3 do:
   > R$*.$*!$*		$3@$1.$2		invert to @-format
   > but only after verifying that there are no @'s in the address already.
   > (That is, my use of the above rule is as the last stage of my Domain
   > Absolutist Rabid Rerouter ruleset trio.)

david@quad1.quad.com writes:
   I suppose. I thought that S3 wasn't quite the place to put it; In fact,
   I was afraid to put it into S1 as well, since that would affect mail
   "just passing through" my site.

I consider S3 to be the right place to do it because S3 is called
first for the purpose of canonicalization into the most correct form.
Or so I view it, anyway.  And yes, it affects mail "just passing
through."  I consider this a feature, not a bug.

   Furthermore, I suspect that the above
   rule matches more addresses than I would care to see attacked anyway -
   Wouldn't this turn "some.place!some.other.place!user" into
   "user@some.place.some!other.place"?  Or do I misunderstand the pattern
   matching rules?

The rule does the right thing, given that it's the last of the
previously mentioned 3.  The complete set is:

R$*.$*!$*@$*		$1.$2!$3		lose @-portion
R$*!$*.$*!$*		$2.$3!$4		strip excess left-hand
R$*.$*!$*		$3@$1.$2		invert to @-format

Note that the 2nd one is responsible for getting rid of all the
left-hand dotted-domain specifications except the last one.  Then the
3rd one turns it around.  (I suppose I should have posted the whole
set in the previous article, to avoid this confusion.  Ohwell.)

Before anyone flames that I'm violating numerous standards in the 1st
rule by deleting the RHS even if it's not "me," please first show me
_existing_, _real_world_ failure cases where this heuristic doesn't
work.  Yes, it's formally a violation of the RFCs; no, it doesn't hurt
anything on a practical basis, and since the practical basis is the
goal (i.e., get mail delivered), I have a clear conscience on that.

And while we're at it, please avoid the flames over Rabid Rerouting of
!-paths, either General RR or my much more restricted Domain
Absolutist version; again, it works, and we've gone over that ground
at least twice in the last year or so.  The only known failure case
was fidonet.org, and that has been resolved.

--karl

Makey@Logicon.COM (Jeff Makey) (05/26/90)

In article <KARL.90May25090924@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
>The complete set is:
>
>R$*.$*!$*@$*		$1.$2!$3		lose @-portion
>R$*!$*.$*!$*		$2.$3!$4		strip excess left-hand
>R$*.$*!$*		$3@$1.$2		invert to @-format

If you insist on doing Rabid Rerouting (a controversial decision I see
no point in discussing here), then this looks like a good way to do
it, except that I would replace all of the "$*" elements with "$+"
to keep from rewriting such obvious garbage as "!.!@".

I believe the original question was simply how to change
"host.domain!user" into "user@host.domain", with no suggestion that
rerouting was desired.  Here is one way to do it in sendmail:

R$+.$+!$+		$@$3<@$1.$2>		resolve mixed UUCP/domain

It belongs near the end of ruleset 3, *after* this rule:

R$*<@$+>		$@$1<@$2>		already canonical

Note that if either of these rules is applied it will result in the
current ruleset being exited.  Your usage may vary.

                           :: Jeff Makey

Department of Tautological Pleonasms and Superfluous Redundancies Department
    Disclaimer: All opinions are strictly those of the author.
    Internet: Makey@Logicon.COM    UUCP: {nosc,ucsd}!logicon.com!Makey

brian@ucsd.Edu (Brian Kantor) (05/26/90)

I don't know why I'm putting my foot into this one, but....


An internetwork mail router must, by definition, transform addresses from
one of its networks to addresses that are acceptable to ALL hosts on
its destination network.

The uucp network does not define addresses. Instead, it defines paths
of the form [site!]site!user.  Only the next site in a path need be
known to a uucp sender; as the message traverses the uucp network, each
site discards its hostname from the front of the path and resends the
file to the next site in the path.  There is no address; there is only a
path.

The UUCP Mapping project has attempted (and succeeded, in large part) to
define uucp addresses of the form @host.uucp.  This functions in the
uucp network by replacing the address with a path (usually a path
calculated by picking a least-cost route from the uucp map data).
But the '@'-address is eliminated and replaced by the path, since uucp
does not assign any significance at all to the '@'.

Note that the @ address may remain in the mail headers, but that isn't
significant to the uucp mail network; the headers are NOT used for mail
delivery, although many uucp mail systems will correctly prefix the Unix
header From_ (that's "From ", not the RFC822 "From: ") as the mail is
relayed through them.

Some uucp hosts, particularly those running sendmail, WILL update the
RFC822 "From: " line.  Others don't, which means that the From: line
is of questionable integrity if the mail has ever passed through a uucp
link.

Strictly speaking, an internet-compliant (i.e., it pays attention to
the From: line) mailer receiving mail from uucp should replace the From:
line with the From_ line (since the From: is unreliable), unless the From:
address is a valid domain-style address.  Few sites do this yet.

So, because uucp requires PATHS of the form site!site!user, a
properly-operating internet-to-uucp gateway will transform a From:
address of the form user@domain to domain!user, preserving the domain
unless the domain is ".uucp",  then prefix that with the gateway's uucp
address.  Anything else would require the final destination of the mail
to be able to interpolate or translate addresses, and you simply can't
depend on that capability.

Thus if the From: line is of the form
	user@host.domain
(where 'domain' is NOT "uucp"), then transform the address to
	gateway!host.domain!user
And that's what uunet (and ucsd, and decwrl, and lots of other gateways)
are doing.  It's clearly the right thing to do by default.

Gateway sites which want to go to the trouble of also supporting uucp
"smart" hosts could have a way to leave the From lines alone, on the
declaration that the destination can handle it.  But they shouldn't
DEFAULT that way.
		- Brian

chip@chinacat.Unicom.COM (Chip Rosenthal) (05/27/90)

In article <13952@ucsd.Edu> brian@ucsd.Edu (Brian Kantor) writes:
>I don't know why I'm putting my foot into this one, but....

I'm glad you did.

One of the things I'm assuming is that most dumb (uucp) sites are going
to use From_ to route a return message while a smart site is going to use
From:.  My thinking is that if sites would leave From: alone and update
From_, then things would work.  Dumb sites would have a From_ with a full
path back to the sender, and smart sites would have an untarnished From:
to work with.

Just as you have sites saying "@ doesn't work in uucp, I'll make it !",
there are sites which say "! isn't RFC822, I'll make it @".  Then you
have the added dimension of complexity of what sites do with '!' in From:,
some add their name, others don't.  Everybody has their own idea of what
to do with From:, each arguable in its own right, but all ending up in a
conglomeration of gibberish.

>Gateway sites which want to go to the trouble of also supporting uucp
>"smart" hosts could have a way to leave the From [From:? -chip] lines
>alone, on the declaration that the destination can handle it.  But they
>shouldn't DEFAULT that way.

That's what I disagree with, but I might be confused in thinking that
dumb uucp mailers are using From_ instead of From: to route replies.  If
so, please do correct me.

In any case, '@' addressing is a good thing, as evidenced by the success
of the UUCP Project.  We should strive to support it, not hobble it.
Even if munging is the correct default, it not should be done as secretively
as uunet does.  It was a surprise to find a "smart hosts" list exists,
and even*after*requesting to be put on the list I'm still getting my
incoming From:'s munged.  That doesn't cause a problem, since the result
is invariably a usable "uunet!host.domain!user", but with munging discression
so difficult to achieve, I'm worried about what's happening to my outgoing
stuff.  Some of us uucp sites do take pains to try to maintain sanity in
our local mail system, and this should be encouraged.

-- 
Chip Rosenthal                            |  You aren't some icon carved out
chip@chinacat.Unicom.COM                  |  of soap, sent down here to clean
Unicom Systems Development, 512-482-8260  |  up my reputation.  -John Hiatt

keld@diku.dk (Keld J|rn Simonsen) (05/27/90)

Another way to do the domain based headers/envelope is to 
have it done in the mailer specific rewriting rules of sendmail.
Sendmail 5.61 with IDA enhancements even has specific rewriting rules
for sender and recipient and for the header and the envelope,
that is in toto 4 separate rewriting rules available.

We use this at the Danish Internet/UUCP backbone to customise
our uucp connections, of which we have a little over 100.
Each of these have a mailer specification, which tells
if they need the header in domain-style, the envelope in
bang style etc. We have even added support for 8-bit mail
in this way on a site dependent basis, handeling different kinds of
7- and 8-bit character sets and conversion between them.

Keld Simonsen, DKnet
keld@dkuug.dk

fitz@wang.com (Tom Fitzgerald) (05/30/90)

chip@chinacat.Unicom.COM (Chip Rosenthal) writes:
> One of the things I'm assuming is that most dumb (uucp) sites are going
> to use From_ to route a return message while a smart site is going to use
> From:.  My thinking is that if sites would leave From: alone and update
> From_, then things would work.

This sounds like an excellent distinction.  But if the From: line contains
a bang address, it should still be updated for consistency.  If the message
is going to another UUCP site, the From: address should get prefixed with
the name of the current site; and if it's going onto the Internet, it
should get @current.site appended to it.

Actually, it can be optimized a little more, so that if the From: line
begins with a "fqdn!", you use that in place of the current.site.

> ... I might be confused in thinking that
> dumb uucp mailers are using From_ instead of From: to route replies.  If
> so, please do correct me.

Personally, I think you're right about that.  There's another thing I
worry about - what happens to the From_ line when a message enters and
leaves the Internet?  Do SMTP-->UUCP gateways use the contents of the
From: line to generate the From_ line, or do they use the MAIL FROM:<>
address from the SMTP handshake, or what?  If the From_ line gets
generated from scratch (">From daemon <timestamp> remote from <myname>"),
that would screw up the old mailers.  Or does the From_ line actually
go over SMTP?

Actually, there's another potential problem.  Are there any dumb UUCP
mailers that stick "myname!" onto the front of From: addresses that
already have an "@" in them?

If none of these things are problems, here's the way it should work:

UUCP->Internet gateway

	If the From: line is ...@f.q.d.n, leave it alone
	If the From: line is f.q.d.n!... change it to ...@f.q.d.n
	Otherwise append "@current.host" to it.

Internet->UUCP gateway

	Leave the From: line, user@f.q.d.n, alone
	(If necessary) build the From_ line from the From: line
		">From f.q.d.n!user <timestamp> remote from <myname>"

UUCP->UUCP transfers:

	Always do the right thing to the From_ line
	If the From: line has an @ in it, leave it alone
	Otherwise put <myname>! at the front of it

There are some uglinesses here that could be made to go away if you know
something about the receiving site.  All this assumes ignorance on the
part of the sending system.

> Dumb sites would have a From_ with a full
> path back to the sender, and smart sites would have an untarnished From:
> to work with.

Just so.

[Re: UUNET's munging]
> and even*after*requesting to be put on the list I'm still getting my
> incoming From:'s munged.

When we got ourselves moved onto the "smart" list, the change didn't kick
in until the next time UUNET restarted its sendmail, which was several
days later.

---
Tom Fitzgerald      Wang Labs           fitz@wang.com
1-508-967-5278      Lowell MA, USA      ...!uunet!wang!fitz

tp@mccall.com (05/30/90)

In article <13952@ucsd.Edu>, brian@ucsd.Edu (Brian Kantor) writes:
> I don't know why I'm putting my foot into this one, but....

Me either... BTW, this isn't directed to Brian specifically. This is
something that has bothered me for a long time, and I apologize in advance
if I get a little hot about it sometimes, but it causes me major headaches.

> Some uucp hosts, particularly those running sendmail, WILL update the
> RFC822 "From: " line.  Others don't, which means that the From: line
> is of questionable integrity if the mail has ever passed through a uucp
> link.

The sendmail sites (and anyone foolish enough to emulate them) are wrong.
Headers shouldn't be modified! This is according to RFC822. If you are
using headers, they are probably rfc822 headers, and thus you should follow
the rules that go with them. If you aren't using headers (i.e. you are
pretending to be a dumb uucp site because that's what you think you are
talking to), then you shouldn't touch them because they are part of the
message. The only header used by "dumb" uucp sites is the "From " header,
and that is the only one that should be modified.

Since sendmail sites do muck with the header, I have to have extensive
rewrite rules to clean up addresses so they are useable. I get unuseable
From: lines because my neighboring site is a sendmail site.

> Strictly speaking, an internet-compliant (i.e., it pays attention to
> the From: line) mailer receiving mail from uucp should replace the From:
> line with the From_ line (since the From: is unreliable), unless the From:
> address is a valid domain-style address.  

You should never remove a From: header! You can add Received: lines, but
you shouldn't touch the existing headers. If by internet-compliant you also
mean rfc822 compliant, this is one of the rules of the game. If everybody
would leave the From: line alone, it would be reliable (i.e. it would
reliably be what the sender set it to). It is only next to useless because
sendmail sites mung it into garbage.

> Few sites do this yet.

Thank God! I'm not sure how I'd EVER decode the result to a useable from
address (perhaps this explains why one person here has had a great deal of
trouble getting mail to or through the ucsd.edu domain).

> So, because uucp requires PATHS of the form site!site!user, a
> properly-operating internet-to-uucp gateway will transform a From:
> address of the form user@domain to domain!user, preserving the domain
> unless the domain is ".uucp",  then prefix that with the gateway's uucp
> address.  Anything else would require the final destination of the mail
> to be able to interpolate or translate addresses, and you simply can't
> depend on that capability.

NO! uucp does require paths. uucp also doesn't have the vaguest idea what a
header line is. Even the "From " line is only so mailers can attempt to
generate reply addresses. The path is the ENVELOPE of the message. Header
info is NOT used for delivery. 

The only reason to mung a From: path is because you think it will somehow
be easier for the recipient to reply to. If the from address is a bang
path, it is probably inconvenient to use for any mailer that actually knows
what a From: line is, since this would be an rfc822 mailer and would thus
know about domains. 

A "dumb" mailer will use the "From " line to generate a return path. A
smart mailer wants to see the domain name of the sender. Because sendmail
screws this up, every smart mailer around has been modified to compensate
for this by accepting bang paths, or rewriting them (usually be a set of
rules that have to be hand-crafted for each site, and are never totally
reliable). They are compensating for what is clearly a bug in sendmail,
since by rfc822, the From: headers shouldn't be modified, ever. 

> Thus if the From: line is of the form
> 	user@host.domain
> (where 'domain' is NOT "uucp"), then transform the address to
> 	gateway!host.domain!user
> And that's what uunet (and ucsd, and decwrl, and lots of other gateways)
> are doing.  It's clearly the right thing to do by default.
                  
I disagree. My neighboring site is a sendmail site, and it took me a long
time to get a set of rewrite rules that cleans up this trash about 95% of
the time. I still get some of them wrong. 

Some examples. I have to deal with things like
"user%site.something@host.domain" getting munged to
"gateway!host.domain!site.something!user" or worse, 
"gateway!host.domain!user%site.something". The first of these might work as
a bang path, but probably won't. It requires host.domain to support
bang-paths, which is unlikely for any site that needs "%" to set up an
address. If I try to get it back to a domain name, I get
user@site.something which is wrong. The second is messier, but my rewrite
rules can handle it (had to put in a whole bunch of things to deal with "%"
to handle these. 

To make matters worse, there are usually a few more hosts at the front of
the bang-path, depending on who first munged the header. Many uucp hops of
a message are typically sendmail sites, and they all do this kind of
munging, leading to real confusion. Some will even see the % sign in that
last example above, treat it like an "@" (since there isn't one already),
and generate "gate2!site.something!gateway!host.domain!user". I then try to
reverse engineer this and get user@host.domain, which is wrong. This isn't
useable as a bang-path either, obviously.

Do you really think any of this is more useable than the original from
line?! 

The basic problem here is there is not a clear model of what an
internet-uucp gateway should be. Much of what Brian said would be true if
you think of it as a gateway between an rfc822-compliant network and a
non-rfc822-compliant network with certain characteristics, namely, that it
supports rfc822 plus the "From " line, and that From: headers should
contain bang-paths. I don't think this is the correct model.

Rfc822 headers should be treated as rfc822 headers (for rfc822 mailers), or
as message text (for dumb mailers that don't use headers). In either case,
they shouldn't be modified, except for the possible addition of a Received:
line. The "From " line should be there, and should be properly updated. A
dumb mailer will use it. A smart mailer will ignore it if there are rfc822
headers (since it knows about domains, it can use an unmodified From:
line), or use it if that's all there is. 

> Gateway sites which want to go to the trouble of also supporting uucp
> "smart" hosts could have a way to leave the From lines alone, on the
> declaration that the destination can handle it.  But they shouldn't
> DEFAULT that way.

You imply by this that uucp smart hosts are the minority. Is this actually
true? My mail software (DECUS uucp for VMS) acts as I have described. smail
2.whatever works as I have described (I used to run a unix machine). In
fact, the docs for smail describe installing it "underneath" smail as the
only known fix to the sendmail BUG that causes this behavior (it fixes it
by keeping the message AWAY from sendmail). Note that NOT munging the From:
line shouldn't ever cause a problem, since non-rfc822 mailers don't need
it.
-- 
Terry Poot <tp@mccall.com>                The McCall Pattern Company
(uucp: ...!rutgers!ksuvax1!mccall!tp)     615 McCall Road
(800)255-2762, in KS (913)776-4041        Manhattan, KS 66502, USA

karl_kleinpaste@cis.ohio-state.edu (05/31/90)

tp@mccall.com writes:
   The sendmail sites (and anyone foolish enough to emulate them) are wrong.
   Headers shouldn't be modified! This is according to RFC822. If you are
   using headers, they are probably rfc822 headers, and thus you should follow
   the rules that go with them.

My sendmail.cf is perfectly happy to leave RFC822-compliant headers
alone.  It's all those damnable broken headers that I feel compelled
to slice, dice, and julienne-fry until they come out looking at least
vaguely reminiscent of something that a TOPS-20 MAISER won't barf all
over.  (And before you ask, Yes, I have to deal with MAISER all the
time, handing it addresses that don't make it complain.)

Take a couple of cases in point.

[1] Start with my favorite example of broken headers, BITNET-
originated mail from a site which doesn't seem to care much for
staying up to date.  (That is, I recognize that, as a purely
ontological statement, "there exist" up-to-date, standards-compliant
mailers for most flavors of systems on the BITNET; but, based on my
mail logs, I assert that one heck of a lot of BITNET sites have admins
who do not care to update their software to use such mailers.  They
prefer old, broken ones.  I _hate_ the BITNET.)

On the BITNET, lacking the DNS, addresses are apparently of the form
UserName@OneWordHostName (hereafter called OWHN).  This is fine, so
long as the mail stays within the BITNET -- but it doesn't, of course,
and it is positively routine for such mail to be aimed at someone
around here without any update to such a header (or Mail From:<>
envelope) to give it a domain.

This is a violation of RFC822 section 6.2.2 page 29 last paragraph[*].
It is a _requirement_ that all addresses show fully-qualified domain
names (FQDNs) any time that a message crosses a domain boundary.  The
reasons for the requirement are trivially easy to understand: Given an
incomplete (or nonexistent) domain specification, it is impossible for
my mailer to figure out how to route a reply back to the originator.

Therefore, in my best Domain Absolutist fashion, my sendmail.cf
notices anything of the form Anything@OWHN and rewrites it as
Anything@cis.ohio-state.edu, and to hell with the BITNET.

(The tendency to rewrite OWHNs as cis.ohio-state.edu actually has
other, perfectly legitimate, reasonable and understandable purposes,
having to do with our use of a shared /usr/spool/mail and the desire
for this entire department to generate mail headers which appear to be
all one big, happy family.)

[2] I see that rewrites of domainisms into !-paths for the benefit of
UUCP sites are protested.  I disagree with the protest.  I count on
the following transform always working, for which I have yet to find a
failure case:

	anything@proper.domain.name <=> proper.domain.name!anything

In my sendmail.cf, I prefer to canonicalize in S3 to stuff@domain.name
syntax, believing that this syntax is the superior method in all
cases.  And when I gateway UUCP-originated mail showing
"OWHN!username" via SMTP to some site, I mostly leave it alone, except
to _add_ "@cis.ohio-state.edu" to it, again because RFC822 requires an
FQDN in all such cases.

However, if I receive mail which shows OWHN!user@F.Q.D.N and I am
about to route it out of here as UUCP mail, that syntax cannot be
assured of working, because "dumb" UUCP sites cannot be counted on to
obey RFC822's requirement of @-over-! precedence.  So I rewrite
"OWHN!user@F.Q.D.N" as "F.Q.D.N!OWHN!user" in the From: line, and the
From_ line (for SMTP Mail From:<>) is modified (by a combination of
sendmail and smail-2.5) to be "osu-cis!F.Q.D.N!OWHN!user," thus giving
a syntax which even a dumb UUCP site can cope with.

And lastly, when I receive mail of the form "F.Q.D.N!OWHN!user" and I
am about to deliver locally or to forward via SMTP, I rewrite it to
"OWHN!user@F.Q.D.N," which works for RFC-compliant sites.  (MAISER is
kept happy -- and that's important to me. :-)

   Since sendmail sites do muck with the header, I have to have extensive
   rewrite rules to clean up addresses so they are useable. I get unuseable
   From: lines because my neighboring site is a sendmail site.

The latter does not necessarily follow from the former: The fact that
sendmail mucks with the headers does not, in and of itself, cause you
to get unusable From: lines.  The problem is that one can almost
always expect that vendor-supplied sendmail.cf files are broken and
unusable, and that most sysadmins are not willing to invest the time
required to understand what it is that sendmail is doing, much less
how to update a sendmail.cf to do something other than what it already
does.  Understanding sendmail.cf is admittedly a daunting proposition.

   If by internet-compliant you also
   mean rfc822 compliant, this is one of the rules of the game. If everybody
   would leave the From: line alone, it would be reliable.

Feh.  An entirely unsupportable assertion.  If everybody left the
From: line alone, it would be set to whatever the first broken mailer
set it to, which is frequently incorrect right from the outset.  Mail
gets here pretty routinely from UCBerkeley with OWHNs in the UNIX
From_ line, rewritten to "cis.ohio-state.edu."  (No, not from ucbvax
or ucbarpa, but yes, from other UCB sites.)

   The basic problem here is there is not a clear model of what an
   internet-uucp gateway should be.

Hm.  Well, maybe and maybe not.  I tend to think that I've got a
pretty clear model of what it is that I'm doing.  This is how I look
at the task, simplified to table format:

Incoming Address	Departing via	Rewrite
-------- -------	--------- ---	-------
owhn!user		local or UUCP	don't touch it
			SMTP		owhn!user@cis.ohio-state.edu

user@f.q.d.n		local or SMTP	don't touch it
			UUCP		"From:" -- don't touch it
					"From_" -- osu-cis!f.q.d.n!user

owhn!user@f.q.d.n	local or SMTP	don't touch it
			UUCP		"From:" -- f.q.d.n!owhn!user
					"From_" -- osu-cis!f.q.d.n!owhn!user

f.q.d.n!owhn!user	local or SMTP	owhn!user@f.q.d.n
			UUCP		as for previous case

It seems to be a pretty clear model to me.  I do minimal-but-necessary
changes, I don't support %ification wraps except for a couple of very
small special cases, and I don't deal with `:' or `.' at all as
address punctuation.

--karl

[*] RFC 822 6.2.2 p.29 last paragraph:
    "When a message crosses a domain boundary, all  addresses must
    be  specified  in  the  full format, ending with the top-level
    name-domain in the right-most field.  It is the responsibility
    of  mail  forwarding services to ensure that addresses conform
    with this requirement."

lear@turbo.bio.net (Eliot) (06/01/90)

One can quote RFC 822 up and down, but it isn't 100% appropriate for
messages transitting off of the Internet.  Rick is supporting the
lowest common denominator of his customers.  It was said that if you
tell UUNET that you have a smart mailer, they'll do the right thing.
What more can you ask for?
-- 
Eliot Lear
[lear@turbo.bio.net]

fitz@wang.com (Tom Fitzgerald) (06/01/90)

> In article <13952@ucsd.Edu>, brian@ucsd.Edu (Brian Kantor) writes:
>> Some uucp hosts, particularly those running sendmail, WILL update the
>> RFC822 "From: " line.  Others don't, which means that the From: line
>> is of questionable integrity if the mail has ever passed through a uucp
>> link.

tp@mccall.com writes:
> The sendmail sites (and anyone foolish enough to emulate them) are wrong.
> Headers shouldn't be modified! This is according to RFC822.

Headers shouldn't be modified IF they are already compliant with RFC822.
The headers that aren't (especially From: lines with no @ in them) aren't
constrained at all.  There's a good justification for putting "myname!" at
the beginning of such lines, or tacking "@my.domain" onto the end.  There's
nothing in RFC822 that specifies what to do to messages that already
violate it.

> I get unuseable
> From: lines because my neighboring site is a sendmail site.

That's a little overgeneralized, there are sendmail sites that don't rewrite.

> uucp does require paths. uucp also doesn't have the vaguest idea what a
> header line is. Even the "From " line is only so mailers can attempt to
> generate reply addresses. The path is the ENVELOPE of the message. Header
> info is NOT used for delivery. 

This is only true for dumb UUCP sites.  There are also smart UUCP sites
around, even smart UUCP sites that don't have a registered domain.  We
have to take these people into account too.  They can't use
"From: user@domain" because they don't have a domain (not everyone can
handle the .UUCP domain), and if they don't put in a From: line at all,
a recipient on a smart site won't be able to reply.

> Many uucp hops of
> a message are typically sendmail sites, and they all do this kind of
> munging, leading to real confusion.

Absolutely.  I've seen some really incredible specimans pass through here
(I wish I'd saved them for this discussion).

> smail
> 2.whatever works as I have described (I used to run a unix machine).

Not quite, smail 2.5 will put "myname!" at the beginning of From: lines
that don't already have a @ in them.  I think this is the right thing
to do for something that's going out over UUCP anyway, since the final
destination may be a smart UUCP site that wants to see a usable From:
line.

I think your arguments are generally right except that non-RFC822
headers sometimes have to be munged anyway.  If nothing else, they
have to be made RFC822-compliant if the message is being gated onto
the Internet, and they have to be kept usable for receiving UUCP
sites.

---
Tom Fitzgerald      Wang Labs           fitz@wang.com
1-508-967-5278      Lowell MA, USA      ...!uunet!wang!fitz

fitz@wang.com (Tom Fitzgerald) (06/01/90)

karl_kleinpaste@cis.ohio-state.edu writes:
> [2] I see that rewrites of domainisms into !-paths for the benefit of
> UUCP sites are protested.  I disagree with the protest.  I count on
> the following transform always working, for which I have yet to find a
> failure case:

> 	anything@proper.domain.name <=> proper.domain.name!anything

Well, I get shitloads of failure cases here.  If everyone on the Internet
had your sendmail.cf, you're right, nothing would go wrong.  But when you
send out a message with:

	From: proper.domain.name!anything

as it goes through sites with different opinions about sendmail.cf, it will
slowly turn into:

	From: proper.domain.name!anything@relay.site
	From: anything%proper.domain.name@relay.site
	From: relay.site!anything%proper.domain.name
	From: relay.site!anything%proper.domain.name@next.relay

and will become useless.  Even in the cases where it's still usable,
it has become seriously nonoptimal.  (I don't know any other sites besides
yours that are domain-absolutist rerouters, so even good addresses
will pass through several domained sites before being delivered).

> My sendmail.cf is perfectly happy to leave RFC822-compliant headers
> alone.

> However, if I receive mail which shows OWHN!user@F.Q.D.N and I am
> about to route it out of here as UUCP mail, that syntax cannot be
> assured of working, because "dumb" UUCP sites cannot be counted on to
> obey RFC822's requirement of @-over-! precedence.  So I rewrite
> "OWHN!user@F.Q.D.N" as "F.Q.D.N!OWHN!user" in the From: line...

Well, you've contradicted yourself since you've just changed an
RFC822-compliant header into a noncompliant one.

And really, this rewriting does no good, and real harm.  Dumb UUCP sites
don't reply to the From: address, they reply to the From_ address (which
you've set up exactly right).  [I say this after a quick look at the
original SysV.2 mailer.  Does anyone know any dumb mailers that use
From:?]

I don't see why you're trying to protect the dumb sites from seeing
owhn!user@f.q.d.n when you're now giving them an address beginning with
a domain, which they can't handle anyway.

> If everybody left the
> From: line alone, it would be set to whatever the first broken mailer
> set it to, which is frequently incorrect right from the outset.

And if everyone screws with it, it will eventually get broken even if
it was correct from the outset.  I don't see anything wrong with changing
addresses that are already broken (like your user@owhn example), but I
hate seeing people rewrite perfectly good ones.

> Incoming Address	Departing via	Rewrite
> -------- -------	--------- ---	-------
> owhn!user		local or UUCP	don't touch it

Interesting.   You don't even change it to osu-cis!owhn!user?

> owhn!user@f.q.d.n	local or SMTP	don't touch it
> 			UUCP		"From:" -- f.q.d.n!owhn!user

In some ways this is the worst of several worlds.  Dumb UUCP sites (even if
they do use From:) can't use it because it begins with something that isn't
a neighboring UUCP site.  And smart sites, seeing that it isn't RFC822,
will try to "fix" it.

---
Tom Fitzgerald      Wang Labs           fitz@wang.com
1-508-967-5278      Lowell MA, USA      ...!uunet!wang!fitz

amanda@mermaid.intercon.com (Amanda Walker) (06/01/90)

In article <May.31.11.25.58.1990.23299@turbo.bio.net>, lear@turbo.bio.net
(Eliot) writes:
> It was said that if you
> tell UUNET that you have a smart mailer, they'll do the right thing.
> What more can you ask for?

I'll also come in on UUNET's side on this one (even though it took me a
while to realize that I could tell them I could handle domain-style
addresses).

A lot of uunet's customers are probably running little UNIX boxes with
vanilla UUCP and a stupid mailer.  They way uunet sets up a customer by
default, all they have to do is set their UUCP node name and mail flows.
This means minimal frustration all around, and it's a lot less annoying
than getting some vendor's random version of sendmail (with an even more
random sendmail.cf) working properly.

If you say you want real addresses, though, they seem happy to oblige.
The host 'intercon.com' is an example... It's even using a mild variation
on Karl's sendmail.cf, which is the most sensible one I've ever run across.

The fact that I used to work two cubicles away from him and overheard his
cursing at <expletive deleted>ed mail relays has nothing to do with it :-)...

--
Amanda Walker, InterCon Systems Corporation
--
"Go not to the elves for counsel, for they will say both no and yes."
	--J.R.R. Tolkien, The Lord of the Rings

karl_kleinpaste@cis.ohio-state.edu (06/01/90)

fitz@wang.com writes:
   Well, I get sh*tloads of failure cases here.

What I meant by a "failure case" was "a case where the transform
_in_and_of_itself_ doesn't work."  It _does_ work:  I _can_ exchange
back and forth between the two forms without error.  And given that
the transform is !-path-centric, and that the most common intelligent
UUCP router is smail 2.5, and that smail 2.5 knows how to deal with
proper.domain.name!anything, then it can be considered to work in
general.

   Well, you've contradicted yourself since you've just changed an
   RFC822-compliant header into a noncompliant one.

No.  By the time I'm routing out via UUCP, RFC822 no longer applies.
I've argued this more times than I can count, in such areas as my
choice to be a Domain Absolutist Rabid Rerouter.  I've transformed the
header into something that a(n assumed) dumb[*] UUCP neighbor can cope
with, but in a form that is readily transformable back into an RFC
header if the receiving site wants it that way.  From what my UUCP
neighbors tell me, they do, and they know how.

   > Incoming Address	Departing via	Rewrite
   > -------- -------	--------- ---	-------
   > owhn!user		local or UUCP	don't touch it

   Interesting.   You don't even change it to osu-cis!owhn!user?

No.  That would be redundant.  This _is_ osu-cis.

   > owhn!user@f.q.d.n	local or SMTP	don't touch it
   > 			UUCP		"From:" -- f.q.d.n!owhn!user

   In some ways this is the worst of several worlds.  Dumb UUCP sites (even if
   they do use From:) can't use it because it begins with something that isn't
   a neighboring UUCP site.  And smart sites, seeing that it isn't RFC822,
   will try to "fix" it.

Truly dumb sites will use the From_ address, which has "osu-cis!"
prepended.  Smart sites (even only to the level of intelligence of
smail 2.5, and certainly with sendmail) can deal with the From:
adequately.

It works.  I constitute my own existence proof of that fact.  My main
mail-routing host here shoveled 12Gbytes of mail in the month of May.
I haven't heard even one complaint about routing errors or abuse.

--karl

[*] "Dumb UUCP site" == /bin/{r,}mail and /usr/bin/uux only.  Anything
    beyond that (e.g., smail) constitutes a smart site.

peter@ficc.ferranti.com (Peter da Silva) (06/01/90)

In article <anjl48.dwd@wang.com> fitz@wang.com (Tom Fitzgerald) writes:
> > So I rewrite
> > "OWHN!user@F.Q.D.N" as "F.Q.D.N!OWHN!user" in the From: line...

What you should do is rewrite it as user%OWHN@F.Q.D.N, if you want to
rewrite it at all. Then it will only break on micnet sites that still
use "machine%user".
-- 
`-_-' Peter da Silva. +1 713 274 5180.  <peter@ficc.ferranti.com>
 'U`  Have you hugged your wolf today?  <peter@sugar.hackercorp.com>
@FIN  Dirty words: Zhghnyyl erphefvir vayvar shapgvbaf.

karl_kleinpaste@cis.ohio-state.edu (06/02/90)

peter@ficc.ferranti.com writes:
   > "OWHN!user@F.Q.D.N" as "F.Q.D.N!OWHN!user" in the From: line...

   What you should do is rewrite it as user%OWHN@F.Q.D.N, if you want to
   rewrite it at all. Then it will only break on micnet sites that still
   use "machine%user".

Offhand, I don't know what micnet is.  "!%@:" doesn't mention it.

As for %ification wraps, that particular piece of punctuation has been
subjected to more abuse than any other in the history of email, as far
as I can see.  I avoid it like the plague, acknowledge it only when
forced.

--karl

les@chinet.chi.il.us (Leslie Mikesell) (06/02/90)

In article <KARL.90May30213527@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:

>However, if I receive mail which shows OWHN!user@F.Q.D.N and I am
>about to route it out of here as UUCP mail, that syntax cannot be
>assured of working, because "dumb" UUCP sites cannot be counted on to
>obey RFC822's requirement of @-over-! precedence.  So I rewrite
>"OWHN!user@F.Q.D.N" as "F.Q.D.N!OWHN!user" in the From: line, and the
>From_ line (for SMTP Mail From:<>) is modified (by a combination of
>sendmail and smail-2.5) to be "osu-cis!F.Q.D.N!OWHN!user," thus giving
>a syntax which even a dumb UUCP site can cope with.

If the OWHN!user@F.Q.D.N happens to have come from a uucp site, there's
a pretty good chance that it is already screwed up, especially if OWHN
happens to be the same name as one of the uucp sites in the path.
Anyway, dumb uucp mailers normally reply by reversing the From_ line
and never look at From:, so that is rarely a problem.

One thing that *is* a problem is trying to do a reply to all the
recepients of the message from a dumb uucp site.  Mailx uses
'r', Elm uses 'g', Mush uses replyall, AT&T PMX-mailers use <F5>(Ans)
<F2>(All) to try to interpret the To: and Cc: lines, and they
all produce fairly bizzare results.  The most likely thing to work
(short of tossing the uucp path concept entirely), is to reverse
the From_ path and stick it in front of the other addresses, then
collapse any obvious duplications ( A!A A!B!A).  Unfortunately, this
doesn't work well if user@host or domain!user notation is used.
For example if you mail from machine A to B!C!user1 with a
Cc: to user2@C.uucp (assume routing software on A), user1's group
reply to user2 on the same machine might be addressed as B!A!user2@C.uucp.
If all the machines do ! paths uucp style, this can be delivered, but
it may be sent across the country to get back to the same machine.

Les Mikesell
  les@chinet.chi.il.us

peter@ficc.ferranti.com (Peter da Silva) (06/03/90)

In article <KARL.90Jun1152243@giza.cis.ohio-state.edu> karl_kleinpaste@cis.ohio-state.edu writes:
> peter@ficc.ferranti.com writes:
>    > "OWHN!user@F.Q.D.N" as "F.Q.D.N!OWHN!user" in the From: line...

>    What you should do is rewrite it as user%OWHN@F.Q.D.N, if you want to
>    rewrite it at all. Then it will only break on micnet sites that still
>    use "machine%user".

> Offhand, I don't know what micnet is.  "!%@:" doesn't mention it.

It's an old (Microsoft?) net-mail system used at some sites that have old
pre-SCO Xenix systems. Like Ferranti.

Oh well. Do what you have to do.
-- 
`-_-' Peter da Silva. +1 713 274 5180.  <peter@ficc.ferranti.com>
 'U`  Have you hugged your wolf today?  <peter@sugar.hackercorp.com>
@FIN  Dirty words: Zhghnyyl erphefvir vayvar shapgvbaf.

fitz@wang.com (Tom Fitzgerald) (06/05/90)

> fitz@wang.com writes:
>    Well, I get sh*tloads of failure cases here.

karl_kleinpaste@cis.ohio-state.edu writes:
> What I meant by a "failure case" was "a case where the transform
> _in_and_of_itself_ doesn't work."  It _does_ work:  I _can_ exchange
> back and forth between the two forms without error.

This is true in the sense that mail has to get at least 2 hops away from
you before it's broken beyond recovery; but your rewriting has made that
breakage possible, and it wouldn't happen if you didn't rewrite it.  Things
may look great to you because you don't see the errors; but I do.  I get
mail with incredible collections of !s, @s and %s and when I reply, the
replies bounce.  Here's one I just saw:

	From: somebody%utdssa.dnet%utadnx@utspan.span.nasa.gov

If it were run through your rewriting rules (as I understand them) you'd
turn it into:

	From: utspan.span.nasa.gov!somebody%utdssa.dnet%utadnx

which would no longer be usable because ! binds tighter than %.  If
you changed the %s to !s, it would still be unusable because when a reply
came to your machine you'd try to DARR it to "utdssa.dnet" and wouldn't
be able to make contact with it (or do you only DARR to domains beneath
known root subdomains?).

You're also right in the sense that, if everyone rewrote and routed mail
the same way you do, there would be no problems.  But everyone has their
own idea of how mail should be rewritten - !s get turned into %'s, @my.node
gets tacked onto the end, etc.  RELAY.CS.NET, the previous incarnation of
decwrl and some BITNET gateways have philosophies that don't mix well
with @->! rewriting.  In the aggregate, addresses get trashed.

> And given that
> the transform is !-path-centric, and that the most common intelligent
> UUCP router is smail 2.5, and that smail 2.5 knows how to deal with
> proper.domain.name!anything, then it can be considered to work in
> general.

smail can also understand "anything@proper.domain.name", so there's nothing
about !-addresses that makes them better.  And speaking as a smail user, I'd
much rather see an @-address than a !-address because it's more robust.
The thing about !-addresses is that people keep rewriting them over and over.

> No.  By the time I'm routing out via UUCP, RFC822 no longer applies.

Your original comment in <KARL.90May30213527@giza.cis.ohio-state.edu> was:

>>> My sendmail.cf is perfectly happy to leave RFC822-compliant headers
>>> alone.

RFC822 does only claim relevance to SMTP; but it defines a header format
that is pretty much independent of transport.  Even in the UUCP world,
the closer something is to RFC822 the better things work.

> I've transformed the
> header into something that a(n assumed) dumb[*] UUCP neighbor can cope
> with

But dumb UUCP neighbors don't pay attention to From: lines, so why change
them in a way that hurts the smart sites several hops away?

> It works.  I constitute my own existence proof of that fact.

Meaning that you don't see any problems.  I do.

> My main
> mail-routing host here shoveled 12Gbytes of mail in the month of May.
> I haven't heard even one complaint about routing errors or abuse.

I can't complain about OSU, because little mail from here goes through
you.  But as a philosophy, I hereby complain: sites that rewrite valid
addresses cause more mail to be lost than sites that don't rewrite, and
I have seen this happen often.  So your set of complaints is now non-null.
Anyone else want to add to it?

Honestly, us UUCP people are fairly shy about complaining to our Internet
neighbors because we depend on them(you) for so much.  Given the choice
between complaining and just trying to quietly compensate, I'll keep quiet.
But this is something that genuinely causes mail to be lost, and there's
nothing I can do to fix it.  I've just recently started complaining to the
sites around here that do rewriting.  You may have more silently suffering
neighbors than you think; or more likely, sites that put "dead {osu-cis}"
in the local additions to their maps without ever bothering to tell you
about it.

---
Tom Fitzgerald      Wang Labs           fitz@wang.com
1-508-967-5278      Lowell MA, USA      ...!uunet!wang!fitz

karl_kleinpaste@cis.ohio-state.edu (06/06/90)

greyham@hades.OZ writes:
   If everyone left From: alone completely, the sites that would be
   affected would be the sites that generate a bad From: line, and it
   would give them some encouragement to fix the thing.

Just for the sake of an extra giggle (if you knew what my day had been
like so far today, you'd appreciate that need in me), two points:

<1> Your host doesn't exist:

| [137] [3:35pm] giza:/n/giza/0/karl> host -a hades.OZ.
| Host not found.

Oh, you meant hades.oz.au?  OH!  That's _different_...too bad your
From: line is bogus...but I suppose no one should ever rewrite it to a
canonical form.

<2> _Trust_me_, postmasters get Really Well Encouraged to fix their
mailers when one of my users gets mail from such a place, complains to
me about why they can't reply, and I find out who the
originating/offending site is, especially when I cite chapter and
verse of 822 at the offending postmaster.

--karl

david@twg.com (David S. Herron) (06/07/90)

In article <2752.26638923@mccall.com> tp@mccall.com writes:
>In article <13952@ucsd.Edu>, brian@ucsd.Edu (Brian Kantor) writes:
>> I don't know why I'm putting my foot into this one, but....
..
>> Some uucp hosts, particularly those running sendmail, WILL update the
>> RFC822 "From: " line. 
..
>The sendmail sites (and anyone foolish enough to emulate them) are wrong.
>Headers shouldn't be modified! This is according to RFC822.  ...



*Unless* you're gatewaying 'tween the two environments..



Remember -- out in UUCP land RFC-822 doesn't exist everywhere
-- 
<- David Herron, an MMDF weenie, <david@twg.com>
<- Formerly: David Herron -- NonResident E-Mail Hack <david@ms.uky.edu>
<-
<- Sign me up for one "I survived Jaka's Story" T-shirt!