[comp.mail.misc] UCB Mail tries to be too smart

msir@uhura.cc.rochester.edu (Mark Sirota) (02/18/89)

I just found a misfeature with UCB Mail.  If remote mail comes in (i.e.
mail from a remote site, with a From: line of the form

	remote-user@remote-host

and your sendmail.cf strips the hostname off of local addresses, so that the
To: reads just

	local-user

instead of
	local-user@local-host

and the local user does a replyall in UCB Mail, then the resulting To:
line will be

	local-user@remote-host remote-user@remote-host

I find this unacceptable.  It would seem that UCB mail is trying to be
intelligent; that is, it's assuming that remote-host may have been dumb
and not fully qualified it's addresses, so it tries to recreate it for you.

Well, I don't want it to do that.  I deliberately strip the local machine
name off of all addresses before local delivery, so the headers will never
contain the local host name.  And now I find that UCB Mail is trying to
protect me from myself.

That pisses me off.

BTW, this is under SunOS 3.5.  Haven't tested it under 4.3.
-- 
Mark Sirota - University of Rochester, Rochester, NY
 Internet: msir@cc.rochester.edu
 Bitnet:   msir_ss@uordbv.bitnet
 UUCP:     ...!rochester!ur-cc!msir

msir@uhura.cc.rochester.edu (Mark Sirota) (02/18/89)

In article <885@ur-cc.UUCP> msir@cc.rochester.edu (Mark Sirota) writes:
> I just found a misfeature with UCB Mail.  If remote mail comes in (i.e.
> mail from a remote site, with a From: line of the form
> 	remote-user@remote-host
> and your sendmail.cf strips the hostname off of local addresses, so that
> the To: reads just
> 	local-user
> instead of
> 	local-user@local-host
> and the local user does a replyall in UCB Mail, then the resulting To:
> line will be
> 	local-user@remote-host remote-user@remote-host
> 
> I find this unacceptable.  It would seem that UCB mail is trying to be
> intelligent; that is, it's assuming that remote-host may have been dumb
> and not fully qualified it's addresses, so it tries to recreate it for you.
> 
> Well, I don't want it to do that.  I deliberately strip the local machine
> name off of all addresses before local delivery, so the headers will never
> contain the local host name.  And now I find that UCB Mail is trying to
> protect me from myself.
> 
> That pisses me off.

I've just been looking through the code for Mail.  The offending procedure
appears to be netmap() in optim.c, which is called from mapf() in names.c,
which is called from _respond() in cmd3.c.

In my opinion, the world would be a better place if this sequence of
functions were never called.  Does anyone see a good reason not to do it?

The problem, as I see it, is that these functions are insuring that the
use doesn't get trapped by a stupid sendmail.cf elsewhere in the world
which didn't fully qualify all its outgoing addresses.  Now, I'm perfectly
willing to believe that there are plenty of such sendmail.cf's out there.
Mine, personally, isn't one of them.  I don't want my software trying to
protect me from myself, and I see this as basically against one of UNIX's
philosophical foundations.  As I said before, it pisses me off.  Not to
mention that I can't do what I consider the Right Thing in my sendmail.cf
without having UCB Mail break on me.

So will anything break if I go and change it?
Anyone wanna help me lobby to have it changed for good?

Mark
-- 
Mark Sirota - University of Rochester, Rochester, NY
 Internet: msir@cc.rochester.edu
 Bitnet:   msir_ss@uordbv.bitnet
 UUCP:     ...!rochester!ur-cc!msir

avolio@decuac.dec.com (Frederick M. Avolio) (02/20/89)

I strongly suggest that you -- anyone who wants control of how the Mail USer
Agent deals with things -- get ahold of a user agent with source code such
as ELM or GNU Emacs or something.  (I am not endorsing any of these... I use
MH... Oh!  Or MH. :-))

Then you can control what it does.  /usr/ucb/mail tries to be smart about
unknown hosts as well as tries to handle BerkNet formats in addresses, which
is also a mistake nowadays.  I think.

Fred

rsalz@bbn.com (Rich Salz) (02/21/89)

In <2609@decuac.DEC.COM> avolio@decuac.dec.com (Frederick M. Avolio) writes:
>I strongly suggest that you -- anyone who wants control of how the Mail USer
>Agent deals with things -- get ahold of a user agent with source code
...
>Then you can control what it does.

The source to the Berkeley Mail program (UCBMail, the basis for mailx, etc)
is free of ATT code and is freely redistributable, as of the 4.3BSD-Tahoe
release.

You can get it from the FSF folks, or UUNET or -- if enough folks ask --
via comp.sources.unix.
	/rich $alz
-- 
Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.

ka@june.cs.washington.edu (Kenneth Almquist) (02/21/89)

msir@uhura.cc.rochester.edu (Mark Sirota) writes:
> I just found a misfeature with UCB Mail.  If remote mail comes in (i.e.
> mail from a remote site, with a From: line of the form
>
> 	remote-user@remote-host
>
> and your sendmail.cf strips the hostname off of local addresses, so that the
> To: reads just
> 
> 	local-user
>
> instead of
> 	local-user@local-host
>
> and the local user does a replyall in UCB Mail, then the resulting To:
> line will be
>
> 	local-user@remote-host remote-user@remote-host

This is the correct thing to do with traditional UUCP mail.  There is no
correct way to handle this with RFC822 format mail because RFC822 prohibits
addresses without host names.

> I find this unacceptable.  It would seem that UCB mail is trying to be
> intelligent; that is, it's assuming that remote-host may have been dumb
> and not fully qualified it's addresses, so it tries to recreate it for you.

The Berkeley mail code attempts to support something approximating a union
of RFC822 and traditional UN*X mail.  Mail that violates RFC822 is generally
assumed to be in traditional UUCP format.

> Well, I don't want it to do that.  I deliberately strip the local machine
> name off of all addresses before local delivery, so the headers will never
> contain the local host name.  And now I find that UCB Mail is trying to
> protect me from myself.
>
> That pisses me off.

I don't see that you have any particular cause to be angry.  You have an
incoming piece of mail in valid RFC822 format, which UCB Mail would handle
fine.  You then mangle the header of the incoming mail.  Not surprisingly,
this causes problems when you later try to run the UCB Mail program.  Why
blame the UCB Mail program for this?  The UCB Mail program is certainly
not "trying to protect you from yourself."  You are changing the addresses
on incoming mail, and UCB Mail is believing them without question.

RFC822 is an interchange standard, so you are free to convert incoming
mail to some other format for internal use (and in fact sendmail does
replace CR-LF sequences with newline characters).  Thus if you want to
user and internal mail format that omits the local host name from all
addresses, you can do so without violating RFC822.  There are, however,
several reasons for not doing this:

1.  If you change the internal mail representation, you have to change
    any program that accesses mail.  This means, for example, that you
    have to modify UCB Mail as well as sendmail.  Why is this a surprise?

2.  Mail doesn't always work perfectly.  Even if your software runs
    perfectly, mail software on other hosts will contain bugs.  Performing
    unnecessary transformations on mail headers makes these problems
    harder to track down and recover from.

3.  The RFC822 mail format is widely known.  Why demand that your users
    learn about local modifications to this format?  It's one thing for
    you to live with a different internal mail format, but because of
    points 1 and 2 in particular, some of your users are likely to be
    affected by the change as well, and since they didn't implement it
    they are more likely to be confused by it.  Why go out of your way
    to make things harder for your users?

In short, don't change UCB Mail.  Just get sendmail to stop mangling
addresses.
					Kenneth Almquist

msir@uhura.cc.rochester.edu (Mark Sirota) (02/22/89)

In article <885@ur-cc.UUCP> I write:
> I just found a misfeature with UCB Mail.  If remote mail comes in (i.e.
> mail from a remote site, with a From: line of the form
> 	remote-user@remote-host
> and your sendmail.cf strips the hostname off of local addresses, so that
> the To: reads just
> 	local-user
> instead of
> 	local-user@local-host
> and the local user does a replyall in UCB Mail, then the resulting To:
> line will be
> 	local-user@remote-host remote-user@remote-host
> 
> I find this unacceptable.  It would seem that UCB mail is trying to be
> intelligent; that is, it's assuming that remote-host may have been dumb
> and not fully qualified it's addresses, so it tries to recreate it for you.
> 
> Well, I don't want it to do that.  I deliberately strip the local machine
> name off of all addresses before local delivery, so the headers will never
> contain the local host name.

Well, I've gotten quite a large number of responses to this.
Unfortunately, there seems to be no consensus.  About half the people
agree with me that UCB Mail is brain-damaged, and the other half think I'm
brain-damaged for stripping local hostnames in the first place.

Given that, I'm willing to give up the campaign to change UCB Mail
(although I still think it's brain-damaged).

Only one person was able to give me a persuasive reason not to strip local
hostnames (hope you don't mind, Craig):

In message <IY0LJDy00VsL86fHFj@andrew.cmu.edu> "Craig F. Everhart" <cfe+@andrew.cmu.edu> writes:
| What you're doing seems reasonable at first, but there are oddball cases
| that I wonder if you're handling correctly.  I can think of re-sending and
| forwarding (packaging onemessage inside the body of another) as example
| user actions that often aren't handled in a correct manner when local host
| names are stripped (or abbreviated),and I imagine that there are others.
| Just to be clear, the problems I cite occur when a piece of mail is sent
| locally, then that received piece of mail is re-sent or forwarded
| elsewhere; the addresses in the headers aren't usually updated correctly.
| What happens when some external recipient of such mail tries to use the
| addresses in the headers of that mail?

So, my inclination is to put the local hostname on *all* local addresses,
even those that are completely local.  For instance, if I mail joeuser on
my-machine, the From: header will be "From: msir@my-machine", and the To:
header will be "To: joeuser@my-machine".

| These complications, and others I'm sure, are why RFC822 says that
| exchanged mail needs to have fully-qualified host names.  While I imagine
| that you could fix up all these and other problems, most Internet sites
| use RFC822 as their internal mail representation also, so that the act of
| mail crossing between inside and outside isn't as complicated as relaying
| mail between dissimilar maildomains.

I ask the net: Is this true?  Do "most Internet sites use RFC822 as their
*internal* mail representation also"?

Thanks for any and all opinions.

Mark
-- 
Mark Sirota - University of Rochester, Rochester, NY
 Internet: msir@cc.rochester.edu
 Bitnet:   msir_ss@uordbv.bitnet
 UUCP:     ...!rochester!ur-cc!msir

kgdykes@watmath.waterloo.edu (Ken Dykes) (02/22/89)

In article <918@ur-cc.UUCP> msir@cc.rochester.edu (Mark Sirota) writes:
>I ask the net: Is this true?  Do "most Internet sites use RFC822 as their
>*internal* mail representation also"?

  I would HOPE so.  A mail package I am responsible for in the Bull-HN world
(Honeywell Bull) is basically rfc822 (and I am steering it to conformity)
and we don't currently have anything to do with Internet!
But there is the realization that sooner than later this product will
be gatewayed to Internet, USENET and just about any other net people get
their hands on.

 The way I see it, to survive the modern age you have only two choices
in mail representation/handling.
 1) rfc822
 2) x.400
(and there is rfc987 for rational mappings between 822 and x400)

Unecessarily doing translations on headers everytime you hit a gateway
or yet-another-host-format is just screaming for silly little irritating
problems like those mentioned (doing REPLY to address received via "forwards").
In general the fewer mappings a message is inflicted, the more likely it
will contain rational information at the end-point.
 -Ken
-- 
          - Ken Dykes,   Software Development Group, U.of.Waterloo
                         Waterloo, Ontario, Canada  N2L 3G1
kgdykes@watmath.uucp     kgdykes@water.bitnet     kgdykes@waterloo.csnet
kgdykes@watmath.uwaterloo.ca   kgdykes@watmath.edu  {backbone}!watmath!kgdykes

kurt@ibmarc.uucp (Kurt Shoens) (02/22/89)

In article <885@ur-cc.UUCP> msir@cc.rochester.edu (Mark Sirota) writes
about how UCB Mail rewrites addresses when doing reply.  In particular,
unqualified addresses (those that do not include "@host") get the host of
the From: field tacked on today.  Mark indicates that this behavior is
causing him problems and isn't necessary since all addresses in messages
from other sites are supposed to be qualified with the hostname.

Sounds like Mark is right.  At the time UCB Mail was written, there WAS no
sendmail to rewrite outgoing addresses, so it was felt necessary to do
the editing on reply that is now causing the troubles.  I doubt that
anything will break if Mark disables the address rewriting, as long as
all mail from outside the local host has qualified, internet style addresses.
In terms of the Right Thing thing, it feels wrong to me to have both
sendmail and Mail rewriting addresses; UCB Mail should stop doing so.

Kurt Shoens, IBM Almaden Research Center, ...!uunet!ibmarc!kurt

msir@uhura.cc.rochester.edu (Mark Sirota) (02/24/89)

In article <918@ur-cc.UUCP> I write:
> I ask the net: Is this true?  Do "most Internet sites use RFC822 as their
> *internal* mail representation also"?

Well, from a truly impressive number of responses both posted and mailed,
it would look as though the majority recommends using RFC822 format
internally.  I will be changing my sendmail.cf's to do this.

Thanks to everyone who participated.

I would like to restate the following responses, just because they say it
so well.  First, Craig Everhart makes it clear why you should use RFC822
internally, and secondly, Kurt Shoens gives a persuading reason why UCB
Mail (or any other MUA) should not rewrite addresses.

In message <IY0LJDy00VsL86fHFj@andrew.cmu.edu> "Craig F. Everhart" <cfe+@andrew.cmu.edu> writes:
| I can think of re-sending and forwarding (packaging onemessage inside the
| body of another) as example user actions that often aren't handled in a
| correct manner when local host names are stripped (or abbreviated), and I
| imagine that there are others.  Just to be clear, the problems I cite
| occur when a piece of mail is sent locally, then that received piece of
| mail is re-sent or forwarded elsewhere; the addresses in the headers
| aren't usually updated correctly.  What happens when some external
| recipient of such mail tries to use the addresses in the headers of that
| mail?

In article <701@ks.UUCP> kurt@ibmarc.UUCP (Kurt Shoens) writes:
> Sounds like Mark is right.  At the time UCB Mail was written, there WAS no
> sendmail to rewrite outgoing addresses, so it was felt necessary to do the
> editing on reply that is now causing the troubles.  In terms of the Right
> Thing thing, it feels wrong to me to have both sendmail and Mail rewriting
> addresses; UCB Mail should stop doing so.
-- 
Mark Sirota - University of Rochester, Rochester, NY
 Internet: msir@cc.rochester.edu
 Bitnet:   msir_ss@uordbv.bitnet
 UUCP:     ...!rochester!ur-cc!msir