[comp.mail.headers] Wanted: Messed-Up Mail Headers

beh@bass.bu.edu (Bruce E. Howells) (09/07/90)

A few students here are trying to write a nice, pretty mail-user-agent
under a custom OS, talking both to BITNet and Internet-style addresses.

After severely taxing our creativity, we're making a request to the
Network:

Send us your tired, weak, confused or downright perverted mail headers...

Since sooner or later we're going to get a weird one, we'd appreciate it if
you could look through your back "strange-headers" file and send anything
particularly strange to me at the following addresses so we can code for it
now rather than debug for it later.

Thanks in advance for your strangeness donations!


Bruce Howells,  beh@bu-pub.bu.edu  |  engnbsc@buacca (BITNet)
  - not an official document of Boston University

avolio@decuac.dec.com (Frederick M. Avolio) (09/07/90)

I believe coding for weird mail headers is wrong.  I suggest you follow the
rules in the RFCs and leave it at that.  Why?  Because most screwed up
mail headers -- for example, something like this:

	From:  host!host2!host3::other!thing@something!else

are too screwy to be usable anyway.  One of the things wrong with the
Berkeley Mail program (in 4.2) was that it tried to be smart about
the headers (it had built-in smarts about BerkNet, etc... kind of like
the finger program expecting people to be in one of 3 Halls at UCB).

I don't think you want to handle any screwy headers.  You software will
probably guess wrong anyway.  Mail handlers along the way have the
responsibility for maintaining usable reply addresses, etc.

Fred

beh@bass.bu.edu (Bruce E. Howells) (09/08/90)

I'm not intending to parse From:, Sender:, Reply-to:, or
Maybe-If-I'm-Lucky-It'll-Get-To: - if the address is broken enough that our
professionally maintained mailer can't cope with it, I doubt that some code
I'll leave here when I graduate can...

Rather, I'm worried about things coming in that will break trying to read
the header - stuff like a full name and no address in From: - which I've
seen coming from PSUVM occasionally.  Things like 100-address To: lines,
and extremely long Subject: lines that people have sent.

Basically, my concern is that this thing not read in a piece of mail, try
to parse it, choke, die, and lose the user's mail...  The
mail-transport-agent level stuff I'm leaving to mailer - let it worry!

Bruce Howells,   beh@bu-pub.bu.edu  |   engnbsc@buacca (BITNet)
  - Not an official document of Boston University Information Technology

les@chinet.chi.il.us (Leslie Mikesell) (09/09/90)

In article <63938@bu.edu.bu.edu> beh@bass.bu.edu (Bruce E. Howells) writes:

>Basically, my concern is that this thing not read in a piece of mail, try
>to parse it, choke, die, and lose the user's mail...  The
>mail-transport-agent level stuff I'm leaving to mailer - let it worry!

There is a relatively simple way to provide this kind of robustness.  Just
read the mail into a spool file before trying to parse anything.  Then the
worst that can happen is that the spool file will be left in place so that
a periodic scan can find it.  Smail 3 works this way, and takes advantage
of the queuing operation to offer a choice of foreground, background or
daemon delivery.  It turns out to be moderately complicated to keep track
of deliveries to multiple recipients to allow restarting correctly, but it
looks like Smail 3 should get it right.  It does, however, choke on certain
combinations of things that really shouldn't be in addresses (I think
multiple ::'s or something like "Thanks! path!to!user" can do it - and
yes, I've seen the latter).  When this happens, the administrator can still
attempt to fix things by hand.

Les Mikesell
  les@chinet.chi.il.us

Makey@Logicon.COM (Jeff Makey) (09/11/90)

In article <63938@bu.edu.bu.edu> beh@bass.bu.edu (Bruce E. Howells) writes:
>Basically, my concern is that this thing not read in a piece of mail, try
>to parse it, choke, die, and lose the user's mail...

Ordinary good coding practices should take care of these types of
problems.  If you assume that lines may be infinitely long, and may
contain characters of *any* of the 256 possible values (including
ASCII NUL), then you should be protected from memory faults and
surprise truncations.  After that, insist that all errors detected
during parsing be handled in some sane manner.

Unfortunately, this is a lot easier to say than to do.

                           :: Jeff Makey

Department of Tautological Pleonasms and Superfluous Redundancies Department
    Disclaimer: Logicon doesn't even know we're running news.
    Internet: Makey@Logicon.COM    UUCP: {nosc,ucsd}!logicon.com!Makey

leres@ace.ee.lbl.gov (Craig Leres) (09/22/90)

Frederick M. Avolio writes:
> I believe coding for weird mail headers is wrong.

I sure hate to contradict you but you're wrong. The golden rule of smtp
is, "Be liberal in what you accept and conservative in what you send."

		Craig

dylan@ibmpcug.co.uk (Matthew Farwell) (09/23/90)

In article <7154@dog.ee.lbl.gov> leres@helios.ee.lbl.gov (ucbvax!leres for uucp weenies) writes:
>I sure hate to contradict you but you're wrong. The golden rule of smtp
>is, "Be liberal in what you accept and conservative in what you send."

Actually thats the golden rule of all TCP applications, as stated in
RFC 1123. It applies generally to most things like that tho.

Dylan.
-- 
Matthew J Farwell                 | Email: dylan@ibmpcug.co.uk
The IBM PC User Group, PO Box 360,|        ...!uunet!ukc!ibmpcug!dylan
Harrow HA1 4LQ England            | CONNECT - Usenet Access in the UK!!
Phone: +44 81-863-1191            | Sun? Don't they make coffee machines?

ken@argus.UUCP (Kenneth Ng) (09/28/90)

In article <1990Sep22.174312.29336@ibmpcug.co.uk>, dylan@ibmpcug.co.uk (Matthew Farwell) writes:
: In article <7154@dog.ee.lbl.gov> leres@helios.ee.lbl.gov (ucbvax!leres for uucp weenies) writes:
: >I sure hate to contradict you but you're wrong. The golden rule of smtp
: >is, "Be liberal in what you accept and conservative in what you send."
: Actually thats the golden rule of all TCP applications, as stated in
: RFC 1123. It applies generally to most things like that tho.

Unfortunately I've seen this phrase abused by both the sender and the
receiver implementers.  The sender says the receiver should be more 
liberal in what they accept.  And the receiver says the sender should
be more conservative in what they send.  Either that or they say 
"It's not a bug, its a FEE_TURE (as in fee_cees)."

-- 
Kenneth Ng: Post office: NJIT - CCCC, Newark New Jersey  07102
uucp !andromeda!galaxy!argus!ken *** NOT ken@bellcore.uucp ***
bitnet(prefered) ken@orion.bitnet  or ken@orion.njit.edu

jbuck@galileo.berkeley.edu (Joe Buck) (09/28/90)

In article <2016@argus.UUCP>, ken@argus.UUCP (Kenneth Ng) writes:
[ "Be liberal in what you accept and conservative in what you send."]

> Unfortunately I've seen this phrase abused by both the sender and the
> receiver implementers.  The sender says the receiver should be more 
> liberal in what they accept.  And the receiver says the sender should
> be more conservative in what they send.  Either that or they say 
> "It's not a bug, its a FEE_TURE (as in fee_cees)."

Kenneth, this is an easy one to solve.  You should be liberal because
people sometimes screw up, but in a case like the above, either the
address is standard (so the receiver is in error) or the address is
non-standard (so the sender is in error).

But people who have to use the system don't care who's guilty; they
just want it to work.  This is why the receiver shouldn't be pedantic
about rejecting things that commonly appear for historical reasons
(i.e. they were allowed in the previous (maybe ad-hoc) standard).

--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck	

eric@sunic.sunet.se (Eric Thomas SUNET) (09/28/90)

In article <28292@pasteur.Berkeley.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:
>Kenneth, this is an easy one to solve.  You should be liberal because
>people sometimes screw up, but in a case like the above, either the
>address is standard (so the receiver is in error) or the address is
>non-standard (so the sender is in error).

Or maybe the meaning is undefined, RFC822 is unfortunately not binary. I have a
server which receives requests via mail. I have some sites sending me mail
with two 'From:' fields, as in 'From: Joe@ABC' followed by 'From: Joe@XYZ'
where ABC and XYZ "look" like variations on the same host name to a human
being. My server likes to know who sent the command, so it refuses to process
such messages. The senders complain that their fields are syntactically correct
(which is true, each of the 2 fields is ok), that RFC822 doesn't say you
can't have more than one (which is also true, although it doesn't define the
meaning of that), and that the 2 addresses should point to the same mailbox, I
should just stop being fussy and accept the command, sending the reply to both
addresses. I refuse to do that on the basis that it's not my problem but
theirs, and that if RFC822 intended to allow multiple originating addresses
they would not have disallowed multiple mailbox specifications in the 'From:'
field (when no 'Sender:' is present).

It's no big deal, I just wanted to show that the case is not always as clear
as you said it is...

  Eric