[comp.mail.misc] Replying to mail... is there a general theory?

joe@hanauma.stanford.edu (Joe Dellinger) (07/11/89)

	I would like to set up a system whereby mailing to a special
address ("anisotropy@hanauma.stanford.edu") automatically sends back
a reply giving some standard information (a list of e-mail addresses
of people working in anisotropy).
	My question is: what's the best way to get a return address
out of a message? It seems that I should use either the address on
the "From" or "From:" lines, but which is correct? Neither works all
the time. Here are some samples:

1)
From ames!uucp@apple.com Fri Jul  7 08:21:53 1989
From: ames!cs.utexas.edu!nluug.nl!ruugeof!douma@apple.com (Jan Douma)

2)
From stan@erebus.STANFORD.EDU Thu Jul  6 01:06:29 1989
From: Stan Ruppert <stan@erebus.STANFORD.EDU>

3)
From killer!jimfig@ames.arc.nasa.gov Tue Jun 20 15:04:51 1989
From: jimfig@killer.Dallas.TX.US (Jim Fiegenschue)


	In #1, the "From:" line would work but the "From" line would not.
	In #2, the "From" line would work but the "From:" line would not,
	unless you knew to look in the <>'s.
	In #3 either would work.

	From looking at other samples collected in my mbox, it seems the
best algorithm would be:

Look at the "From:" line. Does it have a "<...>" field in it? If yes, use
whatever is in the <>'s as the return address; If no, use the first field
after the "From:" as the return address.
\    /\    /\    /\/\/\/\/\/\/\.-.-.-.-.......___________
 \  /  \  /  \  /Dept of Geophysics, Stanford University \/\/\.-.-....___
  \/    \/    \/Joe Dellinger joe@hanauma.stanford.edu  apple!hanauma!joe\/\.-._

aem@ibiza.cs.miami.edu (a.e.mossberg) (07/11/89)

joe@hanauma.stanford.edu (Joe Dellinger) writes:
>	My question is: what's the best way to get a return address
>out of a message? It seems that I should use either the address on
>the "From" or "From:" lines, but which is correct? Neither works all
>the time. Here are some samples:

Don't even consider using the 'From' line. Use, in order of preference,
the 'Reply-To:' line followed by the 'From:' line.

>	From looking at other samples collected in my mbox, it seems the
>best algorithm would be:

>Look at the "From:" line. Does it have a "<...>" field in it? If yes, use
>whatever is in the <>'s as the return address; If no, use the first field
>after the "From:" as the return address.

Look at the appropriate RFC. Look at the parsing code used in other programs.
Your suggestion is a bit too simplistic to catch all varities.  Try it out
anyway. See where it fails. Go back to the drawing board. Try again.


aem

a.e.mossberg - aem@mthvax.cs.miami.edu/aem@umiami.BITNET - Pahayokee Bioregion
If you crumple your money into little balls, it will never stick together.
							- David Byrne

mike@unmvax.cs.unm.edu (Michael I. Bushnell) (07/12/89)

In article <3483@portia.Stanford.EDU> joe@hanauma.stanford.edu (Joe Dellinger) writes:
>
>	I would like to set up a system whereby mailing to a special
>address ("anisotropy@hanauma.stanford.edu") automatically sends back
>a reply giving some standard information (a list of e-mail addresses
>of people working in anisotropy).
>	My question is: what's the best way to get a return address
>out of a message? It seems that I should use either the address on
>the "From" or "From:" lines, but which is correct? Neither works all
>the time. Here are some samples:

Actually, you should use, in order of preference, the following:
  Reply-To:
  From:
  Sender:

For mail to arrive without a From: at all is an error, but I've seen it...
it's wise to use Sender: if there isn't a From:.

You should NEVER use the From_ line unless something is really wrong and
none of the above three works.

Once you've selected the correct field, do the following:
  Drop everything in parentheses (they nest)
  If there are <> in what's left, take what's inside the <>
  Else, take whatever's left.

That will work.  Honest.  You NEVER need the From_ line.  My MUA just
dumps them on the floor.

   Michael I. Bushnell       \     This above all; to thine own self be true
     Silence == Death         \    And it must follow, as the night the day,
  mike@unmvax.cs.unm.edu      /\   Thou canst not be false to any man.
Telephone: +1 505 292 0001   /  \  Farewell:  my blessing season this in thee!

dce@Solbourne.COM (David Elliott) (07/12/89)

In article <207@unmvax.unm.edu> mike@unmvax.cs.unm.edu (Michael I. Bushnell) writes:
>Actually, you should use, in order of preference, the following:
>  Reply-To:
>  From:
>  Sender:

What about Return-Path:?  I modified my MH configuration files
to prefer that because it seems to be right more of the time
than any of the others.

For example, when stuff gets sent through sites like pyramid
and nbires, the From: line gets left alone, but Return-Path:
is modified.  The problem is that other sites may modify From:,
so the path might appear to be uunet!mips.com!mdove, when the
actual path taken would have been mips!pyramid!uunet!nbires.

-- 
David Elliott		dce@Solbourne.COM
			...!{boulder,nbires,sun}!stan!dce

jos@idca.tds.PHILIPS.nl (Jos Vos) (07/12/89)

In article <1581@marvin.Solbourne.COM> dce@Solbourne.com (David Elliott) writes:

>In article <207@unmvax.unm.edu> mike@unmvax.cs.unm.edu (Michael I. Bushnell) writes:
>>Actually, you should use, in order of preference, the following:
>>  Reply-To:
>>  From:
>>  Sender:

>What about Return-Path:?  I modified my MH configuration files
>to prefer that because it seems to be right more of the time
>than any of the others.

The From: field *should* contain an absolute domain address
(in the ideal situation) in its whole life (i.e. already
on the sender's system).
In that case nobody has (and *may*) change this field.

-- 
-- ######   Jos Vos   ######   Internet   jos@idca.tds.philips.nl   ######
-- ######             ######   UUCP         ...!mcvax!philapd!jos   ######

schaefer@ogccse.ogc.edu (Barton E. Schaefer) (07/13/89)

In article <207@unmvax.unm.edu> mike@unmvax.cs.unm.edu (Michael I. Bushnell) writes:
} In article <3483@portia.Stanford.EDU> joe@hanauma.stanford.edu (Joe Dellinger) writes:
} >
} >	My question is: what's the best way to get a return address
} >out of a message? It seems that I should use either the address on
} >the "From" or "From:" lines, but which is correct? Neither works all
} >the time. Here are some samples:
} 
} Actually, you should use, in order of preference, the following:
}   Reply-To:
}   From:
}   Sender:

Actually, this is incorrect.  In section 4.4 of RFC822, "Automatic use
of From / Sender / Reply-To", the statement is explicitly made that
"The `Sender' field mailbox should NEVER be used automatically, in a
recipient's reply messge."  (emphasis theirs)

In the same section, it is stated that the Sender: field should be used
for sending any notices of transport or delivery problems.

Joe is correct that Reply-To: should be preferred to From: if both are
present.

} Once you've selected the correct field, do the following:
}   Drop everything in parentheses (they nest)
}   If there are <> in what's left, take what's inside the <>

Wrong, sort of.  You should take the <> and everything inside (that is,
don't discard the <>).  Some MTAs may require that the <> be dropped, but
those MTAs are broken.  The <> are required for parsing of some legal
addressing forms, and if you remove them, somebody along the way probably
won't like it.

}   Else, take whatever's left.
} 
} That will work.  Honest.  You NEVER need the From_ line.  My MUA just
} dumps them on the floor.

This ought to be correct, but unfortunately it varies depending on how
the mail got to you.  MTAs that may have handled the message during
its trip are sometimes misconfigured and will scramble the From: field.
I've never seen a Reply-To: get scrambled, so you're probably safe if
you find that one--provided that your MTA knows how to reconstruct any
missing path (this is a problem only for UUCP connections, it shouldn't
be necessary if all parties are on the Internet).

However, it usually isn't possible for an automated system to tell
whether the From: line has been scrambled.  You have to look at it
in context--for example, the sequence of Received: headers may show
that intermediate MTAs handled the message in a different order than
the one that normal parsing of the From: address would produce.  So
you're best off using Reply-To: / From:, and be prepared to handle
bounced messages.

In article <1581@marvin.Solbourne.COM> dce@Solbourne.com (David Elliott) writes:
} In article <207@unmvax.unm.edu> mike@unmvax.cs.unm.edu (Michael I. Bushnell) writes:
} 
} What about Return-Path:?  I modified my MH configuration files
} to prefer that because it seems to be right more of the time
} than any of the others.
} 
} For example, when stuff gets sent through sites like pyramid
} and nbires, the From: line gets left alone, but Return-Path:
} is modified.

This is correct behavior.  Return-Path: is the only field that an MTA has
license to modify.  (It can add Recieved: lines, and the From_ line at
the top is a binmail / sendmailism, so those programs can munge it as
they like.)  In fact, the MTA is supposed to add Return-Path: at the time
of final delivery.  However, the Reply-To: field is still to be preferred
for directing replies, because it is specified by the message originator.
He may want the reply directed to a different mailbox than the one in
either the From: or Return-Path: lines.

Return-Path: should be at least as reliable as From_, probably more so.
-- 
Bart Schaefer           "And if you believe that, you'll believe anything."
                                                            -- DangerMouse
CSNET / Internet                schaefer@cse.ogc.edu
UUCP                            ...{sequent,tektronix,verdix}!ogccse!schaefer

mike@unmvax.cs.unm.edu (Michael I. Bushnell) (07/13/89)

In article <1581@marvin.Solbourne.COM> dce@Solbourne.com (David Elliott) writes:
>In article <207@unmvax.unm.edu> mike@unmvax.cs.unm.edu (Michael I. Bushnell) writes:
>>Actually, you should use, in order of preference, the following:
>>  Reply-To:
>>  From:
>>  Sender:

>What about Return-Path:?  I modified my MH configuration files
>to prefer that because it seems to be right more of the time
>than any of the others.

>For example, when stuff gets sent through sites like pyramid
>and nbires, the From: line gets left alone, but Return-Path:
>is modified.  The problem is that other sites may modify From:,
>so the path might appear to be uunet!mips.com!mdove, when the
>actual path taken would have been mips!pyramid!uunet!nbires.

The standard for UUCP mail is for From: to be modified when passing
through each machine.  This arranges for the return address to be
correct.  The standard on the Internet is for From: to be left
absolutely alone.  The best thing to do in gatewaying is as following:

UUCP -> Internet gatewaying:

From: foo!bar!baz    maps to      From: foo!bar!baz@my.internet.site.name


Internet -> UUCP gatewaying:

From: user@his.host.name     maps to   From: myuucpname!user%his.host.name


This all works quite well and integrates nicely with the two
standards.  Note that Reply-To: is modified roughly as above.

Unfortunately, there is no standard for how Return-Path: gets modified
when gatewaying.  Alas.

From RFC 822:

The "Reply-To" field is added by the originator and serves to direct
replies, whereas the "Return-Path" field is used to identify a path
back to the originator.

You see, Return-Path: is not a "path to use for returns/replies".
It's just a path (hopefully) back to the sender.  Internet sites form
it with route-addrs, UUCP sites form it with ! paths, and there isn't
an easy way to map around, and still guarantee that UUCP sites always
get ! and Internet sites never do...you see...not everyone on the
internet understands ! and not everyone in UUCPland understands @ or
route-addrs.  

I have found Return-Path: to be wrong in virtually every case that
involved gatewaying between the two networks.  I have *never* seen the
From: line to be incorrect.  Sometimes inefficient, but never
incorrect. 

   Michael I. Bushnell       \     This above all; to thine own self be true
     Silence == Death         \    And it must follow, as the night the day,
  mike@unmvax.cs.unm.edu      /\   Thou canst not be false to any man.
Telephone: +1 505 292 0001   /  \  Farewell:  my blessing season this in thee!

cfe+@andrew.cmu.edu (Craig F. Everhart) (07/13/89)

There are two classes of reply: error bounces and human-like replies. 
There are different rules for each class.

For human-like replies, follow RFC822 and use Reply-to: if it exists, or
From:.  Do NOT fall back on Sender: or Return-path: or From_.  Remember,
what's happening here is that a human is supposed to really be doing the
addressing, and a user-agent program is simply offering an assist.  If
the user-agent program can't find the proper header to suggest as the
default answer, then it shouldn't be guessing on substandard
information.  The user can do that guessing once the UA program has
announced that the correct headers aren't there.  (Variation: you can
also use the Resent-Reply-to: (falling back on Resent-From:)
information, possibly for a different class of reply.  Still other
classes of replies might also include the To:/CC:/Resent-To:/Resent-CC:
addresses.)

For error bounces, there's an independent collection of hair.  In a pure
RFC822 world, you'd use the Sender: address, falling back on From:
should Sender: not exist.  As implemented in the RFC821 world (using
SMTP), you'd use the ``envelope From'' information, which comes across
as the argument to SMTP's ``MAIL FROM:'' command.  This information is
supposed to be recorded in the Return-path: field when a message is
finally delivered; depending where the rejection happens, it may be
before final delivery occurs.  I believe that the comparable information
is recorded in From_ lines in UUCP-land, but I'm a UUCP-mail novice.

		Craig Everhart