[net.mail] Uucp mail headers

hokey@plus5.UUCP (Hokey) (04/15/85)

What should be used in the headers of mail between uucp sites?

As it stands, mail passed along from site to site (without passing through
a site which uses sendmail, or possibly delivermail) will generate this:

	From uucp <date>
	>From uucp <date> remote from site1
	>From uucp <date> remote from site2
	...
	>From user <date> remote from sendersite

where "remote from" can be "forwarded by" under some circumstances.  If site1
is running sendmail, the header will (probably) be:

	From uucp <date>
	>From site2!...!sendersite!user remote from site1
	From: [anybody's guess]
	Received: by site1; <date>	(or something close)
	To: somesite!user

Basically, non-RFC822 mailers stuff most of the information an '822 mailer
puts in the Received: line into the >From line.  The obvious information
is the timestamp.  The timestamp is useful if you want to see how long it
took the message to go from one site to the next.

This information in the >From lines is lost whenever the mail passes through
a sendmail site, because sendmail doesn't recognize >From lines.  It is
possible to write an rmail which will build Received: lines from >From lines.
Does anybody care?

The next point is domain names.  Right now, mail from (and, perhaps, through)
cbosgd looks like "cbosgd!cbosgd.ATT.UUCP!...".  If the message is being sent
to a class 3 mail system from cbosgd, it is not necessary for the initial
"cbosgd!" to be prepended.  There are two ways cbosgd's site name can be
reported:

	From uucp <date> remote from cbosgd
	>From uucp <date> remote from cbosgd.ATT.UUCP
or
	From uucp <date> remote from cbosgd!cbosgd.ATT.UUCP

The second form works fine on the couple of mail user interfaces I have seen.
Does anybody know of a mail program which does not like the second form?

Why is there a distinction between class 1 and class 2 sites?  Class 1 sites
are assumed to be "traditional" mail sites.  The only difference (that I
know of) between class 1 and class 2 is that class 2 sites will accept mail
in which the next machine in the route/address is the current machine; i.e.,
from upstream, the following is executed:

	uux site!rmail  site!blah	(where blah is a user or a route)

That's it for now.  Other remaining issues include the format of information
on Sender and Recipient lines, >From and From: with respect to replies,
mail to Postmaster, better integration of the mail system with netnews,
and some pie-in-the-sky issues about passing messages between neighbors.

Oh, and I almost forgot a biggie: gateways, and how addresses should be
converted as they pass from one mail network into another.
-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492

lauren@vortex.UUCP (Lauren Weinstein) (04/17/85)

I personally think it is not a good idea to put !'s to the right
of the "remote from" on the "From " line.  I've seen some systems
that clearly expect the "remote from" to always be a single uucp sitename,
not a combined name.  In general, I'd recommend that the combined
! addresses always be on the LEFT side of the "remote from", and that
the right side be used only for "simple" names.

--Lauren--

gregbo@houxm.UUCP (Greg Skinner) (04/18/85)

I'd like to add a question to Hokey's.

Are we assuming in the uucp mail project that all participating machines will upgrade
to sendmail, or at least modify rmail to pass on mail generated or handled by sendmail?
(I'm not trying to enforce use of sendmail, btw.)  

If not, and I know for the time being all my machines won't have sendmail, the

From uucp ...
>From uucp remote from site1
...

headers will be used.  I would recommend that these style headers (hereafter referred to
as Unix mail headers) be the default that is looked for by whatever mail delivery pro-
grams are written, since we can't assume that the majority of Unix mailers on UUCP will
have sendmail.  Also, it is guaranteed that you can build a reply address from the >From
lines, not necessarily with the From: lines.

I would also caution rewriting UUCP mail headers converting from >From lines to 
Received: lines, unless for the sake of the sites that generated the >From lines an
inverse transformation program can be applied to replies changing Received: lines to
>From lines.  Also, how would mail forwarding be handled?

... sigh ...
I remember a year and a half ago when I said that domains would probably take care of
all the mess.  Seems like they've just complicated things.
-- 
			... hey, we've gotta get out of this place,
    			    there's got to be something better than this ...

Greg Skinner (gregbo)
{allegra,cbosgd,ihnp4}!houxm!gregbo
gregbo%houxm.uucp@harvard.arpa

hokey@plus5.UUCP (Hokey) (04/22/85)

Nobody is assuming that sites will upgrade to sendmail (perhaps "convert"
would be a better word).  I am aware of no changes which need to be made
to rmail.  Routes or addresses sent by uucp will be understandable by
neighbors.  I have recently received a copy of the "UUCP Mail Transmission
Format Standard" from Mark Horton.  If people are interested, it will be
posted.

The "condensing" of multiple "remote from" lines down to a single line will
result in the loss of timestamp information.  If this information is retained
in a Received: line, care must be taken to avoid introducing duplicate
Received: lines for the same site.  Mark Horton suggested calling these
"derived" lines "Sent-By:" lines instead of "Received:" lines.  He also
pointed out that by doing so, the date would not have to be converted to
Arpa format.

The issue of site-stripping during condensation is a different matter.  The
single >From line could contain all the sites through which the mail passed.
This will cause long paths for replies to messages with multiple recipients.
If the condensing rmail is intelligent about site-stripping, we can both
reduce the size of the route, and potentially optimize the followup mail's
routing.

This is one place that knowing the "class" of the mailer used by news sites
is really useful.  A lot of really extraneous sites crop up in the return
path when people reply to news articles.  If class 3 mail sites used their
domain name in the Path: line when transmitting news, the reply function
(on sites not doing INTERNET style replies) could strip out the entire
path between the first and last domain sites in the Path.

For example, if I was at a site with a "dumb" mailer, and I wanted to
reply to the author of an article which had the following Path:

	Path: a!b!c.uucp!d!e!f.uucp!g!...!z.uucp!site!user
	(imagine your favorite 15 sites at the ... )

it would be safe for the "reply" program on *my* machine to use:

	To: a!b!c.uucp!z.uucp!site!user

as the route to the author of the article.  If my site *is* class 3, then:

	To: z.uucp!site!user

would be all that I need to get the mail back to the author.

This same site-strip could be safely done by rmail when condensing multiple
>From lines.  (Now where's that infinitive?)
-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492

pete@ecrcvax.UUCP (Pete Delaney) (04/23/85)

	I suggest you talk to mcvax!piet about this.  It seems to me that
Europe may be becomming dependent on sendmail.  We writes that almost all
news sites are defineing INTERNET in news and forwarding via domain 
addressing.

-- 
Pete Delaney - Rockey Mnt UNIX Consultant 	Phone: (49) 89 9269-164
European Computer (Industry) Reseach Center 	UUCP: mcvax!unido!ecrcvax!pete
ArabellaStrasse 17 				CSNET:ecrcvax!pete@Germany.CSNET
D-8000 Muenchen 81, West Germany 		UUCP Domain:  pete@ECRCvax.UUCP
						X25: (262)-45890040262

lauren@vortex.UUCP (Lauren Weinstein) (04/23/85)

While I'm not that concerned what happens with news, I don't think
calling "Received" lines "Sent-By" is a good idea at all.  First of
all, it is much less specific.  "Received" tells when the message
was received.  "Sent-By" seems much more vague.  But much more
importantly, many mail systems already have options to strip the
"Received" (and in some cases "Via") lines from mail so as to not
hassle the reader with all those lines unless they really want
to see them.  Not everyone is in the position to add another line
to that list, nor is it reasonable to introduce another line into
everyone's forced list of things to read in the header.  "Received"
is fine -- and 822 dates are trivial to produce if anyone is THAT
concerned about producing them.

I also don't think that site stripping in the From or >From lines
is a good idea at all.  For mail, those lines represent the only
reasonable way of figuring how the message arrived, and more often
than not I've found that the so-called "smart mailers" that try
to invisibly optimize paths just make things worse.  If people want
optimized paths, let them use the domain routing--where the "best"
route is what is expected or at least desired.  When explcit routes
are specified, there often is a good reason--like intermediate sites
that are mucking things up that someone is trying to route around.

My own solution (in my own class 3 implementation) is to Generate
normal From and >From lines, and include a domain type
From: line as well.  My reply proggrams give the user the option 
('a' or 'A') of replying to the reconstructed From line address or
the Interenet From: (or Reply-To:, etc.) line.  In complex environments
(particularly where gateways are involved) it rapidly becomes impractical
to simply assume which way to handle all replies on an automatic basis.

So far, the people Beta testing my code (all MSDOS users) are very
happy with the results, and they are merrily routing mail to .UUCP,
.ARPA, .BITNET, and also dealing with all sorts of ordinary uucp
addresses (some with embedded @'s) as well.

--Lauren--

hokey@plus5.UUCP (Hokey) (04/25/85)

I don't really care about what the line containing the timestamp which
only exists in a >From line is called.  Mark believes putting this non-
local addition in a Received: line is misleading, as the info wasn't
added at the indicated site.  For all I care, the >From lines could be
kept around, except that sendmail won't keep the >From lines around.

As for site-stripping, the addition of Received/Sent-by lines will
provide the same path information that the full >From line does.

This is a chicken/egg problem.  People who use domain or smart mailers
won't need the optimization.  Sites that don't/can't change their mail
software don't use domain routing.  These are the sites that cause all
the horrible paths.

Therefore, the solution to this problem must come from somewhere other
than the mail software.  The only logical place is from within news.

If the news reply software automatically stripped everything between
the first and last domain name in the Path: line, then mailers would
not have to optimize paths, people who use domain mail addresses would
not suffer, people who don't or can't use domain mail addresses would
see how to get mail to other net, and the number of gross paths would
decline.  We might also begin to see an end to the "How do I send mail
to my friends on FishNet?" postings (dream on).
-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492

hokey@plus5.UUCP (Hokey) (04/25/85)

Somewhere along the way, people seem to have gotten the idea that site-
stripping has something to do with Received/Sent-by lines.  I didn't
catch this before I posted my last article.

Site-stripping would be done when somebody does a "reply" to a news
article, but their reply software can't make use of the From: or Reply-to:
fields.  These replies must therefore be sent out along the Path: line,
which is gross.  Therefore, to minimize the path, the reply software
would strip out all sites in between the first and last domain addresses
in the Path: line.  This implies that only sites with smart (class 3 or
similar) mailers would identify themselves with their domain name in the
Path: line.

The Received/Sent-by stuff has to do with the information which must be lost
whenever mail goes through a site which runs sendmail.  Sendmail can't keep
the ">From site <timestamp> remote from nextdoor" lines which are added
once per uucp hop.  A site which uses sendmail must be fed the incoming mail
with all these multiple >From lines crushed into a single line.  This means
the timestamp information left by each site is lost.  I suggested converting
these individual >From lines into Received: lines just to save the timestamp.
-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492