[net.mail] "Binary"/international text in mail

gnu@hoptoad.UUCP (10/21/86)

[Further discussion should move to net.mail.]

In article <755@mtune.UUCP>, jhc@mtune.UUCP (Jonathan Clark) writes:
> >How can you send binaries in mail?
> You could if /bin/mail supported a logical separation between a letter
> and its envelope....                            do any other mail
> subsystems support this separation so that binaries could be mailed?

Yes, the Arpanet mail standard (Simple Mail Transfer Protocol, or SMTP,
described in RFC [Request for Comments] #821) separates the text and
the header information.  It specifically allows any of the 128 possible
7-bit characters to be sent, and does a trivial encoding to allow the
"end of text" marker to appear in messages.  It requires that 7-bit USASCII
be used, however, which makes things hard on people in Europe and Asia.

I note that Sendmail has a bug which does not allow ASCII NUL (0x00) to
be sent.  This is in violation of RFC 821.

In article <7242@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes:
> > What happens if the string
> > "\nFrom" appears in te binary?  Shouldn't the user agent or
> > delivery system or someone be inserting a '>' before the From?
> 
> Yup.

Actually, the mail transport system should not care about "From "s in
things.  When it gets to the far end, IF it is being delivered to a
mailbox that used "\nFrom " to delimit messages, then the far end has
to worry about this.  Lettuce work towards making all the software
transparent, then when someone writes a final delivery program that
uses a different format (e.g. delivers straight to an MH folder, which
keeps each message in a separate file), the whole thing will work.
Note that local mail delivery is not built in to sendmail -- you can
change sendmail.cf to have it call /bin/foomail rather than /bin/mail
and it really won't care.
-- 
John Gilmore  {sun,ptsfa,lll-crg,ihnp4}!hoptoad!gnu   jgilmore@lll-crg.arpa
(C) Copyright 1986 by John Gilmore.             May the Source be with you!

jhc@mtune.UUCP (Jonathan Clark) (10/24/86)

In article <1209@hoptoad.uucp> gnu@hoptoad.UUCP writes:
>In article <755@mtune.UUCP>, jhc@mtune.UUCP (Jonathan Clark) writes:
>> >How can you send binaries in mail?
>> You could if /bin/mail supported a logical separation between a letter
>> and its envelope....                            do any other mail
>> subsystems support this separation so that binaries could be mailed?
>
>Yes, the Arpanet mail standard (Simple Mail Transfer Protocol, or SMTP,
>described in RFC [Request for Comments] #821) separates the text and
>the header information.  It specifically allows any of the 128 possible
>7-bit characters to be sent, and does a trivial encoding to allow the
>"end of text" marker to appear in messages.  It requires that 7-bit USASCII
>be used, however, which makes things hard on people in Europe and Asia.

I would claim that 7-bit ASCII isn't binary... but I would believe
that the separation of address and data works, at least for this
case. Extending the encoding mechanism would of course be easy, but
non-standard.

I'm told that X.400 separates its address and data as well.
Also that ATTMAIL provides some sort of mailing of binaries, although I
haven't been able to find out the mechanism yet (and I work there!).

>In article <7242@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes:
>> > What happens if the string "\nFrom" appears in te binary?
>Actually, the mail transport system should not care about "From "s in things.

Aye, there's the rub. You're right that it shouldn't, but too many (ie
some) do. Bummer. Perhaps we should all convert. To something.
Speaking of X.400 doesn't that use some magic internal format which
means that you really need either conversion programs in and out or a
complete new subsystem, including editors and such? I ask this having
skimmed the X.400 specs some years ago, and having had a couple of
discussions about them. Constructive comments which display my
ignorance are welcome.

-- 
Jonathan Clark
[NAC,attmail]!mtune!jhc

My walk has become rather more silly lately.

jordan@ucbarpa.Berkeley.EDU (Jordan Hayes) (10/27/86)

So what if RFC822 is limited to 7-bit ASCII ... not everyone agrees
on anything else ... just write your own front end to your favorite
mailer ... if you're on a "normal" machine, try "man uuencode" to send
binary files and other >7bit data ...

/jordan

henry@utzoo.UUCP (Henry Spencer) (10/29/86)

> Actually, the mail transport system should not care about "From "s in
> things.  When it gets to the far end, IF it is being delivered to a
> mailbox that used "\nFrom " to delimit messages, then the far end has
> to worry about this...

I thought about this a little bit, and decided not to do it on utzoo.  Why?
Because it would mean that mail reaching our neighbors through us would
have at least two "From " lines on the front, where it now has one "From "
and one or more ">From ", and I am really not sure how their mailers would
react to this.  Please don't tell me how they *should* react; that is not
the issue.  The current situation is a botch and a blemish, but at least
people have pretty much learned to deal with it.  The various mailers find
it difficult to guess return addresses as it is, and often get it wrong;
I really question the wisdom of introducing a new source of variability
in header format, however noble the motives.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

kre@munnari.OZ (Robert Elz) (10/30/86)

In article <7263@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) quotes:

> > Actually, the mail transport system should not care about "From "s in
> > things.  When it gets to the far end, IF it is being delivered to a
> > mailbox that used "\nFrom " to delimit messages, then the far end has
> > to worry about this...

and then writes:

> I thought about this a little bit, and decided not to do it on utzoo.  Why?
> Because it would mean that mail reaching our neighbors through us would
> have at least two "From " lines on the front, where it now has one "From "
> and one or more ">From ", and I am really not sure how their mailers would
> react to this.

This seems like a separate issue, the unix "From " line at the head of
a mail item, and any subsequent ones are part of the envelope - you
should feel free to do whatever is necessary to those to maintain the
standard envelope format (eh?  What standard envelope format??).

The "From " lines that it would be better not to touch are those in
th body of the mail - ones that currently tend to have a '>' stuck
in just in case someone, somewhere, were to decide that this is
really two mail items in one file.

Those ones you could leave alone (says he on a sendmail machine
that just loves to insert \n's in the middle of long lines in
the body, etc.  I fixed that once, must get around to doing
it again...)

The slightly more difficult question is what to do with a mail item
that starts with "From " in the first line immediately after the
unix header line - is that part of the body, or is it another header?

Robert Elz		munnari!kre