[comp.mail.misc] Content-Length:

argv%eureka@Sun.COM (Dan Heller) (07/08/89)

In article gregg@cbnewsc.ATT.COM (gregg.g.wonderly) writes:
> From article by les@chinet.chi.il.us (Leslie Mikesell):
> > AT&T's PMX mailer products include a new /bin/mail that uses
> > Content-Type: and Content-Length: headers to avoid the problem.

> ..  If the text following Content-length bytes is not "From",
> seek back to Content-length and do normal "look for the From line"
> processing see, great solution :-() going to do with a file that has
> been altered by an unknowing editor (like vi or emacs)?  Once again
> another format has been chosen that absolutely solves nothing!

It's obvious that we know nothing about the proposed scheme right now.
Les didn't give much information at all and I would like more -- a lot
more.  However, I am speculating that this scheme is not intended for
MUAs at all.  It is probably intended for MTAs that are compatible.
If the MTAs are not compatible, then it either doesn't send the mail
(probably the case for the 8-bit binary mode) or it tries some sort of
conversion from its current format to a well-known one.

For example, if I were to mail myself a folder using this new /bin/mail
program, then I'm sure it would count the number of bytes in the folder,
add the Content-Lenth: header and then send it off to the other site.
Note that the folder I sent has other messages in it that also have
Content-Length: headers.  But the -real- content-length header will be
the only one read.

Ok, so the message goes off to the remote site and it indeed has the
compatible MTA that understand content-length, so the messages it gets
is deposited into the recipient's mbox.  Now, what does the MUA do with
it?  Does this also require an MUA that knows content-length?  And if
so, what does it do about error recovery?  Gregg points out that the
user could have edited his messages.  What then?

I think the content-lenth: and type are fine ideas as long as they
are restricted to the MTA and that the MTAs are compatible.  but,
assuming that the MUA understands the MTA is a -big- mistake.  The MTA
should also prepare itself to convert its "message" into a format which
is RFC822 compliant in case it talks to an MTA that isn'nt compatible
with the originating MTA.

Les' original question to me was: can Mush handle that?  My response
is the question I posed above: What should it do if content-length:
doesn't reprepsent the content length?  I need more "spec".

dan <island!argv@sun.com>
-----
My postings reflect my opinion only -- not the opinion of any company.

les@chinet.chi.il.us (Leslie Mikesell) (07/10/89)

In article <114277@sun.Eng.Sun.COM> island!argv@sun.com (Dan Heller) writes:
[Re: Content-Type: & Content-Length: headers used by AT&T PMX mailer ]

>It's obvious that we know nothing about the proposed scheme right now.
>Les didn't give much information at all and I would like more -- a lot
>more.

Perhaps a more knowledgeable source can supply the details.  I only know
about the headers by observation of the technique used by the PMX-Starmail
package where the MUA is a DOS program that talks over the starlan network
to a unix server where the messages are handled by a slightly modified
/bin/mail. The MUA has a built-in editor that allows you to enter normal
message text and you can attach files with the ability to specify the type
of file (text or binary) or you can let it decide for you.  Each portion
(message text and file attachment(s)) is preceded by a Content-Type:
and Content-Length: header, followed by one blank line that is not counted
in the Content-Length.  If there is more than one portion, an additional
Content-Type: Multipart and Content-Length: that covers the entire item
is added at the beginning.  There are some additional headers used for
file attachments to specify the original filename, an optional description
and perhaps some things that would be used by AT&T's Document Exchange
program that converts among various formats.  Some other headers can be
added according to the filename extension to identify the "type" of the
file so the MUA can start the proper program if you want to view a non-text
attachment.  This scheme suffers from the usual DOS memory limitations. 

>However, I am speculating that this scheme is not intended for
>MUAs at all.  It is probably intended for MTAs that are compatible.
>If the MTAs are not compatible, then it either doesn't send the mail
>(probably the case for the 8-bit binary mode) or it tries some sort of
>conversion from its current format to a well-known one.

Actually both, since the MUA uses the headers to do the correct thing
when the user wants to detach or view a binary attachment.  The MTA
only uses them to decide if the next hop is binary-compatible.  It does
have to go to some lengths to insert the headers into messages where
the MTA did not.

>Ok, so the message goes off to the remote site and it indeed has the
>compatible MTA that understand content-length, so the messages it gets
>is deposited into the recipient's mbox.  Now, what does the MUA do with
>it?  Does this also require an MUA that knows content-length?  And if
>so, what does it do about error recovery?  Gregg points out that the
>user could have edited his messages.  What then?

If the message is handed to a /bin/(r)mail that doesn't know about
Content-Length: it will have any From_ lines munged into >From_
as usual so the MUA face of /bin/mail won't be confused (but you
also don't get the exact contents that were sent).

>I think the content-lenth: and type are fine ideas as long as they
>are restricted to the MTA and that the MTAs are compatible.  but,
>assuming that the MUA understands the MTA is a -big- mistake.  The MTA
>should also prepare itself to convert its "message" into a format which
>is RFC822 compliant in case it talks to an MTA that isn'nt compatible
>with the originating MTA.

I think messages with binary attachments are bounced if you attempt
to send to a site that doesn't understand them (there is a table of
sites known to handle them).  I see no reason why they couldn't be
automatically encoded with a human-readable header showing the format.
You can specify to the PMX MUA that all attachments should be encoded
using btoa, but this increases the size by about a third. Other than
binary attachments, the headers shouldn't affect RFC822 compliance.

>Les' original question to me was: can Mush handle that?  My response
>is the question I posed above: What should it do if content-length:
>doesn't reprepsent the content length?  I need more "spec".

In that case there has obviously been an error in storing your mailbox
file, so the proper thing to do is to generate an error message and
try to recover by finding a likely-looking From_ line.  The most likely
problem would be a line starting with "From" that an MTA changes to
">From" after you calculate the length.  If you are interested in pursuing
this, I can generate some sample headers (or perhaps someone who knows the
"spec" can comment).

Les Mikesell

les@chinet.chi.il.us (Leslie Mikesell) (07/14/89)

In article <114277@sun.Eng.Sun.COM> island!argv@sun.com (Dan Heller) writes:

>It's obvious that we know nothing about the proposed scheme right now.
>Les didn't give much information at all and I would like more -- a lot
>more.

Here's a sample of the PMX-mailer headers.  This was sent from a PC
using PMX-starmail from afbf02!les2 to afbf01!les.  The mail appears
to be from the unix host and remains in the normal unix mailbox until
the PC requests it.  The headers up to one line below the first
Content-Length: do not have the PC-style carriage returns at the ends of
the lines.  The rest of the message did and they are included in the
length counts (I deleted them). Going the other direction (unix mail to
PC), CR's are added during the transfer and the length values are adjusted
to match.
(A) It seems a little strange to me to have an End-of-Header: and
    End-of-protocol: header followed by more headers.
(B) Inserting a Status: header before the blank line will mess up the
    count unless you note that the count starts *after* the blank line.
(C) I could not decode the contents with the version of atob that I had
    on a 3B2 although the format looks right after the CR's are taken out.
    Are there incompatible versions of atob?
|------------sample----------
|From uucp Tue Jul 11 15:22 CDT 1989
|>From les2 Tue Jul 11 15:28 CDT 1989 remote from afbf02
|Message-Version: 2
|>To: afbf01!les 
|From: les2
|Date: Tue Jul 11 14:34:50 CDT 1989
|UA-Content-ID: <PMX-LAN-2.01-910404-afbf02-les2-1205>
|End-of-Header:
|Email-Version: 2
|UA-Message-ID: <PMXSTAR-S2.0-les2-3>
|Subject: test from pmx-starmail
|To: afbf01!les 
|End-of-Protocol: 
|Content-Type: multipart
|Content-Length: 14871        
|
|Content-Type: Text
|Content-Length: 56        
|
|This is text in the body of the message.
|text
|text
|
|
|Content-Type: Binary
|Content-Name: E:\DC\PKXARC.COM
|Content-Abstract: this was entered as a description
|Encoding-Algorithm: btoa
|Content-Length: 14605
|
|xbtoa Begin
|pQAVU!U9Z<!NkQma5+XAeS*<Q(>?Dgo@<bGeRlo%^5E,k]mV?YIK1hGmF`cE 1C'E0UfaT!L*#o.&k
|<o!oEt0!T+%4!T+=D!T*k*zzzzzzzzzz   S1  !It-MgXg4rm8e9JAm]6RGSY*<[hi$03>[zzz7S%
|    [ 179 lines of 78 chars + CR/LF deleted ]
|V\Db+;$*DU$#Bg3_,F$F\*Bh<Ks /AS^E((m0E^jinzzzzzzzz   #$"53`bz
|xbtoa End N 11482 2cda E 82 S 13fdec R 9e78bdcc
|
-----------------
Without encoding, binary attachments would omit the Encoding-Algorithm: line
and include the binary image after the blank line following the
Content-Length: header (useful only if your MTA can handle it).  Text
file attachments are similar but with Content-Type: Text.

Les Mikesell