[comp.mail.misc] Who comments out the From_ lines?

dce@smsc.sony.com (David Elliott) (10/21/90)

When I was using a 4.3BSD machine, I found that mail messages
containing lines beginning with the word "From " would be
changed into ">From " by the mail system (it looks like it was
rmail).

Now that I've switched to SVR4, this no longer happens.  This
means that programs like MH that expect the "From " to be
a message separator get broken.  On the other hand, mailx and
the other SVR4 mailers don't get confused by this.

Is the "commenting" of these lines a requirement, or just a
nicety?  I don't mind changing MH and other tools, but I want
to know that our SVR4 isn't just broken.

rickert@mp.cs.niu.edu (Neil Rickert) (10/22/90)

In article <1990Oct21.160918.9684@smsc.sony.com> dce@smsc.sony.com (David Elliott) writes:
>When I was using a 4.3BSD machine, I found that mail messages
>containing lines beginning with the word "From " would be
>changed into ">From " by the mail system (it looks like it was
>rmail).
>
  This is done by /bin/mail during the final delivery of the mail
to your mailbox.  It is done quite crudely, so that valid text lines
in the message are also changed at times

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115.                                  +1-815-753-6940

dce@smsc.sony.com (David Elliott) (10/22/90)

In article <1990Oct21.215709.14022@mp.cs.niu.edu> rickert@mp.cs.niu.edu (Neil Rickert) writes:
>In article <1990Oct21.160918.9684@smsc.sony.com> dce@smsc.sony.com (David Elliott) writes:
>>When I was using a 4.3BSD machine, I found that mail messages
>>containing lines beginning with the word "From " would be
>>changed into ">From " by the mail system (it looks like it was
>>rmail).
>>
>  This is done by /bin/mail during the final delivery of the mail
>to your mailbox.  It is done quite crudely, so that valid text lines
>in the message are also changed at times

I think that's what I said (rmail is often a link to /bin/mail).

My question is not who does it, but whether or not it should
be done.  There are programs that expect all lines beginning with
"From " to have been generated by the mail system.

From what I've heard from other folks, it's not a standard, but
something done by BSD so that programmers can be lazy.  Looks
like I'll be digging into MH tomorrow.

david@twg.com (David S. Herron) (10/23/90)

In article <1990Oct21.160918.9684@smsc.sony.com> dce@smsc.sony.com (David Elliott) writes:
>When I was using a 4.3BSD machine, I found that mail messages
>containing lines beginning with the word "From " would be
>changed into ">From " by the mail system (it looks like it was
>rmail).

From_ to >From_ conversions are a sendmail-ism, so therefore I
don't have anything good to say about them ;-).  MMDF, of course,
does it better by using a non-printable string as the message
seperator.  You're gauranteed that "^A^A^A^A" doesn't appear
in messages because RFC-822 doesn't allow for it..

>Now that I've switched to SVR4, this no longer happens.  This
>means that programs like MH that expect the "From " to be
>a message separator get broken.  On the other hand, mailx and
>the other SVR4 mailers don't get confused by this.

The System V stuff looks for the pattern:

/^From user <ctime> remote from <blah>$/

Instead of just "From ".  Given that they're doing something stupid
by seperating messages with things which could naturally appear in
the text, this is a pretty good compromise.
-- 
<- David Herron, an MMDF & WIN/MHS guy, <david@twg.com>
<- Formerly: David Herron -- NonResident E-Mail Hack <david@ms.uky.edu>
<-
<- Remember:  On System V it's "tar xovf", not "tar xvf"!

les@chinet.chi.il.us (Leslie Mikesell) (10/23/90)

In article <1990Oct22.025301.12573@smsc.sony.com> dce@smsc.sony.com (David Elliott) writes:

>My question is not who does it, but whether or not it should
>be done.  There are programs that expect all lines beginning with
>"From " to have been generated by the mail system.

>From what I've heard from other folks, it's not a standard, but
>something done by BSD so that programmers can be lazy.  Looks
>like I'll be digging into MH tomorrow.

Does the SysVr4 mailer add Content-Type: and Content-Length: headers to
the messages?
This is probably the "best" way to deal with multi-part messages as well
as multiple individual messages in a single file, but it would be nice
if someone would document it.

I have the "AT&T Unix System V Release 4.0 Migration Guide for System V
Developers" in front of me.  Under mail(1)  it says "The format of the
mailbox file has changed, and mail is changed to read the new format".
Thanks, guys.  Hope that wasn't proprietary information I just
released....

Les Mikesell
  les@chinet.chi.il.us

bygg@sunic.sunet.se (Johnny Eriksson) (10/23/90)

In article <8134@gollum.twg.com> david@twg.com (David S. Herron) writes:

) From_ to >From_ conversions are a sendmail-ism, so therefore I
) don't have anything good to say about them ;-).  MMDF, of course,
) does it better by using a non-printable string as the message
) seperator.  You're gauranteed that "^A^A^A^A" doesn't appear
) in messages because RFC-822 doesn't allow for it..

Wrong.  RFC822 allows for all ASCII characters (0.-127. inclusive) to
appear in message bodies.  The same goes for the text on the Subject
line.  "Subject: ^A^A^A^A" is perfectly legal...

--Johnny

dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) (10/23/90)

One day, when I invent a mail handler, it will use ^A-stuffing to solve
the problem.  The message separater will be <cr>^A^A^A^A<cr>.  If the
string ^A^A^A^A appears anywhere else, it will get ^A-stuffed into
^A^A^A^A^A, and get de-stuffed by the user agent.
--
Rahul Dhesi <dhesi%cirrusl@oliveb.ATC.olivetti.com>
UUCP:  oliveb!cirrusl!dhesi

les@chinet.chi.il.us (Leslie Mikesell) (10/23/90)

In article <8134@gollum.twg.com> david@twg.com (David S. Herron) writes:

>The System V stuff looks for the pattern:
>/^From user <ctime> remote from <blah>$/

>Instead of just "From ".  Given that they're doing something stupid
>by seperating messages with things which could naturally appear in
>the text, this is a pretty good compromise.

On the contrary, this pattern is going to appear in the body if you
mail a saved mail message, a mailbox file, or if you forward a message
from a mailer like ELM that sends the whole thing verbatim with new
headers above it. 

Les Mikesell
  les@chinet.chi.il.us

les@chinet.chi.il.us (Leslie Mikesell) (10/23/90)

In article <2597@cirrusl.UUCP> dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:

>One day, when I invent a mail handler, it will use ^A-stuffing to solve
>the problem.  The message separater will be <cr>^A^A^A^A<cr>.  If the
>string ^A^A^A^A appears anywhere else, it will get ^A-stuffed into
>^A^A^A^A^A, and get de-stuffed by the user agent.

You already invented a good mail handler.  It's called zoo - it's just
too bad that no one uses it for that.  And like zoo, the correct way
to delimit items in a mailbox is to store the size of each item so you
don't have to slog through looking at every character to find the
seperators.  Besides, a good mailer should be able to handle any body
contents that the transport can support.  Using Content-Type: and
Content-Length: headers solves the problem and doesn't cause too much
damage when handed to mailers that don't understand them (i.e. they
don't cause any new problems for other mailers).

Picking an "unusual" character combination for a delimiter is sort of
like saying that the faster cars go through an intersection, the less
likely it is that there will be a collision.

Les Mikesell
   les@chinet.chi.il.us

david@twg.com (David S. Herron) (10/24/90)

In article <2222@sunic.sunet.se> bygg@sunic.sunet.se (Johnny Eriksson) writes:
>Wrong.  RFC822 allows for all ASCII characters (0.-127. inclusive) to
>appear in message bodies.  The same goes for the text on the Subject
>line.  "Subject: ^A^A^A^A" is perfectly legal...
>
>--Johnny

Oops..  I know why I said that:  It generally isn't "safe" to use non-printable
characters because

The end-of-line marker can vary from system
The interpretation of TAB can vary

etc..  All of which are open to interpretation during the SMTP
conversation.  In fact, the end-of-line marker used during SMTP is
supposed to be CRLF for *all* messages, not CR, not LF .. CRLF.
(See RFC-822 section 3.3 for the EBNF of a message ..)

You gotta admit though.  Even if ^A^A^A^A is valid for a message
it's an awfully unlikely string as opposed to "From" occuring at
the beginning of the line..  I still prefer mh's way of keeping
messages in seperate files.



-- 
<- David Herron, an MMDF & WIN/MHS guy, <david@twg.com>
<- Formerly: David Herron -- NonResident E-Mail Hack <david@ms.uky.edu>
<-
<- Remember:  On System V it's "tar xovf", not "tar xvf"!

david@twg.com (David S. Herron) (10/24/90)

In article <1990Oct23.142320.20180@chinet.chi.il.us> les@chinet.chi.il.us (Leslie Mikesell) writes:
>In article <8134@gollum.twg.com> david@twg.com (David S. Herron) writes:
>
>>The System V stuff looks for the pattern:
>>/^From user <ctime> remote from <blah>$/
>
>>Instead of just "From ".  Given that they're doing something stupid
>>by seperating messages with things which could naturally appear in
						 ^^^^^^^^^^^^^^^^
>>the text, this is a pretty good compromise.
		      ^^^^^^^^^^^^^^^^^^^^^^
>On the contrary, this pattern is going to appear in the body if you
>mail a saved mail message, a mailbox file, or if you forward a message
>from a mailer like ELM that sends the whole thing verbatim with new
>headers above it. 

Like all "pretty good" compromises it fails in (slightly) obscure situations.

The file system provides very good seperation between messages and one
also gains the ability to have a message in multiple folders.  Anything
else, short of defining some encapsulation format for storing messages
into mailboxes, is a compromise.


-- 
<- David Herron, an MMDF & WIN/MHS guy, <david@twg.com>
<- Formerly: David Herron -- NonResident E-Mail Hack <david@ms.uky.edu>
<-
<- Remember:  On System V it's "tar xovf", not "tar xvf"!

prc@erbe.se (Robert Claeson) (10/25/90)

In article <2597@cirrusl.UUCP>, dhesi%cirrusl@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:

|> One day, when I invent a mail handler, it will use ^A-stuffing to solve
|> the problem.  The message separater will be <cr>^A^A^A^A<cr>.  If the
|> string ^A^A^A^A appears anywhere else, it will get ^A-stuffed into
|> ^A^A^A^A^A, and get de-stuffed by the user agent.

And in my mail handler, I think that I will keep a count of the number of characters
in a message before each message in the mailbox. That way, a message can contain
virtually anything and besides, reading it into the UA would be faster since I don't
have to scan each line for a message separator.
-- 
Robert Claeson                  |Reasonable mailers: rclaeson@erbe.se
ERBE DATA AB                    |      Dumb mailers: rclaeson%erbe.se@sunet.se
                                |  Perverse mailers: rclaeson%erbe.se@encore.com
These opinions reflect my personal views and not those of my employer.