garyo@THINK.COM (Gary Oberbrunner) (09/07/89)
When wrapping long headers with repl(1), it seems to use the length of the field name as the indent width rather than a fixed number of spaces or a TAB. I believe (although I don't have the RFC822 spec here) that indented header lines are always supposed to be indented with a TAB, or else they get treated as the beginning of the message. Repl's indenting breaks my reply mail to people with long return (or From) addresses, such as the following message (enclosed within lines of ==='s): ============================================================================ Date: Wed, 6 Sep 89 13:48:25 EDT From: Bob Doolittle ({gatech,uunet,petsd}!masscomp!rad) <rad@westford.ccur.com> To: garyo Subject: this is a test ------------------ This is a test message to see how reply formatting works. ============================================================================ Repl turns this message into an outgoing header like this: ============================================================================ To: Bob Doolittle ({gatech,uunet, petsd}!masscomp!rad) <rad@westford.ccur.com> Fcc: ccs Subject: Re: this is a test In-reply-to: Your message of Wed, 06 Sep 89 13:48:25 -0400. -------- ============================================================================ From the source code, it looks like this behavior is hardwired into fmtscan() (in uip/sbr/formatsbr.c). And fmtscan() is always called from Replout() (in uip/replsbr.c), regardless of the -format switch. So I don't know what I can do, short of an awk script that parses the headers and turns any leading spaces into a TAB. Any suggestions? Perhaps (I don't know enough about this to really tell) this is a sendmail config-file issue instead? Any help would be appreciated. Here's my .mh_profile, for completeness: ============================================================================ Path: Mail Editor: vi Signature: Gary Oberbrunner Alternate-Mailboxes: *garyo*,staff@*,*!staff Send: -verbose -alias aliases showproc: mhl Repl: -nocc me -fcc ccs -annotate Msg-Protect: 0600 Folder-Protect: 0744 Draft-Folder: /u8/garyo/Mail/drafts Sequence-Negation: ^ whom: -alias aliases ali: -alias aliases Unseen-Sequence: unseen ============================================================================ Thanks, Gary Oberbrunner garyo@think.com {ames,harvard}!think!garyo
karlton@fudge.sgi.com (Phil Karlton) (09/07/89)
In article <8909062110.AA00932@prometheus.think.com> garyo@THINK.COM (Gary Oberbrunner) writes: >When wrapping long headers with repl(1), it seems to use the length of the >field name as the indent width rather than a fixed number of spaces or a >TAB. I believe (although I don't have the RFC822 spec here) that indented >header lines are always supposed to be indented with a TAB, or else they >get treated as the beginning of the message. From RFC822, August 13, 1982, page 5: ... can be split into a multiple line representation; this is called "folding". The general rule is that wherever there may be linear-white-space (NOT simple LWSP-chars), a CRLF immediately followed by AT LEAST one LWSP-char may instead be inserted. In other words, it doesn't have to indented with a TAB. PK -- Phil Karlton karlton@sgi.com Silicon Graphics Computer Systems 415-964-1459, ext. 3018 2011 N. Shoreline Blvd.
mdb@ESD.3Com.COM (Mark D. Baushke) (09/07/89)
On 6 Sep 89 21:10:38 GMT, garyo@THINK.COM (Gary Oberbrunner) said: Gary> When wrapping long headers with repl(1), it seems to use the Gary> length of the field name as the indent width rather than a fixed Gary> number of spaces or a TAB. I believe (although I don't have the Gary> RFC822 spec here) that indented header lines are always supposed Gary> to be indented with a TAB, or else they get treated as the Gary> beginning of the message. I think your problem can be viewed as asking the following questions: Q1) Is it legal to continue a field with either a SPACE or a TAB [HTAB in RFC822 language]? A1) The answer is yes. The continuation should be an LWSP-char. Q2) Is it legal to split a parenthetical comment accross continuation lines? A2) Yes. A parenthetical comment may contain linear-white-space. Q3) If repl is generating a legal address, is this a bug in the MH address parsing code? A3) In my opinion, yes you have found a bug in MH. RFC 822, page 10, 12-13, 15 3.3. LEXICAL TOKENS ; ( Octal, Decimal.) CHAR = <any ASCII character> ; ( 0-177, 0.-127.) CR = <ASCII CR, carriage return> ; ( 15, 13.) LF = <ASCII LF, linefeed> ; ( 12, 10.) CRLF = CR LF SPACE = <ASCII SP, space> ; ( 40, 32.) HTAB = <ASCII HT, horizontal-tab> ; ( 11, 9.) LWSP-char = SPACE / HTAB ; semantics = SPACE linear-white-space = 1*([CRLF] LWSP-char) ; semantics = SPACE comment = "(" *(ctext / quoted-pair / comment) ")" ctext = <any CHAR excluding "(", ; => may be folded ")", "\" & CR, & including linear-white-space> quoted-pair = "\" CHAR ; may quote any char [...] 3.4.2. WHITE SPACE Note: In structured field bodies, multiple linear space ASCII characters (namely HTABs and SPACEs) are treated as single spaces and may freely surround any symbol. In all header fields, the only place in which at least one LWSP-char is REQUIRED is at the beginning of continua- tion lines in a folded field. When passing text to processes that do not interpret text according to this standard (e.g., mail protocol servers), then NO linear-white-space characters should occur between a period (".") or at-sign ("@") and a <word>. Exactly ONE SPACE should be used in place of arbitrary linear-white-space and comment sequences. Note: Within systems conforming to this standard, wherever a member of the list of delimiters is allowed, LWSP-chars may also occur before and/or after it. Writers of mail-sending (i.e., header-generating) programs should realize that there is no network-wide definition of the effect of ASCII HT (horizontal-tab) characters on the appear- ance of text at another network host; therefore, the use of tabs in message headers, though permitted, is discouraged. 3.4.3. COMMENTS A comment is a set of ASCII characters, which is enclosed in matching parentheses and which is not within a quoted-string The comment construct permits message originators to add text which will be useful for human readers, but which will be ignored by the formal semantics. Comments should be retained while the message is subject to interpretation according to this standard. However, comments must NOT be included in other cases, such as during protocol exchanges with mail servers. Comments nest, so that if an unquoted left parenthesis occurs in a comment string, there must also be a matching right parenthesis. When a comment acts as the delimiter between a sequence of two lexical symbols, such as two atoms, it is lex- ically equivalent with a single SPACE, for the purposes of regenerating the sequence, such as when passing the sequence onto a mail protocol server. Comments are detected as such only within field-bodies of structured fields. If a comment is to be "folded" onto multiple lines, then the syntax for folding must be adhered to. (See the "Lexical Analysis of Messages" section on "Folding Long Header Fields" above, and the section on "Case Independence" below.) Note that the official semantics therefore do not "see" any unquoted CRLFs that are in comments, although particular pars- ing programs may wish to note their presence. For these pro- grams, it would be reasonable to interpret a "CRLF LWSP-char" as being a CRLF that is part of the comment; i.e., the CRLF is kept and the LWSP-char is discarded. Quoted CRLFs (i.e., a backslash followed by a CR followed by a LF) still must be followed by at least one LWSP-char. [...] 3.4.8. FOLDING LONG HEADER FIELDS Each header field may be represented on exactly one line con- sisting of the name of the field and its body, and terminated by a CRLF; this is what the parser sees. For readability, the field-body portion of long header fields may be "folded" onto multiple lines of the actual field. "Long" is commonly inter- preted to mean greater than 65 or 72 characters. The former length serves as a limit, when the message is to be viewed on most simple terminals which use simple display software; how- ever, the limit is not imposed by this standard. Note: Some display software often can selectively fold lines, to suit the display terminal. In such cases, sender- provided folding can interfere with the display software. RFC 822, page 40 B.2. SEMANTICS Headers occur before the message body and are terminated by a null line (i.e., two contiguous CRLFs). A line which continues a header field begins with a SPACE or HTAB character, while a line beginning a field starts with a printable character which is not a colon. A field-name consists of one or more printable characters (excluding colon, space, and control-characters). A field-name MUST be contained on one line. Upper and lower case are not dis- tinguished when comparing field-names. Gary> Repl's indenting breaks my reply mail to people with long return Gary> (or From) addresses, such as the following message (enclosed Gary> within lines of ==='s): Gary> ========================================================================= Gary> Date: Wed, 6 Sep 89 13:48:25 EDT Gary> From: Bob Doolittle ({gatech,uunet,petsd}!masscomp!rad) <rad@westford.ccur.com> Gary> To: garyo Gary> Subject: this is a test Gary> ------------------ Gary> This is a test message to see how reply formatting works. Gary> ========================================================================= Gary> Repl turns this message into an outgoing header like this: Gary> ========================================================================= Gary> To: Bob Doolittle ({gatech,uunet, Gary> petsd}!masscomp!rad) <rad@westford.ccur.com> Gary> Fcc: ccs Gary> Subject: Re: this is a test Gary> In-reply-to: Your message of Wed, 06 Sep 89 13:48:25 -0400. Gary> -------- Gary> ========================================================================= As near as I can tell, this is a legal continuation per strict RFC 822. I just tested sendmail directly, I do not seem to have any problem. Of course, post(8) is unable find any addressees. It looks to me like you found a bug in MH (I am running 6.6). Gary> From the source code, it looks like this behavior is hardwired Gary> into fmtscan() (in uip/sbr/formatsbr.c). And fmtscan() is Gary> always called from Replout() (in uip/replsbr.c), regardless of Gary> the -format switch. So I don't know what I can do, short of an Gary> awk script that parses the headers and turns any leading spaces Gary> into a TAB. Any suggestions? Perhaps (I don't know enough Gary> about this to really tell) this is a sendmail config-file issue Gary> instead? Any help would be appreciated. Well, you could look into fixing uip/post.c ... -- Mark D. Baushke Internet: mdb@ESD.3Com.COM UUCP: {3comvax,auspex,sun}!bridge2!mdb