jc@minya.UUCP (John Chambers) (11/22/87)
Hello. I'm interested in characterizing the sorts of damage that the existing electronic mail systems can do to mail as they move it about. To start the ball rolling, I'll give a few examples. First, the mail command that comes with most Unix systems: 1. Any occurrence of the string "\nFrom " has '>' inserted before the 'F'. 2. If the string "\n.\n" occurs, the tail end of the file (starting at the '.') is discarded. In addition, I know of mailers that do the following: 3. High-order bits are turned off (or set to parity). 4. Null bytes are dropped. 5. If a backspace occurs, it and the preceding character are deleted. 6. ASCII tabs are expanded to some number of spaces. Can you add to the list? -- John Chambers <{adelie,ima,maynard,mit-eddie}!minya!{jc,root}> (617/484-6393)
pdb@sei.cmu.edu (Patrick Barron) (11/22/87)
In article <408@minya.UUCP> jc@minya.UUCP (John Chambers) writes: > 2. If the string "\n.\n" occurs, the tail end of the file (starting > at the '.') is discarded. This isn't a bug, it's a feature. And it can be disabled by putting "unset dot" in your .mailrc file. >In addition, I know of mailers that do the following: > > 3. High-order bits are turned off (or set to parity). > 4. Null bytes are dropped. Assuming that your mailer is RFC 822 compatible, these aren't bugs either. RFC 822 specifies that mail messages can only contain 7-bit printable ASCII characters, along with formatting codes like <CR> and <LF>. If you really have to mail such a file, you can use uuencode/uudecode, or some other similar program. >Can you add to the list? How about: 1) Mailers that delete trailing spaces from message lines, occasionally messing up uuencode/uudecode and other similar programs. 2) Mailers that let "From:" addresses like "user@host.UUCP", "host!user", or "user@host.BITNET" escape on to the Internet without fixing the address (e.g., "user@host.UUCP" becomes "user%host.UUCP@gateway.do.main). --Pat.
billw@killer.UUCP (11/23/87)
In article <408@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >Hello. I'm interested in characterizing the sorts of damage that the >existing electronic mail systems can do to mail as they move it about. [..] >Can you add to the list? The mailer at texsun, which about 1/4 of my mail is routed through, adds a ^M to the end of every line. I must manually strip them off of anything that I plan to work with, like source. -- Bill Wisner, HASA "A" Division ..{codas,ihnp4}!killer!billw "I don't mind at all.." -- Bourgeois Tagg
david@ms.uky.edu (David Herron -- Resident E-mail Hack) (11/23/87)
In article <408@minya.UUCP> jc@minya.UUCP (John Chambers) writes: >Hello. I'm interested in characterizing the sorts of damage that the >existing electronic mail systems can do to mail as they move it about. > > 1. Any occurrence of the string "\nFrom " has '>' inserted before > the 'F'. Actually, the "From " line at the beginning of the file is a misfeature (in this day and age ... when they came up with mail back then they weren't connected to the arpanet, rfc822 hadn't been written, and all sorts of fun things). > 2. If the string "\n.\n" occurs, the tail end of the file (starting > at the '.') is discarded. geeee ... >In addition, I know of mailers that do the following: > > 3. High-order bits are turned off (or set to parity). > 4. Null bytes are dropped. eh? both of those things aren't part of normal ascii text. Why would you expect a system which is designed for passing normal ascii text to do something reasonable with strange things. If you wanna do something like this then uuencode the file before mailing it. > 5. If a backspace occurs, it and the preceding character are deleted. > 6. ASCII tabs are expanded to some number of spaces. Weeeell... my comment above sort-of counts here. Also, different operating systems treat tabs in different ways. There is one OS around which sets tab stops at 10 columns rather than 8. Also, for some reason tabs MUST be expanded with mail (I dunno why ... it simply must). I have a vague memory that the system in question is Multics, but may be mistaken. 7. Prepending "host!" to the From: lines of mail passing through the site and going out through UUCP. This is a problem because it creates unreplyable mail. For instance, we are on the mtxinu-users@emory.edu mailing list. Before we joined the Internet we were getting our mail from the list via uucp. The mail would arrive here with headers like: From: emory!someone@utah.edu And the From: should have just been "someone@utah.edu". Depending on how we interpret the address, it will behave differently. I suppose that the people at emory wanted us to interpret ! first. However we're running MMDF and it's fairly tightly conformant to RFC-822. It'll interpret the @ first. So the message trundles off to utah.edu who is told to deliver the mail to emory!someone, which is almost certainly not right because it'll cause the mail to be delivered to Georgia! Note that I'm only mentioning the problem at emory because it was the first one which came to mind. Many sites have this problem ... It's a severe problem which should be fixed. 8. Truncation of long lines. For instance, mail on BITNET is 80 column PUNCH files sent to people's virtual reader. 9. Silent discarding (or any other sort of discarding) of mail which is "too long". SendMail has a limit (configurable) of message size ... it's usually set to something like 100K. BUT the mail system on the main instructional system here has a limit of 200 lines. (!) Further, it drops the message silently if it's too long. One of the common things to do here is print files at cluster sites by mailing them from a unix machine to "printer-name@ukpr.bitnet". It works and happens to be fairly fast ... however, if someone isn't careful the file could be too long and you'd never know it. -- <---- David Herron -- Resident E-Mail Hack <david@ms.uky.edu> <---- or: {rutgers,uunet,cbosgd}!ukma!david, david@UKMA.BITNET <---- "The market doesn't drop hundreds of points on a normal day..." -- <---- Fidelity Investments Corporation
jso@edison.GE.COM (John Owens) (11/25/87)
And don't forget the worst damage of all - ASCII/EBCDIC translation! Since there's no one-to-one mapping, and different sites use different translation tables, there's no way you can know what the mail will look like when it gets through. Most commonly caught characters are characters in ASCII range 5B-5F and 7B-7F. And, of course, tabs are expanded to spaces and formfeeds are usually lost....
rshwake@irs1.UUCP (rshwake) (11/25/87)
I don't know if the original poster's intent was to suggest that prepending lines beginning with "From " with ">" constitutes damage. Since the "From " string is used as a delimiter, separating one message from another, SOME means is required to prevent lines beginning with such a string from signaling the start of a new message. More critically, it would be TOO easy to fake a message from some user if such potential delimiters were not masked. Ray Shwake IRS User Assistance Branch
blarson@skat.usc.edu (Bob Larson) (11/25/87)
In article <256@irs1.UUCP> rshwake@irs1.UUCP (rshwake) writes: > I don't know if the original poster's intent was to suggest that >prepending lines beginning with "From " with ">" constitutes damage. I don't see how it could be considered anything but damgage. >Since >the "From " string is used as a delimiter, separating one message from >another, SOME means is required to prevent lines beginning with such a >string from signaling the start of a new message. This is true in some mail implementations. They should NOT mess with outgoing messages or messages just passing through, since not everyone is using such poor software. > More critically, it would be TOO easy to fake a message from some >user if such potential delimiters were not masked. They could just figure out a way to delimit messages without messing with the text of the message. This isn't even an invertable bogosity. -- Bob Larson Arpa: Blarson@Ecla.Usc.Edu Uucp: {sdcrdcf,cit-vax}!oberon!skat!blarson blarson@skat.usc.edu Prime mailing list (requests): info-prime-request%fns1@ecla.usc.edu
henry@utzoo.UUCP (Henry Spencer) (11/25/87)
> Actually, the "From " line at the beginning of the file is a misfeature > (in this day and age ... Unfortunately, when Unix mail got RFC822ized (at Berkeley, I believe), it did not occur to the people doing it that RFC822 format and the old Unix format ("From " lines at the beginning) were two *different* formats and that they should convert between them rather than smushing them together. There is just no excuse for having the sender's address appear in two different places in two different forms. (Well, actually, nowadays it can be handy to have a domainized address in "From:" and a bang form in "From ", but that is a kludge if there ever was one.) -- Those who do not understand Unix are | Henry Spencer @ U of Toronto Zoology condemned to reinvent it, poorly. | {allegra,ihnp4,decvax,utai}!utzoo!henry
brian@ncrcan.UUCP (11/27/87)
In article <8991@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes: >> Actually, the "From " line at the beginning of the file is a misfeature >> (in this day and age ... > >Unfortunately, when Unix mail got RFC822ized (at Berkeley, I believe), it >did not occur to the people doing it that RFC822 format and the old Unix >format ("From " lines at the beginning) were two *different* formats and >that they should convert between them rather than smushing them together. >There is just no excuse for having the sender's address appear in two >different places in two different forms. (Well, actually, nowadays it can >be handy to have a domainized address in "From:" and a bang form in "From ", >but that is a kludge if there ever was one.) I agree! I hate those "From " lines and "remote from" lines in mail messages. I have been giving serious thought to hacking on smail so that it removes those lines. As long as we have a domain mailer, I don't care how the mail got here. Of course this means that all sites that mess with the "From: " line will have to refrain from doing this :-). Anyone have any reasons (besides the obvious one above) as to why I should not go ahead and do this? Brian. -- +-------------------+--------------------------------------------------------+ | Brian Onn | UUCP:..!{uunet!mnetor, watmath!utai}!lsuc!ncrcan!brian | | NCR Canada Ltd. | INTERNET: Brian.Onn@Toronto.NCR.COM | +-------------------+--------------------------------------------------------+
daveb@geac.UUCP (11/30/87)
In article <471@ncrcan.Toronto.NCR.COM> brian@ncrcan.Toronto.NCR.COM (Brian Onn) writes: |In article <8991@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes: ||| Actually, the "From " line at the beginning of the file is a misfeature ||| (in this day and age ... |I agree! I hate those "From " lines and "remote from" lines in mail messages. |I have been giving serious thought to hacking on smail so that it removes |those lines. As long as we have a domain mailer, I don't care how the mail |got here. You may... When the mailer frogs trying to reply. |Of course this means that all sites that mess with the "From: " line will |have to refrain from doing this :-). | |Anyone have any reasons (besides the obvious one above) as to why I should |not go ahead and do this? Well, you probably want to do it in the mail display agent (mail reader) and not in the transfer agent(s). Otherwise you'll get flamed by someone trying to get in and out of a domainized universe to/from a path-based universe. --dave -- David Collier-Brown. {mnetor|yetti|utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.
alex@.UUCP (Alex Laney) (11/30/87)
In article <2181@killer.UUCP>, billw@killer.UUCP writes: > In article <408@minya.UUCP> jc@minya.UUCP (John Chambers) writes: > >Hello. I'm interested in characterizing the sorts of damage that the > >existing electronic mail systems can do to mail as they move it about. > [..] > >Can you add to the list? What I find annoying, is mail spoolers that reverse the order of the mail passing through them. This makes USENET news article replies arrive before the original article! This, I know, is not damaging to the articles themselves, but is part of the mailing/transport process. -- Alex Laney alex@xicom.UUCP ...utzoo!dciem!nrcaer!xios!xicom!alex Xicom Technologies, 205-1545 Carling Av., Ottawa, Ontario, Canada We may have written the SNA software you use. The opinions are my own.
coffin@xroads.UUCP (Chris Coffin) (12/01/87)
In article <471@ncrcan.Toronto.NCR.COM>, brian@ncrcan.UUCP writes: > I have been giving serious thought to hacking on smail so that it removes > those lines. As long as we have a domain mailer, I don't care how the mail > got here. > > Anyone have any reasons (besides the obvious one above) as to why I should > not go ahead and do this? We here at crossroads are a newsite on the net. We do not have pathalias yet. (waiting for the next posting) We did, however get smail2.5 from our news-feed and are using it as a "smart-host" We have had problems with mail from our site beign bounced and wonder if mail to us has been getting bounced because we are not in the pathalias data base yet. Chris Coffin -- \ / C r o s s r o a d s C o m m u n i c a t i o n s \/ (602) 971-2240 /\ (602) 992-5007 300|1200 Baud 24 hrs/day / \ ihnp4!mot!nud!xroads!coffin
henry@utzoo.UUCP (Henry Spencer) (12/02/87)
There is nothing wrong with having the mailer use "From " as a way of finding the boundary of messages, and putting ">" in front to avoid false boundaries arising from occurrences of that string in text, *provided* that the transformation is reversible, and is in fact reversed when the message leaves the mailer. Unfortunately, neither of these conditions is true of the existing scheme. -- Those who do not understand Unix are | Henry Spencer @ U of Toronto Zoology condemned to reinvent it, poorly. | {allegra,ihnp4,decvax,utai}!utzoo!henry
matt@ncr-sd.UUCP (12/05/87)
In article <471@ncrcan.Toronto.NCR.COM> brian@ncrcan.Toronto.NCR.COM (Brian Onn) writes: >I agree! I hate those "From " lines and "remote from" lines in mail messages. >I have been giving serious thought to hacking on smail so that it removes >those lines. As long as we have a domain mailer, I don't care how the mail >got here. > >Of course this means that all sites that mess with the "From: " line will >have to refrain from doing this :-). > >Anyone have any reasons (besides the obvious one above) as to why I should >not go ahead and do this? At least one From_ line is necessary at the beginning of a mail message for the normal mailbox format. The first one (without the '>') is used to denote the start of a mail message in a mailbox file. This convention is used by /bin/mail, mailx, elm, etc. If you are willing to use a different scheme such as that used by MH then getting rid of the From_ lines might be reasonble. As it it you'd just be cutting your throat. I have to agree that there can be too many From_ lines. The solution is to let smail (i.e. the message tranport agent) collapse all the From_ lines to a single line. Smail has supported this from at least version 1.3. In smail2.5 the function rline() is defined at line 373 of the file headers.c; this function collapses all the From_ lines into a single line and also removes redundant host information. -- Matt Costello <matt.costello@SanDiego.NCR.COM> +1 619 485 2926 <matt.costello%SanDiego.NCR.COM@Relay.CS.NET> {sdcsvax,cbosgd,pyramid,nosc.ARPA}!ncr-sd!matt
phil@amdcad.AMD.COM (Phil Ngai) (12/17/87)
In article <483@.UUCP> alex@.UUCP (Alex Laney) writes: >What I find annoying, is mail spoolers that reverse the order of the mail >passing through them. This makes USENET news article replies arrive before >the original article! This, I know, is not damaging to the articles themselves, >but is part of the mailing/transport process. This is, I believe, due to a misguided optimization by some uucps which send the shortest files first. Do modern (ie HDB) uucps do so? -- Let's go to the mall and see how long people will wait for our parking space! Phil Ngai, {ucbvax,decwrl,allegra}!amdcad!phil or amdcad!phil@decwrl.dec.com
mark@sickkids.UUCP (Mark Bartelt) (12/17/87)
In article <408@minya.UUCP> jc@minya.UUCP (John Chambers) writes: > Hello. I'm interested in characterizing the sorts of damage that the > existing electronic mail systems can do to mail as they move it about. > To start the ball rolling, I'll give a few examples. First, the mail > command that comes with most Unix systems: [ ... ] > 2. If the string "\n.\n" occurs, the tail end of the file (starting > at the '.') is discarded. Are you sure? Our /bin/mail (a truly ancient one, dating back to Seventh Edition days, more or less) contains the following code ... onatty = isatty(0); [ ... ] while (fgets(line, LSIZE, stdin) != NULL) { if (line[0] == '.' && line[1] == '\n' && onatty) break; [ etc. ] ... to avoid exactly that problem. The /bin/mail that comes with 4.3bsd contains different, but equivalent, code. Are there *really* UNIX mailers that exhibit that bug when passing mail between systems, or have you merely inferred this because of the fact that a '.' bracketed by pair of newlines can be used as a message terminator from a terminal? > Can you add to the list? I'd be delighted. I consider complaints about minor mangling of messages (for example, the "From" ==> ">From" controversy; talk about getting worked up about trivia!) to be hardly worth discussing, when compared to the real disasters: Mailers that diddle with headers, especially when they diddle in demonstrably WRONG ways. For example, one of the sites through which mail from here to other places often passes (it shall remain nameless, to protect the guilty) seems to like to play jokes on the recipients, by mis-identifying the senders. Suppose a user at our site, "bozo", sends mail to a friend "zippy", using "neighbor!somewhere!another!cleveland" as a path. Bozo addresses mail to ... neighbor!somewhere!another!cleveland!zippy ... but when the mail arrives at its destination, the "From" line reads ... From: bozo@somewhere.uucp Now if the intermediate mailer had mangled the header to read ... From: bozo@oursite.uucp ... I'd be only a tiny bit upset. But if the recipient of this message uses a "reply" option with one of the so-called "intelligent" mailers plaguing us these days, some poor bozo at the wrong site will be getting the replies intended for our bozo. Thus far, we've been able to avoid that problem by telling all our recipients to send mail to the address which will actually work, but the behaviour of the mailer (hard to tell whether it's at site "somewhere" or "another") is rather annoying. This of course brings up another complaint: Mailers that ignore explicit uucp routings, and choose one of their own if the all-wise pathalias deems it to be better. I have no objection to using a pathalias-generated path if mail is addressed to "someone@site.uucp", but if a sender explicitly specifies a path, intermediate mailers have no business messing with it. If mailers didn't misbehave in this way, we could avoid the first problem above by routing mail to avoid "somewhere" and "another". But, sadly, if "neighbor" thinks that "somewhere!another" is the best path to "cleveland", there's nothing we can do about it. Sigh. ----- Mark Bartelt Hospital for Sick Children, Toronto 416/598-6442 {utzoo,decvax,ihnp4}!sickkids!mark
honey@umix.cc.umich.edu (Peter Honeyman) (12/18/87)
not so phil, or at least not exactly. some old versions of uucp delivered in directory order. honey danber delivers in sequence order. (there's a small problem at sequence number wraparound, which embarrasses me.) however, the behavior you describe, delivering small files before large ones, pertains to sendmail, or so allman told me. (having never run it, i'm no authority on sendmail.) peter
romain@pyrnj.uucp (Romain Kang) (12/30/87)
In article <77@sickkids.UUCP> mark@sickkids.UUCP (Mark Bartelt) writes: | In article <408@minya.UUCP> jc@minya.UUCP (John Chambers) writes: | > 2. If the string "\n.\n" occurs, the tail end of the file (starting | > at the '.') is discarded. | | [...] The /bin/mail that comes with 4.3bsd | contains different, but equivalent, code. Are there *really* UNIX mailers | that exhibit that bug when passing mail between systems, or have you merely | inferred this because of the fact that a '.' bracketed by pair of newlines | can be used as a message terminator from a terminal? BSD mail maintainers take note: There are a great many 4.2-based /bin/rmail's still out there (including Pyramid's, *blush*) that invoke sendmail without the -i option; this means that sendmail will use "\n.\n" as a message terminator and flush anything else that rmail feeds it. Thus if I run the following shell script, John's bug surfaces: #! /bin/sh /bin/rmail $USER << EoF From adm Tue Dec 29 04:00 EST 1987 remote from test IMPORTANT MESSAGE FOLLOWS: . . . ***UPDATE /usr/lib/acct/holidays WITH NEW HOLIDAYS*** EoF -- Romain Kang {allegra,cmcl2,mirror,pyramid,rutgers}!pyrnj!romain Pyramid Technology Corp. / 10 Woodbridge Center. Dr / Woodbridge, NJ 07095 "Eggheads unite! You have nothing to lose but your yolks!" -Adlai Stevenson
daveb@geac.UUCP (David Collier-Brown) (01/01/88)
>In article <408@minya.UUCP> jc@minya.UUCP (John Chambers) writes: > >> Hello. I'm interested in characterizing the sorts of damage that the >> existing electronic mail systems can do to mail as they move it about. >> Can you add to the list? > In article <77@sickkids.UUCP> mark@sickkids.UUCP (Mark Bartelt) writes: >I'd be delighted. > >I consider complaints about minor mangling of messages (for example, the >"From" ==> ">From" controversy; talk about getting worked up about trivia!) >to be hardly worth discussing, when compared to the real disasters: Mailers >that diddle with headers, especially when they diddle in demonstrably WRONG >ways. Another interesting form of header munging is to assume that if I reach site far via near, that far is a subdomain of near. Case in point? If I send to near-host!medium-distance-host!somewhere!fred, fred gets handed a message which claims that I'm daveb@geac.near-host.medium-distance-host[.uucp] I can assure you that geac is not a subdomain of near-host, much less medium-distance-host. Geac is the only canadian mainframe manufacturer, not a subdomain of some #$!&&%*+@!?? unregistered domain... [Gee I'm grumpy today: happy new year?] This can make it hard for fred to reply to me unless he transforms the address back to !s or to @medium-distance-host,@near-host,daveb@geac As you might guess, fred get annoyed with me... -- David Collier-Brown. {mnetor|yetti|utgpu}!geac!daveb Geac Computers International Inc., | Computer Science loses its 350 Steelcase Road,Markham, Ontario, | memory (if not its mind) CANADA, L3R 1B3 (416) 475-0525 x3279 | every 6 months.