zben@umd5.UUCP (09/10/85)
This is a somewhat contrived example of mailing messages through the ARPA Internet and through a BitNet mail gateway. My apologies for its length and complexity, but there are some subtleties that I especially wanted to show. To illustrate the processing required for addresses that appear in the SMTP out-of-band (OOBS) information but not in the message header, I decided to use a "BCC:" (blank carbon copy) address for the message, although the implementation of BCC: is probably different everywhere. This same case does occur frequently in the real word, however. For example, mail sent from a "mail reflector" typically has a bundle of OOBS addresses that are not even mentioned in the header. Note also the rules for the BitNet gateway to which I talk and which is assumed for this example: The user mail agent (mail program) sends one copy of the message to the mailer virtual machine on his host. After syntax checking, it in turn sends one copy of the message to the mailer virtual machine at EACH site. If two or more users were named at the same site, only ONE copy of the message is sent to the mailer virtual machine at that site. The sending is point-to-point within BitNet and is mediated by the RSCS system, atop which BitNet mail is built, in the same fashion that UUCP mail is built atop the UUCP/UUX network. When the mailer virtual machine at a site gets mail from a mailer virtual machine at another site, it examines the HEADER fields (yuck!) and sends a copy on to each local user named. It utterly ignores addresses that name other machines (as the originating mailer sent those separately). This is why we carefully move addresses to an ALSO-TO: field below. We do not want to accidently generate a duplicate copy of the message in the case that it was "split up" by a dumb ARPANET mailer before it got to us. Technical terms: MUNGE - The act of translation of the syntax and semantics of the fields of the headers of messages passed across domain boundaries. The aim of this translation is to make the header comprehensible to mail agents in the new (destination) domain. OOBS - This refers to the SMTP out-of-band information that is NOT a part of the message header at all, but is passed from sending host to receiving host as part of the SMTP mail interaction. OURNAME - Any of the various names for the site that is actually running this software. This includes the primary (official) name and all defined alias or NIC names. TO* - Any of the header fields which carry "To:" semantic information. In my implementation these are: To:, CC:, BCC:, Resent-To:, Resent-CC:, and Resent-BCC: fields. The example is the sending of a message from BEN at site ORIG to several people on both ARPA and BitNet. To set the stage, the hypothetical network configuration is shown here in diagram form. Just to make things a little more complicated, we assume ORIG cannot talk directly to HERE, so several addresses must be forwarded through host RELAY. :: (--------) --------- :: (---------) ( BILL )----| OTHER | ARPA :: BitNet ( MARY ) (--------) --------- domain :: domain (---------) | :: | | :: | (--------) --------- --------- -------- ----------- ( BEN )----| ORIG |-----| RELAY |----| HERE |--------| SCHOOL | (--------) --------- --------- -------- ----------- | :: | | :: | (--------) --------- :: (---------) ( MYBOSS )----| ADMIN | :: ( HERBOSS ) (--------) --------- :: (---------) :: Figure 1: Hypothetical configuration for munging example --------- Hypothetical commands to mail sending agent at ORIG.ARPA: send BILL@OTHER.ARPA @RELAY.ARPA,@HERE.ARPA:MARY@SCHOOL.BITNET ?bcc MYBOSS@ADMIN.ARPA @RELAY.ARPA,@HERE.ARPA:HERBOSS@SCHOOL.BITNET Mail sending agent on ORIG.ARPA generates header: FROM: <BEN@ORIG.ARPA> TO: <BILL@OTHER.ARPA>, <@RELAY.ARPA,@HERE.ARPA:MARY@SCHOOL.BITNET> Out-of-band (OOBS) information generated as: MAIL FROM: <BEN@ORIG.ARPA> RCPT TO: <BILL@OTHER.ARPA> RCPT TO: <MYBOSS@ADMIN.ARPA> RCPT TO: <@RELAY.ARPA,@HERE.ARPA:MARY@SCHOOL.BITNET> RCPT TO: <@RELAY.ARPA,@HERE.ARPA:HERBOSS@SCHOOL.BITNET> --------- When message is passed from ORIG.ARPA to RELAY.ARPA, the header is untouched, and the OOBS information is the subset for which RELAY.ARPA is responsible: MAIL FROM: <BEN@ORIG.ARPA> RCPT TO: <@RELAY.ARPA,@HERE.ARPA:MARY@SCHOOL.BITNET> RCPT TO: <@RELAY.ARPA,@HERE.ARPA:HERBOSS@SCHOOL.BITNET> --------- When message is passed from RELAY.ARPA to HERE.ARPA, the header is untouched, and the OOBS information has been updated to add RELAY.ARPA to the back path, and to remove RELAY.ARPA from the forward paths: MAIL FROM: <@RELAY.ARPA:BEN@ORIG.ARPA> RCPT TO: <@HERE.ARPA:MARY@SCHOOL.BITNET> RCPT TO: <@HERE.ARPA:HERBOSS@SCHOOL.BITNET> --------- At HERE.ARPA the munging actually gets done. Note that the header is unchanged, but the OOBS information has been processed by the relay. We extract the addresses from the header and correlate with the OOBS info: HEADER ADDRESSES (1) BILL@OTHER.ARPA (2) @RELAY.ARPA,@HERE.ARPA:MARY@SCHOOL.BITNET OOBS ADDRESSES (A) @HERE.ARPA:MARY@SCHOOL.BITNET (B) @HERE.ARPA:HERBOSS@SCHOOL.BITNET CORRELATIONS (1) - Header address not correlated with OOBS (case a) (A)<-->(2) - Correlation (case b) (B) - OOBS address not correlated with header (case c) ACTIONS (case a): Any header TO* address that does NOT correlate with an OOBS gets moved to a new ALSO-TO: field in the output header. OURNAME is also added so that replies (if any) come back through here. (case b): Any header TO* address that CORRELATES with an OOBS is passed through UNCHANGED, except for removing OURNAME and rewriting to BitNet address syntax. (case c): Any OOBS address that does NOT correlate with a header TO* field is MATERIALIZED in a new TO: field in the output header. OURNAME is removed from address so BitNet node can recognize it. OUTPUT HEADER (note) FROM: <BEN%ORIG.ARPA%RELAY.ARPA@HERE.BITNET> (i) TO: <MARY@SCHOOL.BITNET>, (ii) <HERBOSS@SCHOOL.BITNET> (iii) ALSO-TO: <BILL%OTHER.ARPA@HERE.BITNET> (iv) NOTES (i) FROM: field is recoding of OOBS MAIL FROM: information, with OURNAME added. This gets advisories (if any) back to the original sender. Equivalent ARPA syntax: @HERE.BITNET,@RELAY.ARPA:BEN@ORIG.ARPA Equivalent UUCP syntax: HERE.BITNET!RELAY.ARPA!ORIG.ARPA!BEN (ii) According to (case b), MARY@SCHOOL.BITNET is passed through unchanged. OURNAME is NOT added to the address. (iii) According to (case c), HERBOSS@SCHOOL.BITNET is materialized into a TO: field in the header, and OURNAME is NOT added. (iv) According to (case a), BILL@OTHER is moved to ALSO-TO: field. In this example, OURNAME is added, but I don't know what algorithm should determine this. Suppose OTHER is a BitNet site??? Equivalent ARPA syntax: @HERE.BITNET:BILL@OTHER.ARPA Equivalent UUCP syntax: HERE.BITNET!OTHER.ARPA!BILL A word about correlation: This example is carefully contrived to pass the message through an ARPA relay site before munging to BitNet, thus highlighting some of the differances in processing between the header addresses and the OOBS addresses. Note that in the above example we had to correlate the OOBS: RCPT TO: <@HERE.ARPA:MARY@SCHOOL.BITNET> with the header TO: field: TO: <@RELAY.ARPA,@HERE.ARPA:MARY@SCHOOL.BITNET> so we have to ignore the leading RELAY.ARPA in the TO* field. At the very least we have to ignore leading sites from the header address until we find an instance of OURNAME. The algorithm I use is a bit more general, and I believe it will work even if the message is looped through the munge site multiple times! To wit: Algorithm C: Test two address strings for correlation. [C0] (initiation) Remove leading OURNAMEs from OOBS string. [C1] (passive string search part 1) Attempt to extract one site from HEADER string. If no more sites, return UNCORRELATED, else goto C2. [C2] (passive string search part 2) Lookup site extracted to see if it is an "OURNAME". If it is NOT "OURNAME" then goto C1 else goto C3. [C3] (begin active string search) Remove leading OURNAMEs from HEADER string. Save current location in HEADER string scanning (in case fails). Begin scanning OOBS at first non-OURNAME. [C4] (active string search part 1) Attempt to extract leading site from HEADER string. If successful goto C5 else goto C7. [C5] (active string search part 2) Attempt to extract leading site from OOBS string. If successful goto C6 else goto C9. [C6] (active string search part 3) If sites extracted from HEADER and OOBS string are the same, goto C4. If they are different sites, goto C9. [C7] (End of HEADER string encountered) Attempt to extract leading site from OOBS string. If successful goto C9 else goto C8. [C8] (Both strings exhausted except for USER parts) Compare user parts, if equal then return CORRELATED, else goto C9. [C9] (partial failure) Restore scanning position in HEADER string saved in C3 and goto C1. A word on step C6, above. We want to detect that the site is the same even if the two site names differ only by being different aliases for the same site. I accomplish this step by looking up both of them, and then testing the ARPA site words (#,#,#,#) for equality. If lookup fails for one, but not the other, step C6 returns "not-equal". If lookup fails for BOTH site names, I compare the names textually. This means that correlation can continue beyond the names known to my domain, with only the loss of the alias-detecting facility. REFERENCES: "Simple Mail Transfer Protocol", Johnathan B. Postel, ARPA RFC 821 "Standard for the Format of Arpa Internet Text Messages", David H. Crocker, ARPA RFC 822 "Proposed Standard for Message Header Munging", Marshall T. Rose, ARPA RFC 886 ACKNOWLEGEMENTS: The author would like to credit Hans Breitenlohner for his modifications to the Sperry RTP system, which implement physical transfer for messages in a reasonable way (i.e. *I* don't have to do the EBCDIC-ASCII translation :-), and to credit Bruce Crabill with the modifications to IBM RSCS to allow the Sperry mainframe (read Remote Hasp Workstation) to participate in the RSCS network. -zben Ben Cranston <ZBEN@UMD2.ARPA> <ZBEN@UMDC.BITNET> <ZBEN@UMD5.UUCP> -- Ben Cranston ...{seismo!umcp-cs,ihnp4!rlgvax}!cvl!umd5!zben zben@umd2.ARPA