UMFORTH@WEIZMANN.BITNET (11/06/85)
Date: Mon, 4 Nov 85 19:32 EST From: SECRIST%OAK.SAINET.MFENET@LLL-MFE.ARPA Subject: ET-FIG News posting #3 Organization: Science Applications Int'l. Corp., Oak Ridge, Tenn. Geographic-Location: 36 01' 42" N, 84 14' 14" W This is Yet Another ET-FIG News Installment. These will continue to vary in length as I edit all of the chapter-specific net-won't-care stuff out. This edition includes a write-up I did on the Christansen file transfer protocol laying the groundwork for a FORTH XMODEM utility I'm writing. My application is that I desire a clean way to port my FIG and '79 screens from the sans-file-system FORTH implementations (such as raw MVP-FORTH) to a newer '83 implementation in a "screen file" (a la LMI). The logical extension of this is to make it a normal XMODEM utility. I did present an almost-there implementation to the club, which is still not finished. We intend to later port it to every system in the club so we can share things with each other, and publish a kernal and n-system-specific-screens of it in FORTH DIMENSIONs or someplace, as well as this list. The latter is frankly a long way off. -- Richard ------------------------------------------------------------------------------- Excerpts from the ET-FIG News Posting #3 in a series ------------------------------------------------------------------------------- Volume 1, Number 3 ** East Tennessee FORTH Interest Group ** 6-October-1984 ------------------------------------------------------------------------------- +--------------------------------------------+ | The Christansen File Transfer Protocol | +--------------------------------------------+ by Richard C. Secrist After CP/M had made floppy-based computer systems a reality in the late 1970s, public domain software written by computer enthusiasts was in great demand. In response to this demand, one hobbiest, Ward Christansen of Chicago, Illinois, developed a simple, error-checking communications protocol to reliably transfer files between different computers via modems. Today, this frame-oriented protocol is known by the names given to Ward's original public domain implementations: XMODEM or MODEM7, or simply as "the Christansen protocol". When trying to pass data from one computer to another, talk is not usually cheap - particularly if one builds custom data transfer hardware for each computer device involved. For this reason, the connections between many computer devices such as modems and terminals was standardized some years ago. Most computers today are connected to terminals in an almost universal manner, in both hardware (the EIA RS-232 standard) and software (the ASCII character code). The XMODEM protocol takes advantage of the fact that most computers have been made to adhere to these standards. By tying two computers together through their RS-232 ports, and tricking each computer into believing that the other one is a terminal, two computers with cooperating programs can transfer data to each other by speaking in a communications protocol. A communications protocol is needed for several reasons: 1) just because the computers are plugged together doesn't mean they know how to send or receive data to the other machine, 2) one computer may be able to send data faster than the other can take it in, and some kind of flow control must be employed to prevent data overruns, 3) the data can become garbled during transmission because of electrical interference or other causes, corrupting the information one desires to be transferred. To define how the machines will speak to each other and solve all of these problems is what a communications protocol is all about. To prevent corruption of data and to synchronize communication, the cooperating computers mix this flow control and error-detecting information with the the actual data one wishes to transfer. A data communications protocol is defined by the way this mix of data is formatted, and sent between computers. The XMODEM protocol formats its data into individual packets of information that contain synchronization information, 128 bytes of data, and a checksum for error detection. Since the information is tucked away into packets that are interpreted by the XMODEM software and not by the command level of the host operating system directly, it is possible to send even binary data without having the host operating system misinterpret the control characters that may be part of the data itself. An XMODEM packet looks like this : 1 2 3 4 131 132 +-----+-----+-----+-----------------------------------------+-----+ ! SOH ! BLK ! BLC ! 128 bytes of 8 bit data padded with 0 ! CHK ! +-----+-----+-----+-----------------------------------------+-----+ Where: SOH - Start of header, ASCII 1 BLK - Block number, 0 to 255 BLC - 1s compliment of the block number CHK - Checksum of the data field mod 255 The first character is an ASCII SOH (Start-Of-Header, A) which delimits the beginning of the packet. The second byte is a sequential block number from 0 to 255. The block number is used to make sure the receiver gets everything in the right order. Since the block number could be corrupted by line noise, the third byte contains the one's complement of the block number. If the block number summed with it's one's complement is not equal to zero, the XMODEM packet has been corrupted and needs to be re-transmitted. Bytes 4 through 131 are 128 bytes of file data. This number of bytes was chosen to coincide with a CP/M disk sector, which makes sense if you consider XMODEM's CP/M heritage. The final byte is a checksum of the 128 data bytes. If the transmitted checksum does not equal the checksum the receiver calculates from the data area, the packet is bad and would need to be re-sent. The flow control mechanism serves to keep both computers synchronized, as well as provide a means to re-transmit bad packets. The flow control information is represented by certian ASCII characters : o Control-F (ASCII ACK, $06) is an ACKnowledgment that the packet has been transferred without error o Control-U (ASCII NAK, $15) is a Negative AcKnowledgement, meaning the packet was not received correctly or at all o Control-D (ASCII EOT, $04) signifies end-of-text o Control-X (ASCII CAN, $18) is a request from the sender to CANCEL the transfer o Control-A (ASCII SOH, $01) signifies Start-Of-Header, the beginning of an XMODEM packet A sample transaction between two computers provides the best illustration of the Christensen protocol in action : Typically the XMODEM programs have a dumb terminal mode to call up the other machine and go through any required login sequence. After that, XMODEM is invoked on the other machine by the caller, and is told to send or receive a file. Then you return to your machine with some magic control sequence and issue a complementary send or receive command to your own machine. After that, the transfer is all up to the computers. First things the machines have to do is get into synch. The receiving computer looks for a packet for ten seconds. If it hasn't seen a packet after that time it "times out", sending a NAK to the sender, and then starts waiting again with a read posted. This NAK is the cue for the sending machine to start, and off goes the packet. If the receiver gets a good packet -- if the block number is okay and the data passes the checksum -- the receiver sends an ACK to the sender. The sender interprets this as "okay, he said he got that alright, so I'll ship him another one", and promptly transmits anoter packet. Of course if he got a NAK that means "say again, something isn't right", and the sender would obligingly re-transmit the same packet. This process is repeated until the end-of-file, at which point the sender transmits an EOT (end of text). After a final ACK from the receiver, the transfer is complete. At this point the XMODEM program usually returns to "dumb terminal" mode. Trying to send the data over noisy lines of course complicates matters. For example, if the sender transmits a block and then misses the ACK from the receiver, the sender will time-out. Upon time-out, the sender will re-transmit the packet over again - even though the receiver has already got it. When the receiver checks the block number, it will know it already has got this packet, drop it on the floor in discust, and ACK the sender, putting them back in synch. There are numerous different error cases one can experience. Every time one of them happens a counter gets bumped. If the sender misses the ACK over and over again the XMODEM protocol gives up after 10 tries. In fact, 10 is sort of the magic error number: 10 seconds between tries up to 10 tries. Many implementations also send an ASCII CAN to the sender if the receiver aborts. Under scrutiny, this protocol has some holes in it. In practice, it is fairly reliable and effective. Over the years people have started doing strange things with the protocol and as it gets bent and twisted there are some logical consequences of all this. First off, when transferring machine-specific files from one machine to another you have to think through what you're doing. Remember when transferring ASCII text data you are going to take all of the environmentally-dependent features of how the text is stored on the source system to the destination computer. This may require some neat hacking on your own part to set matter straight. For example, moving a text file from CP/M-80 to a VAX under VAX/VMS will make the file look rather scatterbrained when you get it there. Simply jumping into TECO and exiting back out will clean it up for the most part, except for the Z's all over the end which can simply be deleted with your favorite editor. Other things to watch for include XMODEM implementations that also offer CRC-support instead of data checksums (make sure you're both in the same mode before you start the transfer - if you "time out" 10 times on the first try mismatched modes is frequently the cause), transferring BASIC files that may be stored in a tokenized format, or porting binaries over that include operating system specific calls to things like the CP/M BIOS. Pseudo-code of the SEND algorithm (reference 2) ----------------------------------------------- open the file to be sent; initialize the modem; while (there are still sectors to be sent) { repeat { send an SOH; send the sector number; send the complemented sector number ; send the data and compute a checksum; send the checksum; wait for a response with timeout; } until (the response is an ACK ); } send an EOT character; wait for an acknowledgement; close the file; Pseudo-code of the RECEIVE algorithm (see reference 2) ------------------------------------------------------ create the new file in the directory; initialize the modem; repeat { wait for an initial SOH, EOT, or TIMEOUT; if (the character is an SOH) { get the sector number; get the complemented sector number; get the data and compute a checksum; get the checksum; if (checksum = computed checksum) send an ACK; else send a NAK; } if (the character is an EOT) { close the new file; send an ACK; } } until (the initial character was an EOT); References: 1) Kermit Users' Guide, Third Edition; Catchings, et.al.; Columbia University, 1983 2) Lmodem: A Small Remote-Communication Program; Clark, David D.; BYTE magazine, Nov. 1983 3) Chapter 16: Protocol Transfers; Blue, et. al.; ASCII Express: The Professional Instruction Manual; Southwestern Data Systems, 1982 * FORTH DEMENTIA * Q: Is FORTH addictive ? Why is it addictive ? A: Yes. If you get used to FORTH and then try to go back to another language, the withdrawl is painful. FORTH changes the way the programmer thinks about his machine, his problem, and the set of possible solutions. FORTH is addictive in the way that a pair of eyeglasses is addictive to someone with severe myopia - it feels good when you stop walking into walls. -- Richard Milewski as quoted in InfoWorld, Oct. 11, 1982 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- * ET-FIG NEWSLETTER STAFF * (where are you ?!) o Editor: Richard Secrist o Staff Writers: Norman Smith, Richard Secrist o Publishing & Distribution: Joseph Minarick o Treasurer: Steven A. Wallace +---------------------< BULLETIN BOARDS OF NOTE >-----------------------+ | | | Nat'l FIG BBS (24 hours, 300 baud): 415/538-3580 | | Jim Altman RCP/M Atlanta (200mb online, lotsa SIG-M): 404/627-7127 | | RCP/M Frog Hollow, Vancouver, BC (Mac stuff on B4:): 604/937-0906 | | | +-----------------------------------------------------------------------+ [ end of posting #3 ] Acknowledge-To: <UMFORTH@WEIZMANN>