5132ts2@hound.UUCP (T.SCHONFELD) (12/18/85)
I am sure you have seen this before, but if anyone has info describing the Xmodem protocol, I would greatly appreciate it if you could send me the info. ihnp4!hound!5132ts2
reintom@rocky2.UUCP (Tom Reingold) (12/21/85)
>I am sure you have seen this before, but if anyone has info describing >the Xmodem protocol, I would greatly appreciate it if you could send >me the info. > ihnp4!hound!5132ts2 > Here it is. I got it from a bulletin board. I don't remember which one. Tom Reingold New York City -------------------------------------------------------------- XMODEM File Transfer Protocol By Larry Jordan When transferring files between computers using the telephone system, there is always the chance that electrical noise will result in data transmission errors. To ensure proper transfer of files it is necessary to detect data transmission errors and to retransmit data that contains errors. Most people think that asynchronous parity error detection provides that capability. It does not. Parity error detection does tell you when a data transfer error has occurred, but it is up to you to retransmit the data to correct errors. The problem is that parity error detection is not actually performed by most IBM PC communication packages. If a package does perform the error detection, it may not inform you of errors in such a way that you know to immediately retransmit the data. ASCOM, for example, places an asterisk in a file where parity errors are detected, but you may not realize the errors occurred until long after the file is transferred. To ensure "error-free" data transfer you need a protocol file transfer technique. Andrew Fluegelman has added such a technique to PC-TALK.III called the XMODEM protocol. A protocol is a set of rules and conventions that apply to a specific area of communications that allow participants to properly communicate regardless of the hardware brand or software package being used. The protocol file transfer is a set of rules for transferring files which specifies a set of ASCII handshaking characters and the sequence of handshaking required to perform certain file transfer functions. Protocol handshaking signals allow communication software to transfer text, data and machine code files, and to perform sophisticated error-checking. The handicap in using protocol file transfer techniques is that the computers on both ends of the communications link must be using compatible software; there is no standard that controls these protocols and almost all communication packages that have a protocol file transfer option use a protocol unique to that package. This means that a business or group of people must standardize its microcomputer communications software to take advantage of protocol transfers. The Ward Christensen XMODEM protocol is one specific file transfer protocol that may become a default standard in personal communications because of its widespread use on bulletin boards and because of its inclusion in low cost personal computer communication packages such as PC-TALK. It has not gained widespread acceptance in business communication packages partly because the protocol is public domain; most business communication package designers use unique protocols to force businesses to use their software on both ends of communication links. By providing you with this insight into protocol transfer and explaining in detail the operation of the XMODEM protocol, I hope to add momentum to the development of a "standard protocol" whether it be the XMODEM model or some other model. Users of communication software deserve a standard protocol that will allow them to use the technique with any microcomputer regardles of the software packages employed. The XMODEM protocol is illustrated in Figure 1. As you can see from that figure, XMODEM does not begin the transfer of data until the receiving computer signals the transmitting computer that it is ready to receive data. The Negative Acknowledge (NAK) character is used for this signal and is sent to the transmitting computer every 10 seconds until the file transfer begins. If the file transfer does not begin after 9 NAK's are sent, the process has to be manually restarted. After a NAK is received, the transmitting computer uses a Start of Header (SOH) character and two block numbers (a true block number followed by a 1's complement of the number) to signal the start of a 128-byte block of data to be transferred then sends the block followed by an error-checking checksum. The checksum is calculated by adding the ASCII values of each character in the 128 character block; the sum is then divided by 255 and the remainder is retained as the checksum. After each block of data is transferred, the receiving computer computes its own checksum and compares the result to the checksum received from the transmitting computer. If the two values are the same, the receiving computer sends an Acknowledge (ACK) character to tell the receiver to send the next sequential block. If the two values are not the same, the receiving computer sends the transmitter an NAK to request a retransmission of the last block This retransmission process is repeated until the block of data is properly received or until 9 attempts have been made to transmit the block. If the communications link is noisy, resulting in improper block transmission after 9 attempts, the file transfer is aborted. XMODEM uses two block numbers at the start of each block to be sure the same block is not transmitted twice because of a handshake character loss during the transfer. The receiving computer checks the transmitted block to be sure that it is the one requested and blocks that are retransmitted by mistake are thrown away. When all data has been successfully transmitted, the transmitting computer sends the receiver an End of Transmission (EOT) character to indicate the end of file. The XMODEM protocol offers the IBM PC several advantages over other protocols and file transfer methods. First, the protocol is in the public domain which makes it readily available for software designers to incorporate into a communications package. Second, the protocol is easy to implement using high level languages such as BASIC or Pascal. Third, the protocol only requires a 256-byte communication receive buffer which makes it attractive for IBM PC owners who only have 64K systems. Forth, the protocol allows a user to transfer non-ASCII 8-bit data files (i.e., COM, EXE and tokenized BASIC) between microcomputers because it calculates the end of a file based on file size and uses handshake signals to indicate the end of a file instead relying on an end of file marker character (control-Z) to terminate a file transfer. Fifth, XMODEM error-checking is superior to normal asynchronous parity error checking. The parity method of error-checking is 95% effective if the software on the receiving end checks for parity errors. XMODEM error-checking is 99.6% effective, and the software on the receiving end must check for errors. Parity errors detected also do not result in automatic retransmission of the bad data; XMODEM detected errors result in data retransmission until no errors are detected or until 9 retransmissions have been attempted. Finally, the protocol is used by many CP/M bulletin boards and having the protocol in a communications package allows the IBM PC user to receive error-checked files from these bulletin boards. Andrew Fluegelman has given the XMODEM protocol a real boost in the IBM PC world by including it in his package. He has also added significant power to the package by including the protocol Rumor has it that Don Withrow will soon add to the XMODEM momentum by adding it to his HOSTCOMM software package. Keep up the good work guys -- we will get a standard one way or the other! [This article was derived from material contained in a book written by Larry Jordan and Bruce Churchill to be published this Summer by The Brady Company. The article will also be in the 5th issue of PC World magazine.] XMODEM Protocol File Transfer Receiving Transmitting Computer Computer Ready to Ready to Receive Transmit | | | | |---------------------\NAK\--------------------->| | | |<------/SOH/Blk #1/Blk #1/Good Data/CkSum/------| | | |---------------------\ACK\--------------------->| | | |<------/SOH/Blk #2/Blk #2/Good Data/CkSum/------| | | |---------------------\ACK\--------------------->| | | |<------/SOH/Blk #3/Blk #3/Garbled Data/CkSum/---| | | |---------------------\NAK\--------------------->| | | |<------/SOH/Blk #3/Blk #3/Good Data/CkSum/------| | | |---------------------\ACK\--------------------->| | | |<--------------------/EOT/----------------------| | | |---------------------\ACK\--------------------->| | | V V File File Receipt Transmit Ends Ends Figure 1 ------------------------------------------------------------------- MODEM PROTOCOL OVERVIEW 1/1/82 by Ward Christensen. I will maintain a master copy of this. Please pass on changes or suggestions via CBBS/Chicago at (312) 545-8086, or by voice at (312) 849-6279. NOTE this does not include things which I am not familiar with, such as the CRC option implemented by John Mahr. Last Rev: (none) At the request of Rick Mallinak on behalf of the guys at Standard Oil with IBM P.C.s, as well as several previous requests, I finally decided to put my modem protocol into writing. It had been previously formally published only in the AMRAD newsletter. Table of Contents 1. DEFINITIONS 2. TRANSMISSION MEDIUM LEVEL PROTOCOL 3. MESSAGE BLOCK LEVEL PROTOCOL 4. FILE LEVEL PROTOCOL 5. DATA FLOW EXAMPLE INCLUDING ERROR RECOVERY 6. PROGRAMMING TIPS. -------- 1. DEFINITIONS. <soh> 01H <eot> 04H <ack> 05H <nak> 15H <can> 18H -------- 2. TRANSMISSION MEDIUM LEVEL PROTOCOL Asynchronous, 8 data bits, no parity, one stop bit. The protocol imposes no restrictions on the contents of the data being transmitted. No control characters are looked for in the 128-byte data messages. Absolutely any kind of data may be sent - binary, ASCII, etc. The protocol has not formally been adopted to a 7-bit environment for the transmission of ASCII-only (or unpacked-hex) data , although it could be simply by having both ends agree to AND the protocol-dependent data with 7F hex before validating it. I specifically am referring to the checksum, and the block numbers and their ones- complement. Those wishing to maintain compatibility of the CP/M file structure, i.e. to allow modemming ASCII files to or from CP/M systems should follow this data format: * ASCII tabs used (09H); tabs set every 8. * Lines terminated by CR/LF (0DH 0AH) * End-of-file indicated by ^Z, 1AH. (one or more) * Data is variable length, i.e. should be considered a continuous stream of data bytes, broken into 128-byte chunks purely for the purpose of transmission. * A CP/M "peculiarity": If the data ends exactly on a 128-byte boundary, i.e. CR in 127, and LF in 128, a subsequent sector containing the ^Z EOF character(s) is optional, but is preferred. Some utilities or user programs still do not handle EOF without ^Zs. * The last block sent is no different from others, i.e. there is no "short block". -------- 3. MESSAGE BLOCK LEVEL PROTOCOL Each block of the transfer looks like: <SOH><blk #><255-blk #><--128 data bytes--><cksum> in which: <SOH> = 01 hex <blk #> = binary number, starts at 01 increments by 1, and wraps 0FFH to 00H (not to 01) <255-blk #> = blk # after going thru 8080 "CMA" instr, i.e. each bit complemented in the 8-bit block number. Formally, this is the "ones complement". <cksum> = the sum of the data bytes only. Toss any carry. -------- 4. FILE LEVEL PROTOCOL ---- 4A. COMMON TO BOTH SENDER AND RECEIVER: All errors are retried 10 times. For versions running with an operator (i.e. NOT with XMODEM), a message is typed after 10 errors asking the operator whether to "retry or quit". Some versions of the protocol use <can>, ASCII ^X, to cancel transmission. This was never adopted as a standard, as having a single "abort" character makes the transmission susceptible to false termination due to an <ack> <nak> or <soh> being corrupted into a <can> and canceling transmission. The protocol may be considered "receiver driven", that is, the sender need not automatically re-transmit, although it does in the current implementations. ---- 4B. RECEIVE PROGRAM CONSIDERATIONS: The receiver has a 10-second timeout. It sends a <nak> every time it times out. The receiver's first timeout, which sends a <nak>, signals the transmitter to start. Optionally, the receiver could send a <nak> immediately, in case the sender was ready. This would save the initial 10 second timeout. However, the receiver MUST continue to timeout every 10 seconds in case the sender wasn't ready. Once into a receiving a block, the receiver goes into a one-second timeout for each character and the checksum. If the receiver wishes to <nak> a block for any reason (invalid header, timeout receiving data), it must wait for the line to clear. See "programming tips" for ideas Synchronizing: If a valid block number is received, it will be: 1) the expected one, in which case everything is fine; or 2) a repeat of the previously received block. This should be considered OK, and only indicates that the receivers <ack> got glitched, and the sender re-transmitted; 3) any other block number indicates a fatal loss of synchronization, such as the rare case of the sender getting a line-glitch that looked like an <ack>. Abort the transmission, sending a <can> ---- 4C. SENDING PROGRAM CONSIDERATIONS. While waiting for transmission to begin, the sender has only a single very long timeout, say one minute. In the current protocol, the sender has a 10 second timeout before retrying. I suggest NOT doing this, and letting the protocol be completely receiver-driven. This will be compatible with existing programs. When the sender has no more data, it sends an <eot>, and awaits an <ack>, resending the <eot> if it doesn't get one. Again, the protocol could be receiver-driven, with the sender only having the high-level 1-minute timeout to abort. -------- 5. DATA FLOW EXAMPLE INCLUDING ERROR RECOVERY Here is a sample of the data flow, sending a 3-block message. It includes the two most common line hits - a garbaged block, and an <ack> reply getting garbaged. <xx> represents the checksum byte. SENDER RECEIVER times out after 10 seconds, <--- <nak> <soh> 01 FE -data- <xx> ---> <--- <ack> <soh> 02 FD -data- xx ---> (data gets line hit) <--- <nak> <soh> 02 FD -data- xx ---> <--- <ack> <soh> 03 FC -data- xx ---> (ack gets garbaged) <--- <ack> <soh> 03 FC -data- xx ---> <ack> <eot> ---> <--- <ack> -------- 6. PROGRAMMING TIPS. * The character-receive subroutine should be called with a parameter specifying the number of seconds to wait. The receiver should first call it with a time of 10, then <nak> and try again, 10 times. After receiving the <soh>, the receiver should call the character receive subroutine with a 1-second timeout, for the remainder of the message and the <cksum>. Since they are sent as a continuous stream, timing out of this implies a serious like glitch that caused, say, 127 characters to be seen instead of 128. * When the receiver wishes to <nak>, it should call a "PURGE" subroutine, to wait for the line to clear. Recall the sender tosses any characters in its UART buffer immediately upon completing sending a block, to ensure no glitches were mis- interpreted. The most common technique is for "PURGE" to call the character receive subroutine, specifying a 1-second timeout, and looping back to PURGE until a timeout occurs. The <nak> is then sent, ensuring the other end will see it. * You may wish to add code recommended by Jonh Mahr to your character receive routine - to set an error flag if the UART shows framing error, or overrun. This will help catch a few more glitches - the most common of which is a hit in the high bits of the byte in two consecutive bytes. The <cksum> comes out OK since counting in 1-byte produces the same result of adding 80H + 80H as with adding 00H + 00H.