[net.micro.pc] xmodem protocol description

5132ts2@hound.UUCP (T.SCHONFELD) (12/18/85)

I am sure you have seen this before, but if anyone has info describing
the Xmodem protocol, I would greatly appreciate it if you could send
me the info.
               ihnp4!hound!5132ts2

reintom@rocky2.UUCP (Tom Reingold) (12/21/85)

>I am sure you have seen this before, but if anyone has info describing
>the Xmodem protocol, I would greatly appreciate it if you could send
>me the info.
>               ihnp4!hound!5132ts2
>
Here it is.  I got it from a bulletin board.  I don't remember which
one.

		Tom Reingold
		New York City
--------------------------------------------------------------




              XMODEM File Transfer Protocol

                    By Larry Jordan

When transferring files between computers using the telephone
system, there is always the chance that electrical noise will
result in data transmission errors. To ensure proper transfer of
files it is necessary to detect data transmission errors and to
retransmit data that contains errors. Most people think that
asynchronous parity error detection provides that capability. It
does not. Parity error detection does tell you when a data
transfer error has occurred, but it is up to you to retransmit
the data to correct errors. The problem is that parity error
detection is not actually performed by most IBM PC communication
packages. If a package does perform the error detection, it may
not inform you of errors in such a way that you know to
immediately retransmit the data. ASCOM, for example, places an
asterisk in a file where parity errors are detected, but you
may not realize the errors occurred until long after the file is
transferred. To ensure "error-free" data transfer you need a
protocol file transfer technique. Andrew Fluegelman has added
such a technique to PC-TALK.III called the XMODEM protocol.

A protocol is a set of rules and conventions that apply to a
specific area of communications that allow participants to
properly communicate regardless of the hardware brand or software
package being used. The protocol file transfer is a set of rules
for transferring files which specifies a set of ASCII handshaking
characters and the sequence of handshaking required to perform
certain file transfer functions. Protocol handshaking signals
allow communication software to transfer text, data and machine
code files, and to perform sophisticated error-checking. The
handicap in using protocol file transfer techniques is that the
computers on both ends of the communications link must be using
compatible software; there is no standard that controls these
protocols and almost all communication packages that have a
protocol file transfer option use a protocol unique to that
package. This means that a business or group of people must
standardize its microcomputer communications software to take
advantage of protocol transfers.

The Ward Christensen XMODEM protocol is one specific file
transfer protocol that may become a default standard in personal
communications because of its widespread use on  bulletin
boards and because of its inclusion in low cost personal computer
communication packages such as PC-TALK. It has not gained
widespread acceptance in business communication packages partly
because the protocol is public domain; most business
communication package designers use unique protocols to force
businesses to use their software on both ends of communication
links. By providing you with this insight into protocol transfer
and explaining in detail the operation of the XMODEM protocol, I
hope to add momentum to the development of a "standard protocol"
whether it be the XMODEM model or some other model. Users of
communication software deserve a standard protocol that will
allow them to use the technique with any microcomputer regardles
of the software packages employed.

The XMODEM protocol is illustrated in Figure 1. As you can see
from that figure, XMODEM does not begin the transfer of data
until the receiving computer signals the transmitting computer
that it is ready to receive data. The Negative Acknowledge (NAK)
character is used for this signal and is sent to the transmitting
computer every 10 seconds until the file transfer begins. If the
file transfer does not begin after 9 NAK's are sent, the process
has to be manually restarted.

After a NAK is received, the transmitting computer uses a Start
of Header (SOH) character and two block numbers (a true block
number followed by a 1's complement of the number) to signal the
start of a 128-byte block of data to be transferred then sends
the block followed by an error-checking checksum.  The checksum
is calculated by adding the ASCII values of each character in the
128 character block; the sum is then divided by 255 and the
remainder is retained as the checksum.  After each block of data
is transferred, the receiving computer computes its own checksum
and compares the result to the checksum received from the
transmitting computer.  If the two values are the same, the
receiving computer sends an Acknowledge (ACK) character to tell
the receiver to send the next sequential block.  If the two
values are not the same, the receiving computer sends the
transmitter an NAK to request a retransmission of the last block
This retransmission process is repeated until the block of data
is properly received or until 9 attempts have been made to
transmit the block.  If the communications link is noisy,
resulting in improper block transmission after 9 attempts, the
file transfer is aborted.

XMODEM uses two block numbers at the start of each block to be
sure the same block is not transmitted twice because of a
handshake character loss during the transfer. The receiving
computer checks the transmitted block to be sure that it is the
one requested and blocks that are retransmitted by mistake are
thrown away. When all data has been successfully transmitted, the
transmitting computer sends the receiver an End of Transmission
(EOT) character to indicate the end of file.

The XMODEM protocol offers the IBM PC several advantages over other
protocols and file transfer methods. First, the protocol is in
the public domain which makes it readily available for software
designers to incorporate into a communications package. Second,
the protocol is easy to implement using high level languages such
as BASIC or Pascal. Third, the protocol only requires a 256-byte
communication receive buffer which makes it attractive for IBM PC
owners who only have 64K systems. Forth, the protocol allows a
user to transfer non-ASCII 8-bit data files (i.e., COM, EXE and
tokenized BASIC) between microcomputers because it calculates the
end of a file based on file size and uses handshake signals to
indicate the end of a file instead relying on an end of file
marker character (control-Z) to terminate a file transfer.
Fifth, XMODEM error-checking is superior to normal asynchronous
parity error checking.  The parity method of error-checking is
95% effective if the software on the receiving end checks for
parity errors.  XMODEM error-checking is 99.6% effective, and the
software on the receiving end must check for errors.  Parity
errors detected also do not result in automatic retransmission of
the bad data; XMODEM detected errors result in data
retransmission until no errors are detected or until 9
retransmissions have been attempted.  Finally, the protocol is
used by many CP/M bulletin boards and having the protocol in a
communications package allows the IBM PC user to receive
error-checked files from these bulletin boards.

Andrew Fluegelman has given the XMODEM protocol a real boost in
the IBM PC world by including it in his package. He has also
added significant power to the package by including the protocol
Rumor has it that Don Withrow will soon add to the XMODEM
momentum by adding it to his HOSTCOMM software package. Keep up
the good work guys -- we will get a standard one way or the
other!

[This article was derived from material contained in a book
written by Larry Jordan and Bruce Churchill to be published this
Summer by The Brady Company. The article will also be in the
5th issue of PC World magazine.]



             XMODEM Protocol File Transfer


Receiving                                      Transmitting
Computer                                         Computer
Ready to                                         Ready to
Receive                                          Transmit
   |                                                |
   |                                                |
   |---------------------\NAK\--------------------->|
   |                                                |
   |<------/SOH/Blk #1/Blk #1/Good Data/CkSum/------|
   |                                                |
   |---------------------\ACK\--------------------->|
   |                                                |
   |<------/SOH/Blk #2/Blk #2/Good Data/CkSum/------|
   |                                                |
   |---------------------\ACK\--------------------->|
   |                                                |
   |<------/SOH/Blk #3/Blk #3/Garbled Data/CkSum/---|
   |                                                |
   |---------------------\NAK\--------------------->|
   |                                                |
   |<------/SOH/Blk #3/Blk #3/Good Data/CkSum/------|
   |                                                |
   |---------------------\ACK\--------------------->|
   |                                                |
   |<--------------------/EOT/----------------------|
   |                                                |
   |---------------------\ACK\--------------------->|
   |                                                |
   V                                                V

  File                                             File
Receipt                                          Transmit
  Ends                                             Ends

                           Figure 1


-------------------------------------------------------------------


MODEM PROTOCOL OVERVIEW

1/1/82 by Ward Christensen.  I will maintain a master copy of
this.  Please pass on changes or suggestions via CBBS/Chicago
at (312) 545-8086, or by voice at (312) 849-6279.

NOTE this does not include things which I am not familiar with,
such as the CRC option implemented by John Mahr.

Last Rev: (none)

At the request of Rick Mallinak on behalf of the guys at
Standard Oil with IBM P.C.s, as well as several previous
requests, I finally decided to put my modem protocol into
writing.  It had been previously formally published only in the
AMRAD newsletter.

        Table of Contents
1. DEFINITIONS
2. TRANSMISSION MEDIUM LEVEL PROTOCOL
3. MESSAGE BLOCK LEVEL PROTOCOL
4. FILE LEVEL PROTOCOL
5. DATA FLOW EXAMPLE INCLUDING ERROR RECOVERY
6. PROGRAMMING TIPS.

-------- 1. DEFINITIONS.
<soh>   01H
<eot>   04H
<ack>   05H
<nak>   15H
<can>   18H

-------- 2. TRANSMISSION MEDIUM LEVEL PROTOCOL
Asynchronous, 8 data bits, no parity, one stop bit.

    The protocol imposes no restrictions on the contents of the
data being transmitted.  No control characters are looked for
in the 128-byte data messages.  Absolutely any kind of data may
be sent - binary, ASCII, etc.  The protocol has not formally
been adopted to a 7-bit environment for the transmission of
ASCII-only (or unpacked-hex) data , although it could be simply
by having both ends agree to AND the protocol-dependent data
with 7F hex before validating it.  I specifically am referring
to the checksum, and the block numbers and their ones-
complement.
    Those wishing to maintain compatibility of the CP/M file
structure, i.e. to allow modemming ASCII files to or from CP/M
systems should follow this data format:
  * ASCII tabs used (09H); tabs set every 8.
  * Lines terminated by CR/LF (0DH 0AH)
  * End-of-file indicated by ^Z, 1AH.  (one or more)
  * Data is variable length, i.e. should be considered a
    continuous stream of data bytes, broken into 128-byte
    chunks purely for the purpose of transmission.
  * A CP/M "peculiarity": If the data ends exactly on a
    128-byte boundary, i.e. CR in 127, and LF in 128, a
    subsequent sector containing the ^Z EOF character(s)
    is optional, but is preferred.  Some utilities or
    user programs still do not handle EOF without ^Zs.
  * The last block sent is no different from others, i.e.
    there is no "short block".

-------- 3. MESSAGE BLOCK LEVEL PROTOCOL
 Each block of the transfer looks like:
<SOH><blk #><255-blk #><--128 data bytes--><cksum>
    in which:
<SOH>       = 01 hex
<blk #>     = binary number, starts at 01 increments by 1, and
              wraps 0FFH to 00H (not to 01)
<255-blk #> = blk # after going thru 8080 "CMA" instr, i.e.
              each bit complemented in the 8-bit block number.
              Formally, this is the "ones complement".
<cksum>     = the sum of the data bytes only.  Toss any carry.

-------- 4. FILE LEVEL PROTOCOL

---- 4A. COMMON TO BOTH SENDER AND RECEIVER:

    All errors are retried 10 times.  For versions running with
an operator (i.e. NOT with XMODEM), a message is typed after 10
errors asking the operator whether to "retry or quit".
    Some versions of the protocol use <can>, ASCII ^X, to
cancel transmission.  This was never adopted as a standard, as
having a single "abort" character makes the transmission
susceptible to false termination due to an <ack> <nak> or <soh>
being corrupted into a <can> and canceling transmission.
    The protocol may be considered "receiver driven", that is,
the sender need not automatically re-transmit, although it does
in the current implementations.

---- 4B. RECEIVE PROGRAM CONSIDERATIONS:
    The receiver has a 10-second timeout.  It sends a <nak>
every time it times out.  The receiver's first timeout, which
sends a <nak>, signals the transmitter to start.  Optionally,
the receiver could send a <nak> immediately, in case the sender
was ready.  This would save the initial 10 second timeout.
However, the receiver MUST continue to timeout every 10 seconds
in case the sender wasn't ready.
    Once into a receiving a block, the receiver goes into a
one-second timeout for each character and the checksum.  If the
receiver wishes to <nak> a block for any reason (invalid
header, timeout receiving data), it must wait for the line to
clear.  See "programming tips" for ideas
    Synchronizing:  If a valid block number is received, it
will be: 1) the expected one, in which case everything is fine;
or 2) a repeat of the previously received block.  This should
be considered OK, and only indicates that the receivers <ack>
got glitched, and the sender re-transmitted; 3) any other block
number indicates a fatal loss of synchronization, such as the
rare case of the sender getting a line-glitch that looked like
an <ack>.  Abort the transmission, sending a <can>

---- 4C. SENDING PROGRAM CONSIDERATIONS.

    While waiting for transmission to begin, the sender has
only a single very long timeout, say one minute.  In the
current protocol, the sender has a 10 second timeout before
retrying.  I suggest NOT doing this, and letting the protocol
be completely receiver-driven.  This will be compatible with
existing programs.
    When the sender has no more data, it sends an <eot>, and
awaits an <ack>, resending the <eot> if it doesn't get one.
Again, the protocol could be receiver-driven, with the sender
only having the high-level 1-minute timeout to abort.


-------- 5. DATA FLOW EXAMPLE INCLUDING ERROR RECOVERY

Here is a sample of the data flow, sending a 3-block message.
It includes the two most common line hits - a garbaged block,
and an <ack> reply getting garbaged.  <xx> represents the
checksum byte.

SENDER                                  RECEIVER
                                times out after 10 seconds,
                        <---            <nak>
<soh> 01 FE -data- <xx> --->
                        <---            <ack>
<soh> 02 FD -data- xx   --->    (data gets line hit)
                        <---            <nak>
<soh> 02 FD -data- xx   --->
                        <---            <ack>
<soh> 03 FC -data- xx   --->
   (ack gets garbaged)  <---            <ack>
<soh> 03 FC -data- xx   --->            <ack>
<eot>                   --->
                        <---            <ack>

-------- 6. PROGRAMMING TIPS.

* The character-receive subroutine should be called with a
parameter specifying the number of seconds to wait.  The
receiver should first call it with a time of 10, then <nak> and
try again, 10 times.
  After receiving the <soh>, the receiver should call the
character receive subroutine with a 1-second timeout, for the
remainder of the message and the <cksum>.  Since they are sent
as a continuous stream, timing out of this implies a serious
like glitch that caused, say, 127 characters to be seen instead
of 128.

* When the receiver wishes to <nak>, it should call a "PURGE"
subroutine, to wait for the line to clear.  Recall the sender
tosses any characters in its UART buffer immediately upon
completing sending a block, to ensure no glitches were mis-
interpreted.
  The most common technique is for "PURGE" to call the
character receive subroutine, specifying a 1-second timeout,
and looping back to PURGE until a timeout occurs.  The <nak> is
then sent, ensuring the other end will see it.

* You may wish to add code recommended by Jonh Mahr to your
character receive routine - to set an error flag if the UART
shows framing error, or overrun.  This will help catch a few
more glitches - the most common of which is a hit in the high
bits of the byte in two consecutive bytes.  The <cksum> comes
out OK since counting in 1-byte produces the same result of
adding 80H + 80H as with adding 00H + 00H.