[net.sources] Xmodem info summary

mercury@ut-ngp.UUCP (Larry E. Baker) (04/15/85)

[.[.].]

Note:
        I'm posting this to net.sources because I feel that it
	is too large to post to the several different newsgroups
	that might be interested.  I am therefore posting shorter
	articles in net.micro, net.micro.pc, net.wanted.sources,
	etc. informing people of its presence here.

Here's the summary of info I got regarding the xmodem protocall.  It
consists mainly of descriptions of the protocall, but if you're interested
in seeing an actual implementation, look at the BYTE refrence below.

The information consists of :  A page from the PC-TALK III user's
manual, the original 'specification' written by none other than Ward
Christiansen himself, and one that was  authored by David roth
<isrnix!pugsly>.  All are informative.

In addition, if you are interested in an actual terminal emulator,
look in the Nov 84 or Nov 83 issue of BYTE (The IBM PC Issue), the
article 'Lmodem.'  It not only defines the Xmodem protocall very
nicely, but presents a terminal emulator written in BDS C.

There was also a 3-part (I think) series in BYTE on the Kermit
protocall fairly recently.

Enjoy!
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Feed this message directly to your printer (it has formfeeds imbedded at
appropriate places) and enjoy!





                       --- PC-TALK III User's Guide ---                 70





        Appendix B: A Brief Description of the XMODEM Protocol


        Once the protocol is put into effect ("Holding for Start..."), 
        the transmitter waits for the receiver to send an NAK character 
        (ASCII 21).  Meanwhile, the receiver sends NAK signals every ten 
        seconds.  Once the transmitter detects an NAK, it starts to send 
        the file in sections of 128 bytes.


        Actually, more than 128 bytes are sent for each block. At the 
        beginning of the block is an SOH character (ASCII 01), followed 
        by the ASCII character representing the block number, followed by 
        the ASCII character of the "one's complement" of the block 
        number. Then the 128 bytes of the file are sent. Finally, the 
        block concludes with an ASCII character representing the sum of 
        the ASCII values of the 128 bytes sent (the "checksum" 
        character).


        The receiver checks the block to verify that everything is in 
        order.  First it makes sure that the block started with an SOH.  
        Then it makes sure that the block number is correct.  Then it 
        performs its own calculations on the 128 data bytes and compares 
        its own checksum with the one received from the transmitter.


        If everything is in order, the receiver sends an ACK character 
        (ASCII 06) to the transmitter, indicating that the next block is 
        to be sent. If the receiver can't verify, it sends an NAK, 
        requesting that the block be sent again.  This continues, block 
        by block, until the entire file has been sent and verified.


        At the end of the file, the transmitter sends an EOT character 
        (ASCII 04).  The receiver acknowledges the EOT with an ACK, and 
        the transfer terminates.





          MODEM PROTOCOL OVERVIEW  178 lines, 7.5K

          1/1/82 by Ward Christensen.  I will maintain a master copy of
          this.  Please pass on changes or suggestions via CBBS/Chicago
          at (312) 545-8086, or by voice at (312) 849-6279.

          NOTE this does not include things which I am not familiar with,
          such as the CRC option implemented by John Mahr.

          Last Rev: (none)

          At the request of Rick Mallinak on behalf of the guys at
          Standard Oil with IBM P.C.s, as well as several previous
          requests, I finally decided to put my modem protocol into
          writing.  It had been previously formally published only in the
          AMRAD newsletter.

              Table of Contents
          1. DEFINITIONS
          2. TRANSMISSION MEDIUM LEVEL PROTOCOL
          3. MESSAGE BLOCK LEVEL PROTOCOL
          4. FILE LEVEL PROTOCOL
          5. DATA FLOW EXAMPLE INCLUDING ERROR RECOVERY
          6. PROGRAMMING TIPS.

          -------- 1. DEFINITIONS.
          <soh>   01H
          <eot>   04H
          <ack>   05H
          <nak>   15H
          <can>   18H

          -------- 2. TRANSMISSION MEDIUM LEVEL PROTOCOL
          Asynchronous, 8 data bits, no parity, one stop bit.

              The protocol imposes no restrictions on the contents of the
          data being transmitted.  No control characters are looked for
          in the 128-byte data messages.  Absolutely any kind of data may
          be sent - binary, ASCII, etc.  The protocol has not formally
          been adopted to a 7-bit environment for the transmission of
          ASCII-only (or unpacked-hex) data , although it could be simply
          by having both ends agree to AND the protocol-dependent data
          with 7F hex before validating it.  I specifically am referring
          to the checksum, and the block numbers and their ones-
          complement.
              Those wishing to maintain compatibility of the CP/M file
          structure, i.e. to allow modemming ASCII files to or from CP/M
          systems should follow this data format:
            * ASCII tabs used (09H); tabs set every 8.
            * Lines terminated by CR/LF (0DH 0AH)
            * End-of-file indicated by ^Z, 1AH.  (one or more)
            * Data is variable length, i.e. should be considered a
              continuous stream of data bytes, broken into 128-byte
              chunks purely for the purpose of transmission.
            * A CP/M "peculiarity": If the data ends exactly on a


              128-byte boundary, i.e. CR in 127, and LF in 128, a
              subsequent sector containing the ^Z EOF character(s)
              is optional, but is preferred.  Some utilities or
              user programs still do not handle EOF without ^Zs.
            * The last block sent is no different from others, i.e.
              there is no "short block".

          -------- 3. MESSAGE BLOCK LEVEL PROTOCOL
           Each block of the transfer looks like:
          <SOH><blk #><255-blk #><--128 data bytes--><cksum>
              in which:
          <SOH>       = 01 hex
          <blk #>     = binary number, starts at 01 increments by 1, and
                        wraps 0FFH to 00H (not to 01)
          <255-blk #> = blk # after going thru 8080 "CMA" instr, i.e.
                        each bit complemented in the 8-bit block number.
                        Formally, this is the "ones complement".
          <cksum>     = the sum of the data bytes only.  Toss any carry.

          -------- 4. FILE LEVEL PROTOCOL

          ---- 4A. COMMON TO BOTH SENDER AND RECEIVER:

              All errors are retried 10 times.  For versions running with
          an operator (i.e. NOT with XMODEM), a message is typed after 10
          errors asking the operator whether to "retry or quit".
              Some versions of the protocol use <can>, ASCII ^X, to
          cancel transmission.  This was never adopted as a standard, as
          having a single "abort" character makes the transmission
          susceptible to false termination due to an <ack> <nak> or <soh>
          being corrupted into a <can> and canceling transmission.
              The protocol may be considered "receiver driven", that is,
          the sender need not automatically re-transmit, although it does
          in the current implementations.

          ---- 4B. RECEIVE PROGRAM CONSIDERATIONS:
              The receiver has a 10-second timeout.  It sends a <nak>
          every time it times out.  The receiver's first timeout, which
          sends a <nak>, signals the transmitter to start.  Optionally,
          the receiver could send a <nak> immediately, in case the sender
          was ready.  This would save the initial 10 second timeout.
          However, the receiver MUST continue to timeout every 10 seconds
          in case the sender wasn't ready.
              Once into a receiving a block, the receiver goes into a
          one-second timeout for each character and the checksum.  If the
          receiver wishes to <nak> a block for any reason (invalid
          header, timeout receiving data), it must wait for the line to
          clear.  See "programming tips" for ideas
              Synchronizing:  If a valid block number is received, it
          will be: 1) the expected one, in which case everything is fine;
          or 2) a repeat of the previously received block.  This should
          be considered OK, and only indicates that the receivers <ack>
          got glitched, and the sender re-transmitted; 3) any other block
          number indicates a fatal loss of synchronization, such as the
          rare case of the sender getting a line-glitch that looked like
          an <ack>.  Abort the transmission, sending a <can>




          ---- 4C. SENDING PROGRAM CONSIDERATIONS.

              While waiting for transmission to begin, the sender has
          only a single very long timeout, say one minute.  In the
          current protocol, the sender has a 10 second timeout before
          retrying.  I suggest NOT doing this, and letting the protocol
          be completely receiver-driven.  This will be compatible with
          existing programs.
              When the sender has no more data, it sends an <eot>, and
          awaits an <ack>, resending the <eot> if it doesn't get one.
          Again, the protocol could be receiver-driven, with the sender
          only having the high-level 1-minute timeout to abort.


          -------- 5. DATA FLOW EXAMPLE INCLUDING ERROR RECOVERY

          Here is a sample of the data flow, sending a 3-block message.
          It includes the two most common line hits - a garbaged block,
          and an <ack> reply getting garbaged.  <xx> represents the
          checksum byte.

          SENDER                  RECEIVER
                          times out after 10 seconds,
                      <---        <nak>
          <soh> 01 FE -data- <xx> --->
                      <---        <ack>
          <soh> 02 FD -data- xx   --->    (data gets line hit)
                      <---        <nak>
          <soh> 02 FD -data- xx   --->
                      <---        <ack>
          <soh> 03 FC -data- xx   --->
             (ack gets garbaged)  <---        <ack>
          <soh> 03 FC -data- xx   --->        <ack>
          <eot>           --->
                      <---        <ack>

          -------- 6. PROGRAMMING TIPS.

          * The character-receive subroutine should be called with a
          parameter specifying the number of seconds to wait.  The
          receiver should first call it with a time of 10, then <nak> and
          try again, 10 times.
            After receiving the <soh>, the receiver should call the
          character receive subroutine with a 1-second timeout, for the
          remainder of the message and the <cksum>.  Since they are sent
          as a continuous stream, timing out of this implies a serious
          like glitch that caused, say, 127 characters to be seen instead
          of 128.

          * When the receiver wishes to <nak>, it should call a "PURGE"
          subroutine, to wait for the line to clear.  Recall the sender
          tosses any characters in its UART buffer immediately upon
          completing sending a block, to ensure no glitches were mis-
          interpreted.
            The most common technique is for "PURGE" to call the



          character receive subroutine, specifying a 1-second timeout,
          and looping back to PURGE until a timeout occurs.  The <nak> is
          then sent, ensuring the other end will see it.

          * You may wish to add code recommended by Jonh Mahr to your
          character receive routine - to set an error flag if the UART
          shows framing error, or overrun.  This will help catch a few
          more glitches - the most common of which is a hit in the high
          bits of the byte in two consecutive bytes.  The <cksum> comes
          out OK since counting in 1-byte produces the same result of
          adding 80H + 80H as with adding 00H + 00H.




Xmodem achieves reliability in its data transfer task by a combination
of three characteristics:

1.      A rigidly defined block structure containing a block number,
        the block number complement, 128 characters of data, and a
        checksum character;

2.      The checksum character noted above is calculated by the
        sending device and appended to the block of data as the
        last character of the block.  The checksum must be
        recalculated and confirmed by the receiving device;

3.      The sending computer will not send another block of data
        until it recieves a positive acknowledgement from the
        receiving device for each block transmitted.

        These requirements allow the XMODEM protocol to indentify
whether a block of data has been correctly received and to resend
a block of data for which an error was detected.

Block Format
------------

        Each block of data transmitted by the XMODEM protocol
contains exactly 132 bytes as follows:


Position        Contents
--------        --------
      1         Start of header                         (/SOH/)
      2         Block (transmission) number              (Blk#)
      3         Ones complement of block number         (CBlk#)
4 - 131         128 bytes of data                      (<data>)
    132         Checksum                                (Cksum)

Thus, the block to be transmitted will have the following format:

 +-------+-------+--------+---------------------+-------+
 : /SOH/ :  Blk# :  CBlk# : <128 bytes of data> : Cksum :
 +-------+-------+--------+---------------------+-------+

The XMODEM protocol requires a transparent mode of transmission
i.e., it must not be sensitive to any control character values
embedded in the data being transmitted.  To achieve the required
transparency, all of the data transmission is performed using
8 bits and no parity.

The block checksum character is a single byte whose initial
value is zero (0) and whose final value is calculated using the
formula:

NewCheckSumValue := (OldCheckSumValue + CurrentChar ) mod 256

This protocol requires that compatible software be installed on
both computer systems being used to transfer files.  This



restriction does not apply to a conversational or emulation mode
of the software.

When the file transfer programs on both computers have been
executed, the sending program waits for a ready signal (called a
handshake) from the receiving program.  For this protocol, this
ready signal is the negative acknowledgment character (NAK - 15h).
Once the NAK has been received by the sending program, block
transfer of the file begins.

The sending routine calculates the checksum as each character
is read from the file to be sent.  The resultant value is then
included as the last byte in the block being transmitted.  The
receiving routine accepts the entire block, calculates a checksum
value, and compares the calculated checksum with the checksum
provided by the sending routine.

If the received and calculated checksums match, the receiving
routine will respond to the sending routine with a positive
acknowledgement (ACK - 06h) character.  Otherwise, the receiving
routine will respond with a NAK.  If the sending routine receives
a responce other than an ACK character, it will attempt to resend
the offending block up to ten times.  If after ten tries the
block has not been properly received, the sender will abort the
file transfer attempt and send a cancel (CAN - 18h) character to
the receiver.

When the entire file has been sent and the receiving routine
responds with an ACK character, the sending routine will send a
single End-of-Transmission (EOT - 04h) character.  This tells the
receiving routine that the file transmission has been completed.
The normal response will be an ACK character.  When the sender
receives an ACK response from the receiver, it goes to normal
end-of-program.

If the receiver responds with anything other than an ACK, the
sender will retransmit the EOT up to ten times.  If no ACK
response is received after ten tries, the sending routine
recognizes a time-out condition, displays an appropriate status
message, and goes to abnormal end-of-program.

When the XMODEM protocol routines reach either normal or
abnormal end-of-program status, they return to the ATE system
emulation mode and restore the datacomm line to bits and parity
parameters which were in effect prior to calling the XMODEM file
transfer subsystem.

-- 
-  Larry Baker @ The University of Texas at Austin
-  ... {seismo!ut-sally | decvax!allegra | tektronix!ihnp4}!ut-ngp!mercury
-  ... mercury@ut-ngp.ARPA