[comp.sys.cbm] Xmodem Information

digdon@ug.cs.dal.ca (Mike Digdon) (01/28/91)

I'm working with a term program (well, passing the time away by pretending
to write one) and I would like it to have transfers. I have Punter, and I can
get it to work and such, so that is no problem. My big problem is Xmodem.
I have nothing on it. Does anyone have any kind of source for it, and maybe
also the docs on how to execute it and set it up? Any help would be greatly
appreciated.

-- 
	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
	+   Mike Digdon: digdon@ug.cs.dal.ca  + My 64 can still get the +
	+		 mike@ac.dal.ca	      + job at hand completed!! +
	+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

root@zswamp.fidonet.org (Geoffrey Welsh) (01/30/91)

In a letter to All, Mike Digdon (digdon@ug.cs.dal.ca ) wrote:

 >My big problem is Xmodem. I have nothing on it.
 >Does anyone have any kind of source for it, and maybe
 >also the docs on how to execute it and set it up?
 
   How about the specifications?  The following is extracted from Chuck 
Forsberg's YMODEM spec, which I was going to copy here in its entirety until I 
noticed that it is some 66K in length and guessed that doing so would be a quick 
way to Usenet flame-Hell.

   BTW, if you (or anyone) want some fast CRC routines, drop me E-mail... my 
terms are very reasonable.

    7.  XMODEM PROTOCOL OVERVIEW

    8/9/82 by Ward Christensen.

    I will maintain a master copy of this.  Please pass on changes or
    suggestions via CBBS/Chicago at (312) 545-8086, CBBS/CPMUG (312) 849-1132
    or by voice at (312) 849-6279.

    7.1  Definitions

      <soh> 01H
      <eot> 04H
      <ack> 06H
      <nak> 15H
      <can> 18H
      <C>   43H

    7.2  Transmission Medium Level Protocol

    Asynchronous, 8 data bits, no parity, one stop bit.

    The protocol imposes no restrictions on the contents of the data being
    transmitted.  No control characters are looked for in the 128-byte data
    messages.  Absolutely any kind of data may be sent - binary, ASCII, etc.
    The protocol has not formally been adopted to a 7-bit environment for the
    transmission of ASCII-only (or unpacked-hex) data , although it could be
    simply by having both ends agree to AND the protocol-dependent data with
    7F hex before validating it.  I specifically am referring to the checksum,
    and the block numbers and their ones- complement.

    Those wishing to maintain compatibility of the CP/M file structure, i.e.
    to allow modemming ASCII files to or from CP/M systems should follow this
    data format:

       + ASCII tabs used (09H); tabs set every 8.

       + Lines terminated by CR/LF (0DH 0AH)

       + End-of-file indicated by ^Z, 1AH.  (one or more)

       + Data is variable length, i.e. should be considered a continuous
         stream of data bytes, broken into 128-byte chunks purely for the
         purpose of transmission.

       + A CP/M "peculiarity": If the data ends exactly on a 128-byte
         boundary, i.e. CR in 127, and LF in 128, a subsequent sector
         containing the ^Z EOF character(s) is optional, but is preferred.
         Some utilities or user programs still do not handle EOF without ^Zs.

       + The last block sent is no different from others, i.e.  there is no
         "short block".

                  Figure 8.  XMODEM Message Block Level Protocol

    Each block of the transfer looks like:

          <SOH><blk #><255-blk #><--128 data bytes--><cksum>

    in which:

    <SOH>         = 01 hex
    <blk #>       = binary number, starts at 01 increments by 1, and
                    wraps 0FFH to 00H (not to 01)
    <255-blk #>   = blk # after going thru 8080 "CMA" instr, i.e.
                    each bit complemented in the 8-bit block number.
                    Formally, this is the "ones complement".
    <cksum>       = the sum of the data bytes only.  Toss any carry.

    7.3  File Level Protocol

    7.3.1  Common_to_Both_Sender_and_Receiver

    All errors are retried 10 times.  For versions running with an operator
    (i.e. NOT with XMODEM), a message is typed after 10 errors asking the
    operator whether to "retry or quit".

    Some versions of the protocol use <can>, ASCII ^X, to cancel transmission.
    This was never adopted as a standard, as having a single "abort" character
    makes the transmission susceptible to false termination due to an <ack>
    <nak> or <soh> being corrupted into a <can> and aborting transmission.

    The protocol may be considered "receiver driven", that is, the sender need
    not automatically re-transmit, although it does in the current
    implementations.

    7.3.2  Receive_Program_Considerations

    The receiver has a 10-second timeout.  It sends a <nak> every time it
    times out.  The receiver's first timeout, which sends a <nak>, signals the
    transmitter to start.  Optionally, the receiver could send a <nak>
    immediately, in case the sender was ready.  This would save the initial 10
    second timeout.  However, the receiver MUST continue to timeout every 10
    seconds in case the sender wasn't ready.

    Once into a receiving a block, the receiver goes into a one-second timeout
    for each character and the checksum.  If the receiver wishes to <nak> a
    block for any reason (invalid header, timeout receiving data), it must
    wait for the line to clear.  See "programming tips" for ideas

    Synchronizing:  If a valid block number is received, it will be: 1) the
    expected one, in which case everything is fine; or 2) a repeat of the
    previously received block.  This should be considered OK, and only
    indicates that the receivers <ack> got glitched, and the sender re-
    transmitted; 3) any other block number indicates a fatal loss of
    synchronization, such as the rare case of the sender getting a line-glitch
    that looked like an <ack>.  Abort the transmission, sending a <can>

    7.3.3  Sending_program_considerations

    While waiting for transmission to begin, the sender has only a single very
    long timeout, say one minute.  In the current protocol, the sender has a
    10 second timeout before retrying.  I suggest NOT doing this, and letting
    the protocol be completely receiver-driven.  This will be compatible with
    existing programs.

    When the sender has no more data, it sends an <eot>, and awaits an <ack>,
    resending the <eot> if it doesn't get one.  Again, the protocol could be
    receiver-driven, with the sender only having the high-level 1-minute
    timeout to abort.

    Here is a sample of the data flow, sending a 3-block message.  It includes
    the two most common line hits - a garbaged block, and an <ack> reply
    getting garbaged.  <xx> represents the checksum byte.

                  Figure 9.  Data flow including Error Recovery

    SENDER                                  RECEIVER
                                  times out after 10 seconds,
                                  <---              <nak>
    <soh> 01 FE -data- <xx>       --->
                                  <---              <ack>
    <soh> 02 FD -data- xx         --->       (data gets line hit)
                                  <---              <nak>
    <soh> 02 FD -data- xx         --->
                                  <---              <ack>
    <soh> 03 FC -data- xx         --->
    (ack gets garbaged)           <---              <ack>
    <soh> 03 FC -data- xx         --->              <ack>
    <eot>                         --->
                                  <---       <anything except ack>
    <eot>                         --->
                                  <---              <ack>
    (finished)

    7.4  Programming Tips

       + The character-receive subroutine should be called with a parameter
         specifying the number of seconds to wait.  The receiver should first
         call it with a time of 10, then <nak> and try again, 10 times.

         After receiving the <soh>, the receiver should call the character
         receive subroutine with a 1-second timeout, for the remainder of the
         message and the <cksum>.  Since they are sent as a continuous stream,
         timing out of this implies a serious like glitch that caused, say,
         127 characters to be seen instead of 128.

       + When the receiver wishes to <nak>, it should call a "PURGE"
         subroutine, to wait for the line to clear.  Recall the sender tosses
         any characters in its UART buffer immediately upon completing sending
         a block, to ensure no glitches were mis- interpreted.

         The most common technique is for "PURGE" to call the character
         receive subroutine, specifying a 1-second timeout,[1] and looping
         back to PURGE until a timeout occurs.  The <nak> is then sent,
         ensuring the other end will see it.

       + You may wish to add code recommended by John Mahr to your character
         receive routine - to set an error flag if the UART shows framing
         error, or overrun.  This will help catch a few more glitches - the
         most common of which is a hit in the high bits of the byte in two
         consecutive bytes.  The <cksum> comes out OK since counting in 1-byte
         produces the same result of adding 80H + 80H as with adding 00H +
         00H.

    8.  XMODEM/CRC Overview

    Original 1/13/85 by John Byrns -- CRC option.

    Please pass on any reports of errors in this document or suggestions for
    improvement to me via Ward's/CBBS at (312) 849-1132, or by voice at (312)
    885-1105.

    The CRC used in the Modem Protocol is an alternate form of block check
    which provides more robust error detection than the original checksum.
    Andrew S. Tanenbaum says in his book, Computer Networks, that the CRC-
    CCITT used by the Modem Protocol will detect all single and double bit
    errors, all errors with an odd number of bits, all burst errors of length
    16 or less, 99.997% of 17-bit error bursts, and 99.998% of 18-bit and
    longer bursts.[1]

    The changes to the Modem Protocol to replace the checksum with the CRC are
    straight forward. If that were all that we did we would not be able to
    communicate between a program using the old checksum protocol and one
    using the new CRC protocol. An initial handshake was added to solve this
    problem. The handshake allows a receiving program with CRC capability to
    determine whether the sending program supports the CRC option, and to
    switch it to CRC mode if it does. This handshake is designed so that it
    will work properly with programs which implement only the original
    protocol. A description of this handshake is presented in section 10.

                Figure 10.  Message Block Level Protocol, CRC mode

    Each block of the transfer in CRC mode looks like:

         <SOH><blk #><255-blk #><--128 data bytes--><CRC hi><CRC lo>

    in which:

    <SOH>        = 01 hex
    <blk #>      = binary number, starts at 01 increments by 1, and
                   wraps 0FFH to 00H (not to 01)
    <255-blk #>  = ones complement of blk #.
    <CRC hi>     = byte containing the 8 hi order coefficients of the CRC.
    <CRC lo>     = byte containing the 8 lo order coefficients of the CRC.

    8.1  CRC Calculation

    8.1.1  Formal_Definition
    To calculate the 16 bit CRC the message bits are considered to be the
    coefficients of a polynomial. This message polynomial is first multiplied
    by X^16 and then divided by the generator polynomial (X^16 + X^12 + X^5 +
    1) using modulo two arithmetic. The remainder left after the division is
    the desired CRC. Since a message block in the Modem Protocol is 128 bytes
    or 1024 bits, the message polynomial will be of order X^1023. The hi order
    bit of the first byte of the message block is the coefficient of X^1023 in
    the message polynomial.  The lo order bit of the last byte of the message
    block is the coefficient of X^0 in the message polynomial.

               Figure 11.  Example of CRC Calculation written in C

    The following XMODEM crc routine is taken from "rbsb.c".  Please refer to
    the source code for these programs (contained in RZSZ.ZOO) for usage.  A
    fast table driven version is also included in this file.

    /* update CRC */
    unsigned short
    updcrc(c, crc)
    register c;
    register unsigned crc;
    {
            register count;

            for (count=8; --count>=0;) {
                    if (crc & 0x8000) {
                            crc <<= 1;
                            crc += (((c<<=1) & 0400)  !=  0);
                            crc ^= 0x1021;
                    }
                    else {
                            crc <<= 1;
                            crc += (((c<<=1) & 0400)  !=  0);
                    }
            }
            return crc;
    }

    8.2  CRC File Level Protocol Changes

    8.2.1  Common_to_Both_Sender_and_Receiver

    The only change to the File Level Protocol for the CRC option is the
    initial handshake which is used to determine if both the sending and the
    receiving programs support the CRC mode. All Modem Programs should support
    the checksum mode for compatibility with older versions.  A receiving
    program that wishes to receive in CRC mode implements the mode setting
    handshake by sending a <C> in place of the initial <nak>.  If the sending
    program supports CRC mode it will recognize the <C> and will set itself
    into CRC mode, and respond by sending the first block as if a <nak> had
    been received. If the sending program does not support CRC mode it will
    not respond to the <C> at all. After the receiver has sent the <C> it will
    wait up to 3 seconds for the <soh> that starts the first block. If it
    receives a <soh> within 3 seconds it will assume the sender supports CRC
    mode and will proceed with the file exchange in CRC mode. If no <soh> is
    received within 3 seconds the receiver will switch to checksum mode, send
    a <nak>, and proceed in checksum mode. If the receiver wishes to use
    checksum mode it should send an initial <nak> and the sending program
    should respond to the <nak> as defined in the original Modem Protocol.
    After the mode has been set by the initial <C> or <nak> the protocol
    follows the original Modem Protocol and is identical whether the checksum
    or CRC is being used.

    8.2.2  Receive_Program_Considerations

    There are at least 4 things that can go wrong with the mode setting
    handshake.

      1.  the initial <C> can be garbled or lost.

      2.  the initial <soh> can be garbled.

      3.  the initial <C> can be changed to a <nak>.

      4.  the initial <nak> from a receiver which wants to receive in checksum
          can be changed to a <C>.

    The first problem can be solved if the receiver sends a second <C> after
    it times out the first time. This process can be repeated several times.
    It must not be repeated too many times before sending a <nak> and
    switching to checksum mode or a sending program without CRC support may
    time out and abort. Repeating the <C> will also fix the second problem if
    the sending program cooperates by responding as if a <nak> were received
    instead of ignoring the extra <C>.

    It is possible to fix problems 3 and 4 but probably not worth the trouble
    since they will occur very infrequently. They could be fixed by switching
    modes in either the sending or the receiving program after a large number
    of successive <nak>s. This solution would risk other problems however.

    8.2.3  Sending_Program_Considerations

    The sending program should start in the checksum mode. This will insure
    compatibility with checksum only receiving programs. Anytime a <C> is
    received before the first <nak> or <ack> the sending program should set
    itself into CRC mode and respond as if a <nak> were received. The sender
    should respond to additional <C>s as if they were <nak>s until the first
    <ack> is received. This will assist the receiving program in determining
    the correct mode when the <soh> is lost or garbled. After the first <ack>
    is received the sending program should ignore <C>s.

    8.3  Data Flow Examples with CRC Option

    Here is a data flow example for the case where the receiver requests
    transmission in the CRC mode but the sender does not support the CRC
    option. This example also includes various transmission errors.  <xx>
    represents the checksum byte.

          Figure 12.  Data Flow: Receiver has CRC Option, Sender Doesn't

    SENDER                                        RECEIVER
                            <---                <C>
                                    times out after 3 seconds,
                            <---                <C>
                                    times out after 3 seconds,
                            <---                <C>
                                    times out after 3 seconds,
                            <---                <C>
                                    times out after 3 seconds,
                            <---                <nak>
    <soh> 01 FE -data- <xx> --->
                            <---                <ack>
    <soh> 02 FD -data- <xx> --->        (data gets line hit)
                            <---                <nak>
    <soh> 02 FD -data- <xx> --->
                            <---                <ack>
    <soh> 03 FC -data- <xx> --->
       (ack gets garbaged)  <---                <ack>
                                    times out after 10 seconds,
                            <---                <nak>
    <soh> 03 FC -data- <xx> --->
                            <---                <ack>
    <eot>                   --->
                            <---                <ack>

    Here is a data flow example for the case where the receiver requests
    transmission in the CRC mode and the sender supports the CRC option.  This
    example also includes various transmission errors.  <xxxx> represents the
    2 CRC bytes.

               Figure 13.  Receiver and Sender Both have CRC Option

    SENDER                                       RECEIVER
                              <---                 <C>
    <soh> 01 FE -data- <xxxx> --->
                              <---                 <ack>
    <soh> 02 FD -data- <xxxx> --->         (data gets line hit)
                              <---                 <nak>
    <soh> 02 FD -data- <xxxx> --->
                              <---                 <ack>
    <soh> 03 FC -data- <xxxx> --->
    (ack gets garbaged)       <---                 <ack>
                                         times out after 10 seconds,
                              <---                 <nak>
    <soh> 03 FC -data- <xxxx> --->
                              <---                 <ack>
    <eot>                     --->
                              <---                 <ack>
 

--  
UUCP:     watmath!xenitec!zswamp!root | 602-66 Mooregate Crescent
Internet: root@zswamp.fidonet.org     | Kitchener, Ontario
FidoNet:  SYSOP, 1:221/171            | N2M 5E6 CANADA
Data:     (519) 742-8939              | (519) 741-9553
MC Hammer, n. Device used to ensure firm seating of MicroChannel boards
Try our new Molson 'C' compiler... it specializes in 'case' statements!