[comp.mail.uucp] uucp 'g' protocol description

jeh@dcs.simpact.com (02/08/91)

Here is my long-promised writeup on the uucp 'g' protocol.  It is based on
the paper by Greg Chesson and on other items posted in this newsgroup, salted
with a healthy dose of experience gained while doing the 'g' implementation
for DECUS uucp (VMS).  

Formatting:  Since formatting characters often won't move through the 
networks very well, I have replaced all form-feed characters with the 
two-character sequence 

        ^L

and all backspaces with the four-character sequence

        ^H

Before printing this file, do a global search-and-replace to change these
four-character sequences back to their respective control characters.  
Backspaces appear here because underlining is done via underline-
backspace-letter, on a character-by-character basis; for instance, 
an underlined "abc" appears as 

        _^Ha_^Hb_^Hc

If this is unsuitable for your printer, you'll have to edit the file 
accordingly.  

I'd post the "source" file for this, but it's in VMS Runoff (aka DSR), and 
therefore not too useful to very many people.  If someone is familiar with 
both DSR and, say, troff, and has the time to do the conversion, please 
contact me via e-mail and I will happily send the DSR source.  

	--- Jamie Hanrahan, Simpact Associates, San Diego CA
uucp protocl guru, VMSnet [DECUS uucp] Working Group, DECUS VAX Systems SIG 
Internet:  jeh@dcs.simpact.com, or if that fails, jeh@crash.cts.com
Uucp:  ...{crash,scubed,decwrl}!simpact!jeh

^L















                           UUCP "G" PROTOCOL


                             Session UN047

                       Fall 1990 DECUS Symposium

                           Las Vegas, Nevada


                            Tuesday, 4pm-5pm

                      Pavilion 2, Las Vegas Hilton



                           Jamie E. Hanrahan
        Chair, VMSnet (DECUS uucp) and Internals Working Groups
                         DECUS VAX Systems SIG

                           Simpact Associates
                          9210 Sky Park Court
                          San Diego, CA 92123
                         +1 619-565-1865 x1116
                          jeh@dcs.simpact.com
                           jeh@crash.cts.com
                 {decwrl,scubed,crash,nosc}!simpact!jeh
^L
UUCP "G" PROTOCOL                                                 Page 2
UN047, Tuesday, 4-5 pm


1  INITIAL HANDSHAKE ("SHERE EXCHANGE")


      o  Occurs after call placement and login

      o  Initiated by answering system

      o  All messages begin with \020 (ctrl-P, also known as DLE) and
         end with \000 (NUL)

      o  Sent and received in "raw mode" (no carriage returns, line
         feeds, eight data bits, no parity, etc.)

          ______________________________________________________________

          Caller                                                Answerer
          ------>                                              <--------

          (places call, gets carrier,
          gets login prompt, logs in)
                                                 (uucico program starts)
                                                  \020Shere=_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He\000
          \020S_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He\000
                                                             \020ROK\000
          ______________________________________________________________


      o  _^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He's are the respective systems' uucp host names.

      o  Caller's S_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He message may include any or all of the
         following optional elements:

         S_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He -Q_^Hs_^He_^Hq -x_^Hd_^Hb_^Hg -p_^Hg_^Hr_^Ha_^Hd_^He -vgrade=_^Hg_^Hr_^Ha_^Hd_^He

          -  _^Hs_^He_^Hq = call sequence number (or 0)

          -  _^Hd_^Hb_^Hg is a digit, answerer sets its debug level accordingly

          -  _^Hg_^Hr_^Ha_^Hd_^He indicates what class of files are being transferred
             (e.g. mail, news, arbitrary files).  Taken from systems
             file entry; not sent or supported by all systems.  Some
             implementations support -p, some support -v, some both,
             some neither.  (See discussion of "C." files later on)

      o  Old answerers may send just Shere instead of Shere=_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He .

The "ROK" message says that the answerer has accepted the call.
Instead, the answerer may reject the call with any of the following
messages:

      o  RLCK - answerer thinks it's already talking to that caller

      o  RCB - answerer wants to call the caller back (to avoid imposter
         callers)
^L
UUCP "G" PROTOCOL                                                 Page 3
UN047, Tuesday, 4-5 pm


      o  RBADSEQ - call sequence number is wrong (caller is an imposter,
         or more likely, sequence numbers weren't updated properly at
         one end or the other)

      o  RLOGIN - caller's login name (username) isn't known to
         answerer's uucp (USERFILE or Permissions file)

      o  RYou are unknown to me - (yes, it really does send all of that)
         caller's hostname isn't known to answerer (as determined by
         L.sys or Systems file)

If caller receives any of these, it simply hangs up.  (The answerer
hangs up after sending the reject message, possibly waiting a few
seconds so that the modem disconnect won't prevent the message from
reaching the caller.)

If answerer "accepts the call":

          ______________________________________________________________

          Caller                                                Answerer
          ------>                                              <--------

                                                               P_^Hp_^Hr_^Ho_^Ht_^Hl_^Hi_^Hs_^Ht
          U_^Hp_^Hr_^Ho_^Ht
          ______________________________________________________________


where

      o  _^Hp_^Hr_^Ho_^Ht_^Hl_^Hi_^Hs_^Ht is a list of the protocols (each identified by a
         single letter, e.g. "g") supported by the answerer

      o  _^Hp_^Hr_^Ho_^Ht is a single protocol from _^Hp_^Hr_^Ho_^Ht_^Hl_^Hi_^Hs_^Ht selected by the
         answerer

      o  Caller can send UN (and then hang up) if it supports no
         protocols in common with answerer

At this point the two machines have agreed on a protocol.  We'll assume
that it's the "g" protocol, so that the last two messages may have been
something like

          ______________________________________________________________

          Caller                                                Answerer
          ------>                                              <--------

                                                                    Pfgt
          Ug
          ______________________________________________________________


At this point both systems call the gturnon() function, which causes the
g data link protocol to start up.
^L
UUCP "G" PROTOCOL                                                 Page 4
UN047, Tuesday, 4-5 pm


2  G PROTOCOL PACKET FORMATS

All subsequent communication (until after the G protocol shutdown) is in
"packet mode".

      o  Each packet has a six-byte header

      o  Control packets have _^Ho_^Hn_^Hl_^Hy a header

      o  Data packets have a header and a data segment


2.1  Packet Header Format

          +-------+-------+-------+-------+-------+-------+
          |  DLE  |   K   | cklo  | ckhi  |  ctl  |  XOR  |
          +-------+-------+-------+-------+-------+-------+

The first byte is _^Ha_^Hl_^Hw_^Ha_^Hy_^Hs DLE (octal 020, hex 10).

The second byte is the "K-byte".  For a control packet it is always 9
(ie a tab character).  For data packets it indicates the transmitted
(physical) length of the following data segment, as follows:

                K-byte value    data seg. len. (bytes)
                ------------    ----------------------
                      1                   32
                      2                   64
                      3                  128
                      4                  256
                      5                  512
                      6                 1024
                      7                 2048
                      8                 4096
                      9             control packet


The third and fourth bytes are the low and high bytes, respectively, of
a checksum computed on the data segment (if any) plus the control byte.
(The checksum computation is described later.)

The fifth byte is a bitmapped command byte, described in the next
section.

The last byte is a simple XOR of the middle four bytes.  The first and
last bytes perform a framing and validation function for headers.


2.2  The Control Byte

The "control" byte is bit-mapped as follows:

                  7   6   5   4   3   2   1   0
                +-------+-----------+-----------+
                | T   T | X   X   X | Y   Y   Y |
                +-------+-----------+-----------+
^L
UUCP "G" PROTOCOL                                                 Page 5
UN047, Tuesday, 4-5 pm


The T field denotes the type of packet:

        T value         packet type
        -------         ------------------------------------------
           0            control packet 
           1            alternate channel data packet (unused in uucp)
           2            long data block
           3            short data block (to be described later)

The interpretations of the X and Y fields vary with the packet type
(control vs. data).

For control packets the X field is the type of control packet and the Y
field is a parameter, as follows:

        X value    name      Y field interpretation
        -------    ------    ---------------------------------------
           1       CLOSE     no significance
           2       NAK       last correctly received packet number
           3       SRJ       packet number to retransmit
           4       ACK       last correctly received packet number
           5       INITC     max window size to use
           6       INITB     max data segment size to use
           7       INITA     max window size to use


      o  The "data segment size" on the INITB packet is encoded like the
         KBYTE but with numbers one lower; e.g. an INITB packet with a Y
         field of 1 indicates a data segment size of 64 bytes.

      o  Most descriptions of uucp refer to types 2 and 4 as "RJ"
         (reject) and "RR" (receiver ready), respectively.

      o  SRJ ("selective reject") is not used in known implementations.


2.3  Data Packets

For data packets, the X field is the sequence number of the packet, and
the Y field is the "ACK number" -- the sequence number of the last
packet correctly received by the system sending the data packet.  (See
the section on data transfer and acknowledgement.)

A data packet header is always followed by a data segment of size
indicated by the Kbyte.


2.3.1  Long Data Packets

If the packet type is "long data", all 'n' bytes of the data segment
(where 'n' is denoted by the Kbyte) contain data.
^L
UUCP "G" PROTOCOL                                                 Page 6
UN047, Tuesday, 4-5 pm


2.3.2  Short Data Packets

If the packet type is "short data", the data segment is still sent in
its entirety, but the first one or two bytes indicate the _^Hd_^Hi_^Hf_^Hf_^He_^Hr_^He_^Hn_^Hc_^He
between its physical (transmitted) length and the number of bytes to be
passed to the presentation level (its "logical length"):


      7   6   5   4   3   2   1   0
    +---+---------------------------+
    | 0 | length difference (1-127) |
    +---+---------------------------+

    or, if the difference is too large to fit in seven bits,

      7   6   5   4   3   2   1   0   7   6   5   4   3   2   1   0
    +---+---------------------------+-------------------------------+
    | 1 | length difference(lo bits)| length difference (high bits) |
    +---+---------------------------+-------------------------------+
        first byte of data segment     second byte of data segment


For example, a data packet with a physical segment size of 64 (Kbyte=2),
but an effective (logical) length of zero, would be sent by sending a
"short data" packet where the data segment consisted of a byte with the
numeric value 64, e.g.


      7   6   5   4   3   2   1   0
    +---+---------------------------+
    | 0 | 1   0   0   0   0   0   0 |
    +---+---------------------------+


followed by 63 additional bytes (whose contents would be ignored by the
receiver).


3  G PROTOCOL ("DATA LINK LAYER") INITIALIZATION

G protocol is initialized via an exchange of the INITA, INITB, and INITC
packets.  A simple approach:

      o  Each system starts sending INITA's to the other

      o  Upon receiving an INITA, send an INITB

      o  Upon receiving an INITB, send an INITC

      o  When both have sent and received an INITC, initialization is
         complete

When the INITx packets are received each system sets its uucp parameters
(data segment size and maximum transmit window size) to the _^Hs_^Hm_^Ha_^Hl_^Hl_^He_^Hr of
what it can handle and what it gets from the packet.
^L
UUCP "G" PROTOCOL                                                 Page 7
UN047, Tuesday, 4-5 pm


4  G PROTOCOL DATA TRANSFER AND ACKNOWLEDGEMENT

After initialization, data packets are exchanged.  Thus, all subsequent
traffic consists of Shortdata, Longdata, ACK, and NAK packets (until the
end of the session, when CLOSE packets are exchanged).

      o  Each data packet must be _^Ha_^Hc_^Hk_^Hn_^Ho_^Hw_^Hl_^He_^Hd_^Hg_^He_^Hd by the receiver.

      o  Data packets can only be acknowledged in sequence.

      o  If a data packet arrives corrupted (as determined via
         checksum), the receiver sends a NAK (requesting retransmission)
         or simply does not send an ACK, allowing the sender to time out
         awaiting an ACK (which has the same effect).

      o  The first packet sent by each transmitter has sequence number
         one.  Packet sequence numbers are modulo 8 (ie 1, 2, ..., 7, 0,
         1, ...)

      o  Uucp (in most implementations) uses a transmit window size of
         more than 1 (typically 3, maximum 7 due to the three-bit packet
         sequence numbers)

          -  After sending one data packet the sender need not wait for
             an ACK before sending the next data packet, until _^Hw_^Hi_^Hn_^Hd_^Ho_^Hw
             _^Hs_^Hi_^Hz_^He unACK'd packets have been sent

          -  Colloquially this is known as a "windowed" protocol.  (As
             opposed to stop-and-wait, e.g. most Kermit implementations)

          -  After the transmit window is full (ie _^Hw_^Hi_^Hn_^Hd_^Ho_^Hw _^Hs_^Hi_^Hz_^He unACK'd
             packets have been sent), transmitter must wait for an ACK
             of (at least) the first packet in the transmit window
             before sending another packet

      o  g protocol's _^Hr_^He_^Hc_^He_^Hi_^Hv_^He _^Hw_^Hi_^Hn_^Hd_^Ho_^Hw _^Hs_^Hi_^Hz_^He is 1 -- at any given time
         there is only one packet which is valid to be received.

      o  An ACK provides the number of the last correctly-received
         packet, and implies that all earlier packet numbers have also
         been correctly received.  Thus a single ACK may acknowledge
         multiple packets.

      o  The "ACK number" -- the number of the most recent correctly-
         received message -- also appears in the Y field of the headers
         of data packets.  If a system is about to acknowledge a packet,
         but happens to have a data packet to send, it need not send an
         explicit ACK; putting the number of the packet to be ACK'd in
         the Y field of the data packet is sufficient.  This doesn't
         happen often, because uucp's "upper layers" don't use the link
         in both directions at once.

      o  A NAK provides _^Ht_^Hh_^He _^Hs_^Ha_^Hm_^He _^Hn_^Hu_^Hm_^Hb_^He_^Hr as an ACK but requests
         retransmission of all outstanding packets with higher (modulo
         8) numbers.
^L
UUCP "G" PROTOCOL                                                 Page 8
UN047, Tuesday, 4-5 pm


      o  Both ACKs and NAKs provide acknowledgement for not only the
         indicated packet but also all previous packets in the transmit
         window ("stacked ACKs").

This is known as a "go back n" protocol.  It is relatively easy to
implement but performance suffers when errors occur.

More complex protocols allow a receive window size of greater than one
with "selective reject", and this was in fact the intent of the SRJ
control packet.  An SRJ says "retransmit this one packet, I didn't get
it right".  This avoids having to resend an entire transmit window to
correct an error in just one packet.
^L
UUCP "G" PROTOCOL                                                 Page 9
UN047, Tuesday, 4-5 pm


4.1  A Simple Data Exchange

          ______________________________________________________________

          Sender                                                Receiver
          ------>                                              <--------

          Data 1 
                                                    (receives Data 1 ok)
                                                                   Ack 1
          Data 2
                                                    (receives Data 2 ok)
                                                                   Ack 2
          (receives Ack 1)
          Data 3
                                                    (receives Data 3 ok)
                                                                   Ack 3
          (receives Ack 2)
          Data 4
                                              (receives Data 4 in error)
                                                                   NAK 3
          (receives Ack 3)
          Data 5
                                    (receives Data 5 ok but out of seq.)
          (receives NAK 3)
          (resends everything in window)
          Data 4
          Data 5
                                                    (receives Data 4 ok)
                                                                   Ack 4
          Data 6
                                                    (receives Data 5 ok)
                                                                   Ack 5
          (receives Ack 4)
          Data 7
                                      etc...
          ______________________________________________________________


4.2  Data Link Layer:  Interesting Cases, Implementation Notes, And War
     Stories

4.2.1  Corrupted Packet Headers

When looking for a packet, the receiver should scan for a DLE, then read
the next five bytes following and check the XOR.

      o  If the XOR check fails, the "packet header" cannot be trusted.
         Scanning for the next DLE begins at the next byte following the
         current DLE (_^Hn_^Ho_^Ht six bytes after it).

      o  If the XOR check succeeds, but something's wrong with the
         header (illegal Kbyte value, for example), the receiver should
         treat this the same as an XOR failure.  (XOR checks aren't all
         that robust)
^L
UUCP "G" PROTOCOL                                                Page 10
UN047, Tuesday, 4-5 pm


4.2.2  Corrupted Data Segments

If the header on a data packet is good (XOR is okay), but the checksum
indicates that the data segment is corrupt:

      o  Send a NAK to indicate receipt of a corrupted data packet (but
         see below)

      o  Rescan the input looking for the next control packet, starting
         with the first byte of the data segment

This seems counterintuitive; if we can trust the header it seems that we
ought to be able to trust it to tell us how long the data segment is,
and just skip it.

Suppose the file being received is a copy of just one direction of a
uucp data transfer.  If we pay no attention to what's inside the data
segments we see this (assuming 32-byte data segments, K = 1):


        HHHHHH DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD HHHHHH DDDD...

        but since the data contains a uucp packet stream, the data 
        fields contain

        HHHHHH ddddhhhhhhdddddddddddddddddddddd HHHHHH ddddddddddhhhhhh
                                                               
        1          2                            3          4     5


Now, suppose that the header at 1 gets corrupted in transmission, so the
XOR check fails.  We junk it and start looking for another DLE, and we
find the "imbedded header" at 2.  It passes xor, so we interpret it as a
data header, and read the next 32 characters.  If it happens to be the
right packet number it'll fail on checksum, but more likely it's the
wrong packet number; whatever, we send a NAK to tell the sender what we
want, and resume scanning.  Since we rescan all of (what we thought was)
the data segment, we'll find the DLE at 3...and we're back in sync.

If instead we had said "bad data segment, skip 32 characters" we'd be
looking for the next header starting at 4.  We'd then find the next
imbedded header at 5....  we'd eventually get back in sync, but it might
take a LONG time!


4.2.3  Missing Packets (out-of-sequence Packets)

Uucp "g" protocol employs a receive window size of one; that is, there
is only one packet that may be received at any time (the next one that
follows the previous correctly-received packet).

If an out-of-sequence packet is received the correct response is to send
a NAK (with the Y field, as always, being the number of the last good
packet).
^L
UUCP "G" PROTOCOL                                                Page 11
UN047, Tuesday, 4-5 pm


4.2.4  Too Many NAKs

It is not good to send a NAK for every error condition.  Let's revisit
the simple data exchange diagrammed previously:

          ______________________________________________________________

          Sender                                                Receiver
          ------>                                              <--------

          Data 3
                                                    (receives Data 3 ok)
                                                                   Ack 3
          (receives Ack 2)
          Data 4
                        (data 4 is corrupted on the link)
                                               (receives Data 4 w/error)
                                                              NAK 3 (#1)
          (receives Ack 3)
          Data 5
                                           (receives Data 5 out of seq.)
                                                              NAK 3 (#2)
          Data 6
                                           (receives Data 6 out of seq.)
                                                              NAK 3 (#3)
          (receives NAK 3 #1)
          Data 4
          Data 5
          Data 6
                                                    (receives Data 4 ok)
                                                                   Ack 4
          (receives NAK 3 #2)
          Data 4
          Data 5
          Data 6
          (receives Ack 4)
                                           (receives Data 4 out of seq.)
                                                                   NAK 4
                                      etc...
          ______________________________________________________________



      o  Unix uucps address this by being very "shy" about sending NAKs,
         mostly just letting the sender time out.

         This works but hurts throughput.

      o  DECUS uucp keeps track of the number of received error packets
         and sends the NAK only if the number modulo the window size =
         1.  Thus we send the first NAK right away, but no more until at
         least a window-size worth of packets have been received.
^L
UUCP "G" PROTOCOL                                                Page 12
UN047, Tuesday, 4-5 pm


4.2.5  Duplicate Packets

If a packet is received with a number that's already been received and
ACKd it's either

      o  a duplicate of one we got earlier; our ACKs got lost, so the
         sender timed out and is resending its window

      o  a future packet, and the intervening seven packets were bad
         (but this can't happen, since max window size is seven, not
         eight)

Since packets must always be acked in sequence, the only correct thing
is to send a NAK indicating the last good (in sequence) received packet.


4.2.6  Miscellaneous

4.2.6.1  Control Packet Priority

The control packet types denote the priority of the packets.  That is,
if several control packets are to be sent, the lower-numbered ones are
sent first.  (CLOSE before anything else, for example.) Control packets
in turn have priority over data packets.


4.2.6.2  Varying Physical Packet Lengths

Although the protocol allows for data packets with different K values
(physical lengths) to be sent in a session, in practice, both ends
always use the value negotiated at the start of the session.


4.2.6.3  Interpacket Noise

The DLE that begins a packet should follow right on the heels of the
previous packet.  Berkeley 4.3 uucp has a bug that causes it to send two
null bytes between packets.  A robust implementation will _^Hn_^Ho_^Ht report
errors when it encounters two null bytes while looking for a DLE.


4.2.6.4  Common Parameters

Almost all implementations seem to use a window size of 3 and a data
packet physical length of 64 bytes (Kbyte = 2).  Some implementations
will agree to use a larger window size or packet length, but do not do
so correctly.
^L
UUCP "G" PROTOCOL                                                Page 13
UN047, Tuesday, 4-5 pm


4.2.6.5  Checksum Details

The checksum that is placed in the packet header for data packets has
the value

               MAGIC - (chksum(buf,len) ^ (0xFF & cbyte))

For control packets, the checksum value is simply

                         MAGIC - (0xFF & cbyte)

where
         buf is the data segment
         len is its (physical) length
         chksum() is shown below
         cbyte is the value of the control byte
         MAGIC is 0125252 (octal) (i.e. alternating bits set)

If a data packet is resent, the checksum must be recalculated, as the
checksum includes the control byte and the Y field (last good received
packet number) of the control byte might change between transmissions of
the same data packet.

/*
 * the checksum routine, copied from G. L. Chesson's article,
 * modified by John Gilmore to reflect actual behavior 
 * (see References)
 */

int
chksum(s,n)
register unsigned char *s;
register n;
{
        register short sum;
        register unsigned short t;
        register short x;

        sum = -1;
        x = 0;
        do {
                if (sum < 0) {
                        sum <<= 1;
                        sum++;
                }
                else
                        sum <<= 1;
                t = sum;
                sum += *s++ & 0377;
                x += sum ^ n;
                if ((unsigned short)sum <= t)
                        sum ^= x;
        } while (--n > 0);

        return(sum);
}
^L
UUCP "G" PROTOCOL                                                Page 14
UN047, Tuesday, 4-5 pm


5  FILE TRANSFER "MESSAGES" ("APPLICATION LAYER")

The data packets and acknowledgements form a _^Hr_^He_^Hl_^Hi_^Ha_^Hb_^Hl_^He _^Hd_^Ha_^Ht_^Ha _^Hl_^Hi_^Hn_^Hk which
the two uucico programs use to exchange _^Hm_^He_^Hs_^Hs_^Ha_^Hg_^He_^Hs and _^Hf_^Hi_^Hl_^He_^Hs.

At the start of a uucp session the caller is in _^Hm_^Ha_^Hs_^Ht_^He_^Hr _^Hm_^Ho_^Hd_^He and the
answerer is in _^Hs_^Hl_^Ha_^Hv_^He.  The caller searches its uucp spool directory for
work (queued requests to send or receive files).  Each such "work
request" results in an "S" (request to send), "R" (request to receive),
or "X" (remote request for uucp) message being sent to the slave:


        S _^Hs_^Hr_^Hc_^Hf_^Hi_^Hl_^He_^H__^Ho_^Hn_^H__^Hm_^Ha_^Hs_^Ht_^He_^Hr _^Ht_^Ha_^Hr_^Hg_^He_^Ht_^Hf_^Hi_^Hl_^He_^H__^Ho_^Hn_^H__^Hs_^Hl_^Ha_^Hv_^He ...

        R _^Hs_^Hr_^Hc_^Hf_^Hi_^Hl_^He_^H__^Ho_^Hn_^H__^Hs_^Hl_^Ha_^Hv_^He _^Ht_^Ha_^Hr_^Hg_^He_^Ht_^Hf_^Hi_^Hl_^He_^H__^Ho_^Hn_^H__^Hm_^Ha_^Hs_^Ht_^He_^Hr ...

        X _^Hs_^Hr_^Hc_^Hf_^Hi_^Hl_^He _^Hu_^Hu_^Hc_^Hp_^Hh_^Ho_^Hs_^Ht_^H!_^Ht_^Ha_^Hr_^Hg_^He_^Ht_^Hf_^Hi_^Hl_^He ...  (not covered here)


The slave will respond with one of:


            (for send requests)

        SY              ok
        SN2             not permitted
        SN4             can't create temporary file

            (for receive requests)

        RY<mode>        ok (gives protection mode of file)
        RN2             not permitted


If the master receives any of the reject messages it simply looks for
its next queued work request.

If the master receives an "ok", the master (for file send requests) or
the slave (for file receive requests) sends the file to the other side.
The receiver of the file then sends one of the following messages:


        CN5             couldn't move temp file into place
        CY              file received ok


to the sender.
^L
UUCP "G" PROTOCOL                                                Page 15
UN047, Tuesday, 4-5 pm


6  ROLE EXCHANGE

When the master has no more queued work requests it asks the slave if it
wants to hang up.  The slave then looks for work and, if any is found,
the two systems exchange master and slave roles.

Example:  Slave has no work, hangup is agreed upon:

          ______________________________________________________________

          Master                                                   Slave
          ------>                                                 <-----

          H
                                                                      HY
          HY
                        (at this point both systems invoke
                         the G protocol shutdown routine)
          ______________________________________________________________



Example:  Slave has work, roles are exchanged:

          ______________________________________________________________

          Master                                                   Slave
          ------>                                                 <-----

          H
                                                                      HN

                       (at this point roles are exchanged)
          Slave                                                   Master
          ----->                                                 <------

                                                       S file1 file2 ...

          ______________________________________________________________


When the new master runs out of work it will again ask the former master
if it wants to hang up (as more work may have been queued on that side
during its tenure as slave).  The process repeats until neither system
finds any work for the other.
^L
UUCP "G" PROTOCOL                                                Page 16
UN047, Tuesday, 4-5 pm


7  SENDING AND RECEIVING MESSAGES AND FILES ("PRESENTATION LAYER")

7.1  Sending Messages


      o  Messages (S, SY, SNn, R, RY, RNn, CY, CNn, H, HY, HN) are sent
         in longdata packets.

      o  Some uucps send the last (or only) part of a message in a
         shortdata packet.

      o  As many packets as necessary for the message are used.

      o  The message is terminated by a null byte.

      o  Only one message is sent at a time.

      o  Some uucps seem to expect the rest of the packet to be padded
         with nulls.


7.2  Sending Files


      o  Since Unix, and therefore uucp, has no concept of "file
         attributes", the file is simply copied, byte for byte, into
         packets and transmitted.  (The file protection mode is sent
         either as a parameter on the S message or with the RY message.)

      o  Transmission of a shortdata packet with logical length of zero
         indicates end of file (that is, the previous packet contains
         the last bytes in the file).

These details are specific to the "g" protocol -- thus "g" might be said
to include both the data link and presentation layers (in OSI terms).  


8  G PROTOCOL SHUTDOWN

When both systems have agreed (via H, HY, HY message exchange) to hang
up, they each call the gturnoff() function, causing CLOSE control
messages to be exchanged:

          ______________________________________________________________

          (either)                                              (either)
          -------->                                            <--------

                                                                   CLOSE
          CLOSE
          ______________________________________________________________


^L
UUCP "G" PROTOCOL                                                Page 17
UN047, Tuesday, 4-5 pm


9  "OVER AND OUT"

After the data link layer has exchanged "CLOSE" messages, one more set
of "over and out" messages is exchanged:

          ______________________________________________________________

          Caller                                                Answerer
          ------>                                              <--------

          OOOOOO  (six "O"'s)
                                                  (seven "O"'s)  OOOOOOO
          ______________________________________________________________


Note that since the "g" protocol has already been turned off, these
message are _^Hn_^Ho_^Ht preceded by six-byte headers.  They are, however,
preceded by DLE and terminated by NUL, as were the Shere, S_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He,
P_^Hp_^Hr_^Ho_^Ht_^Hs, etc., messages at the beginning of the session.
^L
UUCP "G" PROTOCOL                                                Page 18
UN047, Tuesday, 4-5 pm


10  A COMPLETE SESSION

Here the caller sends one file and receives one, after which the
answerer sends one file.  (ACKs, NAKs, and retransmits for the g
protocol are not shown.)

          ______________________________________________________________

          Caller                                                Answerer
          ------>                                              <--------

          (places call, logs in, etc.)
                                                 (uucico program starts)

              (subsequent transmissions are framed by DLE .... NUL)

                                                          Shere=_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He
          S_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He
                                                                     ROK
                                                                    Pfgt
          Ug

                  (subsequent packets have "g" protocol headers)

          Control(INITA)                                  Control(INITA)
          Control(INITB)                                  Control(INITB)
          Control(INITC)                                  Control(INITC)
          Longdata(S fromfile1 tofile1 ...)
                                                            Longdata(SY)
          Longdata(file contents)
          Longdata(file contents)
                   ...
          Shortdata(last few bytes of file)
          Shortdata(logical length 0)
                                                            Longdata(CY)
          Longdata(R fromfile1 tofile1 ...)
                                                            Longdata(RY)
                                                 Longdata(file contents)
                                                 Longdata(file contents)
                                                          ...           
                                       Shortdata(last few bytes of file)
                                             Shortdata(logical length 0)
          Longdata(CY)
          Longdata(H)
                                                            Longdata(HN)
                                       Longdata(S fromfile1 tofile1 ...)
          Longdata(SY)
                                                 Longdata(file contents)
                                                 Longdata(file contents)
                                                          ...           
                                       Shortdata(last few bytes of file)
                                             Shortdata(logical length 0)
          Longdata(CY)
                                                             Longdata(H)
          Longdata(HY)
                                                            Longdata(HY)
^L
UUCP "G" PROTOCOL                                                Page 19
UN047, Tuesday, 4-5 pm


          Control(CLOSE)                                  Control(CLOSE)

            (subsequent transmissions are framed only by DLE .... NUL)

          OOOOOO
                                                                 OOOOOOO
          ______________________________________________________________




11  UUCP WORK FILES:  ORIGINS OF "S" AND "R" COMMANDS

The uucico programs at each system are driven by _^Hw_^Ho_^Hr_^Hk _^Hf_^Hi_^Hl_^He_^Hs in the uucp
spool directory.

      o  To decide whether to call a uucp neighbor, uucico looks for
         work files associated with the neighbor's uucp host name.

      o  When answering a call from a uucp neighbor, after the neighbor
         asks "do you want to hang up?", the answering uucico looks for
         work files associated with the calling system's uucp host name.


11.1  Work File Names

An older work file naming convention is:

        C._^Hr_^He_^Hm_^Ho_^Ht_^He_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He _^Hg_^Hr_^Ha_^Hd_^He _^Hu_^Hn_^Hi_^Hq

For example,

        C.crashA8128


      o  "C." is a constant

      o  _^Hr_^He_^Hm_^Ho_^Ht_^He_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He is the uucp host name of the remote system (7
         chars max, even if the actual host name is longer)

      o  _^Hg_^Hr_^Ha_^Hd_^He is a single letter denoting the priority of the work
         request

         e.g. "A" is the most important, "Z" is very unimportant, and
         "N" is somewhere in between.  Since the files are found by
         searching the spool directory, and since files are kept in
         lexical order by file name, the most important files are found
         first.

         Some uucps can be configured to only place outbound calls for
         certain grades of work at certain times of day; for example,
         mail is typically assigned a very high grade, and one might
         want to transfer mail almost as soon as it is queued, with file
         transfers somewhat less important, and news being relegated to
         evenings and nights.
^L
UUCP "G" PROTOCOL                                                Page 20
UN047, Tuesday, 4-5 pm


      o  uniq is four characters (four digits in older implementations)
         which make this work file name unique, even though other work
         files for the same uucp host, with the same grade, may exist.

      o  Newer systems use four or six alphanumeric characters, and case
         _^Hi_^Hs significant, e.g. C.noscMabcdef is a different file from
         C.noscMabcdeF.

      o  Some Unix implementations use different conventions.

          -  HoneyDanBer uses a different subdirectory for each neighbor
             system (the subdirectory name is the neighbor system's uucp
             host name).  (see references)

          -  One implementation uses one subdirectory for "C." files,
             and another for all data files.

      o  Other systems with restrictive file names (e.g. MS-DOS) can use
         different conventions, as long as the same functions can be
         performed.  For example,

                 uucp/spool/C.simpactA1234

         might become

                 C:UUCP\SPOOL\C\SIMPACT\A1234

After starting up in master mode, the uucico program processes each work
file associated with the remote system.

When no work files remain, the master mode uucico offers to switch roles
(i.e. it sends an "H" message to the slave).  The slave searches for
work files containing the master's host name.  If any are found, roles
are switched, otherwise the systems agree to hang up.


11.2  Work File Contents

Each work file contains one or more "S", "R", or "X" (not covered here)
commands.  These are processed (i.e. sent to the neighbor, and the
indicated file transferred) in turn.


11.2.1  "S" (send File From Master To Slave) Commands

Format:
        S _^Hs_^Hr_^Hc_^Hf_^Hi_^Hl_^He _^Ht_^Hr_^Hg_^Ht_^Hf_^Hi_^Hl_^He _^Hu_^Hs_^He_^Hr _^Ho_^Hp_^Ht_^Hs _^Hd_^Ha_^Ht_^Ha_^Hf_^Hi_^Hl_^He _^Hm_^Ho_^Hd_^He _^Hn_^Ho_^Ht_^Hi_^Hf_^Hy


      o  _^Hs_^Hr_^Hc_^Hf_^Hi_^Hl_^He is the fully-qualified pathname of the source file

      o  _^Ht_^Hr_^Hg_^Ht_^Hf_^Hi_^Hl_^He is the fully-qualified pathname to which the file is
         to be copied on the target system
^L
UUCP "G" PROTOCOL                                                Page 21
UN047, Tuesday, 4-5 pm


      o  _^Hu_^Hs_^He_^Hr is the sending user's login name

      o  _^Ho_^Hp_^Ht_^Hs options, preceded by "-"

          -  c   send directly from _^Hs_^Hr_^Hc_^Hf_^Hi_^Hl_^He.  If absent, send from copy
             of file in spool directory, name given by _^Hd_^Ha_^Ht_^Ha_^Hf_^Hi_^Hl_^He, and
             delete copy after transferring; in this case _^Hs_^Hr_^Hc_^Hf_^Hi_^Hl_^He is
             informational, providing the file's original name.
          -  m   notify sender by mail when copy is completed
          -  n   notify user _^Hn_^Ho_^Ht_^Hi_^Hf_^Hy on target system when file arrives

         If no options are present, a hyphen appears here as a
         placeholder.

      o  _^Hd_^Ha_^Ht_^Ha_^Hf_^Hi_^Hl_^He if "c" option present, "D.0"; if "c" absent, gives the
         name of the data file in the spool directory for uucico to
         send.

      o  _^Hm_^Ho_^Hd_^He is the Unix-style file protection mode (in octal) to be
         given to the new file

      o  _^Hn_^Ho_^Ht_^Hi_^Hf_^Hy is the name of a user on the target system; present only
         if "n" option present


11.2.2  "R" (Receive File From Slave To Master) Commands

Same as "S" commands, except _^Hd_^Ha_^Ht_^Ha_^Hf_^Hi_^Hl_^He, _^Hm_^Ho_^Hd_^He, and _^Hn_^Ho_^Ht_^Hi_^Hf_^Hy fields are not
present.


11.3  Origins Of Work Files

Work files are created when:

      o  Users issue "uucp" commands

         (these are explicit requests to copy files to or from uucp
         neighbors)

      o  Users send mail which goes out via uucp

         (the "mailer" places the mail in a file in a spool directory,
         and creates the necessary work file to cause it to be sent to
         the first hop in the uucp path)

      o  "Route-through" uucp mail arrives

         (this is really just the same as the preceding case -- the
         "mailer" is seen as just another user)

      o  News is queued for transmission to another system

         (as with mail, the news transfer programs place the news batch
         in a file in the spool directory, and create the necessary work
         file)
^L
UUCP "G" PROTOCOL                                                Page 22
UN047, Tuesday, 4-5 pm


12  EXECUTION FILES ("X." FILES)

The X-file mechanism is used for _^Hr_^He_^Hm_^Ho_^Ht_^He _^Hc_^Ho_^Hm_^Hm_^Ha_^Hn_^Hd _^He_^Hx_^He_^Hc_^Hu_^Ht_^Hi_^Ho_^Hn.

      o  Receipt of a file named X.anything in the spool directory
         triggers the "uuxqt" program

      o  uuxqt interprets the X-file

      o  Typical uses are for mail and news

      o  The uux command allows users to request remote execution of
         arbitrary commands (permissions permitting)


12.1  X-File Contents

The X-File contains one or more lines.  The first character on each line
specifies a command type.  It is followed by a blank and one or more
parameters which are interpreted according to the command type, as
follows:

(From a Usenet article by Dr. Peter Honeyman)

        C   command to be executed
        I   file name for command input
        O   file name for command output
        F   file required to be present before running command; optional
            second argument gives name for the file at execution time
        R   name of user who issued request (relative to the host named
            on the U line)
        U   second arg is name of host that passed this X. file to me;
            first arg is a user name on that host (overridden by R line)
        Z   send status notification (and error output) if command
            failed
        N   send no status notification if command failed
        n   send status notification if command succeeded
        B   return command input on error
        e   requests command be processed with sh(1)
        E   requests command be processed with exec(2)
        M   return status info to the named file on the requesting host
        #   comment line

Unrecognized lines are ignored, i.e. treated as comments.


12.2  Example:  Sending Mail.

User jeh on system simpact sends mail to user bblue on uucp neighbor
crash.

     1.  simpact's mailer creates mail text file in spool directory
         (file name D.simpactA3214)
^L
UUCP "G" PROTOCOL                                                Page 23
UN047, Tuesday, 4-5 pm


     2.  simpact's mailer creates what will become an X-file in
         simpact's spool directory (file name B.simpactA3214)

         U jeh simpact
         F D.simpactA3214
         I D.simpactA3214
         C rmail bblue

     3.  simpact's mailer creates work file in simpact's spool directory
         (file name C.crashA0001)

         S D.simpactA3214 D.simpactA3214 jeh - D.simpactA3214 0666
         S B.simpactA3214 X.simpactA3214 jeh - B.simpactA3214 0666

     4.  simpact calls system "crash" and interprets the work file,
         transferring the "D." and "B." files (in that order).

         Note that when the "B." file arrives at crash, its name is
         X.simpactA3214 (see the second "S" command in the C file
         above).

     5.  The uuxqt program at crash interprets the X file, using the
         file D.simpactA3214 as input to the command "rmail bblue".

Notes:

      o  Some systems use a different _^Hu_^Hn_^Hi_^Hq value in the file name for
         each locally-created file (instead of relying on the first
         letter to keep them separate).

      o  Some systems use the "D." prefix for both the mail message text
         and for the file that will become the "X." file on the target
         system.  (In this case, the _^Hu_^Hn_^Hi_^Hq value must be different for
         these two files)

      o  This mechanism is designed so that, even with a single spool
         directory and uncoordinated _^Hu_^Hn_^Hi_^Hq values, all file names will be
         unique:

          -  Only the local system creates "B." and "C." files.

          -  Only the remote system creates "X." files.

          -  Only the local system creates files called
             "D._^Hl_^Ho_^Hc_^Ha_^Hl_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He*".

          -  Only the remote system creates files called
             "D._^Hr_^He_^Hm_^Ho_^Ht_^He_^Hh_^Ho_^Hs_^Ht_^Hn_^Ha_^Hm_^He*".


13  FILE NAME TRANSLATION

Systems with different file name syntax need not be barred from
participating in uucp transfers, and particularly not from mail and
news.
^L
UUCP "G" PROTOCOL                                                Page 24
UN047, Tuesday, 4-5 pm


13.1  Receiving Mail And News


      o  Only received "D." and "X." files are of concern.

      o  Local uucico maps file names specified by Unix system to local
         requirements when processing received "S" commands (in slave
         mode).

      o  Local uuxqt performs the _^Hs_^Ha_^Hm_^He translation when handling file
         names in the "X." file.


13.2  Sending Mail And News


      o  Names of locally-created "B.", "C.", and "D." files can adhere
         to local conventions.

      o  When constructing the local work file ("C." file), build
         Unix-format _^Ht_^Hr_^Hg_^Ht_^Hf_^Hi_^Hl_^He names in the "S" commands.


14  REFERENCES AND ACKNOWLEDGEMENTS

14.1  Uucp Protocols

(The following two papers were instrumental both in preparing this
article and in implement uucp "g" for VMS.  I believe that all of the
material in these two papers is incorporated in this article.  I could
be wrong.)

Chuck Wegrzyn posted a Usenet article (date unknown) that described the
Shere exchange, over-and-out exchange, and the application layer.

"Packet Driver Protocol" by G. L. Chesson of Bell Laboratories (October
5, 1988), distributed via Usenet and Internet, is the standard reference
on the the "g" protocol data link layer.


14.2  The Uucp System

_^HM_^Ha_^Hn_^Ha_^Hg_^Hi_^Hn_^Hg _^Hu_^Hu_^Hc_^Hp _^Ha_^Hn_^Hd _^HU_^Hs_^He_^Hn_^He_^Ht by Tim O'Reilly and Grace Todino, (O'Reilly And
Associates, Newton, MA 02164), has a good description of the Shere
exchange and of work file and execution file contents.

"Uucp Implementation Description" by D. A. Nowitz, part of most complete
Unix documentation sets (check the _^HS_^Hu_^Hp_^Hp_^Hl_^He_^Hm_^He_^Hn_^Ht_^Ha_^Hr_^Hy _^HD_^Ho_^Hc_^Hu_^Hm_^He_^Hn_^Ht_^Hs volumes, if
present) provides a general description of each program in the uucp
suite.

"HoneyDanBer UUCP -- Bringing UNIX Systems into the Information Age" by
Bill Rieken and Jim Webb, _^H;_^Hl_^Ho_^Hg_^Hi_^Hn_^H:  (journal of the Usenix Association),
Volume 11, Numbers 3 and 4 (May/June and July/August, 1986).  Describes
"external" features of one of the newest Unix uucp implementations,
including new file name conventions, logging, and error handling.
^L
UUCP "G" PROTOCOL                                                Page 25
UN047, Tuesday, 4-5 pm


14.3  General

A widely-recognized work on the subject of data communications protocols
is _^HC_^Ho_^Hm_^Hp_^Hu_^Ht_^He_^Hr _^HN_^He_^Ht_^Hw_^Ho_^Hr_^Hk_^Hs by Andrew S. Tanenbaum (Prentice-Hall).  Algorithms
are included for several different types of windowed data link
protocols.

_^HK_^He_^Hr_^Hm_^Hi_^Ht_^H, _^HA _^HF_^Hi_^Hl_^He _^HT_^Hr_^Ha_^Hn_^Hs_^Hf_^He_^Hr _^HP_^Hr_^Ho_^Ht_^Ho_^Hc_^Ho_^Hl by Frank da Cruz (Digital Press),
provides a very lucid description of the sliding-window extension to
Kermit.  Although windowed Kermit differs in several important ways from
uucp "g", this material was of great assistance in designing the
windowed "g" implementation for DECUS uucp and in understanding the
subtle nuances of windowed protocols.


14.4  Critiques And Proofreading

The following people graciously reviewed and send comments on early
versions of this paper:

      o  Christopher J. Ambler, chris@fubarsys.slo.ca.us,
         cambler@polyslo.calpoly.edu

      o  Jordan Brown, jbrown@jato.jpl.nasa.gov

      o  Eric Johansson

      o  Nick Pemberton, Lsuc, uunet!mnetor!aimed!nick

      o  Mark Pizzolato, decwrl!infopiz!mark