UMFORTH@WEIZMANN.BITNET (11/06/85)
Date: Mon, 4 Nov 85 19:32 EST
From: SECRIST%OAK.SAINET.MFENET@LLL-MFE.ARPA
Subject: ET-FIG News posting #3
Organization: Science Applications Int'l. Corp., Oak Ridge, Tenn.
Geographic-Location: 36 01' 42" N, 84 14' 14" W
This is Yet Another ET-FIG News Installment. These will continue to
vary in length as I edit all of the chapter-specific net-won't-care
stuff out. This edition includes a write-up I did on the Christansen
file transfer protocol laying the groundwork for a FORTH XMODEM
utility I'm writing. My application is that I desire a clean way to
port my FIG and '79 screens from the sans-file-system FORTH
implementations (such as raw MVP-FORTH) to a newer '83 implementation
in a "screen file" (a la LMI). The logical extension of this is to
make it a normal XMODEM utility. I did present an almost-there
implementation to the club, which is still not finished. We intend to
later port it to every system in the club so we can share things with
each other, and publish a kernal and n-system-specific-screens of it
in FORTH DIMENSIONs or someplace, as well as this list. The latter is
frankly a long way off.
-- Richard
-------------------------------------------------------------------------------
Excerpts from the ET-FIG News Posting #3 in a series
-------------------------------------------------------------------------------
Volume 1, Number 3 ** East Tennessee FORTH Interest Group ** 6-October-1984
-------------------------------------------------------------------------------
+--------------------------------------------+
| The Christansen File Transfer Protocol |
+--------------------------------------------+
by Richard C. Secrist
After CP/M had made floppy-based computer systems a reality in the
late 1970s, public domain software written by computer enthusiasts was
in great demand. In response to this demand, one hobbiest, Ward
Christansen of Chicago, Illinois, developed a simple, error-checking
communications protocol to reliably transfer files between different
computers via modems. Today, this frame-oriented protocol is known by
the names given to Ward's original public domain implementations:
XMODEM or MODEM7, or simply as "the Christansen protocol".
When trying to pass data from one computer to another, talk is not
usually cheap - particularly if one builds custom data transfer
hardware for each computer device involved. For this reason, the
connections between many computer devices such as modems and terminals
was standardized some years ago. Most computers today are connected to
terminals in an almost universal manner, in both hardware (the EIA
RS-232 standard) and software (the ASCII character code).
The XMODEM protocol takes advantage of the fact that most
computers have been made to adhere to these standards. By tying two
computers together through their RS-232 ports, and tricking each
computer into believing that the other one is a terminal, two computers
with cooperating programs can transfer data to each other by speaking
in a communications protocol.
A communications protocol is needed for several reasons: 1) just
because the computers are plugged together doesn't mean they know how
to send or receive data to the other machine, 2) one computer may be
able to send data faster than the other can take it in, and some kind
of flow control must be employed to prevent data overruns, 3) the data
can become garbled during transmission because of electrical
interference or other causes, corrupting the information one desires to
be transferred.
To define how the machines will speak to each other and solve all
of these problems is what a communications protocol is all about. To
prevent corruption of data and to synchronize communication, the
cooperating computers mix this flow control and error-detecting
information with the the actual data one wishes to transfer. A data
communications protocol is defined by the way this mix of data is
formatted, and sent between computers.
The XMODEM protocol formats its data into individual packets of
information that contain synchronization information, 128 bytes of
data, and a checksum for error detection. Since the information is
tucked away into packets that are interpreted by the XMODEM software
and not by the command level of the host operating system directly, it
is possible to send even binary data without having the host operating
system misinterpret the control characters that may be part of the data
itself.
An XMODEM packet looks like this :
1 2 3 4 131 132
+-----+-----+-----+-----------------------------------------+-----+
! SOH ! BLK ! BLC ! 128 bytes of 8 bit data padded with 0 ! CHK !
+-----+-----+-----+-----------------------------------------+-----+
Where:
SOH - Start of header, ASCII 1
BLK - Block number, 0 to 255
BLC - 1s compliment of the block number
CHK - Checksum of the data field mod 255
The first character is an ASCII SOH (Start-Of-Header, A) which
delimits the beginning of the packet. The second byte is a sequential
block number from 0 to 255. The block number is used to make sure the
receiver gets everything in the right order. Since the block number could
be corrupted by line noise, the third byte contains the one's complement of
the block number. If the block number summed with it's one's complement is
not equal to zero, the XMODEM packet has been corrupted and needs to be
re-transmitted. Bytes 4 through 131 are 128 bytes of file data. This
number of bytes was chosen to coincide with a CP/M disk sector, which makes
sense if you consider XMODEM's CP/M heritage. The final byte is a checksum
of the 128 data bytes. If the transmitted checksum does not equal the
checksum the receiver calculates from the data area, the packet is bad and
would need to be re-sent.
The flow control mechanism serves to keep both computers synchronized,
as well as provide a means to re-transmit bad packets. The flow control
information is represented by certian ASCII characters :
o Control-F (ASCII ACK, $06) is an ACKnowledgment that the packet has
been transferred without error
o Control-U (ASCII NAK, $15) is a Negative AcKnowledgement, meaning the
packet was not received correctly or at all
o Control-D (ASCII EOT, $04) signifies end-of-text
o Control-X (ASCII CAN, $18) is a request from the sender to CANCEL the
transfer
o Control-A (ASCII SOH, $01) signifies Start-Of-Header, the beginning of
an XMODEM packet
A sample transaction between two computers provides the best
illustration of the Christensen protocol in action :
Typically the XMODEM programs have a dumb terminal mode to call up the
other machine and go through any required login sequence. After that,
XMODEM is invoked on the other machine by the caller, and is told to send
or receive a file. Then you return to your machine with some magic control
sequence and issue a complementary send or receive command to your own
machine. After that, the transfer is all up to the computers.
First things the machines have to do is get into synch. The receiving
computer looks for a packet for ten seconds. If it hasn't seen a packet
after that time it "times out", sending a NAK to the sender, and then
starts waiting again with a read posted. This NAK is the cue for the
sending machine to start, and off goes the packet. If the receiver gets a
good packet -- if the block number is okay and the data passes the checksum
-- the receiver sends an ACK to the sender. The sender interprets this as
"okay, he said he got that alright, so I'll ship him another one", and
promptly transmits anoter packet. Of course if he got a NAK that means
"say again, something isn't right", and the sender would obligingly
re-transmit the same packet. This process is repeated until the
end-of-file, at which point the sender transmits an EOT (end of text).
After a final ACK from the receiver, the transfer is complete. At this
point the XMODEM program usually returns to "dumb terminal" mode.
Trying to send the data over noisy lines of course complicates
matters. For example, if the sender transmits a block and then misses the
ACK from the receiver, the sender will time-out. Upon time-out, the sender
will re-transmit the packet over again - even though the receiver has
already got it. When the receiver checks the block number, it will know it
already has got this packet, drop it on the floor in discust, and ACK the
sender, putting them back in synch.
There are numerous different error cases one can experience. Every
time one of them happens a counter gets bumped. If the sender misses the
ACK over and over again the XMODEM protocol gives up after 10 tries. In
fact, 10 is sort of the magic error number: 10 seconds between tries up to
10 tries. Many implementations also send an ASCII CAN to the sender if the
receiver aborts. Under scrutiny, this protocol has some holes in it. In
practice, it is fairly reliable and effective.
Over the years people have started doing strange things with the
protocol and as it gets bent and twisted there are some logical
consequences of all this. First off, when transferring machine-specific
files from one machine to another you have to think through what you're
doing. Remember when transferring ASCII text data you are going to take
all of the environmentally-dependent features of how the text is stored on
the source system to the destination computer. This may require some neat
hacking on your own part to set matter straight. For example, moving a
text file from CP/M-80 to a VAX under VAX/VMS will make the file look
rather scatterbrained when you get it there. Simply jumping into TECO and
exiting back out will clean it up for the most part, except for the Z's all
over the end which can simply be deleted with your favorite editor.
Other things to watch for include XMODEM implementations that also
offer CRC-support instead of data checksums (make sure you're both in the
same mode before you start the transfer - if you "time out" 10 times on the
first try mismatched modes is frequently the cause), transferring BASIC
files that may be stored in a tokenized format, or porting binaries over
that include operating system specific calls to things like the CP/M BIOS.
Pseudo-code of the SEND algorithm (reference 2)
-----------------------------------------------
open the file to be sent;
initialize the modem;
while (there are still sectors to be sent) {
repeat {
send an SOH;
send the sector number;
send the complemented sector number ;
send the data and compute a checksum;
send the checksum;
wait for a response with timeout;
} until (the response is an ACK );
}
send an EOT character;
wait for an acknowledgement;
close the file;
Pseudo-code of the RECEIVE algorithm (see reference 2)
------------------------------------------------------
create the new file in the directory;
initialize the modem;
repeat {
wait for an initial SOH, EOT, or TIMEOUT;
if (the character is an SOH) {
get the sector number;
get the complemented sector number;
get the data and compute a checksum;
get the checksum;
if (checksum = computed checksum)
send an ACK;
else
send a NAK;
}
if (the character is an EOT) {
close the new file;
send an ACK;
}
} until (the initial character was an EOT);
References:
1) Kermit Users' Guide, Third Edition; Catchings, et.al.; Columbia
University, 1983
2) Lmodem: A Small Remote-Communication Program; Clark, David D.; BYTE
magazine, Nov. 1983
3) Chapter 16: Protocol Transfers; Blue, et. al.; ASCII Express: The
Professional Instruction Manual; Southwestern Data Systems, 1982
* FORTH DEMENTIA *
Q: Is FORTH addictive ? Why is it addictive ?
A: Yes. If you get used to FORTH and then try to go back to another
language, the withdrawl is painful.
FORTH changes the way the programmer thinks about his machine,
his problem, and the set of possible solutions. FORTH is
addictive in the way that a pair of eyeglasses is addictive
to someone with severe myopia - it feels good when you stop
walking into walls.
-- Richard Milewski as quoted in
InfoWorld, Oct. 11, 1982
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
* ET-FIG NEWSLETTER STAFF *
(where are you ?!)
o Editor: Richard Secrist
o Staff Writers: Norman Smith, Richard Secrist
o Publishing & Distribution: Joseph Minarick
o Treasurer: Steven A. Wallace
+---------------------< BULLETIN BOARDS OF NOTE >-----------------------+
| |
| Nat'l FIG BBS (24 hours, 300 baud): 415/538-3580 |
| Jim Altman RCP/M Atlanta (200mb online, lotsa SIG-M): 404/627-7127 |
| RCP/M Frog Hollow, Vancouver, BC (Mac stuff on B4:): 604/937-0906 |
| |
+-----------------------------------------------------------------------+
[ end of posting #3 ]
Acknowledge-To: <UMFORTH@WEIZMANN>