[comp.protocols.tcp-ip] Protocol Design issue

brian@ucsd.Edu (Brian Kantor) (12/13/89)

I'm currently designing yet another protocol to sit on top of a
reliable stream such as the TCP, but I'm not sure about one issue:

We have, as part of the various transactions that take place during the
lifetime of a connection, to transfer binary data.  Even with differing
machine architectures, the byte-ness of the data must be preserved
(i.e., it's still N bytes whether you store 4 or 6 per machine word).

It seems to me the best way to do this is by doing it in-band, that is,
sending a plaintext header with a byte-count, followed by that exact
number of bytes in literal mode.  This requires an accurate byte
counting at the sender and recipient, but that isn't a very difficult
thing to do, and it's not computationally expensive.

This is straightforward and relatively simple, and given the underlying
mechanism of an ordered reliable 8-bit stream, I don't see any
significant drawbacks.

All the other alternatives I've seen or dreamed up require either some
imbedded control characters with escaping, or else send the data
out-of-band (for example, FTP using a separate stream from the control
stream for file transmission).  I find both these ideas distasteful.

Are there any drawbacks to byte-counting that I've overlooked?
	- Brian

geoff@hinode.East.Sun.COM (Geoff Arnold @ Sun BOS - R.H. coast near the top) (12/13/89)

Why not just use XDR (RFC1014)? The sources are freely available....

Geoff Arnold, PCDS Group,     | Quote of the week: Too long to include
Sun Microsystems Inc.         | here - read "Being an American" by
Internet: geoff@East.Sun.COM  | Theodore A.Kaldis, <kaldis@topaz.rutgers.edu>
Disclaimer: Obviously....     | in talk.politics.misc. Quite amazing stuff!

rsalz@bbn.com (Rich Salz) (12/14/89)

In <10439@ucsd.Edu> brian@ucsd.Edu (Brian Kantor) asks about shipping
binary data over a TCP stream in-band, by just using byte-counts.

I wrote an rdist-like replacement for our heterogeneous environment some
months ago, and that's exactly what I do.  A header line with a bytecount,
the bytes, and a trailer line to help catch errors.  We ship around a
couple of megabytes a week around to a variety of hosts (VMS, Ultrix,
Sun[234], ATT3b2, ATT6386, Sun3Mach, BBN Butterfly, Masscomp, Xenix) and
don't have any problems.  XDR is too big and bulky to port all over
when all you want is an opaque eight-bit binary stream.
	/r$
-- 
Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.
Use a domain-based address or give alternate paths, or you may lose out.

smb@ulysses.homer.nj.att.com (Steven M. Bellovin) (12/14/89)

Sending a count followed by the exact number of bytes can work; indeed,
that's what Peter Honeyman and I did in uucp's 'e' protocol.  A few
caveats...  First, do everyone a favor and send the count in ASCII.
Using htonl() would work, but is only convenient if (a) the receiving
machine has ntohl(), and (b) both sides use 32-bit longs.  Me -- unless
there's a major performance issue, I'd prefer ASCII.

The other drawback is that there's no good way to abort the transfer.
I suppose you could send an 'urgent' message, but that's difficult to
handle sometimes.  We wanted to be able to abort in case the size of
the file changed between when uucico did the fstat(), and when the file
was actually read.  Might your application run into similar issues?

louie@SAYSHELL.UMD.EDU ("Louis A. Mamakos") (12/14/89)

Brian,

MDQS uses a similar mechanism to transfer files from one system to another,
and it works just fine in most cases.  The large problem that we ran into
when implementing MDQS clients and servers on large, ugly unfriendly machines
like Unisys 1100's and IBM 3080's is that there was no easy way to determine
how many bytes were to be sent without parsing the file once.  

MDQS made the assumption that all of the bytes in the file spooled would be
sent as-is over the network.  This is a broken assumption because:

	* It assumes that you can figure out exactly how long in bytes a
	file is.  Some brain damaged file systems make this difficult or
	impossible to do cheaply.

	* It assumes that the bytes are sent as-is over the network, and
	not converted to NVT ASCII, for instance.  This assumption doesn't
	hold on either the IBM or UNISYS mainframes which have, ah,
	interestingly complicated ways of storing "text" in a "file".

We've redesigned the network protocol to be able to send blocks of data,
each prefixed with a byte count.  This allows the sender to have a finite
sized buffer to accumulate the results of the host system representation
(EBCDIC on the IBM, 6 bit FIELDATA or 9 bit ASCII on the UNISYS) to NVT
ASCII translation.  It will no longer be necessary to pre-parse the entire
file just to find out how many bytes will later be shoved across the
network connection.  The additional complexity is minimal.

If you'd like to get clever, the two ends of the connection can negotiate
the largest size buffer that can be used, if required.

I believe that there is a transfer mode defined in FTP which does just
this sort of thing.  Transfer mode BLOCK, I believe.

louie

hwajin@wrs.com (Hwa Jin Bae) (12/15/89)

The file "xdr_rec.c" in SUN RPC release has an implementation that accomplishes
this "record marking" on top of a TCP stream already.  This can be used
to preserve the application's idea of record boundaries.  No need to reinvent
the wheel.

hwajin
--
"If you live on JELLO you have to realize that you live on JELLO."

jthomp@wintermute.Sun.COM (Jim Thompson) (12/18/89)

In article <12485@ulysses.homer.nj.att.com> smb@ulysses.homer.nj.att.com (Steven M. Bellovin) writes:
>Sending a count followed by the exact number of bytes can work; indeed,
>that's what Peter Honeyman and I did in uucp's 'e' protocol.  A few
>caveats...  First, do everyone a favor and send the count in ASCII.
>Using htonl() would work, but is only convenient if (a) the receiving
>machine has ntohl(), and (b) both sides use 32-bit longs.  Me -- unless
>there's a major performance issue, I'd prefer ASCII.

Indeed, even 'rmt' uses the same technique.
(ascii encoding and all.)

Jim Thompson - Network Engineering - Sun Microsystems -	jthomp@central.sun.com
Member of the Fatalistic International Society for Hedonistic Youth (FISHY)
"Unemployment is the solution, not the problem."  -- B.I.R.D.