[comp.protocols.tcp-ip] partial transfer recovery in RFC and OSI protocols

ggm@brolga.cc.uq.oz (George Michaelson) (12/18/89)

ftp doesn't seem to have any concept of unreliable underlying transport 
or link. I would dearly love to see something like UK-NIFTP's partial
file transfer recovery in ftp. (in fact, I believe most uk-niftp don't
implement file xfer recovery and rewind the retransmission back to the front) 

Once you start looking at that sort of functionality, you might wind up 
agreeing with the OSI reference model which (i think) puts it in session level
but given a null session layer you dont have that choice.

I guess I'd rather see suitable code in an underlying layer, so that the same
functionality was available to ALL TCP-based services, nntp-over-slow-lines
and other bulk transfer being obvious candidates. I'm not (quite) silly
enough to see everyone hacking their kernel, and nobody is offering invisble
glue between TCP and the services, so I guess its a per-service addition or
nothing.

Why bother? 

    (1)	bandwidth is scarce in several crucial places
		cross-channel links
		atlantic links
		pacific links.

    (2) Point-to-point links are NOT decreasing, SLIP to small boxes
	and dial-in IP are increasing and thus "noisy" links are
	just as common dispite backbone upgrades and other net-level
	advances.

    (3) IP based services aren't going away for years to come.

    (4) OSI services can provide this functionality and it might
	be nice to be able to bridge into Internet services and get
	the same thing. [err.. can they? I *think* they can...]

ACSnet has partial transfer recovery built into the transport(ish) level
as well as a multiplexed link level. This is *extremely* useful and far
outweighs in my opinion the other (mis?)features of this package/protocol.
With its probable demise as the "backbone" protocol across Australia I suspect
we are going to see much WORSE service until people learn how to use the IP
based services "properly", ("manual" CSMA-CD when slow IP detected a.k.a try
again at 3am) 

Question(s):

[history]   Was partial transfer recovery considered? If so, why was it 
	    rejected? I heard arguments against its implementation on
	    JANET such as "too hard" and "doesn't make sense between
	    divergent h/w or opsys" neither of which I think make sense,
	    especially given machine-independant upper-level encodings
	    like compressed-batch/lempel-ziv/NNTP and hop-based services
	    where you are placed in the role of an intermediate carrier.

[whinge]    Is it sensible/possible to extend existing product(s) to provide
	    this function? We see Telnet and other extensions, PEP is out,
	    clearly The RFCs aren't rolling over and dying yet. Too late to
	    add into ftp? can it make 4.5bsd?

[cringe]    Will people be using this feature in OSI applications? does it
	    get turned on automatically or is it missing from most existing
	    setups and thus impossible to use in MHS/FTAM over slow/noisy
	    lines?


Enquiring mind wants to know!

	-George
	

Internet: G.Michaelson@cc.uq.oz.au                     Phone: +61 7 377 4079
  Postal: George Michaelson, Prentice Computer Centre
          Queensland University, St Lucia, QLD Australia 4067.

postel@VENERA.ISI.EDU (12/19/89)

George Michaelson:

Please read the specification of FTP, RFC-959.  See section 3.5 on error
recovery and restart, and read about the REST (restart) command.

--jon.

barns@GATEWAY.MITRE.ORG (12/19/89)

You raise several distinct issues of which I can only respond to some.

Restart capability is defined for the FTP protocol (RFC 959) as an
optional feature and has been so since at least RFC 542, 12 August
1973, which is as far back as my collection goes.  This design works
between divergent system types.  The written specs had some unclear
areas which I hope have been fixed by text in RFC 1123, section
4.1.3.4.  Implementations are few and far between, but it was hoped
that the clarified, corrected and expanded writeup in RFC 1123 would
encourage people to take this on.  It is not very hard to do.

There is a different FTP restart scheme by Rick Adams which I have
heard will be in 4.4(?).  It is not compatible with the one defined in
the RFCs, though in principle they can coexist in a single
implementation.  It fits better with a typical UNIX I/O architecture
and thus perhaps gives better throughput and almost certainly is more
CPU-efficient on a number of real world platforms, but it does not try
to handle the more difficult cases of operation between divergent
systems.

The throughput issue regarding the two methods is architecture-dependent
and involves both software and hardware design issues.  It seems that
making the RFC version work using the normal interface routines one
finds on a UNIX box probably means either sending more packets or doing
more data copying.  In some cases it might be possible to just add
library routines to eliminate this, if the hardware has a suitable
design (i.e., scatter/gather DMA).  Also, in some systems, some of the
feared data copying may be happening already anyway.  On some machine
architectures including IBM Big Iron (but there are others too), a
relatively small amount of code together with some smart choices of
block sizes might allow you to do the RFC-style approach with
infinitesimal impact on CPU utilization or network throughput.  The
upshot of all this seems to be that neither version of Restart
maximizes interoperability, portability, and (CPU) efficiency
simultaneously.  I think OSI is trapped in the same solution space.

In either version, the Restart mechanism is basically provided for
recovery from cataclysmic disruptions (disk full, network died, host
died, impatient user blasted the client program into oblivion) and NOT
to deal with bit corruption (noisy links).  Both TCP/IP and OSI hold
that lower layers should do most of the work of protecting against the
latter problem.  I don't find this unreasonable, even on slow serial
point-to-point links.  Data link protocols should be chosen to fit the
error characteristics of the links, and TCP and TP4 can cope with some
residual glitches.  This leaves only the problem of recovery from
higher-layer aspects of service interruptions as the proper domain of
"session" recovery schemes.

Regarding the life and death of TCP/IP family protocols and their
enhancement, I agree that they aren't dead yet and can't be ignored.
However I suggest that there is only one pool of expertise available
for working on generic application domain problems in either TCP/IP or
OSI.  Seems to me that most of the experts with strong feelings are
spending most of their time in the OSI arena.  There are evidently not
enough people to go around, and those that exist evidently see OSI as a
better investment of their time right now (I presume on the theory of
broader impact).

Bill Barns

nelson@sun.soe.clarkson.edu (Russ Nelson) (12/20/89)

In article <8912181942.AA10029@arcturus.mitre.org> barns@GATEWAY.MITRE.ORG writes:

   There is a different FTP restart scheme by Rick Adams which I have
   heard will be in 4.4(?).  ... it does not try to handle the more
   difficult cases of operation between divergent systems.

That is not the case.  His FTP restart scheme relies on the fact that
your file is eventually expressed as a stream of octets over the TCP
connection.  His RESTart command simply says to suppress the first N
octets.  It is brilliantly simple.

--
--russ (nelson@clutx [.bitnet | .clarkson.edu])  Russ.Nelson@$315.268.6667
Live up to the light thou hast, and more will be granted thee.
A recession now appears more than 2 years away -- John D. Mathon, 4 Oct 1989.
I think killing is value-neutral in and of itself. -- Gary Strand, 8 Nov 1989.
Liberals run this country, by and large. -- Clayton Cramer, 20 Nov 1989.
Shut up and mind your Canadian business, you meddlesome foreigner. -- TK, 23 N.

dricejb@drilex.UUCP (Craig Jackson drilex1) (12/20/89)

In article <NELSON.89Dec19120206@sun.clarkson.edu> nelson@clutx.clarkson.edu writes:
>In article <8912181942.AA10029@arcturus.mitre.org> barns@GATEWAY.MITRE.ORG writes:
>
>   There is a different FTP restart scheme by Rick Adams which I have
>   heard will be in 4.4(?).  ... it does not try to handle the more
>   difficult cases of operation between divergent systems.
>
>That is not the case.  His FTP restart scheme relies on the fact that
>your file is eventually expressed as a stream of octets over the TCP
>connection.  His RESTart command simply says to suppress the first N
>octets.  It is brilliantly simple.

I think what barns was saying is while that it may be "simple" just to read
through 100 Megabytes of a file, using a possibly CPU-intensive transformation
to turn in into a Unix-compatible file, just to transmit the last Megabyte,
it can't be considered friendly to non-Unix systems.  Especially systems
that otherwise have no problem saying "move to record 200000 of that 201000
record text file, and transmit the rest".

Just as "All the world isn't a VAX", "All the world isn't Unix".  I would
like to suggest that the world is better for that.

(Did I say enough to make inews happy?)
-- 
Craig Jackson
dricejb@drilex.dri.mgh.com
{bbn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}

braden@VENERA.ISI.EDU (12/21/89)

    In article <8912181942.AA10029@arcturus.mitre.org> barns@GATEWAY.MITRE.ORG writes:

	   There is a different FTP restart scheme by Rick Adams which I have
	   heard will be in 4.4(?).  ... it does not try to handle the more
	   difficult cases of operation between divergent systems.

	That is not the case.  His FTP restart scheme relies on the fact that
	your file is eventually expressed as a stream of octets over the TCP
	connection.  His RESTart command simply says to suppress the first N
	octets.  It is brilliantly simple.

Well, not exactly.  How do you compute N, or reset your file to N?  N
is a count of bytes in the transmitted data stream, which is related to
file position parameters through a transformation which could be very
complex.  On the machine with a complex file structure, the only way to
compute N in general is to play through the conversion process.

I believe Bill Barns stated it exactly right.  Rick's scheme is
brilliantly simple only for binary files between Unix systems.

Bob Braden

rick@uunet.UU.NET (Rick Adams) (12/21/89)

> >That is not the case.  His FTP restart scheme relies on the fact that
> >your file is eventually expressed as a stream of octets over the TCP
> >connection.  His RESTart command simply says to suppress the first N
> >octets.  It is brilliantly simple.
> 
> I think what barns was saying is while that it may be "simple" just to read
> through 100 Megabytes of a file, using a possibly CPU-intensive transformation
> to turn in into a Unix-compatible file, just to transmit the last Megabyte,
> it can't be considered friendly to non-Unix systems.  Especially systems
> that otherwise have no problem saying "move to record 200000 of that 201000
> record text file, and transmit the rest".


We all agree it's a win if you can seek to an arbitrary record, right?

Now, lets take your non-vax/non-unix system.

You start to transfer a file, and the transmition fails half way
through.  Without "my" restart method, your alternative is to read the
entire file again and transmit the entire file again. With my restart
method, you still read the file again but only transmit the new part.

The cpu time to read the file is identical in eaither case. Since it
doesnt have to spend cpu time sending the first half of the file again,
it MUST use less total time. How is that not a win? On systems that
don't have to do the transformation (i.e. the overwhelming majority)
its a HUGE win. On systems that do have to do the transformation, is a
minor win. In no case is it worse that retransmitting the entire file
(which is the alternative)

Also, note that we're talking about the official "ascii" and "image"
types. There's nothing Unix specific about this. It is specific to the
stream and image types (but then the official restart method is
specific to record mode, which is the main reason it sucks).

There are two approaches. Mine is to presume that the majority of
transfers will be done in ascii or image mode and that the majority
of transfers will succeed. Therefore, you want to make the normal
case "expensive" to restart in exchange for no overhead on normal
connections.

The other "official" approach, seems to presume that block mode is
normal (I dont even know if there are ANY implementations that support
block mode) and that failures are normal. So, you clutter up the
data stream with lots of markers, etc and make it cheap to
restart. However, you make it expensive to successfully transfer a file.

I know which approach makes more sense to me...

---rick

barns@GATEWAY.MITRE.ORG (Bill Barns) (12/21/89)

I disagree because I don't believe that the byte number in the transfer
stream is sufficient information to determine how to join the data sent
during the restarted transfer with the data sent the first time in
every imaginable case.

There would be no problem if the bytes in the transfer stream were
literally stored in the file.  This is the case in image transfers
between 8-bit-byte machines, so Rick's method should be able to be
successfully implemented for such a case on any system type.  There is
not much problem if the bytes were stored according to some
transformation of bit sequences which can be reliably inverted.  This
is pretty much true of ASCII transfers between UNIX systems and also
many others, although I'm not so sure it is strictly true if the file
being transferred contained a "naked LF" in the part that made it the
first time.  I defer to people who know the code better than I, but I
got the impression that if a client on a non-UNIX does a STOR onto a
UNIX server of a file containing a naked LF and the session dies
somewhere after the naked LF is stored but before the end of the file,
then when the client tries to restart later, it must use the SIZE
command to get the value to be put into the REST command, and the
server cannot tell the naked LF from LF's that were created out of CR
LF sequences, so it will return a size one higher than the actual byte
count received over the data connection. (?)

Besides non-invertibility problems, I suspect the existence of
situations where the state of the receiving FTP's data transformation
state machine cannot be recreated for points in mid-file in a new
session.  With image mode I think this cannot be a problem, but for
other modes it is possible that transformations such as the end-of-line
transforms used by various systems may result in the server having
state information not represented on disk.  Probably in most cases, the
state information can be synthesized at least for some points in the
file, and if so, then fudging the answer to the SIZE command (if file
was being STORed on server) or backing up the REST value based on
scanning the local file (if it was being RETRieved to client) would
enable this method to work OK, provided you can identify some such safe
point in the partial file.

A pragmatic concern for an implementor is to understand the system's
behavior when it crashes while a file is being stored.  If the byte
count can be left out of sync with the data write, a restart might give
bad results.  If the data is always made non-volatile before the byte
count is updated, this will not be a problem.  This sounds like
something the OS "ought to do right" but they sometimes don't.  (They
can also be helped to screw up by hyper-clever disk latency
optimizations or misbegotten network file systems that handle caching
in some way that might reorder these writes.)  I know of no way to
avoid all such problems, but it is probably easier to hack around known
misbehavior with the explicit restart marker method than with implicit
markers.  For example, a server might delay sending its 110 replies by
some interval and then return the byte count in the marker.  This
knowledge would then be sitting on the client end where a server crash
could not clobber it.  For a client crash while retrieving, I suppose
that the client just has to restart at some earlier point than the
filesystem claims it needs to.  This should work equally well (badly)
with either restart scheme.  For really strong assurance of integrity,
you would probably need to run checksums over the files at both ends.

I hope no one will construe this discussion as some sort of "disproof"
of Rick Adams's approach; it isn't one.  It's meant to be an
illustration that the method in the RFCs has relative advantages in
some situations, as Rick's has in many others.  Neither one seems to be
perfect or dominant in every way; either we haven't gotten smart enough
yet to do that, or the problem has no such solution.

/Bill Barns

nelson@sun.soe.clarkson.edu (Russ Nelson) (12/21/89)

In article <74106@uunet.UU.NET> rick@uunet.UU.NET (Rick Adams) writes:

   Also, note that we're talking about the official "ascii" and "image"
   types. There's nothing Unix specific about this. It is specific to the
   stream and image types (but then the official restart method is
   specific to record mode, which is the main reason it sucks).

As I disagreed with Barns, now I must disagree with Adams.  Yes, the official
restart method *does* require record mode.  But there is no reason why a
Unix file cannot be considered to have N records of 1024 (say) bytes plus
a trailing record of M bytes.

   The other "official" approach, seems to presume that block mode is
   normal (I dont even know if there are ANY implementations that support
   block mode) and that failures are normal. So, you clutter up the
   data stream with lots of markers, etc and make it cheap to
   restart. However, you make it expensive to successfully transfer a file.

You don't seem to recall that I implemented block mode and RESTart in
KA9Q's TCP/IP package.  I also put it into a local copy of the BSD
Tahoe FTP server.  That aside, I do agree that there is a tradeoff
between sending restart markers in block mode and sending a file in
stream mode.  If the block size is made large enough, then the
tradeoff is minimized.

Late breaking news (just got mail from Bill Barns): He points out that
explicit markers (as per the RFC) may be more reliable than implicit
markers (as per Adams) when your transfer failed.

What I did to implement RESTart was to implement block mode (as required).
Then, when a file was retrieved in block mode, I would store the markers
in a specially-named file.  So if you were fetching foo.bar, I would store
the markers in foo.$$$.  Whenever I received a marker, I would fflush()
the data file and the marker file.  In addition to keeping the markers,
I would also keep the position of the data file at the time of receipt.

Then, if the transfer succeeded, I would delete the marker file.  
If they restarted the transfer, I would check for an existing marker file,
and automagically issue a FTP RESTart with the latest marker in the file.
The wisdom of doing this automagically is not clear.  It *did*, however,
work.

   I know which approach makes more sense to me...

On one hand your technique, while sound, is interoperable only with
itself.  On the other hand, there are very few implementations of the
"interoperable" version.  If it hasn't been widely implemented, it
doesn't really matter whether the protocol is well designed or not.
--
--russ (nelson@clutx [.bitnet | .clarkson.edu])  Russ.Nelson@$315.268.6667
Live up to the light thou hast, and more will be granted thee.
A recession now appears more than 2 years away -- John D. Mathon, 4 Oct 1989.
I think killing is value-neutral in and of itself. -- Gary Strand, 8 Nov 1989.
Liberals run this country, by and large. -- Clayton Cramer, 20 Nov 1989.
Shut up and mind your Canadian business, you meddlesome foreigner. -- TK, 23 N.

ggm@brolga.cc.uq.oz (George Michaelson) (12/22/89)

postel@VENERA.ISI.EDU writes:

>George Michaelson:
>Please read the specification of FTP, RFC-959.  See section 3.5 on error
>recovery and restart, and read about the REST (restart) command.

Yes, I should have RTFS, I'm sorry. I made the mistake of treating a
BSD-ism as evidence of what an RFC would contain.

I hope the various schemes for solving this problem come together into
a workable whole. To re-iterate, the ACSnet experience of having this
capability embedded in the transport layer is very positive, the costs
are sufficiently marginal to make overheads acceptable on good links,
the benefits on slow noisy ones are immense. I need hardly add that by
doing it low down the stack ALL upper layer activity can potentially
take advantage of it, whereas a feature like REST is application specific. 

Rick Adams suggestion of working on the network order octetstream seems
pretty close to what ACS is achieving, and has the benefit of being
very simple in concept. 

Thankyou to everyone who emailed me to point out the RFC was not at fault...

	-George

Internet: G.Michaelson@cc.uq.oz.au                     Phone: +61 7 377 4079
  Postal: George Michaelson, Prentice Computer Centre
          Queensland University, St Lucia, QLD Australia 4067.

jdarcy@pinocchio.encore.com (Jeff d'Arcy) (12/24/89)

braden@VENERA.ISI.EDU:
> Well, not exactly.  How do you compute N, or reset your file to N?  N
> is a count of bytes in the transmitted data stream, which is related to
> file position parameters through a transformation which could be very
> complex.  On the machine with a complex file structure, the only way to
> compute N in general is to play through the conversion process.

This kind of restart is obviously a non-trivial problem.  That being the
case, I think it makes a lot of sense to keep the protocol simple and
make the machine- or format-induced complexity invisible to the common
network software.  By making the protocol more complex you introduce
additional overhead even for simple cases or between systems that have
very simple file structures.

Jeff d'Arcy     OS/Network Software Engineer     jdarcy@encore.com
  If Encore endorsed my opinions, they couldn't afford to pay me

nelson@sun.soe.clarkson.edu (Russ Nelson) (12/24/89)

I'm keeping this issue alive because I'm still learning from it.  I hope
that others find it interesting and are also profiting from the discussion.

We are trying to decide which file transfer restarting solution is
more general: sender-controlled or receiver-controlled.
Sender-controlled restarting relies on the sender issuing restart
markers periodically, and on the receiver being able to preserve its
state at these arbitrary (to it) points.  Receiver-controlled
restarting relies on the sender being able to suppress the initial N
octets.

The restart method prescribed by the FTP RFC is sender-controlled.
The restart method implemented by Adams is receiver-controlled.  These
two methods may both be implemented at the same time provided the
markers emitted by the sender-controlled restart are distinguishable
from a string of decimal digits.

In article <8912210357.AA06303@gateway.mitre.org> barns@GATEWAY.MITRE.ORG (Bill Barns) writes:

   I disagree because I don't believe that the byte number in the transfer
   stream is sufficient information to determine how to join the data sent
   during the restarted transfer with the data sent the first time in
   every imaginable case.

It should be sufficient.  It's the receiver that chose the number.
Perhaps an example is in order?  A Unix implementation receiving an
ASCII file [1] would have two states: one for "maybe CR", and another
for "expecting LF".  A receiver-controlled restart mechanism would
only need to remember restarts when in the first state.  A sender-controlled
restart mechanism would need to remember restarts in *both* states.  It
would also need to remember which state it was in.  It would also need
the ability to enter that state upon entry to the routine.

[1] Unix uses a single character (newline) to indicate the end of a
line.  ASCII as transmitted over TCP uses two characters (CR, LF) to
indicate the end of a line.  Every occurrence of CR followed by a LF
should be changed into a newline.  The hack of ignoring CR and
translating LF into newline is not correct.

   There would be no problem if the bytes in the transfer stream were
   literally stored in the file.

Unfortunately, the rest of your discussion that followed was flawed.
You assumed that the restart parameter must be reconstructed solely
from the data file.  I don't believe this is possible for
invertibility reasons as you suggest.  Even if it were possible to do
with receiver-controlled restart, it is certainly impossible with
sender-controlled restart because you have the problem of remembering
the arbitrary (to the receiver) restart markers.

I think that you were led into the brambles because Adam's receivers
let you choose an arbitrary octet at which to restart.
Receiver-controlled restarting isn't always going to be that easy.
But it *is* going to be easier than sender-controlled restarting.

I'm ignoring the issue of block mode, which is required by
sender-controlled restart.  None of the major anonymous FTP archive
sites have implemented block mode.

And having said all that, I'll close by saying that it doesn't really
matter *which* restart method gets implemented, so long as at least
one of them *does* get implemented, preferably the same one.  Given
that receiver-controlled is simpler to implement, and a freely
copyable implementation of it for 4.3 BSD Unix already exists, I'd go
with receiver-controlled.
--
--russ (nelson@clutx [.bitnet | .clarkson.edu])  Russ.Nelson@$315.268.6667
Live up to the light thou hast, and more will be granted thee.
A recession now appears more than 2 years away -- John D. Mathon, 4 Oct 1989.
I think killing is value-neutral in and of itself. -- Gary Strand, 8 Nov 1989.
Liberals run this country, by and large. -- Clayton Cramer, 20 Nov 1989.
Shut up and mind your Canadian business, you meddlesome foreigner. -- TK, 23 N.

dricejb@drilex.UUCP (Craig Jackson drilex1) (12/29/89)

Seeing the discussion of receiver-controlled vs sender-controlled restart
in the referenced article, I realized that there could be another problem
with Rick Adams' restart method (where the receiver tells the sender,
"supress the first n bytes of your transmission").

The problem would occur if the receiver was unable (or unwilling) to
store a byte-image of the transmitted file, or something transformable
back-and-forth to such and image.  If a irreversable transformation
is necessary to store the received file in the receiver's file system,
then the receiver may not be able to compute the proper byte count.

For an example, assume a receiver that implemented a record-oriented file
system, with fixed-length records.  The receiver might have to blank-pad
each record received via FTP.  (I'm ignoring the issue of long lines.)
Such a blank-padding might be innocuous to all uses on the receiver machine,
except for the retransmission.

I suppose that the way this would be dealt with is for the receiver to
store an auxiliary file, with indications of "when I began record m,
n bytes had been sent".  This file would be periodically updated (every
few hundred records) and pushed to the disk.  Some care would need to be
taken to ensure that the received file and the marker file would be
consistent after a crash.

I don't mean to say that Rick's method is not useful.  I'm just trying 
to explore the issues.
-- 
Craig Jackson
dricejb@drilex.dri.mgh.com
{bbn,axiom,redsox,atexnet,ka3ovk}!drilex!{dricej,dricejb}

braden@VENERA.ISI.EDU (01/03/90)

	I suppose that the way this would be dealt with is for the receiver to
	store an auxiliary file, with indications of "when I began record m,
	n bytes had been sent".  This file would be periodically updated (every
	few hundred records) and pushed to the disk.  Some care would need to be
	taken to ensure that the received file and the marker file would be
	consistent after a crash.

The restart mechanism built into FTP is essentially a nice version of what
you suggest.

Bob Braden