[comp.protocols.tcp-ip] tcp smooth rtt function

seasterb@CS.UCL.AC.UK (Steve Easterbrook) (04/30/87)

Whilst building a tool to monitor tcp's behaviour under various conditions
I came across an interesting feature of the smooth round trip time function.
As expected, this hovers around a value, responding to such things as
rtx time-outs, etc. However, when running tcp across a local net, (i.e. with
very small round trip time), I noticed a series of apparently random 'kicks'
where the SRTT suddenly shoots up to a relatively enormous value, coming
down only slightly less quick. After some careful analysis, I discovered
the cause. On the test I was doing, the round trip time was small enough
for the SRTT to reach zero occasionally. Examination of the tcp code reveals
that when modifying the SRTT, a test is made to see if it is zero, and if
so the usual RSRE smoothing function isn't used. 

The question is this: Can anyone give me some pointer as to what the
philosophy behind this is, and maybe some reference as well. I gather
its main purpose is to prevent the SRTT from staying at zero, as this
will cause tcp to be a bit keen on the retransmissions, but it seems
to me that substitute value is a little on the overkill side.

Ta muchly,
Steve

CLYNN@G.BBN.COM.UUCP (04/30/87)

Steve,
	It is well known that a TCP receiving an ack for a segment
does not know if the ack is for the original transmission or one of
the retransmissions.  The spec does not offer much help to the
implementor, consequently different algorithms have been explored.  IF
one measures the rtt from the most recent (re)transmission, then the
possibility exists that an ack of one of the prior transmissions may
arrive "about the same time" as the "next" retransmission.  If it is
just a little before, the segment acked, it is not retransmitted, and
the rtt is "reasonable", so one uses it; if it is just a little after,
who knows ...  one hopes for the best (and frequently looses); if it
arrives at the "same time", who knows.  It sounds like the
implementation you were measuring decided that it was a case of the
retransmission being acked and therefore didn't want to corrupt the
srtt by including what the implementors thought was bogus data.

Aside: I believe one can prove that if the rtt is not measured from
the original transmission (and no other information is available to
decide what is being acked) then it is possible for the srtt to
converge to a value less than the "correct" value; this causes every
packet to be retransmitted, even if it isn't lost.  I think there is
enough latitude in the protocol to allow an implementer to cause the
other end to unknowingly provide some good "hints" about what is being
acked, but that is another discussion.

Charlie

seasterb@CS.UCL.AC.UK (Steve Easterbrook) (04/30/87)

>   It sounds like the
>   implementation you were measuring decided that it was a case of the
>   retransmission being acked and therefore didn't want to corrupt the
>   srtt by including what the implementors thought was bogus data.

Ah, excellent point, but I've managed to rule this one out. I'm monitoring
retransmissions as well, and there aren't any happening at the right time to
account for the behaviour I described. It seems
the SRTT is falling to zero purely due to rounding errors with very
small round trip times. However, this doesnt preclude the resulting
behaviour being due to the implementors allowing for the circumstances
you describe. 

There seems to be two possible approaches (given that it is undesirable to
have a SRTT of zero):
Let the SRTT do whatever it wants, but never let the RTT be rounded to zero,
or do something different with the SRTT if it does reach zero. Clearly
the second approach (whether intentionally or not) is taken in the tcp
I'm using. 
Given this, what alternative value should it be set to? Has anyone
tackled this problem before? And what happens if it *is* left at zero?
(I shall now go away and find out the answer to the last question!)

Steve

karels%okeeffe@UCBVAX.BERKELEY.EDU.UUCP (04/30/87)

I presume that you are using 4.3BSD; discussions of implementation
quirks are easiest to follow if the implementation is identified.
Things will behave strangely in 4.3 TCP if the smoothed round-trip
time becomes zero after the connection is established; it would do
just what you described.  The value of 0 is used to mean "unknown",
and causes the default (fairly long) value to be assumed.  The first
RTT sample (from first send of SYN to its ack) becomes the initial
value of the smoothed RTT.  It was assumed that the smoothed RTT
would never again be 0, as the RTT starts at 1.  I don't understand
how this can happen.

To respond to an early point in this discussion: this implementation
discards RTT estimates for segments that are retransmitted.  This
sometimes reduces the number of RTT estimates that are obtained,
but is much better than restarting the RTT timer when retransmitting.

		Mike

karn@FLASH.BELLCORE.COM (Phil R. Karn) (05/02/87)

My approach to the small-SRTT problem is to let it do what it wants, but
bound the timer to the minimum non-zero value. I.e., if the clock ticks at a
1 Hz rate,

	rto = min(beta*srtt,1);	/* rto is retransmission time-out */


Phil