seasterb@CS.UCL.AC.UK (Steve Easterbrook) (04/30/87)
Whilst building a tool to monitor tcp's behaviour under various conditions I came across an interesting feature of the smooth round trip time function. As expected, this hovers around a value, responding to such things as rtx time-outs, etc. However, when running tcp across a local net, (i.e. with very small round trip time), I noticed a series of apparently random 'kicks' where the SRTT suddenly shoots up to a relatively enormous value, coming down only slightly less quick. After some careful analysis, I discovered the cause. On the test I was doing, the round trip time was small enough for the SRTT to reach zero occasionally. Examination of the tcp code reveals that when modifying the SRTT, a test is made to see if it is zero, and if so the usual RSRE smoothing function isn't used. The question is this: Can anyone give me some pointer as to what the philosophy behind this is, and maybe some reference as well. I gather its main purpose is to prevent the SRTT from staying at zero, as this will cause tcp to be a bit keen on the retransmissions, but it seems to me that substitute value is a little on the overkill side. Ta muchly, Steve
CLYNN@G.BBN.COM.UUCP (04/30/87)
Steve, It is well known that a TCP receiving an ack for a segment does not know if the ack is for the original transmission or one of the retransmissions. The spec does not offer much help to the implementor, consequently different algorithms have been explored. IF one measures the rtt from the most recent (re)transmission, then the possibility exists that an ack of one of the prior transmissions may arrive "about the same time" as the "next" retransmission. If it is just a little before, the segment acked, it is not retransmitted, and the rtt is "reasonable", so one uses it; if it is just a little after, who knows ... one hopes for the best (and frequently looses); if it arrives at the "same time", who knows. It sounds like the implementation you were measuring decided that it was a case of the retransmission being acked and therefore didn't want to corrupt the srtt by including what the implementors thought was bogus data. Aside: I believe one can prove that if the rtt is not measured from the original transmission (and no other information is available to decide what is being acked) then it is possible for the srtt to converge to a value less than the "correct" value; this causes every packet to be retransmitted, even if it isn't lost. I think there is enough latitude in the protocol to allow an implementer to cause the other end to unknowingly provide some good "hints" about what is being acked, but that is another discussion. Charlie
seasterb@CS.UCL.AC.UK (Steve Easterbrook) (04/30/87)
> It sounds like the > implementation you were measuring decided that it was a case of the > retransmission being acked and therefore didn't want to corrupt the > srtt by including what the implementors thought was bogus data. Ah, excellent point, but I've managed to rule this one out. I'm monitoring retransmissions as well, and there aren't any happening at the right time to account for the behaviour I described. It seems the SRTT is falling to zero purely due to rounding errors with very small round trip times. However, this doesnt preclude the resulting behaviour being due to the implementors allowing for the circumstances you describe. There seems to be two possible approaches (given that it is undesirable to have a SRTT of zero): Let the SRTT do whatever it wants, but never let the RTT be rounded to zero, or do something different with the SRTT if it does reach zero. Clearly the second approach (whether intentionally or not) is taken in the tcp I'm using. Given this, what alternative value should it be set to? Has anyone tackled this problem before? And what happens if it *is* left at zero? (I shall now go away and find out the answer to the last question!) Steve
karels%okeeffe@UCBVAX.BERKELEY.EDU.UUCP (04/30/87)
I presume that you are using 4.3BSD; discussions of implementation quirks are easiest to follow if the implementation is identified. Things will behave strangely in 4.3 TCP if the smoothed round-trip time becomes zero after the connection is established; it would do just what you described. The value of 0 is used to mean "unknown", and causes the default (fairly long) value to be assumed. The first RTT sample (from first send of SYN to its ack) becomes the initial value of the smoothed RTT. It was assumed that the smoothed RTT would never again be 0, as the RTT starts at 1. I don't understand how this can happen. To respond to an early point in this discussion: this implementation discards RTT estimates for segments that are retransmitted. This sometimes reduces the number of RTT estimates that are obtained, but is much better than restarting the RTT timer when retransmitting. Mike
karn@FLASH.BELLCORE.COM (Phil R. Karn) (05/02/87)
My approach to the small-SRTT problem is to let it do what it wants, but bound the timer to the minimum non-zero value. I.e., if the clock ticks at a 1 Hz rate, rto = min(beta*srtt,1); /* rto is retransmission time-out */ Phil