[comp.protocols.tcp-ip] TCP and Loss

craig@NNSC.NSF.NET (Craig Partridge) (10/02/87)

Thomas,

The work I was doing on RDP led me into the problems of how transport
protocols perform in the face of loss.  I haven't had much time to
look at the problem recently (I'm doing this in my spare time -- and
don't have much) but keep plugging away.

I'm interested in situations in which loss is inherent -- that
is, you will get loss, no matter what your TCP does.  Packet radios
suffering from noise or jamming are good examples.  Loss caused by
congestion is a different problem.

The paper Phil Karn and I gave at SIGCOMM touches on the problems. Karn's
algorithm is a method for keeping an accurate round-trip time estimate
even when loss rates reach 50%.

Currently I'm seeing if I can model what happens to acknowledgement and
transmission strategies if the loss rate really climbs.  So far I
know that there are situations in which at least some strategies to
reduce the number of acks you send fail *big* under loss -- you get
more total packets sent than if you had had sent the acks you suppressed.
Beyond that I don't have anything to say yet -- I'm finding the mathematics
of it difficult.

I'd be interested to know of other people working in this area.

Craig

brescia@park-street (Mike Brescia) (10/05/87)

Can anyone from Packet Radio or SURAN projects speak to the retransmission
attempts done on single hops to improve reliability (and increase delay
average and variance)?  Is hop-by-hop retransmission better than end-to-end
retransmission or not?  Why?  Should TCP rely on hop-by-hop reliability and
never retransmit?  (Before you answer that, recall that the vast majority of
lost packets are dropped in gateways because of congestion.)

Mike

     <msg from craig@nnsc.nsf.net to narten@purdue.edu>

     I'm interested in situations in which loss is inherent -- that
     is, you will get loss, no matter what your TCP does.  Packet radios
     suffering from noise or jamming are good examples.  Loss caused by
     congestion is a different problem.

glauer@SURAN.BBN.COM (Gregory Lauer) (10/05/87)

Hop-by-hop retransmission is needed in networks with high loss and long routes
if we are to have a reasonable chance of getting anything through the network.
If the link loss probability is p, then the probability of getting a packet
through N hops without hop-by-hop retransmissions is (1-p)**N.  In a lossy
network (p = .50), with a path length of N=10 hops, the probability of getting
a packet through without hop-by-hop retransmissions is thus ~.001.  On the
other hand, end-to-end retransmissions are needed since a node can crash after
having acked a packet and before having forwarded it.

How many packets get sent in each case?

In the following we assume that p**N is approximately 0 and that 1/p<<N (it
lets us sum from 0 to infinity instead of from 0 to N).

Without hop-by-hop retransmission it takes on average 1/(1-p)**N end-to-end
retransmission before the packet will reach the destination.  Each unsuccessful
end-to-end transmission (obviously) doesn't reach the destination, but goes (on
average) 1/p hops, thus the total traffic generated is 1/[p(1-p)**N] packets.
For p=.5 and N=10, this implies ~2000 packets are generated to get the packet
to the destination.  Of course, an end-to-end ack is also needed, which will
generate another 2000 packets (and, unless the retransmission timers are set
"appropriately", the source will continue to transmit while it is waiting for
the ack to get through).

With hop-by-hop retransmission it takes, on average, 1/(1-p) transmissions to
get a packet over a link.  If the hop-by-hop retransmission timers are set
appropriately (so that a packet is retransmitted only if it didn't get across
the link), then a packet is transmitted 1/(1-p)**2 times before the
acknowledgement is received.  Thus the number of transmissions required to get
the packet to the destination is N/(1-p)**2, which for p=.5 and N=10, is 40
packets.  Another 40 packets are required for the end-to-end ack (during which
time the source may continue to send if the end-to-end retransmission timer is
set inappropriately).  Note that in addition to transmissions of the original
packet, there are 1/(1-p) hop-by-hop acks transmitted on each link, adding
N/(1-p) packets to the overhead of getting a packet to a destination.  (For
p=.5, and N=10, this yields another 20 packets each way.) 

Thus ignoring "spurious" end-to-end retransmissions, we have the following
comparison:

with hop-by-hop acks:
   40 transmissions of the original packet
   20 hop-by-hop acks of the original packet
   40 transmissions of the end-to-end ack
   20 hop-by-hop acks of the end-to-end ack
  ---
  120 packets

without hop-by-hop acks:
   2000 transmissions of the original packet
   2000 transmissions of the end-to-end ack
   ----
   4000 packets

Finally, a note about the variance of the round trip time.  Ignoring the impact
of missing hop-by-hop acks, the variance in the number of transmissions
required to get a packet across a link is (I think) q=2[1/(1-p)+{p/(1-p)}**2],
and the variance in the time required to get it to the destination is Nq.  For
our example, this implies a standard deviation in the number of transmission
required to get a packet to the destination of ~24.5 transmissions.  For the
case with no hop-by-hop acks, the number of end-to-end transmissions required
has a variance given by the formula for q above with p replaced by
1-1/(1-p)**N.  For our example this yields a standard deviation of ~1447
transmissions.  Thus in this example, hop-by-hop acks not only reduce the
number of transmissions required but also reduce the variance in the number of
transmissions required (making it easier to have a good estimate of the
round-trip time, and thus reducing the number of unnecessary retransmissions).

For networks with lower losses, the overhead of the hop-by-hop acks will
outway the reduction in number of transmissions required.  We can get an upper
bound on when hop-by-hop acks help as follows.  An upper bound on the number of
transmissions required without hop-by-hop acks is N/(1-p)**N (i.e. each failed
transmission takes the maximum number of hops plus 1).  A lower bound on the
number required with hop-by-hop acks is 2N/(1-p) (i.e. never retransmit a
packet once it has been received...even if you don't know it's been received).
Equating these yields the fact that hop-by-hop acks don't help if
p<1-(.5)**[1/(N-1)], which for N=10 corresponds to p<.074.

This type of analysis is perhaps appropriate for a packet radio network, where
there may be a significant probability that a radio just doesn't receive the
packet and one can (maybe) argue that the probability of loss is independent
from node to node and from transmission to transmission.  This type of analysis
is less useful in networks where packet loss is due to congestion (and
corresponding buffer shortages), where acks have a higher priority, where there
is likly to be significant correlation between the probability that nodes are
congested and where packet (re)transmission is due mainly to timers going off
before a node has a chance to ack a packet that it correctly received.


Greg

WESTCOTT@G.BBN.COM (10/07/87)

Mike,

Hop by hop retransmissions are necessary in packet radio networks 
because PRs must transmit over links with relatively low probability
of success.  Imagine a 5 hop path with a  probability of 80% success
per hop (~30% over net).  When that is coupled with congestion and other
problems along a longer Internet route, then it is extremely difficult
and costly (in terms of packet transmissions) to attempt to maintain
a reliable connection.

Presently, PR nets also use the timing information from their hop
acknowledgements to apply backpressure for congestion control; its
known as the pacing algorithm.  The reasoning is that rather than fill
all the buffers in a data path, limit the packets forwarded to what
the next PR can handle.  If congestion appears on the far side of the
net, traffic generation may be slowed by increasing the pacing delays
all the way back to the source.  Smart sources slow down, dumb sources 
get their packets dropped by the source's attached PR.

Hop by hop retransmission is limited to several attempts along the
routing path and a few attempts directed to "any PR who can route this
packet toward the destination".  The later is known as alternate
routing and becomes most useful when you've lost connectivity with
your neighbor (perhaps its driven under a bridge or behind a
building).  

Because of the uncertainty of successful retransmission, PR nets count
on a reliable end-to-end protocol as well.  One way to look at your
question is that PRnets, with per hop acknowledgements, approach the
non-lossy nets in probability of successful transmissions.  I agree
with your thought that due to "congestion => dropped packets"
behavior, simply using hop by hop acknowledgements seems fruitless.

Jil

PAP4@AI.AI.MIT.EDU ("Philip A. Prindeville") (10/08/87)

Where can I find a complete description of IP packet radio networks?
I'd like to know everything about them: routing algorithms; link
layer protocols; hardware involved; reliability and throughput, etc.

Also, has anyone given any thought to using digital mobile phones as
a subnet media?  Or are most mobile phone nets in the US still analog?

Thanks in advance,

-Philip

CERF@A.ISI.EDU (10/10/87)

Mike,

On a lossy link, it is better to retransmit on the link to cinfine the
amount of retransmission to the link rather than pay the price throughout
the network. One uses end/end retransmission to recover from more cataclysmic
failures (loss of underlying X.25 VC, network partitioning, crash of a
packet switch, etc.).l

On really noisy links, it is better to use forward error correction
because checksums will fail almost every time and retransmission then
is not a good recovery tool.

Vint

CERF@A.ISI.EDU (10/10/87)

For starters, read the November 1978 IEEE Proceedings to get a
good overview of packet radio nets.

Vint Cerf