[comp.protocols.tcp-ip] More on TCP Performance Limits

craig@SICS.SE (Craig Partridge) (01/11/91)

There seems to be a lot of misinformation running around.

The end-to-end performance of a TCP connection is limited by two different
factors:

    (1) The window size.  Because the window size determines how much
    unacknowledged data can be in flight, the maximum bandwidth a
    connection can achieve is the window size divided by the round-trip
    time.

    (2) The sequence space size.  To ensure that you don't get sequence
    space wrap (two different instances of byte #37 active at the
    same time), TCP places time limits (which depend on IP flushing
    packets of a certain age) on how fast you can cycle through the
    sequence space.

Right now (1) is the usual limit on throughput.  I believe you have to get
substantially past FDDI speeds for (2) to be a problem.

RFCs 1072 and 1185 discuss these issues in more detail.

Craig Partridge
(on sabbatical at the Swedish Institute of Computer Science).

rpw3@rigden.wpd.sgi.com (Rob Warnock) (01/14/91)

In article <9101111221.AA08912@garuda.sics.se> craig@SICS.SE
(Craig Partridge) writes:
+---------------
| There seems to be a lot of misinformation running around.
| The end-to-end performance of a TCP connection is limited by two different
| factors:
|     (1) The window size...
|     (2) The sequence space size...
+---------------

And (at least) one more:

      (3) The underlying IP ID space size.

Item #1 depends on the round-trip time, #2 and #3 do not.

As Chris Johnson noted, you can only send <#_of_distinct_IP_IDs> (64K) times
<max_IP_pkt_size> (64K) bytes per TTL, where the TTL has to at least be large
enough to cover your number_of_hops, and in any case shouldn't be smaller
than 15 since that's the suggested default reassembly timeout. With TTL=255,
that's 16.8 MB/s; for TTL=15, that's 286 MB/s.

Why be concerned about reassembly timeouts? Because to get the data rates
noted above, you have to send max-sized IP packets (64 Kbytes), which implies
fragmentation on almost all current media (except HiPPI). (Also means TCP
MSS = 64K, but that's the least of the worries.) And you don't want later
fragments being confused with earlier ones. If your IP holds onto frags for
a minimum of 15 seconds (see RFC 791, Page 27), that puts an effective minimum
on TTL of 15 seconds, at least for the purposes of the rate-limit calculation.

But RFC 1122 says [page 35]:

                 A fixed value [of TTL] must be at least big enough for the
                 Internet "diameter," i.e., the longest possible path.
                 A reasonable value is about twice the diameter, to
                 allow for continued Internet growth.

And further [page 57]:

         There MUST be a reassembly timeout.  The reassembly timeout
         value SHOULD be a fixed value, not set from the remaining TTL.
         It is recommended that the value lie between 60 seconds and 120
         seconds...

         DISCUSSION:
              The IP specification says that the reassembly timeout
              should be the remaining TTL from the IP header, but this
              does not work well because gateways generally treat TTL as
              a simple hop count rather than an elapsed time.  If the
              reassembly timeout is too small, datagrams will be
              discarded unnecessarily, and communication may fail.  The
              timeout needs to be at least as large as the typical
              maximum delay across the Internet.  A realistic minimum
              reassembly timeout would be 60 seconds.

Using the suggested 60 seconds produces a IP ID re-use rate-limit of 71.6 MB/s,
120 seconds gives 35.8 MB/s.

So the IP ID rate-limit (item #3) is also a serious issue in gigabit/sec TCP
networking. Some of the ideas in RFC 1185 may be helpful here, but in the
presence of fragmentation, TCP options cannot be recognized in any fragment
except the first. The solution may be to use some form of MTU discovery, then
send *all* TCP segments with the "Don't Frag" bit on the in the IP packets
(avoiding reassembly aliasing), *let* the IP IDs wrap as they will, and use
the timestamp mechanisms of RFC 1185 to sort out potential duplicates.


-Rob

-----
Rob Warnock, MS-9U/515		rpw3@sgi.com		rpw3@pei.com
Silicon Graphics, Inc.		(415)335-1673		Protocol Engines, Inc.
2011 N. Shoreline Blvd.
Mountain View, CA  94039-7311

mogul@wrl.dec.com (Jeffrey Mogul) (01/15/91)

In article <81033@sgi.sgi.com> rpw3@sgi.com (Rob Warnock) writes:
>In article <9101111221.AA08912@garuda.sics.se> craig@SICS.SE
>(Craig Partridge) writes:
>| There seems to be a lot of misinformation running around.
>| The end-to-end performance of a TCP connection is limited by two different
>| factors:
>|     (1) The window size...
>|     (2) The sequence space size...
>And (at least) one more:
>      (3) The underlying IP ID space size.
>[...]
>So the IP ID rate-limit (item #3) is also a serious issue in gigabit/sec TCP
>networking. Some of the ideas in RFC 1185 may be helpful here, but in the
>presence of fragmentation, TCP options cannot be recognized in any fragment
>except the first. The solution may be to use some form of MTU discovery, then
>send *all* TCP segments with the "Don't Frag" bit on the in the IP packets
>(avoiding reassembly aliasing), *let* the IP IDs wrap as they will, and use
>the timestamp mechanisms of RFC 1185 to sort out potential duplicates.

Note that the Path MTU Discovery Protocol (RFC 1191) specifically
requires the participating sending host to send all TCP segments
with DF set.  With this mechanism in use, the IP ID field becomes
effectively unused, and then the size of the ID space is immaterial
to the maximum rate on the connection.  The rate is thus controlled
only by the sequence space and window sizes.

You might end up sending 8 bytes per segment, but this doesn't limit
the theoretical TCP bandwidth ... only the practical bandwidth.

-Jeff

mostek@PECAN.CRAY.COM (James Mostek) (01/15/91)

It seems that most of the contributors are pointing out that IP
limits the performance of TCP/IP with the TTL and ID fields.
They claim that IP would break if too many packets are in the network
at any point in time (two packets could have the same ID).

However, if IP builds a packet for TCP with bad data,
TCP will drop the packet and it will be retransmitted.
TCP's checksum will certainly show an improper packet.

This is a very obscure case (an ID from a later
fragment arrives before its predecessor).
So we are not talking about many retransmissions.

TCP is built to handle bad data.
I'm not familiar with UDP, how would it handle this?

I don't think people should make the calculations in previous mails
to claim that TCP/IP's performance is limited by IP's ID and
TTL fields.

Jim Mostek
Cray Research, Inc
mostek@cray.com