snorthc@RELAY.NSWC.NAVY.MIL (08/29/89)
The cost of routing? senario: I am collecting data on network protocols and applications. The experiments are conducted on a test subnet only (128.38.45) and from the '45' subnet to another (128.38.48). The '45' cable has only 4 computers, a sun, a dec 3100, a lanalyser and a lan watch. The 48 cable is a "living lab" various protocols are allowed to exist on it: decnet, novell_ipx, apple_localtalk, osi etc. It is also pretty quiet, a 2% peak for 1 second is the max traffic observed to date (I haven't broken out the LAN MDs yet). The router is currently a Network Systems Corporation EN-641. In a few more weeks it will be replaced by a cisco box and the tests will be rerun. question: There is a repeatable difference in the number of packets required to conduct an operation on the 45 cable alone or routed from the 45 cable to the 48 cable for certain applications. The difference is fairly high for xterm and telnet, it cannot be detected for ftp. Below are two fragments of the test results form. They are fairly representative, I have been running tests for three weeks and have collected a fair quantity of data. examples: 45 CABLE 45 -> 48 CABLE Traffic required to initiate an xterm connection: 60 packets 71 packets Traffic telnet requires to transmit a known string: 37 packets 45 packets YES! arps are stripped out and maintained as a separate stat. YES! the test has been run to/from similar hw/sw platforms * NO! fragments have not been observed... in fact in the case of xterm or telnet you tend to have small packets anyway. * there is only one dec 3100, so an ultrix vax was used on the 48 cable, however there is a sun 2 sunos 3.5 on both cables and results are quite similar. So I am confused. What causes this overhead? Is there an RFC I should have read on this subject? Could it be the router? Any ideas? Thank You, Stephen Northcutt (snorthc@relay.nswc.navy.mil)
craig@bbn.com (Craig Partridge) (08/29/89)
In article <8908291343.AA07813@ucbvax.Berkeley.EDU> snorthc@RELAY.NSWC.NAVY.MIL writes: > There is a repeatable difference in the number of packets >required to conduct an operation on the 45 cable alone or routed >from the 45 cable to the 48 cable for certain applications. >The difference is fairly high for xterm and telnet, it cannot be >detected for ftp. > ... >examples: 45 CABLE 45 -> 48 CABLE > Traffic required to initiate an xterm connection: > 60 packets 71 packets > Traffic telnet requires to transmit a known string: > 37 packets 45 packets > >YES! arps are stripped out and maintained as a separate stat. >YES! the test has been run to/from similar hw/sw platforms * >NO! fragments have not been observed... in fact in the case of >xterm or telnet you tend to have small packets anyway. In general, it would be much easier to diagnose this problem with a packet trace, but... - Have you stripped out duplicate SYNs segments? Some TCP's retransmit SYNs pretty quickly at the start. As a result, you are likely to see a few more SYNs/SYN-ACKs in each direction during connection setup. This is simply the cost of getting the connection calibrated to the path. [Although note you can retransmit SYNs in more or less intelligent ways]. - Similarly, does this problem of extra packets persist during the entire connection, or only at startup? If only at startup you may just being seeing effects of the retransmit timer calibrating itself to the slightly longer delay. Speculating in the absence of enough data.... Craig
jas@proteon.com (John A. Shriver) (08/29/89)
Two possibilities: 1. One of the host TCP implementations is hyper-sensitive to small changes in the round trip time. There will be an increase in end-to-end delay passing through the router. This could affect various congestion control algorithms in the hosts (like the Nagle algorithm). 2. Someone is dropping a packet. It could be that the router is not keeping up, or it could be that the router is sending faster than one of the hosts can receive. Many Ethernet interfaces have subtle "deaf time" problems. You might want to look at the TCP, IP, and device stats on the two machines. Unfortunately, until 4.3tahoe (?), the TCP stats were not very detailed. Alternately, you could use a Sniffer (or the like), and interpret the TCP packets to see what's happening.
barns@GATEWAY.MITRE.ORG (08/29/89)
It could be many things, and the way to find out for sure is to look at the packets. However, here is an example of a possible candidate. Telnet is an especially difficult protocol to analyze in terms of theoretical packet behavior. Neither end is exactly sure when it ought to send next, so there are implementation decisions involved. The decisions found in BSD-flavored code (and many others) tend to induce a dependency on round-trip time. If the timing works out favorably, you will have echoes and acknowledgements and window updates traveling together. If it works out less favorably, there will be extra packets carrying "naked ACKs" and possibly also window updates. By going through a router, you increase the RTT by some amount. This may increase the chance that the TCP will feel the need to send an ACK for an incoming segment before the echo or other output is ready to go back. Many other things affect packetization, such as Nagle algorithm, need for retransmission, all the parameters that affect retransmission (since they indirectly determine what will go in the packets), context or task switching behavior of the OS, etc. I've done a few limited paper vs. reality studies on what makes packets and find that there are so many (nonlinear) factors that even if you know quite a lot about the underlying factors, there are likely to be yet more factors that you didn't know about. To summarize, real systems have a lot of timing-related properties, and the TCP (and perhaps its higher layer) both contribute to and are affected by them. These influences show up more with irregular flows of small data chunks than with regular data flows of big data chunks. I also feel that the Nagle algorithm is too blunt an instrument to handle such flows nicely in certain sub-congested regimes, but I don't think that is what is biting here. Bill Barns / MITRE-Washington / barns@gateway.mitre.org
stev@VAX.FTP.COM (08/29/89)
some people (us included), send smaller packets when sending traffic across networks. this is to avoid the routers fragmenting packets. this is considered a good thing. the over head is more when you are going through a fast router between fast networks, but you end up winning more when you consider more networks and more heavily loaded routers. stev knowles ftp software stev@ftp.com