barns@GATEWAY.MITRE.ORG (Bill Barns) (06/05/89)
I am writing a document about how to optimize the effectiveness of network usage, primarily for users of MILNET. Below is a highly condensed list of items I already plan to mention. Suggestions for additional topics, points to bring out, misconceptions to (try to) correct, etc., are welcomed. Surely some of you experts out there have been in the position of trying to explain some of these issues to people who are not protocol gurus and never will be; your insights will be helpful. Please mail comments and suggestions to me directly unless there's some reason for posting it to the world. The document I am writing will probably be publicly released eventually (around the end of the year, I suppose). Thanks in advance for all contributions... Bill Barns / MITRE-Washington / barns@gateway.mitre.org ------- [The eventual reader will be a semi-independent system manager, planner, buyer, or smart user of some type of computer participating in a wide area network which speaks mostly TCP/IP and charges per packet. They may or may not access that WAN through a LAN which may be of any size or type, and the users of these computers may be local or remote with potentially any type of equipment. The readers are sufficiently (un)sophisticated that terms like TCP/IP, TELNET, FTP, network mail, connections, protocol, etc., are familiar but only understood in a rather fuzzy way. This document is intended to explain both general principles and concrete actions that they might take to enable them to do whatever it is they are trying to do with minimal costs for packets on the long-haul network.] Guiding Principles: $ cost depends on number of packets sent through long-haul network. Your LAN is probably free, as long as you stay on it and don't try to exceed its capacity. Don't send the same data between the same points more than once. Consolidate data into a smaller number of large packets when feasible. Convert long-haul usage to short-haul usage (access local resources instead of distant ones) if possible. Mail: Probably the most widely used application. Tends to be reasonably efficient due to its inherent design. Some software gives you ways to do things a little better. Mailers tend to have retry timers, often with adjustable rates. Fewer tries is cheaper but takes longer to get the mail there. Local mail exploders for massive distributions save a few packets (or in some cases, lots) and are nice to have for other reasons. In some cases you may want a tree of exploders. SMTP mailers are cheaper to run if they will send more than one message on a single TCP connection. Also, they can deliver all copies for a single host in one transmission. Domain Names: You usually have a choice of host table or domain servers. Domain servers are better, at least if you tune them right. FTPing host tables is probably wasteful in many ways. DNS has less density in terms of info per packet but most people don't want that much data anyway. With correct tuning, the DNS should be cheaper to run in virtually all cases. You should try to contrive things so that a small number of hosts cache a lot of domain info, and you query those hosts via some free resource (LAN). If you need so much data from a zone that DNS costs more than host table, you should probably set up a local server, which gets the packet count back down since zone transfer is probably even better than FTP, given the compaction. Sensible choice of what zones to serve where is important. File transfer: Files wanted by lots of people should be cached locally in some known server so only one copy needs to be moved through the pay-per-packet backbone. Restart capability in FTP could save, especially for people who frequently move huge files around. Compressed mode in FTP would be good. Compression with separate software can be very good sometimes, but there is a tradeoff with CPU. FTP in background or batch may be cheaper due to fewer retransmits and due to lower prices for off-peak packets. The system administrator might want to log file transfers to find out what files are redundantly transferred ( -> caching). This might be nice for security audit too. TELNET: Line mode is cheaper than character mode. Local echoing is cheaper than remote echoing. Application/user requirements have to be considered too (characterwise interaction is worth its cost in some situations). TACs (and other terminal server devices) have some settable parameters which will affect the packetizing (may foul up interactive applications but may win for upload, etc.) Extensive editing to modest sized files may be cheaper to download/edit/upload than to edit on a remote host. Initial data entry on a PC rather than a remote host may be smart. Custom Applications: How you code tends to affect the packetizing, which affects the cost. Decide if you are doing interactive, query/response or batch type processing and design accordingly. Mail can be used as an interface to some query/response applications - send specially formatted message to a server mailbox, get back a message with the answer. With custom programming on all hosts involved, the same thing can be done in real time (WHOIS). Minimize SEND calls from your code unless you have a TCP which is clever about repacketizing and won't send lots of tiny packets. Other User Facilities: Batch, UNIX "at", etc., may give you a handle on getting transfers done at off-peak time when rates are less. It is also possible to build this into an application (user selects immediate or overnight operation). TCP/IP General: Avoid RX's, use Jacobson TCP's, especially if you go through gateways or bogged paths much. Learn how to see how much RXing your system is doing, and if it's much, find out why, you may be paying for them.