jxh@attain.teda.teradyne.com (Jim Hickstein) (10/08/90)
We use rdist(1) to update a large (100MB) tree of product sources from the golden copy on the East coast to two sites, one on the West coast, the other in Japan. The link from East to West coasts is a relatively slow link: through two Ethernet bridges (or routers) and over one 56KB and one 19.2KB link. This introduces some serious latencies, including the actual copper propagation delay over 3000 miles, which isn't zero. (I used to work for a satellite datacomm company, and our spoofing product actually did better over a geosync satellite hop of .25s than a straight-line copper wire of 1000 miles or more, since we really only had to go 10 feet before getting a reply!) I have watched rdist running over this link, and it is characterized by small (60 bytes) TCP packets bouncing back and forth, one at a time, and once in a while, when it actually has to transfer an updated file, many packets of 566 bytes in length. I suspected that it could be better if we interleaved the rdists of the two major trees, each of which took two hours to run. It now runs twice as fast: two hours in toto. Clearly, rdist is latency-bound rather than bandwidth- bound when most of what it does is check file timestamps. This lead me to think that it must be doing some sort of transaction like this: I've got this file with this date. Wanna new one? Ummm. No. (repeat) This transaction is synchronous, that is the questioner waits for the answer before asking the next question. No overlap of any kind. Clearly, the program itself is nicely structured given this simple protocol, but it's killing me. I am considering running ten rdists, one for each of the ten major subdirectories, for each tree (twenty rdists going at once) to get around this, but that might put a strain on context switches or something: I'm not sure how much better it could get before I run into some other obstacle. Clearly, the right way to do this is to set up two TCP sessions, one of which sends big batches of file names and their timestamps, and receives (eventually) lists of NAKs interpersed with requests to send a new copy of a given file. This request would then be queued for transmission by the other session, which (presumably) would already be busy (in a bandwidth-limited sense) sending some other file, so making yet another session (one for each requested file transfer) wouldn't gain you anything. Has anybody come up with such a thing? Where do I get it? Do I have to pay for it? (I just might be able to do so.) Does "track" (the inverse rdist recently mentioned somewhere on the net) happen to do better in this regard? (I have no particular attachment to rdist: I do 'rdist -r' anyway, giving me an exact copy.) Please email replies, and I will post a summary. Thanks very much.