lauren (03/12/83)
I strongly suspect that the best place to try improve uucp "efficiency" is in the "higher-level" file handling routines -- I recommend AGAINST trying to improve or replace the packet driver. Several points: 1) The packet driver, with its non-trivial buffering techniques, manages to allow fairly high rates of data transfer, even on multiple lines, without dragging most systems into the ground. Most other protocols with which I've experimented have turned out to have substantially lower throughput and presented a greater load (sometimes much greater) on the system. The packet driver is not the problem -- it's a nice piece of work and performs very nicely. 2) Occasionally we hear complaints about uucp not really using the full duplex capabilities of the channel. Frankly, I doubt very much if true "full duplex" file transfers would really improve the situation. Most heavy uucp traffic "tends" to be mostly in one direction (especially netnews traffic!) -- the number of cases where both sites have even approximately equal traffic *at a given time* are pretty small. Given the possibility of load disruptions and other factors, plus the "unidirectional" nature of most traffic, leads me to suspect that an effort into more full duplex usage of uucp would also NOT be the best way to proceed. 3) The queue. The place where we see most of the "clogging" clearly seems to be the uucp spool directory itself. The mechanism of having three files for each message to be mailed provides good generality (i.e. mail is but a special case of a very general inter-machine mechanism) but is costly in time and space. However, we don't really want to toss out the generality either! My suggestion would be to define a new "channel" for mail/netnews which still uses the conventional packet driver. At first, since there would be some compatibility issues, I would propose that the new channel only be used for netnews between cooperating sites. As more sites started running the appropriate versions of uucico/uuxqt, the channel could be used for mail as well. The old channel would still exist as a fallback in all cases. The easiest way to set this up would be to define the new channel as an "alternative" to the normal "g" channel that we now use. Perhaps calling this a new channel is a bit deceptive. What I really want to do is establish a new mail/netnews delivery scheme that uses only one or two files (instead of the current three) for delivery of a single message. Sites would indicate their ability to handle the procedure by negotiating to use the "new channel" (perhaps "m"?) instead of "g". Channel "m" would still use the ordinary packet driver and still transfer files in the conventional manner -- all the "m" indicates is that the sites have agreed to use the new mail delivery mechanism. There are a number of manners in which the new delivery mechanism could function. One of the most obvious is to include the addressing information for the message in the body of the data file *of* the message. This information would be stripped from the message before final delivery. For example, something like: *TO-USER: ihnp4!vortex!lauren From seymour DateTime remote from foosite ... In this case, the "*TO-USER" line (or whatever we'd call it) contains the addressing information which would normally be contained in the "X.foo" file for mail delivery. As I suggested above, netnews would probably be implemented first: *TO-PROGRAM: rnews From... If a site accepted a uucico connection using the "m" protocol, it would be saying that its uuxqt was ready and willing to handle messages in this sort of format, as an alternative to the "g" format. This new technique involves the use of two files (the C. and D.) files in the sending system's spool dir, and the transfer of one file (D.) to the remote system. With considerably more work, it would be possible to reduce the number of files in the sending spool dir to one, but I'm not sure that this would really be worth the effort involved. To avoid confusion, it might well be advisable to store the mail data file, under the new format, as something other than a typical D. file -- Perhaps M. or something similar would be suitable. I believe that the overall changes required to implement such a scheme for netnews/mail, both in uucico and uuxqt, are actually quite small. I also suspect that a substantial increase in overall uucp efficiency might result, with minimal compatibility problems. We would still be using an efficient transfer mechanism, only the upper-level file handling/delivery mechanism would change. For netnews, message batching and other techniques could still be used to gain even more benefits. Comments? --Lauren--
smb (03/15/83)
Lauren raises several good points. However, I don't completely agree with his proposed solutions. First, I agree that the packet driver needs to be kept for dial-up use. The MMDF packet mechanism, though apparently simpler, gets far worse throughput and (at least in some versions) is rather susceptible to catatonia. Uucp's driver gets fairly good throughput on 1200 baud lines (though it starts falling off badly at higher speeds), and it seems to be quite reliable. The issue is different, though, if you're using an underlying transmission path that is itself flow-controlled and error-corrected, such as a TCP/IP channel. Performance improvements of at least a factor of 10 can be obtained by replacing the packet driver with some other protocol, as outlined by Lauren. Which brings up my second point -- the alternate protocol mechanism isn't geared towards higher-level interchanges like "here's some mail"; it's intended for lower-level functions. The primitives a new protocol must provide are "open", "close", "send/receive message", and "send/receive file". Decisions like what file should be sent, and what the contents of it mean, are handled at a higher level, and are not as easily negotiated. It is also unclear that changing file formats will really help netnews, especially for sites that feed more than one other site. The 2.10 code can make use of the '-c' option (via code changes) to uux; this causes the text of the article to be transmitted directly from the news spool area, and hence avoids the creation of the second D. file in the outbound uucp spool area. This change, plus Truscott's subdirectory mod to uucp (separate subdirectories for C., D., and X. files) should yield a large performance improvement. (To be sure, they're not the whole problem; a lot of the overhead with uucp seems to be the per-file handshaking that goes on. Much of this is directory-search time, but I don't have a feel for just how mcuh.) The real problem with mail transfer is that uucp is *too* general; it can't do the sorts of special-purpose mail handling that one might like. MMDF and SMTP (the ARPAnet's "simple mail transfer protocol" -- not to be confused with the message format standard), on the other hand, allow a site to validate each address individually before sending the body of the message. They also make it much easier to deal with temporary resource problems, such as no space or no i-nodes -- by the time uuxqt learns of such a problem, it's too late to reject the message with a request for retry, and it may not even be possible to mail it back. With SMTP, the sending site *knows* whether or not the next relay received it correctly. (Uucp also has problems with the stupidity of uuclean; if my mail can't be delivered, I would really like the letter back, rather than being told the uucp filenames....) Where does this leave us? One idea is to change the C. file mechanism locally to some more efficient scheme. The outbound X. file could probably be generated dynamically, especially for the simple cases, i.e., mail and news. The same might be done for inbound X. files, though there are timing considerations to worry about -- it isn't feasible to attempt delivery immediately upon receipt of an X. file, especially if the delivery attempt involves expensive operations like alias-list expansion. If you want to experiment with alternate protocols, use uucp (rather than uux) to create Q. files (or some such) in the receiving site's spool directory, and have some variant of uuxqt interpret them. To be sure, that can't be negotiated at transfer time, but it can easily be controlled for news hops via the 'sys' file. Finally, let me make one appeal to anyone implementing a new queuing mechanism: implement a "requeue" counter. That is, any time a transmission fails and is requeued, a counter should be bumped. If it reaches a certain limit, that particular job should be abandoned; otherwise, one failing job can wedge the whole queuing system. A good example is an attempt to transmit a gigantic file via uucp. Even if it's received properly -- by no means certain -- if the receiving site has to copy it to another file system from the TM. file, the sending site will time out waiting for a response. And the next time the two systems connect, the file will be sent again.... --Steve