kazar+@andrew.cmu.edu (Mike Kazar) (06/09/89)
[Disclaimer: I'm the AFS development manager at Transarc Corp, and thus have a serious interest in AFS.] Things seem confusing because Melinda Shore and Bill Sommerfeld are comparing two *very* different versions of AFS. Mt. Xinu's version is a very old system sent unofficially to Mt. Xinu by the CMU CS department, while MIT's Project Athena is running the latest and greatest version of AFS direct from us. Here are some of their differences: 1. The old cache manager (the client-end code) is implemented as a user-level process with limited internal concurrency. It has 2 or 3 light-weight threads to handle user-generated file system requests. If these threads are busy, the user waits. If the Unix process implmenting the threads is busy, the user waits. In the new implementation, AFS is just another (Sun) virtual file system, sitting in the kernel. There are essentially no external concurrency constraints imposed. Trivial operations (such as stat of a cached file) run more than 10 times faster since a procedure call to the VFS "afs_getattr" function is a lot faster than sending an IPC message to a user-level process and waiting for a reply. 2. The old system uses an inferior RPC package, with a separate file transfer protocol invoked to transfer data blocks larger than a packet. The new RPC integrates these two types of operations, cutting down considerably (a factor of 2.5 or so) on the actual number of packets sent in the most common cases. In addition, we've learned a lot more about file transfers, and have stolen some ideas from people who've done TCP/IP improvements. Thus our new RPC runs faster than the old file transfer protocol, on top of the reduction in overhead from having just one protocol. 3. The old system is an administrative nightmare. All sorts of ad hoc databases had to be maintained via text editors, and then run through obscure programs to convert them to internal forms. These binary databases would then propagate out to the other file servers over a period of tens of minutes, during which time things often looked a little inconsistent. In the new system, these databases are implemented by replicated transactional database servers. System administrators update these databases by issuing commands from their own workstations that make authenticated RPCs to the database servers. 4. The old system transferred entire files, and made the reader wait until the entire file had been received before letting the user see *any* of the data. In the new system, files are transferred in 64 Kbyte chunks, and the user process can read the data as soon as the appropriate portion of the chunk has been received at the workstation. So, here's my interpretation of the differences between M & B's posts: > 1) The [protection] semantics really are different from Unix > filesystem semantics. > >The use of access control lists is necessary in large-scale >environments. Not a rebuttal, true, but certainly a justification for doing something incompatible with Unix protection given a large environment. Since virtually no Unix programs know anything about ACLs, we had to do some pretty odd things, as compared with straight Unix protection, to get a reasonably powerful and simple that even novices can use without being surprised. > 2) Directories, which you and I consider to be files, aren't treated as > files by AFS. *No* caching, which means that you can ls until the > cows come home but the 80th time is not going to be any faster than > the first. > >Please check your facts; last I looked, they're cached just like files. >A significant part of the hair in the AFS client is involved with >keeping the local copy of a directory in synch with the master copy >when directory operations are done. All versions of AFS have always cached directories. AFS does name to low-level ID translation on the workstation; there's no way we could fail to cache directories in any of our releases. Even a bug of this magnitude would be instantly visible. > 3) Performance. The whole file is copied over at access time, which > speeds up future file accesses but can turn "grep string *" into a > fairly unpleasant experience. > >Yes, but the user process doing the "grep" sees the bits as soon as >they're available, and doesn't have to wait for them to be written to >the cache. It should be clear from the above that the system Shore is using would perform much worse on a 'grep string *' than the current system. > 4) Disk usage. Because entire files are copied over it can be something > of a disk burner. > >True; you want a large enough cache that the "working set" of files > you normally touch in the period of an hour or two fits in its > entirety; for normal users, 10MB is probably enough, while for "power > users" doing kernel builds, 30MB+ is more like it. [Barry quoted Bill out of context, > eliminating the last three lines of Bill's reply; I've restored the rest -MLK] 10 MB isn't in my opinion a disk burner, when you realize that you don't have to keep virtually anything on your workstation disk aside from the cache and swap space. For example, most developers of large systems have hundreds of megabytes of local disk storage so that they can work effectively. I've worked for years with only a 40 megabyte disk quite comfortably, and only switched to a 70 when I upgraded to a new machine that didn't come with such small disks! If you're only a casual user of AFS (rather than someone getting 99% of all your files from AFS), you can get by with a couple of megabytes of cache. > 5) Administration is somewhat (!) complex. > >Agreed, but managing 10 AFS servers is only slightly harder than They're also talking about totally different administrative procedures here, folks. Shore's system is *very* painful to administer. Sommerfeld's is an order of magnitude easier to deal with. So, in short, these folks really are comparing completely different systems, and that's the main reason for the confusion.
wesommer@athena.mit.edu (Bill Sommerfeld) (06/10/89)
Thanks, Mike, for clearing up some of the confusion. I was not aware that the version Melinda was using was such an old version... I attempted to bring that version up (or one even older than it) about two or three years ago, and pretty much gave up in disgust. The current release from CMU/ITC is in much better shape. I'd like to apologise to Melinda for making that mistake. - Bill --