mogul@decwrl.dec.com (Jeffrey Mogul) (05/09/90)
In article <SAUS.90Apr30131509@media-lab.media.mit.edu> saus@media-lab.media.mit.edu (Mark Sausville) writes: >Has anyone else noticed that update(8) gets what seems like an >inordinately large amount of time? > >Does anyone know some sort of connection between the buffer cache >size and the time that update gets? As you know, the only thing that update does is to issue the sync() system call once every 30 seconds. One would think this would run quickly, since all sync() is supposed to do is to force dirty blocks to the disk. Since Ultrix is basically 4.xBSD, it uses the same algorithm for sync that has been used for a long time ... but that algorithm was written when nobody had more than a few hundred disk buffers, and on a large system nowadays it's not impossible that one would have several thousand. So, it wasn't until recently that anyone cared that the algorithm is O(nbuf*n_dirty_buffers*n_filesystems), which in the worst case is O((nbuf^2)*n_filesystems). The code looks something like this foreach i (every file system) { loop: for (every buffer in the system [dirty or not]) { if (this buffer lives on filesystem i) { if (buffer is dirty) { put it on disk driver queue goto loop; } } } } Needless to say, this is not the most efficient way to find all the dirty buffers in the system. I am pretty sure that this is fixed in Ultrix 4.0 (although I don't have a copy of the 4.0 sources to check this against). Meanwhile, I suppose you will have to experiment with a value for "nbuf" that trades this problem off against the cost of having too few disk buffers. By the way, I discovered this problem while bringing up a kernel on an experimental system that (1) had 128 Mbytes of memory and (2) ran in kernel mode with interrupts turned off. So, not only would this fiendish loop consume lots of CPU time, but once every 30 seconds the system would simply stop doing anything for a large fraction of a second. -Jeff