chris@mimsy.umd.edu (Chris Torek) (05/04/90)
In article <12578@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: >Interestingly, this aspect of the copy program [reading and writing very >large blocks] is one place where I think DOS is sometimes faster than >UNIX. I suspect that many UNIX versions of 'cp' use block-sized buffers. >Doing so makes overly pessimistic assumptions about the amount of >physical memory you're likely to get. None of the newsgroups to which this is posted are particularly suited to discussions about O/S level optimisation of file I/O, but I feel compelled to point out that `big gulp' style copying is not always, and indeed not often, the best way to go about things. The optimal point is often not `read the whole file into memory, then write it out of memory', because this requires waiting for the entire file to come in before figuring out where to put the new blocks for the output file. It is better to get computation done while waiting for the disk to transfer data, whenever this can be done without `getting behind'. Unix systems use write-behind (also known as delayed write) schemes to help out here; writers need use only block-sized buffers to avoid user-to-kernel copy inefficiencies. As far as comp.lang.c goes, the best one can do here is call fread() and fwrite() with fairly large buffers, since standard C provides nothing more `primitive' or `low-level', nor does it give the programmer a way to find a good buffer size. Better stdio implementations will do well with large fwrite()s, although there may be no way for them to avoid memory-to-memory copies on fread(). A useful fwrite() implementation trick goes about like this: set resid = number of bytes to write; set p = base of bytes to write; while (resid) { if (there is stuff in the output buffer || resid < output_buffer_size) { n = MIN(resid, space_in_output_buffer); move n bytes from p to buffer; p += n; resid -= n; if (buffer is full) if (fflush(output_file)) goto error; } else { --> write output_buffer_size bytes directly; if this fails, goto error; p += n_written; resid -= n_written; } } The `trick' is in the line marked with the arrow --> : there is no need to copy bytes into an internal buffer just to write them, at least in most systems. (Some O/Ses may `revoke' access to pages that are being written to external files.) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
jhallen@wpi.wpi.edu (Joseph H Allen) (05/04/90)
In article <24164@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: >In article <12578@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: >>Interestingly, this aspect of the copy program [reading and writing very >>large blocks] is one place where I think DOS is sometimes faster than >>UNIX. I suspect that many UNIX versions of 'cp' use block-sized buffers. >>Doing so makes overly pessimistic assumptions about the amount of >>physical memory you're likely to get. >The optimal point >is often not `read the whole file into memory, then write it out of >memory', because this requires waiting for the entire file to come in >before figuring out where to put the new blocks for the output file. >It is better to get computation done while waiting for the disk to transfer >data, whenever this can be done without `getting behind'. Unix systems >use write-behind (also known as delayed write) schemes to help out here; >writers need use only block-sized buffers to avoid user-to-kernel copy >inefficiencies. On big, loaded, systems this is certainly true since you want full use of 'elevator' disk optimizing between multiple users. This should be the normal mode of operation. The problem with this on smaller UNIX systems is that whatever the disk interleave is will be missed unless there is very intelligent read-ahead. If you're lucky enough to have all your memory paged in, one read call may, if the system is designed right, read in contiguous sets of blocks without missing the interleave. For things like backups you usually want to tweak it a bit since this operation is slow and can usually be done when no one else is on the system. Also, for copying to tapes and raw disks, 'cp' is usually very bad. I think dd can be used to transfer large sets of blocks. On one system I know of, if you 'cp' between two raw floppy devices, the floppy lights will blink on and off for each sector. Also you have to be carefull about what is buffered and what isn't and happens when you mix the two. -- jhallen@wpi.wpi.edu (130.215.24.1)
scs@athena.mit.edu (Steve Summit) (05/05/90)
In article <24164@mimsy.umd.edu> chris@mimsy.umd.edu (Chris Torek) writes: >In article <12578@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: >>Interestingly, this aspect of the copy program [reading and writing very >>large blocks] is one place where I think DOS is sometimes faster than >>UNIX. I suspect that many UNIX versions of 'cp' use block-sized buffers. >>Doing so makes overly pessimistic assumptions about the amount of >>physical memory you're likely to get. > >...`big gulp' style copying is not always, and >indeed not often, the best way to go about things... Unix systems >use write-behind (also known as delayed write) schemes to help out here; >writers need use only block-sized buffers to avoid user-to-kernel copy >inefficiencies. Indeed. The DOS implementation of cp is only apparently "better" because it is doing something explicitly which the Unix program has no need for. Unaided DOS has no write-behind or read-ahead (and very little caching), and programs that do 512-byte or 1K reads and writes (including, tragically, most programs using stdio) run abysmally slowly. Using stdio is supposed to be the "right" thing to do; the stdio implementation should worry about things like correct block sizes, leaving these unnecessary system-related details out of application programs. (Indeed, BSD stdio does an fstat to pick a buffer size matching the block size of the underlying filesystem.) If a measly little not-really-an-operating-system like DOS must be used at all, a better place to patch over its miserably simpleminded I/O "architecture" would be inside stdio, which (on DOS) should use large buffers (up around 10K) if it can get them, certainly not 512 bytes or 1K. Otherwise, every program (not just backup or cp) potentially needs to be making explicit, system-dependent blocksize choices. cat, grep, wc, cmp, sum, strings, compress, etc., etc., etc. all want to be able to read large files fast. (The versions of these programs that I have for DOS all run unnecessarily slowly, because the stdio package they are written in terms of is doing pokey little 512 byte reads. I refuse to sully all of those programs with explicit blocksize notions. Sooner or later I have to stop using the vendor's stdio implementation and start using my own, so I can have it do I/O in bigger chunks.) In article <12642@wpi.wpi.edu> jhallen@wpi.wpi.edu (Joseph H Allen) writes: >Also, for copying to tapes and raw disks, 'cp' is usually very bad. I think >dd can be used to transfer large sets of blocks. On one system I know of, if >you 'cp' between two raw floppy devices, the floppy lights will blink on and >off for each sector. A certain amount of "flip-flopping" like this is inevitable under vanilla Unix, at least when using raw devices, since there is no notion of asynchronous I/O: the system call for reading is active until it completes, during which time the write call is inactive, and the reader is similarly idle while writing is going on. Graham Ross once proposed a clever "double-buffered" device copy program which forked, resulting in two processes, sharing input and output file descriptors, and synchronized through a semaphore so that one was always actively reading while the other one was writing. (This trick is analogous to the fork cu used to do to have two non-blocking reads pending.) It was amazing to watch a tape-to-tape copy via this program between high-throughput, 6250 bpi tape drives: both tapes would spin continuously, without pausing. (cp or dd under the same circumstances resulted in block-at-a-time pauses while the writing drive waited for the reader and vice versa.) Steve Summit scs@adam.mit.edu
ralerche@lindy.Stanford.EDU (Robert A. Lerche) (05/06/90)
In Microsoft C (and some others), one can use "setvbuf" to attach a large I/O buffer to a stdio-package file. It's sensible to wish for the operating system to do this itself, but in the DOS world, given the 640K memory limit, it's not totally unreasonable to place the memory allocation burden on the application program (since it probably has to worry about tight memory). Using "setvbuf" makes a biiiiiig difference in file I/O performance.