cs00chs@unccvax.UUCP (charles spell) (09/01/89)
Does the kernal optimize seeks within an open file? Eg. if you have a file descriptor that is currently at offset 500,000 of a 1,000,000 byte file, which would be faster (to get to byte 500,001)?: lseek(fd, 1L, 1); -OR- lseek(fd, 500001L, 0); with file descriptors: fseek(fp, 1L, 1); -OR- fseek(fp, 500001L, 0); _____________________________________________________________________________ Clemson IPTAY: It's Probation Time Again Y'all....
cpcahil@virtech.UUCP (Conor P. Cahill) (09/02/89)
In article <1631@unccvax.UUCP>, cs00chs@unccvax.UUCP (charles spell) writes: > Does the kernal optimize seeks within an open file? There is not much to optimize because the seek operation is one of the simplest (both in overhead & implementation) system calls in the kernel. It simply sets the file offset to the new value. The difference between adding one byte to the current value and storing the new value would be unmeasurable (especially when compared to the overhead of the context switch into kernel mode to perform the system call). I have worked on systems that had special system calls to perform an lseek and read/write in a single system call. The addition of these system calls had a significant (positive) impact on the performance of database software which routinely perform an lseek with just about every read/write. > if you have a file descriptor that is currently at offset 500,000 of a > 1,000,000 byte file, which would be faster (to get to byte 500,001)?: > > lseek(fd, 1L, 1); -OR- lseek(fd, 500001L, 0); See above. > > with file descriptors: > fseek(fp, 1L, 1); -OR- fseek(fp, 500001L, 0); For the file POINTERS (not descriptors) I'm not too sure if there is any local (stdio) operations associated with discarding the current buffer and getting a new one. My *guess* would be that there is no measurable difference, but that is only a non-educated guess. -- +-----------------------------------------------------------------------+ | Conor P. Cahill uunet!virtech!cpcahil 703-430-9247 ! | Virtual Technologies Inc., P. O. Box 876, Sterling, VA 22170 | +-----------------------------------------------------------------------+
chris@mimsy.UUCP (Chris Torek) (09/02/89)
In article <1631@unccvax.UUCP> cs00chs@unccvax.UUCP (charles spell) writes: >Does the kernal optimize seeks within an open file? This question is basically meaningless, because the kernel (note spelling) code for lseek---minus error checks, and with names expanded---is: fp = this_process.open_files[file_descriptor]; switch (whence) { case 0: fp->f_offset = offset; break; case 1: fp->f_offset += offset; break; case 2: fp->f_offset = fp->f_inode->i_file_size - offset; break; } return; Offsets from the end of the file are a tiny bit slower than other offsets due to the extra indirection required to get the file size. If a system call requires 100 machine instructions (this estimate is probably a bit low), case 2 might be 1% slower. >[to go from byte 500000 to byte 500001] with file descriptors: >fseek(fp, 1L, 1); -OR- fseek(fp, 500001L, 0); Presumably you mean `with stdio'. In general, existing stdio implementations are better with offsets from 0 than with offsets from `current point' or `end of file', so the latter would be faster. But `(void) getc(fp)' would be faster still. Stdio has to make two lseek calls per fseek, in the most general case, since it needs to first discover where it is (consider, e.g., `prog >> output', which might be at byte 5131 when it begins). -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
gwyn@smoke.BRL.MIL (Doug Gwyn) (09/03/89)
In article <1631@unccvax.UUCP> cs00chs@unccvax.UUCP (charles spell) writes: >lseek(fd, 1L, 1); -OR- lseek(fd, 500001L, 0); The kernel does essentially the same operation in both cases, the only difference being a miniscule amount af extra arithmetic in the relative- seek case. Why are you worrying about such things, anyway?
jjb@sequent.UUCP (Jeff Berkowitz) (09/03/89)
In reference to the question about lseek (..., L_SET), Chris Torek writes: > >This question is basically meaningless, because the kernel >code for lseek---minus error checks, and with names expanded---is: > > switch (whence) { > case 0: fp->f_offset = offset; break; > case 1: fp->f_offset += offset; break; > case 2: fp->f_offset = fp->f_inode->i_file_size - offset; break; > } > return; This is true for 4.3BSD, but slightly misleading for systems that include NFS. The difference is only in the L_XTND code ("case 2:" in the example). The reference to the file size - "fp->f_inode->i_file_size" - requires a VOP_GETATTR() call into the underlying virtual file system code on systems which include NFS. If the underlying file type is ufs (local disk), the VOP_GETATTR call will be reasonably inexpensive (although it will cost a bit more than the two pointer references in the example). If the underlying file is being served from another machine, though, the VOP_GETATTR() call may require an RPC to the file server. This will cost much more than L_SET or L_INCR. (Caching by the NFS implementation may eliminate some of the RPC calls, but can't eliminate all of them). -- Jeff Berkowitz N6QOM uunet!sequent!jjb Sequent Computer Systems Custom Systems Group