[comp.unix.wizards] Stupid behaviour in fseek

tbray@watsol.UUCP (07/02/87)

fseek() exhibits behaviour which seems to defeat the purpose of stdio.
Briefly, it does not notice that small relative fseek() calls may leave the 
pointer within the current buffer and not require lseek() calls.  E.g.:
/* reverse stdin to stdout */
main()
{
  register int size;           
  struct stat s;                
  fstat(0, &s); size = s.st_size;
--------------------------OR----------------------OR------------------------
fseek(stdin, size - 1, 0);|                       |size = 0;
while(size--) {           |while(size--) {        |while(size++<s.st_size) {
  putchar(getchar());     | fseek(stdin, size, 0);|  fseek(stdin, size, 2);
  fseek(stdin, -2, 1);    | putchar(getchar());   |  putchar(getchar());
----------------------------------------------------------------------------
  }
}
Profiling, input a 32K file, output /dev/null, reveals:
Calls to: fseek 32769       fseek 32768                 fseek 32768
          lseek 16386       lseek 49152 (1.5 * 32768?!) lseek 32768
          read  16386       read  16385 (?)             read  32768
Results were the same under 4.3bsd and SUN (whatever's current on a 3/160).

Now, the last time I looked at stdio source (on a V6 system), _filbuf would
read() BUFSIZ bytes block-aligned around the file pointer.  If this is still 
true, then a lot of small backward fseeks should merely require updating 
some _iob pointers, nein?  

I have an application which would benefit greatly from being able to do fast 
backward searches through a file of ints.  A correct fseek() implementation 
should give near optimal performance.  The idea is to avoid writing my own 
buffering logic.  Is there a good design reason why stdio acts this way?
Tim Bray, New Oxford English Dictionary Project, tbray@watsol