tbray@watsol.UUCP (07/02/87)
fseek() exhibits behaviour which seems to defeat the purpose of stdio.
Briefly, it does not notice that small relative fseek() calls may leave the
pointer within the current buffer and not require lseek() calls. E.g.:
/* reverse stdin to stdout */
main()
{
register int size;
struct stat s;
fstat(0, &s); size = s.st_size;
--------------------------OR----------------------OR------------------------
fseek(stdin, size - 1, 0);| |size = 0;
while(size--) { |while(size--) { |while(size++<s.st_size) {
putchar(getchar()); | fseek(stdin, size, 0);| fseek(stdin, size, 2);
fseek(stdin, -2, 1); | putchar(getchar()); | putchar(getchar());
----------------------------------------------------------------------------
}
}
Profiling, input a 32K file, output /dev/null, reveals:
Calls to: fseek 32769 fseek 32768 fseek 32768
lseek 16386 lseek 49152 (1.5 * 32768?!) lseek 32768
read 16386 read 16385 (?) read 32768
Results were the same under 4.3bsd and SUN (whatever's current on a 3/160).
Now, the last time I looked at stdio source (on a V6 system), _filbuf would
read() BUFSIZ bytes block-aligned around the file pointer. If this is still
true, then a lot of small backward fseeks should merely require updating
some _iob pointers, nein?
I have an application which would benefit greatly from being able to do fast
backward searches through a file of ints. A correct fseek() implementation
should give near optimal performance. The idea is to avoid writing my own
buffering logic. Is there a good design reason why stdio acts this way?
Tim Bray, New Oxford English Dictionary Project, tbray@watsol