tbray@watsol.UUCP (07/02/87)
fseek() exhibits behaviour which seems to defeat the purpose of stdio. Briefly, it does not notice that small relative fseek() calls may leave the pointer within the current buffer and not require lseek() calls. E.g.: /* reverse stdin to stdout */ main() { register int size; struct stat s; fstat(0, &s); size = s.st_size; --------------------------OR----------------------OR------------------------ fseek(stdin, size - 1, 0);| |size = 0; while(size--) { |while(size--) { |while(size++<s.st_size) { putchar(getchar()); | fseek(stdin, size, 0);| fseek(stdin, size, 2); fseek(stdin, -2, 1); | putchar(getchar()); | putchar(getchar()); ---------------------------------------------------------------------------- } } Profiling, input a 32K file, output /dev/null, reveals: Calls to: fseek 32769 fseek 32768 fseek 32768 lseek 16386 lseek 49152 (1.5 * 32768?!) lseek 32768 read 16386 read 16385 (?) read 32768 Results were the same under 4.3bsd and SUN (whatever's current on a 3/160). Now, the last time I looked at stdio source (on a V6 system), _filbuf would read() BUFSIZ bytes block-aligned around the file pointer. If this is still true, then a lot of small backward fseeks should merely require updating some _iob pointers, nein? I have an application which would benefit greatly from being able to do fast backward searches through a file of ints. A correct fseek() implementation should give near optimal performance. The idea is to avoid writing my own buffering logic. Is there a good design reason why stdio acts this way? Tim Bray, New Oxford English Dictionary Project, tbray@watsol