[net.unix] Speed of read vs. fread

lied@ihlts.UUCP (Bob Lied) (01/20/85)

Which is faster, read/write or fread/fwrite?

I've been told that fread/fwrite is faster because
it buffers.  On the other hand, I've been told that
read/write is faster because (a) read always does
a one block read-ahead if it can and (b) it avoids
the overhead of the fread/fwrite abstraction.

Of the few examples I know personally, read/write
seems to win, but I'm not really convinced.

Given a choice, and doing I/O in chunks instead of
characters, which should I use for performance?
Please explain your selection, and no lectures on
tuning too early, OK?

	Bob Lied	ihnp4!ihlts!lied

malcolm@ee.UUCP (01/22/85)

	/* Written  9:36 pm  Jan 19, 1985 by lied@ihlts in ee:net.unix */
	Which is faster, read/write or fread/fwrite?
Depends on what you are doing.  If you are doing single character IO then
by all means use fread/fwrite.  The time it takes to put the character in 
the buffer is insignificant to the expense of a syscall.

On the other hand large chunks of IO (several BUFSIZ's at a time) are much
more efficient with read/write.  One of my research programs saved many
seconds of CPU time by using write() to output large matrices (100k bytes)
for debugging.

The question is where to draw the line.  I would recomend that in general
to use fread/fwrite.  If you are reading/writing more than 5 or 10k at a
time (one call to read/write) then you'll start saving a little time by
using read/write.

					Charter member of the Purdue
					team to stamp out unnecessary
					sys-calls.

					Malcolm
					pur-ee!malcolm
					malcolm@purdue.arpa

perlman@wanginst.UUCP (Gary Perlman) (01/22/85)

> Which is faster, read/write or fread/fwrite?
> 
> 	Bob Lied	ihnp4!ihlts!lied

There is no comparison.  Here are some times
for copying /etc/termcap (38123 chars).

SIZE NAME        COMMENT
6144 fgetputc    using fgetc and fputc with stdio
7168 freadwrite  using fread and fwrite BUFSIZ blocks with stdio
6144 getputc     using getc and putc (macros) with stdio
5120 readwrite   using read and write BUFSIZ blocks (no stdio)

The results:
PROGRAM       user   sys   total
fgetputc      2.1    0.2    2.3
freadwrite    1.1    0.1    1.2
getputc       0.7    0.1    0.8
readwrite     0.0    0.1    0.1


That is a huge effect!  A factor of 23.
Some observations:
	fgetc and fputc are functions which getc and putc are macros.
	The overhead of 38123 (times 2) function calls is immense.

	I think all the stdio functions are doing reads and writes.
	It looks like fread and fwrite are using more.

	I am not sure, but there may be some strange interactions
	if read and write are used directly in conjunction with stdio.

I am posting the benchmarking programs to net.sources.

Gary Perlman/Wang Institute/Tyngsboro, MA/01879/(617) 649-9731

stevel@haddock.UUCP (01/23/85)

> SIZE NAME        COMMENT
> 6144 fgetputc    using fgetc and fputc with stdio
> 7168 freadwrite  using fread and fwrite BUFSIZ blocks with stdio
> 6144 getputc     using getc and putc (macros) with stdio
> 5120 readwrite   using read and write BUFSIZ blocks (no stdio)

If you read/write in increments of other than BUFSIZ you will
notice fread/fwrite should be faster. Try read/write in increments
of 1 char.

Steve Ludlum, decvax!yale-co!ima!stevel, {amd|ihnp4!cbosgd}!ima!stevel

pag@hao.UUCP (Peter Gross) (01/23/85)

> Which is faster, read/write or fread/fwrite?

Well, you didn't specify on which version of Unix; that might make a difference.
I can speak for 4.2bsd.  We have a production job here that was important
enough and taking long enough that our director was requesting that we shut
the system down while this job ran.  We decided that was extreme and wanted
to see if we could speed it up.  Profiling showed it spent a great deal of
time in fread()/fwrite().  We checked the code, and lo and behold, fread/fwrite
do single getc/putc's in a loop.  Converting to read/write made the program
run about 3 times faster.

--peter gross
hao!pag

thomas@utah-gr.UUCP (Spencer W. Thomas) (01/23/85)

In article <626@ihlts.UUCP> lied@ihlts.UUCP (Bob Lied) writes:
>I've been told that fread/fwrite is faster because
>it buffers.  On the other hand, I've been told that
>read/write is faster because (a) read always does
>a one block read-ahead if it can and (b) it avoids
>the overhead of the fread/fwrite abstraction.
>
>Given a choice, and doing I/O in chunks instead of
>characters, which should I use for performance?
>Please explain your selection, and no lectures on
>tuning too early, OK?
>
Well, I did a simple experiment, contrasting read and write vs fread and
fwrite performance on a 4.2a (or is it 4.3) system.  This has the fread
and fwrite enhancements (the idea for which was taken from System V, I
believe, so results should be comparable).  My experiment, with results
tabulated below, shows that fread and fwrite are MUCH better for small
record sizes, and are worse for large record sizes (but not too much).
These tests were run on a basically unloaded VAX 750 running the
enhanced 4.2bsd system (which may be called 4.3 someday, I dunno).  The
file system block size for the first test was 8192/1024 and for the
second it was 4096/512.  (For those of you not familiar with the 4.2
file system, the first number is the file system block size, and
the 2nd is the fragment size.  For files this size, you can mostly
ignore the fragment size.)

Reading /vmunix (274432 bytes) with

record	|	    read		    	    fread
size  	|user	system	clock	%	|user	system	clock	%
--------+-------------------------------+-------------------------------
1	|17.4	499.7	9:35	89%	|40.0	1.5	1:10	68%
512	|0.0	1.4	0:02	74%	|0.2	0.4	0:01	47%
8192	|0.0	0.4	0:01	41%	|0.2	0.5	0:01	54%

Writing a 1Mb file with

record	|	    write		    	    fwrite
size  	|user	system	clock	%	|user	system	clock	%
--------+-------------------------------+-------------------------------
28	|3.3	134.5	2:34	89%	|5.6	3.6	0:10	88%
1024	|0.1	6.5	0:07	88%	|0.7	3.6	0:05	77%
8192	|0.0	3.4	0:04	77%	|0.8	3.3	0:04	89%

P.S., this also shows the performance of the 4.2 filesystem -- note that
I can read a 270Kb file in about 1 second, and can write a 1Mb file in
about 4 seconds.  Not too shabby!
-- 
=Spencer
	({ihnp4,decvax}!utah-cs!thomas, thomas@utah-cs.ARPA)
		<<< Silly quote of the week >>>

thomas@utah-gr.UUCP (Spencer W. Thomas) (01/23/85)

In article <1350@hao.UUCP> pag@hao.UUCP (Peter Gross) writes:
>  Profiling showed it spent a great deal of
>time in fread()/fwrite().  We checked the code, and lo and behold, fread/fwrite
>do single getc/putc's in a loop.  Converting to read/write made the program
>run about 3 times faster.

This has been fixed! This has been fixed!  This has been fixed!  This
has been fixed!  This has been fixed!  This has been fixed!

Contact your best friend at Berkeley and see if you can get the fix.  As
described by Sam Leffler at the Salt Lake Usenix, almost all the
places where System V (and others) was more efficient than 4.2 have been
fixed (and most of these fixes were pretty trivial).  See my note for
some speed comparisons.

-- 
=Spencer
	({ihnp4,decvax}!utah-cs!thomas, thomas@utah-cs.ARPA)
		<<< Silly quote of the week >>>

pdj@nbires.UUCP (Paul Jensen) (01/24/85)

> Which is faster, read/write or fread/fwrite?
> ...
> 	Bob Lied	ihnp4!ihlts!lied

We found that fread was MUCH slower than read on our Vax 780 
when reading structures greater than about 256 - 512 bytes, 
and MUCH faster than read below that. It is about a factor of 
3 slower than read for 8K chunks, and about 5 times slower than read
for 16 byte chunks.  fwrite behaves similarly.

The problem is apparently due to the per character
overhead of getc/putc.  Fortunately, there is a simple fix.
If you peek to see how much is currently in the buffer 
and use bcopy (or memcpy) to move in what you need,
you can get fread's performance back up to what read can do 
for large chunks and still have the big 5:1 advantage over read 
for small chunks.  Thus, you shouldn't have to choose.
-- 
Paul Jensen	NBI {hao,ucbvax,allegra}!nbires!pdj	(303) 444-5710 x3054

guy@rlgvax.UUCP (Guy Harris) (01/26/85)

> Which is faster, read/write or fread/fwrite?
> 
> I've been told that fread/fwrite is faster because
> it buffers.  On the other hand, I've been told that
> read/write is faster because (a) read always does
> a one block read-ahead if it can and (b) it avoids
> the overhead of the fread/fwrite abstraction.

a) is prunejuice; "fread" calls "read", so you get the same
read-ahead.  b) is true as long as your "read"s are the same
size as the ones that "fread" does.  The overhead of the
abstraction is MUCH lower in System V Release 2 and will
be lower in 4.3BSD (originally, "fread" was a loop containing
a "getc" macro"; it has been redone to move entire blocks of
characters from the buffer to the user directly).  Still, this
overhead is non-zero.

The point about buffering is that doing "fread" to read 16 bytes
at a time will probably be better than doing "read" to read
16 bytes.  The latter requires 64 times as many "read" calls (assuming
1024-byte standard I/O buffers), and system calls are expensive.

On 4.2BSD, the win of "fread" is even higher; the buffer size is the
same as the filesystem block size, which is usually 4096 or 8192 bytes.

	Guy Harris
	{seismo,ihnp4,allegra}!

chris@umcp-cs.UUCP (Chris Torek) (01/27/85)

The latest Berkeley C library (which as far as I know is not on any
distribution tapes---in other words, don't bother asking for a new
tape) has optimized fread/fwrite, using bcopy() if necessary, which
runs MUCH faster.  (Put new life into those old programs....)

The big advantage fread/fwrite have is that you don't have to do any
kind of buffering, so that when/if 4.nBSD acquires copy-on-write, true
virtual reads, invisible memory sharing, and all those kind of
performance goodies that make page-aligned exact-size buffers do
wonders for performance, why, all you need to do is recompile....

(But I wouldn't recommend holding your breath. :-) )
-- 
(This line accidently left nonblank.)

In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

dave@garfield.UUCP (David Janes) (01/30/85)

In article <1315@utah-gr.UUCP> thomas@utah-gr.UUCP (Spencer W. Thomas) writes:
| 
|                                               ... almost all the
| places where System V (and others) was more efficient than 4.2 have been
| fixed (and most of these fixes were pretty trivial).  ...
|
| =Spencer

What did they do, slow down the System V routines with 'while' loops? _:-)

dave