[unix-pc.general] Synchronous I/O

jcm@mtunb.ATT.COM (was-John McMillan) (02/24/89)

Earlier flames regarding UNIX(rg) I/O and databases suggested that
UNIX Block-I/O deferred-writes leave databases in unreliable states.
This was because of the unflushed cache contents at the time of
crashes.  (At least so far as I could identify amidst the flames.)

An E-mail item from John R. MacMillan [!] raised an interesting
point:  in <sys/file.h>, on the 3B1, there's a flag FSYNC.

Examining the sources, and testing, indicates:
	 NON-deferred (block until written) I/O is available.

Add the following line to 
	/usr/include/fcntl.h:

#define	O_SYNC   020	/* synchronous write option */ /* JCM */

Either OPEN(2) or FCNTL(2) can be used to set this file attribute.

This has been there for some time -- probably as long as there has
been an FSYNC entry in 'file.h'.  (As a kernel-repair person, there's
always the problem of living too close to the code to notice the
features!)

It is supported in SVR3, also. (In SVR3, however, there is NO need
to add the 'define'.)

Finally: remember, this is an abusable resource.  It consumes
disk-throughput by performing a write-through of the cache for
each write into the cache.

    zB.:
    	for (i=0; i<256; i++) write(fid, bfr , 4);
    Using O_SYNC, the above causes a minimum of 256 disk-accesses.
    Using the standard deferred I/O, the above require 1 disk-access.
    
    Another example:
    	for (l=0;l<1000;l++) {lseek(fid,0,0); write(fid, b, 4096);}
    Using O_SYNC:	Real=118.6s User=.05s Sys=8.7s (6386/135MB)
    			Real=100.8s User=.03s Sys=1.0s (3B1/67MB)
    Otherwise:		Real=  2.5s User=.04s Sys=2.4s (6386/135MB)
    			Real=  1.0s User=.01s Sys= .2s (3B1/67MB)

It's possible the database issues were flared in another newsgroup.
But it's the 3B1 users who may need to add the define -- and I haven't
the foggiest recollection of where it came up ... sigh.

Back under the bridge...

john mcmillan	-- att!mtunb!jcm	-- muttering for himself, ONLY

ditto@cbmvax.UUCP (Michael "Ford" Ditto) (02/25/89)

In article <1417@mtunb.ATT.COM> jcm@mtunb.UUCP (was-John McMillan) writes:
>    	for (l=0;l<1000;l++) {lseek(fid,0,0); write(fid, b, 4096);}
>    Using O_SYNC:	Real=118.6s User=.05s Sys=8.7s (6386/135MB)
>    			Real=100.8s User=.03s Sys=1.0s (3B1/67MB)
>    Otherwise:		Real=  2.5s User=.04s Sys=2.4s (6386/135MB)
>    			Real=  1.0s User=.01s Sys= .2s (3B1/67MB)

The SYNC vs. non-SYNC ratio seems about right, but I'm a bit surprised
at the 6386 vs. 3B1 ratio.  Is the 3B1 really faster?  Did the 6386
have a really slow disk?  Even so, that wouldn't explain the CPU time.

				Just wondering.
-- 
					-=] Ford [=-

"The number of Unix installations	(In Real Life:  Mike Ditto)
has grown to 10, with more expected."	ford@kenobi.cts.com
- The Unix Programmer's Manual,		...!sdcsvax!crash!kenobi!ford
  2nd Edition, June, 1972.		ditto@cbmvax.commodore.com