[comp.unix.wizards] Record locking vs. buffered I/O

850153d@aucs.uucp (Jules d'Entremont) (11/13/89)

I posted the following to comp.unix.questions but got no responses, so I'll
try again here.

I have the task of implementing a set of simple database routines.  Since
there may be multiple processes accessing the database simultaneously, we
are using record locking (via the fcntl mechanism) to ensure data integrity.
However, the manual warns that buffering as done in stdio may cause 
unexpected results.  I understand that this results since the buffer may
not be written to the file before the lock is released.  To get around this,
I use fflush() to flush the buffer before I release the lock.  So my routine
for writing to the database has the following skeleton:

	FILE *fptr;
	struct flock lock;
	data_record *datrec;

	lock.l_type = F_RDLCK;
	lock.l_whence = 0;
	lock.l_start = 0L;
	lock.l_len = 0L;

	fcntl(fileno(fptr), F_SETLK, &lock);	/* Apply a shared lock */

	/* fseek to the desired place in the file */
	/* Code deleted */

	lock.l_type = F_WRLCK;
	lock.l_whence = 1;

	fcntl(fileno(fptr), F_SETLK, &lock);	/* Apply an exclusive lock */
	fwrite((char *) datrec, sizeof(data_record), 1, fptr);
	fflush(fptr);

	lock.l_type = F_UNLCK;
	lock.l_whence = 0;
	fcntl(fileno(fptr), F_SETLK, &lock);	/* Release all locks */

There is a similar routine for reading a record which contains the following
code:

	fcntl(fileno(fptr), F_SETLK, &lock);	/* Apply a shared lock */

	/* fseek to the desired place in the file */

	fread((char *) &datrec, sizeof(data_record), 1, fptr);

	lock.l_type = F_UNLCK;
	lock.l_whence = 0;
	fcntl(fileno(fptr), F_SETLK, &lock);	/* Release all locks */

I think this scheme should work, but I am not convinced.  Is there a problem
with using buffered reads?  Could the data read by fread() be incorrect because
of the buffering being done?  Should I even be using buffered I/O at all? 
(sizeof(data_record) = 24).

This needs to work on an 80386 machine running AT&T SysV/386 Unix, but portable
solutions would be preferred.

Thanks for any help.

-- 
Jules d'Entremont       Acadia University, Wolfville, N.S., Canada
BITNET/Internet 	850153d@AcadiaU.CA
UUCP	{uunet | watmath | utai | garfield}!cs.dal.ca!aucs!850153d
	"To iterate is human, to recurse divine." - L. Peter Deutsch

deastman@pilot.njin.net (Don Eastman) (11/13/89)

Jules d'Entremont writes (which I edited liberally):

> I posted the following to comp.unix.questions but got no responses, so I'll
> try again here.
> 
> I have the task of implementing a set of simple database routines.  
> Should I even be using buffered I/O at all? 

[ Perhaps the mail I sent last week will arrive by Christmas? ]

I believe you should be using read and write system calls rather than
stdio for your data base access routines.

Flushing after each fwrite will eliminate the performance benefit
that usually accrues from using stdio.  This is because fflush
will call write which one could have called directly.  Also,
the added level of buffering will make the code run more slowly.

On the fread side, the concerns inherent in your questions are well
founded.  The buffering done by stdio cannot be used; subsequent queries
must retrieve data from the file, not what was cached in the process,
because the file data may have changed between calls.

You might also reconsider your locking implementation.  Are you attempting
to provide serializable transactions?

Don Eastman     deastman@pilot.njin.net

cpcahil@virtech.uucp (Conor P. Cahill) (11/13/89)

In article <1989Nov12.192441.9270@aucs.uucp>, 850153d@aucs.uucp (Jules d'Entremont) writes:
> I have the task of implementing a set of simple database routines.  Since
> there may be multiple processes accessing the database simultaneously, we
> are using record locking (via the fcntl mechanism) to ensure data integrity.
> However, the manual warns that buffering as done in stdio may cause 
> unexpected results.  I understand that this results since the buffer may
> not be written to the file before the lock is released.  To get around this,
> I use fflush() to flush the buffer before I release the lock.  So my routine

Not only that, but the entire buffered section will be written out to the
file, thereby overwriting some portion of the file that you are not
intending to write.

If you are performing random access small reads and writes of data (as opposed
to sequential reading/writing) you should use the read/write system calls not
the stdio.h buffering.  An fwrite() of x bytes followed by an ffush() will 
cause a write of at least 1 full stdio block (usually around 4k) and if the
x # of bytes happens to straddle 2 blocks, you have 2 full blocks to write.

Another complication of the fread/fwrite usage is that you must perform an
fseek() before switching between the two.

> for writing to the database has the following skeleton:
> 
     [ example code deleted... ]

> 	fcntl(fileno(fptr), F_SETLK, &lock);	/* Apply a shared lock */

Why place a shared lock on the entire file when performing a write
operation.  To gain maximum concurrency you should only place a write lock
on the area of the file that you wish to update (set l_len to sizeof(datarec)).

For reading, the same applies with the exception of the type of lock (shared)

> I think this scheme should work, but I am not convinced.  Is there a problem
> with using buffered reads?  Could the data read by fread() be incorrect because
> of the buffering being done?  Should I even be using buffered I/O at all? 
> (sizeof(data_record) = 24).

No you should not be using buffered i/o for random accessing in this type 
of a file.

-- 
+-----------------------------------------------------------------------+
| Conor P. Cahill     uunet!virtech!cpcahil      	703-430-9247	!
| Virtual Technologies Inc.,    P. O. Box 876,   Sterling, VA 22170     |
+-----------------------------------------------------------------------+