[comp.bugs.4bsd] 8k holes in NFS files bug: a solution?

hugo@acorn.co.uk (Hugo "Bignose" Tyson) (01/18/90)

This is a follow up to my earlier posting (wizards and protocols.nfs)
about finding 8k holes in files written over NFS in one large fwrite.

First, thanks to the people who helped me with advice and test code -
I used the hammer program to provoke what evidence I have that we
found a fix.  I don't think what we were suffering was to do with the
suggestions that the truncate operation was getting tangled with the
writes, but I'll remember the idea just in case...

We did find a bug which could account for this nasty behaviour: there
is a race in do_bio which can cause an error status for an async write
to be discarded in favour of an OK status, thus masking the error.  I
enclose a context diff below (NFS 3.2 version), for those who are
interested.  We managed to provoke the debugging printf, so it can
happen.  Whether this was masking an error which caused our original
problem is a different question; we shall see.

I have a supplementary question: the NFS 4.0 version appears to have
'fixed' this bug by ALWAYS discarding error status from an async
write, ie. the "if (bp->b_flags & B_ASYNC) { }" clause is completely
absent, and there appears to be no substitute functionality elsewhere.
Does anyone know if this is a bug, or a design decision (to not report
async errors because you'll notice the kernel printf()s in the log and
on the console), or am I missing something?  If I'm being stupid,
don't flame me, I admit I've only had chance for a quick look at the
4.0 code for comparison ;-).

Ta,
	- Huge

------ nothing but code after this, so skip now ------
in do_bio(bp) in the kernel in sys/nfs/nfs_vnodeops.c:

*** nfs_vnodeops.c	Wed Jan 17 15:39:23 1990
--- nfs_vnodeops.c.orig	Wed Jan 17 15:36:32 1990
***************
*** 1592,1609 ****
	if ((bp->b_flags & B_READ) == B_READ) {
	   	...read stuff...
	} else {
		/*
		 * If the write fails and it was asynchronous
		 * all future writes will get an error.
		 */
		rp = vtor(bp->b_vp);
		if (rp->r_error == 0) {
			int rnode_size = rp->r_size;
			count = MIN(bp->b_bcount,
				    rnode_size - bp->b_blkno * DEV_BSIZE);
			if (count < 0) {
				panic("do_bio: write count < 0");
			}
  			bp->b_error = nfswrite(bp->b_vp, bp->b_un.b_addr,
  				(int) bp->b_blkno * DEV_BSIZE,
  				count, rp->r_cred);
! 			if (bp->b_flags & B_ASYNC) {
! 			  	if (rp->r_error && !bp->b_error)
! 				  	printf(
!			"DEBUG: Async NFS write error %d nearly obliterated\n", 
! 					       rp->r_error);
! 
! 			        if (bp->b_error)
! 					rp->r_error = bp->b_error;
! 			     /* BUG!! only write if there was an error
! 			      * - avoid the race with other chaps with
! 			      * an error when we succeeded.
!			      */
! 			}
--- 1592,1599 ----
! 			if (bp->b_flags & B_ASYNC)
! 				rp->r_error = bp->b_error;
  		} else {
  			bp->b_error = rp->r_error;
  		}
	}
	if (bp->b_error) {
		bp->b_flags |= B_ERROR;
	}
	iodone(bp);
}

---------- the end ----------