[net.bugs.2bsd] Bug in umount code

jfw@mit-eddie.MIT.EDU (John Woods) (03/18/86)

There appears to be a bug in umount on slow devices (and in principle on
faster ones).  umount invalidates the entries in the buffer cache by setting
their b_dev field to -1.  However, slow devices like RK05s and RX02s, when
they finally get around to looking at these entries some time later, find
that they point to non-existant devices (like minor device 7 in the case of
the RK05).  Is there a standard fix for this problem?
-- 
John Woods, Charles River Data Systems
decvax!frog!john, mit-eddie!jfw, JFW%mit-ccc@MIT-XX

ach@pucc-h (Stephen Uitti) (03/19/86)

In article <1307@mit-eddie.MIT.EDU> jfw@mit-eddie.MIT.EDU (John Woods) writes:
>There appears to be a bug in umount on slow devices (and in principle on
>faster ones).  umount invalidates the entries in the buffer cache by setting
>their b_dev field to -1.  However, slow devices like RK05s and RX02s, when
>they finally get around to looking at these entries some time later, find
>that they point to non-existant devices (like minor device 7 in the case of
>the RK05).  Is there a standard fix for this problem?
>-- 
>John Woods, Charles River Data Systems
>decvax!frog!john, mit-eddie!jfw, JFW%mit-ccc@MIT-XX

	Yes.  This is not a bug that's limited to 2.9.  It's
probably found everywhere, and has existed from the dawn of
time.
	I remember reading a note from someone at Digital
Equipement Corp, saying he'd fixed the bug.  He said something
like, "it was a pain to fix, and required several tries to get
it right", but gave no real hints on how to fix it, other than
he said that "sumount" should wait until all the blocks are
written out (after the call to "update" (internal name for
sync) (which schedules the writes but returns immediately))
before the file systems are actually unmounted.
	Actually, maybe the solution is to have "update" wait
for the I/O to finish, so a command line "sync" really does
what you want it to do.
	I haven't looked at the problem other than to verify
that it exists.  I have been bitten by the bug with rk07's.  I
ran "umount", then physically unmounted the pack, then mounted
a new pack, at which point, the kernel finished it's writes.
The new pack's file systems were completely screwed up.
	Stephen Uitti, Purdue University Computing Center.
	ach@pucc-j.

wescott@sauron.UUCP (Michael Wescott) (03/19/86)

In article <1307@mit-eddie.MIT.EDU> jfw@mit-eddie.MIT.EDU (John Woods) writes:
>There appears to be a bug in umount on slow devices (and in principle on
>faster ones).  umount invalidates the entries in the buffer cache by setting
>their b_dev field to -1.  However, slow devices like RK05s and RX02s, when
>they finally get around to looking at these entries some time later, find
>that they point to non-existent devices (like minor device 7 in the case of
>the RK05).  Is there a standard fix for this problem?

We found a similar problem in an early v7 port purchased for MC68000.  The
umount() call works (roughly speaking since I'm not looking at the code) by
first calling update(), the kernel's internal version of sync.  This is to
insure that the buffer cache has been flushed prior to proceeding with the
umount().  Umount() eventually marks the buffers invalid in this case by
changing the b_dev to -1.

There is, however, a race condition here.  Update(), when called, first checks
to see if there already is a sync in progress.  If so it returns immediately.
In this case, umount() can proceed to put -1 into the b_dev field and when
the system tries to look this number up in the bdevsw table...

In our system it was a kernel bus error, "panic: ..."

Any fix must still mark the buffer invalid otherwise the system can get very
confused when a disk is unmount and a new one mounted soon thereafter. We
eventually fixed it by using the SysIII approach.  Put another bit into the
flags field of the buffer header that indicates that the data is no longer
valid when read.  Then the appropriate checks have to go into routines in
bio.c.  Then the b_dev is no longer invalid.

Another approach would be to have update() sleep if another update() is in
progress.  Then, upon wakeup just return, or continue and do it again.

	-Mike Wescott
	ncrcae!wescott

chris@umcp-cs.UUCP (Chris Torek) (03/23/86)

In article <2713@pucc-h> ach@pucc-h.UUCP (Stephen Uitti) writes:
>[A bug in umount] is not a bug that's limited to 2.9.  It's
>probably found everywhere, and has existed from the dawn of
>time.

It is not present in 4.2 or 4.1, and probably was not in 4.1 (though
I have to admit I never was familiar with the 4.1 code).  The
`proper' way to fix it is clear; indeed, there is a comment in the
4.2 and 4.3 ufs_bio.c that reads as follows:

/*
 * Invalidate in core blocks belonging to closed or umounted filesystem
 *
 * This is not nicely done at all - the buffer ought to be removed from the
 * hash chains & have its dev/blkno fields clobbered, but unfortunately we
 * can't do that here, as it is quite possible that the block is still
 * being used for i/o. Eventually, all disc drivers should be forced to
 * have a close routine, which ought ensure that the queue is empty, then
 * properly flush the queues. Until that happy day, this suffices for
 * correctness.                                         ... kre
 */

`kre', of course, is Robert Elz.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

chris@umcp-cs.UUCP (Chris Torek) (03/23/86)

Oops.  Where I wrote

>It is not present in 4.2 or 4.1, and probably was not in 4.1 (though

I meant `in 4.2 or 4.3'.  Sorry about that.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu