[net.bugs.4bsd] 4.2 hangs with locked root inode

joe@fluke.UUCP (Joe Kelsey) (02/13/85)

Index:	Probably sys/{ufs,sys}_inode.c

Description:
	We occaisionally have systems hang in such a state that no
	useful work seems to be getting done.  When we force a crash,
	the dump reveals several interesting facts:
	1) Almost no runnable processes.  If there are any runnable 
	   processes, it's usually something inoccuous like rwhod.
	2) LOTS of processes stuck in disk wait, with WCHAN pointing
	   to the ROOT INODE!  Also, the root inode has ILOCK|IWANT
	   set, indicating someone locked it and lots of people want
	   it (count is large).
Repeat-By:
	I really don't know how to repeat this.  It seems to occur
	randomly enough that I can't seem to pin down the cause.
	How do you tell which process has set a lock on an inode that
	others want to access?

/Joe Kelsey

joe@fluke.UUCP (Joe Kelsey) (02/19/85)

Well, I think I know where the problem is.  Everyone have their mt.
Xinu bug lists handy?  Ok, now turn to the listing for:

sys/ufs_alloc.c		decvax!jmcg (Jim Mcginness)	6 Feb 84 +FIX sys/94
	cylinder group allocation bug causes hung system.

Mark Plotnick >allegra!mp> pointed this out to me, and also pointed to
a case that Jim missed. I'll submit a separate bug fix for the missing
condition.  If anyone is missing this fix, or doesn't have the Xinu
tapes to refer to, I will send individual copies of this particular bug
via mail.

/Joe

ITAI@TECHUNIX.BITNET (03/14/85)

From: itai@techunix.bitnet (Itai Nahshon)



Well, I know of another bug causing same results which
can be repeated easily. The bug is in the function getmdev()
in sys/sys/ufs_mount.c. when You try to mount/umount a non block
device it's inode is locked (by namei) but not released (by iput
as in the case of block device). Another attempt to access that
inode will lock the directory and so on till root is locked.

repeat-by:
	I tried to mount /dev/rmtxx. (a character device).
	after I failed I tried it again (this time /dev was locked)
	and after a short time the whole system was locked.
	P.S. mounting /dev/mtxx (the tape blocked device, as read only)
	worked on 4.1BSD but not working on 4.2. Anyone knows why ?
fix:
	change the routine getmdev() to iput() the non block file
	before returning an error. (warning: should not iput a NULL
	in case namei fails, only if namei succeeds but the file
	mode doesn't match).

			Itai Nahshon.
			Technion, Israel institute of technology
			Haifa, Israel

BITNET:		itai@techunix.BITNET
ARPANET:	itai%techunix.BITNET@wiscvm.ARPA