[net.unix-wizards] 2.8bsd pure text bug?

peachey (04/21/83)

	Some time ago I found a bug which exists in V7 and System III,
	and very probably in 2.81 BSD as well (though I don't have the
	code handy to check).  The bug seems to have been fixed in 4.1 BSD.
	The symptoms sound rather like the ones which Jim Reuter reported.

	In order to understand the bug, it is necessary to realize that
	the swap scheduler must never swap out a process which has a
	pure text and has locked the text (XLOCK bit on).  If the swap
	scheduler does this, it may hang waiting for the lock to clear.
	To avoid such disasters, the swap scheduler checks for the
	text lock on each process it might like to swap out, and
	never chooses a process with the text lock on.

	Unfortunately, there is a bug in the xfree() routine of text.c.
	In this routine, the process text pointer is cleared ...

		u.u_procp->p_textp = NULL;

	... before xccdec(xp) is called.  The text indicated by xp
	is locked by xccdec().  If the XWRIT flag is on, xccdec(xp) may
	go to sleep waiting for an I/O.  This gives the swap
	scheduler an opportunity to swap the process out.  The
	check for locked texts in the swap scheduler doesn't work,
	because the process text pointer has been cleared, so it
	looks like the process has no pure text.  If the text is shared
	with another process that is already swapped out, and the
	swap scheduler tries to swap in that second process, it
	will hang up waiting for the text lock.  Only the first process
	can release the lock, but it is swapped out , with no way
	to get swapped back in!

	I found that repeatedly running the program ...

		main()
		{
			if (fork() == 0) {
				do something for a while
			}
			exit(0);
		}

	would hang the system every so often.  The bug is
	not easily demonstrated on demand, because it depends
	on some tricky timing.

	The easiest solution to this problem is to move the line
	which clears the process text pointer to two places
	further down in xfree(), namely, right after the if statement
	that tests x_count and ISVTX (before "xp->x_iptr = NULL;")
	and right after the xccdec(xp) call.

				Darwyn Peachey
				Hospital Systems Study Group
				harpo!utah-cs!sask!hssg40!peachey