[comp.os.minix] Deadlock in Minix

bart@kulcs.UUCP (Bart De Decker) (11/30/87)

The scenario I sketched in my previous message, would inevitably lead to
a deadlock IF the minix `sendrec' call (i.e. sys_call(BOTH, ...))
behaved properly.

However, you may have noticed that when the system hangs, echoing still goes on.
Hence, the TTY task managed to get readied again.

What actually happens is the following:

	TTY				FS

tty_reply (REVIVE, ...)	 (line 3571)					+
   send (...)								| time
      sys_call (SEND, ...)						|
    	mini_send (...)							V
           flags |= SENDING,
	   FS->p_callerq = TTY
	   unready (TTY)
      ======== blocked
   
				dev_io (...)
				 rw_dev (...)  (line 11282)
				   sendrec (...)   (line 12356)
				      sys_call (BOTH, ..., req)
					 mini_send (...)
					     flags |= SENDING
					     TTY->p_callerq = FS
					     unready (FS)
					 mini_rec (...)
					     OK !!!!
					     REVIVE message is copied in req !
   on ready queue <------------------------  ready(TTY)
				      ================ blocked
main-loop
receive (ANY, ...)
  sys_call (RECEIVE, TTY, ANY, ...)
     mini_rec (...)
	OK !!!
	FS req message is copied ! BUT the original message has been destroyed
		(i.e. overwritten by the REVIVE message). What a pity !
	ready (FS) --------------------------> on ready queue
					
TTY does not understand REVIVE message	
Hence, tty_reply (..., EINVAL, ...)	
   send (...)		
      sys_call (SEND, ...)	
	 mini_send (...)
	     FS->p_callerq = TTY
	     unready (TTY)
      ========== blocked
				 
				return from sendrec
				since REVIVE message, enter while loop body
				   receive (TTY, ...)
				      sys_call (RECEIVE, ...)
					 mini_rec (...)
					     copy TTY message (EINVAL)
     on ready queue <----------------------  ready (TTY)

		...

				second while test : still not reply for first
					request ...
				   receive (TTY, ...)
				      sys_call (RECEIVE, ...)
					 mini_rec (...)
					     flags |= RECEIVING
					     TTY->p_callerq = FS
					     unready (FS)
				      =============== BLOCKED FOREVER
		

Hence, this is not a real deadlock, since the FS is waiting for
a message from the TTY task (and not vice versa).
However, the situation is really as bad as a deadlock, since the
TTY task will NEVER send a message to the FS, since there is no
outstanding request.
This weird situation could happen because of a bug in the implementation
of the `sys_call' procedure. It allows that a message is received
BEFORE the request has been sent. But that (received) message destroys the
original request-message ...

Note: changing sys_call does not solve the original `deadlock' problem !!!

-- Bart

================================================================================
|| Bart De Decker		        mail: Katholieke Universiteit Leuven  ||
|| - research assistant	-		      Dept. Computer Science	      ||
|| Tel: +32 16 200656 x 3556		      Celestijnenlaan 200 A	      ||
|| E-mail: bart@kulcs.UUCP 		      B-3030 Heverlee		      ||
|| 	 ... mcvax!prlb2!kulcs!bart 	      Belgium			      ||
||         bart@kulcs.BITNET              				      ||
================================================================================

bart@kulcs.UUCP (Bart De Decker) (12/04/87)

I added the following code to kernel/proc.c.
It detects a deadlock that arises when two processes (or tasks) try to send
each other a message at the same time.

/--------------------------- proc.c.diff ------------------------------/
156a157,163
> 	/* check for deadlock */
> 	if (dest_ptr->p_flags & SENDING) {
> 		printf ("%d sending to %d and vice versa\n",
> 			caller, dest);
> 		p_dmp();
> 		panic ("... Deadlock detected in mini_send ...", NO_NUM);
> 	}
207a215,221
> 		/* avoid receiving reply before having sent request */
> 		if (caller_ptr->p_flags & SENDING) {
> 			printf ("%d sending to %d and vice versa\n",
> 				caller, sender);
> 			p_dmp();
> 			panic ("... Deadlock detected in mini_rec ...", NO_NUM);
> 		}
/------------------------------ end -------------------------------------/

Not really a solution, but it shows clearly the deadlock problem  ...

-- Bart

================================================================================
|| Bart De Decker		        mail: Katholieke Universiteit Leuven  ||
|| - research assistant	-		      Dept. Computer Science	      ||
|| Tel: +32 16 200656 x 3556		      Celestijnenlaan 200 A	      ||
|| E-mail: bart@kulcs.UUCP 		      B-3030 Heverlee		      ||
|| 	 ... mcvax!prlb2!kulcs!bart 	      Belgium			      ||
||         bart@kulcs.BITNET              				      ||
================================================================================

bing@galbp.LBP.HARRIS.COM (Bing Bang) (12/19/87)

In article <1064@kulcs.UUCP> bart@kulcs.UUCP (Bart De Decker) writes:
>
>I added the following code to kernel/proc.c.
>It detects a deadlock that arises when two processes (or tasks) try to send
>each other a message at the same time.
>
[dif code]
>
>Not really a solution, but it shows clearly the deadlock problem  ...
>
>-- Bart

long ago, i posted changes to fs and the kernel to solve this problem.
it involved creating a que for messages, so that the sender is never
blocked by not being able to send. of course the messages must be copied
to a system resourced message structure so that the sender can safely re-
use his message structure.


-- 
Bing H. Bang           +----------------------------------------------------+
Harris/Lanier          |MSDOS and OS/2 (whenever it gets here): just say no.|
Atlanta GA             +----------------------------------------------------+