[comp.os.mach] interrupt and simple_locks

rchen@m.cs.uiuc.edu (Rong Chen) (04/04/91)

While I was reading the source code of Mach3.0, I noticed that
the hardware interrupts, if any, are always turned off before
a simple-lock is applied to a Mach object.   For example:

	s = splsched();
	simple_lock(&timer_lock);
	...
	simple_unlock(&timer_lock);
	splx(s);

What's the reason for it?  Shouldn't we do it the other way round?
Like:

	simple_lock(&timer_lock);
	s = splsched();
	...
	splx(s);
	simple_lock(&timer_lock);

Since simple-locks are spin locks, they may prolong the period
of disabling interrupts unnecessarily, in my humble opinion.
It is true that an interrupt may arrive after an object has
been locked, but that should not cause any harm, right?

Thanks for anyone who could enlighten me on this issue.

-Rong Chen
 Department of Computer Science
 University of Illinois at Urbana-Champaign

dswartz@bigbootay.sw.stratus.com (Dan Swartzendruber) (04/04/91)

In article <1991Apr3.184610.12580@m.cs.uiuc.edu> rchen@m.cs.uiuc.edu (Rong Chen) writes:
:While I was reading the source code of Mach3.0, I noticed that
:the hardware interrupts, if any, are always turned off before
:a simple-lock is applied to a Mach object.   For example:
:
:	s = splsched();
:	simple_lock(&timer_lock);
:	...
:	simple_unlock(&timer_lock);
:	splx(s);
:
:What's the reason for it?  Shouldn't we do it the other way round?
:Like:
:
:	simple_lock(&timer_lock);
:	s = splsched();
:	...
:	splx(s);
:	simple_lock(&timer_lock);
:
:Since simple-locks are spin locks, they may prolong the period
:of disabling interrupts unnecessarily, in my humble opinion.
:It is true that an interrupt may arrive after an object has
:been locked, but that should not cause any harm, right?
:
:Thanks for anyone who could enlighten me on this issue.
:
:-Rong Chen
: Department of Computer Science
: University of Illinois at Urbana-Champaign

Interrupts must be masked first, since otherwise you have a race condition
where the lock is acquired on call-side and an interrupt comes in before
the spl call.  If the interrupt handler tries to acquire the same lock,
you're stuck, since the only code which can release the lock is the same
thread which is now servicing the interrupt.

--

Dan S.

rchen@m.cs.uiuc.edu (Rong Chen) (04/04/91)

From: Michael_Young@transarc.com
To: rchen@m.cs.uiuc.edu
Subject: Re: interrupt and simple_locks

> While I was reading the source code of Mach3.0, I noticed that
> the hardware interrupts, if any, are always turned off before
> a simple-lock is applied to a Mach object. ...

If an interrupt handler may try to take that latch, interrupt
protection is required.  In your example, the scheduling
system needs to take the latches in question at timer
interrupt time, so those scheduling latches require protection.
Many objects (e.g., in the VM system) are never handled at
interrupt level, and therefore don't need protection.

> It is true that an interrupt may arrive after an object has
> been locked, but that should not cause any harm, right?

Wrong.  The interrupt handler may try to take the latch.  If the latch
was already taken, the interrupt handler would spin forever.
Interrupts cannot be rescheduled the way normal threads can be.

The network service thread (in Mach 2.5 at least) is an example of how
to get around this problem.  The interrupt handler queues work for a
real thread.  The real thread can get rescheduled, so it can take
latches.  Those latches don't need interrupt protection.

Feel free to repost this; I cannot post directly from here.

rchen@m.cs.uiuc.edu (Rong Chen) (04/04/91)

dswartz@bigbootay.sw.stratus.com (Dan Swartzendruber) writes:
>Interrupts must be masked first, since otherwise you have a race condition
>where the lock is acquired on call-side and an interrupt comes in before
>the spl call.  If the interrupt handler tries to acquire the same lock,
>you're stuck, since the only code which can release the lock is the same
>thread which is now servicing the interrupt.

Why do we want the interrupt handlers to acquire locks?  I thought
the interrupt handlers should always be executed from begin to end
without any complications.  Usually the handlers will only change
some volatile data structures in the kernel space.  Shouldn't we
limit interrupt interface only between kernel and hardware, and
spin-locks only among kernel objects?

-Rong Chen
 Department of Computer Science
 University of Illinois at Urbana-Champaign

dswartz@bigbootay.sw.stratus.com (Dan Swartzendruber) (04/05/91)

In article <1991Apr4.035548.25439@m.cs.uiuc.edu> rchen@m.cs.uiuc.edu (Rong Chen) writes:
:dswartz@bigbootay.sw.stratus.com (Dan Swartzendruber) writes:
::Interrupts must be masked first, since otherwise you have a race condition
::where the lock is acquired on call-side and an interrupt comes in before
::the spl call.  If the interrupt handler tries to acquire the same lock,
::you're stuck, since the only code which can release the lock is the same
::thread which is now servicing the interrupt.
:
:Why do we want the interrupt handlers to acquire locks?  I thought
:the interrupt handlers should always be executed from begin to end
:without any complications.  Usually the handlers will only change
:some volatile data structures in the kernel space.  Shouldn't we
:limit interrupt interface only between kernel and hardware, and
:spin-locks only among kernel objects?
:

Just out of curiosity, how do you suggest that the interrupt handler
"change the volatile structure" in kernel space in a multiprocessor
environment when there are other threads of control trying to do the
same thing?

:-Rong Chen
: Department of Computer Science
: University of Illinois at Urbana-Champaign


--

Dan S.

mib@geech.gnu.ai.mit.edu (Michael I Bushnell) (04/05/91)

In article <1991Apr4.225931.20534@ready.eng.ready.com> buz@ready.com (Greg Buzzard) writes:

   Certainly one possibility is to not expect the interrupt level code to
   "change the volatile structure" -- it could queue a request for a
   non-interrupt thread to do it.  If there is a synchronization
   "problem" the onus ought to be on the non-interrupt threads to "do the
   right thing"

Pray tell, how do you want to prevent the interrupt handler and the
thread reading from the queue to avoid clashing with eachother?  You
can move the volatility from one structure to another, but it remains.
Putting it in a queue doesn't solve the problem, it just moves it.

	-mib

buz@ready.com (Greg Buzzard) (04/05/91)

In article <4813@lectroid.sw.stratus.com>, dswartz@bigbootay.sw.stratus.com (Dan Swartzendruber) writes:
|> In article <1991Apr4.035548.25439@m.cs.uiuc.edu> rchen@m.cs.uiuc.edu (Rong Chen) writes:
|> ...
|> Just out of curiosity, how do you suggest that the interrupt handler
|> "change the volatile structure" in kernel space in a multiprocessor
|> environment when there are other threads of control trying to do the
|> same thing?

Certainly one possibility is to not expect the interrupt level code to
"change the volatile structure" -- it could queue a request for a
non-interrupt thread to do it.  If there is a synchronization
"problem" the onus ought to be on the non-interrupt threads to "do the
right thing"
-- 
Greg Buzzard, Ph.D. 			internet:  buz@eng.ready.com
Ready Systems
470 Potrero Ave.			phone:     408/736-2600
Sunnyvale, CA  94086
-- 
Greg Buzzard	 			internet:  buz@eng.ready.com
Ready Systems
470 Potrero Ave.			phone:     408/736-2600
Sunnyvale, CA  94086

dswartz@bigbootay.sw.stratus.com (Dan Swartzendruber) (04/05/91)

In article <1991Apr4.225931.20534@ready.eng.ready.com> buz@ready.com (Greg Buzzard) writes:
>
>In article <4813@lectroid.sw.stratus.com>, dswartz@bigbootay.sw.stratus.com (Dan Swartzendruber) writes:
:|: In article <1991Apr4.035548.25439@m.cs.uiuc.edu> rchen@m.cs.uiuc.edu (Rong Chen) writes:
:|: ...
:|: Just out of curiosity, how do you suggest that the interrupt handler
:|: "change the volatile structure" in kernel space in a multiprocessor
:|: environment when there are other threads of control trying to do the
:|: same thing?
:
:Certainly one possibility is to not expect the interrupt level code to
:"change the volatile structure" -- it could queue a request for a
:non-interrupt thread to do it.  If there is a synchronization
:"problem" the onus ought to be on the non-interrupt threads to "do the
:right thing"

Ummmm, just where is the interrupt level code supposed to queue this
request of yours?  And how is it supposed to arbitrate for access to
that queue with the non-interrupt thread which will be looking on
that queue for requests???  As an aside, are you seriously proposing
TWO contexts switches for every interrupt or page fault which occurs?

:-- 
:Greg Buzzard, Ph.D. 			internet:  buz@eng.ready.com
:Ready Systems
:470 Potrero Ave.			phone:     408/736-2600
:Sunnyvale, CA  94086
:-- 
:Greg Buzzard	 			internet:  buz@eng.ready.com
:Ready Systems
:470 Potrero Ave.			phone:     408/736-2600
:Sunnyvale, CA  94086


--

Dan S.

bolosky@cs.rochester.edu (Bill Bolosky) (04/05/91)

In article <MIB.91Apr4201536@geech.gnu.ai.mit.edu> mib@geech.gnu.ai.mit.edu (Michael I Bushnell) writes:
>In article <1991Apr4.225931.20534@ready.eng.ready.com> buz@ready.com (Greg Buzzard) writes:
>
>   Certainly one possibility is to not expect the interrupt level code to
>   "change the volatile structure" -- it could queue a request for a
>   non-interrupt thread to do it.  If there is a synchronization
>   "problem" the onus ought to be on the non-interrupt threads to "do the
>   right thing"
>
>Pray tell, how do you want to prevent the interrupt handler and the
>thread reading from the queue to avoid clashing with eachother?  You
>can move the volatility from one structure to another, but it remains.
>Putting it in a queue doesn't solve the problem, it just moves it.
>
>	-mib


You could always force the non-interrupt thread that reads the queue to run
on the processor on which the interrupts happen, and use interrupt masking!

This, of course, is just what you had wanted to avoid in the first place
(having interrups masked), requires an extra context switch per interrupt,
and requires that you have some kind of thread sitting around just to handle
these queue entries.  Hardly among the best ideas I've heard recently.

So...remember to mask those interrupts before grabbing a lock!

Bill

PS. Does this really belong on comp.os.mach?  It's hardly a Mach
    specific discussion at this point.

buz@ready.com (Greg Buzzard) (04/05/91)

In article <MIB.91Apr4201536@geech.gnu.ai.mit.edu>, mib@geech.gnu.ai.mit.edu (Michael I Bushnell) writes:
|> In article <1991Apr4.225931.20534@ready.eng.ready.com> buz@ready.com (Greg Buzzard) writes:
|> 
|>    Certainly one possibility is to not expect the interrupt level code to
|>    "change the volatile structure" -- it could queue a request for a
|>    non-interrupt thread to do it.  If there is a synchronization
|>    "problem" the onus ought to be on the non-interrupt threads to "do the
|>    right thing"
|> 
|> Pray tell, how do you want to prevent the interrupt handler and the
|> thread reading from the queue to avoid clashing with eachother?  You
|> can move the volatility from one structure to another, but it remains.
|> Putting it in a queue doesn't solve the problem, it just moves it.

Yeowww...

If the non-interrupt thread is always the reader and the interrupt
level code is always the writer it's pretty easy -- one reads (only)
the tail pointer and updates the head, the other does the converse
(this assumes you're willing to directly share data and that you're
willing to add some low-level intra-kernel signalling mechanism that
the interrupt level code can use).

I agree with Bill Bolosky, this is degenerating away from a Mach
specific discussion.  I seem to be getting the keen sense that this
group has a pretty parochial focus and since I don't want to get into a
wee wee contest with anyone :-) I'll save my "abstract" thoughts on
the general issues involved on this subject for some other audience...
after one final comment. :-)

As for the cost of multiple context switches, if waiting on a lock is
appropriate w.r.t. your overall system and you know that the lock wont
be held for *too* long, then by all means do it.  If your system can't
tolerate this (possibly non-deterministic) wait, then look at the costs
associated with other alternatives, they do exist...  I didn't mean to
rile anyone I was just viewing the problem from a substantially
different perspective, I'll just sit back and observe. :-#

-- 
Greg Buzzard	 			internet:  buz@eng.ready.com
Ready Systems
470 Potrero Ave.			phone:     408/736-2600
Sunnyvale, CA  94086

-- 
Greg Buzzard	 			internet:  buz@eng.ready.com
Ready Systems
470 Potrero Ave.			phone:     408/736-2600
Sunnyvale, CA  94086