hascall@atanasoff.cs.iastate.edu (John Hascall) (02/12/89)
A couple of days ago I posted a question about saving/restoring registers during context switch. A couple of people sent me mail saying that register/save restore was a very small part of the time taken by a context switch. One person claimed about 1%! What all are you people doing during context switch??? And are they things which are required by the architecture of the machine or by a particular OS? Tasks done at in response to reschedule interrupt (as I see it): 1) Block interrupts 2) Save current processes registers (GP, pagetable-base etc) 3) Select new process 4) Invalidate per-process TLB entries 5) Restore registers 6) Return from interrupt ?) assorted bit and register twiddling A check of the VMS scheduler interrupt handler shows it to be similar (and the longest path through the code looks to be 28 instructions, of course SVPCTX and LDPCTX do the copying of the PCB registers to/from memory (24 longwords) -- and most assuredly they take a majority of the time spent). You can see for yourself (if you have access to a VMS system) with: $ analyze/system SDA> exam/noskip SCH$RESCHED;84 (I think it's covered in Levy's book on the VAX-11 architecture too.) I haven't looked at this section of the BSD code (for VAXen) in a while (and it's at home, of course :-), but I'm sure it's not all that different. John Hascall ISU Comp Center (I just work here, any ideas or opinions which may have found their way into the above nonsense are strictly my own.)
asg@pyuxf.UUCP (alan geller) (02/18/89)
In article <788@atanasoff.cs.iastate.edu>, hascall@atanasoff.cs.iastate.edu (John Hascall) writes: > A couple of days ago I posted a question about saving/restoring > registers during context switch. > > A couple of people sent me mail saying that register/save restore > was a very small part of the time taken by a context switch. One > person claimed about 1%! > > What all are you people doing during context switch??? > And are they things which are required by the architecture of the > machine or by a particular OS? > > Tasks done at in response to reschedule interrupt (as I see it): > > 1) Block interrupts > 2) Save current processes registers (GP, pagetable-base etc) > 3) Select new process > 4) Invalidate per-process TLB entries > 5) Restore registers > 6) Return from interrupt > ?) assorted bit and register twiddling > > A check of the VMS scheduler interrupt handler shows it to be similar > (and the longest path through the code looks to be 28 instructions, > of course SVPCTX and LDPCTX do the copying of the PCB registers > to/from memory (24 longwords) -- and most assuredly they take a majority > of the time spent). > > ... Well, I don't know what V.2 looks like, but I was working at Bell Labs when we first got System V (this was early 1983) for our VAXen. My office-mate and I were both curious about these things, so we took a walk through the scheduler code. After recoiling in disgust, we rewrote the scheduler (from scratch, mostly in assembler, rather than the 90% C it was delivered in), and got an enormous (5x to 20x measured) performance improvement. Why? Well, first off, the assembler code that implemented the actual context switch, once the new process had been selected, explicitly saved the registers to a register save area, and then did a SVPCTX; similarly, after the LDPCTX, the registers were restored from the save area. Not that the register values in the save area were any different from those saved by SVPCTX, mind you; the duplication was just to make sure, I guess. We cleaned this up by eliminating the duplication of effort. Secondly, the scheduler at that time always did a context switch into process 0 first, and then process 0 would switch into whatever the next process should be. This was done so that the system could idle in proc 0, if no other process was able to run; this would keep some poor user's process from getting charged for all of that idle time. We added an idle process that always ran at the lowest possible priority, so that it would get charged for the idle time, and eliminated the extraneous context switch. Finally, the scheduler always made a full pass through the run list to find the highest-priority runnable process. The run list was a singly- linked, unordered list of process structures. If there were many runnable processes, the time it took to scan the process list could stall the system entirely (about 15 processes doing 1 millisecond naps would stall a VAV 11/780). We replaced the run list with an array of doubly-linked lists (using remque and all those other neat instructions), and used a bitmap that flagged which lists had procs on them; then we could use find-first-set to pick a list, and just dequeue the head of that list to get our next process (we ran over 250 nappers, without problems). We made some other modifications to the priority algorithm, but that's not relevant here. Anyway, there are LOTS of things that operating systems do, beyond saving and restoring register sets. Many of them may be pretty silly, but many OSs are not as clean as VMS in this regard. Alan Geller Bellcore ...!{princeton,rutgers}!bcr!asg
barmar@think.COM (Barry Margolin) (02/19/89)
In article <497@pyuxf.UUCP> asg@pyuxf.UUCP (alan geller) writes: >My office-mate and I were both curious about these things, so we took >a walk through the scheduler code. After recoiling in disgust, we >rewrote the scheduler (from scratch, mostly in assembler, rather than >the 90% C it was delivered in), and got an enormous (5x to 20x measured) >performance improvement. Why was it necessary to rewrite it in assembler? From your description of the improvements you made, they mostly seemed to be at the algorithm level, not the implementation level. It's true that you used some nice instructions in the implementation of the run queues (which, by the way, seemed alot like the Multics run queues), but you could have done that by writing a small number of assembly subroutines. And it's not clear how important those special instructions actually are to your code; I'm not really familiar with the Vax, but I suspect that their primary benefit is in user code operating on shared memory, because they are indivisible operations. They're probably not a whole lot faster than the corresponding C code. Barry Margolin Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
rpeck@Jessica.stanford.edu (Raymond Peck) (02/28/89)
>In article <788@atanasoff.cs.iastate.edu>, hascall@atanasoff.cs.iastate.edu (John Hascall) writes: >> A couple of days ago I posted a question about saving/restoring >> registers during context switch. >> >> A couple of people sent me mail saying that register/save restore >> was a very small part of the time taken by a context switch. One >> person claimed about 1%! >> >> What all are you people doing during context switch??? >> And are they things which are required by the architecture of the >> machine or by a particular OS? >> >> Tasks done at in response to reschedule interrupt (as I see it): >> [ obvious context-switch steps deleted ] I think it's interesting that people define context-switch time as the time it takes to do the reg saves, etc. There's a whole other issue that is usually ignored, and is only hinted at in Mr. Hascall's posting. That is: cache flush. Most architectures today do not use PIDs in the cache tags. This, of course, requires all the caches to be flushed (invalidated) upon context switch. So every time you do a switch, you run like molasses until your cache becomes warm again. I think this is *really* where you get hit, not in how sloppily you save the registers. Cold cache misses generate a lot more memory references than register save/restore (unless you have register windows. . . ;-) (*****************************************************************************) "Submarines are lurking in my foggy ceiling/ they keep me sleepless at night" "Read some Kerouac and it put me on the track to burn a little brighter now" (*****************************************************************************)