mlord@bwdls58.UUCP (Mark Lord) (01/15/91)
How smart are the stack-cache designs that have been looked at? I'm curious whether a proposal such as the following has already been researched. ------- The procedure call stack is probably the heaviest used chunk of memory in most time sharing systems, such as *nix or whatever. It therefore follows that hardware optimization of this resource might provide significant performance improvements in just about any RISC/CISC system. Specifically, how about an operating system controlled cache, devoted to caching the call stack of the current task and/or interrupt handler? This stack would have the following key characteristics: 1) copyback cache - line size of 64bits. (determine optimum size ?) 2) one-way direct mapping. 3) O/S writes start & size registers for memory range to be cached, during context switches. 4) cache size >= average task stack size (4-8K ?) 5) copyback has lowest bus priority - gets done only when nothing else wants the bus (reads, writes, dma, other caches). 6) at context switch time, O/S uses hardware mechanism to clear the copyback flags for locations beyond top of stack. This could eliminate a lot of unnecessary copybacks. The same strategy could also be used by interrupt handlers, if the cache is to be shared. So the idea is that, each context switch, the O/S purges the cache of any pending copybacks for items no longer "on the stack", and then sets a new range of addresses to be cached for the next task. Stack locations do not get copied back to memory until absolutely necessary (some other location needs to be cached in the same slot of the cache.. gotta clear it first!), unless there are free bus cycles that would otherwise be unused. I've seen discussion of stack caching here in the past, but I don't recall seeing suggestions 3/6) discussed. Without it, stack caching appears to gain nothing over simply using a larger general purpose cache. But with it.. ? Undoubtedly this has problems, but perhaps others here can help iron them out. -- ___Mark S. Lord__________________________________________ | ..uunet!bnrgate!mlord%bmerh724 | Climb Free Or Die (NH) | | MLORD@BNR.CA Ottawa, Ontario | Personal views only. | |________________________________|________________________|
ddr@cs.edinburgh.ac.uk (Doug Rogers) (01/15/91)
In article <5229@bwdls58.UUCP>, mlord@bwdls58.UUCP (Mark Lord) writes: > How smart are the stack-cache designs that have been looked at? > I'm curious whether a proposal such as the following has already > been researched. > > Specifically, how about an operating system controlled cache, devoted to > caching the call stack of the current task and/or interrupt handler? > > This stack would have the following key characteristics: > > 5) copyback has lowest bus priority - gets done only when nothing > else wants the bus (reads, writes, dma, other caches). The problem is in multi processor systems maintaining cache consistency. Within the Futurebus spec. one user has the right to modify at any time. Clearly if there was only one copy of the information then it should not matter how slowly the information is passed back but what happens if while this is going on another cache wishes to gain access to this information. (this would include the data cache for the same processor). -- Douglas Rogers JANET: ddr@uk.ac.ed.lfcs Department of Computer Science UUCP: ..!mcvax!ukc!lfcs!ddr University of Edinburgh ARPA: ddr%lfcs.ed.ac.uk@nsfnet-relay.ac.uk Edinburgh EH9 3JZ, UK. Tel: 031-650 5172 (direct line)
mlord@bwdls58.bnr.ca (Mark Lord) (01/16/91)
In article <4513@skye.cs.ed.ac.uk> ddr@cs.edinburgh.ac.uk (Doug Rogers) writes: <In article <5229@bwdls58.UUCP>, mlord@bwdls58.UUCP (Mark Lord) writes: <> <> Specifically, how about an operating system controlled cache, devoted to <> caching the call stack of the current task and/or interrupt handler? <> <> 5) copyback has lowest bus priority - gets done only when nothing <> else wants the bus (reads, writes, dma, other caches). < <The problem is in multi processor systems maintaining cache consistency. Within <the Futurebus spec. one user has the right to modify at any time. Clearly <if there was only one copy of the information then it should not matter how slowly <the information is passed back but what happens if while this is going on another <cache wishes to gain access to this information. (this would include the data cache <for the same processor). I'm not particularly worried about the multi-processor case for this stack cache. Shared memory variables tend not to be procedure locals on a call stack, as it becomes tricky to communicate their current addresses to other processors. Also, note that this proposal is in addition to regular data caches, which can handle multi-processor consistency using one's favorite means for the REST of memory, just not the process STACKs. We could agree to document this very unlikely deficiency and let the O/S programmers read our notes before they try to set up globals in stack variables which are shared between CPUs. :) -- ___Mark S. Lord__________________________________________ | ..uunet!bnrgate!mlord%bmerh724 | Climb Free Or Die (NH) | | MLORD@BNR.CA Ottawa, Ontario | Personal views only. | |________________________________|________________________|