[comp.arch] Icache flushes, self-modifying code and multiprocessors

rouellet@crhc.uiuc.edu (Roland G. Ouellette) (12/19/90)

I think we are almost in violent agreement...

> When the code stream changes, it's necessary to cause all data caches in
> an MP system to writeback, and then to invalidate all instruction caches

Lets not say they need to writeback...  The instruction cache will not
be a writeback cache on any sensible system.  The caches need to
become coherent.  Flushing them is sufficient.

> (Harvard architecture, instruction caches don't snoop...

Yes I agree.  I missed the MP issue here, where different processors
share the same instruction stream.

> Life is somewhat easier on VAX and similar proprietary CISC
> architectures, because there is much flexibility to move the
> implementation from hardware to microcode to an OS trap handler while
> preserving the user-level illusion of direct hardware support for the
> desired functionality.  But even there, I would expect that many
> implementations would implement what appear to be user-level cache
> control operations by trapping to kernel software or microcode, which
> is not exactly direct hardware support.

Actually the cache flush function is trival...  it is one monster
driver and a few gates per valid bit on the icache.  There's no real
need to trap to the OS, microcode, whatever ... at least for a VLSI
cache tag store.

> Surely the VAX REI instruction doesn't flush all instruction caches in a
> multi-processor?

No...  I think that OS support might be appropriate here as there is
processes shouldn't be locked onto specific processors (gang
scheduling is fairly icky).  With gang scheduling, just pingging your
cooperating processes to flush their caches would be OK.

Hmmm... that is interesting... I never thought of multiprocessor
self-modifying code applications.
--
= Roland G. Ouellette			ouellette@tarkin.enet.dec.com	=
= 1203 E. Florida Ave			rouellet@[dwarfs.]crhc.uiuc.edu	=
= Urbana, IL 61801	   "You rescued me; I didn't want to be saved." =
=							- Cyndi Lauper	=

cprice@mips.COM (Charlie Price) (12/29/90)

In article <ROUELLET.90Dec18223208@pinnacle.crhc.uiuc.edu> rouellet@crhc.uiuc.edu (Roland G. Ouellette) writes:
>
>Actually the cache flush function is trival...  it is one monster
>driver and a few gates per valid bit on the icache.  There's no real
>need to trap to the OS, microcode, whatever ... at least for a VLSI
>cache tag store.

This identifies one of the problems with providing an
invalidate_the_entire_cache operation --
you need to be able to write a whole bunch of bits at one time.
Even if that can be done in custom parts or on-chip,
if you ever hope to build a system that uses more-or-less-standard SRAMs
for the cache (regarded as a "Good Thing" here at MIPS) then it
is probably a mistake to put such a feature into your architecture.

Entirely aside from the implementability of the feature,
I think you would want to examine how programs actually
modified-or-created code in the data space and then executed it.
Tossing the entire contents of the I-cache and refilling
the useful lines can be a costly proposition.
Unless incremental compiler systems or other users of the feature
create a whole lot of code in between synchronization calls,
they could easily be faster if they selectively invalidate a range
of addresses via user-level instructions or an OS call.


-- 
Charlie Price    cprice@mips.mips.com        (408) 720-1700
MIPS Computer Systems / 928 Arques Ave. / Sunnyvale, CA   94086-23650