[net.micro.68k] 68030 data cache vs. IO devices

tgl@zog.cs.cmu.edu (Tom Lane) (10/06/86)

I'm a little disturbed by the reports of the new 68030's on-chip
cache for data accesses.  I'm concerned about accesses to I/O devices,
which are (by necessity) memory-mapped in 68xxx machines.

Problem #1: if the chip decides to cache the value read from an
I/O device status register, subsequent reads will not produce the
current contents of that register.

This problem also applies to shared variables in a multi-CPU, shared
memory machine (which is supposedly a design target for Motorola ...
remember TAS, CAS and them other funny instructions?)

Problem #2: the caches are apparently built to slurp in 16 bytes
around any referenced location, on the theory that adjacent words
may be required soon.  (While this makes good sense for instructions,
I'm not at all sure that I buy it for data.)  This is *extremely
dangerous* for I/O devices, as adjacent locations (1) may not exist,
or (2) may have some active response to being read ... e.g. clearing
an interrupt request.

Now I assume that the 68030 group are not fools, and that they
thought about this problem.  What did they do about it?  Is there
a mechanism for keeping certain (ranges of?) addresses from being
cached?  How do they expect software to handle shared variables?

Looking forward to enlightenment...

				tom lane
-----
ARPA: lane@A.CS.CMU.EDU
UUCP: ...!seismo!a.cs.cmu.edu!lane

witters@fluke.UUCP (10/09/86)

> Problem #1: if the chip decides to cache the value read from an
> I/O device status register, subsequent reads will not produce the
> current contents of that register.
> 

I've read the data sheet for the 68851 PMMU, and the PMMU in the 68030 is
supposed to be a stripped down version of the 68851.  The 68851's translation
descriptor for a page has a cache inhibit bit which, when set, will cause the
CLI* (Cache Load Inhibit) signal to be asserted when the page is accessed.
This signal can be used to inhibit the loading of an external data cache.  I
assume that the PMMU in the 68030 does the same thing internally with it's data
cache as the 68851 would do with an external data cache.

> Problem #2: the caches are apparently built to slurp in 16 bytes
> around any referenced location, on the theory that adjacent words
> may be required soon.  (While this makes good sense for instructions,
> I'm not at all sure that I buy it for data.)  This is *extremely
> dangerous* for I/O devices, as adjacent locations (1) may not exist,
> or (2) may have some active response to being read ... e.g. clearing
> an interrupt request.

Well, I would guess that the cache load inhibit bit would prevent this if it is
set.  I assume that it is the cache management hardware in the 68030 which does
the 'slurping', and not the CPU.  There isn't an analog to this in the
68020/68851 combination, so I can't speak with authority here.  Anyone from
Motorola care to comment?

Your second problem raises another question.  What happens if one of the 16
bytes crosses a page boundary?  Does the slurping stop at the page boundary,
or does the 68030 try to read from the next page?  Without thinking about it
deeply, it seems possible that this could cause problems if the next page isn't
mapped, or is mapped to an I/O device.

-- 
						I'm not a lumberjack
						and I'm not O.K.

						John Witters
						John Fluke Mfg. Co.  Inc.
						P.O.B. C9090 M/S 245F
						Everett, Washington  98206

						(206) 356-5274

markp@valid.UUCP (Mark P.) (10/09/86)

> I'm a little disturbed by the reports of the new 68030's on-chip
> cache for data accesses.  I'm concerned about accesses to I/O devices,
> which are (by necessity) memory-mapped in 68xxx machines.
> 
> Problem #1: if the chip decides to cache the value read from an
> I/O device status register, subsequent reads will not produce the
> current contents of that register.
> 
> This problem also applies to shared variables in a multi-CPU, shared
> memory machine (which is supposedly a design target for Motorola ...
> remember TAS, CAS and them other funny instructions?)
> 
> Problem #2: the caches are apparently built to slurp in 16 bytes
> around any referenced location, on the theory that adjacent words
> may be required soon.  (While this makes good sense for instructions,
> I'm not at all sure that I buy it for data.)  This is *extremely
> dangerous* for I/O devices, as adjacent locations (1) may not exist,
> or (2) may have some active response to being read ... e.g. clearing
> an interrupt request.
> 
> Now I assume that the 68030 group are not fools, and that they
> thought about this problem.  What did they do about it?  Is there
> a mechanism for keeping certain (ranges of?) addresses from being
> cached?  How do they expect software to handle shared variables?
> 
> Looking forward to enlightenment...
> 
> 				tom lane

In a virtual memory system, the solution is easy.  The 68851 provided
a bit in the page descriptor called CI (cache inhibit).  When a reference
is made to a page with CI set, a pin CLI/ is asserted by the 68851,
indicating to an external data cache that it should bypass.  The 68030
MMU is said to be compatible with the 68851, so one assumes that a
similar (i.e. identical) arrangement exists.  What this means to a Unix
implementation is that, upon initialization of the I/O devices, these
bits must be selectively be set for the noncacheable areas of the memory
map.  Of course, it undoubtedly involves more, but I am not a wizard.

The solution is somewhat harder when you are not using virtual memory--
in essence you must surround the possibly offending accesses by instructions
to disable operation of the data cache.  This would be done by a bit
similar to the freeze-I-cache bit in the CACR (cache control register) of
the 68020.

Similar arguments apply to software semaphores-- i.e. they must be
either placed in a non-cacheable region of memory or accessed without
fear of intervention by the data cache.  Presumably, Motorola was
intelligent enough to force read-modify-write cycles (i.e. TAS/CAS) to
ALWAYS bypass cache, though.  Motorola appears to have ignored the cache
consistency issue, with no apparent support for external invalidations.
This will, of course, delight those who would exploit the 68030 in
multiprocessor shared-memory systems by the added gaggage which programming
requires.

You have brought out a very important point here, that the data cache is
undoubtedly the most severe source of potential incompatibility between
the 68030 and the 68020.  However, as most systems using it will use
virtual memory (due to its price placing it in that market), such
incompatibility will be hidden from the application level, and require
only changes to the kernel/device drivers.

Always glad to enlighten... :-)

	Mark Papamarcos
	Valid Logic Systems
	hplabs!ridge!valid!markp

"Have you hugged your Futurebus today?"

henry@utzoo.UUCP (Henry Spencer) (10/10/86)

> ...  Is there
> a mechanism for keeping certain (ranges of?) addresses from being
> cached?  ...

The orthodox solution to the problem is to have a "don't cache" bit in
the page table entries, and have the MMU and the cache collaborate so
that pages with that bit on do not get cached.  Given that both the MMU
and the cache are on-chip in the 030, that is probably what Motorola
has done.  (I don't remember the PMMU specs well enough to know whether
it has provisions for this...)
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

davet@oakhill.UUCP (Dave Trissel) (10/10/86)

In article <1007@zog.cs.cmu.edu> tgl@zog.cs.cmu.edu (Tom Lane) writes:
>
>Problem #1: if the chip decides to cache the value read from an
>I/O device status register, subsequent reads will not produce the
>current contents of that register.

Just like the MC68851 Paged Memory Management Unit for the MC68020 the
MC68030 MMU descriptors support a cache inhibit status bit.  This forces
all bus cycles on pages so marked to bypass the on-chip caches.  All I/O
devices are normally attached to contiguous memory blocks so the only
requirement is to have the cache inhibit bit set for the descriptors
associated with these blocks.

>This problem also applies to shared variables in a multi-CPU, shared
>memory machine (which is supposedly a design target for Motorola ...
>remember TAS, CAS and them other funny instructions?)

Several solutions here.  First, the cache inhibit bit can be set for sharable
memory as described above.  Second, any external bus read request by the
processor can have the data returned with a signal indicating that it should
not be cached.  Of course, if you have an O.S. which arbitrarily makes all
memory sharable (this is rare) you can always disable the data cache, a rather
extreme thing to have to do.  We are at present examining ways to get around
this last solution.

All instructions which lock the external CPU (such as though you mentioned)
always by pass the on-chip data cache.

  -- Dave Trissel Motorola, Austin  {ihnp4,siesmo}!ut-sally!im4u!oakhill!davet

markp@valid.UUCP (Mark P.) (10/11/86)

> [stuff deleted]
> 
> Your second problem raises another question.  What happens if one of the 16
> bytes crosses a page boundary?  Does the slurping stop at the page boundary,
> or does the 68030 try to read from the next page?  Without thinking about it
> deeply, it seems possible that this could cause problems if the next page
> isn't mapped, or is mapped to an I/O device.
> 
> 						John Witters

This is physically (pun intended) impossible, as cache block transfers are
necessarily aligned to block-aligned boundaries (i.e. transfer always starts
with A<3..2>=00).  So as long as you don't use 8-byte or smaller pages,
which you probably won't, you're okay.

	Mark Papamarcos
	Valid Logic
	hplabs!ridge!valid!markp

hunter@oakhill.UUCP (Hunter Scales) (10/11/86)

In article <1624@vax1.fluke.UUCP> witters@fluke.UUCP writes:
>> Problem #1: if the chip decides to cache the value read from an
>> I/O device status register, subsequent reads will not produce the
>> current contents of that register.
>> 
>
>I've read the data sheet for the 68851 PMMU, and the PMMU in the 68030 is
>supposed to be a stripped down version of the 68851.  The 68851's translation
>descriptor for a page has a cache inhibit bit which, when set, will cause the
>CLI* (Cache Load Inhibit) signal to be asserted when the page is accessed.
>This signal can be used to inhibit the loading of an external data cache.  I
>assume that the PMMU in the 68030 does the same thing internally with it's
>data
>cache as the 68851 would do with an external data cache.

	This is exactly correct.


>
>> Problem #2: the caches are apparently built to slurp in 16 bytes
>> around any referenced location, on the theory that adjacent words
>> may be required soon.  (While this makes good sense for instructions,
>> I'm not at all sure that I buy it for data.)  This is *extremely
>> dangerous* for I/O devices, as adjacent locations (1) may not exist,
>> or (2) may have some active response to being read ... e.g. clearing
>> an interrupt request.
>
>Well, I would guess that the cache load inhibit bit would prevent this if it is
>set.  I assume that it is the cache management hardware in the 68030 which does
>the 'slurping', and not the CPU.  There isn't an analog to this in the
>68020/68851 combination, so I can't speak with authority here.  Anyone from
>Motorola care to comment?

	Burst fetches are inhibited if the CI (cache inhibit) bit is set.


>
>Your second problem raises another question.  What happens if one of the 16
>bytes crosses a page boundary?  Does the slurping stop at the page boundary,
>or does the 68030 try to read from the next page?  Without thinking about it
>deeply, it seems possible that this could cause problems if the next page isn't
>mapped, or is mapped to an I/O device.
>

	The burst fetches are modulo(4) (long words).  As long as your
page size is greater than 16 bytes, you will never cross a page boundary
on bursts.
>-- 
>
>						John Witters


-- 
Motorola Semiconductor Inc.                Hunter Scales
Austin, Texas           {ihnp4,seismo,ctvax,gatech}!ut-sally!oakhill!hunter

(I am responsible for myself and my dog and no-one else)

john@frog.UUCP (John Woods, Software) (10/24/86)

> > ...  Is there
> > a mechanism for keeping certain (ranges of?) addresses from being
> > cached?  ...
> 
> The orthodox solution to the problem is to have a "don't cache" bit in
> the page table entries, and have the MMU and the cache collaborate so
> that pages with that bit on do not get cached.  Given that both the MMU
> and the cache are on-chip in the 030, that is probably what Motorola
> has done.  (I don't remember the PMMU specs well enough to know whether
> it has provisions for this...)
> -- 
> 				Henry Spencer @ U of Toronto Zoology
> 				{allegra,ihnp4,decvax,pyramid}!utzoo!henry
> 

The PMMU has a "don't even THINK of caching references through me" bit in
the PTEs (or whatever they are called).  For amusement's sake, there is also
a "don't cache this reference" pin coming into the chip; I suppose that a
system running with the PMMU disabled could use address decoders to recognize
"the IO page" and specifically disable caching for it.

--
John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101
...!decvax!frog!john, ...!mit-eddie!jfw, jfw%mit-ccc@MIT-XX.ARPA

"Soylent Green is People Helping People!"