[comp.arch] Magnum Workstation and Cached Framebuffers

rowen@mips.com (Chris Rowen) (04/24/91)

> From: keith@xenon.lcs.mit.edu (Keith Packard)
> Subject: Re: Caching the frame buffer
> Organization: MIT Laboratory for Computer Science
> 
> Henry Spencer writes
> 
> > Anybody who caches a frame buffer is crazy.  Especially if the cache isn't
> > write-through, in which case your frame-buffer updates show up on the screen
> > some arbitrary time later!
> 
> While this may in general be true; and I've got numbers to prove it for
> several machines, there are a few memory architectures which are broken
> to the point
> where it becomes advantageous to cache the frame buffer.
> 
> In particular, the MIPS Magnum memory system will not generate page-mode
> writes except through a cache line flush; therefore writes to uncached
> locations always take a full memory cycle, while sequential writes through
> the cache hits page mode making large area refreshes quite a bit faster.

Actually, the MIPS Magnum does generate page-mode writes for both
cached and uncached data.  Caching the frame buffer has an important
effect on BLTs.  On an R3000 design with 8W cache refill, it effectively 
improves the frame buffer read bandwidth by 4x for scanline-oriented 
algorithms, and does not affect the write bandwidth.  In the Magnum 3000/33,
for example, this allows aligned memory-to-frame-buffer, and frame-buffer-to-
memory copies to run at about 33Mpixels/s.

However, if the implementer chooses to mix cached and uncached references to 
the frame buffer, some care needs to be used in management of data consistency.

> 
> To alievate the cache-busting nature of the frame buffer, they map each
> scan line into 4K bytes, so only 1/4 of the cache is destroyed by frame
> buffer access.  

In normal use, successive 2K (1280 visible, 768 non-visible) scan lines
are mapped at 32K intervals.  This means that all scan lines map to
the same 2K region of the cache -- only 1/16 of the cache is destroyed by
frame buffer.  If you assume that the non-visible portion of the scan line
is rarely touched, only 3.9% of the cache is clobbered.

> This makes things pretty awful for the TLB however.  

Yes, instead of fitting 2 or 3 scan lines on a TLB page, only one fits.

> Which
> would tend to make drawing lines very slow, except that they also provide an
> alternative mapping scheme which maps square sections of the screen into
> sequential memory locations.

This alternative mapping ("packed mode") puts a 64 x 64 pixel square
from the frame buffer into a page -- typically the TLB miss overhead
drops to .2 cycles per pixel from ~7 cycles per pixel for the conventionally
mapped frame buffer, or ~14 cycles per pixel for the MIPS "unpacked mode"
frame buffer.  It does add a couple of instructions to the inner loop
of the vector drawing code, but that's still much better than the conventional
mapping.

> 
> Hack upon Hack upon Hack.


Chris Rowen			- I speak only for myself, etc. -

dd@mips.com (Oh, duct tape has its place...) (04/26/91)

In article <2559@spim.mips.COM> rowen@mips.com (Chris Rowen) writes:
>In normal use, successive 2K (1280 visible, 768 non-visible) scan lines
>are mapped at 32K intervals.  This means that all scan lines map to
>the same 2K region of the cache -- only 1/16 of the cache is destroyed by
>frame buffer.  If you assume that the non-visible portion of the scan line
>is rarely touched, only 3.9% of the cache is clobbered.

As I mentioned in a previous posting, RISC/os remaps the frame buffer so
that to a user process the scan lines appear to be 4K apart.  However, as
Chris politely reminded me, the Magnum has physical caches, so that this
remapping does not affect the portion of the cache that is "clobbered".
Chris's 3.9% number is correct, and I apologize for claiming otherwise.

If cache conflicts severely degrade performance for a particular operation
such as text drawing, it would probably be worthwhile to make an effort to
allocate storage for the data used by that operation so it doesn't alias
with the frame buffer.  I guess this would be considered a hack.

--
David DiGiacomo, MIPS Computer Systems, Sunnyvale, CA  dd@mips.com