[comp.arch] Memory hierarchy

devine@shodha.enet.dec.com (Bob Devine) (03/28/91)

In article <1998@kuling.UUCP>, irf@kuling.UUCP (Bo Thide') writes:
> Now that the Snakes (HP9000/700 series HP-PA 1.1 RISC workstations) are let
> loose, the official HP info has become available.  Some of this info follows.
> 
> Cache: 128 kB instr/256 kB data (720, 730), 256 kB instr/256 kB data.

The sizes of the caches used in HP Snake systems are interesting.
A bit more than a decade ago (or more than 3 generations ago
if expressed in "product cycle years"), the first release of
the VAX 11/780 had a minimum main memory of 256 kB.

The memory hierarchy has, so far, be contained inside the "box"
and hasn't extended to the disks.  Disks are, for the most part,
still treated as a single-level.  Even RAIDs are made to look
like one big disk.  Some research efforts have been made to
move files off and return files to disk based upon access patterns.

My questions are to the folks with a /dev/crystal_ball: when will
two level processor caches be here?  When will a storage hierarchy
be extended to disks (or to ram-disk, or ... etc)?

in a wondering mood,
Bob Devine

preston@ariel.rice.edu (Preston Briggs) (03/28/91)

devine@shodha.enet.dec.com (Bob Devine) writes:

>The memory hierarchy has, so far, be contained inside the "box"
>and hasn't extended to the disks.
...
>My questions are to the folks with a /dev/crystal_ball: when will
>two level processor caches be here?  When will a storage hierarchy
>be extended to disks (or to ram-disk, or ... etc)?

I think it's happening now.
To the aggressive compiler or programmer, the memory hierarchy
is already extensive

		registers
		cache
		(secondary cache -- does MIPS have one already?)
		TLB
		RAM
		disk

and some efforts are made to block code for each of these levels.
It's the exercise of blocking code (say matrix multiply) for
cache that makes you want lots of set associativity.

It took me a while to think of the TLB as a level in the hierarchy,
but it's true.  Data "in" the TLB can be accessed more quickly than
data outside the TLB.  Same problems with replacement and set associativity
as with cache.  Much faster replacement, though!

Preston Briggs

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (03/29/91)

In article <2832@shodha.enet.dec.com> devine@shodha.enet.dec.com (Bob Devine) writes:

| My questions are to the folks with a /dev/crystal_ball: when will
| two level processor caches be here?  When will a storage hierarchy
| be extended to disks (or to ram-disk, or ... etc)?

  You don't need the hardware assist to answer this one, you can do it
with no balls at all. NOW.

  The intel 486 has the pipeline, on chip cache, off chip cache, main
memory, and virtual memory. That's multilevel by any definition.

  Companies like Epoch have fileservers now which do caching, using
many MB of memory, then a GB or so of hard disk, and ending with 30GB
or so of optical. There are other companies doing it, too, and I think
Plexus (the software reincarnation) is doing something like this on a
portable basis.

-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
        "Most of the VAX instructions are in microcode,
         but halt and no-op are in hardware for efficiency"

devine@shodha.enet.dec.com (Bob Devine) (03/30/91)

I asked:
> My questions are to the folks with a /dev/crystal_ball: when will
> two level processor caches be here?  When will a storage hierarchy
> be extended to disks (or to ram-disk, or ... etc)?

Several folks have sent mail answering the first question by telling
me that some current generation and near-future processors will
have two level caches (eg, MIPS R6000, some Intel 486s, future Moto
68040, and future SPARC chips were mentioned).  Thanks!

However, the point I would like to make is that the processors are
beating the rest of the system for performance.  Many problems
that were cpu-bound are now I/O bound or have bottlenecks that moved.
The basic cpu/memory/disk hierarchy has been expanding so that at
each of the interfaces more levels will be used.  There are a few
companies that are looking at the storage issues but no revolutionary
proposal has emerged -- news on that was what I was fishing for...

Bob Devine

mash@mips.com (John Mashey) (03/31/91)

In article <1991Mar28.152952.18380@rice.edu> preston@ariel.rice.edu (Preston Briggs) writes:

>I think it's happening now.
>To the aggressive compiler or programmer, the memory hierarchy
>is already extensive
>
>		registers
>		cache
>		(secondary cache -- does MIPS have one already?)
>		TLB
>		RAM
>		disk

To answer the questions: sure.

1) R3000s have occasionally had secondary caches bolted on the outside
(ex:SGI MP).
This is fairly similar to 680x0 and i486, where the primary design
point seemed to be for 1-level caches, but allowing 2-level,
but with less support built directly in to the chip.  [Maybe designers
of these would comment if this impression on priorities is accurate.}

2) R6000s were designed from day 1 to use 2-level caches.

3) For R4000s, the primary design point in many ways is for 2-level caches,
although the 1-level single-chip version is also an important point, 
and gets more prevalent with time as the on-chip caches get bigger,
reducing the performance delta between R4000 & R4000+scache for
some applications.  A distinct difference found in the R4000s is the
inclusion of full 2-level cache-coherency control on the chip.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	 mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash 
DDD:  	408-524-7015, 524-8253 or (main number) 408-720-1700
USPS: 	MIPS Computer Systems MS 1/05, 930 E. Arques, Sunnyvale, CA 94086

elg@elgamy.RAIDERNET.COM (Eric Lee Green) (03/31/91)

From article <2845@shodha.enet.dec.com>, by devine@shodha.enet.dec.com (Bob Devine):
> However, the point I would like to make is that the processors are
> beating the rest of the system for performance.  Many problems
> that were cpu-bound are now I/O bound or have bottlenecks that moved.

Looking at the LFS paper, with enough memory, disk caching can keep up to
90% of the disk working-set in RAM for "typical" users.

This, however, ignores two particular application sticky points: random
access of large databases, and streamed access of large data sets. Often
large databases in the commercial world take up entire disk ranches, with
minimal predictability regardless of any cache size you can think of. The
possibility of a disk heirarchy (similar to the early days) of large slow
disk and smaller faster disk isn't really valid, because you simply can't
predict where you're getting data next (except perhaps for often-accessed
index files and such, but RAM cache can often handle that). The only real
answer there is what the mainframe folks do -- a bunch of controllers and
disks, and do things in parallel (disk access wise) as much as possible.

Sequential access... operating systems are the basic foe here. Many
operating systems do not contain provisions for easily streaming files into
a user's dataspace. To some extent virtual memory is an enemy here... what
is, on a non-VM machine, a simple operation (simply hand off the address of
the user's data space to the DMA disk controller, and go do something else
while it streams the entire file into memory) becomes fraught with
complexity when dealing with a MMU. If I recall right, the vast majority of
Unix kernals basically do I/O a block at a time into cache RAM, then copy
it by hand into the user's data buffers. Some do pre-reading, which
speeds sequential access, but puts random access into the doghouse
(because everytime the user asks for a random block, the disk system
transfers twice as much data as what the user really wants). The caching is
nice when dealing with compilers and text editors and such, which will be
swiftly re-used. But when you're bringing in a huge array to feed to the
array processor, you're most likely not going to bring that array back into
memory anytime soon. Blowing the cache out of the water and then doing a
memory-to-memory copy of a huge amount of data doesn't strike me as too
attractive of a thing to do.

Which brings us back to the question of storage heirarchies. As established
elsewhere, for "normal" usage the memory cache is quite adequate for
maintaining a "working set" of the disk drive. For random access no cache
really works too well (except perhaps for index files), and about the only
sort of "disk" that would be suitable for an intermediate position would be
a "ram disk". Which is possible, and I do seem to recall that some folks
have put something of the sort out on the market. Someone else will have to
supply details on that. But no miracles here.

And finally, for sequential access, when you're continually streaming
multi-megabyte files into memory one after the other, forget about
caching... all you can do there is concentrate on things like, e.g., data
striping, getting multiple i/o channels pouring their data into memory as
fast as the memory system will handle. And with things such as, e.g., wide
memory busses, you can design memory systems that'll handle sequential
accesses basically as fast as you have money for.

In other words, in the cpu/memory/disk heirarchy, a) there isn't any need
for an additional level in the general case (90% cached in memory, when
enough memory), b) an additional RAM-disk might be suitable to put between
memory and disk for random access, if it's not cheaper to just add memory
to the main CPU to begin with, c) any heirarchy you can think of won't help
sequential access, all that'll help there is simply improving bandwidth.

> The basic cpu/memory/disk hierarchy has been expanding so that at
> each of the interfaces more levels will be used.  There are a few
> companies that are looking at the storage issues but no revolutionary
> proposal has emerged -- news on that was what I was fishing for...

Revolutions, alas, happen only once every 200 years or so. The last major
revision was when the disk heirarchy contracted from disk/drum to just
plain disk. Unless research says that for large databases a large cheap
RAM cache between disk and CPU is viable, things are likely to stay as they
are heirarchy-wise. The factor implicit in two-level CPU caches (lack of
space for large on-chip caches) doesn't hold when we're talking disk.

--
Eric Lee Green   (318) 984-1820  P.O. Box 92191  Lafayette, LA 70509
elg@elgamy.RAIDERNET.COM               uunet!mjbtn!raider!elgamy!elg
 Looking for a job... tips, leads appreciated... inquire within...

peter@ficc.ferranti.com (Peter da Silva) (04/02/91)

In article <00670398761@elgamy.RAIDERNET.COM> elg@elgamy.RAIDERNET.COM (Eric Lee Green) writes:
> Revolutions, alas, happen only once every 200 years or so. The last major
> revision was when the disk heirarchy contracted from disk/drum to just
> plain disk.

How about magnetic disk-optical disk, as in things like the Epoch Infinite
Storage server? A new level... between tape and disk, not between disk and
RAM.
-- 
Peter da Silva.  `-_-'  peter@ferranti.com
+1 713 274 5180.  'U`  "Have you hugged your wolf today?"

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (04/02/91)

In article <-+EAGCE@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes:

| How about magnetic disk-optical disk, as in things like the Epoch Infinite
| Storage server? A new level... between tape and disk, not between disk and
| RAM.

  Point taken, for any r/w optical format. 

  The problem is that the optical disks fall at a funny place in the
hierarchy. Typically the ration of cache to memory, memory to disk,
disk to tape is an order of magnitude in size, access time, and
cost/bit. Optical is not on that path in terms of size or cost, and with
new larger disks, neither are tapes. There is a real need for a large
backup medium right now.

  With PC and workstation disk going multi gigabyte at reasonable
prices, there is no acceptable way to back up. Whatever the hardware
costs, the labor of having someone changing media drives the price of
backup to the point where you can't afford to do the backups you should.

  I have faith in capitalism, there will be something better soon.

-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
        "Most of the VAX instructions are in microcode,
         but halt and no-op are in hardware for efficiency"

rmc@snitor.UUCP (Russell Crook) (04/03/91)

From a previous posting... (sorry, my news software bunged up):

  The problem is that the optical disks fall at a funny place in the
  hierarchy. Typically the ration of cache to memory, memory to disk,
  disk to tape is an order of magnitude in size, access time, and
  cost/bit. Optical is not on that path in terms of size or cost, and with
  new larger disks, neither are tapes. There is a real need for a large
  backup medium right now.


Actually, there *is* such a beast (albeit rare and expensive) called
optical (write once) tape.  A company called CREO (in Vancouver)
makes the drive (at something like 250,000$ or so), and a single
tape reel stores several terabytes (I believe).  NASA now has
one, and I think it was shown at last year's CeBit in Hannover.
Transfer rate is somewhere under 1Mbyte/sec., and it takes some
number of days to write a complete tape!  There is a high speed search
mode (the tape is block addressable).

Sorry for the fuzziness in the above numbers... I had some blurbs on it,
but I can't find them at the moment.
------------------------------------------------------------------------------
Russell Crook, Siemens Nixdorf Information Systems, Toronto Development Centre
2235 Sheppard Ave. E., Willowdale, Ontario, Canada M2J 5B5   +1 416 496 8510
uunet!{imax,lsuc,mnetor}!nixtdc!rmc,  rmc%nixtdc.uucp@{eunet.eu,uunet.uu}.net,
      rmc.tor@nixdorf.com (in N.A.), rmc.tor@nixpbe.uucp (in Europe)
      "... technology so advanced, even we don't know what it does."

-- 
------------------------------------------------------------------------------
Russell Crook, Siemens Nixdorf Information Systems, Toronto Development Centre
2235 Sheppard Ave. E., Willowdale, Ontario, Canada M2J 5B5   +1 416 496 8510
uunet!{imax,lsuc,mnetor}!nixtdc!rmc,  rmc%nixtdc.uucp@{eunet.eu,uunet.uu}.net,

wcs) (04/04/91)

In article <3306@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
]   The problem is that the optical disks fall at a funny place in the
] hierarchy. Typically the ration of cache to memory, memory to disk,
] disk to tape is an order of magnitude in size, access time, and
] cost/bit. Optical is not on that path in terms of size or cost, and with
] new larger disks, neither are tapes. There is a real need for a large
] backup medium right now.

Tapes haven't been much bigger than disks for ages.  Back in 1980 or so,
my VAX had 256MB removable disk drives, and we had the expensive 6250bpi
tape which held 140MB on a good day when they were working and a huge 4MB RAM. 
Lots of people were stuck with 1600bpi tape drives, 30-40 MB.
We did incremental backups to reduce tape-changing labor as well as to
allow on-line backups when we had the space.  
Disk platters cost ~$1000, tapes $25.  Disk drives were $35K.
We could afford 4 disk drives = 1GB, which needed about 8 tapes to back up.

Today, DATs are 1 GB (2GB real soon), and slow 8mm tape is 5GB.
You can finally back up most disk drives on a single tape, and you can
get a DAT stacker holding 10 tapes.  It's a lot less annoying.
With our current reduced budgets, we can afford about 20GB of disk :-)

Removable disks now cost ~$1000-2000 for 500MB - 1GB, including the drive,
and tapes are still about $25.

]   I have faith in capitalism, there will be something better soon.
Well, if you've got the cash, you can do anything from a pair of DATs
or small stacker, which gives you a fair amount of unattended backup /recovery,
to optical disk jukebox-based systems like Epoch or AT&T's nice
CommVault 3-D File System (plug, plug), to evil monstrosities like
the 50-foot-long robotic mag-tape handler a Lawrence Livermore Labs.
-- 
				Pray for peace;		  Bill
# Bill Stewart 908-949-0705 erebus.att.com!wcs AT&T Bell Labs 4M-312 Holmdel NJ
"Don't Use Racist or Sexist Language" - Political Correctness Police Slogan
"Let's Beat Up That African-American" - Los Angeles Police Department Slogan