[comp.arch] memory speed & futurology

george@ditmela.oz (George michaelson) (08/24/88)

	The August edition of 'electronics' is about memory technology
current & futures. it had this table suggesting price/performance trends
in the next 10 years. I suspect we see another sub-generation of workstations
coming online before this, but a 5 year design timescale might mean the sun-6
series and its peers could take advantage of these changes.

(from Electronics, Aug '88 page 70, reproduced without permission)

	HOW SYSTEM STORAGE WILL CHANGE 
	==============================

		Cache		Main Mem	Magnetic Disk	Optical Disk
Cost/Mb:	
 1987		$500		$250		$6.30		$3.50 - $5.50
 1995		$ 60		$ 30		$1.75		  under 50c
Av access time:
 1987		45-55ns		120-150ns	15-20ms		35-100ms
 1995		15-20ns		under 25ns	 10ms		20-25ms


My questions:

(1)	how plausible are their figures? I have no other info to hand.

(2)	main memory gets 6 times faster and 8 times cheaper. cache gets
	2-3 times faster and 8 times cheaper. Thus their speeds almost
	converge. Does that make cache sufficiently unattactive to stop 
	being used?

	-price/speed ratio looks ugly (twice the cost for saving 5-10ns)
	-overall access speed to memory is 2 X current speed to cache so
	for existing processor architectures you can possibly do without it
	unless there are `logical' reasons for using cache other than
	buffering for slower memory.

(3)	do the new speeds still look good alongside predicted clock speeds
	for CPU or do we have another development lag here?

	there is another table in the mag showing possible access times
	for existing 32bit cpus & available speeds from different memory
	technology but it's hard to reproduce. It implies memory access
	delay is one of many  bottlenecks, I think these speedups might cure 
	it but only for existing clockspeeds.

(4)	at $30/Mb do we start to get ramdisk coming back into fashion?

	$3k for a 100meg `drive' looks pretty neat, assuming there are
	architectural reasons for not making it simply look like resident
	memory.

(5)	do we start to get 32/64Mb by default in our workstations? 
	does the opsys change its memory usage when that much memory
	is around? 

(6)	with optical disk getting down to current access times for `real'
	disk do they become standard or are the disadvantages still too
	great? does unix sprout file version numbers? -I'm assuming WORM speeds
	are similar to pre-recorded speeds here...

(7)	dual ported memory & video rams: do they benifit too?

(8)	put your own question here!

	-george

-- 
        George Michaelson, CSIRO Division of Information Technology

ACSnet: G.Michaelson@ditmela.oz                      Phone: +61 3 347 8644
Postal: CSIRO, 55 Barry St, Carlton, Vic 3053 Oz       Fax: +61 3 347 8987

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (08/26/88)

In article <2179@ditmela.oz> george@ditmela.oz (George michaelson) writes:
| 
| 	The August edition of 'electronics' is about memory technology

	[ numbers followed by questions ]

| (3)	do the new speeds still look good alongside predicted clock speeds
| 	for CPU or do we have another development lag here?
| 
| 	there is another table in the mag showing possible access times
| 	for existing 32bit cpus & available speeds from different memory
| 	technology but it's hard to reproduce. It implies memory access
| 	delay is one of many  bottlenecks, I think these speedups might cure 
| 	it but only for existing clockspeeds.

	I think you've hit right on it. Fortunately, I believe that CPU
	speeds are not going up the way they have in the past five
	years (see below).

| 
| (5)	do we start to get 32/64Mb by default in our workstations? 
| 	does the opsys change its memory usage when that much memory
| 	is around? 

	I think you're a lot closer than that now. When we ordered some
	Sun 386i workstations with 4MB, we were told the delivery would
	be five weeks slower than the 8 or 16MB models. It seems that
	there is little demand for 4MB, and they only set the line
	for them every six weeks or so. Most of the stations we order
	have 8-16MB, and 4 is confining, even for single user.

|
| (6)	with optical disk getting down to current access times for `real'
| 	disk do they become standard or are the disadvantages still too
| 	great? does unix sprout file version numbers? -I'm assuming WORM speeds
| 	are similar to pre-recorded speeds here...

	It would seem that the access time on hard or optical disks is
	limited by rotational speed in the long run. You can add sectors
	and tracks to reduce the track to track seeks, but there is that
	limiting factor. Depending who you believe, it is obvious that
	the WORM has a higher limit than the magnetic disk, or exactly
	the opposite.

	There are good reasons for WORM drives, mainly legal. Things
	written on a WORM can't be changed, and therefore are more
	likely to be admissible as evidence. This exists in some courts
	now, but it's a topic for the legal group, not here.	
| 
| (7)	dual ported memory & video rams: do they benifit too?

	I would expect them to benefit more, at least comparedto what
	we use now. Add ROMs to that list! With faster, cheaper,
	ROMs there will be more applications available for them. I would
	hope that EAPROMs would be here, too.

================ My view on where computers are going ================

  There are some reasons why running CPUs at higher and higher speeds is not
the most cost effective way to improve performance. As the speeds get higher
the traces on the boards start to look at transmission line, you get
radiation, standing waves, etc. These problems are soluble, but looking at
the troubles today in meeting class B certification by the FCC, at some
point it will be more cost effective to use other approaches.

  There are two well known ways to use slower speeds effectively, and both
are in use now. The easiest is lowering the number of clock cycles per
instruction. This is a benefit of RISC. There are some developments ongoing
to have a mixed hard logic and microcode system. By making more of the
common instructions hard logic, the code runs faster without trying to get
all of the special purpose instruction in hard logic.

  The other is using a wider bus. This allows slower memory (and their
busses) and takes advantage of larger register sets inside the CPU. Cache
memory also reduces the effects of slow memory, and by placing cache on the
CPU chip itself, it may be possible to run the CPU at higher speeds with
fewer problems that if the faster signals were run on the backplane.

  It can be proven that there is a limit to how fast a computer may be,
independent of the techniques used. There was an article by a physicist
several years ago on this, and he quoted the limit. The problem is the speed
of light. To reduce delays induced by the SOL requires making devices
smaller. When wires becode extremely small they become statistical problems
rather than conductors. If a wire is a few molecules in diameter (he quoted
the values), putting an electron in one end does not insure that an electron
comes out the other.

  Add to this having to use lower voltage to keep power down so the whole
thing won't melt, and you hit a firm lower limit on size, and thereby
performance. This does not preclude parallel processing for problems which
have the right characteristics.

  The good news is that we are currently about 21 orders of magnitude from
the limit. I can't guess what reducing the size of a CPU/memory system by
that level would do for performance, but if you put one on my desk I'll
report the benchmarks.

  Anyone who can find a copy of the original article, please supply more
details.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

anand@vax1.acs.udel.EDU (Anand Iyengar) (08/26/88)

In article <2179@ditmela.oz> george@ditmela.oz (George michaelson) writes:
...
>	great? does unix sprout file version numbers? -I'm assuming WORM speeds
>	are similar to pre-recorded speeds here...
...
	If Tandy's THOR eventually becomes what it's currently being touted to
be, WORM optical drives may not be necessary.  Anyone know more about the
current state of "optical storage"?

scott@labtam.OZ (Scott Colwell) (08/26/88)

In article <2179@ditmela.oz>, george@ditmela.oz (George michaelson) writes:
> (2)	main memory gets 6 times faster and 8 times cheaper. cache gets
> 	2-3 times faster and 8 times cheaper. Thus their speeds almost
> 	converge. Does that make cache sufficiently unattactive to stop 
> 	being used?
> 
> 	-price/speed ratio looks ugly (twice the cost for saving 5-10ns)
> 	-overall access speed to memory is 2 X current speed to cache so
> 	for existing processor architectures you can possibly do without it
> 	unless there are `logical' reasons for using cache other than
> 	buffering for slower memory.

	There are two major factors in the access times of memory
	systems. The actual speed of the memory devices and the overheads
	associated with controlling them. (i.e. the time taken to drive
	address lines to stable final values etc.) Caches are typically
	fewer chips, simpler to control (no muxed addresses) and closely
	coupled to the CPU. These issues probably won't change and hence
	caches will remain faster than main memory (even if the same RAM
	chips are used for both). (And main memory will usually have either
	a bus or switching network between it and the CPU, hence more
	delay).

	Main memory is accessed by more than just the cpu in current
	machines. DMA and other processors need access to perform I/O,
	and in multiprocessors, the main memory is often the point
	of sharing. This implies that a very important parameter for
	main memory is the access latency (how long it will take for
	the main memory to become available for a given master). This
	is where caches are also useful, they reduce the amount of
	traffic that hits the main memory.
	(From the above it isn't clear if you know this. If so, sorry)

> (3)	do the new speeds still look good alongside predicted clock speeds
> 	for CPU or do we have another development lag here?

	The current crop of SRAM at the moment is not really up to the
	demands of the new processors being touted for release in
	calendar year '89 (if your criterion is zero waits states).

	80386 @ 20MHz requires 35ns SRAM.
		25MHz requires 30ns SRAM.
		32MHz requires ??
		faster ??
	Admittedly the 80386 makes more demands on memory than most
	of the current CPUs.

> (5)	do we start to get 32/64Mb by default in our workstations? 
> 	does the opsys change its memory usage when that much memory
> 	is around? 

	Using 1Mx1 chips and assuming two way interleave with 32bit
	memory the minimum configuration possible is 8Mbytes. If we
	assume 4Mx1 then the minimum is 32Mbytes. (I hope they bring
	out the x4 parts first this time :-)

	Scott
-- 
Scott Colwell			ACSnet:	scott@labtam.oz
Design Engineer			UUCP:	..uunet!munnari!labtam.oz!scott
Information Systems Division	ARPA:	scott%labtam.oz@UUNET.UU.NET
Labtam Ltd Melbourne, Australia PHONE:	+61-3-587-1444
D

pal@murdu.OZ (Philip Leverton) (08/27/88)

In article <11978@steinmetz.ge.com> davidsen@crdos1.UUCP (bill davidsen) writes:
[deleted]
>  It can be proven that there is a limit to how fast a computer may be,
>independent of the techniques used. There was an article by a physicist
>several years ago on this, and he quoted the limit. The problem is the speed
>of light. To reduce delays induced by the SOL requires making devices
>smaller. When wires becode extremely small they become statistical problems
>rather than conductors. If a wire is a few molecules in diameter (he quoted
>the values), putting an electron in one end does not insure that an electron
>comes out the other.
>
>  Add to this having to use lower voltage to keep power down so the whole
>thing won't melt, and you hit a firm lower limit on size, and thereby
>performance. This does not preclude parallel processing for problems which
>have the right characteristics.
>
>  The good news is that we are currently about 21 orders of magnitude from
>the limit. I can't guess what reducing the size of a CPU/memory system by
>that level would do for performance, but if you put one on my desk I'll
>report the benchmarks.
>
>  Anyone who can find a copy of the original article, please supply more
>details.

I think the article that you're referring to is "Quantum Mechanical Computers"
by the late Prof. Richard Feynman. A copy arrived as a Caltech preprint at the
University of Melbourne Physics Dept in 1984. The preprint says that is is
a "Plenary Talk presented to IQEC-CLEO Meeting, Anaheim,June 19, 1984."
What follows is essentially a paraphrase of a several sections of Feynman's
original article, which I have. I also have no idea what IQEC-CLEO stands
for, either. :-)

Feynman considered the problem of the determining the physical limitations
imposed on the future development of computers due to quantum mechanics
and the uncertainty principle. His formulated a Hamiltonial to describle
an ideal computing system (ignoring the effect of small imperfections).
His conclusion was that "the laws of physics present no barrier to reducing
the size of computers until bits are the size of atoms, and quantum
behaviour holds dominant sway."

The minimum free enery that must be expended to operate a computer composed
of the ideal primitives AND, NOT, [could be combined into NAND], 
FAN OUT(2 "wires" -> 1 "wire") and EXCHANGE (crossed "wires") was thought 
to be kT ln(2) [from the AND case]. At the present with a transistor
system the heat dissipation is 10**10 kT because to change the voltage
of a wire it is dumped to ground through a resistance; and to build up the
voltage we feed it charge again through a resistance. Nature in her DNA
copying machine, dissipates about 100 kT per bit copied. Feynman goes on to
say that "Being, at present so very far from this kT ln(2) figure, it seems
ridiculous to argue that even this is too high and the minimum is essentially
zero. But, we are going to be even more ridiculous later and consider bits
written on one atom instead of the present 10**11 atoms. Such nonsense is
very entertaining to professors like me. I hope you will find it interesting
and entertaining also." Bennet showed that the kT ln(2) limit was wrong
because you don't have to use irreversible (in the thermodynamic sense)
primitives. You can use reversible machines that contain only reversible
primitives. Then the minimum free energy required is *independent* of the
complexity or number of logical steps in the calculation. The energy would
probably be kT per bit of the output answer! 

"The time needed to make a step in a calculation in such a machine depends on
the strength or the energy of the interactions in the term of the Hamiltonian.
If each of the terms in the Hamiltonial is supposed to be of the order of
0.1 eletron volts, then it appears that the time for the "cursor" to make
each step, if done in a ballistic fashion, is of the order 6.0E-15 sec.
This does not represent an enormous improvement, perhaps only about four
orders of magnitude, over the present values of the time delays in
transistors, and is not much shorter that the very short times possible to
achieve in many optical systems."

The article goes on to consider the Hamiltonian of quantum mechanical
computer, state representation in such a machine, imperfections and 
irreversible free energy losses, and simplifying the implementation.

Here are the references of the article:
1. Bennet, C.H. "Logical Reversibility of Computation," IBM Journal of
Research and Development, *6* (1979) 525-532.
2. Fredkin, E. and Toffoli, T. "Conservative Logic," Int. Journal of
Theoretical Physics, *21* (1982) 219-253.
3. Bennet, C.H. "Thermodynamics of Computation - A Review," Int. Journal of
Theoretical Physics, *21* (1982) 905-940.
4. Toffoli, T. "Bicontinuous Extensions of Irreversible Combinatorial
Functions," Mathematical Systems Theory, *14* (1981) 13-23.
5. Priese, L. "On a Simple Combinatorial Structure Sufficient for Sublying
Non-Trivial Self Reproduction," Journal of Cybernetics, *6* (1976) 101-137.

Of course, all this work dates from 1984; no doubt further research has been
done since then. The 10**11 figure for the size of a transistor might
have been reduced by a bit too!

Phil Leverton, University Computing Services, University of Melbourne.
A once and future physics student.
ACSnet: pal@murdu 	CSNET: pal%murdu.oz@australia
ARPA: pal%murdu.oz@uunet.css.gov
UUCP: {uunet,hplabs,mcvax,ukc,nttlab}!munnari!murdu.oz!pal

steve@cit5.cit.oz (Steve Balogh) (08/30/88)

In article <1829@udccvax1.acs.udel.EDU>, anand@vax1.acs.udel.EDU (Anand Iyengar) writes:
> ...
> 	If Tandy's THOR eventually becomes what it's currently being touted to
> be, WORM optical drives may not be necessary.  Anyone know more about the
> current state of "optical storage"?

I saw an advertisement recently for the following drive in a local computer
newspaper.... 

1Gbyte  5.25"  Erasable Optical Disk Drive
Average seek time 30mS
Read/write storage - claimed to be erasable and fast.

The drive is called the TAHITI from Maxtor. The price was not quoted.

-		-		-		-		-
		(It's my opinion and not my employers)
Steve Balogh	VK3YMY			| steve@cit5.cit.oz (...oz.au)
Chisholm Institute of Technology	| steve%cit5.cit.oz@uunet.UU.NET
PO Box 197, Caulfield East		| 
Melbourne, AUSTRALIA. 3145		| {hplabs,mcvax,uunet,ukc}!munnari\
+61 3 573 2266 (Ans Machine)		|  !cit5.cit.oz!steve

cjh@hpausla.HP.COM (Clifford Heath) (09/26/88)

>	It would seem that the access time on hard or optical disks is
>	limited by rotational speed in the long run. You can add sectors
>	and tracks to reduce the track to track seeks, but there is that
>	limiting factor. Depending who you believe, it is obvious that
>	the WORM has a higher limit than the magnetic disk, or exactly
>	the opposite.

(Drifting slightly...)

Given the (perhaps almost) unlimited density of optical disks, the
limiting factor on rotational speed is the speed of the encoder/decoder.
If you want to spin a disk with 1 million bits/track at 1000
revs/second, you've got to detect that information at 1Gbit/second.
That's about two orders of magnitude more than is required by current
magnetic technology.  The example given is probably within reach, but it
still gives a rotational latency of .5 ms, which although faster than
current devices it's still not that fast.  To go beyond that you
eventually reach a limit at which electronics isn't fast enough to
detect and serialize the information any more.  Note that the optical
read head is capable of these speeds, although the write head may find
it harder (because of the use of heating effects).  Optics in general
are capable of speeds far in excess of what electronics can handle,
n'est ce pas?
(Blue sky on... I'm looking forward to optical computers, built as
optical chips mounted directly on the fixed head of a super-fast
optical disk.  Pocket super-computers, here we come!)

Clifford Heath, Hewlett Packard Australian Software Operation.
(UUCP: hplabs!hpfcla!hpausla!cjh, ACSnet: cjh@hpausla.oz)

lindsay@k.gp.cs.cmu.edu (Donald Lindsay) (09/29/88)

In article <2220001@hpausla.HP.COM> cjh@hpausla.HP.COM (Clifford Heath) writes:
>>	It would seem that the access time on hard or optical disks is
>>	limited by rotational speed in the long run.
>Given the (perhaps almost) unlimited density of optical disks, the
>limiting factor on rotational speed is the speed of the encoder/decoder.
>If you want to spin a disk with 1 million bits/track at 1000
>revs/second, you've got to detect that information at 1Gbit/second.

I would say (from under my futurology hat) that it's basically silly to
have moving objects. We want to scan read/write beam[s] across unmoving
media. Of course, a disc is an inefficient shape: a rectangle has more
area. Also, it can't be too small, or people will lose them. I'd say that a
credit card has field-tested human factors. It would have to hold at least
a gigabyte, since we wouldn't want to split encyclopedias onto two cards.
-- 
Don		lindsay@k.gp.cs.cmu.edu    CMU Computer Science