[comp.arch] N11 - i860 XP - some details, who knows more?

noah@cs.washington.edu (06/08/91)

	I just heard a few things about the i860 XP (formerly known as
the N11) from an Intel person.  This is the successor to i860 and was
announced last week (probably overshadowed by the Touchstone and TMI
announcements).  It is binary code compatible.  It uses the same basic
architecture with the following differences.

	The caches are bigger.  I-cache is 16K (v. 4K).  The D-cache is
also 16K (v. 8K).  The burst mode to memory is 400 MB/sec (the old one
was around 150-160 MB/sec?).  There is something built-in to facilitate
a snoopy cache scheme.  There are also some enhancements for dealing with
fault tolerance.  The first version is to run at 50 MHZ and then there
will be a 66 MHZ one.  It is implemented 0.8 micron technology.

	This is all the information I have.  Does anyone know any more?
Among the many complaints that have surfaced about the i860, a major one
is the memory bandwidth being inadequate, especially for keeping the
floating point pipeline fed.  Is this really going to help?  They have
doubled the D-cache size, but 16K is still small.  What about this burst
mode?  I am not familiar with it in the i860.  Is increasing its speed
2.5 times going to make a difference in trying to get near peak floating
point performance?

						Rick N. Zucker
						noah@cs.washington.edu

fritz@saturn.ucsc.edu (Frederick Staats) (06/09/91)

In article <1991Jun7.193507.3733@beaver.cs.washington.edu> noah@cs.washington.edu writes:
>
>	I just heard a few things about the i860 XP (formerly known as
>the N11) from an Intel person.  

>Among the many complaints that have surfaced about the i860, a major one
>is the memory bandwidth being inadequate, especially for keeping the
>floating point pipeline fed.  Is this really going to help?  They have
>doubled the D-cache size, but 16K is still small.  What about this burst
>mode?  I am not familiar with it in the i860.  Is increasing its speed
>2.5 times going to make a difference in trying to get near peak floating
>point performance?
>
>						Rick N. Zucker

   There are two key differences in the i860 XP that should increase
performance of real machines built on the i860 family architecture.

	1) Updated caches (I-cache 4K --> 16K, D-cache 8K --> 16K):
	   These caches now have both virtual tags and physical tags
	   and use the MESI snooping protocol to support shared memory
	   multiprocessing and DMA.  Hooks for large external snooping
	   caches are also included.  The i860 XP appears to have
	   solved the major caching performance problems that made it
	   hard to use in multiprocessor computers.

	2) Quad word pipelined load/store. The Double word pipelined
	   load/store provided insufficient memory bandwidth for many
	   algorithms.  The pipelined load/store bypass the cache and
	   use the full memory bus bandwidth (burst mode) for maximum
	   performance on large external datasets.  The more than
	   doubling of the memory bandwidth was a requirement to
	   balance the FP with memory.

   Other minor changes (ie. registers for operating system support,
hardware for parallel loop execution on multiple cpus) are nifty
features but do not appear to significantly effect the performance
of the architecture.  The one thing I would like to see in the future
is more registers.  The current number is cramped and I am told make
it a real pain to write a good compiler for the architecture.

Frederick Staats		University of California, Santa Cruz
fritz@saturn.ucsc.edu		Supercomputer Research Group