[net.micro.68k] 16k vs 68k vs 432

chris@mddc.UUCP (Chris Maloney) (12/22/83)

Though we have all seen parts of this discussion before and
could live without some of the repeations of the same ideas,
I would like to see some discussion of the issues.  I am not
really qualified to comment but would like to relate some ideas
of a friend of mine.  He is a devoted Motorola fan and claims
they buy far have the best support chips (Memory Management,
Communications, ...).  He also feels the Motorola will in the
future and has in the past offer much better support than
NS or Intel.  His says about Intel, he wouldn't like to compete
with Big Blue useing chips from the company they control.
He will admit that the 16k has a better instruction set than
the 68k.

Let's here it (but only once).

tjt@kobold.UUCP (T.J.Teixeira) (12/28/83)

mddc!chris says (quoting a friend of his):

	He is a devoted Motorola fan and claims they buy far have the best
	support chips (Memory Management, Communications, ...).

Does any 68000 board vendor (besides Motorola) use Motorola MMU chips?
Perhaps their future MMU chips will be better, but on this basis, the
current Motorola MMU's aren't winners (they may be the best of a bad
lot though -- at the moment, I think most 68000 systems roll their own
memory management).
-- 
	Tom Teixeira,  Massachusetts Computer Corporation.  Westford MA
	...!{ihnp4,harpo,decvax,ucbcad,tektronix}!masscomp!tjt   (617) 692-6200

henry@utzoo.UUCP (Henry Spencer) (12/29/83)

Anyone who thinks Motorola's memory-management chip is superior to
the National one for the 16k has been totally brainwashed.  The
Motorola chip is next to useless, while the 16032 does a full demand
paging environment and does it *right*.  (Or at least, a lot closer
to right than many other versions, e.g. the one on the VAX.)

Note also that the 16032 has a rather fast floating-point chip already
operational, whereas Motorola is still just mumbling about one.  I am
told that a number of people are using the 16032 FPU on 68000 systems,
despite howling and gnashing of teeth from Motorola.

I tend to agree about some of the other Motorola peripheral chips,
with the 68121 a particular win.  But these chips don't have any real
competition from National yet; it will be interesting to see just
what appears.  National did an awful lot of things right on the 16032
and its support chips.  If they can continue this practice and
overcome their slow start, they have a big winner on their hands.

(No, I don't work for National.  But the closer I look at the 16032,
the more impressed I am.)
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

fair@dual.UUCP (12/31/83)

The 68000 wasn't just out earlier, it is second/third/fourth/fifth sourced.
I don't believe National can say the same...

When National starts doing things like that, then maybe more systems
manufacturers will look seriously at them...

(Of course, my information could be incorrect. If so, I will sit corrected)

	working for a 68k system mfgr,

	Erik E. Fair	{ucbvax,amd70,zehntel,unisoft,onyx,its}!dual!fair
			Dual Systems Corporation, Berkeley, California

kurt@fluke.UUCP (Kurt Guntheroth) (01/03/84)

Good things about the 16000

1.  8-, 16-, and 32-bit versions of the 16000 architecture are shipping now.
    Motorola is shipping only 8- and 16-bit versions because the 32-bit
    version is much different internally.

2.  We don't need to mention instruction sets.  Everybody acknowledges that
    this is NSC's strongest selling point.  Motorola is weaker here than
    most people realize.  Their basic architecture with 32 bit pointers is
    right, but their instruction set is not symmetrical enough to be really
    useful.

3.  Peripheral chips.  NSC has some very important peripherals that Motorola
    hasn't got.  I am thinking specifically of the Floating Point Unit.

4.  Demand Paged Virtual Memory.  It seems to work better no matter what the
    'experts' say.

-- 
Kurt Guntheroth
John Fluke Mfg. Co., Inc.
{uw-beaver,decvax!microsof,ucbvax!lbl-csam,allegra,ssc-vax}!fluke!kurt

kissell@flairvax.UUCP (Kevin Kissell) (01/04/84)

First of all, the 432 is either a machine of the past or of the future,
The price is too high (but dropping) and the performance is apalling
(but improving).  Except for certain applications with heavy security
and/or reliability constraints, it is a very interesting and expensive
toy.  A revolution in semiconductor technology could change that, but
the benefits of such a revolution would apply equally well to simpler
architectures.

On to the less easy matter of 16k vs 68k.  The overall capabilities of
the two are really pretty similar, with a the 68k *generally* yielding 
somewhat better performance (depending, as always, on how one chooses 
one's benchmarks) and the 16k yielding better code density (ditto ditto).
The 32 bit internal architecture of the 16k (along with the ALU design)
make for faster multiplies and divides than the 68k, but the generality
and orthogonality of the 16k instruction set are expensive in microcode
cycles.  Exception handling in particular is significantly slower on the
16k than the 68k.  The 68k has more registers, but the 16k offers runtime
intermodule linkage.  The 16k has a much nicer MMU, but the 68k has those
handy postincrement/predecrement addressing modes.  Then there are the
quasi-religious issues: the 68k bus is asynchronous and unmultiplexed while
the 16k's is synchronous and muxed, the 68k is big-endian while the 16k
is little-endian, etc.  Which is better?  Neither, necessarily.  If you
want to build a simple, clean virtual memory machine, use the 16032/16082.
If you have a severe real-time problem, use a 68000 and *no* MMU.

		Kevin D. Kissell
		Fairchild Research Center
		Advanced Processor Development
		uucp:{ucbvax!sun decvax allegra}!decwrl!flairvax!kissell

		And yes, Fairchild is still planning to second-source 16k's.
		Not my department, though, so don't ask me when.

chongo@nsc.UUCP (Landon Noll) (01/04/84)

having used both NS16032 based systems, and DUAL's 68000 based system
i make the following observation:

DUAL's systems are nice, but one of the two major limiting factors
is the 68000 hardware.  to this problem i note:

1) floating point on the DUAL is very slow.  test show that it is between
   1 to 2 magnitudes slower than National's FPU. (ill be vage since i
   dont have the figures right in front of me)
2) due to the lack of extended multiplication on the 68000, the 
   multiplication of two long ints needs to be performed via subroutine.
   (so a fellow DUAL used showed me) National's CPU can multiply two 32 bit
   values and get a 64 bit result.
3) due to the poor memory management hardware, programs only divided
   and swaped on two sections: data and text.  you are limited in size
   by the amount of physical memory.  (most programs are not allowed
   to poke at the mmu)  National's MMU uses 512 byte pages and can
   support a demand paged area of up to 16 Meg, regardless of physical
   memory.  [well we assume you do have some! :~) ]

now dont get me wrong, the DUAL is a GOOD SYSTEM for the price.  but if they
could just get away from the MC68000 based systems and over to ...

chongo <the other system that i use is a DUAL> /\DD/\

phil@amd70.UUCP (Phil Ngai) (01/06/84)

You say that programs on a DUAL are divided into two pieces: code and data.
This sounds like segmentation to me. But then, they never claimed to
have demand paged virtual memory.
-- 
Phil Ngai (408) 988-7777 {ucbvax,decwrl,ihnp4,allegra,intelca}!amd70!phil

fair@dual.UUCP (Erik E. Fair) (01/06/84)

OK, I will respond to the points one at a time.

>>From amd70!fortune!nsc!chongo (Landon Noll)
>>
>>having used both NS16032 based systems, and DUAL's 68000 based system
>>i make the following observation:
>>
>>1) floating point on the DUAL is very slow.  test show that it is between
>>   1 to 2 magnitudes slower than National's FPU. (ill be vage since i
>>   dont have the figures right in front of me)

I'm not surprised. There is also a problem with accuracy, since
UniSoft chose to leave the DEC standard floating point implementation
as is. Fortunately, we also have an IEEE floating point standard C
compiler (and you thought that the DEC standard was slow?), and
we support the SKY Fast Floating Point board (which makes things 
faster than the DEC standard, and still IEEE standard too). Clearly, though
we can't (nor can any other 68k without an onboard FPU) beat a 16032 with
a 16082 FPU right next to it.

>>2) due to the lack of extended multiplication on the 68000, the 
>>   multiplication of two long ints needs to be performed via subroutine.
>>   (so a fellow DUAL used showed me) National's CPU can multiply two 32 bit
>>   values and get a 64 bit result.

Guilty as charged. Thank you, Motorola.

>>3) due to the poor memory management hardware, programs only divided
>>   and swaped on two sections: data and text.  you are limited in size
>>   by the amount of physical memory.  (most programs are not allowed
>>   to poke at the mmu)  National's MMU uses 512 byte pages and can
>>   support a demand paged area of up to 16 Meg, regardless of physical
>>   memory.  [well we assume you do have some! :~) ]

OK, this is apples and oranges time. DUAL uses the 68451 MMU with the
68000 processor. That combination can't do Virtual Memory. However, the
68010 with the 68451 CAN. However, I won't argue that the 68451 doesn't
leave \much/ to be desired as an MMU in a virtual system. Most of the
68k boxes out there (at least most of the UniSoft ports) are still
`swap' based systems, where any given process is limited to the size of
physical memory minus the space that the kernel takes up. In the DUAL,
though, you can drop 3.25Mbytes on the bus, and that is usually
sufficient for most applications.  Even a program as large as the Franz
Lisp interpreter fits in that quite comfortably.

>>
>>now dont get me wrong, the DUAL is a GOOD SYSTEM for the price.  but if they
>>could just get away from the MC68000 based systems and over to ...
>>
>>chongo <the other system that i use is a DUAL> /\DD/\

The NS16000 series is a very nice chip set. We would have to about
double in size as a company, and start dealing with Yet Another Unix
Porting House which is not my idea of fun (although the two who are of
any note have both done 4.1BSD, and that, in my book, is a definite
plus). One minor note on the NS 16081 MMU, is that apparently it has
512 byte pages ingrained into its structure, and so those of you who
want larger page sizes than that are in for an interesting time. Or so
our CPU board designer says.

	sitting on the software side of the fence,
		waiting for my microVAX,
			with 4.2BSD,

	Erik E. Fair	{ucbvax,amd70,zehntel,unisoft,onyx,its}!dual!fair
			Dual Systems Corporation, Berkeley, California

P.S.	If I got the 16081 and 16082 confused, sorry. Hardware hack, I ain't.

henry@utzoo.UUCP (Henry Spencer) (01/10/84)

Erik Fair observes:

   .......One minor note on the NS [16032] MMU, is that apparently it has
   512 byte pages ingrained into its structure, and so those of you who
   want larger page sizes than that are in for an interesting time....

It's quite true that the MMU has 512-byte pages ingrained into it.  So?
The VAX does too.  If you want bigger pages, you simply do everything N
512-byte pages at a time instead of one 512-byte page at a time.  As an
example, National's 4.xBSD port does everything 2 pages at a time to
give the effect of 1024-byte pages.  Almost any paging hardware has to
make a firm decision about page size and stick to it.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

tjt@kobold.UUCP (01/10/84)

The problem with compensating for a small page size by pretending the
page size with bigger (as is done in 4BSD for the VAX and National's
4.xBSD port) is that page tables take up N times as much space, and the
inner loops for setting up page tables run N times slower.

On the other hand, a large page size causes you to lose more storage to
fragmentation.

Actually, I had heard that the real reason the VAX used 512-byte pages
was because the unibus disks used 512-byte blocks.  Actually, the
physical disk block sizes are less important than the logical file
system block size.

Anyway, does anyone know if it is possible to use the NS 16032 MMU with
some additional logic to look at function codes and fool it into using
a larger page size directly? i.e. the MMU would normally ignore the
bottom 9 bits of the address and only look at the page number.  You can
feed whatever bits you want in as the page number.  Similarly, you can
take the translated page number to high order bits.  By definition, the
low order bits (page offset) of the address aren't changed by the MMU.
However, if the MMU does not find the right page table entry in its
on-chip cache, it will fetch it from memory, in which case the memory
system needs to use exactly the address generated by the MMU.
-- 
	Tom Teixeira,  Massachusetts Computer Corporation.  Westford MA
	...!{ihnp4,harpo,decvax,ucbcad,tektronix}!masscomp!tjt   (617) 692-6200

chongo@nsc.UUCP (Landon Noll) (01/11/84)

good comments on the DUAL.  now in reply to:

	>One minor note on the NS 16081 MMU, is that apparently it has
	>512 byte pages ingrained into its structure, and so those of you who
	>want larger page sizes than that are in for an interesting time. Or so
	>our CPU board designer says.
	>
	>	sitting on the software side of the fence,
	>		waiting for my microVAX,
	>			with 4.2BSD,

At National we dont find a "problem" with the 512 byte hardware pages.  If you
want 1k pages, cluster them two at a time.  a nice #define or two and you dont
notice the difference at the source file level.  now some of you folks might
say that such clustering work-a-rounds are a problem, but thats picky.  as
far as hardware is concerned, our systems do it quite well.  perhaps your
hardware person picked up another random rumor?

chongo <lowercase them letters!> /\oo/\

crandell@ut-sally.UUCP (Jim Crandell) (01/11/84)

>	If you have a severe real-time problem, use a 68000 and *no* MMU.

A severe real-time problem OTHER THAN NUMBER CRUNCHING, you mean.
-- 
   Jim ({ihnp4,kpno,ut-ngp}!ut-sally!crandell or crandell@ut-sally.UUCP)

henry@utzoo.UUCP (Henry Spencer) (01/11/84)

Tom Teixeira asks:

   Anyway, does anyone know if it is possible to use the NS 16032 MMU with
   some additional logic to look at function codes and fool it into using
   a larger page size directly? i.e. the MMU would normally ignore the
   bottom 9 bits of the address and only look at the page number.  You can
   feed whatever bits you want in as the page number.  Similarly, you can
   take the translated page number to high order bits.  By definition, the
   low order bits (page offset) of the address aren't changed by the MMU.
   However, if the MMU does not find the right page table entry in its
   on-chip cache, it will fetch it from memory, in which case the memory
   system needs to use exactly the address generated by the MMU.

Given that the 16032 MMU is *not* just a combinatorial logic block, but
a complex state machine which sits on the cpu bus, manipulates it in
non-trivial ways, and does its own memory accesses using that same bus,
and given that the same set of wires carries virtual addresses from the
cpu, physical addresses from the MMU, and data to and from both, pulling
devious tricks involving reshuffling the bits strikes me as asking for
trouble.  I would recommend against it.  It's true that simply using N
"real" pages as a single "logical" page involves extra overhead in space
and time, but if these overheads are significant then I would suspect the
software of doing something stupid.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

tjt@kobold.UUCP (01/12/84)

The time and space overhead of treating N "real" pages as a single
"logical" page should indeed be insignificant if N is small (say 2 or
4).  Once N gets as large as 8 I think the overheads may become
noticable although I do not have measurements to back this up.

However, the amount of page table space required to map a 4M image with
512 byte pages is 8K entries which is 32Kbytes if each entry requires 4
bytes.  This represents about 3% of a 1M physical memory system used
for storing page tables.  A system with 4K byte pages would only
require 1K entries or 4K bytes or about .4% of the same physical
memory.  Of course, a system with 4K byte pages will waste additional
space due to fragmentation.

Because of these opposing tradeoffs, there is an optimum page size for
given segment size which grows roughly as the square root of the
segment size (assuming uniformly distributed segment sizes).  It turns
out that for segments ranging from 0-~64K bytes and 4 byte page table
entries the optimal page size is 512 bytes.  For segments from 0-.25M
the optimal page size is 1K bytes, for 0-1M segments 2K bytes and for
0-4M segments the optimal page size is 4K bytes.  Although there are
lots of unix programs in the 100-200K byte range (on the vax, csh, vi,
emacs) there aren't many larger (however, there is always franz and
macsyma).  On the other hand, the extra 2% of memory you would recover
by using a larger page size is probably not going to seriously affect
the performancs of one of these huge programs running on a 1M system
anyway.
-- 
	Tom Teixeira,  Massachusetts Computer Corporation.  Westford MA
	...!{ihnp4,harpo,decvax}!masscomp!tjt   (617) 692-6200 x275

tjt@kobold.UUCP (01/14/84)

nsc!chongo says:

    now some of you folks might say that such clustering work-a-rounds are
    a problem, but thats picky.

As a software engineer, I'm sick and tired of producing software
work-a-rounds for hardware designed by people who think I have nothing
better to do than deal with the intellectual challenge of getting that
hardware to work.  These same designers often have the gall to complain
about how awkward the software is to deal with.

If you think people complain about unix, just think what it would be
like if it was designed like a typical microprocessor support chip:
read-only files, write-only files, editors that only work on 16-line
files, ....
-- 
	Tom Teixeira,  Massachusetts Computer Corporation.  Westford MA
	...!{ihnp4,harpo,decvax}!masscomp!tjt   (617) 692-6200 x275

wmb@sun.uucp (Mitch Bradley) (01/15/84)

> As a software engineer, I'm sick and tired of producing software
> work-a-rounds for hardware designed by people who think I have nothing
> better to do than deal with the intellectual challenge of getting that
> hardware to work.  These same designers often have the gall to complain
> about how awkward the software is to deal with.

> If you think people complain about unix, just think what it would be
> like if it was designed like a typical microprocessor support chip:
> read-only files, write-only files, editors that only work on 16-line
> files, ....

I agree that lots of chips are hard to deal with, and that a lot of
hardware is poorly-designed from a software standpoint.  On the other
hand, the constraints that hardware designers work under are often
different from those of software engineers.  Silicon, in particular, is
an unforgiving medium.  A mistake takes months to correct.  Every
feature is additional complexity, which translates to increased risk,
longer time to market, and more silicon area (which translates to
reduced yield and thus higher cost).

The pat answer to your complaint is, if you don't like a chip, don't
use it.  Unfortunately, there may be no alternative.  Why?  Because the
manufacturer who tried to add all the nice features isn't shipping yet,
while the manufacturer who traded niceness for getting functionality to
the market sooner is raking in the dough selling his brain-damaged
chips.  Look at Intel with the 8080, then the 8086.

The point I am trying to make is that everybody has constraints,
including hardware designers.  Functionality sells, not elegance.

Lets wish for both, but take what we can get.

				Hardwarily speaking,
				Mitch Bradley
				Sun Microsystems, Inc.

guy@rlgvax.UUCP (Guy Harris) (01/16/84)

But Tom wasn't complaining about the lack of features as much as he was
complaining about misdesigned features.  An MMU with 4KB pages shouldn't take
any more design time or silicon than one with 512 byte pages.  I won't
get into the debate about which is better, but there really is no reason for
a chip which does something in a fashion which is awkward for software to
deal with when doing it the right way wouldn't have taken any more time or
silicon.

	Guy Harris
	{seismo,ihnp4,allegra}!rlgvax!guy