[net.micro.ns32k] National's 32332

clif@intelca.UUCP (05/30/86)

> In article <216@motsj1.UUCP> kjm@motsj1.UUCP (Kevin Meyer) writes:
> >>
> >> I know someone who did benchmarks comparing the 32332 to the 68020,
> >> with all else being identical (RAM, etc).  A 15MHz 32332 is faster
> >> than a 16MHz 68020.
> >>
> >Shouldn't this be posted to net.rumor?
> 
> 
> Or maybe net.jokes?
> -- 
> David Herron,  cbosgd!ukma!david, david@UKMA.BITNET, david@uky.csnet

	Would someone post the benchmarks so the rest of us could
evaluate the results.   

	I think someone should run the Dhrystone benchmarks on a 32332
machine otherwise, I'd be force to to conclude that the part really
wasn't that fast.

	The only 32K machine that has Dhrystone numbers posted to the net
is the Sequent Balance for a single processor.  

	The results were 1250 without Reg variables and 1315 with reg
variables.  These numbers were slower than a 6 MHz PC-AT with MicroSoft
C 3.0.  

	In order for the 15 MHz 32332 to be faster than an 68020 (much less
a 386) it would have be 3x faster than an 10MHz 32032.  The increased
clock frequency accounts for 1.5x, I am hard pressed to imagine where
the other 2x could come from.

	This is just the observations of a disinterested :-) person.

-- 
Clif Purkiser, Intel, Santa Clara, Ca.
HIGH PERFORMANCE MICROPROCESSORS
{pur-ee,hplabs,amd,scgvaxd,dual,idi,omsvax}!intelca!clif
	
{standard disclaimer about how these views are mine and may not reflect
the views of Intel, my boss , or USNET goes here. }

seifert@hammer.UUCP (Snoopy) (05/31/86)

In article <699@oakhill.UUCP> davet@oakhill.UUCP (Dave Trissel) writes:

>We hope to be in standard production of 25 Mhz units by the end of this year.
>
>  --  Dave Trissel  Motorola Semiconductor, Austin, Texas
>    {ihnp4,seismo}!ut-sally!im4u!oakhill!davet

When will MMUs for these be available in quantity?

(really! I'm sure the 440x group would like to know!)

Snoopy
tektronix!tekecs!doghouse.TEK!snoopy

bjorn@alberta.UUCP (Bjorn R. Bjornsson) (06/06/86)

In article <55@intelca.UUCP> clif@intelca.UUCP (Clif Purkiser) writes:
> 	The only 32K machine that has Dhrystone numbers posted to the net
> is the Sequent Balance for a single processor. 
>
> 	The results were 1250 without Reg variables and 1315 with reg
> variables.  These numbers were slower than a 6 MHz PC-AT with MicroSoft
> C 3.0. 

Far be it from me to call Mr. Purkiser biased B-).
But let's get it straight that the PC/AT benchmark being
referred to is "SMALL MODEL" and (correct me if I'm wrong)
16 bit ints.  Contrast that to 32 bit addresses and ints
in the Balance.

Note also that benchmarks figures for other NS32k systems are
available from the latest Dhrystone posting.  Here are the figures
for 80286s and NS32ks:

(Excuse the length of the benchmark results.  The PC/AT is one
 the most benchmarked systems, by virtue of its popularity
 unfortunately [That's my reasoned bias showing through].
 I felt that the full breadth of 80286 "performance" would
 not be as easily discernible without all the numbers that
 are available).

From message <1369@homxb.UUCP>:

|*----------------DHRYSTONE VERSION 1.1 RESULTS BEGIN--------------------------
|*
|* MACHINE	MICROPROCESSOR	OPERATING	COMPILER	DHRYSTONES/SEC.
|* TYPE				SYSTEM				NO REG	REGS
|* --------------------------	------------	-----------	---------------
|*
|* Compaq II	80286-8Mhz	MSDOS 3.1	MS C 3.0 	1086	1140 LM
		      ----			----------------------------
|* IBM PC/AT    80286-7.5Mhz    Venix/286 SVR2  cc              1159    1254 *15
|* Compaq II	80286-8Mhz	MSDOS 3.1	MS C 3.0 	1190	1282 MM
|* Compaq II	80286-8Mhz	MSDOS 3.1	MS C 3.0 	1351	1428
|*
|*----------------DHRYSTONE VERSION 1.0 RESULTS BEGIN--------------------------
|*
|* IBM PC/AT	80286-6Mhz	PCDOS 3.0	CI-C86 2.1	 666	 684
|* IBM PC/AT	80286-6Mhz	Xenix 3.0	cc		 684	 704 MM
|* IBM PC/AT	80286-6Mhz	Xenix 3.0	cc		 704	 714 LM
|* IBM PC/AT	80286-6Mhz	PCDOS 3.0	MS 3.0(large)	 833	 847 LM
		      ----			----------------------------
|* IBM PC/AT	80286-6Mhz	Xenix 3.0	cc -i		 909	 925
|* IBM PC/AT	80286-6Mhz	Xenix 3.0	cc		 892	 961
|* IBM PC/AT	80286-6Mhz	Venix/86 2.1	cc		 961	1000
|* IBM PC/AT	80286-6Mhz	PCDOS 3.0	b16cc 2.0	 943	1063
|* NSC ICM-3216 NSC 32016-10Mhz	UNIX SVR2	cc		1041	1084
|* IBM PC/AT	80286-6Mhz	PCDOS 3.0	MS 3.0(small)	1063	1086
						^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|* IBM PC/AT    80286-6Mhz      Venix/286 SVR2  cc              1056    1149
|* IBM PC/AT	80286-6Mhz	PCDOS 3.0	Datalight 1.10	1190	1190
|* ATT PC6300+	80286-6Mhz	MSDOS 3.1	b16cc 2.0	1111	1219
|* IBM PC/AT	80286-6Mhz	PCDOS 3.1	Wizard 2.1	1136	1219
|* IBM PC/AT	80286-6Mhz	PCDOS 3.0	CI-C86 2.20M	1219	1219
|* IBM PC/AT	80286-6Mhz	PCDOS 3.1	Lattice 2.15	1250	1250
|* IBM PC/AT	80286-7.5Mhz	Venix/86 2.1	cc		1190	1315 *15
|* Intel 380	80286-8Mhz	Xenix R3.0up1	cc		1250	1315 *16
|* Sequent Balance 8000	NS32032-10MHz	Dynix 2.0	cc	1250	1315 N12
|* IBM PC/DSI-32 32032-10Mhz	MSDOS 3.1	GreenHills 2.14	1282	1315 C3
|* IBM PC/AT	80286-8Mhz	Venix/86 2.1	cc		1275	1380 *16
|* IBM PC/AT	80286-6Mhz	MSDOS 3.0	Microsoft 3.0	1250	1388
						^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|* ATT PC6300+	80286-6Mhz	MSDOS 3.1	CI-C86 2.20M	1428	1428
|* COMPAQ/286   80286-8Mhz      Venix/286 SVR2  cc              1326    1443
|* IBM PC/AT    80286-7.5Mhz    Venix/286 SVR2  cc              1333    1449 *15
|* IBM PC/AT    80286-9Mhz      SCO Xenix V     cc              1540    1556 *18
|* NEC PC-98XA	80286-8Mhz	PCDOS 3.1	Lattice 2.15	1724	1724 @
|* IBM PC/STD	80286-8Mhz	MSDOS 3.0 	Microsoft 3.0	1724	1785 C2
|* Intel 310AP	80286-8Mhz	Xenix 3.0	cc		1893	2009

(NOTE: LM == Large Model, MM == Medium Model, all other 80286 == small model)
(Presumably the ICM-3216 and the Balance are running with an MMU,
 while on the DSI-32 an MMU is optional)

I would like to call attention to the results for LARGE MODEL code,
not nearly as impressive as when the 80286 is operating closer to
its, in more sense than one, origins.
It would be interesting to obtain figures for LARGE MODEL code
with 32 bit ints (My guess is a further deterioration in speed by
15 to 30%).
Last time I checked, the Xenix C compiler is Microsoft C version 3,
but it's not clear whether that is the compiler referred to as "cc"
in the Xenix 3.0 benchmarks.

In any case it's not known which of the above benchmarks under
the version 1.0 heading are actually 1.1 benchmarks.  The difference
in speed between the two versions is on the order of 10 to 20%.
It would seem that the former of the ^^^^ underlined figures is
for 1.1 and the latter for 1.0.

> 	In order for the 15 MHz 32332 to be faster than an 68020 (much less
> a 386) it would have be 3x faster than an 10MHz 32032.  The increased
> clock frequency accounts for 1.5x, I am hard pressed to imagine where
> the other 2x could come from.

I may find that hard to imagine.  But why would you?
If we are to believe the Dhrystone figures posted to the net,
your company did just that in going from the 8086 to the 80286, ie.
obtained a 2x speedup for the same clock frequencies.

> 	This is just the observations of a disinterested :-) person.

You may be disinterested, but I am not.  I'm tempted to resort
to personal insults here, as the amount of resources that have
been/will be spent on x86 systems does not bear thinking (or
smiling) about, but of course I'll refrain.

> --
> Clif Purkiser, Intel, Santa Clara, Ca.
> HIGH PERFORMANCE MICROPROCESSORS
> {pur-ee,hplabs,amd,scgvaxd,dual,idi,omsvax}!intelca!clif

			Bjorn R. Bjornsson
			Department of Computing Science
			University of Alberta
			Edmonton

			ihnp4!alberta!bjorn

jans@tekecs.UUCP (Jan Steinman) (06/11/86)

>> 	In order for the 15 MHz 32332 to be faster than an 68020 (much less
>> a 386) it would have be 3x faster than an 10MHz 32032.  The increased
>> clock frequency accounts for 1.5x, I am hard pressed to imagine where
>> the other 2x could come from.
>
>I may find that hard to imagine.  But why would you?  If we are to believe
>the Dhrystone figures posted to the net, your company did just that in
>going from the 8086 to the 80286, ie. obtained a 2x speedup for the same
>clock frequencies.

Those who can't imagine where Nati could find the time have not carefully
examined the instruction times.  Two areas stand out like a sore thumb:
memory access time (5 clocks per int on the 32032 with MMU, even more for
32016), and address calculation time.  Since this was Nati's first major
CPU in quite a while, I see no reason to doubt that memory access times
could be improved to the 'state of the are' times found in Intel or Mota
products.

The address times are particularly depressing on the 32032.  As a
specialist in fast interpreters, I'm not particularly pleased with how
Nati apparently dealt with this, but deep prefetch is the classical way
of improving address calculation *throughput*, if not time, and the
addition of the barrell shifter would certainly improve the scaled index
calculation times.

In all, I find it just as plausible that Nati can improve their first cut
by a factor of three as it is that Mota can do the same.  I find it harder
to imagine that Intel can improve their `nth' cut by a similar margin!
How about it, guys?  Why so much silicon to get to the same place Nati got
in two tries?
-- 
:::::: Artificial   Intelligence   Machines   ---   Smalltalk   Project ::::::
:::::: Jan Steinman		Box 1000, MS 60-405	(w)503/685-2956 ::::::
:::::: tektronix!tekecs!jans	Wilsonville, OR 97070	(h)503/657-7703 ::::::

clif@intelca.UUCP (06/11/86)

> 
> Far be it from me to call Mr. Purkiser biased B-).
> But let's get it straight that the PC/AT benchmark being
> referred to is "SMALL MODEL" and (correct me if I'm wrong)
> 16 bit ints.  Contrast that to 32 bit addresses and ints
> in the Balance.
> 
I was kidding about myself being unbaised, hence the :-) symbol.
Yes the PC-AT Dhrystone numbers I quoted where small model, last
I looked the Dhrystone fit easily in 64K of code and data.

> A long list of Dhyrstone numbers was deleted here along
> with some comments about the "unfortunate" popularity of PC AT
> machines 




(quotes from my orginal article in >> ) 
> > 	In order for the 15 MHz 32332 to be faster than an 68020 (much less
> > a 386) it would have be 3x faster than an 10MHz 32032.  The increased
> > clock frequency accounts for 1.5x, I am hard pressed to imagine where
> > the other 2x could come from.
> 
> I may find that hard to imagine.  But why would you?
> If we are to believe the Dhrystone figures posted to the net,
> your company did just that in going from the 8086 to the 80286, ie.
> obtained a 2x speedup for the same clock frequencies.
> 
> > 	This is just the observations of a disinterested :-) person.
> 
> You may be disinterested, but I am not.  I'm tempted to resort
> to personal insults here, as the amount of resources that have
> been/will be spent on x86 systems does not bear thinking (or
> smiling) about, but of course I'll refrain.
> 
> 			Bjorn R. Bjornsson
> 			Department of Computing Science
> 			University of Alberta
> 			Edmonton
> 
> 			ihnp4!alberta!bjorn


My point was not that PC-ATs and 286 are faster than 32032.  This 
point (really 16-bits vs 32-bits) has being debated for some time on the 
network and I doubt that there is anything more either one of us
could add to the discussion. 

Rather my point was what significant architectural improvements were made
to the 32332 to give it a 2x performance boost at the same clock frequency?

After reading the data sheet and other information on the part I am only
aware of a few improvements made to the 32332 vs 32032.
	1. Burst Mode Bus, useful for cache systems and possibly
	   DMA-like transfers
	2. 32 address lines, no performance increase
	3. large prefetch unit, some performance boost.

How could these architectural improvements double the performance?
It is certainly possible that I may have overlooked some important things
since I can no longer find my 32332 literature.  However in my opinion
the main improvement that the 32332 has over the 32032 is clock speed.
The clock speed boost alone would not make it faster than a 68020. 

In contrast, there were many architectural improvements in the 86 vs the 286.
Additional commonly used instructions e.g. push immediate, faster multiply etc.
The most important 8086 to 286 improvement was the addition of special
purpose hardware to calculate the effective address.  Effective address
calculation which took 6-12 clocks on an 8086 take only 1 clock on the
286.  This resulted in an almost 2/3 reduction in average clocks/instruction
between an 8086 and 286.  This is responsible for the 2.5-3x performance
advantage at the same clock frequency.

The major architecture improvement of the 386 compared to the 286 is going from
16-bits to 32-bits.  The other improvements, orthogonal registers, the
addition of scaled-index addressing, and the shaving of 1 clock of the
execution of most common instructions, while important, are not as
dramatic as the 8086 to 80286 performance improvements.  

I standby my statement that I do not believe that an 15MHz 32332 is faster
than a 16.7 MHz 68020 (much less a 80386).  If somebody actually gets
4000+ Dhrystones from a 32332 I would be surpised.   While I do not 
believe that Dhrystones are a perfect measurement of performance, I
think they are pretty darn good.  
-- 
Clif Purkiser, Intel, Santa Clara, Ca.
{pur-ee,hplabs,amd,scgvaxd,dual,idi,omsvax}!intelca!clif
	
{Stamp Out Stupid Signatures}

bjorn@alberta.UUCP (Bjorn R. Bjornsson) (06/13/86)

In <88@intelca.UUCP> clif@intelca.UUCP (Clif Purkiser) writes:
> After reading the data sheet and other information on the part I am only
> aware of a few improvements made to the 32332 vs 32032.
>	1. Burst Mode Bus, useful for cache systems and possibly
>	   DMA-like transfers
>	2. 32 address lines, no performance increase
>	3. large prefetch unit, some performance boost.

The speed improvements (32332 over 32032) come from:

	1. Separate address path ALU with barrel shifter.
	   As with the 80286 vs 8086, the extra addressing hardware
	   is responsible for a major part of the performance gains.

	2. Burst mode bus, Transfers upto 16 bytes on operand reads
	   and instruction prefetch.

	3. 20 byte prefetch queue (vs. 8 bytes for the 32032).

	4. Slight microcode improvements.

In a system implementation additional improvements
from the following:

	a) In memory managed systems, there is 1 less clock per
	   bus cycle, ie. 4 (vs. 5).
	b) Pages are 4k bytes (vs. 512).
	c) MMU has 32 bit data bus (vs. 16) (for TLB misses).
	d) Slave processors use 32 bit transfers.



			Bjorn R. Bjornsson
			Department of Computing Science
			University of Alberta
			Edmonton

			ihnp4!alberta!bjorn

kevin@tolerant.UUCP (06/14/86)

> > > 	In order for the 15 MHz 32332 to be faster than an 68020 (much less
> > > a 386) it would have be 3x faster than an 10MHz 32032.  The increased
> > > clock frequency accounts for 1.5x, I am hard pressed to imagine where
> > > the other 2x could come from.
> > 
> > I may find that hard to imagine.  But why would you?
> > If we are to believe the Dhrystone figures posted to the net,
> > your company did just that in going from the 8086 to the 80286, ie.
> > obtained a 2x speedup for the same clock frequencies.
> > 
> My point was not that PC-ATs and 286 are faster than 32032.  This 
> point (really 16-bits vs 32-bits) has being debated for some time on the 
> network and I doubt that there is anything more either one of us
> could add to the discussion. 
> 
> Rather my point was what significant architectural improvements were made
> to the 32332 to give it a 2x performance boost at the same clock frequency?
> 
> After reading the data sheet and other information on the part I am only
> aware of a few improvements made to the 32332 vs 32032.
> 	1. Burst Mode Bus, useful for cache systems and possibly
> 	   DMA-like transfers
> 	2. 32 address lines, no performance increase
> 	3. large prefetch unit, some performance boost.
> 
> How could these architectural improvements double the performance?
> It is certainly possible that I may have overlooked some important things
> since I can no longer find my 32332 literature.  However in my opinion
> the main improvement that the 32332 has over the 32032 is clock speed.
> The clock speed boost alone would not make it faster than a 68020. 
> 
> In contrast, there were many architectural improvements in the 86 vs the 286.
> Additional commonly used instructions e.g. push immediate, faster multiply etc.
> The most important 8086 to 286 improvement was the addition of special
> purpose hardware to calculate the effective address.  Effective address
> calculation which took 6-12 clocks on an 8086 take only 1 clock on the
> 286.  This resulted in an almost 2/3 reduction in average clocks/instruction
> between an 8086 and 286.  This is responsible for the 2.5-3x performance
> advantage at the same clock frequency.

National also added special a special purpouse ALU to the 32332 to calculate
the effective address. And I beleive that this accounts for significant amount
of the additional speed fo the 32332.

> 
> The major architecture improvement of the 386 compared to the 286 is going from
> 16-bits to 32-bits.  The other improvements, orthogonal registers, the
> addition of scaled-index addressing, and the shaving of 1 clock of the
> execution of most common instructions, while important, are not as
> dramatic as the 8086 to 80286 performance improvements.  
> 

National a reduced the number of clock cycles per bus cycle. I believe the
that bus cycles are now 2 clock cycles without an MMU and 3 with an MMU.
Where as the 32032 is 4 clock cycles per bus cycle with out an MMU and 5
clock cycles with an MMU. It's been quite some time since I saw National's
presentation so I may not be entrierly correct about the numbers, but I
do remember that there are less clock cycles per bus cycle.

> I standby my statement that I do not believe that an 15MHz 32332 is faster
> than a 16.7 MHz 68020 (much less a 80386).  If somebody actually gets
> 4000+ Dhrystones from a 32332 I would be surpised.   While I do not 
> believe that Dhrystones are a perfect measurement of performance, I
> think they are pretty darn good.  
> -- 

The National 32032, 32332, and the Motorola 680XX both have a much better 
instuction set than the Intel 8086, 186, 286, and 386. This may not be
as noticible at the machine code level, but when in compiled codes, especialy
'C', the National and Motorola instrucitons sets are much more efficient.


-- 
Kevin Flory @ Tolerant Systems, San Jose CA
..{bene,mordor,nsc,oliveb,pyramid,ucbvax}!tolerant!kevin

farren@hoptoad.UUCP (06/16/86)

In article <367@tolerant.UUCP> kevin@tolerant.UUCP (Kevin Flory) writes:
>
>The National 32032, 32332, and the Motorola 680XX both have a much better 
>instuction set than the Intel 8086, 186, 286, and 386. This may not be
>as noticible at the machine code level, but when in compiled codes, especialy
>'C', the National and Motorola instrucitons sets are much more efficient.

Executable file sizes, 6502 assembler program:

Intel, 8086, Microsoft C 3.0 -> 15110
Motorola 68000, UniSoft cc   -> 19500

This is more efficient?

----------------
Mike Farren
hoptoad!farren

kevin@tolerant.UUCP (Kevin Flory) (06/16/86)

> In article <367@tolerant.UUCP> kevin@tolerant.UUCP (Kevin Flory) writes:
> >
> >The National 32032, 32332, and the Motorola 680XX both have a much better 
> >instuction set than the Intel 8086, 186, 286, and 386. This may not be
> >as noticible at the machine code level, but when in compiled codes, especialy
> >'C', the National and Motorola instrucitons sets are much more efficient.
> 
> Executable file sizes, 6502 assembler program:
> 
> Intel, 8086, Microsoft C 3.0 -> 15110
> Motorola 68000, UniSoft cc   -> 19500
> 
> This is more efficient?
> 
> ----------------
> Mike Farren
> hoptoad!farren

No, not as memory space goes, but let's not compare different complilers
with different libraries and exepect people to use this as measure of
efficientcy. 

But I wasn't refering to memory space. I was refering execution, we were
talking about exectution speed. If you would like to demestrate this you
can take a compiler that either runs on both machines or one that produces
code for both processors. Write a small 'C' routine that uses number of
'C' instructions, with not many library calls. I beleive that you will
find that the 68K and the 32K are much more useful to the compiler.


 

*** REPLACE THIS LINE WITH YOUR MESSAGE ***
-- 
Kevin Flory @ Tolerant Systems, San Jose CA
..{bene,mordor,nsc,oliveb,pyramid,ucbvax}!tolerant!kevin

pete@octopus.UUCP (06/16/86)

In article <865@hoptoad.uucp> farren@hoptoad.UUCP (Mike Farren) writes:
>(a quote):
>>...when in compiled codes, especialy
>>'C', the National and Motorola instrucitons sets are much more efficient.
>
>Executable file sizes, 6502 assembler program:
>
>Intel, 8086, Microsoft C 3.0 -> 15110
>Motorola 68000, UniSoft cc   -> 19500
>
>This is more efficient?
>

Come on folks! Now we're *really* comparing apples and oranges! Small
executable files are only minimally affected by the compiled source code.
A *much* larger factor is how well the compiler/library manufacturer has
broken down the library into small chunks. The above numbers are really
a comparison of Microsoft's PC C library vs. UniSoft's Unix/68000 C library,
in terms of how small a subset of the full library is pulled in at a time.
Let's get back to apples vs. apples... and let's talk about something more
interesting and more obviously comparable, like support chips (whose
floating point is better?), etc...
file sizes are only minimal
-- 
  OOO   __| ___      Peter Holzmann, Octopus Enterprises
 OOOOOOO___/ _______ USPS: 19611 La Mar Court, Cupertino, CA 95014
  OOOOO \___/        UUCP: {hplabs!hpdsd,pyramid}!octopus!pete
___| \_____          Phone: 408/996-7746

wje@lewey.UUCP (Bill Earl) (06/17/86)

>>The National 32032, 32332, and the Motorola 680XX both have a much better 
>>instuction set than the Intel 8086, 186, 286, and 386. This may not be
>>as noticible at the machine code level, but when in compiled codes, especialy
>>'C', the National and Motorola instrucitons sets are much more efficient.
> Executable file sizes, 6502 assembler program:
> Intel, 8086, Microsoft C 3.0 -> 15110
> Motorola 68000, UniSoft cc   -> 19500
> This is more efficient?

    I have had occasion to compile 4.2 BSD system code for both 80286
and 32032, under comparable execution conditions, meaning that both 
compilers generated 32-bit pointers and 32-bit int's (not 16-bit pointers
and 16-bit int's as with some 80286 compilers).  The 32032 code was
about the same size as the VAX code.  The 80286 code was about 70
per cent larger.  Some of the difference was due to slightly less
sophisticated optimization in the 80286 compiler, but there was no
reasonable way to get the 80286 code to be less than 50 per cent
larger than the 32032 code, given comparable levels of optimization.
The main problems on the 80286 were the small number of pointer
registers (2) and the lack of built-in 32-bit arithmetic.  The 80386
is much better than the 80286, in 80386 "flat" model (32-bit registers,
32-bit addresses, no segmentation), but 80386 flat model is essentially
incompatible with 8086 and 80286 compilers.  

    Some people seem to like the intellectual challenge of fitting
complex programs onto inadequate hardware, such as the 80286, but
doing so is not likely to be cost-effective, except for mass-market
products such Lotus.  (By complex, I mean programs which require
several megabytes of memory to execute.)

	William J. Earl
	American Information Technology, Cupertino, CA
	408-252-8713
	...!decwrl!nsc!voder!lewey!wje


-- 
	William J. Earl
	American Information Technology, Cupertino, CA
	408-252-8713
	...!decwrl!nsc!voder!lewey!wje

daveh@cbmvax.cbm.UUCP (Dave Haynie) (06/17/86)

> 
> In article <367@tolerant.UUCP> kevin@tolerant.UUCP (Kevin Flory) writes:
>>
>>The National 32032, 32332, and the Motorola 680XX both have a much better 
>>instuction set than the Intel 8086, 186, 286, and 386. This may not be
>>as noticible at the machine code level, but when in compiled codes, especialy
>>'C', the National and Motorola instrucitons sets are much more efficient.
> 
> Executable file sizes, 6502 assembler program:
> 
> Intel, 8086, Microsoft C 3.0 -> 15110
> Motorola 68000, UniSoft cc   -> 19500
> 
> This is more efficient?
> 
> ----------------
> Mike Farren
> hoptoad!farren

Well, first of all, you should be comparing implementations of the same
compiler on both machines.  And what's perhaps even more important, the
action of the linker on added library files.  Using the Lattice 3.03
compiler on the Amiga computer and Lattice's run-time library, it is
impossible to create a file much smaller than 13000 bytes.  The fault
here is not the efficiency of the compiler or the 68000 instruction set
in general, but the fact that the Lattice linker library is composed of
only a few very large object modules, and if any one function in an
object module is called, the linker includes the whole thing.  In this case,
if one could separate each function into its own object module, then place
them all in a library, the minimum code size would drop dramatically.  I
think the point of the above article was that the 68xxx and 32xxx instruction
sets will provide better overall compiled code, all other things being 
equal.  Besides testing the memory efficiency of the compiled code, take
a look at the execution efficiency; I'd expect the 68000 to do quite a bit
better than the 8086 at the same clock speed with comparable compilers,
especially with program and data spaces over 64K bytes.

-- 
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
Dave Haynie    {caip,ihnp4,allegra,seismo}!cbmvax!daveh

"As a dreamer of dreams and a travellin' man, I had chalked up many a mile."
"I read dozens of books about heros and crooks, and I learned much from both 
	of their styles.."
						-Jimmy Buffett

	These opinions are my own, though for a small fee they be yours too.
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

jpm@quad1.UUCP (06/17/86)

> >The National 32032, 32332, and the Motorola 680XX both have a much better 
> >instuction set than the Intel 8086, 186, 286, and 386. This may not be
> >as noticible at the machine code level, but when in compiled codes, especialy
> >'C', the National and Motorola instrucitons sets are much more efficient.
> 
> Executable file sizes, 6502 assembler program:
> 
> Intel, 8086, Microsoft C 3.0 -> 15110
> Motorola 68000, UniSoft cc   -> 19500
> 
> This is more efficient?
> 
> Mike Farren
> hoptoad!farren

Talk about apples and oranges...

You are compaing file size, which can have a lot more to do with internal
structure than code size.  Unix a.out files have a lot of stuff in them
that MSDOS .EXE files do not.  The larger size could easily be explained
by the difference in file structure and not by code efficiency.
-- 
John P. McNamee					Quadratron Systems Inc.

UUCP: {sdcrdcf|ttdica|scgvaxd|mc0|bellcore|logico|ihnp4}!psivax!quad1!jpm
ARPA: jpm@BNL.ARPA

farren@hoptoad.uucp (Mike Farren) (06/18/86)

daveh@cbmvax.cbm.UUCP (Dave Haynie) writes:
>Well, first of all, you should be comparing implementations of the same
>compiler on both machines.

  Agreed, this would be the best of all alternatives.  Don't know of an
instance of the SAME compiler on 8086 and 68000, though.  Note that the
UniSoft cc seems to be quite efficient, as far as 68000 compilers go;
the same program that resulted in a 19.5K file with UniSoft created a
32K (!) file on a SUN-3.

> I think the point of the above article was that the 68xxx and 32xxx
> instruction sets will provide better overall compiled code, all other
> things being equal.

  At any event, there was general agreement (with which my experience
agrees) in a discussion in net.arch that comparable 68K programs will be
approx. 20% larger than 8086 programs, simply because of the larger
number of bytes used by the respective instructions.  I have no data for
the 32K series - I remember it being a very nicely designed instruction
set, though.

> Besides testing the memory efficiency of the compiled code, take
>a look at the execution efficiency; I'd expect the 68000 to do quite a bit
>better than the 8086 at the same clock speed with comparable compilers,
>especially with program and data spaces over 64K bytes.

On this point, I have to agree, although you have to consider that clock speed
is one of the most meaningless measures you can possibly come up with.  The
point that I would make here is that immediate rejection of Intel parts 
simply because they have faults is, perhaps, the silliest position I've seen
taken in a religious war in a long time.  Certainly, the Intel processors
have deficiencies - what processor doesn't?  The Intel processors also have
one SCREAMING advantage --- there are one hell of a lot of them out there.
As a (currently) independent programmer, I'd rather make a lot of money by
writing for a machine with faults but a huge market than starve by only
writing for the hottest architecture available.  Not that I would mind
seeing the 68K or 32K processors take off, mind you - it would make my job
a lot easier.  Ain't gonna happen this century, though.

----------------
Mike Farren
hoptoad!farren

daveh@cbmvax.cbm.UUCP (Dave Haynie) (06/19/86)

> 
>> Besides testing the memory efficiency of the compiled code, take
>>a look at the execution efficiency; I'd expect the 68000 to do quite a bit
>>better than the 8086 at the same clock speed with comparable compilers,
>>especially with program and data spaces over 64K bytes.
> 
> On this point, I have to agree, although you have to consider that clock speed
> is one of the most meaningless measures you can possibly come up with.  

I did say clock speed, STUPID ME.  What I MEANT was bus speed, which is one
of the mosy meaningful measures I can come up with, since much of the cost of
a system is based on the speed you run the bus at.  A 68000 at 8MHz runs a
2MHz bus, I though that the 8086 at 8MHz was very similar.  I don't know the
ratio of clock to bus speed on the 32K series, but I think its more like 
5:1 than 4:1; anyone know for sure?

> ----------------
> Mike Farren
> hoptoad!farren
-- 
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
Dave Haynie    {caip,ihnp4,allegra,seismo}!cbmvax!daveh

"As a dreamer of dreams and a travellin' man, I had chalked up many a mile."
"I read dozens of books about heros and crooks, and I learned much from both 
	of their styles.."
						-Jimmy Buffett

	These opinions are my own, though for a small fee they be yours too.
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

dunk@ihuxf.UUCP (Tom Duncan) (06/19/86)

> Executable file sizes, 6502 assembler program:
> 
> Intel, 8086, Microsoft C 3.0 -> 15110
> Motorola 68000, UniSoft cc   -> 19500

A number of people have pointed out that executable file size
is not a good indicator of memory resource requirements associated
with a given machine.  For example:


> 					        Using the Lattice 3.03
> compiler on the Amiga computer and Lattice's run-time library, it is
> impossible to create a file much smaller than 13000 bytes.  The fault
> here is not the efficiency of the compiler or the 68000 instruction set
> in general, but the fact that the Lattice linker library is composed of
> only a few very large object modules, and if any one function in an
> object module is called, the linker includes the whole thing.


Why not avoid this distortion by comparing "object file" size, not
"executable file" size.  That is, compile with the "-c" option.

	Tom Duncan

elg@usl.UUCP (Eric Lee Green) (06/20/86)

In article <865@hoptoad.uucp> farren@hoptoad.UUCP (Mike Farren) writes:
>In article <367@tolerant.UUCP> kevin@tolerant.UUCP (Kevin Flory) writes:

>>... when in compiled codes, especialy
>>'C', the National and Motorola instrucitons sets are much more efficient.

>Executable file sizes, 6502 assembler program:
>
>Intel, 8086, Microsoft C 3.0 -> 15110
>Motorola 68000, UniSoft cc   -> 19500
>

The 68000 uses 16-bit instructions. The 8086 uses 8 bit instructions,
with many of them having postbytes but not all. You need to look at
the number of machine instructions, not the number of bytes. Remember,
the 68000 is a 16/32 bit machine, while an 8086 is an 8/16 bit machine.

Also note that the difference in operating system and compiler is just
as likely to make a difference in code size as a different processor.

Please, people, these religious quarrels get tiresome... this is
ns32k, not net.flame.
-- 
Computing from the Bayous,
       Eric Green {akgua,ut-sally}!usl!elg
            (Snail Mail P.O. Box 92191, Lafayette, LA 70509)

lee@nscpdc.UUCP (Lee Tapper) (06/21/86)

In article <428@cbmvax.cbmvax.cbm.UUCP> daveh@cbmvax.cbm.UUCP (Dave Haynie) writes:
>
>I did say clock speed, STUPID ME.  What I MEANT was bus speed, which is one
>of the mosy meaningful measures I can come up with, since much of the cost of
>a system is based on the speed you run the bus at.  A 68000 at 8MHz runs a
>2MHz bus, I though that the 8086 at 8MHz was very similar.  I don't know the
>ratio of clock to bus speed on the 32K series, but I think its more like 
>5:1 than 4:1; anyone know for sure?

Bus speed for the 332 depends on a number of factors. The
first is the presence or absence of an MMU. The second is
the bus cycle type. The 332 has the ability to take all of
it's instruction stream and any misaligned operands in burst
mode. In this mode it sends out an address and then gets the
next four 32 bit words back to back without using any bus time
for additional addresses. This works quite well for nibble mode
memory chipsand can speed execution. The ratios of
clocks to bus time are as follows :

* no MMU, single cycle 3:1
* MMU, single cycle 4:1
* no MMU burst fetch 9:4

Lee Tapper


* MMU burst fetch 10:4

jpm@quad1.UUCP (John McMamee) (06/24/86)

> > Executable file sizes, 6502 assembler program:
> > 
> > Intel, 8086, Microsoft C 3.0 -> 15110
> > Motorola 68000, UniSoft cc   -> 19500
> 
> A number of people have pointed out that executable file size
> is not a good indicator of memory resource requirements associated
> with a given machine.  For example:
> 
> 	. . . .
> 
> Why not avoid this distortion by comparing "object file" size, not
> "executable file" size.  That is, compile with the "-c" option.

That method fails as well because object modules can have a lot of
junk in them besides the code.  The only way to compare code size
is to get the code size figures (i.e. from size(1) on Unix, linker
maps on MSDOS, etc.).  Looking at file size, any sort of file,
just doesn't cut it.
-- 
John P. McNamee					Quadratron Systems Inc.

UUCP: {sdcrdcf|ttdica|scgvaxd|mc0|bellcore|logico|ihnp4}!psivax!quad1!jpm
ARPA: jpm@BNL.ARPA