[comp.unix.xenix] 16-bit versus 32-bit memory performance

dyer@spdcc.COM (Steve Dyer) (12/06/87)

In article <620@omen.UUCP>, caf@omen.UUCP (Chuck Forsberg WA7KGX) writes:
> SCO warns not to use 16 bit memory on 386 motherboards.  I have not
> seen any problems resulting from this, other than speed.  Code running
> from 16 bit memory behaves like a 4 mHz 286 box.  So I only use the
> extra memory when running VP/ix.

Let (re)offer my experience with 16-bit memory when using an Intel Inboard 386
in an 8mhz PC/AT.  The Inboard has 64K cache which offsets some of the
performance problems that Chuck reports with systems based on the Intel 386
motherboard.  It's still true that 16-bit memory is a lose, although not as
bad as Chuck reports--an Inboard system with only 16-bit memory is still better
at running 286 code than the equivalent 286-based 8mhz AT.  To be honest, I
haven't examined the XENIX 386 system with only 16-bit memory (my tests
reflected the chronology of my system upgrades, and I'm loathe to open it up
and rip out the 32-bit memory to try it now...)  So, there might yet be some
disproportionate disadvantage to running 386 code on an Inboard with only
16-bit memory.  It works and it's cheap, though, so you can always upgrade
later.

> As far as I know, none of the 386 Unix systems have any understanding of
> the fact that some memory is fast 32 bit and some is slow 16 bit.  Many
> 386 boxes have a limited amount of fast 32 bit memory which should be
> used for active text and stack segments, not the buffer cache.

I had 3mb of 16-bit memory installed in my system together with the 3mb maximum
of 32-bit memory on the Inboard; this issue was nagging me, too, especially
considering that XENIX 386 keeps everything around for as long as possible
in its object cache--that is, some data objects might get "stuck" in 16-bit
memory and never leave.  When I began to get parity errors in the 16-bit memory,
I just removed it permanently rather than finding and replacing the chips.
At least in my case, 3mb 32-bit memory is more than enough.  In an Inboard 386
system, the first 256kb of memory is the 16-bit memory on the PC/AT motherboard.
I don't know whether the "inboard" keyword on the boot string, i.e.,
"hd(40)xenix inboard" takes that memory into account (it DOES turn on 16mhz
mode and cache.)

I agree that it would be nice to have a system which could recognize these
two "classes" of memory, if only to determine what the magnitude, if any,
of the improvement might be.  It would be amusing if the side-effect of
locating the buffer cache in 16-bit memory was to slow the entire system
down (I don't know that, of course, but I never fail to be surprised how
"intuitive" assumptions can be shown incorrect.)

Anyway, here is a reposting of my test results:
---------------------------------------
All tests were run on the same hardware and software environment,
with the exception of the replacement of the 286 for the 386 card,
under SCO XENIX 286 OS v. 2.1.3 with Development System v. 2.1.4
or SCO XENIX 386 OS and Development System beta test with beta update (6/16/87).

	      IBM PC/AT 8mhz	  IBM PC/AT with Intel Inboard 386/AT
					at 16mhz, cache enabled
		XENIX 286	XENIX 286	XENIX 286	XENIX 386
		16-bit mem	16-bit mem	32-bit mem	32-bit mem

Drystone 1.1	no reg	reg	no reg	reg	no reg	reg	no reg	reg
		1084	1094	1957	1963	2906	2893	4603	4922

Buchholz (sum of user & sys times in sec)

short cpu	0.3		0.2		0.1		0.1
medium cpu	3.3		1.9		1.2		0.5
long cpu	***		*** (values out of range)	2.6
short I/O	0.9		0.6		0.4		0.1
I/O bound	3.1		1.9		1.4		0.5
long mixed	56.9		33.8		21.8		8.9

-- 
Steve Dyer
dyer@harvard.harvard.edu
dyer@spdcc.COM aka {ihnp4,harvard,linus,ima,bbn,m2c}!spdcc!dyer

jpp@slxsys.specialix.co.uk (John Pettitt) (12/10/87)

This should perhaps belong in comp.arch

It would appear that most 8088,8086,186 and 286 systems are
limited by the number of cycles taken to execute instructions
(I.E the clock speed).  However the 80386 (at 16 and esp at 20 Mhz)
is limited by its memory bus bandwidth.  That is the memory subsystem
on most 286 boxes is fast enough have little or no real effect on
performance compared to a change in clock speed.   An 80386
however is largly limited by the rate that it can be 'fed' data
and instructions.   

16 Bit memory subsystems have a devestating effect on the 80386 
for 2 reasons.  Firstly 2 memory accesses are required rather than
one thus doubling the access time.  Secondly most 16 bit memory cards
are designed for 8 or 10 Mhz operation not 16 Mhz so a significant
number of wait states are needed when used with a 386.   It would
appear that a 'cache miss' on the Intel Inboard(tm) generates beteween
10 and 12 wait states thus making access to 16 bit ram slower than
from the original 286.

In conclustion - if you want a 32 bit CPU use 32 bit ram.  If you
just want the instruction set use the P9 (80388) - if it ever appears.

(This posting written on a Dell 386 with 6 MB of 0 wait static 32 bit ram)

-- 
John Pettitt - 144.5 MHz: G6KCQ, CIX: jpettitt,  Voice: +44 1 398 9422
UUCP:  ...uunet!mcvax!ukc!pyrltd!slxsys!jpp  (jpp@slxsys.specialix.co.uk)
Disclaimer: I don't even own a cat to share my views !

dave@micropen (David F. Carlson) (12/16/87)

In article <109@slxsys.specialix.co.uk>, jpp@slxsys.specialix.co.uk (John Pettitt) writes:
> This should perhaps belong in comp.arch
> 
... other stuff deleted -- dfc
> 16 Bit memory subsystems have a devestating effect on the 80386 
> for 2 reasons.  Firstly 2 memory accesses are required rather than
> one thus doubling the access time.  Secondly most 16 bit memory cards
> are designed for 8 or 10 Mhz operation not 16 Mhz so a significant
> number of wait states are needed when used with a 386.   It would
> appear that a 'cache miss' on the Intel Inboard(tm) generates beteween
> 10 and 12 wait states thus making access to 16 bit ram slower than
> from the original 286.
> 

First, *the* standard of the AT class machines (which most of the non-ibm
386 boxes are) is the 8MHz 16 bit bus.  Thus, your 16MHz 386 runs all IO and
16bit memory operations at 8MHz.  Wait states a-plenty.  10MHz AT buses
have been tried but *many* xt class cards won't work with that rate, so
all AT-class buses on 386 boxes (that I know of) are strictly 8MHz.

> In conclustion - if you want a 32 bit CPU use 32 bit ram.  If you
> just want the instruction set use the P9 (80388) - if it ever appears.

The reason that I use 386 machines is that the 386 is a real architecture.
(Read 32 bit linear address space, virtual memory support, etc.) and the
286 machines that we did use had a fatally flawed architecture for supporting
multi-user systems (UNIX in particular.)  I use 4 Meg of static "32-bit"
ram and 3 meg of 16 bit ram.  Reason:  Most kernel activities and most
user processes are in fast ram.  Disk buffers (read slow io bound) are kept
in 16 bit memory.  It will always be faster to have extra ram pages in
a DP VM environment than to go with less ram and be forced to go into page
thrash.  That 28msec disk system is *much* slower that 16 bit memory any
way you slice it.  So, get enough *affordable* ram in any combination
that will keep your system from thrashing.  Intel Inboard had 1meg of fast
ram and that simply isn't enough to prevent thrash.  Solution is to muddle
through with 16 bit ram and like it!

Personally, I can't wait for the vaporware 80388 just to get the architecture
of a real computer.  And I can't afford 6 Meg of static column ram.  Besides,
8 MHz AT bus speed is sufficient for the tasks required of my system.  And 
certainly beats paging thrash!

> John Pettitt - 144.5 MHz: G6KCQ, CIX: jpettitt,  Voice: +44 1 398 9422

-- 
David F. Carlson, Micropen, Inc.
...!{ames|harvard|rutgers|topaz|...}!rochester!ur-valhalla!micropen!dave

"The faster I go, the behinder I get." --Lewis Carroll