[comp.sys.ibm.pc.hardware] Memory Parity. Is It Really Needed

steveh@tasman.cc.utas.edu.au (Steven Howell) (03/19/91)

	Evening All.

I have just lashed out and scored myself 4 megs of SIMM Memory. However it is
the parity-Less type, as in Macintosh memory (1Mx8).

I installed them into my 386, disabled parity checking and it worked fine. 

The question is, is parity really needed, and if so what am I going to be
missing out on.

Thanks In Advance
Steveh

plim@hpsgwp.sgp.hp.com (Peter Lim) (03/20/91)

/ steveh@tasman.cc.utas.edu.au (Steven Howell) /  8:34 pm  Mar 19, 1991 /
writes:

> I installed them into my 386, disabled parity checking and it worked fine. 
> 
> The question is, is parity really needed, and if so what am I going to be
> missing out on.
> 
You will miss nothing if your memory and you PC system play well together.
If not, you could get one or more of the following:

	[1] Unexpected program behaviour ranging from crashing system
	    to uncontrollable activities (??).
	[2] Loss of data or storing corrupted data.
	[3] Loss of hair ?? (trying to figure out what is going on  ;-))
	... (may be more ...).

Because by turning off parity check, your system doesn't have the hardware
feature to catch any memory failure. And any such failure will shows up
and something else dubious.

Anyway, I've got a friend who uses his PC system with no parity for
more than a year now and he's fine.


Regards,     . .. ... .- -> -->## Life is fast enough as it is ........
Peter Lim.                     ## .... DON'T PUSH IT !!          >>>-------,
                               ########################################### :
E-mail:  plim@hpsgwg.HP.COM     Snail-mail:  Hewlett Packard Singapore,    :
Tel:     (065)-279-2289                      (ICDS, ICS)                   |
Telnet:        520-2289                      1150 Depot Road,           __\@/__
                                             Singapore   0410.           SPLAT !

#include <standard_disclaimer.hpp>

town@hpspkla.spk.hp.com (Brian R. Town) (03/20/91)

steveh@tasman.cc.utas.edu.au writes:

>I installed them into my 386, disabled parity checking and it worked fine. 
>The question is, is parity really needed, and if so what am I going to be
>missing out on.

The parity bit is used for error detection.  Here is a brief description of
how it works:

   Whenever a byte (8 bits) is written to memory, special hardware adds up the
   number of bits set to 1.  The 9th bit is then set to 1 or 0 such that
   the total number of 1's is even (if using even parity) or odd (if using odd
   parity).  Example:
         
        DATA      TOTAL 1's        9th BIT FOR     9th BIT FOR
                  BEFORE PARITY    EVEN PARITY     ODD PARITY
        
        1110010       4                 0              1
        1010100       3                 1              0 
        0001000       1                 1              0
        0011000       2                 0              1

Now, whenever data is read back from memory, the hardware adds up the number of
1's (including the 9th bit).  If the system is using even parity, then the
total should be even.  If a read occurs and the total bits set to 1 is odd,
then a memory parity error has occured and the CPU gets an interrupt.  I am
not sure what PC's do to report this, but most systems will simply report
something like 'Parity error at <address>'.

So, now that you know what it is for, you can be the judge as to whether you
need it or not.  What you will be missing is the ability for your hardware to
detect single bit memory errors.  How a memory failure affects you depends
on if it is in code space of data space.  What you really have to worry about
is having an error that doesn't cause a crash, but corrupts some of your data.
Somehow, it helps me sleep at night knowing that I don't have to rely on a
system crash or data corruption to inform me that I have a memory problem. 8-)
I am of course sunk if I have more than a 1 bit error. 8-(

Now, want to hear about error correcting memory?????

Hope this helps,

Brian (I dream about this stuff) Town

wbonner@eecs.wsu.edu (Wim Bonner) (03/24/91)

In article <3370017@hpsgwp.sgp.hp.com> plim@hpsgwp.sgp.hp.com (Peter Lim) writes:
>Anyway, I've got a friend who uses his PC system with no parity for
>more than a year now and he's fine.

My Wyse 286 box came with memory that doesn't have parity checking at all in
the first 640k.  Some of us are forced to deal with the hope that nothing 
goes wrong with that memory.  anyway, it has worked ok for the past two years 
running both DOS and OS/2 so I figure it must be OK.

-- 
|  wbonner@yoda.eecs.wsu.edu  |
| 27313853@wsuvm1.csc.wsu.edu |
|  72561.3135@CompuServe.com  |

berggren@eecs.cs.pdx.edu (Eric Berggren) (04/04/91)

wbonner@eecs.wsu.edu (Wim Bonner) writes:
                                                        nice touch ;) -v
>In article <3370017@hpsgwp.sgp.hp.com> plim@hpsgwp.sgp.hp.com (Peter Lim) writes:
>>Anyway, I've got a friend who uses his PC system with no parity for
>>more than a year now and he's fine.

>My Wyse 286 box came with memory that doesn't have parity checking at all in
>the first 640k.  Some of us are forced to deal with the hope that nothing 
>goes wrong with that memory.  anyway, it has worked ok for the past two years 
>running both DOS and OS/2 so I figure it must be OK.

  The part about memory parity I don't understand is that I am told one
wants memory parity checking done to "prevent loss of important data". Well
everytime I got a memory parity error, I lost important data because it
brought the whole system to a halt. What next? Helicopters with emergency
ejector seats? wierd...

-e.b.

==============================================================================
  Eric Berggren             |  "The force of the 'Dark Side' eminates from 
  Computer Science/Eng.     |    the ominous DeathStar looming overhead." 
  berggren@eecs.cs.pdx.edu  |            - Down with AT&T! -

wsineel@wsooti01.info.win.tue.nl (Eelco Vriezekolk) (04/04/91)

In article <2217@pdxgate.UUCP> berggren@eecs.cs.pdx.edu (Eric Berggren) writes:
>wbonner@eecs.wsu.edu (Wim Bonner) writes:
>                                                        nice touch ;) -v
>>In article <3370017@hpsgwp.sgp.hp.com> plim@hpsgwp.sgp.hp.com (Peter Lim) writes:
[PC with and without memory parity]
>
>  The part about memory parity I don't understand is that I am told one
>wants memory parity checking done to "prevent loss of important data". Well
>everytime I got a memory parity error, I lost important data because it
>brought the whole system to a halt. What next? Helicopters with emergency
>ejector seats? wierd...

In professional situations it could be much more important
that you get *reliable* data than that you get data at all. If I
did my financial administration on a PC (making regular
backups, ofcourse) I'd prefer a total crash to some errors
slipping into the database.
That's the point for memory parity checking. It is the same
reason as why you do write-verify on disks, only these kinds
of errors are more easily recoverable.

`Fault tolerance' consists of both interception of errors
(very important) and automatic recovery of errors (very
useful).

-- 
Eelco Vriezekolk, wsineel@win.tue.nl, (+31)40-118338.
 Software engineer looking for career in any civilized country. Software
 development for technical uses. Experience with various programming
 languages, hardware, AI, user interfaces, etc.

town@hpspkla.spk.hp.com (Brian R. Town) (04/05/91)

berggren@eecs.cs.pdx.edu (Eric Berggren) writes:

>   The part about memory parity I don't understand is that I am told one
> wants memory parity checking done to "prevent loss of important data". Well
> everytime I got a memory parity error, I lost important data because it
> brought the whole system to a halt. What next? Helicopters with emergency
> ejector seats? wierd...

Yes you lost important data, but only what you were working on at the time.  I
would guess that the parity error kept your application from writing the
corrupt information to disk, didn't it??  The 'important data' that you were
saved from loosing is the data on the disk that the parity checking kept the 
program from trashing.  Just consider what shape any data files that your
program writes would have been in if you would have used the program for a few
days without knowing that there was a problem.

Brian

berggren@eecs.cs.pdx.edu (Eric Berggren) (04/07/91)

wsineel@wsooti01.info.win.tue.nl (Eelco Vriezekolk) writes:

>In article <2217@pdxgate.UUCP> berggren@eecs.cs.pdx.edu (Eric Berggren) writes:
>>wbonner@eecs.wsu.edu (Wim Bonner) writes:
>>                                                        nice touch ;) -v
>>>In article <3370017@hpsgwp.sgp.hp.com> plim@hpsgwp.sgp.hp.com (Peter Lim) writes:
>[PC with and without memory parity]
>>
>>  The part about memory parity I don't understand is that I am told one
>>wants memory parity checking done to "prevent loss of important data". Well
>>everytime I got a memory parity error, I lost important data because it
>>brought the whole system to a halt. What next? Helicopters with emergency
>>ejector seats? wierd...

>In professional situations it could be much more important
>that you get *reliable* data than that you get data at all. If I
>did my financial administration on a PC (making regular
>backups, ofcourse) I'd prefer a total crash to some errors
>slipping into the database.
>That's the point for memory parity checking. It is the same
>reason as why you do write-verify on disks, only these kinds
>of errors are more easily recoverable.

>`Fault tolerance' consists of both interception of errors
>(very important) and automatic recovery of errors (very
>useful).


  I agree, but I would rather be advised and allow me to make a decision
about how I would handle it. In some databases (and probably some spread-
sheets too) if you don't close up properly, you may lose everything. Of
course, that's what backups are for. When a write-verify fails on a disk,
it doesn't erase the disk for you. It gives you an error and, depending,
on the application, allows you to use another disk.
  My main point was not necessarily the concept of memory parity (which
I probably left the subject a little) but more how most systems handle it.
Anyway, no big deal. It doesn't happen that often... <splat>

-e.b.

==============================================================================
  Eric Berggren             |  "The force of the 'Dark Side' eminates from 
  Computer Science/Eng.     |    the ominous DeathStar looming overhead." 
  berggren@eecs.cs.pdx.edu  |            - Down with AT&T! -

achilles@unixland.uucp (David Holland) (04/07/91)

town@hpspkla.spk.hp.com (Brian R. Town) writes:

> Yes you lost important data, but only what you were working on at the time.  
> would guess that the parity error kept your application from writing the
> corrupt information to disk, didn't it??  The 'important data' that you were
> saved from loosing is the data on the disk that the parity checking kept the 
> program from trashing.  Just consider what shape any data files that your
> program writes would have been in if you would have used the program for a fe
> days without knowing that there was a problem.
> 
> Brian

 
 However, wouldn't the memory test that your computer does when you turn it 
on or do a "cold" reboot detect memory failures? If so, adding an extra chip 
for parity not only wastes silicon, board space, money, power, and 
everything else, but also DECREASES reliability - if the chance of any 
particular memory chip failing is 1/10,000, the chance of any one of your 
memory chips failing is 8/10,000 without parity, or 9/10,000 with... 
 
 ------------
 David A. Holland
 
 pro-angmar!achilles@alfalfa.com ... alphalpha!pro-angmar!achilles
 
 CAD/CAM: Computer Aided Disaster/Computer Assisted Mayhem :-)

fr@compu.com (Fred Rump from home) (04/07/91)

berggren@eecs.cs.pdx.edu (Eric Berggren) writes:

>wbonner@eecs.wsu.edu (Wim Bonner) writes:
>                                                        nice touch ;) -v
>>In article <3370017@hpsgwp.sgp.hp.com> plim@hpsgwp.sgp.hp.com (Peter Lim) writes:
>>>Anyway, I've got a friend who uses his PC system with no parity for
>>>more than a year now and he's fine.

>>My Wyse 286 box came with memory that doesn't have parity checking at all in
>>the first 640k.  Some of us are forced to deal with the hope that nothing
>>goes wrong with that memory.  anyway, it has worked ok for the past two years
>>running both DOS and OS/2 so I figure it must be OK.

>  The part about memory parity I don't understand is that I am told one
>wants memory parity checking done to "prevent loss of important data". Well
>everytime I got a memory parity error, I lost important data because it
>brought the whole system to a halt. What next? Helicopters with emergency
>ejector seats? wierd...

Boy, does that bring back memories.

We installed a number of Wyse 286 and early model 386 boxes running Xenix and 
none are left in the field.

These things may run DOS just fine and we use them now for junk work in the 
office but trusting them at customer locations was simply not worth the agony.
Since we sold them we had to replace them with something that worked reliably.
While it cost us some money, we only lost one client in the process. (He siad 
we should have known better than to sell them junk)

The whole wyse fiasco rippled thru the marketplace and put them into tenuous 
financial condition until the buyout with additional cash.

We still sell their terminals and wish they would stick to them.

But I hear that their current computer line works as well as any but I have no 
direct experience there.


Fred
-- 
W. Fred Rump 			office:		   fred.COMPU.COM	
26 Warren St.   	          home:     fred@icdi10.COMPU.COM 
Beverly, NJ. 08010                bang:   ...{dsinc uunet}!cdin-1!icdi10!fred
609-386-6846          "Freude... Alle Menschen werden Brueder..."  -  The Ode

fr@compu.com (Fred Rump from home) (04/07/91)

wsineel@wsooti01.info.win.tue.nl (Eelco Vriezekolk) writes:


.>That's the point for memory parity checking. It is the same
.>reason as why you do write-verify on disks, only these kinds
.>of errors are more easily recoverable.

Exactly. And what we were not getting is reliable data.

Problems would simply show up in databases long after junk was written to 
them.

Makes for very unhappy clients.

Fred
-- 
W. Fred Rump 			office:		   fred.COMPU.COM	
26 Warren St.   	          home:     fred@icdi10.COMPU.COM 
Beverly, NJ. 08010                bang:   ...{dsinc uunet}!cdin-1!icdi10!fred
609-386-6846          "Freude... Alle Menschen werden Brueder..."  -  The Ode

fr@compu.com (Fred Rump from home) (04/08/91)

town@hpspkla.spk.hp.com (Brian R. Town) writes:


>Yes you lost important data, but only what you were working on at the time.  I
>would guess that the parity error kept your application from writing the
>corrupt information to disk, didn't it??  The 'important data' that you were
>saved from loosing is the data on the disk that the parity checking kept the
>program from trashing.  Just consider what shape any data files that your
>program writes would have been in if you would have used the program for a few
>days without knowing that there was a problem.

Sorry to keep harping on this, but the original question came from a wyse 286 
system user.

In our experience those systems did not have parity checking. Therefore errors 
were silently being ignored while garbage was being written to disk.

A parity checking hardware feature would have indicated exactly what Brian 
above indicates: that there is a memory problem.

The wyse machines had only 8 chips in a row of memory. The ninth chip for 
parity checking (actually the ninth bit) was not there to ensure proper memory 
function. This was done to save a few bucks as memory was assumed to be 
reliable. This was normally the case with only 640KB but as several megabytes 
were added on additional memory cards, the chance of error grew dramatically. 
It simply made those machines not usable for multi-user functions.

fred
-- 
W. Fred Rump 			office:		   fred.COMPU.COM	
26 Warren St.   	          home:     fred@icdi10.COMPU.COM 
Beverly, NJ. 08010                bang:   ...{dsinc uunet}!cdin-1!icdi10!fred
609-386-6846          "Freude... Alle Menschen werden Brueder..."  -  The Ode

campbell@dev8o.mdcbbs.com (Tim Campbell) (04/09/91)

In article <2258@pdxgate.UUCP>, berggren@eecs.cs.pdx.edu (Eric Berggren) writes:
> wsineel@wsooti01.info.win.tue.nl (Eelco Vriezekolk) writes:
> 
>>In article <2217@pdxgate.UUCP> berggren@eecs.cs.pdx.edu (Eric Berggren) writes:
>>>wbonner@eecs.wsu.edu (Wim Bonner) writes:
>>>                                                        nice touch ;) -v
>>>>In article <3370017@hpsgwp.sgp.hp.com> plim@hpsgwp.sgp.hp.com (Peter Lim) writes:
>>[PC with and without memory parity]
>>>
>>>  The part about memory parity I don't understand is that I am told one
>>>wants memory parity checking done to "prevent loss of important data". Well
>>>everytime I got a memory parity error, I lost important data because it
>>>brought the whole system to a halt. What next? Helicopters with emergency
>>>ejector seats? wierd...
> 
>   I agree, but I would rather be advised and allow me to make a decision
> about how I would handle it. In some databases (and probably some spread-
> sheets too) if you don't close up properly, you may lose everything. Of
> course, that's what backups are for. When a write-verify fails on a disk,
> it doesn't erase the disk for you. It gives you an error and, depending,
> on the application, allows you to use another disk.
>   My main point was not necessarily the concept of memory parity (which
> I probably left the subject a little) but more how most systems handle it.
> Anyway, no big deal. It doesn't happen that often... <splat>
> 
If you get parity errors - somethings WRONG - you should probably do something
about it (like replace the chip, clean the dust out of the computer (now
there's an interesting concept) or check to make sure all the chips are
properly seated on the boards (SIMMs properly seated in slots)).

You _CAN_ control what happens when the error occurs.  The occurance of the
error generates an interrupt.  Anyone can attach that interrupt vector (sorry,
don't recall which interrupt it is - 02h or 03h come to mind).  The default
action is to print a rather cryptic error message (which allegedly tells you
at what address the error occured - fingering the bad chip) and promptly
locking your machine.  If you want you can plug in a far pointer to an 
IRET instruction in which case the machine will do absolutely nothing 
(although _I_ personally wouldn't advise this approach) and proceed merily
along with corrupt data.
  ---------------------------------------------------------------------------
	  In real life:  Tim Campbell - Electronic Data Systems Corp.
     Usenet:  campbell@dev8.mdcbbs.com   @ McDonnell Douglas M&E - Cypress, CA
       also:  tcampbel@einstein.eds.com  @ EDS - Troy, MI
 CompuServe:  71631,654	 	 (alias  71631.654@compuserve.com)
 P.S.  If anyone asks, just remember, you never saw any of this -- in fact, I 
       wasn't even here.

davet@cbnewsj.att.com (Dave Tutelman) (04/09/91)

In article <1991Apr8.183732.1@dev8o.mdcbbs.com> campbell@dev8o.mdcbbs.com (Tim Campbell) writes:
>
>You _CAN_ control what happens when the error occurs.  The occurance of the
>error generates an interrupt.  Anyone can attach that interrupt vector...
	Tim is absolutely right.  I've done that myself, specifically for
	a memory test program.  (You have to, if your memory tester is to
	function long enough to tell you about the errors it discovers.
	Think about it.)
	But...

>...If you want you can plug in a far pointer to an 
>IRET instruction in which case the machine will do absolutely nothing 
>(although _I_ personally wouldn't advise this approach) and proceed merily
>along with corrupt data.
	Again, I agree with Tim (both that you can and that you shouldn't).

	There's an assumption here that the DATA is bad.  But a stored-program
	computer is as likely to have the bit error in the PROGRAM.  Three
	things could happen:
	   1.	The program could hang, because of the bad instruction.
		(For practical purposes, indistinguishable from just
		locking up due to BIOS default parity error handling,
		except that you don't get the message on-screen.)
	   2.	The program could do something wrong, visibly (trashing
		the screen) or invisibly (trashing your data).  The latter
		is the most pathological possible consequence, much worse
		than simply saving a byte with a bad bit (what happens
		if the data has the parity error, and is ignored).
	   3.	You could luck out, and no serious consequence accrues.

	I don't think I'd trust anything I cared about, on the chance that
	I'll get consequence #3.

Dave
+---------------------------------------------------------------+
|    Dave Tutelman						|
|    Physical - AT&T Bell Labs  -  Lincroft, NJ	  07738		|
|    Logical -  dmt@pegasus.att.com				|
|    Audible -  (908) 576 2194  (Office)			|
|		(908) 922 9576  (Home)				|
+---------------------------------------------------------------+

town@-a.spk.hp.com (Brian R. Town) (04/11/91)

achilles@unixland.uucp (David Holland) writes: 
 
> However, wouldn't the memory test that your computer does when you turn it 
>on or do a "cold" reboot detect memory failures? If so, adding an extra chip 
>for parity not only wastes silicon, board space, money, power, and 
>everything else, but also DECREASES reliability - if the chance of any 
>particular memory chip failing is 1/10,000, the chance of any one of your 
>memory chips failing is 8/10,000 without parity, or 9/10,000 with... 
> 
> ------------
> David A. Holland

This is only true for 'hard' memory failures.  The ones that are going to hurt
you the most are the intermittent ones.  These are the ones that show up once
in a while.  Examples:  A poor connection or joint which is influenced by heat
                        and/or vibrations.  An electrical problem within a
                        memory or support chip which causes the device to have
                        an output signal which is compromized (weak for ex.).
                        This may only show up as an error under certain heat
                        conditons.

Boy, I sure would love to have the 'decresed reliability' of error correcting
memory (you use 12 bits for every byte with it).   ;) ;)

Brian (rather have mem fail 12.5% more often, but always know it did) Town
 

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (04/11/91)

Most (perhaps all) Macs have no parity. Parity problems byte seldom but deep!
-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc and 80386 mailing list
"Stupidity, like virtue, is its own reward" -me

hdrw@ibmpcug.co.uk (Howard Winter) (04/12/91)

Memory doesn't always fail 'hard' by any means, and the power-on test
only detects hard failures.  Back in the mid 70's I had a Z80 based machine
which I played with, and it developed a single-bit fault that took between
5 and 15 seconds to happen - you could set a value in a mmemory location
and then run a display loop which showed the value.  Suddenly, with no
provocation the value of a single bit would change.  Set it back and 
'verify' it - OK.  A few seconds later it changed back again.
I isolated the chip, and changed it - all was well after that.  It seemed
a bit of a shame to throw away a whole chip for a single-bit fault,
but then most of the Titanic didn't leak...
This system didn't have parity checking, so if I hadn't known about the
fault, the effect could have been an error occurring in data (and being
written back to disk) would have gradually corrupted a database, and
in program code could have caused almost any unwanted side effect - just
look at the difference changing a bit in a machine-code instruction makes...

I thI think memory parity checking is *vital* for any serious equipment -
certainly for commercial machines, and in my case, for my home machine.
I am a little surprised that with 32-bit machines that simple 1-per-byte
parity is used instead of more sophisticated correction, but there you are.
I don't know if 4-per-32 can do any correction - perhaps someone else
out there can say ?

Howard.
-- 
Automatic Disclaimer:
The views expressed above are those of the author alone and may not
represent the views of the IBM PC User Group.
-- 
hdrw@ibmpcug.Co.UK     Howard Winter     0W21'  51N43'