[comp.arch] parity is for farmers?

scott@poincare.geom.umn.edu (Scott S. Bertilson) (05/22/91)

  Does anyone else get nervous about the fact that NeXT ships their machines
with 8 megabytes of non-parity memory?  Is memory so reliable today that
parity doesn't give enough benefit to bother with?  Does only ECC give a
strong enough guarantee - and that is too expensive, so we should just
go without?
  I might be paranoid, but I don't ever remember buying memory for a
"real" computer before that didn't have some sort of error checking
included.  Please help me to find a good answer to this question -
I've got to add memory to a number of NeXTStations and I'd sure prefer
to use parity memory, but not if it is a waste of time.
				Thanks..Scott S. Bertilson
					scott@geom.umn.edu

danw@hob8.prime.com (Dan Westerberg) (05/22/91)

Parity is by no means a dead issue with memory.  We recently installed a 
number of SPARCstation2s here and ran into a memory problem.  One of our 
machines was max-ed out on the motherboard with 64MB of memory and we began
to take 'transient' parity errors with alarming frequency.  Sun maintained that
this wasn't a known problem of any type and we ended up tracing the problem
down to the 3rd party 4MB simms we were using.  The Sparc2 *requires* 80ns
ram parts, the 3rd party simms were spec'd to 80ns but it turned out that the
particular DRAMs used could not maintain 80ns access times with a fully loaded
memory bus.  Our vendor replaced all of these 4MB simms and we haven't had a 
problem since.

Without parity checking, I have no idea how this problem could have been tracked
down.  I do believe, however, that parity is no longer useful for intra-chip
busses.  Parity on busses was useful when those busses spanned long lengths of
trace and/or crossed a backplane going through several connectors.  But, on 
today's motherboards where we have similar busses traveling a few inches
at most, I think the usefullness of parity wanes.

ECC is the only way in which to provide reliable error protection/correction
in today's large memory architectures.  Also, in order to maintain extremely
high *availability* of a machine (i.e. # of hours down vs. # of hours running),
ECC is the only way to go.

Dan

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~  "These walls that still surround me      ~                                 ~
~   Still contain the same old me           ~     Dan Westerberg              ~
~   Just one more who's searching for       ~        danw@toucan.prime.com    ~
~   The world that ought to be"             ~                                 ~
~                             - Neil Peart  ~  Prime Computer, Framingham, MA ~
~                                           ~                                 ~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

chuck@irene.mit.edu (CHUCK PARSONS 617-253-4157) (05/22/91)

In article <1991May21.232331.24888@cs.umn.edu>, scott@poincare.geom.umn.edu (Scott S. Bertilson) writes...
> 
>  Does anyone else get nervous about the fact that NeXT ships their machines
>with 8 megabytes of non-parity memory?  Is memory so reliable today that
>parity doesn't give enough benefit to bother with?  Does only ECC give a
>strong enough guarantee - and that is too expensive, so we should just
>go without?
>  I might be paranoid, but I don't ever remember buying memory for a
>"real" computer before that didn't have some sort of error checking
>included.  Please help me to find a good answer to this question -
>I've got to add memory to a number of NeXTStations and I'd sure prefer
>to use parity memory, but not if it is a waste of time.

  The parity memory costs you and extra bus wait state, on the NeXT
or so I'm told.

  If you are a bank, a hospital or SDI get ECC. If you are a university
research or programming enviorment then IMHO 

 Probability(serious program bug) > 10^4* Probabilty(non detected memory fault)

 Most memory problems will make themselves apparent, of course some won't.
but in many areas the majority of programs running have serious bugs.
The chance of an ocassional cosmic ray bit flip going undetected is real.
Lets say something like 1/per 5 years. At least on the parity systems
I've used thats about how often you see it. But how does that affect
your productivity vs buying an extra 1gigabyte disk drive with the
money you save?


  I find that properly designed memory systems are pretty reliable. The
biggest problem is cutting the timing specs too close. The NeXT calls
for 100ns ram. I'm sure that spec is well thought out. You should
be fine.

chuck@mitlns.mit.edu

henry@zoo.toronto.edu (Henry Spencer) (05/22/91)

In article <1991May21.232331.24888@cs.umn.edu> scott@poincare.geom.umn.edu (Scott S. Bertilson) writes:
>  Does anyone else get nervous about the fact that NeXT ships their machines
>with 8 megabytes of non-parity memory? 

Apple does the same thing.

>Is memory so reliable today that
>parity doesn't give enough benefit to bother with?

It's marginal, and depends on circumstances.  Modern memory *is* pretty
reliable, and it's attractive to the bean-counters to pinch pennies by
removing parity circuitry.  Doing parity requires 12.5% more memory, adds
circuitry for checking and for testing the checker (not being able to test
it, e.g. the IBM PC, is a mistake), can introduce problematic delays in
fast machines, and can be a considerable headache when writing partial
words to memory in some situations.  On well-broken-in hardware, parity
errors are quite rare.  (Utzoo gets maybe one or two a year on 24MB of
relatively old memory.)

Of course, when a bit does go bad, you'd kind of like to know about it...

>Does only ECC give a
>strong enough guarantee - and that is too expensive...

ECC is more painful in all the above ways, and tends to be used only for
server-class machines where availability sells.  In practice, with current
error rates, unless the application is one where crashes are utterly
unacceptable, parity is amply sufficient.  It's nice to be able to repair
the error, but relatively unimportant:  most parity errors will not bring
the system down if intelligently managed, and most systems crash more
often than that for other reasons anyway.
-- 
And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry

lindsay@gandalf.cs.cmu.edu (Donald Lindsay) (05/23/91)

In article <1991May22.155818.20148@zoo.toronto.edu> 
	henry@zoo.toronto.edu (Henry Spencer) writes:
>On well-broken-in hardware, parity errors are quite rare.

True, if the hardware is treated well. Parity can make a pretty good
electronic detector for air conditioning failure.

Further, not all hardware is well broken in. Leaving aside machines
that were shipped before their time, there's the fact that a lot of
memory upgrades are done by untrained personnel, with chips they
bought by mail order.

(Anyone who doesn't see the risks in the above should subscribe to
alt.folklore.computers. Sadly, the stories of hobbyists who left
cigar ash under the sockets, are not apocryphal.)
-- 
Don		D.C.Lindsay 	Carnegie Mellon Robotics Institute

mrc@milton.u.washington.edu (Mark Crispin) (05/23/91)

In article <1991May21.232331.24888@cs.umn.edu> scott@poincare.geom.umn.edu (Scott S. Bertilson) writes:
>  Does anyone else get nervous about the fact that NeXT ships their machines
>with 8 megabytes of non-parity memory?  Is memory so reliable today that
>parity doesn't give enough benefit to bother with?  Does only ECC give a
>strong enough guarantee - and that is too expensive, so we should just
>go without?

With core memory, a single magnetic core failing would cause a single
bit error at a specific location.  Parity is great for detecting that
kind of error.  Chances are, it didn't happen at a critical location
(critical for the operating system, anyway) so if your operating
system is clever enough it could abort the affected process (along
with suitable logging), and mark that memory page as being bad (and
hence shouldn't be used).

Another possibility with core memory is the failure of a single line
(row or column) that causes the loss of bit n in locations in a
particular memory range.  This sort of failure has greater impact, but
there is still the chance of a software recovery (albeit not of the
process that hit the error) and the continuation of the system in a
degraded mode.

Semiconductor memory is a different story.  My experience with
semiconductor memory suggests that failures are catastrophic and
massive.  Also, modern software using virtual memory tends to scatter
kernal critical pages throughout physical memory.

Put another way, if any of the SIMMs in a NeXT were to fail while the
system was running, the resulting data scrambling would tend to cause
an immediate failure of the system, probably before the parity trap
code would get to run, much less print out any diagnostics.

Finally, note that you are not running a multi-user timesharing
system.  The crash of an individual NeXT is not as horrible an event
as the crash of a timesharing system with 150 logged-in users.  There
are enough system-crash software bugs in 2.1 that crashes are to be
expected.  The main danger of a memory error is one in which the error
happens *without* the system crashing -- in effect, undetected.

jewett@hpl-opus.hpl.hp.com (Bob Jewett) (05/24/91)

> >Is memory so reliable today that
> >parity doesn't give enough benefit to bother with?
> 
> It's marginal, and depends on circumstances.  Modern memory *is* pretty
> reliable,
...
> On well-broken-in hardware, parity errors are quite rare.  (Utzoo gets
> maybe one or two a year on 24MB of relatively old memory.)

On the ~20 systems in this department, we see DRAM error rates that vary
according to the type of memory chips used.  Systems that have 1Mb chips
seem to average about one error for every 400 megabyte-months of
operation.  That's one error on a 16MB system every 25 months.  Systems
that use 4Mb chips (i.e., all the new ones) have one error every 100
megabyte-months, or four times a year for a 32Meg system.

>> Does only ECC give a
>> strong enough guarantee - and that is too expensive...

> ECC is more painful in all the above ways, and tends to be used only for
> server-class machines where availability sells.  In practice, with current
> error rates, unless the application is one where crashes are utterly
> unacceptable, parity is amply sufficient.

Yes, it depends on what the costs of a crash are.  If you have spent
a couple of days working on an IC design and have neglected to write a
checkpoint version, a crash costs at least several hundred dollars.

All the new systems here have ECC.  We have had about 50 corrected
errors in the last year, not counting two bursts of errors on two
systems that had hardware problems.  In our situation, is would have
been unacceptable to have had that many more crashes.  ECC is required.
Parity is not sufficient.

Bob Jewett
[Not an official statement, etc.]

henry@zoo.toronto.edu (Henry Spencer) (05/24/91)

In article <1991May22.234515.24685@milton.u.washington.edu> mrc@milton.u.washington.edu (Mark Crispin) writes:
>With core memory, a single magnetic core failing would cause a single
>bit error at a specific location.  Parity is great for detecting that
>kind of error.  Chances are, it didn't happen at a critical location...

This is exactly the dominant error mode for DRAMs:  one bit going bad
either once (alpha particle or cosmic ray) or permanently.  The same
comments apply:  odds are, given intelligent management, you can either
recover transparently (if it hits a page that exists on disk) or just
blow away one process.

>Another possibility with core memory is the failure of a single line
>(row or column) that causes the loss of bit n in locations in a
>particular memory range...

Again, this is exactly analogous to a DRAM failure mode:  one chip dying.
Almost all current DRAM configurations put one bit of each word in each
chip.  That's starting to change as DRAMs get huge and x4 and x8 versions
start to appear, but most systems still do one bit per chip.

>Put another way, if any of the SIMMs in a NeXT were to fail while the
>system was running...

A whole SIMM failing is indeed pretty disastrous.  It should also be
pretty uncommon, unless you've got marginal parts or marginal installers.
-- 
And the bean-counter replied,           | Henry Spencer @ U of Toronto Zoology
"beans are more important".             |  henry@zoo.toronto.edu  utzoo!henry

adoyle@bbn.com (Allan Doyle) (05/24/91)

In article <1991May22.234515.24685@milton.u.washington.edu> mrc@milton.u.washington.edu (Mark Crispin) writes:
>Semiconductor memory is a different story.  My experience with
>semiconductor memory suggests that failures are catastrophic and
>massive.  Also, modern software using virtual memory tends to scatter
>kernal critical pages throughout physical memory.

True enough, but whatever happened to the Alpha particle hits we
were hearing so much about a few years ago. Alpha particle hits would
affect only single bits by flipping the bit or erasing or setting
the bit. I was under the impression that the newer memories were
getting increasingly vulnerable since the bit size was shrinking and
the energy needed to flip a bit was decreasing. Have the semiconductor
manufacturers figured out how to shield the chips with a special
coating or something?

	Allan

Allan Doyle                                        adoyle@bbn.com
Bolt Beranek and Newman,  Incorporated             +1 (617) 873-3398
10 Moulton Street, Cambridge, MA 02138             USA

gillies@m.cs.uiuc.edu (Don Gillies) (05/25/91)

Actually, wouldn't it be pretty easy to provide "optional parity" on
non bank-switched memory?  At boot time, sense whether there are
parity simms installed, and if so, enable the parity checking
hardware, which uses a single simm to parity-check all the other simms
in the system.  One byte from the parity simm is used to check 8 words
of main memory.

One 4Mb simm would enough to parity-check 128Mb of memory.  If your
system was experiencing random crashes, you could install a parity
simm to detect the defective memory bank.  Once you had found the
errant memory bank, you could replace the errant part and remove the
parity simm.  This would save a lot of money, since you wouldn't
1/32th more memory on a permanent basis.

Don Gillies	     |  University of Illinois at Urbana-Champaign
gillies@cs.uiuc.edu  |  Digital Computer Lab, 1304 W. Springfield, Urbana IL

--

ratshana@triton.unm.edu (R.L.) (06/03/91)

Parity memory isn't really necessary for two reasons:
1) Unless you buy really cheap memory, you should NEVER have a problem with
   screwed up RAM. Besides, if the RAM is screwy, whats the worst that can
   happen? Your term window dies, so you kill it on another task...
2) I don't know about the NeXT, but the Amiga used to also store checksums
   at the end of the file stored in RAM. Everytime in went to run the program
   it would calculate the checksum and compare it to the stored one, if they
   didn't match, it'd give you a dialog box saying so. I ONLY had that problem
   when I was using this nifty utility that would compress programs, which
   would automagically decompress in RAM--the checksums didn't match cuz one
   was compressed, and the other wasn't!

hrubin@pop.stat.purdue.edu (Herman Rubin) (06/03/91)

In article <1991Jun03.040242.15406@ariel.unm.edu>, ratshana@triton.unm.edu (R.L.) writes:
> Parity memory isn't really necessary for two reasons:
> 1) Unless you buy really cheap memory, you should NEVER have a problem with
>    screwed up RAM. Besides, if the RAM is screwy, whats the worst that can
>    happen? Your term window dies, so you kill it on another task...

If all you are using your computer for is producing documents, you MAY be right.
But errors can be important, and the worst type is the soft error, not even
reproducible.

When computers are used for COMPUTING, and the results are used, this attitude
cannot be taken.  The soft error problem has been considered as unavoidable in
very condensed electronic memories.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

brian@umbc4.umbc.edu (Brian Cuthie) (06/03/91)

In article <1991Jun03.040242.15406@ariel.unm.edu> ratshana@triton.unm.edu (R.L.) writes:
>Parity memory isn't really necessary for two reasons:
>1) Unless you buy really cheap memory, you should NEVER have a problem with
>   screwed up RAM. Besides, if the RAM is screwy, whats the worst that can
>   happen? Your term window dies, so you kill it on another task...

What!? Are you kidding ?  Maybe instead it changes the amount of a deposit 
while I am balancing my checkbook.  Bit errors are as unpredictable in 
location as they are in frequency.  Therefore, it is not reasonable to
make any assumptions concerning their effects.

>2) I don't know about the NeXT, but the Amiga used to also store checksums
>   at the end of the file stored in RAM. Everytime in went to run the program
>   it would calculate the checksum and compare it to the stored one, if they
>   didn't match, it'd give you a dialog box saying so. I ONLY had that problem
>   when I was using this nifty utility that would compress programs, which
>   would automagically decompress in RAM--the checksums didn't match cuz one
>   was compressed, and the other wasn't!

Well I can say with some certainty that the NeXT (or Mach, for that  matter)
doesn't run around doing checksums of things in memory.  That would cost
*way* too much time.  It's easy to do at program launch time, but in the
context of a running program's data area, it would be necessary to recompute
and store the checksum *each time something was knowingly changed*.  Not
feasible.

Unfortunately, although not without it's onw problems, parity is the only
cost effective way of adding some detection mechanism for bit errors in
memory.  Of course, it still doesn't help you if you mangle your data
yourself :-)

-brian

kenton@abyss.zk3.dec.com (Jeff Kenton OSG/UEG) (06/03/91)

|> In article <1991Jun03.040242.15406@ariel.unm.edu>
ratshana@triton.unm.edu (R.L.) writes:
|> 
|> >2) I don't know about the NeXT, but the Amiga used to also store checksums
|> >   at the end of the file stored in RAM. Everytime in went to run
the program
|> >   it would calculate the checksum and compare it to the stored one,
if they
|> >   didn't match, it'd give you a dialog box saying so.

The Amiga needed to do this because it supported multi-tasking without an MMU.
Programs used to regularly trash each other's memory space.  It was one of the
most painful development environments I ever worked with.  Neat graphics
hardware, but pretty weak operating system.

-----------------------------------------------------------------------------
==	jeff kenton		Consulting at kenton@decvax.dec.com        ==
==	(617) 894-4508			(603) 881-0011			   ==
-----------------------------------------------------------------------------

ssr@taylor.Princeton.EDU (Steve S. Roy) (06/03/91)

In article <1991Jun03.040242.15406@ariel.unm.edu> ratshana@triton.unm.edu (R.L.) writes:
>Parity memory isn't really necessary for two reasons:
>1) Unless you buy really cheap memory, you should NEVER have a problem with
>   screwed up RAM. Besides, if the RAM is screwy, whats the worst that can
>   happen? Your term window dies, so you kill it on another task...

I'm sorry but this is an awfully glib statement and should come with a
bunch of qualifiers like that cluster around a particular theme.

    if you do a given calculation many times and can compare the
results to see if something is wrong.  This includes being able redo a
compile or something similar.

    if you are operating far from the limits of the machine.  This
means that you will be able to rerun the job.

    if you don't really care whether a given calculation was correct.
The vast majority of the memory in many (especially scientific) codes
is in large arrays of floating point numbers and if a bit flips
somewhere in there it will not crash the program it will just give
wrong answers.

Now, I realize that these things are typically true for the majority
of the readers of this group, code hackers for whom most of the
burdens on the machine come from editing and the occasional compile.
There are, however, many users for whom these things are not true and
it bugs me to see glib generalizations like the above.  

Steve Roy.

kludge@grissom.larc.nasa.gov ( Scott Dorsey) (06/04/91)

In article <1991Jun03.040242.15406@ariel.unm.edu> ratshana@triton.unm.edu (R.L.) writes:
>1) Unless you buy really cheap memory, you should NEVER have a problem with
>   screwed up RAM. Besides, if the RAM is screwy, whats the worst that can
>   happen? Your term window dies, so you kill it on another task...

Well, one thing that can happen is that a simulation that is run produces
incorrect output, which is a very curious anomaly.  Someone writes a thesis
to explain that anomaly, then two people write books citing that thesis.
A piece of equipment is constructed based upon the data in one of those books
and it fails, killing someone.  Then a posting is made to comp.risks about
the danger of computer simulations...
--scott
   (the example, incidentally, occurred on a Cyber machine)

bard@jessica.stanford.edu (David Hopper) (06/04/91)

In article <1991Jun3.144909.24609@decvax.dec.com> kenton@abyss.zk3.dec.com (Jeff Kenton OSG/UEG) writes:
>|> In article <1991Jun03.040242.15406@ariel.unm.edu>
>ratshana@triton.unm.edu (R.L.) writes:
>|> 
>|> >2) I don't know about the NeXT, but the Amiga used to also store checksums
>|> >   at the end of the file stored in RAM. Everytime in went to run
>the program
>|> >   it would calculate the checksum and compare it to the stored one,
>if they
>|> >   didn't match, it'd give you a dialog box saying so.
>
>The Amiga needed to do this because it supported multi-tasking without an MMU.
>Programs used to regularly trash each other's memory space.  It was one of the
>most painful development environments I ever worked with.  Neat graphics
>hardware, but pretty weak operating system.

That's a matter of opinion.  For a preemptive microkernal OS, it doesn't
get any better.

Followups to comp.sys.amiga.advocacy.

>==	jeff kenton		Consulting at kenton@decvax.dec.com        ==

Dave Hopper      |MUYOM!/// Anthro Creep | NeXT Campus Consultant at Stanford
                 | __  ///    .   .      | Smackintosh/UNIX Consultant - AIR
bard@jessica.    | \\\///    Ia! Ia!     | Independent Amiga Developer
   Stanford.EDU  |  \XX/ Shub-Niggurath! | & (Mosh) Pit Fiend from Acheron

jlee@sobeco.com (j.lee) (06/04/91)

In <1991May25.062358.13694@m.cs.uiuc.edu> gillies@m.cs.uiuc.edu (Don Gillies) writes:

>Actually, wouldn't it be pretty easy to provide "optional parity" on
>non bank-switched memory?  At boot time, sense whether there are
>parity simms installed, and if so, enable the parity checking
>hardware, which uses a single simm to parity-check all the other simms
>in the system.  One byte from the parity simm is used to check 8 words
>of main memory.

Sorry, parity is one game where either you're in or you're out.
The complexity is in the parity generation and detection circuitry;
even parity checking is not instantaneous and careful design is
needed to avoid slowing down accesses to memory.  The expense of
the extra bit-per-byte is minimal (or should be in bulk if enough
people used parity).

For speed, you need to access the parity bits in parallel with the
RAM bits that they guard; for simplicity, you want them on the same
bus; for byte-addressible memory, you want to have a separate parity
bit for each byte to simplify the update problem.  You *don't* want
to require a Read-Modify-Write cycle to update the parity bits.
That means using 9-bit SIMMS (or similar schemes).  You could
arrange to disable the parity checking (or at least the trap), but
if you have gone to the trouble to support it, why turn it off?

Jeff Lee -- jlee@sobeco.com || jonah@cs.toronto.edu

jbn35564@uxa.cso.uiuc.edu (J.B. Nicholson) (06/05/91)

Can someone please post a summary of what the deal on parity ram is?  Basically
I'd like to have all parity ram (and make use of the 9th bit), but would that
involve having to find another use for the 8MB of non-parity ram that the NeXT
comes with?

Jeff
--
jeffo@uiuc.edu

jak@interlan.Interlan.COM (Jeff Koehler) (06/05/91)

	Here's my two cents on parity RAM -- anyone that wants to 'use it'
	for purposes other than parity, or believes they can get around it,
	most likely has never had any bad RAMs!

	I have had parity errors on two of 'my' machines, both 386-AT style
	boxes -- one at home, one at work.  In both cases, I found the
	parity errors to be real.  The errors broke down like this:

		1)	motherboard 256kx1 'original', in the lower 640k.

		2)	motherboard 'add-on' SIMMs, 4Mx9's, which also
			turned out to be in the lower 640k. (I removed
			the original 256kx1's).

	I feel lucky becuase the errors actually occured in memory below
	the 1-MByte limit, where a quick hack-job memory tester I wrote
	could easily find them.  Sure enough, the errors were real.
	
	Case 2) only happened when walking a '1' across zeros, where 
	the '1' was in bit 0 and the error occured in bit '16' (I am
	pointing this out to illustrate the quirky nature of bad RAMs).

	Needless to say, I was unconvinced the first few times I experienced
	a system halt (by the BIOS), but as it occurred more and more often,
	I thought it was time to do something.

	How would I have found this without parity?  Well, my memory tester
	would find it, but not the simple power-up sequence tests in the
	BIOS ROMs (in fact, I had to disable the NMI for parity errors in
	order to get my program to be able to print the errored location to
	the screen).  But how long would strange things have been happening
	if there was no parity?  How many hours would I have wasted with
	errant programs, etc.?  Sound absurd?  Wait 'till *YOUR* memory
	goes bad!  I certainly have a few 1-2-3 spreadsheets with some '|'
	characters that changed to a '{' because I told the BIOS's NMI 
	handler to 'keep going' rather than reboot.

	The machines we use today at home may easily have more memory than
	the original CRAY-1S I used to write diagnostics for back in college.
	Technicians there pointed out that the ECC memory would log an error
	every few days when under heavy use -- and believe me, If there was
	a PC with ECC instead of parity, I would be waiting in line ...
	but just like the person who will drive around for years with no spare
	tire in the trunk, there will be people trying to subvert the purpose
	of those 'wasted bits' in the ECC memory.
							Jak

((((((((((((((((((((((((((((((((((|))))))))))))))))))))))))))))))))))))))))
(( Sr Hardware Engineer       ))    ((  Jeff Koehler                     ))
(( Racal InterLan, Inc.       ))    ((  jak@interlan.com                 ))
(( Boxboro, MA  508-263-9929  ))    ((  imagine code with this many '('s!))
((((((((((((((((((((((((((((((((((|))))))))))))))))))))))))))))))))))))))))

rhealey@digir4.digibd.com (Rob Healey) (06/06/91)

In article <1991Jun3.144909.24609@decvax.dec.com> kenton@abyss.zk3.dec.com (Jeff Kenton OSG/UEG) writes:
>|> >2) I don't know about the NeXT, but the Amiga used to also store checksums
>|> >   at the end of the file stored in RAM. Everytime in went to run
>the program
>|> >   it would calculate the checksum and compare it to the stored one,
>if they
>|> >   didn't match, it'd give you a dialog box saying so.
>
>The Amiga needed to do this because it supported multi-tasking without an MMU.
>Programs used to regularly trash each other's memory space.  It was one of the
>most painful development environments I ever worked with.  Neat graphics
>hardware, but pretty weak operating system.
>
	Then you must not have used any of the other nonprotected environments
	like DOS or Mac... Common problem for systems that try to do
	multiple things at once without MMU's. Add a few TSR's and a spooler
	together or the equiv for the Mac, then try to develop buggie code.
	You'd get the same result no matter what system you used. The
	problem is certainly not unique to the Amiga. Had the same problem
	on Apples II's and TRS-80's way back when. TRS-80's were real
	fun 'cause they ran tasks off a 60Hz timer inturrupt, code would
	overwrite the vector and kabooom! Fun stuff from the good 'ol days!

		-Rob

lee@pipe.cs.wisc.edu (Soo Lee) (06/06/91)

In article <1991Jun5.155332.485@interlan.Interlan.COM> jak@interlan.interlan.com writes:
>
>	I have had parity errors on two of 'my' machines, both 386-AT style
>	boxes -- one at home, one at work.  In both cases, I found the
>	parity errors to be real.  The errors broke down like this:
>	~~~~~~~~~~~~~~~~~~~~~~~~
I was a TRUE BLUE believer and am having IBM PC. But I had bad memory problem
which occurs one year later after buying it. The problem was infrequent crash
for different size of programs. I spent a couple of days until I wrote a
very short asm program to check memory in 4 different test modes on user area
and I was shocked about my findings. Of course, I definitely invest my extra
money to go after parity memory! ;-|

Soo	lee@cs.wisc.edu

herman@corpane.uucp (Harry Herman) (06/09/91)

In <1991Jun03.040242.15406@ariel.unm.edu> ratshana@triton.unm.edu (R.L.) writes:

>Parity memory isn't really necessary for two reasons:
>1) Unless you buy really cheap memory, you should NEVER have a problem with
>   screwed up RAM. Besides, if the RAM is screwy, whats the worst that can
>   happen? Your term window dies, so you kill it on another task...

Or the screwy memory is in the middle of a disk buffer and you end up
trashing your company's accounting database, which then takes several
days to restore due to having to figure out when the database got
screwed up, restoring backups before that point (hopefully sometime
recently) and re-entering ALL the lost transactions between when the bad
data was written and when it was discovered.

Of course, this assumes that regular backups are taken...

					Harry Herman
					herman@corpane