hunt@spar.SPAR.SLB.COM (Neil Hunt) (09/10/87)
In article <1184@itm.UUCP> danny@itm.UUCP (Danny) writes: > [...] > How about a computer with say, 300 Meg of RAM. There is, also, > [...] > Nevertheless, whatever else may be happening, a scoo-bah of memory >has mucho apeal. Comments? > Better make sure that it has full error correction ! On a Sun 3, I believe that you can put 28 Mbytes of mem, at which time you should expect a parity error to be detected about once a month, with current technology. Thus 300 Megabytes will get a soft error every three days or so (bit of a pain !). I understand that Sun 4s will have error *correction* hardware so that they can correct single bit (?) errors, and thus go to larger memories without crashing too often. Does anyone know about soft failure modes of DRAMs ? How likely is it to find double bit errors ? With denser and denser memory chips, one might expect that one day soon, background alpha particles will be able to flip several adjacent bits. By the way, my dream machine would have much more than 300 M ! Some people here have swap discs in the 100s of M on their lispms, and still could use more ! Also I don't know why you would have a conventional disc to back up your DRAM. I would trust my (EC) memory more than a disc, but do conventional type backups on an Optical WORM disc now and then. Neil/.
bobw@wdl1.UUCP (Robert Lee Wilson Jr.) (09/11/87)
Concerning the failure rates for DRAMS: There are data available, from the manufacturers and others, but most present DRAM configurations for large memories are arranged so that two-bit (and more) errors come from simultaneous failures in two DRAM chips (or associated logic) rather than a double bit failure in a single chip. Most DRAM cips (again I meanas used in for large memories) are 1 bit wide by 256K or 1M or 4M or .... bits capacity. If your memory design is n bits wide (including whatever checking bits are used) it is typically composed of some multiple of n memory chips, in blocks. In each block are n chips, each holding 1 bit out of 256K (or 1M, etc.) locations. Thus for the cosmic ray to have a multi-bit effect it must simultaneously affect several chips, and moe than that must affect those chips in the same bit locations. That certainly is possible but it seems less likely than affecting several bits in one chip, and probability is the central issue when designing codes to handle different kinds of errors. This appears again when you look at other singl-point-of-failure possibilities. Since a single failure in some auxilary logic might easily produce all 1's or all 0's, some ECC schemes are careful to detect those failures as special cases, even though they are many-bit errors.
richard@aiva.ed.ac.uk (Richard Tobin) (09/11/87)
In article <797@spar.SPAR.SLB.COM> hunt@spar.UUCP (Neil Hunt) writes: >Does anyone know about soft failure modes of DRAMs ? How likely >is it to find double bit errors ? With denser and denser memory chips, >one might expect that one day soon, background alpha particles will be >able to flip several adjacent bits. I don't know how likely such adjacent bit errors are, but it shouldn't matter much. Most memory chips are <some large number> x 1 bit, which means that a given byte will consist of a bit from each of 8 chips. So an error of the kind described will produce correctable 1-bit errors in several adjacent bytes, rather than an uncorrectable multi-bit error in one byte. > Thus 300 Megabytes will get a soft error every > three days or so (bit of a pain !). If this is accurate, it means that a given byte has a 1-in-10^9 chance of getting a single-bit error in a given day, which means the chance of it getting 2 errors in one day (from different alpha particles) is 1-in-10^18 - fairly safe, since to provoke an uncorrectable error, the second bit has to be corrupted before error-correction puts the first one right (this suggests that you should make sure all your physical memory gets read frequently). -- Richard Tobin, JANET: R.Tobin@uk.ac.ed AI Applications Institute, ARPA: R.Tobin%uk.ac.ed@nss.cs.ucl.ac.uk Edinburgh University. UUCP: ...!ukc!ed.ac.uk!R.Tobin