knop@dutesta.UUCP (Peter Knoppers, Delft Univ. of Technology) (11/22/88)
>From: bill@bilver.UUCP (Bill Vermillion) >Can anyone recommend a reliable memory test. (Talking '286 and '386 >machines here). Yes, below I will give a summary of part of a tutorial on memory testing presented by Prof. A.J. van de Goor at the Internation Test Conference 1988. It is not very likely that you can obtain copies of the documents used in that tutorial. We are in the process of converting the information to a book that will be obtainable trough the normal channels. This will become during in 1989. Faults in memory arrays come in several flavors: SA: Stuck-At faults in the memory array. This means that one or more cells contain a constant value. AF: Address decoder faults. This means that several addresses access the same memory cell, or some addresses access no cell at all. TF: Transition faults. This means that a cell changes value when another cell is written to. CF: Coupling faults. This means that a cell shows a stuck-at fault or transition fault only when a certain pattern is present in neighboring cells. Traditionally, memory test algorithms were ad hoc designs, no one knew what fault coverage was obtained by them. To describe some memory tests I will present a language that was invented when the so-called march tests were introduced. A march test consists of a number of march elements. Each march element tells whether the full address range is accessed from lowest to highest address, of vice-versa. Furthermore the march element tells what read and write actions must be performed at each address. The simplest (ad-hoc) memory test is the Zero-One test. The march notation for Zero-One is { up(w0); up(r0); up(w1); up(r1) }. "up" is usually depicted by an up-arrow, but I don't have that on my keyboard/display. The opposite of up is down, my keyboard has no key for that either. "w0" means to write 0 in the memory location. "r0" means to read the memory location and assert that it contains 0. This march can be translated to an equivalent piece of a C program: for(address = lowest; address <= highestl; address++) *address = 0; for(address = lowest; address <= highestl; address++) if(*address != 0) error(...); for(address = lowest; address <= highestl; address++) *address = 1; for(address = lowest; address <= highestl; address++) if(*address != 1) error(...); The Zero-one test is extremely simple and it only detects SA faults in the memory array. Every memory location is accessed 4 times, therefore the time it takes to execute Zero-one is proportional to 4n (n is the number of addresses). The Zero-one test is include here for its simplicity. It has very bad fault coverage, for instance, it will not detect that a 64 kb chip was inserted in a socket designed for a 256 kb chip. (Such faults are detected by memory tests that detect address decoder faults.) A slightly better test is called MATS. The march description for MATS is { up(w0); up(r0,w1); up(r1) }. In addition to all SA faults, MATS detects all address decoder faults if an additional condition is met. Address decoder faults come in several types: 1: no cell is accessed 2: a different cell is accessed 3: several cells are accessed, the output value is the logical OR, or the logical AND of the contents of all selected cells. Which logical function is applicable depends on the technology of the memory chips. Cases 1 and 2 are always covered by MATS. Case 3 is only covered if the resulting value is the logical OR of the contents of the memory cells is output. If the logical AND is output, the march can be replaced by { up(w1; up(r1,w0); up(r0) } to obtain complete coverage of all address decoder faults. The example of the 64 kb chip in the 256 kb chip's socket would be detected, as this is equivalent to a type 2 address decoder fault. MATS+ is an improved version, that detects all address decoder faults, independent of the way case 3 faults operate. It is described by the march { up(w0); up(r0,w1); up(r1,w0) }. MATS+ takes 5n accesses. The equivalent C code is: for(address = lowest; address <= highestl; address++) *address = 0; for(address = lowest; address <= highestl; address++) { if(*address != 0) error(...); *address = 1; } for(address = lowest; address <= highestl; address++) { if(*address != 1) error(...); *address = 0; } There exists an even more improved version of MATS, called MATS++. It is described as the march { up(w0); up(r0,w1,r1); down(r1,w0,r0) }. MATS++ takes 7n accesses. In addition to detecting all SA faults and all AF faults it detects all transition faults. An algorithm that detects the simple coupling faults where only two cells are involved is MARCH-B: { up(w0); up(r0,w1,r1,w0,r0,w1); up(r1,w0,w1); down(r1,w0,w1,w0); down(r0,w1,w0) } This test takes 17n accesses. As you can see, several writes to the same memory location are performed, without any read operation, or accesses to other memory locations. There are tests that can detect even more complex coupling faults, but these cannot be described as a march. These tests are of order n*log(n) (or worse). I won't describe them here. Although MARCH-B does not detect all complex coupling faults, it will detect the vast majority of them. The algorithms described above apply to 1-bit wide memory. Computers usually access memory in bytes, words, or longwords. For these cases the same algorithms can be used, if you replace 1 in the C code by -1 (which expands to a byte/word/longword with all bits set). To detect shorts between the bits in a byte/word/longword you can precede the test with a few writes and reads to one memory location with a few different values. A good set of values to use for byte oriented memories is: binary hex 11110000 F0 11001100 CC 10101010 AA These patterns detect shorts between any two data lines. For word or longword oriented memories you must double or quadruple the patterns and add 1 or 2 patterns at the top of the list. To detect all shorts to ground or vcc you must add a pattern consisting of all 1's and a pattern consisting of all 0's. This should be enough to get you started writing your own memory test programs. Only this time you should KNOW the faults it detects. You can even write a program that interprets marches in the notation used in this article. Although I tried to be very careful, it is possible that this article contains errors. Use the algorithms in this article at your own risk. -- _____ __ __ __ __ ____ _____ _____ ______ _____ _____ | _ \ | |/ || \| | / \ | _ \ | _ \ | ___|| _ \ / ___| | __/ _ | < | || || || __/ | __/ | >__ | < \__ \ |__| |_| |__|\__||__|\__| \____/ |__| |__| |______||__|\__||_____/ P. Knoppers, Delft Univ. of Technology, The Netherlands - knop@dutesta.UUCP