lethin@athena.mit.edu (Richard A Lethin) (12/15/88)
I'm curious about how people designing RISC chips and microprocessors test them. We're designing a small VLSI processor here and would like to do it right. There seem to be lots of great algorithms for strictly combinational circuits, but when state gets added, the answer seems to be "make every register shiftable" to allow the circuit to be analyzed as a combinational circuit. A RISC chip, with a bunch of on-chip registers, a TLB, pipeline registers, a limited number of IO pins, and irregular logic would seem to be a testability nightmare. But gunking up a pipeline register to make it shiftable seems drastic. The state's not regular enough, like a DRAM, to just run patterns, so, how is it done? How about testing the small, on-chip data cache? That certainly isn't going to be made shiftable... What specific testability features are added to the chip? LSSD? OCMS? Special opcodes? RESET? Special test pins or pads? Are test vectors generated by hand? If so, how long does that take? And who does it? If not, how are they generated? How much coverage do you get? Is the coverage satisfactory? How long does it take to run the vectors on the chip? What hardware do you use? -- Rich
mark@mips.COM (Mark G. Johnson) (12/16/88)
In article <8453@bloom-beacon.MIT.EDU>, lethin@wheaties.ai.mit.edu (Richard A Lethin) writes: $ A RISC chip, with a bunch of on-chip registers, a TLB, pipeline $ registers, a limited number of IO pins, and irregular logic would seem $ to be a testability nightmare. But gunking up a pipeline register to $ make it shiftable seems drastic. The state's not regular enough, like $ a DRAM, to just run patterns, so, how is it done? $ $ What specific testability features are added to the chip? LSSD? $ OCMS? Special opcodes? RESET? Special test pins or pads? Several of the RISC chips under construction in bipolar ECL technology are using LSSD-like scan paths. And people have this belief, founded or unfounded, that however difficult it is to test a RISC chip, it's *more* difficult to test a CISC chip having an equal number of circuits. You know, simplicity breeds observability. :-) -- -- Mark Johnson MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086 ...!decwrl!mips!mark (408) 991-0208
aglew@mcdurb.Urbana.Gould.COM (12/17/88)
..> Richard A. Lethin of MIT asks about testing of VLSI RISCs ..> and Mark Johnson of MIPs makes some comments. Talking about testing, I finally think that I have figured out one of the things that has been bothering me about hardware testability methodology. Most VLSI test methodology seems oriented towards detecting *implementation* or *fabrication* errors, not *design* errors. Ie. the tests look for bad transistors, or mis-wirings; they don't look for adherence to higher level specs. When people talk about test coverage, they mean test coverage over a limited space of implementation and fabrication errors, not over the much larger space of design errors.
seeger@beach.cis.ufl.edu (F. L. Charles Seeger III) (12/17/88)
In article <28200252@mcdurb> aglew@mcdurb.Urbana.Gould.COM writes: | |..> Richard A. Lethin of MIT asks about testing of VLSI RISCs |..> and Mark Johnson of MIPs makes some comments. | |Talking about testing, I finally think that I have figured out |one of the things that has been bothering me about hardware |testability methodology. | Most VLSI test methodology seems oriented towards detecting |*implementation* or *fabrication* errors, not *design* errors. |Ie. the tests look for bad transistors, or mis-wirings; they |don't look for adherence to higher level specs. | When people talk about test coverage, they mean test coverage |over a limited space of implementation and fabrication errors, |not over the much larger space of design errors. Why does that bother you? Design errors are supposed to be caught during the design phase, where the design can be simulated with all nodes equally observable. A physical test is only concerned with testing that individual part, not the design of the part. Very few nodes of the physical part are directly observable, which makes this kind of test most unattractive for testing/debugging the design. Test coverage is usually computed based on the percentage of nodes where Stuck-At faults can be detected. Of course, these are only the grossest faults, and there is a much larger space of more subtle physical faults. However, more extensive testing of parts is extremely expensive. This sort of thing is done when new processing technology is developed and is to some extent done during development of new parts, but production testing and characterization of parts must be affordable and fast. Apologies in advance, if I have misunderstood your posting. -- Charles Seeger 216 Larsen Hall Electrical Engineering University of Florida seeger@iec.ufl.edu Gainesville, FL 32611
lethin@athena.mit.edu (Richard A Lethin) (12/17/88)
In article <28200252@mcdurb> aglew@mcdurb.Urbana.Gould.COM writes: > >..> Richard A. Lethin of MIT asks about testing of VLSI RISCs >..> and Mark Johnson of MIPs makes some comments. > >Talking about testing, I finally think that I have figured out >one of the things that has been bothering me about hardware >testability methodology. > Most VLSI test methodology seems oriented towards detecting >*implementation* or *fabrication* errors, not *design* errors. >Ie. the tests look for bad transistors, or mis-wirings; they >don't look for adherence to higher level specs. > When people talk about test coverage, they mean test coverage >over a limited space of implementation and fabrication errors, >not over the much larger space of design errors. Doesn't that depend on the context? If you're designing a new computer, you've got at least three sets of tests to develop: "Manufacturing Tests" "Validation Tests" "Diagnostic Tests" with different time and coverage constraints. Most VLSI test methodologies seem geared toward getting the most coverage with the smallest set of test vectors, primarily because VLSI test equipment is very expensive -- they're manufacturing tests. Diagnostics and validation suites are run in a context where time is free, but coverage has to be TOTAL. Diagnostics have the additional constraint that they ought to be able to identify the faulty component. Are there some principles people follow when designing diagnostics and validation tests, or is everything bound to be ad-hoc? And how do people go about finding design errors anyway? Would any real-life diagnostic engineers care to comment?
aglew@mcdurb.Urbana.Gould.COM (12/18/88)
>/* Written 6:39 pm Dec 16, 1988 by seeger@beach.cis.ufl.edu in mcdurb:comp.arch */ >In article <28200252@mcdurb> aglew@mcdurb.Urbana.Gould.COM writes: >| >|..> Richard A. Lethin of MIT asks about testing of VLSI RISCs >|..> and Mark Johnson of MIPs makes some comments. >| >|Talking about testing, I finally think that I have figured out >|one of the things that has been bothering me about hardware >|testability methodology. >| Most VLSI test methodology seems oriented towards detecting >|*implementation* or *fabrication* errors, not *design* errors. >|Ie. the tests look for bad transistors, or mis-wirings; they >|don't look for adherence to higher level specs. >| When people talk about test coverage, they mean test coverage >|over a limited space of implementation and fabrication errors, >|not over the much larger space of design errors. > >Why does that bother you? Design errors are supposed to be caught >during the design phase, where the design can be simulated with all >nodes equally observable. A physical test is only concerned with >testing that individual part, not the design of the part. Very >few nodes of the physical part are directly observable, which makes >this kind of test most unattractive for testing/debugging the design. > > ... > >Apologies in advance, if I have misunderstood your posting. > > Charles Seeger 216 Larsen Hall > Electrical Engineering University of Florida > seeger@iec.ufl.edu Gainesville, FL 32611 No need to apologize - you have understood the essence of my posting. Why does a testing methodology that tests for implementation errors rather than design errors bother me? First, because when I am designing a circuit I would like guidance as to what test vectors to simulate to test design correctness. Usually, of course, I have a separate software model to compare the simulation results to, but I would still like a methodology to help me choose the conditions that I run through both simulations. Testability tools that I know of take my circuit implementation, and give me vectors that cover it. Usually, running these through the circuit simulator and the higher level simulator catches trivial design errors well enough, but I know of at least one situation where they did not. So, I am left with all the standards: test boundary conditions, a few random patterns, etc., much like in software testing (see next point) and with no idea of "sufficiency". I suppose what I really want is a test generator that can take a high level spec and a low level spec, and generate tests that make you reasonably confident that the low level deesign actually implements the high level design. Whereas the test tools that I know of take a low level schematic and generate tests that make you reasonably certain that a physical implementation actually implements that schematic. You could get picky, and say that the low level design is actually an "implementation" of the high level design, but how does this help me gain any more confidence in the correctness of my design? Second, I get tired of hearing people say that VLSI hardware test methodology is far more advanced than software test methodolgy. Software "implementation" is what would be called "design" in a VLSI design system, so software testing is mainly design testing, not implementation testing. Implementation testing would correspond to testing for correct generation of code by the compiler; this needs to be done, but software testing goes much further than that. VLSI design correctness testing seems to be at the same level (I would almost say a little bit behind) software design testing methodology. I hope that I am terribly mistaken. Can anyone give me references for VLSI testing that cover design correctness, not fabrication? Andy "Krazy" Glew aglew@urbana.mcd.mot.com uunet!uiucdcs!mcdurb!aglew Motorola Microcomputer Division, Champaign-Urbana Design Center 1101 E. University, Urbana, Illinois 61801, USA. My opinions are my own, and are not the opinions of my employer, or any other organisation. I indicate my company only so that the reader may account for any possible bias I may have towards our products.
friedl@vsi.COM (Stephen J. Friedl) (12/18/88)
In article <28200252@mcdurb>, aglew@mcdurb.Urbana.Gould.COM writes: > Most VLSI test methodology seems oriented towards detecting > *implementation* or *fabrication* errors, not *design* errors. Design errors are meant to be caught by customers :-) Steve -- Stephen J. Friedl 3B2-kind-of-guy friedl@vsi.com V-Systems, Inc. attmail!vsi!friedl Santa Ana, CA USA +1 714 545 6442 {backbones}!vsi!friedl Nancy Reagan on my new '89 Mustang GT Convertible: "Just say WOW!"
kenm@sci.UUCP (Ken McElvain) (12/18/88)
In article <8453@bloom-beacon.MIT.EDU>, lethin@athena.mit.edu (Richard A Lethin) writes: > > I'm curious about how people designing RISC chips and microprocessors > test them. We're designing a small VLSI processor here and would like > to do it right. > > There seem to be lots of great algorithms for strictly combinational > circuits, but when state gets added, the answer seems to be "make > every register shiftable" to allow the circuit to be analyzed > as a combinational circuit. There are a few available programs that can cope with sequential circuit test generation. We supply one, and to be fair, Zycad and HHB(->Cadnetix->Daisy???) also have products for sequential test generation. Research in this area is much lighter than in combinational test generation, but the approaches are much more varied. Differences in capablity of available programs lie in the size of the circuits handled, effectiveness in dealing with sequential behaviour, available modeling primitives (usually gates and latches, we support tristate and precharge bus drivers and RAM and ROM primitives.). I like to believe that mine is the best. > > A RISC chip, with a bunch of on-chip registers, a TLB, pipeline > registers, a limited number of IO pins, and irregular logic would seem > to be a testability nightmare. But gunking up a pipeline register to > make it shiftable seems drastic. The state's not regular enough, like > a DRAM, to just run patterns, so, how is it done? The way we do it is to turn the test generator loose on the chip overnight with no design for test. In the morning we look at what parts of the chip weren't tested. We then determine via a combination of coverage percentages, testablility analysis, and brain power, what the problem was and then we set up a control file that tells the test generation program which internal nodes to treat as external inputs and outputs. This is far less overhead than making every latch scanable. You can usually keep the intrusion off of the critical path as well. > > How about testing the small, on-chip data cache? That certainly > isn't going to be made shiftable... You should be able to isolate it from the rest of the chip with a boundry scan path . > > What specific testability features are added to the chip? LSSD? > OCMS? Special opcodes? RESET? Special test pins or pads? > > Are test vectors generated by hand? If so, how long does that take? > And who does it? If not, how are they generated? Generation of test vectors by hand has been known to take longer than the design. Especially if you want high test coverage. Fault simulation time can eat you alive. > > How much coverage do you get? Is the coverage satisfactory? One thing to remember is that 99% fault coverage (ignoring the question of fault models) does not mean that 99% of the chips that pass your test are working parts. Chip size and defect density also affect the answer. To see this, think of the remaining %1 of the faults as an area A. If the defect density D is such that A * D = .1 then 10% of the passing chips will have a fault in the untested area. Realisticly then, large chips need to have better fault coverage than small ones to achieve the same defect rate in tested chips. > > How long does it take to run the vectors on the chip? What hardware > do you use? > > -- Rich A couple of references if you are curious. The basic idea in [1] is to analyize the circuit and derive a partial state transition table. [2] is the basis for both Zycad's and HHB's products. The idea is to start at the outputs and work back towards the fault, maintaining a current and previous instance of the circuit. EBT stands for Extended Back Trace. [1] Hi-Keung Tony Ma, Srinivas Devadas, A. Richard Newton, Alberto Sangiovanni-Vincentelli, "Test Generation for Sequential Circuits," IEEE Trans. Computer-Aided Design, Oct 1988 [2] R. Marlett, "EBT: A comprehensive test generation technique for highly sequential circuits," in Proc. 15th Design Automation Conference, June 1978, pp 332-338
wad@houxv.UUCP (R.WADSACK) (12/20/88)
> you've got at least three sets of tests to develop: > > "Manufacturing Tests" > "Validation Tests" > "Diagnostic Tests" > > with different time and coverage constraints. > > And how do people go about finding design errors anyway? > > Would any real-life diagnostic engineers care to comment? > I wrote up my experiences with the AT&T WE 32100 CPU's in an article in the August 1984 issue of "IEEE Design and Test of Computers" (pp. 66 - 75). It's titled "Design Verification and Testing of the WE 32100 CPUs". It treates the differences between DV tests, silicon tests, and diagnostic tests. I have also done equivalent work on the Bell Labs Crisp "RISC" CPU. The concepts and approaches are much the same regardless of whether the chip at hand is CISC or RISC. Ronald L. Wadsack AT&T Bell Labs Holmdel, NJ
fwb@demon.siemens.com (Frederic W. Brehm) (12/20/88)
In article <28200252@mcdurb> aglew@mcdurb.Urbana.Gould.COM writes: >Talking about testing, I finally think that I have figured out >one of the things that has been bothering me about hardware >testability methodology. >... > When people talk about test coverage, they mean test coverage >over a limited space of implementation and fabrication errors, >not over the much larger space of design errors. Hmmmm. Are you suggesting that each and every part out of the fab be tested for DESIGN errors? I hope not. Tests for design errors should be done before the design is committed to high volume production. Design tests are usually done by running simulation software (e.g. Spice) and testing the first samples off the production line in high-end testers and prototype circuits. Fred ---------------------------------------------------------------------------- Frederic W. Brehm phone: (609)-734-3336 Siemens Corporate Research FAX: (609)-734-6565 755 College Road East uucp: princeton!siemens!demon!fwb Princeton, NJ 08540 internet: fwb@demon.siemens.com "From there to here, from here to there, funny things are everywhere." - Dr. Seuss