rick@nyit.UUCP (Rick Ace) (11/15/85)
Back in October, Chris Torek posted the following description of a bug that appeared in a Vax-11/785 CPU: > On our 785, this bug takes the form of computing the value 0x7dfffffc > on `extzv $0,$4,-4(r0),r0' instructions when r0 has the value 0x80000000. Well, last week one of our 780s got a 785 upgrade, and it started crashing with "panic: Segmentation fault". Investigation revealed that the crash occurred at the very same EXTZV instruction that Chris's 785 was stumbling over, but our CPU mangled a different bit: the effective address calculated by the CPU in our case was 0x5ffffffc, which caused 4.2bsd to die hard and fast. It was rather odd, though, because most of the time the kernel sailed right though this instruction without a hitch. I wrote a small test program to exercise the bug: main() { while (1) { asm(" movl $0x80000000,r0"); asm(" extzv $0,$4,-4(r0),r0"); } } Sometimes the program would dump core right away; other times it took a few minutes. I made the point to Marc Merrill, our Field Service technician, who agreed that *something* was amiss. He ran diagnostics, but they found nothing. Marc then swapped CPU boards between the failing 785 and another working 785 until the problem moved. The failing board was the M7468 data path module, and replacing it with a spare from the F-S office put us back on the air. DEC - if you're listening - maybe you should beef up your FA&T diagnostics to catch this specific problem. (It didn't show up under VMS!) ----- Rick Ace Computer Graphics Laboratory New York Institute of Technology Old Westbury, NY 11568 (516) 686-7644 {decvax,seismo}!philabs!nyit!rick