arne@hpserv1.uit.no (Arne Helme) (08/10/90)
Yesterday i saw a program on comp.lang.c that claimed that it could crash certain RISC architectures from user mode. I tried it on our decsystem 5200 running Ultrix V3.1A and on a sun 4 SPARC station running sunOS4.03c. Both machines crashed. The program was quite simple. It allocated a chunk of memory and filled it with garbage. Then it tried to execute this garbage as instructions. I thought this should be impossible! Would anyone out there like to comment the strange behaviour I have observed on our machines? -- Arne Helme -- //// Arne Helme, science assistant// Email: arne@sfd.uit.no / /// Computer Science Department // "Going on means going far. Going far // // University of Tromsoe // means returning." (Tao Te Ching) /// / N-9000 Tromsoe, NORWAY // Phone: +47 83 44035 ////
guy@auspex.auspex.com (Guy Harris) (08/16/90)
>I thought this should be impossible! Yeah, it *should* be impossible to crash the OS from non-privileged user-mode code, but sometimes there are bugs in the OS. >Would anyone out there like to comment the strange behaviour I have >observed on our machines? From a quick look at a crash dump on an SS1 running 4.0.3c, my suspicion is that the kernel code for handing the floating point unit isn't being careful enough in looking at the floating point state; it appears to be handing a bad pointer to another routine that calls the procedure to which that pointer is supposed to point, only it points into the nether reaches of Hell instead. In other words, it doesn't appear to bear out the conclusions the original poster of the program, in "comp.os.vms", drew: OK. Here is a quick summary of the HOW TO CRASH A RISC machine from a USER-MODE program test. Reports have arrived that all of these machines can be crashed using CRASHME.C: IBM RT, MIPS, DECSTATION 5000, SPARC. On the two CISC architectures tried, VAX/VMS and SUN-3, the program either completed or exited with a core or register dump, as expected. Some background/motivation. My experience with microcode programming taught me that some sequences of MICROINSTRUCTIONS could wedge or jam the hardware in such a way that recovery was impossible without a reboot of some kind. The RISC architectures have some of the same properties of MICROCODE in that certain instruction sequences have UNDEFINED behavior. Now one of the great costs in a CISC machine is usually the trouble the designers go through to make sure that every instruction returns the MACHINE to a KNOWN STATE. That way the behavior of every instruction can be well defined, tested, and documented, individually verified and tested, and by simple induction be valid for arbitrary SEQUENCES of instructions. (In general). Engineers of RISC machines don't bother to do this, which is one of the reasons they are CHEAPER (the hardware, not the engineers). The problem of proving that an arbitary sequence of instructions "N" long will not crash the machine is much more costly if N > 1. (To say the least, if you know anything about mathematical logic). If there are M instructions (and M is probably around 1 BILLION) then there may be about M^N cases to check. And what is N? For a classic CISC machine a price is paid to make N = 1, or at least small. But for a RISC machine, might N be 10 or more? Anyway, no need to make too big a deal about this. Probably all the vendors can fix things in software alone, and certainly CISC chips with bugs in them have been shipped in the past too. Just a reminder though. There is no free lunch. There really is a trade-off between ROBUSTNESS-PRICE/PERFORMANCE-TIME_TO_MARKET. The *only* way in which you *might* be able to agree with this as being the source of the problem - at least in the SPARC case, and maybe in the MIPS case as well - would be to claim that the floating-point support software was part of the implementation of the architecture, and that the checks he alleges are made for CISC but not RISC machines weren't made in the software part of the architecture. It certainly doesn't seem to be the case that the *processor* gets stuck in some state "in such a way that recovery was impossible without a reboot of some kind."
bwong@cbnewsc.att.com (bruce.f.wong) (08/16/90)
In article <3899@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >>I thought this should be impossible! > >Yeah, it *should* be impossible to crash the OS from non-privileged >user-mode code, but sometimes there are bugs in the OS. I gave up trying to crash, with crashme.c, a SUN4/40 running SunOS 4.1. I used about 30 different combinations of arguments and then got bored when it didn't crash. Then I tried crashme using the arguments suggested in the original article on a SUN4/110 running SunOS 4.0 and it crashed immediately. My conclusion is that the OS is to blame and this is re- inforced by a posting that stated no success crashing two different RISC machines running the Mach OS. -- Bruce F. Wong ATT Bell Laboratories att!iexist!bwong 200 Park Plaza, Rm 1B-232 708-713-5111 Naperville, Ill 60566-7050 USA
my@dtg.nsc.com (Michael Yip) (08/17/90)
I crashed my machine here! And I also crashed some other ones. Machines that I crashed:- SUN4/160(?) SparcStation 1+ -- Mike