martin@aster.UCAR.EDU (Charlie Martin) (11/18/89)
Has anybody devised a method for remotely booting (ie. over the network) a vxWorks system which has crashed?
sbrandt@herds.SRC.Honeywell.COM (Scott Brandt) (12/06/89)
In article <5321@ncar.ucar.edu> martin@aster.UCAR.EDU (Charlie Martin) writes: >Has anybody devised a method for remotely booting >(ie. over the network) a vxWorks system which has crashed? In answer to your question: If the system has crashed, it would probably be impossible to talk to it to tell it to reboot. Therefore you would need some external way to toggle the reset line in the chassis. One of my hardware-hacking coworkers connected a device to one of our vxWorks systems that toggles the reset line if the phone rings more than 8 times in the lab. This is hardly an elegant solution, but in order to remotely reboot the system you will need something analogous to this. If you are creative you can probably put some sort of device on whichever host your vxWorks system boots off of to accomplish the same thing. Scott Brandt
wbrown@beva.bev.lbl.gov (Bill Brown) (12/06/89)
In article <41645@srcsip.UUCP> sbrandt@src.honeywell.com (Scott Brandt) writes: >In article <5321@ncar.ucar.edu> martin@aster.UCAR.EDU (Charlie Martin) writes: >>Has anybody devised a method for remotely booting >>(ie. over the network) a vxWorks system which has crashed? > >In answer to your question: > If the system has crashed, it would probably be impossible to talk to >it to tell it to reboot. Therefore you would need some external way to toggle >the reset line in the chassis. One of my hardware-hacking coworkers >connected a device to one of our vxWorks systems that toggles the reset line >if the phone rings more than 8 times in the lab. This is hardly an elegant >solution, but in order to remotely reboot the system you will need something >analogous to this. If you are creative you can probably put some sort of >device on whichever host your vxWorks system boots off of to accomplish >the same thing. We've developed a dead-man system that seems to work fairly well. It's our intent to release it to the archive when we're sure it's working. It uses the built-in deadman timer on the MVME-147 and also is being moved to the '133 and '135. The scheme depends on the task-scheduling invoking a function that keeps the dead-man from timing out. The problem is that all 3 boards have different hardware, so most of the code is system-dependent. Probably we'll add stuff to the vxWorks system-dependent code. I certainly wouldn't want it to stand between me and the end of the world, but it does seem to take care of quite a few system hangs. -bill wlbrown@lbl.gov Disclaimer: These opinions are my own and have nothing to do with the official policy or management of L.B.L, who probably couldn't care less about employees who play with trains.
biocca@bevb.bev.lbl.gov (Alan Biocca) (12/07/89)
In article <5321@ncar.ucar.edu> martin@aster.UCAR.EDU (Charlie Martin) writes: >Has anybody devised a method for remotely booting >(ie. over the network) a vxWorks system which has crashed? This is not quite what you are asking, but may be of use: We have had good success using the Deadman timer available on many CPU boards. Essentially a monitor task periodically checks on the health of the system (in whatever way you like) and resets the deadman timer. This timer produces a reset when it times out, causing the crate to automatically reboot when crashes occur. In the Motorola 147 we are using the deadman timer has a very short maximum timeout and we found that a regular task could not reliaby reset it often enough. We added some code to the clock interrupt handler to manage a longer timer and reset the hardware deadman until the longer software timer expired. We are planning to put this in the vxWorks archive soon. Alan K Biocca Deputy Project Manager BEVALAC Controls Group Lawrence Berkeley Laboratory