[comp.unix.wizards] SIGPWR signal in system v

ohurley@cs.tcd.ie (Oisin Hurley) (05/06/91)

Does anybody out there have information on the SYS V SIGPWR signal? The man
says that this signal occurs if there's a power failure. I have a couple of
questioned which I haven't been able to answer myself:

1. When power goes, is this signal sent to every process currently running?

2. How far does the power have to drop before the signal is activated?

3. How long does the power have to stay at that level to ensure activation?
	(how about transients, etc.)

4. Is the signal generated on the mboard or is there a line from the psu?

5. Has anybody used it - is it useful? Is there enough time to sync disks
   upon receipt of SIGPWR (I presume there's hardly time for anything)?
   
6. Why is it there?


Any help would be much appreciated.
thanks in advance
	O[-<

---
ombhurley@cs.tcd.ie
Oisin Hurley, Dept. of Comp. Sci., TCD, Dublin 2, Ireland
---

dennis@nebulus.ampr.org (Dennis S. Breckenridge) (05/06/91)

ohurley@cs.tcd.ie (Oisin Hurley) writes:

>1. When power goes, is this signal sent to every process currently running?

No, only processes that list for it. Init is one of those processes.

>2. How far does the power have to drop before the signal is activated?

Specs say that a frequency change of 5hz or a voltage level out of the 
specified voltage (on a 3B2 its about 92 volts).

>3. How long does the power have to stay at that level to ensure activation?
>	(how about transients, etc.)

If the frequency changes, it is a very good sign that you are really going
to have a power failure. With the tolerance in the power supply, 94-130 
volts, anything outside that range is a real drop. Once the signal is caught
init start's the shutdown process. 

>4. Is the signal generated on the mboard or is there a line from the psu?

The power supply is responsible.

>5. Has anybody used it - is it useful? Is there enough time to sync disks
>   upon receipt of SIGPWR (I presume there's hardly time for anything)?
>   
>6. Why is it there?

The 3B2 series of machines use the power on off switch to send this signal.
Other than that, I don't know. I have considered tying my UPS alarm into
this bit and shuting down the system if the power drops. Yuor right about
the time. If the disks could write fast enough, UNIX could shut down grace-
fully. 
-- 
-----------------------------------------------------------------------------
    Dennis S. Breckenridge VE7TCP@VE7TCP [44.135.160.59]  dennis@nebulus.UUCP
Lately it occurs to me...what a long strange trip it's been. - Grateful Dead
-----------------------------------------------------------------------------

paul@prcrs.prc.com (Paul Hite) (05/06/91)

In article <1991May6.112253.5344@cs.tcd.ie>, ohurley@cs.tcd.ie (Oisin Hurley) writes:
> 
> Does anybody out there have information on the SYS V SIGPWR signal? The man
> says that this signal occurs if there's a power failure. 

Well, I have only seen one system, the HP 9000/800, that uses this.  The 800's
have a battery back-up that keeps main memory (only) alive while power is out.
(The battery is optional on some models.)  If power is restored before the 
battery dies, the system will recover.  First, some special entries in 
inittab get executed:
	pf::powerwait:/etc/powerfail >/dev/console 2>&1 #power fail routines

This reloads software into smart cards on our system.  After this is done,
SIGPWR is sent to every process, but the default on SIGPWR is to ignore it.
vi (for one) will catch the signal and repaint the screen.

Paul Hite   PRC Realty Systems  McLean,Va   paul@prcrs.prc.com    (703) 556-2243
        You can't tell which way the train went by studying its tracks.

rwhite@nusdecs.uucp (Robert White) (05/07/91)

RE: SIGPWR questions....

On the systems I use (386 and 3B2) you need special hardware attached
to a UPS (uninteruptable power supply) in order for your system to
support SIGPWR.  You are exactly correct that there isnt any time
for anything to happen in software durring a normal power failure.

If you dont have a built-in UPS, or an external UPS and a builtin
board, SIGPWR is never generated, the system simply dies.  You
know power-in=power-out, power-in=0 => activity=0 (etc.)

The typical mechanisim (as implemented on a 3B2) goes like this.

The UPS powers the system and has "external alarm closures" which
are relays that close switches across known pins on a connector
on the side of the UPS.  A remote managment board has two conductor
pairs that represent "loss of AC feed" and "low battery".  Continuity
across the pair indicates the condition is present.  When a poweroutage
takes place closure occurs across the LOACF pair.  A sanity daemon
(or other sortware hack usually running as a normal process) discovers
the condition within a known period (granularity usually about
one minute modified by your cron(1) period), a notification message
"Warning: System Power Fault" or some such is generated and
the program goes to sleep for a set interval.  The interval
being arbatrary and chosen based on aprox 3/4 of the available 
batery time.  When the time is expired the program checks the
hardware to see if the power outage condition is still present
(clear the hardwar flag and see if the hardware resets it).
If there is still no AC, the program initates a "normal"
shutdown.

If the "low batery" condition becomes true while the LOACF is true
the software will *only then* issue SIGPWR (eh, this is supposed to
be a "standard", sometimes SIGPWR will be the "normal shutdown"
method mentioned above in the case of really clever/obscure system
drivers) ti indicate that the AC is off and there is no way to predict
how long the UPS can keep the system up, so it is going down *NOW*
*ASAP* with only minimum prepration.  In a tightly implemented
environment the time between SIGPWR and a totally synced and
quesent system should be aobut 15 seconds. (e.g. SIGPWR, 3 sec,
SIGTERM, 5 sec, SIGKILL, 2 sec, sync(s) and umounts(s), halt or
some such.  The SIGTERM is recommended if you have lots of programs
that will not catch SIGPWR, that will catch SIGTERM and which
you *really* want to do their emergency shutdown routines [e.g.
database apps] otherwise, just skip it and bail asap.)

In the 386 environment the UPS(s) that support the Netware Feature
Connectors are the ones that support this kind of environment.
I don't know wether the netware feature boards are supported in
any/many/most/all UNIX implementations but if they are not it
should be a trival driver to implement (wrt power sensing)
if the technical manual is available.  Also, a software driver
was posted here a week or so ago that lets a serial port use
CD to indicate a power falure when the associated device is
plugged into raw AC and the computer is pluuged into a UPS.

john@sco.COM (John R. MacMillan) (05/07/91)

|Does anybody out there have information on the SYS V SIGPWR signal? The man
|says that this signal occurs if there's a power failure. I have a couple of
|questioned which I haven't been able to answer myself:

On the only system I've used that implemented SIGPWR, the signal is
raised when power is _restored_ not just before it goes away.  In the
event of a power loss the kernel alone had any warning and was able to
write memory contents and state info to non-volatile storage.  When
power was restored, the kernel restored the previous state, and sent
SIGPWR to all processes.  I can't remember for certain, but I think
the default handling may have been SIG_IGN, so many processeses
wouldn't even notice the power loss.  Those that caught SIGPWR could
refresh their screens, or whatever.

hecker@federal.uucp (Frank Hecker) (05/07/91)

In article <1991May6.112253.5344@cs.tcd.ie> ohurley@cs.tcd.ie (Oisin
Hurley) writes:

>Does anybody out there have information on the SYS V SIGPWR signal? The man
>says that this signal occurs if there's a power failure. I have a couple of
>questioned which I haven't been able to answer myself:

I have no idea how widely this signal is implemented, but it's
available on the Tandem Integrity S2 fault-tolerant system, which has
built-in batteries sufficient to keep the system up for several
minutes.  The S2 implementation provides a good example of how SIGPWR
could be used in any system with an well-integrated uninterruptible
power system.

>1. When power goes, is this signal sent to every process currently running?

On the S2, it depends on how you've configured your system.  If you've
configured it to do a shutdown, then the standard shutdown procedure
is followed and SIGPWR is not used; processes are sent SIGTERM instead.

On the other hand, if you've configured it to do a powerfail auto
restart then (almost) all processes get sent SIGPWR.  They then get
suspended and the contents of system memory written to disk, after
which the system turns itself off.  When power resumes the contents of
memory are reloaded from disk, the processes are resumed, and they're
sent SIGPWR again.

The reason for the "almost" above is that you can set selected
processes to not resume across a power failure (e.g., to avoid
security problems with open sessions).  On power failure these
processes get sent SIGTERM instead, followed by SIGKILL.

Also, since processes get sent SIGPWR twice (once before going down
and once after coming back up), additional information must be
included with the signal to indicate what's happening.  This is done
using an integer code passed as the second argument to the signal
handler.

>2. How far does the power have to drop before the signal is activated?

On the S2, below the lower voltage limits of the system, at which
point the batteries start supplying power to the system and the kernel
is notified.

>3. How long does the power have to stay at that level to ensure activation?
>	(how about transients, etc.)

On the S2 this is configurable.  The standard figure is 15 seconds,
but 30 seconds or even a minute are also reasonable values.  If power
is restored during that period then the kernel takes no action and
applications are not affected.

>4. Is the signal generated on the mboard or is there a line from the psu?

The kernel is interrupted by the S2's I/O processor, which samples
environmental info like power and temperature.  (For example, the
system can survive the failure of one fan, but the failure of two fans
is treated like a power failure.)  I'm not familar with the exact
hardware details.  The kernel then starts checking every five seconds
to see if power is still off, until either power is restored or the
allotted time period elapses.

>5. Has anybody used it - is it useful? Is there enough time to sync disks
>   upon receipt of SIGPWR (I presume there's hardly time for anything)?

I've written demo programs to catch SIGPWR on the S2, but haven't used
it in an actual application, nor have I modified any freeware programs
to support it.  I leave it to others to propose applications for which
it would be most useful.

By default S2 processes get 30 seconds to do SIGPWR handling before
they're suspended.  This can be decreased or increased by the system
administrator.  (Disks get synced in any case, whether the system
administrator has configured for a shutdown or for a restart.)

Handling SIGPWR correctly in the general case is quite tricky, since
you have to account for interrupted system calls, date/time changes
across the power failure, and the like.  There's a whole section on it
in the Integrity S2 Programmer's Guide, with an example I still
haven't been able to puzzle out fully.

There are also some problematic issues from the system point of view.
For example, if the kernel starts a power fail shutdown it has to
finish it even if power comes back on during the shutdown.  (In this
case the system reloads memory after it finishes dumping it.)

>6. Why is it there?

It's a very useful signal to have if your system is supported by a UPS
that can keep it up for a long enough period of time for appropriate
actions to be taken.  It's even more useful if your system can resume
applications across a power failure, like the S2, as opposed to just
initiating a standard shutdown.

I'm interested in hearing about any comparable implementations of
SIGPWR on other systems.
-- 
Frank Hecker
...!uunet!tdmfed!hecker

martin@adpplz.UUCP (Martin Golding) (05/08/91)

In <1991May6.224407.22544@federal.uucp> hecker@federal.uucp (Frank Hecker) writes:

>[I]f you've configured [the S2] to do a powerfail auto
>restart then (almost) all processes get sent SIGPWR.  They then get
>suspended and the contents of system memory written to disk, after
>which the system turns itself off.  When power resumes the contents of
>memory are reloaded from disk, the processes are resumed, and they're
>sent SIGPWR again.
>...you can set selected processes to not resume across a power failure 
>[they] get sent SIGTERM instead, followed by SIGKILL.

>I'm interested in hearing about any comparable implementations of
>SIGPWR on other systems.

Last time we looked, NCR and Altos had the save/resume capability,
although perhaps not as thorough as Tandem's. We've been explaining
to Motorola how easy it is, and what a good idea, but no sale yet.

Just a data point: McDonnell Douglas Reality systems have done this
for years, ever since we showed them how :-)

Another data point: I just saw a _card_ for a PC that implements the
feature: it's got on-board battery backup to hold the disk up long
enough for a memory save. I have a faint suspicion that there are some 
applications that won't come back clean.


Martin Golding    | sync, sync, sync, sank ... sunk:
Dod #0236         |  He who steals my code steals trash.
A poor old decrepit Pick programmer. Sympathize at:
{mcspdx,pdxgate}!adpplz!martin or martin@adpplz.uucp

chap@art-sy.detroit.mi.us (j chapman flack) (05/13/91)

In article <1991May6.112253.5344@cs.tcd.ie> ohurley@cs.tcd.ie (Oisin Hurley) writes:
>
>Does anybody out there have information on the SYS V SIGPWR signal? The man
>says that this signal occurs if there's a power failure. I have a couple of
>questioned which I haven't been able to answer myself:

I, too, would like information on this facility.

>2. How far ... 3. How long ... 4. ... generated on mboard or line from the psu?

I assume the answers to those questions are very system-dependent.  Somewhere
there has to be hardware that detects a power drop and asserts an IRQ, and I
suppose somewhere there has to be a driver that handles that IRQ and causes
signals to be sent.

I have SCO SysV/386, and have been looking seriously at the ITT PowerMate,
which is several minutes' worth of battery on a card (internal) which will
assert an IRQ when it senses main power dropping and goes to batteries.
I assume I'll need to write a driver with an interrupt handler for that,
and that the driver should call some kernel routine that causes SIGPWRs to
be sent.  Can anyone offer details?

... My man page for `init' says that if init receives a SIGPWR, it will
fire up anything in /etc/inittab with type 'powerfail' or 'powerwait'.
I experimented with "kill -19 1" to send init a SIGPWR (19) after putting
some 'powerfail' entries in inittab; it didn't do diddley.  Either init
doesn't really support SIGPWR as documented, or "kill -19" doesn't send a
SIGPWR, or I'm missing something else.  Ideas?

I don't have the development system (yet) which is why my experiments are on
such a superficial level so far.
-- 
Chap Flack                         Their tanks will rust.  Our songs will last.
chap@art-sy.detroit.mi.us                                    -MIKHS 0EODWPAKHS

Nothing I say represents Appropriate Roles for Technology unless I say it does.