[comp.sys.sun] Reliability v. Fire Risk

pm@cs.city.ac.uk (Pete Mellor) (02/18/90)

Scott Stone asks (v9n21):

> One of the companies I work with is considering turning their Suns off
> every evening, and on again in the morning.  They wish to do this in order
> to minimize the possibility of a fire...What opinion do you have about this?

The gain in safety from switching off is probably outweighed by the
inconvenience caused. This is a guess: I don't have any statistics on
fires in computing machinery, and I don't know who has. SUN would probably
be happy to tell you how safe their machines are provided you don't fool
around with 3rd party memory upgrades with which the fan can't cope (or,
worse, install your own fan, as was suggested recently in SUN-SPOTS).
Other than that, try asking a few insurance companies how they assess the
risk for computer installations.

> What percentage of people with networked Suns leave them on...?

In my limited experience, 100%. Our centre (4 machines) certainly does,
and as far as I know, so does every other department in the university
(and that's quite a few SUNs), and so do at least one large firm and two
departments in other universities with whom we work. I did see one guy
turn off the monitor alone to save the phosphor, but he'd forgotten how to
run screenblank.

The only times we switch off our network is when we have been warned by
the electricians of a scheduled loss of power during maintenance work. I
did switch off my own machine at night for a time when the fan was making
a funny noise.  I had visions of the fan packing up completely and the
machine overheating while nobody was there to spot it.

> Would turning it off every night, and on in the morning reduce the
> reliability/MTBF of the machine significantly?  

Probably yes. I don't have any data on this myself, but an ex-colleague in
the quality, reliability and statistics department of a large computer
manufacturer has been investigating the effect of various 'explanatory
variables' on the reliability of printed circuit boards. He did tell me
that a 'duty cycle' involving regular power-off showed a significant
positive correlation with PCB failure rate. (I don't know if these results
have been published.)

My understanding has always been that SUNs are designed to be permanently
powered on.

> Have you known anyone that has had a fire due to a computer, particularly
> a Sun?

See v9n20! The only serious fire in a computer installation with which I
was connected was caused by an operator on the night shift dropping a
lighted cigarette into a waste-bin.

> What pro's/con's do you see?

Each of our machines has its own hard disc, and each disc is remotely
mounted on every other machine via nfs. To bring back up more than one
machine in the network involves the dreaded 'nfs: server not responding'
deadlock. Also, if any machine is off, everyone else on the network is
deprived of that machine's filestore. Add to this the fact that here we
work what is politely described as flexi-time (i.e. you've no idea when
any particular user will be sleeping off a hangover until lunchtime, or
working until 3 in the morning to make up for it), and that our central
mail server would probably not like it if it found a machine off-line when
trying to distribute overnight e-mail, and you will see that we have no
choice but to leave everything switched on.

Regularly powering down a network can *only* work if everyone works from 9
to 5 and e-mail is suitably stored until power-on time.

On the other hand, I wonder how much of the earth's resources are spent in
driving machines which spend around 75% of their time waiting for another
machine to talk to them? What is the green party's policy on this?  It was
with this thought in mind that I used to switch of my old ICL PERQ every
night, but that was a stand-alone machine. (It also required 2 new hard
discs in 12 months!)

I hope that this is a fair assessment, and that I don't get flamed
(metaphorically) by a lot of people who have been flamed (literally) by
SUNs!  In CSR, we're more interested in software reliability than in
boring things like the probability of the centre going up in smoke one
night. If anyone out there has any relevant data (statistical or
anecdotal), I'd be very interested to talk to you.

bob@morningstar.com (Bob Sutterfield) (02/20/90)

In article <5078@brazos.Rice.edu> pm@cs.city.ac.uk (Pete Mellor) writes:
| My understanding has always been that SUNs are designed to be
| permanently powered on.

Power cycling at night is something perhaps appropriate to a personal
computer class machine, like a Mac or something running MS-DOS.  Though
workstations are getting to be the same physical size, the general design
and philosophy in UNIX comes down from timesharing systems.  Thus we see
the tremendous overhead in shutting down and starting up a modern UNIX
system - trace through /etc/rc* someday.  A Sun running UNIX is still a
timesharing system at heart, even though it is primarily dedicated to a
single user's needs.  The hardware seems to reflect this software
philosophy, and is happy powered up for months at a time.

On the other hand, the local Sun field service guy told me last fall that
"the latest word from the factory says that monitors should be switched
off when not used, even overnight" to save the phosphors.  This, he said,
was regardless of whether a screen-saver was in use.  I still wonder about
that, and we still leave our monitors (with the rest of the systems) on
all the time.  I'm still wondering whether we'll ever see a more
authoritative pronouncement from "the factory" on that one...

dmc@cam.sri.com (David Carter) (02/21/90)

[original: v9n21; commenting now on v9n50]

Pete Mellor <pm@cs.city.ac.uk> says:
>I did see one guy turn off the monitor alone to save the phosphor, but he'd 
>forgotten how to run screenblank.

Phosphors, fire and amnesia are not the only considerations. We turn all
our screens off here every night and have saved quite a lot of money. We
started doing this after our local Sun office told us they did so too. If
it's cost-effective for them, it must be for anyone with a hardware
maintenance contract.

>I wonder how much of the earth's resources are spent in
>driving machines which spend around 75% of their time waiting for another
>machine to talk to them? What is the green party's policy on this?

Green Party policy, as I understand it, is to encourage the development of
appropriate technology. In this case, I imagine it could take the form of
some box, with minimal power consumption, which would listen to an
ethernet and whose sole function would be to power a Sun up or down on
receipt of the appropriate request. Does such a box exist? If not,
couldn't someone make a lot of money by manufacturing one? Would it save
resources, or would the effect be nullified by need to replace (and
therefore manufacture) PCBs more frequently? If the latter, is anyone
developing "green"(ish) PCBs that can stand up to regular powering on and
off just like every other electrical appliance?

David Carter
SRI Cambridge Research Centre
Cambridge, UK
dmc@ai.sri.com, dmc@uk.co.sri

BHamilton.osbuSouth@xerox.com (02/22/90)

Here at Xerox we did some studies a few years ago on powering down Xerox
(not Sun) workstations.  The hard facts are: powering down a Xerox 8010 or
6085 workstation each night saves on the order of $1/night in electricity
and air conditioning costs.  Multiply by approx. 25K workstations and it
adds up to MEGABUCKS per year.

Reliability: there was a SLIGHT increase in "infant mortality" of some
components, such that a large site BEGINNING a power-down program might
want to stock some additional spares.

We plan to start a similar study soon with Sun 4/110's and SPARCstations.
Anecdote: a fellow across the hall from me has been powering down his
4/110 every day for about a year with no problems.  "exit SunView",
"Halt", and the boot-up cycle only take a couple of minutes.

Most PC users are accustomed to powering down after use.  A SPARCstation
should be able to be treated like a PC.  Public files belong on a server
(which stays up), not on a slow workstation disk.

Even if equipment is left powered up, it is every electronics engineer's
responsibility to design equipment to tolerate power cycling.  In my
experience over the past ten years in greater Los Angeles, you can expect
an average of three or four power failures per year.  Also, most systems
have failure modes where the only way to perform a complete system reset
is to cycle power.

--Bruce
BHamilton.osbuSouth@Xerox.COM
213/333-8075

dwight@relay.eu.net (Dwight Ernest) (02/27/90)

Yesterday when I came in to the office, one of the first things I
noticed was a very noxious odour or burnt components or crisped
circuit boards. Those who had arrived before me filled me in:
at about 8:30 a.m., all of the smoke alarms on our office floor
had been activated. The London Fire Brigade responded with a full
batallion, and one of my colleagues met them rushing to our floor
as he was arriving for work.

One of the Sun 3/50 workstations we work with had "caught fire,"
he said. The firefighters apparently only had to cut the power to
prevent it spreading. A bit worrisome. The offending system unit
had already been removed to a storage location to prevent the
spread of the noxious odour.

When the Sun CE arrived to look at it, I got my first look at it as well.
One of the largest components on the PCB had been completed burnt up,
although the exterior of the box showed absolutely no damage from fire nor
from smoke. The PCB itself had at one location been burnt through. The
component involved was a "DC-DC Convertor".

The Sun CE swore that he'd never seen this before in three years of
experience with Suns. He even wrote this down on the Call Sheet.

We leave our Suns up and running on a 24-hour 365-day duty cycle.  They
all act as servers for something or other, even if just for shared
printers (we run a lot of PC-NFS in both normal office environments and
journalistic environments).

After this scary experience I'm wondering whether we ought to change that
policy.

		--Dwight Ernest
		  Technical Systems Coordinator
		  The Independent (Newspaper Publishing PLC)
		  40 City Road, London EC1Y 2DB United Kingdom
		     Phone: +44 1 956 1633
		     Fax:   +44 1 956 1996
		       UUCP: ...ukc!independent!dwight