[comp.risks] RISKS DIGEST 9.64

risks@CSL.SRI.COM (RISKS Forum) (02/02/90)

RISKS-LIST: RISKS-FORUM Digest  Thursday 1 February 1990   Volume 9 : Issue 64

        FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS 
   ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Contents:
  SENDMAIL horrors (PGN)
  Software error at Bruce nuclear station (Mark Bartelt)
  New South Wales Police deregisters police cars (Diomidis Spinellis)
  Fire and 753 controllers (need a light?) (Neal Immega via Mark Seiden)
  The substantiative error made by AT&T (Robert Ullmann)
  Re: AT&T Crash Statement: The Official Report (Bob Munck)
  Re: Airbus crash of June 88 (Robert Dorsett)
  Re: Virology and an infectious date syndrome (Gene Spafford)

The RISKS Forum is moderated.  Contributions should be relevant, sound, in good
taste, objective, coherent, concise, and nonrepetitious.  Diversity is welcome.
CONTRIBUTIONS to RISKS@CSL.SRI.COM, with relevant, substantive "Subject:" line
(otherwise they may be ignored).  REQUESTS to RISKS-Request@CSL.SRI.COM.
TO FTP VOL i ISSUE j:  ftp CRVAX.sri.com<CR>login anonymous<CR>AnyNonNullPW<CR>
  cd sys$user2:[risks]<CR>get risks-i.j .  Vol summaries now in risks-i.0 (j=0)

----------------------------------------------------------------------

Date: Thu, 1 Feb 1990 10:59:38 PST
From: RISKS Forum <risks@csl.sri.com> (Really from Neumann@csl.sri.com)
Subject: SENDMAIL horrors

My sincerest apologies to those of you who received multiple copies of
RISKS-9.63.  I keep breaking the mailing list up into smaller sublists, but for
the first time TWO of the sublists were victimized by the lurking SENDMAIL
flaw.  I will try mailing this issue to the sublists sequentially, waiting for
each sublist to complete before going on to the next, just in case there are
competing effects.  I have been putting out fewer issues in hopes of minimizing
your annoyance, but that is also counterproductive.  So, please bear with me as
I try this issue serially over the sublists...  Painful, but it might help!

By the way, the end of the AT&T "Official report" message in RISKS-9.63 had a
few trailing blank characters that overflowed the line and made the next line
look blank on my screen, which means that the undigestifiers probably gagged on
it and refused to recognize the following message separately.  I try hard to
avoid that, but this one slipped through.  For those of you who get RISKS as
individual messages, Oops.  Sorry. :-(
                                                         PGN

------------------------------

Date: 	Thu, 1 Feb 90 06:32:11 EST
From: Mark Bartelt <mark@sickkids.toronto.edu>
Subject: Software error at Bruce nuclear station

Mark Bartelt, Hospital for Sick Children, Toronto   mark@sickkids.utoronto.ca
416/598-6442                          UUCP: {utzoo,decvax}!sickkids!mark

Group questions software's reliability after Bruce Accident  
(Canadian Press)

A computer software error that released thousands of litres of radioactive
water at the Bruce nuclear station raises questions about the reliability of
software at the new Darlington station, Energy Probe says.  "This spurious
software accident at Bruce Unit 4 should be regarded as a warning about
software safety in general," Tom Adams, utility analyst with the energy
watchdog group, said yesterday.  But Ontario Hydro says the group is comparing
"apples to oranges."

The computers at Darlington have three backup systems that automatically shut
down the reactors in certain emergencies, spokesman Dave Stevens said.  The
Crown utility says a software flaw caused last week's accident, when a
mechanical fuelling machine was loading and unloading fuel bundles into reactor
Unit 4 at the Bruce station at Kincardine, near Owen Sound.  No one was injured
and neither workers nor the environment were directly contaminated by the
tritium-irradiated heavy water, which escaped from the reactor as steam.

Although Mr. Stevens said he did not know whether Hydro had anticipated such an
accident, he said the utility had contingency plans for the release of heavy
water.

A spokesman for the Atomic Energy Control Board, which regulates the safety of
the nuclear industry, described the accident as "very unusual and also very
significant" because of what could have happened.  "It has the appearance of
something that could have been worse," Zygmund Domaratzki said in a telephone
interview from Ottawa.  "If that fuelling machine had kept on going it would
have ripped the end of that channel," allowing the fuel bundles to tumble out,
Mr. Domaratzki said.  With that kind of mishap, he said, "it wouldn't take much
to cause widespread contamination within the reactor building."  As it was, the
accident cost Hydro time and money, he said.

Meanwhile, heavy water continued leaking from the Bruce reactor yesterday at
the rate of about seven litres per hour.  Workers planned to remove 13 fuel
bundles from the damaged fuel channel and retrieve the fuelling machine, still
dangling on the face of the reactor.  The unit is expected to be down for at
least six weeks.

Because two of the other three reactors at Bruce are still down, the giant
utility may have problems meeting demand if there is a severe cold snap.

------------------------------

Date:       Tue, 30 Jan 90 12:20:55 GMT
From: Diomidis Spinellis <dds@cc.imperial.ac.uk>
Subject: New South Wales Police deregisters police cars

Found in the British ``Guardian'' newspaper of 25 January 1990:

  ``The entire state police force in the New South Wales Australia, found
  itself driving illegal cars after an enthusiastic computer deregistered them,
  Paul Zucker reports on the Newsbytes online newsletter.  The police were
  instructed to book themselves, or each other.

  ``The problem was cause by a high ranking officer (not named) who failed to
  pay illegal parking tickets.  His unmarked car was registered to the police
  department.  After statutory warnings were ignored, the computer program
  deregistered all cars belonging to the offender: that is, all cars belonging
  to the police department.''

It may be that I am still under the influence of Dijkstra's CACM article, but
notice how the computer is given the human attribute "enthusiastic".
 
Diomidis Spinellis, Imperial College, London.

------------------------------

Date: Wed, 31 Jan 90 21:39:22 EST
From: mis@Seiden.com (Mark Seiden)
Subject: Fire and 753 controllers (need a light?)

posted recently (edited slightly):

Subject: Fire and 753 controllers
Date: 30 Jan 90 14:25:25 GMT
[From: Neal Immega]
Organization: Shell Development Company, Bellaire Research Center, Houston, TX

Shell Development Co. had a fire in a SUN 4/280 when diodes on a
Xylogics 753 disk controller overheated and caught fire. The plastic
card guides on the Illmanite double wide to triple wide controller
also appear to have burned and may have contributed to the damage of
the two adjacent disk controller cards (which operated perfectly while
burning!). Flames four inches high were coming out of the top of the
card cage when the fire was extinguished.

This problem could have been prevented if we had been notified of the
10/10/89 field change notice from Xylogics to make a free upgrade to
all boards by adding a heatsink for the diodes. The new design has a
bronze colored strip 1.5 inch wide running the length of the card.
Xylogics said that I should have been notified by the distributor
(CITA Technologies) and CITA says that they were not told of the
problem. The Xylogics contact for this is Laurie Walker at
617-272-8140. She will need the serial number from the board (X plus 7
numbers from the back).

 Neal Immega
 Staff Geologist , Geology Research                                
 Shell Development Company, Bellaire Research Center     (713) 663-2572
 ...!rice!shell!immega   immega@shell.com

  [This raises the interesting question how us poor users at the bottom of the
  food chain are expected to find out about safety-related problems....
  Mark Seiden, mis@seiden.com, 203 329 2722]

------------------------------

Date: 31 Jan 90 22:04:49 EST
From: Robert Ullmann <Ariel@RELAY.PRIME.COM>
Subject: the substantiative error made by AT&T

I am surprised that no one has pointed to the [IMHO] real error committed by
AT&T: that is, upgrading all of the ESS4s to the same s/w revision at the same
time.

This exposes the network to a single error, with consequences that affect the
entire network simultaneously.

To contrast, from personal experience: I run the internal mail network in Prime
Computer, which runs SMTP mail on 3000 systems in 27 countries.  I have been
criticized for not being agressive in upgrading the revision of PDNmail (an
RFC1090 implementation) to current release.

My reason is that the network is much more robust running a range of different
versions (much like Long-Lines was, before everything was ESSn, with DLL'd s/w:
which is why this didn't happen before). Many of the systems run other software
implementations entirely, which also helps. Except when those implementations
are "ports" of the same software, witness the continuing generic sendmail
problems.  Not that anything is wrong with sendmail, per se, but there ought to
be more independent implementations-from-specification.

The resulting heterogenous network of systems is so robust that I can test new
implmentations of PDNmail by replacing the mailer on the most heavily loaded
system (!), and watching closely for several hours. (Not that I am generally
advocating this sort of thing!  "ah, Schickelgruber, you have a new version of
the master S/W?  It compiles, Ja? Sehr gut, load it on Hinsdale ... " :-)

Systemic problems with new versions become apparent after only a few systems
are using it in production service; and only affect a small part of the
network, even if undiagnosed or uncorrected for long periods of time.

Robert Ullmann, Prime Computer, Inc.

------------------------------

Date: Thu, 01 Feb 90 13:43:02 -0500
From: Bob Munck <munck@community-chest.mitre.org>
Subject: Re: AT&T Crash Statement: The Official Report

>From Telecom-Digest: Volume 10, Issue 59 and Risks Digest: 9.63

> Here's AT&T's _official_ report on the Martin Luther King day network
> problems, courtesy of the AT&T Consultant Liason Program.
> ...
> While the software had been rigorously tested in laboratory
> environments before it was introduced, the unique combination of
> events that led to this problem   couldn't be   predicted.
                                    ^^^^^^^^ ^^

Don't they mean "wasn't"?  The rest of the report seems (to me) to be
reasonably detailed, well explained, and apparently honest, but this one little
dissemblance ruins the whole thing.  Is there any justification for the
assertion that the prediction was (and is) _impossible_ in these circumstances?

                         -- Bob Munck, MITRE Corporation, McLean VA

------------------------------

Date: Thu, 1 Feb 90 02:23:53 CST
From: rdd@walt.cc.utexas.edu (Robert Dorsett)
Subject: Re: Airbus crash of June 88

In RISKS 9.63, Olivier Crepin-Leblond provided a number of interesting
conclusions regarding the A320 crash at Mulhouse-Habsheim.  _Flight
International_ also leaked the commission's findings, which accord heavily with
what he translated.  It should be noted, however, that while the engines were
at comparatively high power settings at the time of impact, the flight path of
the airplane was quite unusual -- it approached the field from a high altitude,
"dirty" (flaps and landing gear extended), and with engines at *flight idle.*
It is entirely conceivable that the airplane was far behind the power curve at
the point the flight crew decided to go around; moreover, this maneuver is
unusual enough in of itself to bring into serious doubt the crew's judgement.

The altimeter issue is also interesting, but has largely been applied to the
incident after later experiences with altitude displays going haywire on IFR
approaches (including one well-publicised incident into Zurich).  In the
crash, the visibility was unlimited VFR; at low altitudes in such weather, 
one does not pay much attention to the altimeter, anyway--even in airliners.
The captain still insists that his flight instruments did him in.

Lastly, I note with some amusement the captain's new place of employment.  
Australia has been undergoing a very bitter pilot's dispute.  The largest
airline recently fired all striking pilots (which is to say, all of its 
pilots) and has been recruiting heavily overseas, offering relatively 
senior positions to anyone with even marginal qualifications.  

Robert Dorsett                                   	Moderator, aeronautics
Internet: rdd@rascal.ics.utexas.edu                        mailing list.
UUCP: ...cs.utexas.edu!rascal.ics.utexas.edu!rdd  

------------------------------

Date: 1 Feb 90 18:03:44 GMT
From: spaf@cs.purdue.edu (Gene Spafford)
Subject: Re: Virology and an infectious date syndrome (RISKS-9.63)

  >The appendices provide extensive references to other publications, security
  >organizations, anti-viral software sources, applicable (U.S.)  state and
  >Federal laws against computer crime, and detailed descriptions of all IBM and
  >Apple Macintosh viruses known as of 1 October 1990.
                                                 ^^^^     No, the authors aren't
psychic.  I've been writing 1989 on all my checks for the past month, and now
I'm writing 1990 everywhere I should write 1989!  I'm glad so many people found
this amusing....someday you'll grow old and senile too :-) 

Gene Spafford, NSF/Purdue/U of Florida Software Engineering Research Center, 
Dept. of Computer Sciences, Purdue University, W. Lafayette IN 47907-2004 

------------------------------

End of RISKS-FORUM Digest 9.64
************************