[mod.risks] RISKS-3.45

RISKS@CSL.SRI.COM (RISKS FORUM, Peter G. Neumann -- Coordinator) (08/29/86)

RISKS-LIST: RISKS-FORUM Digest,  Thursday, 28 August 1986  Volume 3 : Issue 45

           FORUM ON RISKS TO THE PUBLIC IN COMPUTER SYSTEMS 
   ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Contents:
  Nonviolent Resistor Destroys Aries Launch (PGN)
  Risks in the design of civil engineering projects (Annette Bauman)
  ATMs (Lindsay F. Marshall)
  Re: Typing Profiles (Lindsay F. Marshall)
  Human errors prevail (Ken Dymond, Nancy Leveson)

The RISKS Forum is moderated.  Contributions should be relevant, sound, in good
taste, objective, coherent, concise, nonrepetitious.  Diversity is welcome. 
(Contributions to RISKS@CSL.SRI.COM, Requests to RISKS-Request@CSL.SRI.COM)
  (Back issues Vol i Issue j available in CSL.SRI.COM:<RISKS>RISKS-i.j.
  Summary Contents in MAXj for each i; Vol 1: RISKS-1.46; Vol 2: RISKS-2.57.)

----------------------------------------------------------------------

Date: Thu 28 Aug 86 21:30:48-PDT
From: Peter G. Neumann <Neumann@CSL.SRI.COM>
Subject: Nonviolent Resistor Destroys Aries Launch
To: RISKS@CSL.SRI.COM

From SF Chronicle wire services, 28 Aug 1986: White Sands Missile Range NM

A rocket carrying a scientific payload for NASA was destroyed 50 seconds
after launch because its guidance system failed...  The loss of the $1.5
million rocket was caused by a mistake in the installation of a ... resistor
of the wrong size in the guidance system.  "It was an honest error", said
Warren Gurkin...  "This rocket has been a good rocket, and we continue to
have a lot of faith in it."  Saturday's flight was the 27th since the first
Aries was launched in 1973, and it was the third failure.

------------------------------

Date: 28 Aug 86 06:40 EDT
From: ABauman @ DDN1
Subject: Risks in the design of civil engineering projects
To: risks @ csl.sri.com

Computer-Aided Engineering, Penton Publishing, Cleveland OH, April 1986 page 4:

"Impressive computer analysis, however, may tempt some engineers into
developing designs that barely exceed maximum expected operational loads.
In these cases there is no room for error, no allowance for slight
miscalculations, no tolerance for inaccuracy.  In engineering parlance, the
design is "close to the line".  The reasoning, of course, is that relatively
small safety factors are justified because computer analysis is so accurate.
     The major flaw in this logic, however, lies in the fact that the
initial mathematical model set up by the designer may itself contain gross
inaccuracies...  These errors are carried through the entire analysis by
thecomputer, of course, which uses the model as the sole basis for its
calculations...  And wrong answers are easily obsuured by flashy color
graphics and high-speed interactive displays.  In most cases, the engineer
must be extreamly familar with the design and the programs used in its
development to spot errors in results."  -John K. Krouse editor

Annette C. Bauman, DDN-PMO
Test & Evaluation Branch, DDN Network software Test Director

------------------------------

From: "Lindsay F. Marshall" <lindsay%kelpie.newcastle.ac.uk@Cs.Ucl.AC.UK>
Date: Wed, 27 Aug 86 08:38:38 bst
To: risks@csl.sri.com
Subject: Re: ATMs

>....Their dispensing machines cannot be cheated in this way, because they have
>a steel door in front of the machine which does not open until you insert a
>valid plastic card.

People who swindle ATM's don't have cash cards?????

ATM swindle's don't seem to have caught on in the UK too much yet (at least
not that I've heard), but the new "vandal proof" phone boxes which have
special money compartments seem to be rather more vulnerable. I have heard
reports of people touring regions of the UK on a regular basis emptying
these phones.  Another interesting scam at the moment (which I presume has
swept the US long ago....) and which is not illegal is that of beating quiz
machines. Teams of 3 "experts" (sport, TV/film and general knowledge
usually) tour pubs and play the video quiz machines. These have money prizes
and they simply strip them of everything in them by answering all the
questions. Most landlords are now removing these games as they are losing
money.......  

------------------------------

From: "Lindsay F. Marshall" <lindsay%kelpie.newcastle.ac.uk@Cs.Ucl.AC.UK>
Date: Wed, 27 Aug 86 08:29:32 bst
To: risks@csl.sri.com
Subject: Re: Typing Profiles

John Ellenby (of Grid systems) told me that they installed just such a thing
into an operating system they were building and used it to distinguish
between the various operators who used the console. The operators never
could work out how the system "knew" who they were. (I may say that I am not
totally convinced however - particularly in a non-keyboard oriented society
such as the UK where very few people can actually type properly.)

------------------------------

Date: 28 Aug 86 14:11:00 EDT
From: "DYMOND, KEN" <dymond@nbs-vms.ARPA>
Subject: Human errors prevail -- Comment on Nancy's Comment on ...
To: "risks" <risks@csl.sri.com>

Nancy Leveson's comment (on PGN's comment on human error in RISKS-3.43)
makes some very good points.  We do need to discuss the terms we use to
describe the various ways systems fail if only because system safety and
especially software safety are fairly young fields.  And it seems natural
for practitioners of a science, young or not, to disagree on what they are
talking about.  (Recall the discussion a few years ago in SEN on what the
term "software engineering" meant and whether what software engineers did
was really engineering.)

But what scientists say in these discussions about science may not be
science, at least in the sense of experimental science -- it's more like
philosophy, especially when the talk is about "causes".  Aristotle, for one,
talked a lot about causes and categories.  When we are urged to constrain
our use of "cause" ("Trying to simplify and ascribe accidents to one cause
will ALWAYS be misleading.  Worse, it leads us to think that by eliminating
one cause, we have then done everything necessary to eliminate accidents
(e.g. train the operators better, replace the operators by computer,
etc.)"), we are being given a prescription, something value-laden.  (I don't
mean to imply that science is or should be value-free.)  The implication in
the prescription seems to be that we (those interested in software and
system safety) should avoid using "cause" in a certain way otherwise we are
in danger of seducing ourselves as well as everybody else not specifically
so interested (the public) into a dangerous (unsafe) way of thinking.

But a way of supplementing the philosophical or prescriptive bent to our
discussion about the fundamental words is to look at how other disciplines
use the same words.  For example structural engineers seem to be doing a lot
of thinking about what we would call safety.  They even say "Human error is
the major cause of structural failures." (Nowak and Carr, "Classification of
Human Errors," in Structural Safety Studies, American Society of Civil
Engineers, 1985.)  It may be that our discussions about the basic words we
use can be helped by consulting similar areas in more traditional types of
engineering.

There is another prescriptive aspect to the subject of constraining our
discourse as raised by Nancy, namely not admitting into that discourse
statements from certain sources.  ("Also, the nature of the mass media, such
as newspapers, is to simplify.  This is one of the dangers of just quoting
newspaper articles about computer-related incidents, When one reads accident
investigation reports by government agencies, the picture is always more
complicated.")  Our thinking about this prescription may also benefit from
looking at other engineering disciplines to see how they investigate and
report on failures and what criteria and categories (the jargon word is
"methodology") they use, implicitly or explicitly, in assigning causes to
failure.  "Over-simplified" might be the best adjective to describe some of
the contributions to RISKS from newspapers-- one doesn't know whether to
believe them or not.  A problem may arise when writers on safety start to
quote SEN and the safety material collected there, most of which is
previewed here on RISKS, as authoritative sources on computer and other
types of failures.  The question is whether SEN's credibility is being
lessened or the newspaper's enhanced by the one being the source for the
other.  Compare some of the newspaper stories reproduced on this list with
the lucidity and thoroughness of Garman's report on the "The 'Bug' Heard
'Round the World," (SEN, Oct. 1981).  That seems a model for a software
engineering analysis and report of a failure.  We might compare it to other
thorough engineering analyses of failures, say the various commissions'
reports on Three Mile Island or the NBS (no chauvinism intended) report on
the skywalk collapse at the Hyatt Regency in Kansas City.  (The report of
the Soviet government on Chernobyl will perhaps bear reading, too.)

If we evolve some kind of standard for analyzing and reporting system
failure, we'll be able to categorize the trustworthiness of newspaper and,
for that matter, any other failure reports so that their appearance on RISKS
will not necessarily count as an endorsement, either in our own minds or in
that of the public.
   
Ken Dymond, NBS

------------------------------

Date: 28 Aug 86 19:42:14 PDT (Thu)
From: Nancy Leveson <nancy@ICSD.UCI.EDU>
To: risks@csl.sri.com
Subject: Human errors prevail -- Comment on Alan Wexelblat's Comment on 
   Nancy Leveson's... (ad infinitum?)      [but not quite yet ad nauseum!]

From Alan Wexelblat's comment on my comment on ... (RISKS-3.44):

    >... she denies that there are "human errors" but believes that
    >there are "management errors."  It seems that the latter is simply
    >a subset of the former (at least until we get computer managers).

With some risk of belaboring a somewhat insignificant point, after reading
[Alan's message], it is clear to me that I did not make myself very clear.
So let me try again to make a more coherent statement.  I did not mean to
deny that there are human errors, in fact, the problem is that all "errors"
are human errors.

I divide the world of things that can go wrong into human errors and random
hardware failures (or "acts of God" in the words of the insurance
companies).  My real quibble is with the term "computer errors".  Since I do
not believe that computers can perform acts of volition (they tend to
slavishly and often frustratingly follow directions to my frequent chagrin),
erroneous actions on the part of computers must either stem from errors made
by programmers and/or software engineers (who, for the most part, are humans
despite rumors to the contrary) or from underlying hardware failures or a
combination of both.  I suppose we could also include operator errors such
as "pushing the wrong button" or "following the wrong procedure" as either
part of "computer errors" or as a separate category.  The point is that the
term "computer error" includes everything (or nothing depending on how you
want to argue) and the term "human error" includes most everything and
overlaps with most of the computer errors.  And the term "computer error" is
also misleading since to me (and apparently to others since they tend to
talk about human errors vs. computer errors and to imply that we will get
rid of human errors by replacing humans with computers) it seems to imply
some sort of volition on the part of the computer as if it were acting on
its own, without any human influence, to do these terrible things.

That is why I do not find the terms particularly useful in terms of
diagnosing the cause of accidents or devising preventative measures.  I was
just trying to suggest a breakdown of these terms into more useful
subcategories, not to deny that there are "human errors" (in fact, just the
opposite).  And in fact, to be useful, we probably need to further
understand and subdivide my four or five categories which included design
flaws, random hardware failures, operational errors, and management errors
(along with the possibility of including production or manufacturing errors
for hardware components).  Note that three out of the first four of these
are definitely human errors and manufacturing errors could be either
human-caused (most likely) or random.

Actually, I thought the part of my original comment that followed the 
quibbling about terms was much more interesting...

 Nancy Leveson
 ICS Dept.
 University of California, Irvine

------------------------------

End of RISKS-FORUM Digest
************************
-------