[mod.risks] RISKS-3.57

RISKS@CSL.SRI.COM (RISKS FORUM, Peter G. Neumann -- Coordinator) (09/17/86)
RISKS-LIST: RISKS-FORUM Digest, Tuesday, 16 September 1986  Volume 3 : Issue 57

           FORUM ON RISKS TO THE PUBLIC IN COMPUTER SYSTEMS 
   ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Contents:
  Computers and the Stock Market (again) (Robert Stroud)
  The Old Saw about Computers and TMI (Ken Dymond)
  Do More Faults Mean (Yet) More Faults? (Dave Benson)
  A critical real-time application worked the first time (Dave Benson)
  Autonomous weapons (Eugene Miya)
  "Unreasonable behavior" and software (Eugene Miya on Gary Chapman)
  Risks of maintaining computer timestamps revisited (John Coughlin)

The RISKS Forum is moderated.  Contributions should be relevant, sound, in good
taste, objective, coherent, concise, nonrepetitious.  Diversity is welcome. 
(Contributions to RISKS@CSL.SRI.COM, Requests to RISKS-Request@CSL.SRI.COM)
  (Back issues Vol i Issue j available in CSL.SRI.COM:<RISKS>RISKS-i.j.
  Summary Contents in MAXj for each i; Vol 1: RISKS-1.46; Vol 2: RISKS-2.57.)

----------------------------------------------------------------------

From: Robert Stroud <robert%cheviot.newcastle.ac.uk@Cs.Ucl.AC.UK>
Date: Mon, 15 Sep 86 16:53:37 gmt
To: risks@csl.sri.com
Subject: Computers and the Stock Market (again)

The computers had a hand in the dramatic fall on Wall Street last week
according to an item on the BBC TV news. Apparently, the systems were
not designed to cope with the sheer volume of sales, (anybody know more
about this?). The report continued

	"In London they still do it the old fashioned way with bits
	of paper, which makes people think twice before joining in
	a mindless selling spree. However, all this could change in
	October with the Big Bang..."

What price progress?

Robert Stroud,
Computing Laboratory,
University of Newcastle upon Tyne.

ARPA robert%cheviot.newcastle@ucl-cs.ARPA
UUCP ...!ukc!cheviot!robert

------------------------------

Date: 16 Sep 86 09:25:00 EDT
From: "DYMOND, KEN" <dymond@nbs-vms.ARPA>
Subject: The Old Saw about Computers and TMI
To: "risks" <risks@csl.sri.com>
cc: dymond      
Reply-To: "DYMOND, KEN" <dymond@nbs-vms.ARPA>

Ihor Kinal says in RISKS-3.55

	>Obviously, one can present arguments for each side [human
	> vs computer having the last say -- at TMI, computers
	>were right, but ...]   I would say that if humans do
	>override CRITICAL computer control [like TMI], then
	>some means of escalating the attention level must be
	>invoked [e.g., have the computers automatically notify
	>the NRC].

This belief keeps surfacing but is false.  There was no computer
control in safety grade systems at TMI -- see the documentation in
the Kemeny report and probably elsewhere.  There was a computer in
the control room but it only drove a printer to provide a hardcopy
log of alarms in the sequence in which they occurred.  The log is
an aid in diagnosing events.  The computer (a Bendix G-15 ??) did 
play a role in the emergency since at one point its buffer became 
full and something like 90 minutes of alarms were not recorded, thus
hampering diagnosis.  

On a couple of occasions I have asked NRC people why computers aren't
used to control critical plant systems and have been told that "they aren't
safety grade."  I'm not quite sure what this means, but I take it
to mean that computers (and software) aren't trustworthy enough for
such safety areas as the reactor protection system.  This is not to
say that computers aren't used in monitoring plant status, quite
different from control.

Ken Dymond
(the opinions above don't necessarily reflect those of my employer
or anybody else, for that matter.)

------------------------------

Date: Sun, 14 Sep 86 19:00:30 pdt
From: Dave Benson <benson%wsu.csnet@CSNET-RELAY.ARPA>
To: risks%csl.sri.com@CSNET-RELAY.ARPA
Subject: Do More Faults Mean (Yet) More Faults?

  |In RISKS 3.50 Dave Benson comments in "Flight Simulator
  |Simulators Have Faults" that
  |
  |    >We need to understand that the more faults found at
  |    >any stage to engineering software the less confidence one has in the
  |    >final product.  The more faults found, the higher the likelyhood that
  |    >faults remain.
  |
  |This statement makes intuitive sense, but does anyone know of any data
  |to support this ?  Is this true of any models of software failures ?
  |Is this true of the products in any of the hard engineering fields -- civil,
  |mechanical, naval, etc. -- and do those fields have the confirming data ?
  |
  |Ken Dymond, NBS

Please read the compendium of (highly readable) papers by M.M.Lehman and
L.A.Belady, Program Evolution: Processes of Software Change, APIC Studies
in Data Processing No. 27, Academic Press, Orlando, 1985.  This provides data.
It is (sorry-- should be, but probably isn't) standard in software quality
assurance efforts to throw away modules which show a high proportion of
the total evidenced failures.  The (valid, in my opinion) assumption is
that the engineering on these is so poor that it is hopeless to continue
to try to patch it up.

Certain models of software failure place increased "reliablity" on software
which has been exercised for long periods without fault.  One must
understand that this is simply formal modelling of the intuition that
some faults means (yet) more faults.  This is certainly true of all
engineering fields.  While I don't have the "confirming data" I suggest you
consider your car, your friends car, etc.  Any good history of engineering
will suggest that many designs never are marketed because of an unending
sequence of irremediable faults.

The intuitive explaination is: Good design and careful implementation works.
This is teleological.  We define good design and careful implementation by
"that which works".

However, I carefully said "confidence".  Confidence is an intuitive
assessment of reliability.  I was not considering the formalized notion
of "confidence interval" used in statistical studies.  To obtain high
confidence in the number of faults requires observing very many errors,
thus lowering one's confidence in the product.  To obtain high confidence
in a product requires observing very few errors while using it.

------------------------------

Date: Sun, 14 Sep 86 22:40:21 pdt
From: Dave Benson <benson%wsu.csnet@CSNET-RELAY.ARPA>
To: risks%csl.sri.com@CSNET-RELAY.ARPA
Subject: I found one! (A critical real-time application worked the first time)

Last spring I issued a call for hard data to refute a hypothesis which I,
perhaps mistakenly, called the Parnas Hypothesis:
	No large computer software has ever worked the first time.
Actually, I was only interested in military software, so let me repost the
challenge in the form I am most interested in:
	NO MILITARY SOFTWARE (large or small) HAS EVER WORKED IN ITS FIRST
	OPERATIONAL TEST OR ITS FIRST ACTUAL BATTLE.
Contradict me if you can. (Send citations to the open literature
to benson@wsu via csnet)

Last spring's request for data has finally led to the following paper:
	Bonnie A. Claussen, II
	VIKING '75 -- THE DEVELOPMENT OF A RELIABLE FLIGHT PROGRAM
	Proc. IEEE COMPSAC 77 (Computer Software & Applications Conference)
	IEEE Computer Society, 1977
	pp. 33-37

I offer some quotations for your delictation:

	The 1976 landings of Viking 1 and Viking 2 upon the surface of
	Mars represented a significant achievement in the United States
	space exploration program. ... The unprecented success of the Viking
	mission was due in part to the ability of the flight software
	to operate in an autonomous and error free manner. ...
	Upon separation from the Oribiter the Viking Lander, under autonomous
	software control, deorbits, enters the Martian atmosphere,
	and performs a soft landing on the surface. ... Once upon the surface,
	... the computer and its flight software provide the means by
	which the Lander is controlled.  This control is semi-autonomous
	in the sense that Flight Operations can only command the Lander
	once a day at 4 bit/sec rate.

(Progress occured in a NASA contract over a decade ago, in that)

	In the initial stages of the Viking flight program development,
	the decision was made to test the flight algorithms and determine
	the timing, sizing and accuracy requirements that should be 
	levied upon the flight computer prior to computer procurement.
	... The entire philosophy of the computer hardware and
	software reliability was to "keep it simple."  Using the
	philosophy of simplification, modules and tasks tend toward 
	straight line code with minium decisions and minimum
	interactions with other modules.

(It was lots of work, as)

	When questioning the magnitude of the qulity assurance task,
	it should be noted that the Viking Lander flight program development
	required approximately 135 man-years to complete.

(But the paper gives no quantitative data about program size or complexity.)

Nevertheless, we may judge this as one of the finest software engineering
acomplishments to date.  The engineers on this project deserve far more
plaudits than they've received.  I know of no similar piece of software
with so much riding upon its reliable behavior which has done so well.
(If you do, please do tell me about it.)

However, one estimates that this program is on the order of kilolines of FORTRAN
and assembly code, probably less than one hundred kilolines.  Thus
Parnas will need to judge for himself whether or not the Viking Lander
flight software causes him to abandon (what I take to be) his hypothesis
about programs not working the first time.

It doesn't cause me to abandon mine because there were no Martians shooting
back, as far as we know...

David B. Benson, Computer Science Department, Washington State University,
Pullman, WA 99164-1210  csnet: benson@wsu

------------------------------

Date: Tue, 16 Sep 1986  08:31 EDT
From: LIN@XX.LCS.MIT.EDU
To:   eugene@AMES-NAS.ARPA (Eugene Miya)
Cc:   risks@CSL.SRI.COM, arms-d@XX.LCS.MIT.EDU
Subject: Autonomous weapons
In-reply-to: Msg of 10 Sep 1986  16:17-EDT from eugene at AMES-NAS.ARPA (Eugene Miya)


    From: eugene at AMES-NAS.ARPA (Eugene Miya)

    ... another poster brought up the issue of autonmous weapons.
    We had a discussion of of this at the last Palo Alto CPSR meeting.
    Are autonmous weapons moral?  If an enemy has a white flag or hand-ups,
    is the weapon "smart enough" to know the Geneva Convention (or is too
    moral for programmers of such systems)?

What do you consider an autonomous weapon?  Some anti-tank devices are
intended to recognize tanks and then attack them without human
intervention after they have been launched (so-called fire-and-forget
weapons).  But they still must be fired under human control.  *People*
are supposed to recognize white flags and surrendering soldiers.

------------------------------

Date: Tue, 16 Sep 1986  09:01 EDT
From: LIN@XX.LCS.MIT.EDU
To:   Gary Chapman <chapman@RUSSELL.STANFORD.EDU>
Cc:   arms-d@XX.LCS.MIT.EDU, RISKS@CSL.SRI.COM
Subject: "Unreasonable behavior" and software

    From: Gary Chapman <chapman at russell.stanford.edu>
    	Information about targets can be placed into the munitions
    	processor prior to firing along with updates on meteorologi-
    	cal conditions and terrain.  Warhead functioning can also be
    	selected as variable options will be available.  The intro-
    	duction of VHSIC processors will give the terminal homing
    	munitions the capability of distinguishing between enemy and
    	friendly systems and finite target type selection.  Since
    	the decision of which target to attack is made on board the
    	weapon, the THM will approach human intelligence in this area.
    	The design criteria is pointed toward one munition per target
    	kill.

    (I scratched my head along with the rest of you when I saw this;
    I've always
    thought if you fire a bullet or a shell out of a tube it goes until it hits
    something, preferably something you're aiming at.  But maybe the Army has
    some new theories of ballistics we don't know about yet.)

The THM is an example of what the army calls a "fire-and-forget"
munition. A human being fires it in the general direction of the
target, and then the munition seeks out its target without further
intervention.  The munition has mechanisms to alter its course from a
ballistic trajectory.

    What level of confidence would we have to give soldiers (human soldiers--we
    may have to get used to using that caveat) operating at close proximity to
    THMs that the things are "safe"?

That is indeed the question.  My own guess is that THMs and other
smart munitions will never be able to distinguish between friend or
foe.  That's why most current concepts are directed towards attacking
enemy forces deep behind enemy lines, where you can ASSUME that
anything you see is hostile.

------------------------------

Date:     15 Sep 86 12:14:00 EDT
From:       John Coughlin  <JC%CARLETON.BITNET@WISCVM.WISC.EDU>
To:  <risks@csl.sri.COM>
Subject:  Risks of maintaining computer timestamps revisited


Some  time ago I  submitted an item  to RISKS describing  the way in which the
CP-6 operating system  requires the time to be set  manually during every warm
or cold boot.  The latest release  of this OS contains an improvement: in most
cases the time need only be  manually set on a cold boot.  Unfortunately, with
this enhancement came an unusual bug.

The timestamp is  stored in a special hardware register,  which is modified by
certain  diagnostic procedures  run during  preventive maintenance.   It seems
these diagnostic  procedures were not modified  to reflect the new  use put to
the timestamp register.  As a result, any time a warm boot was performed after
PM,  the monitor  would freak  out at  the illegal  timestamp and mysteriously
abort the boot  with a memory fault.  Until this bug  was patched the only fix
was to power the computer down, thus clearing the offending value.

Luckily, the  PM procedure set the timestamp  register to an impossible value,
rather than a realistic but incorrect value.  Therefore the problem manifested
itself in  an obvious way, instead  of subtly changing the  date and time.  Of
course  this was  at the  cost of  having to  fix a  hung system.  This is yet
another illustration of the risk of breaking one thing while fixing another.

                                                                     /jc

------------------------------

End of RISKS-FORUM Digest
************************
-------