stock@uwvax.UUCP (10/17/83)
A while ago I asked net.misc fans to send me references for expensive or interesting bugs. I got about a dozen replies, some with good references and some expressing interest in having the results posted to the net. The purpose of the original article was to get a series of examples for a short talk I have now given on the reliability of computer systems. The examples I used in the talk follow, separated into broad categories of data, hardware, and software errors. Many of the errors were listed in ACM SIGSOFT's "Software Engineering Notes," in the Letters from the Editor; the editor, Peter Neumann, often publishes accounts of interesting bugs that come to his attention. DATA ERRORS: In 1972, a town in Rhode Island miscalculated its tax base by seven million dollars because of a keypunch error; as a result, the tax rates were set far too low. [Kernighan and Plauger, *The Elements of Programming Style*, McGraw-Hill 1974, pp. 59-60] A nuclear attack alert was caused by loading of a "war game" tape onto a NORAD computer in November, 1979. ["Computers and the U.S. Military Don't Mix," in *Science*, Volume 207, pp. 1183-1187] HARDWARE ERRORS: ". . . in current chip technology . . . about one system error a week could be attributable to cosmic ray interference at the electron level." [SEN, April 1980] (not a specific bug, but an interesting point -- dLs) Two nuclear attack alerts were caused by an integrated circuit in a communications multiplexor in June, 1980. [SEN, July 1980] SOFTWARE ERRORS: A communication protocol error delayed the first space shuttle flight. [article by J. Garman in SEN, October 1981] An infinite loop in a simulated shuttle flight was caused by an attempt to abort the mission after a canceled previous abort (the designers of the program had not anticipated that the same mission might need to be aborted TWICE). [SEN, January 1982] The first Mariner flight to Venus failed due to a FORTRAN statement of the form "DO 3 I = 1.3"; this was intended to be a DO loop, but was interpreted as an assignment to the variable "DO 3 I" because the period should have been a comma. [Myers, Glenford J., *Software Reliability*, Wiley & Sons, 1976, p. 275] Another Mariner flight failed because of a missing "NOT" in a program. [SEN, April 1980] (NOTE -- I am not certain that these last two bugs are not just different versions of what happened to the same Mariner flight. I would appreciate any references that ascertain the facts either way -- dLs) The first version of the F16 navigation software would have inverted the plane whenever it crossed the equator (fortunately this bug was caught in simulation) [SEN, April 1980] Five nuclear plants were closed when a serious bug was discovered in the software that helped to design the plants to be safe from earthquakes. [SEN, April 1979] Sorry for the length of this article. Many thanks to Steve Bellovin, Robert Duncan, Jim Jenal, Timothy L. Kay, Jeff Klein, Brian Marick, Hal Perkins, Rick Smith, Don Stanwyck, Chris Torek, and dciem!ntt for their replies (and to any others who tried to send replies, but did not succeed). -- daniel stock stock@uwisc ...!seismo!uwvax!stock