risks@CSL.SRI.COM (RISKS Forum) (10/12/90)
RISKS-LIST: RISKS-FORUM Digest Thursday 11 October 1990 Volume 10 : Issue 49 FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator Contents: Programmer error kills phones for 30 minutes (John R. Dudeck) Answering Machine Cheats at Phone Tag (Ed McGuire) Discovery misprogrammed (Fernando Pereira) Airliner story (Rich Epstein via Gene Spafford) An IBM interface glitch & RISKS masthead FTP instructions (Lorenzo Strigini) Automobile Computer RISKS - A Real Life Experience (Marc Lewert) Re: BA 747-400 Engine Failure (Jerry Hollombe) Re: Equinox on A320 (Ken Tindell, Henry Spencer) Re: Ada and multitasking (Stephen Tihor, Henry Spencer) The RISKS Forum is moderated. Contributions should be relevant, sound, in good taste, objective, coherent, concise, and nonrepetitious. Diversity is welcome. CONTRIBUTIONS to RISKS@CSL.SRI.COM, with relevant, substantive "Subject:" line (otherwise they may be ignored). REQUESTS to RISKS-Request@CSL.SRI.COM. TO FTP VOL i ISSUE j: ftp CRVAX.sri.com<CR>login anonymous<CR>AnyNonNullPW<CR> CD RISKS:<CR>GET RISKS-i.j<CR>; j is TWO digits. Vol summaries in risks-i.00 (j=0); "dir risks-*.*<CR>" gives directory; bye logs out. ALL CONTRIBUTIONS ARE CONSIDERED AS PERSONAL COMMENTS; USUAL DISCLAIMERS APPLY. The most relevant contributions may appear in the RISKS section of regular issues of ACM SIGSOFT's SOFTWARE ENGINEERING NOTES, unless you state otherwise. ---------------------------------------------------------------------- Date: Wed, 10 Oct 90 20:49:32 -0700 From: jdudeck@polyslo.CalPoly.EDU (John R. Dudeck) Subject: Programmer error kills phones for 30 minutes Phone lines - including the 911 emergency number - were dead throughout San Luis Obispo [California] for thirty minutes Tuesday night [Oct 9, 1990]. The interruption, which affected 30,000 customers, occurred at 10:15 p.m. and affected virtually all phone lines in the city, said Jim Bower, Pacific Bell's Central Coast area manager. Bower said the interruption was caused by a computer programming error made by a Pacific Bell employee. "We were putting in a new program and an error was made," he said. "We were responsible for the error and corrected it as soon as we could." The error also caused phone lines at the Sheriff's department, including the 911 number, to go dead for 30 minutes, said Sheriff's Sgt. Scott Thompson. The disruption didn't cause any serious problems, he said. - San Luis Obispo Telegram Tribune, Oct. 10, 1990, p. A-5 John Dudeck, jdudeck@Polyslo.CalPoly.Edu ESL: 62013975 Tel: 805-545-9549 ------------------------------ Date: Thu, 11 Oct 90 15:08:40 CST From: Ed McGuire <emcguire@cadfx.ccad.uiowa.edu> Subject: Answering Machine Cheats at Phone Tag Our organization has been playing the game of Phone Tag a lot recently. We reasoned that, if the other player could leave a voice message for us instead of asking a receptionist to have us call them back, we would win the game more often. (Not to mention improve our response to our customers.) Accordingly, we installed telephone answering machines for several people, including myself, on the desks next to the telephones. The telephones are just the visible part of the campus' fancy private branch exchange (PBX). The PBX was insulted when I installed my machine. Accordingly, when one of our secretaries tried to call me, it rang my phone only two times while it rang in the callers ear four times, then "forwarded on no answer." When the secretary answered her phone, she was talking to herself. So I told my machine to answer on two rings. Then I found out that my machine has a bad attitude. For two days it left me "1 message waiting" but there was nothing on the tape. I discovered that it had been telling people to leave their name, company and phone, then hanging up on them. This was because I had mistakenly moved the wrong one of two identical switches on the side to fix the earlier problem. Today I caught my machine cheating at Phone Tag. I started the game by making a call to a person in our Inventory Department. The line was busy, so I tagged her by instructing our telephone system to call me back when she hung up. YOU'RE IT. A few minutes later I left my desk on a short errand. And while I was gone, she hung up and my phone rang. But now my machine saw its opportunity! It answered the call, apologized to her for my absence and instructed her to leave her name, company and phone. YOU'RE IT. The message I received on the machine made it clear that this was breaking the rules of the game. * * * Fortunately, this last event never actually happened. I realized the risk I was taking due to the interaction of two Phone Tag technologies just before I left my office. A colleague and I simply tested and verified the correctness of my scenario. Ed ------------------------------ Date: Wed, 10 Oct 90 22:30:10 EDT From: pereira@research.mercury.nj.att.com (Fernando Pereira) Subject: Discovery misprogrammed Summary of story by AP Science Writer Lee Siegel ``Wrong Computer Instructions Were Given to Discovery Before Liftoff''. According to Discovery flight director Milt Heflin, the shuttle was launched with incorrect instructions on how to operate some of its programs. The error was discovered by the shuttle crew about one hour into the mission, and was quickly corrected. NASA claims that automatic safeguards would have prevent any ill effects even if the crew had not noticed the error on a display. The error was made before the launch and discovered when the crew was switching the shuttle computers from launch to orbital operations. The switching procedure involves shutting down computers 3 and 5, while computers 1 and 2 carry on normal operations, and computer 4 monitors the shuttle's ``vital signs'' [I'm just following the article on this, I don't know whether it is accurate]. However, the crew noticed that the instructions for computer 4 were in fact those intended for computer 2. Heflin stated that the problem is considered serious because the ground pre-launch procedures failed to catch it. Fernando Pereira, 2D-447, AT&T Bell Laboratories, 600 Mountain Ave, Murray Hill, NJ 07974 pereira@research.att.com ------------------------------ Date: Wed, 10 Oct 90 12:25:11 EST From: Gene Spafford <spaf@cs.purdue.edu> Subject: Airliner story A few months ago, I told a friend about the various stories I had read here and elsewhere about the A320. The subject came up when I explained why I would never again fly Northwest Airlines (they bought a bunch of A320s for domestic use). He just recently sent me this mail: >> From: RIch EPstein <@VM.CC.PURDUE.EDU:REPSTEIN@GWUVM> >> To: Spaf <spaf> >> Date: Tue, 09 Oct 90 16:13:44 EDT >> >> I just came back from the IEEE Visual Languages WOrkshop, which >> I thoroughly enjoyed. However, I thought you would find my >> air horror story of interest: >> >> The conference was in Skokie, ILL, so I had to fly in and out >> of O'Hare. We had a slight mishap on the plane on the way back >> to Dulles. Heavy rains leaked into the plane and knocked out >> the transponders and the auto-pilot computer. About 15 minutes >> into the flight the pilot announced that we had to return to >> O'Hare because the air traffic controllers couldn't "pick us >> up". In other words, we were invisible, in the clouds, at >> O'Hare. According to an Air Force ROTC student here at GW >> the pilot meant this literally. Radar picks up aircraft by >> means of the signal sent out by the transponders. >> >> We flew around in the clouds for 15 more minutes. We landed >> with all sorts of emergency vehicles on the runway. Then we >> waited for almost three hours until they finally replaced >> the transponders and computer and we left Chicago on the >> same plane (which I didn't like too much). >> >> The pilot got on the p.a. system after we were successfully >> on our way to Dulles and he made an interesting remark. He >> said that this was a good plane because it had "stainless >> steel aeronautical control cables", a reference to the fact >> that an Airbus would probably have been disabled completely >> in a similar circumstance. I have no doubt that the pilot >> was referring to the Airbus when he made this remark. I wrote asking his permission to send this on to Risks, and in his reply, he said: >> By the way, I think the ground crew at O'Hare might have been >> negligent in my airline incident. When we entered the airplane >> water was LITERALLY POURING INTO THE AIRCRAFT at the door to >> the airplane. Passengers had to JUMP THROUGH a sheet of water, >> a thin veil, maybe 1/4" thick, but continuous. The water was >> coming in from the top of the door and onto the floor of >> the airplane. Obviously, the water went from there into the >> underbelly of the craft. The reason for this was that the >> airport walkway was not meeting the fuselage correctly. ------------------------------ Date: Mon, 08 Oct 90 09:49:54 MET From: Lorenzo Strigini <STRIGINI@ICNUCEVM.CNUCE.CNR.IT> Subject: An IBM interface glitch & RISKS masthead FTP instructions Just to signal another minor problem similar to that of truncation to 80 columns: After several unsuccessful attempts to follow the masthead instructions for FTPing RISKS issues, I discovered this morning that my IBM3278 emulator eats square brackets. CRVAX runs VMS, I guess, but I hadn't used such a system for years: only when I moved my attempts to a Unix machine did I realize why I could not cd to the risks directory. The 3278 keyboards don't have square brackets, but square brackets entered through an emulator are stored as escape sequences. ASCII square brackets that exist in mailed of FTPed files are stored as such, and displayed as blank spaces. and here are a few left square brackets embedded in a series of dashes: -------------------- and a few right brackets: ----------:::::---------- Amazing... I thought this was worth signalling in case you receive requests for help from other IBM users. Lorenzo Lorenzo Strigini, IEI del CNR, Via Santa Maria 46 I-56126 Pisa ITALY Tel: +39-50-553159 ; Fax: +39-50-554342 ; strigini@icnucevm.bitnet [Lorenzo and RISKSers, I have long been annoyed at the miserable VAX command "cd sys$user2:[risks]" to get the anonymous FTPer into the RISKS directory. In response to Lorenzo's message, SRI's CRVAX wizard Ray Curiel, at Steve Milunovic's request, has provided a terse alias: "cd risks:". Upper/lower case does not matter, but the colon does. HooRAY! Thanks. I changed the masthead. Now you don't need to escape from the colonease? PGN] ------------------------------ Date: Wed, 10 Oct 90 12:58:25 PDT From: marc@frederic.octel.com (Marc Lewert) Subject: Automobile Computer RISKS - A Real Life Experience With all of the discussion on the risks of computerized and/or electronic controls in Aircraft, one should not overlook the fact that there could be more down to earth (pun intended) risks along the same lines. One of our cars has a computer, and various other electronic sensors that control the engine. A couple of years back, shortly after we bought the car, it started intermittently losing power. Not too much trouble on city streets, but on the freeway, it was downright dangerous. The symptom was that the car's engine would drop to idle speed, and would not speed up, not matter what we did with the gas pedal. It could be reset by turning off, and restarting the engine. We had brought the car in several times, but the dealership could not find the problem. Then came the fateful day...My wife was driving the car when the engine dropped to idle at the merge of two freeways, and she was in slow lane of one freeway that merged with the fast lane of the other...I was not happy when I got the call (I told her to call the dealership to come get the car!). Somehow she was talked into driving the car to the dealership, when the same problem occured. This time, though, it occured in front of a truck hauling an oversized load in the slow lane of the freeway. If there had not been an offramp and a truck driver that was on his toes, we might be talking about my wife in the past tense at this point. The eventual problem was an intermittent failure of a sensor in the air intake system. The computer responded by cutting down the fuel flow to its minimum setting. I just wonder how many of these types of problems exist out in the world, and if anyone had been killed by them. All in all, everyone was very lucky... This Time. ------------------------------ Date: 11 Oct 90 01:49:21 GMT From: hollombe@ttidca.tti.com (The Polymath) Subject: Re: BA 747-400 Engine Failure (Thomas, RISKS-10.47) }... It's a FADEC failure" [FADEC = Full Authority Digital Engine Controller]. }... There has been a number of instances of spurious signals causing }747-400 engines to throttle back or shut down, according to Flight ... This begins to sound a bit like the discussion of electronic vs. mechanical rail line switching controls. I earned my Airframe & Powerplant mechanic's license (A&P) nearly 25 years ago (when pilots were made of iron and planes were made of wood ... but I digress (-: ). At that time, jet engine controls were almost entirely mechanical, consisting of amazingly complex blocks of pneumatic and hydraulic sensors and actuators. Each engine had a main controller and a backup, "get you down alive" controller that provided just enough control to keep the engine running if the main failed. Has the concept of such backups been lost in the rush to computerize? Jerry Hollombe, M.A., CDP, Citicorp, 3100 Ocean Park Blvd. Santa Monica, CA 90405 (213) 450-9111, x2483 {csun | philabs | psivax}!ttidca!hollombe ------------------------------ Date: 11 Oct 1990 18:09:03 GMT From: ken@minster.york.ac.uk Subject: Re: Equinox on A320 (Pete Mellor, RISKS-10.48) >The programme went on to consider the crash of the A320 at Bangalore. A pilot >was interviewed saying that it was virtually unknown for an aircraft to lose >height in such a way in clear conditions on a landing approach. We know that the Bangalore crash _was_ pilot error. Both the `back box' and the cockpit voice recorder indicate that the pilots were to blame. Flight International has given a good account of this, including the CVR transcript. The captain left the aircraft in idle descent mode and flew into the ground. The aircraft warned the pilots (both visually and aurally), but they ignored the warnings. Equinox chose not to report this (the rest of the programme seemed very convincing). A lot of people have a lot of axes to grind over Airbus Industrie, and receiving totally impartial and accurate information is almost impossible. Listen to Boeing and you hear that the A320 is a death-trap. Listen to Aerospatiale and you hear supreme confidence. Watch TV programmes and you see sensationalism. Ken Tindell, Computer Science Dept., York University YO1 5DD, UK Tel.: +44-904-433244 UUCP: ..!mcsun!ukc!minster!ken ------------------------------ Date: Wed, 10 Oct 90 12:39:31 EDT From: henry@zoo.toronto.edu Subject: Re: Equinox on A320 (UK Channel 4, Sun., 30th Sep) >- The DFDR recording stops 4 seconds *prior* to impact with the trees. (Davis > added that, in his entire career, he had *never* come across a similar > instantaneous stoppage of a recorder.) Is it possible that Davis is not familiar with *digital* flight recorders? I've seen some commentary on such an issue in the aviation press recently: the underlying problem is that some (all?) digital flight recorders buffer incoming data in semiconductor memory, which loses its contents on power failure. The airworthiness authorities are starting to be seriously displeased with the potential for loss of crucial data, and there are mutterings about requiring non-volatile memory. I don't know for sure that this accounts for the above claim, but it certainly sounds like the right sort of symptoms. (Would a simple explanation like this go unconsidered? Quite possibly, especially in the context of a media story whose basic slant is "dirty work at the crossroads". As I've commented before, there is a problem with the A320 business in that almost all participants have axes to grind and it is very difficult to get a balanced view. The media are not exempt from this, since sensation sells and boring truth doesn't.) Henry Spencer at U of Toronto Zoology henry@zoo.toronto.edu utzoo!henry ------------------------------ Date: Wed, 10 Oct 1990 17:03:25 EDT From: TIHOR@ACFcluster.NYU.EDU (Stephen Tihor) Subject: Re: Ada and multitasking (Kristiansen, RISKS-10.48) The areas left to the implementer were left that way due to disagreements on the proper and useful choices. All such options must be fully specified in mandatory sections of the Ada reference manual. The general phrase I remember being used whenever such items were discussed is that the market will select among conforming compilers. In hindsight it might have been better to add some language clauses that allow you to specific or explicitly leave unspecified the tasking priority requirements. On the other hand many people believe that the ADA tasking model, while interesting, is not general enough to begin with. ------------------------------ Date: Wed, 10 Oct 90 12:59:53 EDT From: henry@zoo.toronto.edu Subject: Re: Ada and multitasking > The author does not seem to realize the contradiction between the > *reliability* and *portability* quoted as features of Ada on one hand, and > the lack of definition of crucial features on the other. There is no inherent contradiction here, unless "reliability" and "portability" are taken to include the phrase "guaranteed or your money back". (Mind you, some of the Ada enthusiasts essentially do claim this.) "Reliability" and "portability" are not absolutes, especially in a language constrained to be implemented efficiently on current machines. If such constraints mean that a particular feature is not completely defined, this just means that reliable/portable programs must avoid depending on it. This does require competent programmers, however, and one gets the impression that some of Ada's big backers hoped that their wonderful language would do away with the need for competence. After all, it's much easier to run a test suite through a compiler than to decide whether a programmer is competent. The C community regularly sees broadsides on the subject, with ignorant people claiming that the large number of "implementation defined" or even "undefined" items in ANSI C implies that C programs cannot possibly be portable or reliable. Not so; these are just indications of where the programmer must avoid depending on implementation-dependent behavior. (There is room for legitimate debate about whether C expects too much from its programmers, but that is a different issue. Portable, reliable C code is verifiably possible.) C gets more of this than Ada, because C is a rather unforgiving language meant for people who know what they are doing, but almost any efficient language will run into similar issues. To draw an analogy from more traditional engineering, the basic art of designing circuits with transistors is organizing things so that the characteristics of individual transistors do not affect the outputs much. Transistor characteristics are quite variable, especially if you want the transistors to be cheap. This does not make it impossible to design transistor circuits with predictable properties. It merely requires that designers take care to use circuits that allow for the variability and cancel it out. Henry Spencer at U of Toronto Zoology henry@zoo.toronto.edu utzoo!henry ------------------------------ End of RISKS-FORUM Digest 10.49 ************************