[comp.risks] RISKS DIGEST 10.49

risks@CSL.SRI.COM (RISKS Forum) (10/12/90)

RISKS-LIST: RISKS-FORUM Digest  Thursday 11 October 1990  Volume 10 : Issue 49

        FORUM ON RISKS TO THE PUBLIC IN COMPUTERS AND RELATED SYSTEMS 
   ACM Committee on Computers and Public Policy, Peter G. Neumann, moderator

Contents:
  Programmer error kills phones for 30 minutes (John R. Dudeck)
  Answering Machine Cheats at Phone Tag (Ed McGuire)
  Discovery misprogrammed (Fernando Pereira)
  Airliner story (Rich Epstein via Gene Spafford)
  An IBM interface glitch & RISKS masthead FTP instructions (Lorenzo Strigini)
  Automobile Computer RISKS - A Real Life Experience (Marc Lewert)
  Re: BA 747-400 Engine Failure (Jerry Hollombe)
  Re: Equinox on A320 (Ken Tindell, Henry Spencer)
  Re: Ada and multitasking (Stephen Tihor, Henry Spencer)

The RISKS Forum is moderated.  Contributions should be relevant, sound, in good
taste, objective, coherent, concise, and nonrepetitious.  Diversity is welcome.
CONTRIBUTIONS to RISKS@CSL.SRI.COM, with relevant, substantive "Subject:" line
(otherwise they may be ignored).  REQUESTS to RISKS-Request@CSL.SRI.COM.  TO
FTP VOL i ISSUE j: ftp CRVAX.sri.com<CR>login anonymous<CR>AnyNonNullPW<CR> CD
RISKS:<CR>GET RISKS-i.j<CR>; j is TWO digits.  Vol summaries in risks-i.00
(j=0); "dir risks-*.*<CR>" gives directory; bye logs out.  ALL CONTRIBUTIONS
ARE CONSIDERED AS PERSONAL COMMENTS; USUAL DISCLAIMERS APPLY.  The most
relevant contributions may appear in the RISKS section of regular issues of ACM
SIGSOFT's SOFTWARE ENGINEERING NOTES, unless you state otherwise.

----------------------------------------------------------------------

Date: Wed, 10 Oct 90 20:49:32 -0700 From: jdudeck@polyslo.CalPoly.EDU (John R.
Dudeck) Subject: Programmer error kills phones for 30 minutes

   Phone lines - including the 911 emergency number - were dead throughout San
Luis Obispo [California] for thirty minutes Tuesday night [Oct 9, 1990].
   The interruption, which affected 30,000 customers, occurred at 10:15 p.m.
and affected virtually all phone lines in the city, said Jim Bower, Pacific
Bell's Central Coast area manager.
   Bower said the interruption was caused by a computer programming error made
by a Pacific Bell employee.  "We were putting in a new program and an error was
made," he said.  "We were responsible for the error and corrected it as soon as
we could."
	The error also caused phone lines at the Sheriff's department,
including the 911 number, to go dead for 30 minutes, said Sheriff's Sgt. Scott
Thompson.  The disruption didn't cause any serious problems, he said.

	- San Luis Obispo Telegram Tribune, Oct. 10, 1990, p. A-5

John Dudeck, jdudeck@Polyslo.CalPoly.Edu ESL: 62013975 Tel: 805-545-9549

------------------------------

Date: Thu, 11 Oct 90 15:08:40 CST From: Ed McGuire
<emcguire@cadfx.ccad.uiowa.edu> Subject: Answering Machine Cheats at Phone Tag

Our organization has been playing the game of Phone Tag a lot recently.  We
reasoned that, if the other player could leave a voice message for us instead
of asking a receptionist to have us call them back, we would win the game more
often.  (Not to mention improve our response to our customers.)  Accordingly,
we installed telephone answering machines for several people, including myself,
on the desks next to the telephones.  The telephones are just the visible part
of the campus' fancy private branch exchange (PBX).

The PBX was insulted when I installed my machine.  Accordingly, when one of our
secretaries tried to call me, it rang my phone only two times while it rang in
the callers ear four times, then "forwarded on no answer."  When the secretary
answered her phone, she was talking to herself.

So I told my machine to answer on two rings.  Then I found out that my machine
has a bad attitude.

For two days it left me "1 message waiting" but there was nothing on the tape.
I discovered that it had been telling people to leave their name, company and
phone, then hanging up on them.  This was because I had mistakenly moved the
wrong one of two identical switches on the side to fix the earlier problem.

Today I caught my machine cheating at Phone Tag.

I started the game by making a call to a person in our Inventory Department.
The line was busy, so I tagged her by instructing our telephone system to call
me back when she hung up.  YOU'RE IT.

A few minutes later I left my desk on a short errand.  And while I was gone,
she hung up and my phone rang.  But now my machine saw its opportunity!  It
answered the call, apologized to her for my absence and instructed her to leave
her name, company and phone.  YOU'RE IT.

The message I received on the machine made it clear that this was breaking the
rules of the game.

			*	*	*

Fortunately, this last event never actually happened.  I realized the risk I
was taking due to the interaction of two Phone Tag technologies just before I
left my office.  A colleague and I simply tested and verified the correctness
of my scenario.
                                        Ed

------------------------------

Date: Wed, 10 Oct 90 22:30:10 EDT
From: pereira@research.mercury.nj.att.com (Fernando Pereira)
Subject: Discovery misprogrammed

Summary of story by AP Science Writer Lee Siegel ``Wrong Computer
Instructions Were Given to Discovery Before Liftoff''.

According to Discovery flight director Milt Heflin, the shuttle was launched
with incorrect instructions on how to operate some of its programs. The error
was discovered by the shuttle crew about one hour into the mission, and was
quickly corrected. NASA claims that automatic safeguards would have prevent any
ill effects even if the crew had not noticed the error on a display. The error
was made before the launch and discovered when the crew was switching the
shuttle computers from launch to orbital operations. The switching procedure
involves shutting down computers 3 and 5, while computers 1 and 2 carry on
normal operations, and computer 4 monitors the shuttle's ``vital signs'' [I'm
just following the article on this, I don't know whether it is accurate].
However, the crew noticed that the instructions for computer 4 were in fact
those intended for computer 2.  Heflin stated that the problem is considered
serious because the ground pre-launch procedures failed to catch it.

Fernando Pereira, 2D-447, AT&T Bell Laboratories, 600 Mountain Ave, Murray
Hill, NJ 07974                                    pereira@research.att.com

------------------------------

Date: Wed, 10 Oct 90 12:25:11 EST
From: Gene Spafford <spaf@cs.purdue.edu>
Subject: Airliner story

A few months ago, I told a friend about the various stories I had read here and
elsewhere about the A320.  The subject came up when I explained why I would
never again fly Northwest Airlines (they bought a bunch of A320s for domestic
use).

He just recently sent me this mail:

>> From:    RIch EPstein <@VM.CC.PURDUE.EDU:REPSTEIN@GWUVM>
>> To:      Spaf <spaf>
>> Date:    Tue, 09 Oct 90 16:13:44 EDT 
>>
>> I just came back from the IEEE Visual Languages WOrkshop, which
>> I thoroughly enjoyed. However, I thought you would find my
>> air horror story of interest:
>> 
>> The conference was in Skokie, ILL, so I had to fly in and out
>> of O'Hare. We had a slight mishap on the plane on the way back
>> to Dulles. Heavy rains leaked into the plane and knocked out
>> the transponders and the auto-pilot computer. About 15 minutes
>> into the flight the pilot announced that we had to return to
>> O'Hare because the air traffic controllers couldn't "pick us
>> up". In other words, we were invisible, in the clouds, at
>> O'Hare. According to an Air Force ROTC student here at GW
>> the pilot meant this literally. Radar picks up aircraft by
>> means of the signal sent out by the transponders.
>> 
>> We flew around in the clouds for 15 more minutes. We landed
>> with all sorts of emergency vehicles on the runway. Then we
>> waited for almost three hours until they finally replaced
>> the transponders and computer and we left Chicago on the
>> same plane (which I didn't like too much).
>> 
>> The pilot got on the p.a. system after we were successfully
>> on our way to Dulles and he made an interesting remark. He
>> said that this was a good plane because it had "stainless
>> steel aeronautical control cables", a reference to the fact
>> that an Airbus would probably have been disabled completely
>> in a similar circumstance. I have no doubt that the pilot
>> was referring to the Airbus when he made this remark.

I wrote asking his permission to send this on to Risks, and in his
reply, he said:

>> By the way, I think the ground crew at O'Hare might have been
>> negligent in my airline incident. When we entered the airplane
>> water was LITERALLY POURING INTO THE AIRCRAFT at the door to
>> the airplane. Passengers had to JUMP THROUGH a sheet of water,
>> a thin veil, maybe 1/4" thick, but continuous. The water was
>> coming in from the top of the door and onto the floor of
>> the airplane. Obviously, the water went from there into the
>> underbelly of the craft. The reason for this was that the
>> airport walkway was not meeting the fuselage correctly.

------------------------------

Date: Mon, 08 Oct 90 09:49:54 MET 
From: Lorenzo Strigini <STRIGINI@ICNUCEVM.CNUCE.CNR.IT>
Subject: An IBM interface glitch & RISKS masthead FTP instructions 

Just to signal another minor problem similar to that of truncation to 80
columns: After several unsuccessful attempts to follow the masthead
instructions for FTPing RISKS issues, I discovered this morning that my IBM3278
emulator eats square brackets.  CRVAX runs VMS, I guess, but I hadn't used such
a system for years: only when I moved my attempts to a Unix machine did I
realize why I could not cd to the risks directory.

The 3278 keyboards don't have square brackets, but square brackets entered
through an emulator are stored as escape sequences. ASCII square brackets that
exist in mailed of FTPed files are stored as such, and displayed as blank
spaces.

and here are a few left square brackets embedded in a series of dashes:
     --------------------
and a few right brackets:
     ----------:::::----------

Amazing... I thought this was worth signalling in case you receive requests
for help from other IBM users.
                                          Lorenzo

Lorenzo Strigini, IEI del CNR, Via Santa Maria 46   I-56126 Pisa   ITALY
Tel: +39-50-553159 ; Fax: +39-50-554342 ; strigini@icnucevm.bitnet

     [Lorenzo and RISKSers, I have long been annoyed at the miserable
     VAX command "cd sys$user2:[risks]" to get the anonymous FTPer into
     the RISKS directory.  In response to Lorenzo's message, SRI's CRVAX 
     wizard Ray Curiel, at Steve Milunovic's request, has provided a terse
     alias: "cd risks:".  Upper/lower case does not matter, but the colon 
     does.  HooRAY!  Thanks.  I changed the masthead.  Now you don't need
     to escape from the colonease?  PGN]

------------------------------

Date: Wed, 10 Oct 90 12:58:25 PDT
From: marc@frederic.octel.com (Marc Lewert)
Subject: Automobile Computer RISKS - A Real Life Experience

With all of the discussion on the risks of computerized and/or electronic
controls in Aircraft, one should not overlook the fact that there could be
more down to earth (pun intended) risks along the same lines.

One of our cars has a computer, and various other electronic sensors that
control the engine.  A couple of years back, shortly after we bought the
car, it started intermittently losing power.  Not too much trouble on city
streets, but on the freeway, it was downright dangerous.  

The symptom was that the car's engine would drop to idle speed, and would not
speed up, not matter what we did with the gas pedal.  It could be reset by
turning off, and restarting the engine.

We had brought the car in several times, but the dealership could not find
the problem.

Then came the fateful day...My wife was driving the car when the engine dropped
to idle at the merge of two freeways, and she was in slow lane of one freeway
that merged with the fast lane of the other...I was not happy when I got the
call (I told her to call the dealership to come get the car!).

Somehow she was talked into driving the car to the dealership, when the
same problem occured.  This time, though, it occured in front of a truck
hauling an oversized load in the slow lane of the freeway.  If there had not
been an offramp and a truck driver that was on his toes, we might be talking
about my wife in the past tense at this point.

The eventual problem was an intermittent failure of a sensor in the air intake
system.  The computer responded by cutting down the fuel flow to its minimum
setting.

I just wonder how many of these types of problems exist out in the world, and
if anyone had been killed by them.  All in all, everyone was very lucky...
	This Time.

------------------------------

Date: 11 Oct 90 01:49:21 GMT
From: hollombe@ttidca.tti.com (The Polymath)
Subject: Re: BA 747-400 Engine Failure (Thomas, RISKS-10.47)

}... It's a FADEC failure" [FADEC = Full Authority Digital Engine Controller].
}... There has been a number of instances of spurious signals causing
}747-400 engines to throttle back or shut down, according to Flight ...

This begins to sound a bit like the discussion of electronic vs. mechanical
rail line switching controls.

I earned my Airframe & Powerplant mechanic's license (A&P) nearly 25 years
ago (when pilots were made of iron and planes were made of wood ... but I
digress (-: ).  At that time, jet engine controls were almost entirely
mechanical, consisting of amazingly complex blocks of pneumatic and
hydraulic sensors and actuators.  Each engine had a main controller and a
backup, "get you down alive" controller that provided just enough control
to keep the engine running if the main failed.

Has the concept of such backups been lost in the rush to computerize?

Jerry Hollombe, M.A., CDP, Citicorp, 3100 Ocean Park Blvd.  Santa Monica, CA
90405        (213) 450-9111, x2483 {csun | philabs | psivax}!ttidca!hollombe

------------------------------

Date: 11 Oct 1990 18:09:03 GMT
From: ken@minster.york.ac.uk
Subject: Re: Equinox on A320 (Pete Mellor, RISKS-10.48)

>The programme went on to consider the crash of the A320 at Bangalore. A pilot
>was interviewed saying that it was virtually unknown for an aircraft to lose
>height in such a way in clear conditions on a landing approach.

We know that the Bangalore crash _was_ pilot error. Both the `back box'
and the cockpit voice recorder indicate that the pilots were to blame.
Flight International has given a good account of this, including the
CVR transcript. The captain left the aircraft in idle descent mode and
flew into the ground. The aircraft warned the pilots (both visually and
aurally), but they ignored the warnings. Equinox chose not to report
this (the rest of the programme seemed very convincing).

A lot of people have a lot of axes to grind over Airbus Industrie, and
receiving totally impartial and accurate information is almost impossible.
Listen to Boeing and you hear that the A320 is a death-trap. Listen to
Aerospatiale and you hear supreme confidence. Watch TV programmes and you see
sensationalism.

Ken Tindell, Computer Science Dept., York University YO1 5DD, 
UK Tel.: +44-904-433244   UUCP:     ..!mcsun!ukc!minster!ken

------------------------------

Date: Wed, 10 Oct 90 12:39:31 EDT
From: henry@zoo.toronto.edu
Subject: Re: Equinox on A320 (UK Channel 4, Sun., 30th Sep)

>- The DFDR recording stops 4 seconds *prior* to impact with the trees. (Davis
>  added that, in his entire career, he had *never* come across a similar
>  instantaneous stoppage of a recorder.)

Is it possible that Davis is not familiar with *digital* flight recorders?
I've seen some commentary on such an issue in the aviation press recently:
the underlying problem is that some (all?) digital flight recorders buffer
incoming data in semiconductor memory, which loses its contents on power
failure.  The airworthiness authorities are starting to be seriously
displeased with the potential for loss of crucial data, and there are
mutterings about requiring non-volatile memory.

I don't know for sure that this accounts for the above claim, but it
certainly sounds like the right sort of symptoms.

(Would a simple explanation like this go unconsidered?  Quite possibly,
especially in the context of a media story whose basic slant is "dirty
work at the crossroads".  As I've commented before, there is a problem
with the A320 business in that almost all participants have axes to
grind and it is very difficult to get a balanced view.  The media are
not exempt from this, since sensation sells and boring truth doesn't.)

     Henry Spencer at U of Toronto Zoology  henry@zoo.toronto.edu  utzoo!henry

------------------------------

Date:    Wed, 10 Oct 1990 17:03:25 EDT
From: TIHOR@ACFcluster.NYU.EDU (Stephen Tihor)
Subject: Re: Ada and multitasking (Kristiansen, RISKS-10.48)

The areas left to the implementer were left that way due to disagreements on
the proper and useful choices.  All such options must be fully specified in
mandatory sections of the Ada reference manual.

The general phrase I remember being used whenever such items were discussed is
that the market will select among conforming compilers.

In hindsight it might have been better to add some language clauses that allow
you to specific or explicitly leave unspecified the tasking priority
requirements.

On the other hand many people believe that the ADA tasking model, while
interesting, is not general enough to begin with.

------------------------------

Date: Wed, 10 Oct 90 12:59:53 EDT
From: henry@zoo.toronto.edu
Subject: Re: Ada and multitasking

> The author does not seem to realize the contradiction between the
> *reliability* and *portability* quoted as features of Ada on one hand, and
> the lack of definition of crucial features on the other.

There is no inherent contradiction here, unless "reliability" and "portability"
are taken to include the phrase "guaranteed or your money back".  (Mind you,
some of the Ada enthusiasts essentially do claim this.)  "Reliability" and
"portability" are not absolutes, especially in a language constrained to be
implemented efficiently on current machines.  If such constraints mean that a
particular feature is not completely defined, this just means that
reliable/portable programs must avoid depending on it.  This does require
competent programmers, however, and one gets the impression that some of Ada's
big backers hoped that their wonderful language would do away with the need for
competence.  After all, it's much easier to run a test suite through a compiler
than to decide whether a programmer is competent.

The C community regularly sees broadsides on the subject, with ignorant people
claiming that the large number of "implementation defined" or even "undefined"
items in ANSI C implies that C programs cannot possibly be portable or
reliable.  Not so; these are just indications of where the programmer must
avoid depending on implementation-dependent behavior.  (There is room for
legitimate debate about whether C expects too much from its programmers, but
that is a different issue.  Portable, reliable C code is verifiably possible.)
C gets more of this than Ada, because C is a rather unforgiving language meant
for people who know what they are doing, but almost any efficient language will
run into similar issues.

To draw an analogy from more traditional engineering, the basic art of
designing circuits with transistors is organizing things so that the
characteristics of individual transistors do not affect the outputs much.
Transistor characteristics are quite variable, especially if you want the
transistors to be cheap.  This does not make it impossible to design transistor
circuits with predictable properties.  It merely requires that designers take
care to use circuits that allow for the variability and cancel it out.

     Henry Spencer at U of Toronto Zoology  henry@zoo.toronto.edu  utzoo!henry

------------------------------

End of RISKS-FORUM Digest 10.49
************************