gd%sri-spam@sri-unix.UUCP (06/29/84)
From: gd@sri-spam (Greg DesBrisay) Today's Chicago Tribune (Tues., 6/26) has an article about the replacement of Discovery's backup computer, described as costing $1.2 million, and weighing 57 tons. No wonder it won't fly. I remember seeing news pictures of a technician picking up and carrying a box that the reporter said was the backup computer that failed on the first launch try; it sure didn't look like 57 tons! By the way, does the Chicago Tribune mean to imply that NASA is going to throw that 1.2-million dollar computer on the scrap heap?! Even the government tries to fix such expensive pieces of equipment and use them again, don't they? Greg DesBrisay gd@sri-spam
lmc@denelcor.UUCP (Lyle McElhaney) (07/10/84)
No, the computer will not go into the scrap heap (yet, anyway). Much more important (at least in the NASA way of doing things) is to discover exactly WHY the computer failed. It will most probably be sent back to the manufacturer to be tested to death in order to find out whether the fault was a fluke (and why) or whether it is a design error. Unresolved failures in a piece of man-rated hardware are very serious. If the problem is found to be a single error in that computer alone, and can be fixed without stressing other components, and the computer can be requalified, then it may join the available spare parts list. Testing may well stress it beyond its useful life, and it will be scrapped, but not before the secrets of its problems have been squeezed out of it. This is one of the major reasons why the exploration of space has been so expensive. In a commercial operation, that computer would be tested, and if it failed, it would be scrapped and replaced. Another $1.2 million burned. NASA will more than likely expend far more than the $1.2 million tracing the error. The largest part of that cost per computer, in the first place, is the paperwork that the manufacturer has to generate on each piece of hardware in order that a fault can be traced to the exact step where it was introduced. Yes, this probably kept a few astronauts from being killed in space; but it has made space exploration a very expensive proposition, indulged in only by governments. -- Lyle McElhaney (hao,brl-bmd,nbires,csu-cs,scgvaxd)!denelcor!lmc
al@ames.UUCP (Al Globus) (07/13/84)
No, the computer will not go into the scrap heap (yet, anyway). Much more >>>>>>>>>>>>>>>>>>>>>>>>>>> important (at least in the NASA way of doing things) is to discover exactly WHY the computer failed. It will most probably be sent back to the manufacturer to be tested to death in order to find out whether the fault was a fluke (and why) or whether it is a design error.... This is one of the major reasons why the exploration of space has been so expensive. In a commercial operation, that computer would be tested, and if it failed, it would be scrapped and replaced. Another $1.2 million burned. NASA will more than likely expend far more than the $1.2 million tracing the error. The largest part of that cost per computer, in the first place, is the paperwork that the manufacturer has to generate on each piece of hardware in order that a fault can be traced to the exact step where it was introduced. Yes, this probably kept a few astronauts from being killed in space; but it has made space exploration a very expensive proposition, indulged in only by governments. -- Lyle McElhaney >>>>>>>>>>>>>>>>>>>>>>>>>>> As a recent spate of in flight failures has shown, extreme caution is needed in space to make things work. The margins for error are tiny and the consequences of mistakes in the hundred million dollar range or more. Insurance money for spacecraft is drying up and getting very expensive due to failures by PAM-D, Ariane and other upper stages. NASA is extremely careful because that is what it takes to make spacecraft work. Even the vast documentation requirements failed to note a critical pin on Solar-Max that almost caused the mission to fail. The paper work could be replaced by computer work at lower cost and greater reliability, but leaving out the tests and documentation is asking for megabucks down the tubes.
henry@utzoo.UUCP (Henry Spencer) (07/17/84)
> As a recent spate of in flight failures > has shown, extreme caution is needed in space to make things work. The > margins for error are tiny and the consequences of mistakes in the hundred > million dollar range or more. Insurance money for spacecraft is drying up > and getting very expensive due to failures by PAM-D, Ariane and other upper > stages. NASA is extremely careful because that is what it takes to make > spacecraft work. Even the vast documentation requirements failed to note > a critical pin on Solar-Max that almost caused the mission to fail. The > paper work could be replaced by computer work at lower cost and greater > reliability, but leaving out the tests and documentation is asking for > megabucks down the tubes. As several projects have demonstrated, vast documentation systems are *not* necessary for the (rare) projects that are run *right*. A good example of this is the SR-71 Blackbird. It's still the world's fastest aircraft (if you don't count the Shuttle's brief reentry), and 25 years ago it was a formidable challenge. New ground had to be broken in a dozen areas, including metallurgy. [I mention this because tracking every last piece of metal is one of the reasons frequently advanced for needing bales of paper for everything.] Nevertheless, it got by with several orders of magnitude less documentation than "ordinary" aircraft projects needed, even then. "Do not confuse effort with work." The basic problem with the space business right now is not the lack of still-more-detailed documentation. It is the "everything is required to work right the first time" attitude. Now, don't get me wrong. There is nothing wrong with "we will do our best to make sure it works the first time"; it's definitely the only way to go. The problem is when you start insisting that failures are not just undesirable, but unacceptable. This means that it is impossible to do meaningful experiments, because they might fail. *OF COURSE* it is expensive to build, say, a Space Shuttle, when the roof falls in if the tiniest thing goes wrong. How many aircraft are required to be perfect after only a handful of test flights? Yet the Shuttle program not only organized things this way, it based the whole viability of the program on the notion that the Shuttle would be fully operational almost instantly. This is madness, and awesomely expensive madness too. Even in military aircraft programs, not noted for being well-managed, it's common for the first dozen aircraft to be allocated solely to test work, with no expectation that they will ever be useful otherwise. Where are the test shuttles? Please don't tell me that the orbiters are too expensive to be used this way; this is known as "painting yourself into a corner", and does not connote good design to me. In retrospect, it is clear that the Shuttle was too ambitious a project trying to meet too many needs simultaneously. The US would be much better off with a large fleet of much smaller reusable spacecraft, plus big expendable boosters for heavy-lift work. Oh, true, the heavy-lift jobs ought to be done with reusables, too -- EVENTUALLY. But one must learn to crawl before one can walk, and NASA is now paying the price for trying to take shortcuts. "Of course it'll work." Sure. Of course "extreme caution is needed... to make things work", of course "margins for error are tiny", of course the consequences of mistakes are severe -- because the whole system is organized on the assumption that mistakes will never happen! The margins for error should never have been allowed to get that small, because Murphy's Law really does apply here, as everywhere else. "Even the vast documentation ... failed to note a critical pin on Solar-Max that almost caused the mission to fail...", and as we all know, it's a good thing for the Shuttle's credibility that the Solar Max repair worked. This sort of cliffhanger should not be allowed to happen. It's a travesty to design a spacecraft to be repaired in-orbit by the shuttle and then forget to include an emergency de-spin system, which would permit the thing to be despun for repair in the presence of attitude-control failure. It's ridiculous to set up a repair mission which cannot adapt to the smallest problem. My understanding was that the docking failure was because of a spike of fiberglass sticking up; why didn't the astronaut have clippers on hand for coping with such things? (Yes, I know, because the spacesuits are too clumsy for such fine work in tricky conditions... please don't set me off about the wretched misdesign of current spacesuits...) It's a credit to the cleverness of the astronauts and the people on the ground that they managed successful completion of such a zero-defects mission after the inevitable defects showed up. I hope the rescue mission for the PAMmed satellites is indeed mounted. It would be another small step towards a system that is somewhat tolerant of unexpected difficulties. Unless, of course, the mission is a failure because NASA, once again, assumes that the plan is perfect and nothing will go wrong... I realize that I am, to some degree, slandering NASA unfairly. They do put a lot of attention into contingency plans and such. But this is all to meet *expected* troubles; building in enough flexibility to meet the *unexpected* problems is a subtly different thing. Sometimes NASA pulls this off, sometimes not. It was fortunate for the Apollo 13 crew that some smart people insisted on making the LM computer identical to the CM one, rather than specializing it for the lunar landing only. It was a potentially-disastrous inconvenience to them that nobody thought to apply the same philosophy to the lithium-hydroxide air-purifier cartridges; fortunately they managed to improvise around that one. This same phenomenon has been noted in other contexts, notably military aircraft projects: lots of attention to known problem areas, but a firm subconscious assumption that everything else will work, because it's required to. The only real solution to this is a firm emphasis on getting real working hardware -- not computerized guesswork and theoretical pontifications -- going *early*, so that the inevitable mistakes can be found and fixed. Testing must be thorough, and must be done on whole systems, not just components! The tests, and preferably the operational service thereafter, must not be structured on the assumption that there will be no failures: failure-tolerance must be built into the plans, not just the hardware. Note that this implies designing the whole system so that a single failure is neither disastrous nor astronomically expensive. (I don't even want to *think* about the results of a Shuttle crashing.) Everyone, especially Congress and the media, should be clearly told that trouble is expected and is not cause for panic. ["You say your program still needs debugging, because you didn't write it correctly the very first time? Unacceptable. You're fired."] I know, it's easier said than done. Especially for a US government bureaucracy. Best argument I've heard yet for private industry in space... -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
mcgeer%ucbkim%Berkeley@sri-unix.UUCP (07/19/84)
From: Rick McGeer (on an h19-u) <mcgeer%ucbkim@Berkeley> Damn right you're being unfair. Remember the flap in the late fifties and early sixties when NASA did have a series of tests and did have busted launches? There was a howl from the public and the Congress that could be heard from Moscow. Given that, and the Mondale/Proxmire bills in the seventies to kill NASA, it's not surprising that NASA feels that it can't test and can't have any failures -- the lawyers in Congress, who don't understand engineering design and don't want to, would cut funds in a minute if, say, an orbiter blew up on the pad. Also, isn't it always the case that prototypes are more expensive than the production version? Rick.