| October 1999
Five ways to kill a project
Technology doesn’t kill projects; people do
After 13 years of fruitless effort and countless taxpayer dollars, the largest civilian computer project ever attempted rolled over and died. It was the FAA’s Advanced Automation System, and Robert Britcher was there ’till the last dog was hung. His book on the subject, The Limits of Software, is filled with lessons about what not to do. What follows is his litany of top management mismanagement: five project killers, each of which takes a perfectly sound IT principle and twists it into a cancerous parody of itself. Read ’em and weep.
In the 1980s and early 1990s, in the wheelhouse of the FAA’s Advanced Automation System, government procurement procedures and regulations mushroomed. They mandated everything from contractor pricing to the number of “IF” statements that could be nested within one programming procedure. It seemed that for every layer of complexity, a layer of supervision was added. (The government has recently swayed to the left, allowing private companies some freedom. But with the year 2000 bug spreading fear throughout the nation—thanks to the rare collusion between the administration and the press—there may again be a shift to the right.)
Micromanage to kill productivity
“Then, thinking that the chair work was not going well, I moved the crowbar to the right, to get a better look. This done, the area of work decreased, the area of supervision increased. Less and less chair work was getting done. (Not just because the area is smaller, but because every increase in supervision taxes the work. Thermodynamically, more heat is exchanged.) I noticed that the work was still lax and I moved the crowbar farther to the right. The work area shrunk. The supervision area grew. I moved the bar to the right until work was crowded out and stopped dead.”
Make sure it works, even if it’s useless
The Major is a surefire kind of guy. He doesn’t go in for probabilities. His favorite question isn’t a question at all: “Do you receive my meaning?” He’s a combat veteran. In countless battles, he has fought it out with Hughes, with TRW, with Boeing, with a lot of contractors building large computer systems. His father fought in the Balkans 50 years ago.
The Major is likable. He has quoted Thucydides at project reviews. I’ve heard he stands on his head for five minutes before lunch. He plays contract bridge and is quite good at it: a tournament player. Sometimes he “hypnotizes himself” doing needlepoint. He used to drag his needlepoint around with him, but never worked on it during meetings or on planes. The Major told me years ago he “played a key role in improving the quality of large computer systems by upping the level of supervision, because basically, these prime contractors can’t be trusted.” Who writes military standards? We do: the citizens, or a few government officials and subcontractors acting on our behalf. This is as it should be. We pay for these systems and they support us. And we don’t take to systems that do not work. Standards need to be severe so that we can filter out those systems for which we have no consensus, or clear need, or that are too complicated to build and use. Sometimes they are filtered out while they are being built. These are often called failures by people who know how much money was spent on them or whom lost out on promotions. But the systems are just being filtered later, after we’ve had a better look at them.
The real danger lies in successfully building systems that aren’t good for us. So far, there have only been a few. Not that we aren’t trying for more. Every week some new system or a new version of an old one is left to wither, its books stuck away in vaults, the code libraries purged. I wonder how many thousands of programs have been warehoused, left to some custodian. It’s too bad we don’t record the history of these rehearsals, scrupulously, like scientists. We could learn something. Maybe we could learn not to make the same mistakes, if mistakes are what we are making. But there is no provision for this, no journal of lost systems. The dead are not even named, just forgotten.
Turn specifications into straitjackets
The elaboration goes on in direct proportion to the number of government employees and subcontractors and the severity of the procurement policies. It is less apt to roll on when government streamlining is popular. When the elaboration rolls on, as it did on the Advanced Automation System project, it weighs heavily against the contractor’s flexibility to modify the design. When the design details and the rules for designing are imposed by contract, common sense is replaced by litigation.
The practice of government-designed computer systems began early in our digital history. In 1963, the FAA insisted the automated en route air traffic control system use a multiprocessor—the 9020. It was believed to be the best way to protect against failures. IBM, one of the bidders, felt a duplex would be fine, one computer backing up another. A much less complicated solution, it would require no investment in a new computer. Very few multiprocessors were on the market then, and none met the FAA’s requirements.
Joe Fox, IBM’s salesman for the project and a writer of children’s books, met often with T.V. Learson, IBM’s chairman of the board, about building a microprocessor. Learson balked at the idea, but after many meetings he gave in. Fox convinced him that IBM couldn’t enter the air traffic control business without a multiprocessor. So IBM designed and manufactured the 9020, and wrote a specialized operating system to support it: the NAS Monitor. This took seven years. Custom-built, the 9020 and the NAS Monitor were never sold commercially.
The FAA wanted a commercially available multiprocessor that met the air traffic control availability requirements. Put another way, the FAA wanted to buy commercial products and it wanted them highly customized, in this case, for air traffic control.
For over 40 years, the U.S. government has been specifying conflicting requirements—and changing them throughout the development cycle. This costs the taxpayers a considerable sum of money, and it is one of the reasons so many large systems fail.
If the give and take between the sponsor and the contractor is reasonable, and the system is not too large or complex, chances are good that our government-sponsored systems can be built on time and at a fair cost. Frequent communications and reviews are essential. No one can foresee how a system will unfold. No one gets it right the first time. The buyers and the users and the builders must look things over from time to time, especially in the beginning, before it’s too late.
Unfortunately, the give and take can become personal. Then, before you know it, the project runs away. Auditors are brought in. They drill into the layers of documentation, looking for errors. Every aspect of the project is briefed. This costs more money and dilutes the production of the vital materials. But it creates a mound of new ones: correspondences that explain and correct facts and impressions, volumes of rejoinders, reports, sometimes hundreds of pages long. Nothing is overlooked. A misplaced gerund might cause a satellite to fall from the sky. The recommendation: more supervision. A few more years go by and the cost for a single programming statement exceeds $1500, and Congress wants to know why.
Only test what you’re sure will work
Why? Because both the government and the prime contractor benefit from early acceptance. For the government, acceptance is a major milestone; often it is on record with Congress. The sponsor’s promise to deliver a system to the people, to us, is fulfilled. The contractor is rewarded financially for meeting or beating this deadline. The sooner both parties enter what is called the maintenance phase, the sooner the pressure is off.
The intentions are honorable: to demonstrate that the private sector has complied with its part of the bargain, that is has met the terms of the contract. In practice, the story’s different. With this objective looming on the horizon (no matter how distant), both parties, the sponsor and the contractor, shade testing toward acceptance. So much emphasis is placed on “qualifying the system” that the tough business of testing every aspect of the system, to find as many failures as possible, is compromised. Precious time and money are diverted.
The acceptance concept may once have made sense, when systems were hewed of cloth and steel, when materials, not yet worn by time and use, could be declared fit. In contrast, software is never completely fit; its only constant is the presence of faults and their potential to cause failures. Faults don’t so much inhabit the material as suffuse it. So, in computer systems, acceptance testing is anathema to both science and engineering. Testing should simply continue until all of the performance criteria have been met, including the failure rate.
The very existence of an acceptance test ensures that, sooner or later, there will be pressure not to find failures, to reconcile test scenarios and the system they exercise so that the system will “pass.” Remove this concept and testers would pursue the discovery of failures with a vengeance, the way NASA and its contractors pursued them on the manned space systems.
On the Advanced Automation System project, the FAA put great stock in qualifying each requirement: not the system that is derived from the requirements (and many other variables), but each sentence describing the system’s potential behavior. The approach turned into a massive, ongoing acceptance test. As a result, IBM had more than two hundred people writing test plans and procedures for what was called requirements testing. A single department, of about a dozen testers, was assigned to integrate and find failures in the system. Five years into the project’s acquisition phase, after tens of millions of dollars had been spent preparing for the acceptance tests, the imbalance was corrected—somewhat.
Sacrifice utility for metrics
Requirements testing is a little like trying to reduce crime in England by running a test for each line in the Magna Carta. In practice, rhetoric replaces the scientific method. As with mathematical proofs, the quest for “true or not false” soon turns into a party—a not very cordial party—marked by endless debate over whether this sentence or that sentence has been verified by this moment of behavior or that.
On the Advanced Automation System project, the floors sagged beneath the requirements test procedures. Each step was notarized, as if the very land on which the computers rest were changing hands. IBM was pressed into a gargantuan effort. Tests were practiced over and over, always witnessed by the FAA. Then there were the pretest briefings and the posttest briefings, and their respective rehearsals. As each requirement was “verified,” it was checked off, like soldiers entering boot camp. All this so that we could count the uncountable. But a subsystem with 99 percent of its requirements checked off could be unusable.
The FAA called this formal testing. The term “formal” in computer programming has come to mean using mathematical methods, such as reasoning. But on the Advanced Automation System, “formal” meant ceremonial. Considering the arbitrary phrasing of the requirements statements and their paucity in relation to the system’s romping behavior and its uncountable number of states … well, it reminds me of my father’s neighbor’s approach to raking leaves: leaning on the wash-line pole, rake in hand, pontificating, while the whole world marches by and leaves trickle across the sidewalk. Later, everyone is surprised when failures show up in the files and missiles hit schools.