Back to previous post: Pity the Times

Go to Making Light's front page.

Forward to next post: Heads they win; tails we lose

Subscribe (via RSS) to this post's comment thread. (What does this mean? Here's a quick introduction.)

April 6, 2008

Some must employ the scythe
Posted by Abi Sutherland at 06:40 PM *

Once again, a major implementation goes pear-shaped. On Thursday, March 27, Heathrow Airport opened Terminal 5 with great fanfare. It promised a revolution in passenger convenience, and included a new automated baggage handling system1. But things did not go well, and the opening weeks are sure to become a case study in project failure. Hundreds of flights to and from the terminal were redirected or cancelled. The stranded luggage mountain reached a peak of 28,000 bags, but appears to be declining with intensive manual effort.

I’m not going to bore you with the details of what I think went wrong in this specific case; I’ve mused about it on my own blog. But this implementation is part of a larger picture of bad decisions—expensive bad decisions—that have a wider impact than flight delays and missing luggage. The human factors here are the same as those that let the levees fail in New Orleans, and spread the Challenger across the Florida sky. Taken to national scale, they’re part of why we’re in Iraq.

Listen to Mustn’ts, child, listen to the Don’ts.
Listen to the Shouldn’ts, the Impossibles, the Won’ts.
Listen to the Never Haves, then listen close to me.
Anything can happen, child,
Anything can be.

Shel Silverstein, Listen to the Mustn’ts

We love heroes and leaders, from Alexander the Great and his iconic descendants2 to Captain Kirk and his. Whether real or fictional, they stretch the bounds of the possible. They show us a world where our fears don’t limit us, and inspire us to try to live there.

The problem is that great leadership is about more than “the vision thing” or being “the Decider”. A poor leader can sound like a great one by choosing a direction and sticking to it, counting on his “will” to carry him (and the people following him) over the obstacles that they encounter. And if the obstacles are small and their momentum great, that is all that’s needed. But that doesn’t make him a great leader. That makes him lucky, and luck runs out.

What a real leader needs is people who disagree with him. I don’t mean the needlessly contrary, the ornery and the difficult. I mean people who share his ultimate goal, but whose job and passion it is to pick holes in his plans to get there in order to improve them. Sometimes that’s the loyal opposition; sometimes it’s the court jester. Sometimes it’s citizens exercising their First Amendment rights. Sometimes it’s me.

Since human beings tend to be highly goal-oriented, establishing the proper goal has an important psychological effect. If our goal is to demonstrate that a program has no errors, then we shall tend to select test data that have a low probability of finding errors. On the other hand, if our goal is to demonstrate that a program has errors, our test data will have a higher probability of finding errors.

Glenford J Myers, The Art of Software Testing

Testers (like me) are, in the small scale, the disagreeable advisors to the king. We share the ultimate goal with our project managers: releasing the product to the admiring masses. But our job is to find problems, from the design stage through to the final build, so they can be fixed. To do that well, we must stand in opposition to the belief that the code, or the product, or the plan is bug-free. We start with the supposition that something is wrong and go looking for it.

This mindset never earns testers, or anyone who questions a leader’s vision, many friends. In the public sphere, where motivations are part of the discourse, it is taken as evidence of bias and a reason to ignore any inconvenient views. In the corporate world, it can lead to the perception that the testers are never satisfied, and can therefore be overruled at will.

Some must employ the scythe
Upon the grasses,
That the walks may be smooth
For the feet of the angel.
Some keep in repair
The locks, that the visitor
Unhindered passes
To the innermost chamber.

Philip Larkin, from The Dedicated

Being a tester, or any kind of disagreeable advisor, can be unglamorous and unrewarding. Our successes are rarely advertised. What leader proclaims, “I had this idea that I thought was good, but then my advisers pointed out that it was impractical”? I can think of one major cancelled plan in my own field, when the London Eye’s initial passenger run on the Turn of the Century was called off because of a failed safety test3. Usually, though, the successes are marked by delayed implementation dates or revised press releases. Not the stuff of fame.

But I believe that those of us who question leaders, who look for failures and are only reluctantly convinced of successes, are important. We can stop a bad idea before it becomes a bad plan, a bad product, a bad policy. I think we need more people like that in the wider world, whether it be consumers adopting a security mindset or citizens questioning government officials.

And even if all we do is teach our leaders the mantra reserved for those who ignore their disagreeable advisors, well, that too is a form of recognition.

Ivanova is always right. I will listen to Ivanova. I will not ignore Ivanova’s recommendations. Ivanova is God. And, if this ever happens again, Ivanova will personally rip your lungs out!

Susan Ivanova, Babylon 5

  1. This is the kiss of death for a new terminal, and has been since Denver International Airport’s long-running and epic failure to automate their baggage handling. It was introduced in 1995 after a two year delay, and spent a decade expensively mangling and mislaying passenger luggage. It was put out of its misery in 2005, after parallel trials proved that manual handling was more accurate, faster, and cheaper.
  2. All the way down to JFK and Bill Clinton, everyone who has used the icon of the tousle-haired young idealist who conquered the world hearkens back to him.
  3. And if I ever meet the tester who made that recommendation stick, I will buy the drinks all night.

Comments on Some must employ the scythe:
#1 ::: elise ::: (view all by) ::: April 06, 2008, 06:44 PM:

I am sending this to a tester friend immediately. Nicely done.

#2 ::: Jon Meltzer ::: (view all by) ::: April 06, 2008, 06:49 PM:

But - we wouldn't need to waste money on testers if the programmers and engineers would only do their work right.

(Yes, I've really heard this)

#3 ::: clew ::: (view all by) ::: April 06, 2008, 06:49 PM:

I was a software tester a while back, and our group was told to develop a mission statement. We stuck to short ones, knowing the long ones would be unverifiable...

Our manager's bid was "Ship great software", but we argued him down on the grounds that that goal wasn't unique to us. Our banner finished "Don't ship bad software." It's got no glamor to it, but evidently it needs to be done.

#4 ::: AliceB ::: (view all by) ::: April 06, 2008, 07:18 PM:

Hear! Hear!

#5 ::: Kip W ::: (view all by) ::: April 06, 2008, 07:30 PM:

I'm so worried about what's happening today,

In the Middle east you know,

And I'm so worried about the baggage retrieval

system they've got at Heathrow.

--John Cleese (years and years ago -- I heard this in the mid 80s)

#6 ::: Kip W ::: (view all by) ::: April 06, 2008, 07:33 PM:

...By the way, I love Silverstein, and that's always been one of my favorite poems of his.

#7 ::: Kip W ::: (view all by) ::: April 06, 2008, 07:35 PM:

Silly me! I spelled "Terry Jones" wrong. I'm, uh, going somewhere else for a while.

#8 ::: Fragano Ledgister ::: (view all by) ::: April 06, 2008, 07:38 PM:

It's the little things that always seem to matter.

#9 ::: michelel ::: (view all by) ::: April 06, 2008, 07:40 PM:

Kip W. @5 -- Okay, that's scary. And that song came up on my randomizer last week, I think.

---

As a coder, I scold my QCers when they apologize or call themselves pains. "I'd rather you find it than it get to the customer," I tell them. Sometimes I grouse, but I like to think they know that it's a backhanded kind of praise. I'll even phrase it that way: "That Jane, always doing her job right and making more work for me, feh."

My group recently started an actual peer review program, and I try to approach it the same way: I'm gonna break it. I inflate my own ego, at least for the content I specialize in, and approach each review with the mindset that the programmer can't possibly have considered all these quirks I know about.

I hear that Abe Lincoln built a Cabinet of people who would disagree with him; I think Bill Clinton may have a had a few himself. These days, only military experts dare disagree, and those that do always seem to find the door soon after (or earlier) ... shame.

#10 ::: Steve Taylor ::: (view all by) ::: April 06, 2008, 07:45 PM:

I'm one of the people who *makes* the errors by trade, but I've always respected the weasely mind of a good tester. The really good ones - the ones who can bring a solid looking program down like a house of cards - are fairly rare.

I remember Denver airport with great affection - it was the first major software disaster I'd read about in detail, and it was gripping stuff. Now I wish I could read as much about Melbourne's failed deployment of an ambulance dispatch system, or the internals of Melbourne's public transport swipe card system ('Myki'), current a couple of years late and not looking happy.

I wonder what the Heathrow implementors are feeling? I remember working at MelbourneIT when a freshly released domain registration system had to be rolled back because it Just Didn't Work. Everyone (except lucky me, the contractor) had been working late nights for weeks, and it had all come to nothing, and depleted as they were, they *still* had to come to work and try to fix things. It felt like the morning after the Dieppe Raid...

#11 ::: P J Evans ::: (view all by) ::: April 06, 2008, 08:11 PM:

My boss, the one who retired twenty years ago, used to say things like 'If you don't have time to do it right, where will you find the time to do it over?' My current boss doesn't understand why I react badly to statements about 'fixing it later'.

#12 ::: Nat ::: (view all by) ::: April 06, 2008, 08:31 PM:

michelel @ 9: I always try to tell them that I'd much rather they tell me about the bug when I can still fix it than have to apologize to my mom when she runs the software she paid for and it breaks.

You can quantify how much money it costs to fix a defect at every single stage of the development process, and the cost-per-defect doesn't exactly go down as you get to the later stages. It really does amaze me that so many people don't grasp this.

#13 ::: David Harmon ::: (view all by) ::: April 06, 2008, 08:34 PM:

Amen! Far too many managers and executives think they can make any problem go away by quoting platitudes at their subordinates -- or worse, by firing whoever insists there's a problem.

#14 ::: xeger ::: (view all by) ::: April 06, 2008, 08:38 PM:

I've occasionally wondered if the part of the problem isn't the inability to visualize and understand the magnitude of the problem.

It's pretty easy to have a mental image of "the bridge will fall down" -- it's not nearly so easy to have a mental image of "people might lose some data".

Similarly, if it's not going to make any particular difference to you, nobody will know that you were involved, and you won't have to fix it... that's almost invisible, isn't it...

#15 ::: bellatrys ::: (view all by) ::: April 06, 2008, 08:38 PM:

We start with the supposition that something is wrong and go looking for it.

Clearly this is all the sign of Bad Will, the lack of a Can-Do Attitude, and a perverse desire to See Good Things Fail.

[/snark, from someone who eventually gave up trying to point out the inherent problems in boss-ly enthusiasms due to the fact that nobody *ever* remembers that Cassandra did, in fact, say so before things went beyond all recognition, even when the circumstances are awfully similar to last time...]

#16 ::: B. Durbin ::: (view all by) ::: April 06, 2008, 08:48 PM:

Many of the people in my life are the functional equivalent of testers. My father went to the Pentagon on a yearly basis for some time because he was good at spotting errors— and wasn't afraid of pointing them out*. (One colonel admiringly referred to him as "the [subordinate] who chewed me out.")

*More specifically, he could explain in simple and direct terms why a procedure or process wouldn't work, so they sent him off every so often to do the explaining.

#17 ::: Luthe ::: (view all by) ::: April 06, 2008, 08:49 PM:

This is why my personal motto is "Hope for the best, plan for the worst." People (and machines) are bound to screw up. There's always a bigger fool.

This is why I dislike theorists in some fields. They miss the fact that people never do what one expects them to do.

#18 ::: Sam Kelly ::: (view all by) ::: April 06, 2008, 09:05 PM:

Nat at #12 wrote: You can quantify how much money it costs to fix a defect at every single stage of the development process, and the cost-per-defect doesn't exactly go down as you get to the later stages.

And in addition to that, the pile of bugs-to-be-fixed will only increase over time. Fixing them is usually easier if you do them one at a time and don't have to worry about changing the code in three different directions at once.

I started off as a tester, became a developer, and more than once I've thrown away a pile of code and started from scratch, because that's what it's there for.

#19 ::: Greg London ::: (view all by) ::: April 06, 2008, 09:18 PM:

I worked on the software for the fly-by-wire computer for the Boeing 777 for about three years. That's where I met my lovely Ada, a strict Victorian, but always a lady when it came to type conversions, which is to say she wouldn't convert unless you made the proper introductions.

Sometimes I miss that sort of work, because it could feel quite rewarding when you got something right. Sometimes I don't miss it, because you could get ulcers when things went wrong.

It was a very odd feeling the first time you come to grips with the fact that your coding error could get someone killed.

A few years later, I rode on a 777, and that was an even stranger feeling knowing that my coding error might get me killed.

You do the absolute best you can, but man, the stress was crazy sometimes.

#20 ::: lightning ::: (view all by) ::: April 06, 2008, 09:44 PM:

#12: You can quantify how much money it costs to fix a defect at every single stage of the development process, and the cost-per-defect doesn't exactly go down as you get to the later stages.

The usual metric that I've seen is a factor of 10 at each stage of the production process. If a programmer can fix an error in his/her own code for a cost of one unit, it costs 10 in unit test, 100 in system test, 1000 in alpha, 10000 in beta, and 100000 in production. My experience shows that this is, if anything, an underestimate.

At each stage, the errors become more subtle -- a typo is trivial to fix, an obscure timing bug that shows up in beta may take weeks, and a total collapse of an installed system (like the Denver or Heathrow baggage systems) may mean a total redesign.

#21 ::: Shannon ::: (view all by) ::: April 06, 2008, 10:32 PM:

In some ways, being a copyeditor is similar to being a tester. You need a keen eye and attention to details that other people miss. Fortunately, a few typos usually won't mean anyone's deaths, but occasionally it will mean a very angry reader (or article subject!). That is one thing I'm glad about as a writer - the likelihood of killing someone is low.

#22 ::: Marilee ::: (view all by) ::: April 06, 2008, 10:42 PM:

My first QA job I was always waking the head coder up at night because I'd killed the software. He was always sure I couldn't have, but they wrote really bad software. And the reason I called him in the middle of the night was because he'd goofed off and delivered the s/w to me near the end of the time I was supposed to have to test and he needed to come in early so we could deliver the product on time. I can remember him saying "We'll just train the sailors well, so they never hit the wrong key."

And to most of the public, my volunteer job for the small charity that I'm on the board of is auctions this time of year. But what the board actually brought me on for was to make sure things get done on time and properly. In other words, I'm the nag.

#23 ::: Clifton Royston ::: (view all by) ::: April 06, 2008, 10:43 PM:

Speaking of which, I don't suppose anybody knows a great software tester (and better still, test process designer) looking for work in Hawaii?

See, a bug was just found Friday in some of the code changes I've been working on over the last few months - and it's been out in the field, affecting customers because our present testing didn't catch it. What, me infallible? No.

Alas, I asked my one-time favorite counterpart in QC, but she is doing other types of work and is not available or interested.

I suppose it's not strictly true that I would kill for a good tester around now, but I might be willing to pummel and maim a bit for one. Seriously, if you know a really good software tester looking for work in Hawaii, email me.

#24 ::: Lee ::: (view all by) ::: April 06, 2008, 11:13 PM:

Abi: The "security mindset" article also goes a long way toward explaining how many people get caught by scammers, e-mail or otherwise.

#25 ::: Josh Jasper ::: (view all by) ::: April 07, 2008, 12:00 AM:

I tend to get kudos for my bug catching, but if my company's software is broken, we loose clients and money.

#26 ::: Mary Dell ::: (view all by) ::: April 07, 2008, 12:18 AM:

Part of why I love my job is that they utilize my deep, deep pessimism. They've put me in a role where I do a lot of risk mitigation, so my tendancy to expect the worst works well for me.

Developing a discipline around it can be challenging -- sometimes something just feels wrong to me, but that doesn't help me to make my case. I know there's a rational analysis underlying the feeling, but I can't always articulate it into an argument.

#27 ::: Kevin Andrew Murphy ::: (view all by) ::: April 07, 2008, 12:28 AM:

The situation at Heathrow has just gotten more interesting with Naomi Campbell's arrest:

http://www.eonline.com/news/article/index.jsp?uuid=fa694956-b2f9-4f35-8bf6-77ca170719d8

Extra data not included in the linked article but mentioned on Entertainment Tonight: Supposedly the lost bag, besides being a ridiculously expensive Louis Vuitton number, also held a (presumably) ridiculously expensive outfit which Campbell was going to be wearing on the Tonight Show.

As much as we can laugh about Zsa Zsa cop-slapping or spitting behavior and stars and their absurdly priced couture, I'm on Campbell's side here: She's traveling to LA to attend a memorial service, and is of course distraught about that, and is also doing a professional media appearance likely arranged by her publicist. The clothes for both are probably in the same bag, and she doesn't have time to shop for alternates when she gets there, and it's not like British Airways lost bag clothing allowance is going to cover the cost of whatever an international fashion model would be wearing or would be expected to be wearing to a funeral or on a major national television program.

Plus she paid for 1st class and that's supposed to mean something more than wine and better snacks.

Campbell's ruined trip is simply a more high profile version of everyone elses and there's no excuse for it.

I remember when I had my dog shipped to San Jose International and they sent him to the wrong terminal and left him shivering in his crate out on the tarmac until I finally got him. The stupid young woman the airport sent to find him said they were sorry it took so long but they had "an incident." I tore into her, saying I wasn't paying her to have "an incident," and the main cause of the "incident" was the airport being too cheap and stupid to hire enough staff to deal with unexpected "incidents" and regular business at the same time.

Thankfully my dog was merely very cold and hadn't died of hypthermia in the time it took the airport to deal with their "incident" but it all comes down to a case of cheap and stupid.

If Heathrow had just paid to have enough staff on hand to deal with the bags in case the new baggage machines broke down, there wouldn't be any troubles going on except for having to pay a lot of people overtime behind the scenes.

#28 ::: Steve Taylor ::: (view all by) ::: April 07, 2008, 12:33 AM:

David Harmon at #13 writes:

> Amen! Far too many managers and executives think they can make any problem go away by quoting platitudes at their subordinates

David - I think you just need to learn to work smarter, not harder!



sorry. don't hit me.

#29 ::: Terry Karney ::: (view all by) ::: April 07, 2008, 12:47 AM:

One of the things I've learned in the army is worst case planning. We don't do it as well as we might, but we do it (casualty projections are sobering, and doing risk assessments for a classroom lecture seems pointless, but the habit is really good, and has stood me in good stead; insisting we turn around when we were in "The Narrows" of The Paria, when we felt raindrops was a really good idea, the water, back at our camp, rose more than a foot; where the river was widened from 20 feet to 100, you do the math).

I took this thinking to my machining job. My boss (who thought he was "one of the guys" never mind that he had a degree from Brown, and his mother in law was paying him to go to law school; while nominally making him the shop manager, but I digress) asked me how long a job was going to take.

"Worst case..."

"I don't want to hear about worst case. What't the best case".

That was also how he told Don to bid jobs. The jobs came in, but somehow lots of them ended up not making as much money as they were supposed to.

And then the jobs started to slow down.

Greg: One of the strange things about my career has been the strange level of "Little jobs with big import". I've been a combination translator, protocol and cultureal advisor. We were doing a theater missile defense exercise at SpaceCom. It was all fun and games until I realised we were actually establishing protocols which would be used in the event of a real event.

Sobering.

#30 ::: Distraxi ::: (view all by) ::: April 07, 2008, 02:01 AM:

lightnng @ 20:

I've heard the factor-of-10 thing used before too, and it sounds plausible to me. But the only study of it I've ever seen (from Boehm's Software Engineering Economics, which admittedly dates back to the dark ages) made it more like x100 across the whole project from Requirements to Operational, which is much less than x10 per phase*.

Does anyone know of more recent evidence on this one?

*unless you're skipping a lot of steps

#31 ::: Dave Bell ::: (view all by) ::: April 07, 2008, 03:43 AM:

They still need some baggage handling staff, but forgot to provide for them getting to work--parking space and such.

They brought in extra staff from other terminals, and it's been reported that a dozen of them started fighting over what was the correct procedure.

Terminal 5 is apparently a hub. They couldn't start off at reduced capacity, because they couldn't easily shift passengers to other terminals to catch connecting flights.

It's a common domestic/international shopping mall departure lounge. It isn't the only such airport terminal in the UK, and they installed a special check system so that they could confirm the passenger who has been issued the boarding pass was the one who tried to board the plane. The core of that system is well-tested at large airports. Up until now it has used a digital photograph system, but for T5 they decided to use fingerprints.

It's not just a huge project, it seems to have repeatedly been arranged to maximise the risk, and the cost, of failure

#32 ::: Peter Erwin ::: (view all by) ::: April 07, 2008, 05:04 AM:

This article in the Economist has some interesting comments about Heathrow in the larger context of UK and European airports. It mentions the Terminal 5 foul-up only in passing, and is rather sanguine about immediate T5's future, but is pessimistic about Heathrow's long-term future, particularly as an international hub for transfers. (There's no real room to expand, continental hubs have more capacity, and current plans for expansion seem based on dodgy accounting.)

There's a (grimly) amusing passage which suggests that the T5 mess is part of long-term tradition:

When the first permanent terminal (today's Terminal 2) was built in 1955, it was decided to stick with the original layout and reach it through a narrow road tunnel, which is still the main way in. The next two terminals were also placed in the centre, ensuring perpetual traffic congestion. “Heathrow's history”, says Sir Peter, “is a series of minor planning disasters that together make up one of the country's truly great planning catastrophes.”

#33 ::: John Chu ::: (view all by) ::: April 07, 2008, 06:11 AM:

I spent years doing microprocessor design verification. (I'm now on the other side doing architecture. i.e., putting in the bugs, not taking finding them.) Every verification team I've been on or near has championed the "bugs are good" philosophy. That is, architects will make their best efforts, but they'll ultimately put in bugs anyway. So, a high bug rate is a good thing, not a bad thing. (Well, a design which never stabilizes is a bad thing. There are usually other signs besides a bug rate which never lowers though.)

In retrospect, I knew exactly when a microprocessor for which I was a verification unit lead was doomed. At a status meeting, project managers asked if I could make the bug rate for my unit look better (i.e., lower) for the sake of upper management. I was appalled. My job was to find all the bugs that can be found, not cook the status for upper management. (At other meetings, they took me to task for milestone dates which were consistently several months later than everyone else's. They were, in general, not thrilled with me. However, I'll note that most everyone else missed their target dates... by several months, usually. That people were submitting irrational target dates was a symptom, not the root problem, IMHO.)

For the record, we did eventually get to the point where, despite our best efforts, the bug rate remained stubbornly low. This was after much yelling ensued (not my proudest moments as a professional.) We also had to replace the unit's design team and do a complete redesign. (Apparently, the third design team was the charm. The 2nd design team had chosen to maintain the 1st design team's design. I hopped on board as verification unit lead with the 2nd design team and stay on to verify the 3rd.) However, we brought it in on schedule. It ended up being one of the first units ready for tape out. Unfortunately, by then, it was obvious the project as a whole was doomed. *sigh*

#34 ::: Connie H. ::: (view all by) ::: April 07, 2008, 07:11 AM:

>I can remember him saying "We'll just train the sailors well, so they never hit the wrong key."

Ah! My best boss ever would say things like "This would be a perfect application, if it weren't for the damn users!" Happily, that was said in jest, which is why he is probably the very best boss I'll have ever had.

#35 ::: heresiarch ::: (view all by) ::: April 07, 2008, 07:33 AM:

"What a real leader needs is people who disagree with him. I don’t mean the needlessly contrary, the ornery and the difficult. I mean people who share his ultimate goal, but whose job and passion it is to pick holes in his plans to get there in order to improve them. Sometimes that’s the loyal opposition; sometimes it’s the court jester. Sometimes it’s citizens exercising their First Amendment rights. Sometimes it’s me."

Patriotism is the highest form of dissent, as that one fellow said. Or something like that...

#36 ::: iain ::: (view all by) ::: April 07, 2008, 08:06 AM:

Good article. I knew one guy who had a sign pinned to the wall above his desk. It just said 'remember the Therac 25'.

Good advice.

#37 ::: Pete ::: (view all by) ::: April 07, 2008, 08:20 AM:

Abi - excellent sentiments, eloquently expressed.

As a software developer/systems analyst/business tester (long story), I'd have to say I agree with basically all of it.

To my mind the absolute hardest idea to get across to the people who matter is a remarkably simple one: QA does not create bugs, it exposes them. The bugs already exist, and if QA isn't given the time and resources to find them, then the customers almost certainly will (usually at the worst possible time).

The underlying assumption is, and always has to be, that the software does not work properly until demonstrated otherwise.

Trust, but verify.

#38 ::: Jon Meltzer ::: (view all by) ::: April 07, 2008, 08:51 AM:

I suspect that, if a Parliamentary investigation of the Heathrow disaster ever occurs, it will be found that the engineering was passed through several levels of outsourcing with most of the monies on each level spent on other than product development/testing. But I suppose that makes me anticapitalist.

#39 ::: Greg London ::: (view all by) ::: April 07, 2008, 09:04 AM:

Another interesting job was hardware design on a satellite. There was concern to find our design errors, but the big thing about that job was that everything is designed from the point of view of "Space is gonna get you, sucka." Radiation can cause a normally static bit to either flip or oscilate or break. The level of radiation would give you the average bit rate of bit flips. We then had to make sure all our data paths had error correction that happened at a faster rate than the radiation would mess it up.

#40 ::: xeger ::: (view all by) ::: April 07, 2008, 09:12 AM:

iain @ 36 ...

This talk about the security butterfly effect includes a section about the Therac-25 (mp3 can be found here).

#41 ::: Jon Meltzer ::: (view all by) ::: April 07, 2008, 09:14 AM:

remember the Therac 25'

My friend the nuclear plant cooling engineer says that if nuclear engineering had design and management like software, most of America would be radioactive. (Let's hope their standards remain true once Chernobyl passes out of direct engineering memory.)

#42 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 07, 2008, 10:18 AM:

iain @ 36

A former colleague of mine worked on a project at Los Alamos Labs inspired by the Therac-25 fiasco. Their task was to design, prototype, and test a fail-safe / fail-tolerant cobalt-60 radiotherapy system. From first principles, realizing that software always has bugs, and hardware always fails, they decided to include a final fail-safe: a large block of lead held up by an electromagnet and a latch with a mechanical timer release. If the lead block fell, it blocked the opening the radiation came out of. If the power went out, the electromagnet went off and the block fell. If everything failed to shut down, the mechanical timer hit its end point within a safe time, and opened the latch to drop the block. Suspenders, belt, clips, and velcro. Software always has bugs, hardware always fails.

#43 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 07, 2008, 10:30 AM:

Wonderful post, abi. A subject that just gets more topical all the time as we press further into the age of Positive Thinking and Best-Case Analysis. The transportation industry seems to be a never-ending source of bad examples, perhaps because the budgets are so huge, the projects so visible, and the politics so ferocious.

The one I'll never forget was the first test run of a train in the BART system in San Francisco. I was living near Sacramento and looking into jobs and housing in the Bay, and the idea of using a train to get to work was very enticing (I grew up in the Northeast, and was going all over Philadelphia on their then-excellent transit system before my 11th birthday). So I followed the stories about BART on the TV and in the SF Chronicle closely.

The first trial run was to involve one car going from one station to the next with a load of politicians, celebrities, and reporters. About 30 feet into the trip a control system failed. There was no backup or fail-soft, even though the design philosophy had been to have at least backup, if not majority-votes designs on every critical subsystem (the brakes are critical, right?). The car came to a screeching halt, throwing expensive clothes and coiffures with powerful people inside them all over the car.

When the head of the BART system, who was at least theoretically responsible for the failure, was asked about the lack of a backup for the crystal oscillator that had failed, he said (paraphrase): "It was designed not to fail, so it didn't need a backup." A paradigm of the problem.

#44 ::: inge ::: (view all by) ::: April 07, 2008, 10:39 AM:

Jon Meltzer @2: I heard that last Friday...

#45 ::: Jon Meltzer ::: (view all by) ::: April 07, 2008, 10:49 AM:

#43: I suppose the test train was an ED-209 model ...

#46 ::: Christopher Davis ::: (view all by) ::: April 07, 2008, 10:53 AM:

Peter Erwin (#32): The tradition continued, of course. When the Piccadilly line was extended to Heathrow, the overrun tunnels were designed to allow continuing the line to the projected location of the next terminal.

BAA then built the new terminal in a completely different location, requiring a loop to be built to serve both T4 and the original Heathrow station. (In addition, they didn't coordinate construction times for the terminal with the folks doing the tube station, resulting in more expense and a longer connection to the terminal building.)

Naturally T5 was built where the original plans had put T4, so the line has been extended to that point...but because of the loop, trains can't serve both T4 and T5.

#47 ::: Nix ::: (view all by) ::: April 07, 2008, 11:33 AM:

Terry @#29, hear, hear. I've had bosses tell me 'I don't want to hear about the worst case' and then treat it as my fault when, amazingly enough, the best case does *not* come to pass.

I've had bosses tell me to give them estimates and refuse to accept them because they were 'too long' (then why did they ask? Validation of their own guesses?)

And, of course, you have to be able to give an estimate of the time taken to do something to a system you've never heard of before, in ten minutes, and it must never slip.

Bah.

#48 ::: Nick Kiddle ::: (view all by) ::: April 07, 2008, 11:40 AM:

This reminds me of a conversation I had with my aunt the last time I was looking for work. I said something about making sure I could still get everything done in a worst-case scenario, and she said that there was no point planning because the scenario you get is never one of the ones you planned for. Which may be true, but I couldn't convince her that making a plan with plenty of redundancy and fail-safes was more likely to bring a reasonable outcome than winging it with absolutely no plan.

#49 ::: John L ::: (view all by) ::: April 07, 2008, 12:43 PM:

Our Fiscal department decided that printing every employee's pay stubs biweekly was costing the state way too much money, so they purchased a software program that would allow every employee to do it themselves. It would allow individual customization, convenience, and current info on both your pay and your leave status.

It doesn't work; not only does it not work, but instead of the state paying for a printing 4"x11" pay stubs for everyone, not each employee has to print two 8.5"x11" sheets to get the same info that little pay stub contained.

Except for leave time; that's what still doesn't work, so there are a LOT of unhappy and frustrated employees not knowing exactly how much leave time they currently have...

#50 ::: Serge ::: (view all by) ::: April 07, 2008, 01:11 PM:

My own motto is that if someone can do it wrong, someone will.

And, if some undocumented feature still gets thru, hopefully the person whodunit will remember the steps that he/she followed before the computer accidentally ripped a hole in space.

#51 ::: Trey ::: (view all by) ::: April 07, 2008, 02:45 PM:

This is interesting - in the ugliest sense of the word.

It applies to me because I work for an organization (a Dhnyvgl Vzcebirzrag Betnavmngvba, specifically Vasbezngvba naq Dhnyvgu Urnygupner (use ROT 13)) that is writing a contract for the Center for Medicare and Medicaid Services where the upper management is deliberately moving itself down what Abi calls the "happy path" in her blog.

Folks, this is going to get ugly. They're assuming that the physician offices will give us unrestricted time and access to their electronic health record systems. They also seem to assume that the physician offices will gleefully pony up $15k to have an interface written to the CMS data warehouse just to participate in this. Or that the vendors will do it for free, out of the goodness of their hearts (pull the other one - it's got bells on it).

Then there is CMS' info security policy - as in the "let's not use IT to get anything done remotely" policy. Can't hook the lap tops up to external networks. Can't use unauthorized software (which takes ~ 18 months to approve or move) and that unauthorized software will be required to do this. The software? Crystal Reports.

And I've been called alarmist and unrealistic for bringing these issues up.

Its frightening to me since I'm the one that will be out in the offices doing all the work, most of the calls and all of the project management.

Bleah. I doubt this cluster fuck will make the headlines like Heathrow, but I am sharpening my resumé these days.

#52 ::: John L ::: (view all by) ::: April 07, 2008, 02:59 PM:

No way will physicians grant unlimited access to their medical records; the spectre of lawsuits from every one of their patients is too great for them to risk having someone (by mistake or otherwise) release that information out into the public.

#53 ::: abi ::: (view all by) ::: April 07, 2008, 03:03 PM:

Trey @51:

I hear you. I've worked on projects like that*.

Get your resume up to date and network a lot. That'll give you understanding company while you work through the problems you can, and the feeling that you can escape when it gets too hairy.

Tell everyone, at every opportunity, what their assumptions are. Make them have the shotgun conversation†. Put stated assumptions in every test plan, test strategy, and results document you write.

And don't let the twits rent space in your head. Their screwup, their penalty. Don't work yourself into the ground to save their tails.

-----

* One was named for one of Columbus' ships, because it was to take us to a new world of processing. I always felt that the name was well chosen, because as the project continued they realised that almost nothing would go down the happy path they had planned, built and tested. They slowly retreated from the original functional profile to the merest sliver of the planned processing, never decommissioned the systems we were replacing, but still called it a victory. So it was much like Columbus' expedition to India: they ran aground partway, looked around, labelled everyone around them Indians, and called the project a success!

† The "shotgun conversation" is where you clarify the real-world outcome of all those blithe assumptions. I had it a lot when I was building test environments. It goes like this, in very thin disguise:

Me: Let's go over your test plan. You want me to load a shotgun.

Business Rep: What does the plan say? Yep. Shotgun. Load it.

Me: And then you want me to point it at your foot. Left foot or right foot?

Business Rep: Let me check. Plan says left. Definitely left.

Me: And then you want me to pull the trigger.

Business Rep: That's what the plan says.

Me: So you want me to blow your foot off.

Business Rep: WHAT? What do you mean by that?

#54 ::: Serge ::: (view all by) ::: April 07, 2008, 03:24 PM:

Abi... the real-world outcome of all those blithe assumptions

"It's better to ask stupid questions than to make stupid assumptions."

#55 ::: Clifton Royston ::: (view all by) ::: April 07, 2008, 03:37 PM:

Trey @ 51: Document your concerns and predictions in writing, no matter how much they hate for you to do that. At some point, depending on the sanity level of those who take charge, there will either be a desperate hunt for someone who is realistic enough to salvage the situation or (more likely) a hunt for scapegoats, and in either case you want to be able to pull up your dated analysis of the problem. Also polish your resume, because even if they can't make you the scapegoat, they can still fire you for having the bad taste to have been right.

abi @ 53: I hear you on the "shotgun conversation"; I've been in that role too, on the developer's side trying to explain it to the marketing/sales side.

For a while I was terming it the "run at a wall" discussion:

"OK, so there's a concrete wall right here, right?"

"Yep."

"Big thick concrete wall. Immovable."

"Yep."

"And so you want us to do what marketing says, right? You want the company to put its head down, right?"

"Yep."

"And all of us run at the wall, head down, as hard as we all can?"

"Yes!"

"So you want the company to get a massive concussion?"

"Why are you being so negative? Why are you trying to tear us down?"

#56 ::: Clifton Royston ::: (view all by) ::: April 07, 2008, 03:42 PM:

Going back to the original post by Abi, there's a lovely phrase in there which jarred a different series of associations:

"We love heroes and leaders, from Alexander the Great and his iconic descendants..."

And that's one reason The Man Who Would Be King is such a great parable, both in the original Kipling story and the movie. It's about what happens when a leader begins believing what the people tell him about how great he is, and that he need not listen to objections because he is the spiritual descendant of Alexander the Great and he can't fail.

#57 ::: Lori Coulson ::: (view all by) ::: April 07, 2008, 03:51 PM:

Trey @51: I work for HHS -- when are they going to attempt to roll out this turkey?

~IF~ the target date is any time after January 21, 2009, then it may never get off the ground, as new Administrations often scrap programs that the old Administration thought would be wonderful.

You also have the option of calling the IG hotline, and reporting the problems to them. There is no point in HHS funding this, if it will not do the job it's supposed to -- and expecting physicians to pay $15k to participate is just insane...

#58 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 07, 2008, 03:57 PM:

Trey @ 51

And when Clifton says keep documentation, he also means "never delete email". In fact, keep all email relating, even tangentially, to the project in a separate mail folder, and back it up to some removable medium like optical disk every day or so. The first thing the scapegoat hunters are going to say is, "We didn't ask for that." and being able to point to the email that asked for that will often send them running off hunting for someone else to stake out on the anthill.

#59 ::: Jon Meltzer ::: (view all by) ::: April 07, 2008, 04:33 PM:

One was named for one of Columbus' ships, because it was to take us to a new world of processing.

Hmm. I guess it had to be the Nina; not the Santa Maria, which sank, or the Pinta, whose captain went rogue.

Only one out of three actually completed the mission. Yeah, that's management naming.

#60 ::: Jon Meltzer ::: (view all by) ::: April 07, 2008, 04:39 PM:

Can't use unauthorized software (which takes ~ 18 months to approve or move) and that unauthorized software will be required to do this. The software? Crystal Reports.

Oh, yes. I remember once trying to get a particular software vendor onto the company's "approved" list. You may know the vendor - their headquarters is in Redmond. After several rounds of bull from a flunky in Cleveland (I was in Boston) I finally sent an email "please do your job so that I can do mine." My boss was a bit pissed at me, but I did get the approval.

#61 ::: abi ::: (view all by) ::: April 07, 2008, 04:46 PM:

I once had to write a business case to get a driver for my mouse wheel.

This was in the company that also had a "Bureaucracy Hotline" webform, which you could fill in to report cases of excessive bureaucracy.

Whenever I'm wondering why I don't work for big corporate any more, whenever I get tired of being the lone tester and QA guru in my new place, I think of these things, and I feel better.

#62 ::: cajunfj40 ::: (view all by) ::: April 07, 2008, 05:10 PM:

@#36 iain Remember the Therac 25.

*shudder*

Here, it's more along the lines of "Remember the Telectronics Accufix Atrial J Lead."

Me, I have a printout stuck up in my cube that is basically a "thank you" letter from a gentleman who, thanks to a warning device I helped design part of, was able to get to hospital *before* he had a massive myocardial infarction. Failure Mode Effects Analysis reports that include "Death" as a potential effect are sobering, indeed.

I've been lucky here - my boss has always understood that the patient comes first. Missing a launch date or the leading edge of a market window is expensive. Having the company fold under lawsuits from bereaved families is more so, and not just monetarily.

Later,

-cajun

#63 ::: Trey ::: (view all by) ::: April 07, 2008, 05:11 PM:

All,

Document, document, document. Backed up off-site on my home machines and in a remote location.

I've also kept the drafts of the proposal with the barrier analyses pointing these out.

And the resumé has my nifty new job title on it as well! Its only 2 weeks old!

John @ 52,

Oh yes. Even with our prize physician for this project, the one who wants this, the best we could manage was 2-3 hours a week in his office. That was with a Business Associate Agreement and a long history of working with him on quality improvement projects. Add in that there is the distinct possibility that queries and reports will slow the system down, or worse possibly mess it up (especially if someone uses append, modify or delete queries).

Abi @ 53,

Shotgun conversation - what a lovely concept! I'll borrow that. And I'll continue to make them state their blythe little assumptions as well. Maybe they'll begin to understand why they are so terrifying.

Lori @57,

Does the phrase Q u a l i t y I m p r o v e m e n t O rg a n i z a t i o n (sorry, I'm getting paranoid) mean anything to you? Well the start date for the contract is Aug 1, 2008, with major contract measures taking place in 2009. As to junking it, I don't see it. All 3 candidates have said Health IT is great and will save lots of money. None have said they want to pay for it, or otherwise provide incentives.

As to the interfaces, the classic vendor quote is ~100 man hours at $150/hr billing rate. Thus, $15k.

I don't think its reached the IG level yet, but I'll keep it in mind.

Bruce @58,

Shit. Need to do that now. Thanks!

John @60,

The problem is, that if I do that (contact those deciding whether software is authorized or not) I acquire another blackmark for that - our IT dept lands on those contacting the CMS help desks directly quickly and with both feet.

#64 ::: Pfusand ::: (view all by) ::: April 07, 2008, 05:38 PM:

Oh, you QA guys are making my mouth water. I've been out of the software field for several years now. (I took the work I could get: tutoring, mostly math and stats.) My husband thinks that I could get back in as a tester, but who would hire a cough, mumble year old woman as a novice QA person?

#65 ::: Clifton Royston ::: (view all by) ::: April 07, 2008, 05:53 PM:

Pfusand: Maybe you should try it and see what kind of interest or interviews you get?

I've sometimes suggested testing as a way to get a foot in the door for SW development, for people stuck in the no-experience Catch-22, because QA groups often have trouble recruiting. People who have a talent for and want to do testing are a rare breed in my experience. If you actually want to do testing and aren't just putting in time until you can get into a SW development job, I would think that would look pretty attractive to a QA head.

#66 ::: Mycroft W ::: (view all by) ::: April 07, 2008, 07:07 PM:

QA and sysadmin are two ill-paid, ill-respected professions. They're "simply overhead", and clearly a needless expense. Sysadmin is a needless expense, when she's good, because she sits around the office doing nothing, and the system only needs 2-3 hours a week "work" (i.e. the admin out desk monkeying or plugging in wires in the server room). Until, of course, the job is downsized, or the admin decides she can make more money programming or running cable, and all the things she kept an eye on and updated and automated start failing to edge cases...

QA is the opposite. They get paid to delay release dates. "The code works!" says the happy path people (and the managers who don't know the Corollary to Clarke's Law), the programmers who have "solved the problem" and both need to get on to the next one and don't want to do all the boring getting the fix right don't want to hear from them, and QA says "nope, can't release - there are problems".

What I find interesting is that the Comptroller and the receptionist are also "overhead"; but they're treated better (well, the Comptroller is, at least) because the PTB know why they need a money man. A computer person? Well, it just works or doesn't, doesn't it? Why would it need to be managed?

Funny thing is that I work support for a tech company, and my "users" are clued sysadmins. Who find stuff. And who are interested in security and system-wide breaches and that sort of thing. And who appreciate the security mindset being applied by the people who are supplying the product, double-checking their guesses. And when QA is rushed, and we test only the happy path, guess who gets it in the neck.

However, the security mindset does cause a problem IRL. Everybody thinks I'm a massive pessimist, who doesn't like any new idea. Not true. I may like it; but by default I see the flaws. Resolve all the flaws, and it's probably a good idea. But I'll try to break it first, because that's how my mind's warped.

You want one of me on every project, at least. Not on the initial design, or you won't get anywhere; but preferably before the design gets to implementation, and part of the oversight committee when changes to the design need to be made (as they always will). What you don't want is to decide on a plan, implement it, test it, and then have one of me say "so, our clients do this, and this, and this. How are they going to do that now?"

#67 ::: Jon Meltzer ::: (view all by) ::: April 07, 2008, 07:22 PM:

#61: I once had to write a business case to get a driver for my mouse wheel.

You win.

#68 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 07, 2008, 08:52 PM:

abi @ 61

I left AMD after six months working in the Linear IC design group, for 2 reasons. One, the engineer who hired me, and to whom I effectively apprenticed myself, got pushed out of the company, and Two, the purchase request for a new oscilloscope I'd been asked to write (we had just one for 5 engineers and 3 technicians) got sent back for the 3rd time by the CEO for additional justification. This was a $10,000 purchase in a $100 million/year company, and had been approved by my group manager and his boss as well.

#69 ::: P J Evans ::: (view all by) ::: April 07, 2008, 09:54 PM:

I keep saying that they should have spent more time figuring out what we were going to have to do before starting to do it.

Now we have stuff that has to be fixed - I spent two days last week fixing one of the problems, so I could do my job properly on some other stuff that needed the fix done first - and other stuff with people wanting it done differently 'because if this part fails it will do that'.

Which, if they'd said so two or three years ago, we wouldn't be needing to go back and look at two or three years of work to fix the stuff. (They're actually still arguing about that change. Meanwhile, we're maybe-maybe not doing it the changed way.)

#70 ::: xeger ::: (view all by) ::: April 07, 2008, 10:14 PM:

If I could -find-[0] the things that we already know need to be fixed, I'd be ahead of the game.

[0] Yes, physically and logically.

#71 ::: Erik Olson ::: (view all by) ::: April 07, 2008, 10:55 PM:

Ahh, DIA, where the bags are MIA.

I loved this system. The belts ran too fast, so when you'd hit a curve, ZING, the bag would fly off. So, they slowed down the belts...but only in the curves.

So, Bag #1 slows down -- and the bag right behind whips in at full speed and BAM, slams that first bag into a wall like a well struck pool break. It then falls over on the slow belt, allowing bag #3 to make a competitive vault over it for the gold.

End result. Bag #2 would make it through the curve, but dented from two impacts. #1 and #3 hit the wall at different heights.

The answer? Rip the belts out and run carts.

#72 ::: John Fiala ::: (view all by) ::: April 07, 2008, 11:22 PM:

This is the kiss of death for a new terminal, and has been since Denver International Airport’s long-running and epic failure to automate their baggage handling. It was introduced in 1995 after a two year delay, and spent a decade expensively mangling and mislaying passenger luggage. It was put out of its misery in 2005, after parallel trials proved that manual handling was more accurate, faster, and cheaper.

Although I'll agree that the baggage handling in DIA was a horror and an epic failure, it wasn't the kiss of death for the terminal. In fact, having flown around a bit, I'm quite fond of DIA as a whole. (And not entirely because half the time I'm there I'm almost home. :)

That said, total agreement on the QA/Testing... I've been enjoying the concept of unit testing since I've been able to start using it on my code, and although that doesn't find all the bugs, it really does help me nail them down and prevent them from reoccuring.

#73 ::: Marilee ::: (view all by) ::: April 07, 2008, 11:55 PM:

John L., #49, Kaiser's old pharmacy software printed a front sheet with basic info, and then one or two more sheets with more specific info, but without any identifying text on them. The new software prints out four or five sheets with the specific info and my name and ID on them, so I have to shred them instead of just recycling them. Very annoying.

#74 ::: Greg London ::: (view all by) ::: April 08, 2008, 02:21 AM:

Standard hardware verification methodology these days is to create a model of the hardware, tie it together to the code for the real hardware, then throw random data at both and check that the results match.

It's an interesting concept, though it does require that the model be pretty complicated, though no where near as problematic as the real hardware code.

The odd thing about randomized testing is that you sort of get, from the beginning, that your "coverage" has a certainty that is purely statistical, probability.

Which is a big switch from aviation verification which requires directed tests for every corner case. Directed tests are still probabalistic in finding errors taht might not be uncovered, but the statistics is hidden slightly more.

#75 ::: Dave Bell ::: (view all by) ::: April 08, 2008, 03:17 AM:

A little local example: the local Doctors ran their own pharmacy service for out-of-town patients (they weren't allowed to compete with the in-town commercial pharmacies). They recently split it off into a seperate business (and while it's described as a particular small company, the signs associate it with a pharmacy chain) running with 260% of the opening hours.

There's more queueing. The deliveries don't get unpacked as soon in the day. The order deadline is earlier. At the moment it can easily take a couple of days longer to get a prescription filled.

Oh, and it looks as though they don't have as many computer terminals.

I wonder if they thought that some of the peak-time customers would be willing to come it at 10pm

#76 ::: Ken MacLeod ::: (view all by) ::: April 08, 2008, 07:48 AM:

We are just back from Australia, where I was a guest at Swancon, and we decided to make holiday of it. We had essentially no problems until this morning, when we landed at Heathrow. The first indication was a twenty-minute wait in Terminal 4 for the bus to Terminal 5 (these are supposed to run every ten minutes).

On arriving via a very circuitous route at T5, we had a long long walk to Flight Connections, followed by a routing to a small office, not signposted, to get our boarding cards for the final hop to Edinburgh. This was followed by three queues for successive passport checks including a final one where we were photographed (a novelty here) and a security queue at the gate. After clearing security, our passports and boarding cards were checked (cursorily this time) again.

At Edinburgh there was a mound of uncollected luggage taken off the conveyer belt, and a long queue of people, including us, whose luggage was not on the belt. Our luggage is either in Singapore or, more worryingly, Terminal 5.

I intend to avoid Heathrow for the foreseeable future, and especially not use it as a hub.

#77 ::: Debbie ::: (view all by) ::: April 08, 2008, 08:25 AM:

P J Evans @69 -- I keep saying that they should have spent more time figuring out what we were going to have to do before starting to do it.

Now we have stuff that has to be fixed...

So true! A few years ago the German education ministry announced they were going to institute a 12-year high school diploma (Abitur) for the highest tier of the school system (Gymnasium and Gesamtschulen). Previously it had been 13 years, with kids graduating at age 19. I had no problem with that, theoretically. The program started in 2005, applying to all who were then in the 5th grade, my son included.

So, we are now into the third year of this system. Reports in the press have substantiated our experience and hearsay, and made it clear that there was inadequate planning* before the implementation. It seemed clear -- to me, anyway -- that the curriculum would have to be adjusted systematically, perhaps dropping some topics or speeding up on the instruction. Apparently that wasn't done. Everywhere schools, students and parents have complained that they're not sure which parts of the curriculum should be retained, which should be dropped, and when subjects should be introduced, because they weren't given the information by the ministry. "We used to teach this in the ninth grade, but should we introduce it in the eighth now?" So a few weeks ago there was a Big Conference to finally make these decisions.

I am extremely sceptical about the quality of those decisions, given the circumstances under which they were made. It was also a jaw-dropping realization to discover how such a hugely important program could be instituted with so little attention to detail beforehand.

*to put it extremely politely!

#78 ::: John Chu ::: (view all by) ::: April 08, 2008, 11:57 AM:

#74: Greg, you make it sound like an either-or proposition. The projects I've worked on have done both random and directed testing.

With the former, you can get into long navel-gazing arguments over what coverage really means, how we measure it, and how we know that we've truly covered the design. With the latter, you can get into long navel-gazing arguments over whether we've truly thought of all the possible corner cases.

My preferred solution is to not argue. Do both. You explicitly cover the cases you know will be trouble. You cover the rest stochastically because the stimulus space is too large for exhaustive testing.

#79 ::: Greg London ::: (view all by) ::: April 08, 2008, 12:13 PM:

John@78: you make it sound like an either-or proposition.

Last time I worked on an FAA project (which was a while ago and maybe they've revised their spec), the requirement for getting certified was based solely on directed testing. A lot of directed testing. I don't think randomized testing was considered or even allowed.

I hadn't really thought about it until this thread, but it's interesting to see the spectrum of scythes. Some are wider and heavier than others.

The thread is basically saying "Test your design". I was considering the stranger question of "how much?" and then realized the differences I've experienced in previous jobs. Not that I have an answer for how much is enough, but it's interesting to see the different approaches used. You can never prove you've found all the bugs in your code. But the FAA tries to exhaustively near that asymptote.

#80 ::: C. Wingate ::: (view all by) ::: April 08, 2008, 12:58 PM:

Testing, I think, is something that can be taught to most anyone with the right temperament. The problem is that a huge chunk of those people, just as with documentation, can program, and therefore (quite reasonably) go for the better paying job.

#81 ::: guthrie ::: (view all by) ::: April 08, 2008, 01:09 PM:

Our managing director is so stupid that he goes down to packing and final inspection and fiddles inspection results because he is desperate to get material shipped out the door each month, to ensure the monthly target is met.

#82 ::: xeger ::: (view all by) ::: April 08, 2008, 01:37 PM:

guthrie @ 81...

I'm still boggled by the guy who paid sales their commission based on what they said they thought they were going to sell in the next month...

#83 ::: P J Evans ::: (view all by) ::: April 08, 2008, 01:53 PM:

#81

I had a supervisor who was like that. He was never around when I needed his input on stuff, either. I left before I felt the need to use the gas-bottle wrench (box-end wrench, something like 1 1/4in) on his pointy head. (It was the place where we had several people named 'John' or variants thereof.)

#84 ::: abi ::: (view all by) ::: April 08, 2008, 02:42 PM:

C Wingate @80:

Testing, I think, is something that can be taught to most anyone with the right temperament. The problem is that a huge chunk of those people, just as with documentation, can program, and therefore (quite reasonably) go for the better paying job.

This is both true and stupid. (By which I do not mean that you're stupid to say so, but that the business world is stupid to work that way.) Coupla reasons:

  1. Testing is a vocation in its own right, and it's a shame that it seems to come with a vow of poverty.

    I started as a developer. I can program. As a matter of fact, I am doing a fair bit of programming these days, writing test drivers and working with test tools that require programming skills to operate. But I don't want to program; being a developer doesn't make my soul sing.

    Learning to test well requires as long a time as learning to develop well. I've been testing software for 10 years now, and that decade has taught me a lot. I know where some of the problems with testing effort are likely to be, and I know where the bugs like to hang out. Some of that can be found in books, but most of it is (often bitter) experience.

    I love developers who can switch-hit and test as well, but there is a body of knowledge that they haven't spent the time acquiring. And testing wears them down, because they're makers by calling, not breakers.

  2. That which we obtain too cheaply, we esteem too lightly.
  3. When it's a choice between using a developer's time and using a tester's time (for instance, by having the developer explain what he's just designed so the tester has some expected results without rereading the code check-ins for clues), the higher paid person frequently wins.

    In many organizations, when a developer and a tester go head to head ("It's not a bug, it's a feature!"), the developer has more pull. And project managers often treat test time as dev contingency time, because the pecking order is baked into the structure of the company.

    It can get dispiriting.

Joel Spolsky, whom many people respect in the development world, is a particular vector of this disease. Point 4 of the article I linked to complains that so many testers are monkeys, while point 5 explains that testers are paid peanuts.

Sorry for the ranting. The original blog post took a long time to write because I have a lot of things to say on the subject, and some very strong feelings. I love being a tester insanely much, but I hate how much crap I've sometimes had to take to do this necessary job.

(To be fair, I have pay parity with similarly experienced developers at my current company, and the only person making me feel small there is me. But I was a long time in worse circumstances, and I still twitch.)

#85 ::: guthrie ::: (view all by) ::: April 08, 2008, 03:22 PM:

#82, 83-

I weep tears of joy! I am not alone!

Seriously, there is no reason for this guy to be in office. He is despised by the higher ups in the company (we're part of a privately owned conglomerate), and the only useful function he servies is as a talking dummy, and I could do that for less money. He started in teh company as quality manager, despite as far as I can tell never being more than a quality technician before. There, he managed to butcher the waulity system that had been built up over the previous 10 years. THen he moved onto production, and proceeded to mismanage it such that we are still paying for the mistakes.

Now, as MD, he has presided over loss of personnel and wasteage of money. The only reason any rational person would keep him in place right now would be as a sacrificial goat, to be executed to placate the owner and the agencies such as the HSE and SEPA who are watching us like hawks.

(We've had nearly 20 enforcement and prohibition notices from the HSE, and SEPA were a bit annoyed to find we were putting around 2.5 tonnes of phenol down the drain every year)

And such a manager engenders stupidity below him. His diddy men have, to the best of my knowledge, cost the company half a million pounds or more this century, but nobody seems to care. (I've cost us about £2,500, but also saved some money here and there)

We lost our QC manager in 2006. It took them 4 or 5 months to recruit a replacement. He lasted 3 days before leaving, saying they weren't paying him enough for the huge job they were expecting him to do.

Since then we've limped on with consultants, but the main issue is the senior management are over their heads or dangerously incompetent.

Dammit, if only I was that incompetent, I could be promoted too!

#86 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 08, 2008, 04:16 PM:

guthrie @ 85

Dammit, if only I was that incompetent, I could be promoted too!

Is your name Peter? If not, forget promotion on principle.

abi @ 84

What you said, in bold, small caps, underlined. I started out as a "Hardware Systems Evaluation Engineer", but found that my kink was really in design of software (which, towards the end of my hardware work was about 80% of my job anyway). But I've always had respect for testers and other quality assurance people because I know that the job can be interesting and challenging, and because I know just how much the deck is stacked against you guys in most organizations. The best job I ever had was with a company that was fanatical about testing, maintained never more than a 2 designers per tester ratio, and insisted that testers be involved from the very beginning of any project. We got product out the door and into mission critical enterprise applications with as few bugs as I've seen at any development shop.

The tester attitude: the senior engineer in terms of testing experience was responsible for validating the distributed, shared, and transactional nature of the software (3 red flags to anyone who's worked with networked software). The test harness he designed beat the snot out of the software, and (in the late 90's) loaded it equivalent to running on dozens of computers with thousands of processes each spread across large latency networks. After running it for awhile, he decided to name it "Stick it in and break it off."

#87 ::: Jon Meltzer ::: (view all by) ::: April 08, 2008, 04:41 PM:

Point 4 of the article I linked to complains that so many testers are monkeys

Tester monkey wishes that developer would test ^%#^%#$ login page himself.

(it's been a long day ... )

#88 ::: R. M. Koske ::: (view all by) ::: April 08, 2008, 04:58 PM:

Testing sounds like something I have the temperment to find enjoyable, but I'm not sure I'd have the temperment to handle the job situation.

Since I'm apparently feeling a bit masochistic today, can anyone tell me about how one might get started in testing? Unlike Pfusand (comment #64) I've got no software experience other than being a fairly savvy user. Are QA groups desperate enough to hire someone like that? Or is testing a "go back to school" career change?

#89 ::: C. Wingate ::: (view all by) ::: April 08, 2008, 05:31 PM:

re 84: Spolsky's statement that "programmers don't make good testers" is not entirely true, either. A lot of developers are poor testers, but on the other hand, those people should be at a discount anyway, because they will produce more bugs. It is true that they can't keep it up, especially if they find lots of stupid bugs.

I suspect that part of the status thing is "tester" is one of those things you don't get a degree in.

One of the things I do here is produce what I call a "change document" for every functional change I make. I send these to the testers and to the documentation people so they don't have to work out for themselves what I did. It also gives them a fighting chance at updating the automated test scripts. It just makes everyone's life easier. Unfortunately I don't have the clout to make everyone else do it.

Can we rag on ISO 9001 while we're at it?

#90 ::: guthrie ::: (view all by) ::: April 08, 2008, 05:34 PM:

I quite like ISO 9001, but then they did re-write it to say "senior management shall" etc. We'd fail the audit right there and then.

#91 ::: xeger ::: (view all by) ::: April 08, 2008, 05:47 PM:

What's not to like about a standard that boils down to "Do it the same (wrong) way every time" ?

#92 ::: abi ::: (view all by) ::: April 08, 2008, 06:00 PM:

Pfusand @64 & RM Koske @88:

I'm assuming you're in the US, which means that I have no idea how to break* into software testing. I got into it by circuitous routes.

I have a BA in Latin, and did a postgraduate computer science year in a British university (during which we were taught no testing, not even unit testing). I then joined a large bank as a software developer (Cobol).

Within a few months, someone was looking for a warm body to support our system in the newly formed Y2K test environment. Being the newby, I was naturally tapped for this thankless task. Shortly after that, it became apparent that someone had forgotten to get us ready for a major nationwide collaborative test. Because I was the only one who very stupidly went to a meeting I was invited to†, I ended up running our participation in this test effort. I simultaneously developed a stomach ulcer and had a blast.

I went back to the development side after that, and lasted about a month before I was climbing the walls. So I wangled my way onto the biggest, messiest project I could find and started testing again**. I've been doing it ever since.

There are a ton of courses on software testing, and very few have any accreditation or formal recognition. I've done the two British qualifications of any merit, the Information Systems Examination Board Foundation and Practitioner Certificates, which are a nice accessory to real world experience.

But I have no idea how to go about getting a job in software testing in any circumstances but mine.

----

* pun very much intended

† this was much like the scene in a film where a sergeant asks for volunteers to step forward and everyone else takes a step back

** That was the project that gave me exclusive use of a 1,000 MIPS mainframe for a weekend, representing 25% of our entire mainframe estate at the time. All loved me and despaired.

#93 ::: David Harmon ::: (view all by) ::: April 08, 2008, 06:00 PM:

It's not so much that "programmers don't make good testers", though not all of them will. But even the best "two-fer" can't do both jobs on the same project, or the same day. That's because they're human, and have certain cognitive limits.



#94 ::: xeger ::: (view all by) ::: April 08, 2008, 07:46 PM:

Having finally had a user of software I'd written with the property that I usually have for software of others (weird, weird corner cases) ... I'd have to agree with abi - all shall love the tester... and despair.

#95 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 08, 2008, 07:50 PM:

I used to rag on ISO 9001 all the time, because I saw it in action up close in one large corporation, and at a little more distance in a lot of customers of a small company I worked for later. But then Sorbanes-Oxley came along, and every IT group in the universe used it as an excuse to have audits for completely irrelevant factors*. So let's rag on Sorbanes-Oxley instead.



* irrelevant to the actual operation of an IT group, and irrelevant to the original purpose of Sorbanes-Oxley, which was intended to clean up corporate governance after Enron. Of course, no corporate executive wanted that, so they started a rumor it was actually about IT operational compliance, and everybody believed it.

#96 ::: Niall McAuley ::: (view all by) ::: April 08, 2008, 08:08 PM:

Ken McLeod writes:

I intend to avoid Heathrow for the foreseeable future, and especially not use it as a hub.

I started doing that around 1990, based on their proven ability to lose luggage far more often than anyone else. Unfortunately, it's still hard to avoid Heathrow when travelling from Ireland to places other than London.

#97 ::: Serge ::: (view all by) ::: April 08, 2008, 10:18 PM:

abi @ 92... pun very much intended

Waitaminnit! That's my beat!

#98 ::: Terry Karney ::: (view all by) ::: April 08, 2008, 11:34 PM:

I made a ROT-13 decoding error, and got, Duality Improvement Organization

It's funny, but I realise that a large part of my favorite job in the army is a sort of QC. When we teach someone (esp. interrogation) we are assessing them for suitability. For interrogation that involves not less than 20 separate 2 hour evaluations (usually, in my schools, by 5-6 people. In the active army school I attended it was 21 evaluations, and some of those were reviewed [they were all taped]).

We've had people we decided weren't the right sort of person, and they failed. Usually we knew they were going to fail before we knew why.

I once called my boss and said, "Come Phase II, 'x' needs to fail". We talked about it, and he said that we'd pay careful attention to him. Happily he chose to fail out of Phase I. I think (looking back) that he'd have been washed out.

My boss had faith in me (still does), and trusted my judgement. Then again, he understood people; having been a middle school teacher for years. He knew that not all were cut out for it.

So I think that my estimation carried a lot of weight with him (I think that's the only time I've called someone out as a "needs to be failed", instead of, "going to fail, and nothing we can do about it").

I think, looking at those who are good at this job, that being good at it (as opposed to merely competent) requires some of that mindset.

#99 ::: Greg London ::: (view all by) ::: April 09, 2008, 12:04 AM:

Oooh, I almost forgot my favorite quality control story (legend? myth?)

It takes place at a military parachute packing unit. The commander was a former parachute packer himself and would always have his own chute ready and waiting. Every once in a while (one a day? COuple of days? Not sure of the exact time frame), the CO would walk into the area where his people were packing chutes, grab a random chute off a pile and toss it to the guy who packed it, do that for everyone in the building, and then take everyone, including himself, up for a little skydiving.

Pack a chute wrong, and you might be jumping with it strapped to your back.

I have no idea if it's true, but it always makes me smile.

#100 ::: Steve C. ::: (view all by) ::: April 09, 2008, 12:17 AM:

Bruce Cohen @ 95 -

You're singing an all-too-familiar song. I had a taste of SOX in my prior job, producing flurries of documents, flowcharts, and control matrices for programs I developed years before and still supported, systems that had always gone through internal and external audit reviews, and now were subjected to the auditing-on-steroids of SOX 404.

I know controls are necessary, but Enron wasn't brought to grief by anyone in IT. Nothing in SOX addresses collusion by senior decision makers, nor can it.

The one thing that the emphasis on controls did that was good was that topside General Ledger journal entries had to be blessed by external auditors. But these were not entered by anyone in IT (we just made sure that the processes that edited and posted them worked correctly).

#101 ::: Terry Karney ::: (view all by) ::: April 09, 2008, 01:43 AM:

Greg London: It's more than that. Riggers (the guys who pack chutes) have to be jump qualified, because any chute they pack can be handed to them to test.

One of the guys I went to interrogation school with was reclassifying because he'd blown his knees out, and could no longer jump. Not being able to jump meant he couldn't pack.

He told a story about being in Alaska. Three HALO jumpers (High Altitude, Low Opening: guys who jump out of airplanes at with O2 bottles, from as high as 40,000 feet) walked into his shop, told him to grab four chutes he'd packed, and come with them.

They tossed him one, and donned the others, climbed up to about 5,000 feet, watched him jump the chute and then he watched the plane head toward Kamchatka.

#102 ::: C. Wingate ::: (view all by) ::: April 09, 2008, 06:53 AM:

re 102: My company wrote and maintains the software that sends these guys to training. Everyone except the parachute riggers goes to jump school last; the riggers go directly from BT to jump school. It's the number one irregularity in the system.

#103 ::: guthrie ::: (view all by) ::: April 09, 2008, 07:33 AM:

Xeger #91- I'd love production here to do the same thing every time. But they don't.

In my experience here and in an analytical lab, and in a place making up doses for animal testing, the "right" way was sorted out first. Indeed, that was what was done here 20 years ago when they were developing the product- work out how to do it right, then have the quality system monitor the manufacture. If the operators would actually make things the same way every time we would be able to track down what goes wrong, but no, they have to take short cuts and not fill out the paperwork etc etc, thus leaving us without the information to tell what they have done wrong (or right).

#104 ::: Nix ::: (view all by) ::: April 09, 2008, 07:33 AM:

C. Wingate @ #80, hear hear, but you have a false dichotomy. Some of us developers like testing, too: as a result, the dedicated testing people hardly ever find any bugs in code I write, because the automated tests I run over that code are far more comprehensive than anything the testers run... in part this proves that we need better testers, really: but a good tester has to be coversant with coding as well, because white-box I-know-where-this-might-go-wrong and black-box I-can-feel-the-risky-corner-cases both require some knowledge of what's actually being done.

#105 ::: Nix ::: (view all by) ::: April 09, 2008, 07:40 AM:

Abi @#84, developers who treat testers badly are morons who deserve to have their stuff poorly tested by annoyed testers and then break at the worst possible instant after it ships.

And it will, it will.

(What's more, the *developers* should get the blame. I'm always disgusted when I see testers get the blame for bugs: they didn't put the damn bugs in, did they?)

#106 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 09, 2008, 10:46 AM:

Nix @ 105

Agreed, they deserve it, but the customers who bought the software in good faith, to use in some application that matters to them, don't deserve that. That's why I favor keelhauling; it's ever so much more personal. *g*

#107 ::: Nicole J. LeBoeuf-Little ::: (view all by) ::: April 09, 2008, 01:02 PM:

On a positive note about DIA, they now have free wi-fi.

Which I suppose means that, for those who enjoy having such, the wait for one's bags to show up is more enjoyable.

(I like DIA. It's spacious and pretty - well, not from the outside, from the outside it looks like the udders of a cow that has fallen dead on its back, but inside it's lovely - and a generally pleasant place to be forced to spend several hours, as such places go. And having learned from the snowstorms of 2006, they're trying to get a hotel connected directly to the terminal. So that'll be a plus.)

#108 ::: OtterB ::: (view all by) ::: April 09, 2008, 04:41 PM:

Terry Karney @29 insisting we turn around when we were in "The Narrows" of The Paria, when we felt raindrops was a really good idea, the water, back at our camp, rose more than a foot; where the river was widened from 20 feet to 100, you do the math).

This reminded me of the guy who taught a class I took in First Aid for Girl Scout leaders, who repeatedly made the point that it was a lot easier to think ahead and avoid getting into a dicey situation (e.g. hypothermic girls on a hike in the rain, cut off from the road by rising water) than to get out of it well. And that you avoid getting into those situations by being aware of your hazards and options, being properly equipped, and having a plan for worst case possibilities. And by making sure that the girls knew when to tell you bad news (e.g., they were developing blisters) and that you listened when they did.

cf. many of Jim Macdonald's posts on emergency preparedness.

There's value in looking on the bright side, but substantial risk in looking only on the bright side.

#109 ::: Chris ::: (view all by) ::: April 09, 2008, 06:28 PM:

It occurred to me today that this thread doesn't (yet) mention one of the best-known examples of this sort of thing: the Morton Thiokol engineers' concerns about the cold-weather performance of the seals on the space shuttle's solid rocket boosters.

The engineers' concerns were overruled, the launch of the Challenger proceeded and (as you probably already know) the shuttle exploded in midair, killing its entire crew.

A presidential commission formed to investigate (famously including Nobel Prize winner Richard Feynman) stated:

l. The Commission concluded that there was a serious flaw in the decision making process leading up to the launch of flight 51-L. A well structured and managed system emphasizing safety would have flagged the rising doubts about the Solid Rocket Booster joint seal. Had these matters been clearly stated and emphasized in the flight readiness process in terms reflecting the views of most of the Thiokol engineers and at least some of the Marshall engineers, it seems likely that the launch of 51-L might not have occurred when it did.

2. The waiving of launch constraints appears to have been at the expense of flight safety. There was no system which made it imperative that launch constraints and waivers of launch constraints be considered by all levels of management.

3. The Commission is troubled by what appears to be a propensity of management at Marshall to contain potentially serious problems and to attempt to resolve them internally rather than communicate them forward. This tendency is altogether at odds with the need for Marshall to function as part of a system working toward successful flight missions, interfacing and communicating with the other parts of the system that work to the same end.

4. The Commission concluded that the Thiokol Management reversed its position and recommended the launch of 51-L, at the urging of Marshall and contrary to the views of its engineers in order to accommodate a major customer.

(Commission report found here.)

#110 ::: Terry Karney ::: (view all by) ::: April 09, 2008, 07:12 PM:

Then there's the MD-80 downtime right now. I've been doing a lot of driving, and the radioheads are an interesting mix.

One guy (yesterday) said the FAA is being pointlessly obsessive about this, and ought to have let the airlines do this in the routine downtime of the planes.

The better one was the fellow (opposite an FAA guy on another line) who pointed out the detailed nature of the directions, and that AA could have appealed for an alternate procedure to attain the same end.

Me, yes, it would suck if I were travelling and my flight was cancelled, but it would suck more to have my (or a loved one's; even just a friend's) plane catch fire from an arcing short in the neighborhood of flammable hydraulic fluids.

#111 ::: Steve C. ::: (view all by) ::: April 09, 2008, 10:24 PM:

Far better to have a thousand pointless inspections than one crash. Everything I've read about accidents indicates that it's never one single thing that causes the crash - it's always a chain of incidents.

#112 ::: Neil Willcox ::: (view all by) ::: April 10, 2008, 06:36 AM:

This was in the company that also had a "Bureaucracy Hotline" webform, which you could fill in to report cases of excessive bureaucracy.

I can't even begin to match that story. I haven't laughed so much since... well actually since I was reading a Tank Girl script last week, but before that, for at least a month.

...from Alexander the Great and his iconic descendants...

Let's not forget what happened to Alexander's actual descendants, due to his inadequate succession plan.

#113 ::: guthrie ::: (view all by) ::: April 10, 2008, 03:53 PM:

Bruce Cohen #58- yet another sign that disfunctionoal companies are everywhere and not any different.

Today my colleauge (However you spell it) dealing with our new moulding station was asked by th directors to print out and arrange in order all his communications with the company we had contracted the moulding station to. Said contractors were suspected of not telling us things, or of being told things and not listening. Of course the primary reason for the moulding station not being up and running already is that 6 months ago the manahement changed the design spec about a dozen times, thus losing 3 months. The contractor may have not spoken to a sub contractor about something which is now late, but that is because their engineer left half way through the project, not through wilfull oversight.

#114 ::: C. Wingate ::: (view all by) ::: April 10, 2008, 04:10 PM:

The best I can do in that line is an old visual presentation standards document that said at one point, "This is to be avoided for reasons of avoid clarity to promote."

#115 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 10, 2008, 07:29 PM:

C. Wingate @ 114

Wow, I didn't know Yoda was a tech writer!

#116 ::: Christopher Davis ::: (view all by) ::: April 11, 2008, 01:10 AM:

Steve C. (#111): Yup. There are usually several things that could have happened just a little differently in a way that would have avoided or mitigated the situation.

One example that comes to mind is the crash of AA 191 out of O'Hare. 273 people were killed in what was the deadliest crash on US soil from May 1979 to August 2001.

At takeoff, the #1 engine separated from the aircraft, flipping up over the wing and severing hydraulic lines. (Fix #1: not using an unapproved method for removing and reattaching engines, which caused pylon cracking.)

The severed hydraulic lines lost pressure and the slats on the left wing retracted. (Fix #2: some aircraft have mechanical locking systems to keep slats extended; the DC-10 used hydraulic pressure without mechanical locks.)

The pilots reduced the aircraft's speed to the recommended engine-out climb speed, as their procedures provided for. (Fix #3: have the procedures tell pilots not to reduce speed if the aircraft is already going faster than the engine-out climb speed.)

Due to the reduced speed and lack of slats on the left wing, it started to stall. The slat disagreement warning and stick shaker were only on the captain's side, and were powered by the engine that was no longer there. (Fix #4: replicate these warning systems on the first officer's side; fix #5, automatically provide backup power from a different engine.)

Any of those changes could have prevented the accident.

#117 ::: Lee ::: (view all by) ::: April 11, 2008, 01:15 AM:

Re airplane inspections, I'd like to point out a couple of things which are being (as usual) largely overlooked in many quarters:

1) Southwest got hit with a bunch of "inspection violation" notices which basically amounted to being late with the paperwork. They cleared the entire list in one day, without finding any significant mechanical problems. Media response: Eek, horrors, Southwest has Feet Of Clay, airline customers should Beware!

2) American got hit with a bunch of "inspection violation" notices which have resulted in two weeks of schedule disruption, hundreds of flights being canceled (with no end in sight), and planes being grounded for BROKEN SHIT. Media response: yawn, nothing happening here.

I don't think it's coincidental that Southwest also treats their employees rather better than most of the other major airlines. There's a reason I have some Southwest stock in my portfolio.

#118 ::: Greg London ::: (view all by) ::: April 11, 2008, 09:05 AM:

Lee, the first question that comes to mind is how does does an airline like AA bribe the media. I assume advertising dollars. The second question that comes to mind is how much less of an advertising budget does Southwest have compared to AA.

#119 ::: Bruce Cohen (SpeakerToManagers) ::: (view all by) ::: April 11, 2008, 09:37 AM:

Lee @ 117

The issue with Southwest was that, according to the FAA engineer responsible for overseeing their inpsections, his boss, who used to work for Southwest and was on very good terms with some of their managers, had been preventing him from flagging the missed inspections for more than three years. Furthermore, when he tried to press the issue, his boss threatened him with a bad review and transfer, and Southwest sent the FAA a letter saying that he was "harrassing" them. There's s sufficiently suggestive pattern in the history of the Southwest actions that when the FAA finally acted, Southwest rolled over and took a $1 E7 fine rather than get bad PR from an investigation or trial. Since it is the policy of the current administration in general not to embarrass large corporations by even seeming to hold them accountable, it's doubtful if that pattern will ever be examined carefully to determine what actually happened, but I'm inclined to believe that inspector.

#120 ::: Lori Coulson ::: (view all by) ::: April 11, 2008, 10:37 AM:

Bruce Cohen @119:

Even if Southwest doesn't appear in court, the people in the FAA who prevented the inspections could lose their jobs, and are probably subject to some pretty hefty fines as well.

My guess is that the FAA's Inspector General will be looking at this to see if there's a pattern and how many other airlines may have received similar treatment.

It may not be as satisfying as having this come out in court, but measures will be taken, and there's always the possibility that there will be Congressional hearings on this...

#121 ::: Lori Coulson ::: (view all by) ::: April 11, 2008, 10:46 AM:

Bruce Cohen @119:

Even if Southwest doesn't appear in court, the people in the FAA who prevented the inspections could lose their jobs, and are probably subject to some pretty hefty fines as well.

My guess is that the FAA's Inspector General will be looking at this to see if there's a pattern and how many other airlines may have received similar treatment.

It may not be as satisfying as having this come out in court, but measures will be taken, and there's always the possibility that there will be Congressional hearings on this...

#122 ::: joann ::: (view all by) ::: April 11, 2008, 01:19 PM:

Bruce $119

I would have said that *any* mention of all this in the news was already sufficiently bad PR for Southwest. The payment of a ten million dollar fine is also rather hard to live down. I've been avoiding Southwest for some time now, and I've just received extra impetus to continue doing so in the future.

#123 ::: Alex ::: (view all by) ::: April 13, 2008, 11:08 AM:

There is only one problem with this story: there's no evidence of anything wrong with the software at all, rather than problems with security passes (wrong ones/not enough issued), parking (problem of ex-T1 staff not recognised), familiarisation (we've never done the job in here before), and industrial relations (BAA antagonising the unions).

"Software", in this sense, is actually another word for "magic"; stuff we don't understand. Giving out "software" as an explanation is a PR man's reflex; giving out "computers always fail" as an argument for conservatism is similar. They all have the effect of hiding political/economic realities.

#124 ::: Greg London ::: (view all by) ::: April 13, 2008, 11:17 AM:

Bruce@119: according to the FAA engineer responsible for overseeing their inpsections, his boss, who used to work for Southwest and was on very good terms with some of their manager

Ah, that explains it. A double agent is much better than a bribe.

#125 ::: abi ::: (view all by) ::: April 13, 2008, 11:34 AM:

Alex @123:

I never said that this was a software failure; nor did anyone from BAA. It's a testing failure.

The processes for issuing passes, getting people through security, and ensuring that they know where to park are all testable. So is the baggage handing hardware.

I happen to be a tester who specialises in software. Before I did that, I was a tester who specialised in financial processes and the production of financial records (translation: financial auditor). Other people specialise in the testing of hardware, for instance, or security processes. We all share a particular mindset, which is what I was trying to unpick in this post.

The political and economic reality is that BAA did not spend enough attention and time (translation: money) on certain aspects of the Terminal 5 opening. One of the corners they cut was testing.

#126 ::: abi ::: (view all by) ::: April 15, 2008, 03:53 PM:

I see that two senior British Airways executives have left the airline. There is of course rampant speculation. Did they jump or were they pushed?*

BA said Gareth Kirkwood, director of operations, and David Noyes, director of customer services, would be leaving.

Frankly, I think someone in BAA should have their head served up on a platter for the fiasco, though I can see how BA might take a hit for their ways of dealing with the aftereffects.

Also in that article, it mentions that many travel insurers are not paying out for delayed flights and lost luggage in the terminal. BA's and BAA's customers, in other words, are going to continue getting it in the neck for some time to come.

We are not amused.

-----

* The answer, of course, is mu.

#127 ::: abi ::: (view all by) ::: June 28, 2009, 04:47 PM:

I see that no one learned a thing from the opening of the airport.

Baggage handling breaks, airport snarls up.

Surely someone could have thought to take a few hours and create a contingency plan, just in case the baggage handling system breaks?

#128 ::: Serge ::: (view all by) ::: June 29, 2009, 03:52 PM:

abi @ 127... Was that a (bag o') trick question?

Choose:
Smaller type (our default)
Larger type
Even larger type, with serifs

Dire legal notice
Making Light copyright 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 by Patrick & Teresa Nielsen Hayden. All rights reserved.