Predicting Testing Outcomes (Part 1)

Over the past 25 years I’ve been capturing data on every project that I’ve worked on and therefore on the majority of the projects I’ve worked on within the past 10 years I have been able to predict to within 5% how many bugs we will uncover in System Integration Testing and/or User Acceptance Testing.

The projects I’ve worked on mainly fall into three categories – retail banking, retail utilities (gas and electricity) and telecommunications. I’ve also worked in government, wealth management, online retailing, government agencies, transport and logistics. Over 90% of these projects have followed some form of waterfall approach and have been multi-million dollar initiatives with hundreds (sometimes thousands) of people involved. The largest (by dollars and personnel) was my most successful in terms of predicting a “Go Live” date – we were 1 week late on a prediction we made 9 months earlier. This doesn’t mean to say we went live with no bugs – far from it; but we did go live with software that was good enough to support the business.

So, how is this possible? In simple terms – by fully understanding the context within which I work. It is absolutely critical for me to understand the overall experience of the project team, the major risks and dependencies associated with delivering the expected outcome(s), the impact and importance of each feature/function being delivered etc. It is also crucial for me to have complete control of the Testing and Implementation schedules.

I am not a fan of making grand statements or promises about what my (Test) team can achieve on a project, but I learnt many years ago that not entering into a conversation/negotiation with senior management types regarding what is achievable and what is not means you’re on a hiding to nothing.

Fortunately (for me) I have a mild form of OCD and therefore I check everything at least three times and recheck them another dozen times and keep records of these checks. This means that I have excellent records of what I’ve worked on, what worked and what didn’t work, why it worked and why it didn’t etc. Therefore, when I join a new project I know what to look for, what to ask, who to ask, who to believe etc etc…. This means that I can build a clear and realistic picture of what is likely to happen, how it’s going to happen and when it’s going to happen. There are very few new problems on IT projects and therefore being prepared saves an enormous amount of time.

I spoke at a workshop in Wellington recently about known unknowns and unknown unknowns and the impact that these can have on projects. There are always unknown unknowns, it’s just a matter of how you deal with them and how you manage their impact – that’s why we conduct ongoing Risk assessments.

How many bugs we might find on any given project is useful, but has little to do with the overall end-game – delivering a successful project. The fact that the majority of Project Managers are focused on this measure and several other (relatively unimportant) numbers means that no matter what I may think (of the value of keeping count) I still have to do it. This doesn’t mean that I manage my part of the project around these numbers either, it just means that I have to allocate some effort to keeping the PM off my back.

Even though I started off by stating that I have profiled the projects I’ve worked on over the past 25 years, I’m not saying that it’s easy to predict outcomes. I have a defined process that continues to work for me. Some may say that I’ve been lucky with my predictions, to which I’d reply “maybe, but I believe you make your own luck and detailed preparation and meticulous attention to detail can definitely shift the odds in your favour”. Being lucky for 10 years isn’t a bad record, but then again I have walked away from projects because I didn’t like the risk profile they presented!!!

The bottom line for me is that if you are prepared to put in the effort and keep meticulous records and become adept at managing risk you too can successfully predict Testing outcomes. Focus on the unknowns and not the knowns, because they are far more likely to prevent you from being successful.

In Part 2 I will expand on the unknown unknowns (sometimes referred to as Black Swans) as these situations have the potential to derail any project.

(Revised) Dateline: Saturday January 11, 2014

Advertisements

28 thoughts on “Predicting Testing Outcomes (Part 1)

  1. Boy, I really don’t like this blog’s reply template. Not at all. Nothing personal, Colin, but I wish you could change it.

    My last post was meant to refer to this, along with the commentary that follows:

    http://www.developsense.com/blog/2010/10/project-estimation-and-black-swans/

    This is a long series, but I think it has some important things to say about why we should be very skeptical about estimation and planning, and more oriented towards preparation and steering.

    —Michael B.

    • Hi Michael,

      Apologies for the WordPress software, I obviously get a different view to you guys as I can see everything quite clearly.

      I’ll do the reading you suggest, so that I understand better your viewpoint and then I’ll be more than happy to set up a Skype session. Give me a few days and we’ll see what we can do.

      On the point about estimation vs prediction, I thought I differentiated these previously. I’ll go back over my original post and see if I can make it clearer.

      Thanks again for your interest.
      Colin

  2. Colin,

    I’ve also seen thos projects with the 200%,100%,50%,… variance clauses. I generally tend to rip into someone for that. If you can’t make a prediction, that you think is within 10-25% why make it at all? There is no planning or anything you can base on a prediction that is possibly 3-figure out.

    But even then, predictions are fine and well but PM’s suffer from acute reactionism. Anything you give them will end in a flurry of knee-jerk reactions. So I just stipulate that you should stop giving predictions and stand firm on what you described above (changing the company mantra), If you can’t you should walk. That’s what the ethical thing todo would be. And I’m guilty of that too but it doesn’t mean we should state here that it is a valid and conscientious decision. It is a bad one at that and perpetuates the foul core we have in the IT business.

    Geoff,

    90%? sure it’s not 91%? You list all these numbers sounding very (pseudo) scientific and then you take it all back by saying “it’s finger-in-the-wind”. That’s what we’ve been discussing with you for ages now. Can I put it plainly? You are deceiving your customers by having clever smoke and mirrors with all those assumptions and when it turns all to shit, then it wasn’t you because it was all just “finger-in-the-wind”. I know the tactic. You’re surely not alone in using this sales methodology and I despise it to the extreme.

    Put some meat on it mate and stand by what you say or don’t say it at all. Stop doing a management-hogwash and call it testing or test management. If you’d call it finger-in-the-wind that is a sure fire signal to say nothing at all.

    • With respect Oliver, that’s a load of bull. I do not deceive my clients & resent those accusations which I consider totally unprofessional behaviour. As I state, I have used this approach successfully for many years & suggest that only if you have been on one of my programmes and witnessed the proceedings first-hand would you be qualified to make those sort of comments. Therefore I have to dismiss your input as fundamentalist claptrap (hmmm….perhaps I should publish in NZTester?). Grrr.

      • I am sorry that I worded my reply above to Geoff as I have. I stated a few things that are simply not true and they imply things I actually didn’t want to say. I in no way believe that Geoff is doing a bad job or is actually deceiving his customers. What I should have said is that we need to be careful, what figures we present and what risks are inherent to them. Geoff’s reply didn’t show the actual spectrum he uses when doing his job (and I doubt any blog post from anyone can).

        Geoff has a wealth of knowledge and experience that makes him sought after in the market. And there is no doubt that he has had a long string of successes throughout his career. I especially commend his involvement in the testing community by publishing testing magazines and holding testing meetings. As he has pointed out I have never worked with or for him so far be it from me to actually comment on what he does or has done.

        I’d like you (Geoff) to excuse my post above as a really bad rant on my part. I will in future stay on point and discuss the actual issues as I should have done in the 1st place.

      • All good Oliver, thanks. I apologise for my knee response, I really must learn not to respond without properly thinking things through!

      • I believe Oliver was correct to say what he did. I had a three hour conversation with you, Geoff, wherein you advocated using reckless and bogus mathematics– statistics that you freely and repeatedly claimed that you were not qualified even to understand. You told me that you didn’t know about statistics– that you didn’t even know what statistical variance is– but that was okay because all you were doing was “crunching numbers.”

        If you want to use numbers in magical ways, we can’t stop you. But we can call it what it is: Pseudo-science. Reckless. A load of bull.

        Please clean up your act.

      • I think relatively few people consciously deceive their clients. I think relatively more people deceive themselves subconsciously, and deceive their clients inadvertently. I perceive self-deception when I see people say things that are in opposition to one another. Here’s an example:

        (in defense of predictions of +/- 5% accuracy) “If we know that we are passing on average x no. of test cases/day, logging x no. of defects/day & closing x no. of defects/day, we can use these velocities to forecast where we may end up…”
        followed by
        “nothing more than a glorified finger-in-the-wind”

        When I find myself in a contradiction like that, I worry that I’m fooling myself, especially about the first part.

        I’m aware that I can prepare myself to deal with lots of different project problems and respond to lots of different circumstances, refining my tactics and my logistics as I go. A statement of the second kind above is all I need to justify that. I would be concerned that a statement of the first kind would distract me or my clients from the truth of the second kind.

      • Gents, as this is Colin’s blog, I shall refrain from posting any more comments here on this particular subject.

        James: I’ll respond to your comment directly.

        Michael/Oliver: more than happy to continue discussions directly.

        Geoff

    • Hi Oliver,

      I know this may sound like semantics but I have to distinguish between predictions and estimates. The estimates I am asked to provide (sometimes 12 or 24 months ahead of a planned Implementation Date) are just that “estimates”; they’re not predictions. My predictions are never made before we have begun meaningful and controlled testing activities. Please also remember, that the the title of this piece includes the proviso “(if I have to)”.

      Going back to the comments regarding the “100/200%” variances, I agree that it’s not ideal, I also agree that it’s not easy and it’s also fraught with danger in the wrong environment but most of the projects I’ve worked on (and specifically the ones I was referring to in my original post) have very long lead times, very large budgets, often thousands of people involved and this requires multiple levels of approvals. It’s not uncommon for budgetary approvals to take 3 to 6 months and therefore very early estimations are required. If the PM and the rest of the management team know their jobs then there is little risk in providing these estimates. If, however, they are not I will in all likelihood walk away sooner or later. I have probably walked away from half a dozen projects of the past 20 years and it was always because of unrealistic and/or poor project management.

      I like to think of myself as a pragmatist and that’s why I have to be flexible when it comes to estimation and prediction. Even though I may produce the “numbers” I may not always be held accountable because my input is part of a greater body of work for which my expertise has been sought. This is especially true when it comes to bidding for outsourced projects. Again, I have walked away when I have disagreed with the management of a bid.

      This is tough stuff that needs to be debated and made public. If it was easy I wouldn’t be writing about it….

      With respect to your comments to Geoff (and his to you), it may be better if you and Geoff discuss your differences away from my Blog, as I do not wish to comment on either side’s assertions or allegations. I can’t administer comments on my Blog about stuff I didn’t write – sorry guys.

      Thanks again Oliver for your insights and continued support of my Blog.

      • Colin…

        “I know this may sound like semantics”

        Yes. Semantics are important. Semantics is about what people mean by what they say. That’s why it’s important, when you mean “estimation”, that you say “estimation” and not “prediction”. That helps to avoid all kinds of confusion triggered by the fact that your original blog post refers entirely to prediction, and not to estimation.

        I’d reply to some of your other points, but the threading system on this blog is too incoherent (in reverse order by initial reply, in ordinary date order after that, and a limited number of reply levels on each initial) for me to untangle it. We should have a Skype conversation instead, I think, if you’re interested and willing.

        For now, suffice it to say that I believe Oliver is on the right track here. Make an estimate; make a prediction. Self-aware systems (that is, individuals and groups of people) will do lots of things, mostly unconsciously, in response to the prediction. Your context shapes your choices, which in turn shape your context, which in turn shapes subsequent choices, and so on, until the project ends.

        After the project ends, we tell a story about what happened—but because we know the result, our narrative tends to focus on the things that help us make sense of the story, and typically on the things that we remember had a happy outcome. In looking over “successful” estimation, we tend not to count the dropped features; the reductions or increases in project scope; the “outliers” that were, in retrospect, so obviously bad predictions that we shouldn’t have made them at all; the bugs that we would count as bugs but that the project manager dismissed; the severe bugs that the project manager downgraded to “cosmetic”; the prediction that we made “too early” (because, as we know realize, we didn’t have valid data); and the projects we walked away from. And if we do count those things, we usually do so to justify the errors in our predictions. Trouble is, there’s no experiment that we can do to show that our estimates would have been right “if only that one (those several) thing(s) had gone differently”.

        It’s your option as to whether you read The Black Swan before or after our chat, but I urge you to read it eventually.

        This is a long series, but I think it has some important things to say about why we should be very skeptical about estimation and planning, and more oriented towards preparation and steering.

        —Michael B.

      • Can’t reply to Michael’s comment below but I did a quick Google On the semantics. I think it’s not clear cut. In the colloquial sense that we are using here both terms are next to synonyms. In statistics they do have a clearer mathematical difference but that doesn’t apply here.

        One thing a prediction ( vs an estimation) allows is to make a forecast without any proof/process of calculating a number.

        I stipulate that what we’re actually doing here is giving a forecast. To call the models we have an estimation I think goes over and above what they are and -as Michael said- can elude to more than there actually is.

        The word estimation in my opinion is over and misused in IT. It’s a darling of PMs everywhere and the deception works. Nevermind if conscious or not.

        Maybe it’s time to change that mantra and call things what they really are and then actually have the chance if dealing with the outcome of that discussion. We cannot calculate away project risks. Let’s call them forecasts so we highlight the limited input we have had. More akin to the weather forecast.

        As for success and successful, again I think that is a over and misused word in IT. Success can have several criteria and next to never does a project define what success is for them.

        If someone says that a project was successful that can have the full bandwidth of projects out there, including the failed ones! Success or not is politically charged.

        To me success in the simplest form means on or below budget, full functionality, on original release date and a quality product. On all the projects I’ve been on I doubt more than 1-2 have actually managed that. So under my definition next to all projects are failures. Maybe there needs to be a term between success and failure or something above success to reflect things more acurately. But fact remains that most projects sold as a success today are not.

        I am not saying that success is actually a possible thing because the goal that is given is an from estimate (actually forecast) and as we can see those are of varied track record.

        So success nowadays is mostly in the eye of the beholder and as Michael pointed out we definitely apply selective remembering.

      • Hi Oliver,

        Excellent points. I’m quite interested in the “forecast” concept, so I’ll consider using it instead of estimate, but I’ll stick with my Prediction usage for now.

        With regards to success criteria, I agree with most of your comments. However, too few projects fail to clearly define “success” at the outset and therefore achieving it becomes a matter of opinion which then leads to all sorts of claims and counter claims.

        I think my next Blog may be on this subject – “SUCCESS & FAILURE – the same or different?”….

  3. I do wonder if this is a bit of the old physics conundrum, where the observation of something alters the result.

    Maybe you stop (i.e. lean back and cruise) testing because you have reached the number of defects you were expecting. That would certainly give you a VERY predictable outcome. And if this is several testers maybe they are aiming to please their manager by hitting the target!….
    And I don’t even suggest that it’s done consciously! So have a reflection and see how the actual estimation and numbers affect your execution.

    • Interesting point Oliver. I’ve never been aware of this though. My ultimate goal is always to ship the best possible product. In my world the number of defects is irrelevant, quality will always overrule quantity (of tests or defects). However, there is always the danger on constraining or defining ones own reality via pre-conceived ideas and biases.

      Good to hear from you. Happy New a Year…

      • HNY to you too. My current observation is that we are bigger slaves to instinct, illogic and unconscious processes than we’d like to admit. We need to heed our own reactions more than we currently do . As Michael pointed out there are a lot of publications that should be on the reading list of testers just to make us more conscious of the bad decisions or conclusions we make.

  4. Pingback: Testing Bits – 12/29/13 – 1/4/14 | Testing Curator Blog

  5. Perhaps I could toss 5c worth or so in here as well.

    On the question around %variance; in my experience estimates are of greater validity the further through a project we are, simply because we have more water under the bridge to draw estimates upon. If we know that we are passing on average x no. of test cases/day, logging x no. of defects/day & closing x no. of defects/day, we can use these velocities to forecast where we may end up, assuming of course that we are able to maintain them. We can also apply “what ifs” eg. can we increase the velocity if we add a couple of testers, if we fix fewer defects etc.

    Re; Predictions, yes anyone can make a prediction however once you’ve been in charge of a few programmes, especially large and/or complex ones, you quickly learn that patterns do emerge around particular circumstances & it is often scary how many workstreams in difficulty are so due to the same problems. Therefore its usually quite easy to make predictions on where things will go without remedial action & which strategies will be best to deploy to turn things around. .

    Re: stopping testing, I’ve recommended this a few times (although no-one has been game enough to act on it). The intent is to allow the developers sufficient opportunity to 1) fix the known defects & 2) conduct further testing, without the testers continually bombarding them with more & more problems & thus creating a snowballing backlog. .I sometimes set up a pre-test phase that is intended to determine whether a release is in fact testable or not. If a certain no. of higher severity defects are found within a certain space of time, back it goes for further unit testing. Interestingly enough, each time I’ve recommended test suspension & it has been ignored, the programme has gone exactly the way I thought it would – further into the quicksand with 2-3 deadline delays before finally getting over the line & even then the client has spent at least 6 months mopping up the mess.

    Geoff

    • Geoff…

      Before you talk about variances and averages, could you please read the second paragraph of my first reply (“One problem with predictions about bugs”) instead of pretending that it (and more importantly, the problem it refers to) doesn’t exist? Talking about “the average defect” or “the average test case” and the “average time to fix a bug” is like talking about “the average problem” and “the average vehicle” and “the average repair” when a problem could be a squeaky wheel or a fuel leak, the repair could be a drop of oil or an engine redesign, and the vehicle could be a tricycle or an Airbus A380. Yes, programmers can make some reasonable guesses about how long it takes to fix a particular problem when they know something about its nature, and a reasonably dim project manager might aggregate the estimates for each bug (and double it) to predict a ship date. (Once again, the product will ship when the showstopper bugs are fixed, and the prediction won’t have anything to do with it except to provide some anchoring bias as to when the ship date “should” be.)

      But if you’re determined to persist in abusing mathematics, try this at your first opportunity: go to a senior programmer; tell him that over the last three months, programmers have fixed 245 bugs; and tell him that that there are 137 bugs currently in the database. Without giving him any other information, ask him to predict when the product will ship. If he gives you a date instead of a scornful look, try reminding him that some testing will happen between now and the ship date he’s predicting, and that the number of bugs in the product at this moment is in fact unknown.

      We can use these velocities to forecast where we may end up, assuming of course that we are able to maintain them.

      Where we may end up, assuming we can maintain the velocities. Right. And if we don’t maintain the velocities, we’ll continue to work on the product until it’s in a shippable state, ignoring the prediction, and (if someone is rude enough to bring it up) chalking up the prediction error to an unfortunate assumption or two.

      People, if you haven’t already, please read The Black Swan. Please read Kaner & Bond’s paper, “Software Engineering Metrics: What Do They Measure and How Do We Know?” Please read Kahneman’s Thinking Fast and Slow.

      “The intent is to allow the developers sufficient opportunity to 1) fix the known defects & 2) conduct further testing, without the testers continually bombarding them with more & more problems & thus creating a snowballing backlog.”

      a) Presumably the stopping (you said stopping, but I believe you mean pausing, not stopping) of testing wouldn’t make the unknown problems go away. It would simply delay their discovery. Right?

      b) Is stopping testing the only way to prevent the programmers from being overwhelmed by problem reports (you said problems, but I believe that you mean problem reports)? Could the programmers work on the backlog while the testers continue to investigate the product? Could the testers find problems and the managers prioritize them, such that the programmers could fix the most important ones when the backlog is empty, and such that the testers could find potential workarounds for the not-so-bad ones?

      What you’re describing sounds to me like a car journey in which someone says “We’re behind schedule… it takes time to look out the window, so let’s stop looking out the window until we’re back on schedule.”

      • Gday Michael, thanks for the response.

        Your perspectives lead me to believe that you’re referring more to a software development-type scenario. The reality is in my work, which is 90% directing or managing software implementations, is that these techniques do work, I’ve proven them over & over again therefore it is difficult for me to accept some of your points as real-world valid. Regarding the abuse of mathematics, software testing IMHO cannot be confined within the boundaries of an absolute science. There are too many variables which is why I tend to forecast on a daily basis an approx. finish date based on the approaches I’ve outlined & clearly state the assumptions around my workings – nothing more than a glorified finger-in-the-wind, because in my experience, this is the best we can get to.

        To address a couple of specific points:

        Stopping Testing: yes, mean suspension not termination.
        Your comment: “When you say It would simply delay their discovery. Right?”
        No, the supposition is that the developers perform more unit testing on the product hopefully finding those defects before releasing back to Testing. In this example I as Test Manager have effectively rejected the release because it has failed my entry criteria by being too buggy to test any further. Bear in mind that my expectation is that the release has been properly unit tested by developers before it is released to my team.

        Your comment: “And if we don’t maintain the velocities, we’ll continue to work on the product until it’s in a shippable state, ignoring the prediction, and (if someone is rude enough to bring it up) chalking up the prediction error to an unfortunate assumption or two.”
        Again no, if it turns out that the velocity cannot be maintained then our daily assessment will make that clear & I then have the option to make changes, add testers, fix fewer defects or whatever. However I hope I’d be smarter than that & preempt the situation by making those changes before
        we slip further behind or as a mitigation exercise.

        Geoff

  6. Anyone can predict anything. That’s not at issue. Other things are at issue, though. How do other people use your prediction? Suppose it comes to pass, or doesn’t; what difference does it make? What are the costs of making the prediction? What are the consequences when you’re not only wrong, but dramatically wrong? I agree that it’s a good idea that you should be prepared to manage the unknown unknowns (and the known unknowns, and the knowns)… but what does the prediction have to do with that?

    One problem with predictions about bugs is the problem of asymmetry and that fact that bugs live in fat-tailed domains. That is: it’s easy to postulate that there’s an “average” bug, with an “average” time to fix and “average” consequences. But whatever average you come up with is an average based on your observations so far in an open-ended domain. The time to fix a bug may be minutes, days, or weeks… or years. The impact of a bug may be nil or it may destroy the value of the company. Many bugs get fixed before they appear in your measurements—how do you account for those? And plenty of bugs never get fixed, so don’t get counted in the same way as the ones that do. It would be easy for you to say that since you can’t do anything about it, you simply ignore them—and I say you’d be right to do that, but you might as well abandon the rest of it too. Here’s why:

    a) Suppose you found dramatically more bugs than you had predicted. You’d probably be surprised, and concerned, and you’d probably try to figure out what was going on in the development process (or maybe you’d be concerned that the testers were hypersensitive). And you’d take steps to fix what you had found. b) Suppose that you found dramatically fewer bugs that you had predicted. You’d probably be surprise and concerned, and you’d probably try to figure out how to broaden or deepen you test coverage (or perhaps you’d be concerned that the programmers weren’t working hard enough or quickly enough). And you’d adjust the project to fit your findings.

    OK. But here’s my point: How is (a) fundamentally different in character from “suppose you found a lot of bugs?” How is (b) characteristically different from “suppose you found few bugs?” In each case, you’d probably take steps to manage the project (or to help the people who are doing that). In other words, what does the prediction have to do with it?

    In fact, I think that you probably do something different that what you’re laying out here. You probably take the time to observe and analyze and understand every bug–every piece of evidence that things aren’t according to Hoyle. Predicting how many bugs you’ll find in a product is like a chef predicting how many customers are going to succumb to food poisoning. Wise chefs don’t bother with that; they anticipate, detect, and address the problems that would sicken their customers. That is, they deal with what needs to be dealt with, and I suspect that, in reality, that’s what you do too.

    You say “Managers are focused on this measure and several other (relatively unimportant) numbers means that no matter what I may think (of the value of keeping count) I still have to do it.” I disagree. You don’t have to do anything. You may want or prefer to do it, presumably because it saves you from a more challenging conversation. In fact, I see it as a disservice to the manager to keep encouraging him to depend on his security blanket, rather than helping him to grow up. All of the effort that you expend keeping the PM off your back, instead of helping him learn how to be an effective PM, is wasted effort with respect to the next time you have to do it.

    • Hi Michael,

      Thanks (1) for taking the time to read my post and (2) for providing some really insightful comments and questions. I’ll do my best to address each point.

      Firstly, let me provide some context to my post. As I have worked predominantly on very large (sometimes even enormous projects) some normalisation around defects is pertinent when it comes to setting expectations, planning resources and scheduling major events. This means that most predictions are made once I have (as you put it) observed and analyzed key factors and drivers for the overall project. I don’t make predictions before ANY Testing has been undertaken, I only estimate based upon scope and requirements. As I said these projects are sometimes massive and therefore estimates are provided based upon the timing and project activity at the time. If we are over 2 years away from our proposed Implementation Date then estimates may have up to a 200% variance. If the Implementation date is 12 months away estimates we may have up to a 100% variance and so on. My predictions, however, are not made until we have evidence regarding Requirements volatility and sturdiness, Design techniques, Coding velocity and quality/predictability, along with User involvement and experience. Project governance is also a factor.

      Secondly, let me address your comments and questions individually.

      1) Anyone can predict anything – I’m not sure I agree with this statement. If someone predicts something that turns out to be utter rubbish no-one will ask their opinion again. If someone predicts something that turns out to be accurate then they’re opinion will be sought again. When I have performed governance-type roles on projects I have always questioned estimation and prediction techniques very carefully and it was through performing this type of role that I began my own prediction work.
      2) How do other people use your prediction? – the obvious answer to this is “it depends”. Sometimes my predictions are used in project reports to senior management, sometimes they are used by vendors (or by PM’s to challenge vendors), sometimes they are used to manipulate project momentum. There was a case a few years ago when a banking project (where I was the Program Test Manager) was stalled due to a defect fixing backlog. I predicted that unless we STOP Integration Testing for at least two weeks the product would never reach the required stability for the project to EVER enter UAT. Eventually (after about 4 weeks of negotiation with the PM) Testing was stopped for a minimum of 2 weeks (it ended up being around two and a half weeks before we resumed) and we subsequently entered UAT on schedule and met an acceptable Implementation date as well.
      3) Because predictions can come in various sizes and guises the cost or impact of my predictions can also vary. In summary, I undertake predictions as part of my daily what-if analyses. Some of my predictions are for my management team only, some are for a wider audience and some are aimed at changing project/program momentum or culture.
      4) I accept your points regarding bug impact. I probably spend more time assessing and understanding the impact of bugs than anything else. My approach has always been from multiple perspectives mainly categorised by impact on the business, impact on the customers, impact on the schedule, impact on Testing etc. this then leads to discussing and attributing priorities with the business representatives determining what happens next.
      5) With respect to your (a) and (b) points/assumptions require additional clarification. Are you saying that no matter whether things are going better than expected or worse than expected the same type of action is required. Also, are you saying that predictions have no value because you have to do the work anyway?
      6) I’m not sure that I get your reasoning regarding the chef comments either. There are overarching/underlying risks associated with any project or program of work and some outcomes may have the capacity to kill or injure people – I just see this as risk analysis. However, I don’t see how this invalidates my strategy for predicting outcomes.
      7) Lastly – I think, in retrospect, my comments regarding Project Managers probably require more context and background. I have (obviously) spent many years influencing PMs (sometimes successfully and sometimes unsuccessfully). On balance I believe that I have gained far more ground that I have lost. However, sometimes the PM (and I) also fail to influence the client sufficiently and corporate guidelines and history over-ride our attempts to change the culture within the organisation with respect to how projects are run. A great example of this was when I undertook an extensive review of how a major government agency conducted it’s Testing activities. The response to my recommendations regarding measurement and metrics were greeted with “we’re not ready for type of transformation yet” to which I replied “you can’t afford not to do this”. The final response was “we will wear it for now and try again sometime in the future” – code for We’re putting that in the too-hard basket…. So, you see, sometimes I win and sometimes I have to bide my time; however, I am a pragmatist and sometimes you just have to walk away and say “I’m afraid I can’t help you after all”.

      I hope that answers your questions/concerns. I really do appreciate your time on this and look forward to your responses to my questions.

      Cheers
      Colin

      • Hi, Colin…

        1) I don’t understand what you mean by this statement “… some normalisation around defects is pertinent when it comes to setting expectations, planning resources and scheduling major events.”

        2) “If we are over 2 years away from our proposed Implementation Date then estimates may have up to a 200% variance. If the Implementation date is 12 months away estimates we may have up to a 100% variance and so on.” I don’t understand how this squares with your earlier statement, ” I can tell you that I can predict to within 5% how many bugs we will uncover in System Integration Test and User Acceptance Test – IF the project falls within the scope of projects I’ve worked on previously.” I was going to ask “To what p-value is that? Plus or minus what percentage 19 times out of 20?” But I guess I now also have to ask, “How close to release to you have to be before you’re going to give your +/-5% estimate?”

        Apropos of (1) You claim that you don’t agree with the statement “Anyone can make a prediction.” But that’s just a fact. Anyone CAN make a prediction. You make a different assertion: ” If someone predicts something that turns out to be utter rubbish no-one will ask their opinion again. If someone predicts something that turns out to be accurate then they’re (sic) opinion will be sought again.” That is, you seem to be asserting that people who make inaccurate predictions will be ignored. From what you’ve said, that’s not true; you’ve apparently made predictions that vary by 100 or 200 per cent. Those don’t sound like accurate predictions to me (they’re certainly inconsistent with your claim of 5% accuracy), and yet you seemed to keep your job; and presumably they sought your opinion again. Maybe the bosses paid attention to what you advised while forgetting that you made a spectacularly inaccurate prediction.

        Apropos of (2) “…a banking project … was stalled due to a defect fixing backlog…I predicted that unless we STOP Integration Testing for at least two weeks the product would never reach the required stability for the project to EVER enter UAT.” I’d like you to consider this question very carefully: how does stopping integration testing help the product to reach required stability? That is, how does stopping testing affect fixing? To me, what you’re saying is like claiming “we cleaned up the corruption problem at City Hall because we stopped the reporters from investigating for two weeks.” (It might be the case, for example, that triaging or prioritizing or otherwise discussing problems was costing time for the developers such that they were distracted from fixing problems. If that’s true, OK—but then it wouldn’t be because you stopped testing, but because you stopped meetings or some other form of distraction.) But again: I don’t see any link here between your prediction and what was actually going in the project. Had you predicted an enormous volume of bugs? A small volume? Were you way over or way under the prediction? In any case, so what? Would you have behaved any differently if your prediction had been wildly out of whack or razor precise?

        I’m going to stop here, for now, because each subsequent paragraph raises even more questions. I’d like to start clearing this first bit up first.

        —Michael B.

      • Hi Michael,

        1) “… some normalisation around defects is pertinent when it comes to setting expectations, planning resources and scheduling major events.”
        I’ll give you two examples of this:
        i) When major initiatives are in the early stages of Planning funding and resource estimates are required to gain various approvals to continue. I’ve been asked on these occasions to provide information regarding which Testing phases are required, estimate how long they will take and how many people I’ll need. This question may be asked two years before the “anticipated” Implementation Date. Because I have nowhere near enough information at this point in time I estimate (with many provisos and assumptions) my requirements – this is where the 100% / 200% variances come in). It usually requires many iterations of analysis and discussion (with Business and Tech leads) but we have to provide a basis upon which the project can proceed.

        ii) If my company is bidding for an outsourcing contract my team and I are asked to provide estimates for the a Testing phases – again this may be well in advance of having all the necessary information but my CEO wants to bid for the work, so I use my knowledge and experience (and various software tools) to estimate what is required. Again provisos and assumptions are presented with the estimate.

        As I said in my original post, I don’t like having to do this but a major part of my job is to undertake this kind of analysis. It’s really arduous and time consuming but a necessary evil in the type of work I’ve been engaged to do.

        2) The “5%” variance is possible because we have gathered enough data to be able to plot sufficient what-if scenarios to come up with a set of numbers (including expected number of bugs). I qualified this as you highlighted and yes I don’t always get it right, but I get it right often enough (say 8 out of 10 times). If I’m wrong it’s not something that transpires instantaneously, major projects move slowly and therefore I can add staff, negotiate more time or money etc and add the information into my knowledge base for next time.

        3) I’ll accept that in your interpretation ANYONE can make a Prediction.

        4) In the case of the banking project my main reason for stopping testing was that the prevalence of bugs within certain (very important) functions was so great that the level of instability/volatility of the test environment meant that test results could not be relied upon. I believe that testing outcomes are only valid within a “controlled and predictable” environment so it was a waste of time and effort to continue testing until an acceptable level of stability had been restored (via bug fixes).

        Does that all make sense?

We're here to help

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s