24 January, 2018

Heuristics in Testing

Definition of heuristic: involving or serving as an aid to learning, discovery, or problem-solving by experimental and especially trial-and-error methods.

Human brains use strategies, or mental short-cuts, all the time to ease the mental load of processing information and making decisions – examples of these mental short-cuts include using a rule of thumb, making an educated guess, guesstimating, stereotyping, following your intuition, or using common sense and they’re also known as heuristics.

Without heuristics we’d need to carefully consider and analyse every possible outcome of every single choice we make – many choices that we’re not even aware that we’re making – and we’d be completely unable to function and get things done. Here we’ll take a look at where our own heuristics might trip us up, and some heuristics that have been developed to help with testing specifically – although please always bear in mind what Boris Beizer, in his 1990 book Software Testing Techniques, Second Edition, called the Pesticide Paradox: Software undergoing the same repetitive tests eventually builds up resistance to them.  Keep your eyes open and keep varying your heuristics.

The first heuristic that most of us are aware of is the "Rule of thumb". This is a very broad approach to problem solving which allows us to use information we already have – for example, a small child doesn’t like mashed carrot, and they might decide that they also don’t like mashed sweet potato without trying it because it is a similar orange mush. In testing we also use rules of thumb, albeit with a slightly more educated thumb;  we might already know that a calculation behind a form is tricky, so we’ll always add a good regression test pack to ensure it is tested thoroughly before release (and how do we know what makes a “good” regression test pack? That might be considered making an educated guess).

Most of our heuristics are based on our own experiences, or on our witnessing someone else’s experience, and they can be incredibly useful in speeding up our decision making, but there is also a danger of building-in bias –like the toddler missing out on sweet potato - and in testing we come back to Boris Beizer’s Pesticide Paradox because if we keep looking at the same scope for bugs, we’ll probably miss any bugs out with that scope.

There are two approaches that can help us avoid getting caught in our own bias; firstly we can challenge the assumptions that our own heuristics are based upon and secondly we can call on more diverse experience by applying heuristics from other testers. Neither approach is perfect, but they both reduce the risk of personal bias.

Human brains like to pattern-match – it’s the basis for a lot of our heuristics – and this leads to three areas that it is worth challenging:

Perceived Regularity

Without rational statistics to back it up, we tend to estimate the regularity with something occurs either as far higher than it actually is (like pregnant people suddenly see babies and other pregnant people everywhere, when statistically there are no more than usual), or as far lower (like the lucky people who never have to queue to buy a train ticket in the morning, although statistically there is an average four minute wait). Sometimes this bias can work in our favour, for example when a risk is perceived as coming to fruition far more regularly than the statistics bear out it would tend to indicate that while the probability of that risk is lower than perceived, the impact is high and should therefore be tested, but it’s worth trying to put some numbers on the “feelings” and see if the statistics bear out the estimated regularity.


Probability feels like it should be very rational, and statistics may not help with challenging it – for example, if you live in a country that’s not prone to seismic activity, and you’re next to a railway line, the vibration you feel through the ground is probably due to a train rather than an earthquake – and most of the time it’s a really useful heuristic, but it can cause problems when the assumption is made that probably means definitely – ignoring all the signs that it’s an earthquake and not standing in a doorway could very well cause an issue!

The majority of the time in testing, probability will come up during risk assessment and defect management: all our users will probably all be over 18, a fault is probably due to widget X failing, and, again, we should really focus our challenges on the assumption that probably means definitely and make sure we’re not excluding other possibilities completely, simply because they’re less probable.


If it looks like a duck, and quacks like a duck, it’s probably a duck. Stereotyping is one of our most basic pattern-matching heuristics and is also one of the most problematic – it’s why our toddler isn’t eating sweet potato, and there are many examples of where stereotyping has failed in society at large. To look more specifically at testing, it can result in an unconscious decision to focus testing based on a negative or positive prevailing stereotype (“this database always causes all our problems”, “that programme is always perfect”).  Again, this is somewhere that statistics can help and also where seeking diverse opinions – perhaps that of someone not directly involved with the project for example - can also help challenge any prevailing stereotypes.

Heuristics from Other Sources

There are a large number of testing heuristics available for a Tester to use – some are specific to particular types of testing, others are applicable to specific testing tasks.

A useful heuristic for identifying Functional Requirements is from Gause & Weinberg’ s Exploring Requirements, which is a slightly different version of the old heuristic “Who, What, Where, When, How, Why?” and it is: Users/Functions/Attributes/Constraints.

i. Who will be using the System?

ii. What will they be doing with the system?

iii. How will the System to doing the thing?

iv. What will stop the users or system doing things?

Many heuristics have been developed so that they have an easy to remember associated mnemonic, like James Bach and Michael Bolton’s FEW HICCUPPS . This is the mnemonic for a very useful heuristic to examine expectations and consistency, and can be a powerful tool when looking at those bugs logged when software doesn’t “meet expectations”.

i. Familiarity. We expect the system to be inconsistent with patterns of familiar problems – ie, Testers will go looking for bugs they’ve seen before and expect not to find them.

ii. Explainability. We expect a system to be understandable to the degree that we can articulately explain its behaviour to ourselves and others.

iii. World. We expect the product to be consistent with things that we know about or can observe in the world.

iv. History. We expect the present version of the system to be consistent with past versions of it.

v. Image. We expect the system to be consistent with an image that the organization wants to project, with its brand, or with its reputation.

vi. Comparable Products. We expect the system to be consistent with systems that are in some way comparable. This includes other products in the same product line; competitive products, services, or systems; or products that are not in the same category but which process the same data; or alternative processes or algorithms.

vii. Claims. We consider that the system should be consistent with things important people say about it, whether in writing (references specifications, design documents, manuals…) or in conversation (meetings, public announcements, lunchroom conversations…).

viii. Users’ Desires. We believe that the system should be consistent with ideas about what reasonable users might want.

ix. Product. We expect each element of the system (or product) to be consistent with comparable elements in the same system.

x. Purpose. We expect the system to be consistent with the explicit and implicit uses to which people might put it.

xi. Statutes. We expect a system to be consistent with relevant statutes, acts, laws, regulations, or standards – i.e. consistent with those things required by external laws and regulations.

Note, that there are still some issues with this heuristic – who is an important person and what is a “reasonable user”, for example – that would require to be confirmed by the project culture as a whole.

Testers tend to find a few external heuristics that suit them, and that they’ll work with consistently, but eventually those external heuristics will be prone to the same issues that the Tester’s own, original rules of thumb and are – with built in bias and the resurgence of the Pesticide Paradox – and the resolution is the same thing that drove the tester to find external heuristics in the first place: challenge the assumptions that the heuristics in regular use are based upon and call on more diverse experience by applying other heuristics from other testers.


By Claire Anderson, Senior Test Analyst at Edge Testing

Back to Blog