You’re making incorrect assumptions. And that’s hard wired in your brain.
Our brains have evolved to make assumptions, to analyse vast amounts of data and make assumptions and judgements to reduce billions of data points down to a few key facts. Psychologists believe that these shortcuts are adaptive and help us make smart choices. At one time they helped us connect a moving shadow in the grass to a predator, triggering a flight or fight response.
You’re also influencing your participants. As science tells us, by observing something we are already interacting and influencing it, whether that’s particles or the people we are researching.
So you’re not only biased, but you’re probably influencing your research without realising it.
But don’t worry - we all are. Here are 5 ways your research is being impacted by bias:
1. You make assumptions about what’s happening
You can’t avoid making assumptions, it happens. But the effect of incorrect assumptions can be huge. Here’s a couple of quick examples.
Stevenage & Bennett performed recently a study where forensic students were asked to match fingerprints when given an initial report. Whilst their matching effort should have been the same regardless of whether that initial report suggested the fingerprints matched, in reality a much higher number of students stated the fingerprints matched when their initial test report suggested so too - they were influenced by the suggestion of a match, and followed that suggestion through.
Another study found that judges in domestic violence protection order cases were influenced by the behaviour of the applicant. If the applicant had waited for a period of time after the alleged incident before making an application, or if they did not appear ‘visibly afraid’ during the court appearance, then the judge's statements suggested a bias against them, based on their perception that the person “must be less afraid”.
This obviously has huge and potentially deadly consequences. And that’s a judge, who is trained to be impartial. How many researchers have similar levels of training and experience to a judge?
I have seen a lot of UX insight come out of assumptions. For example I once worked with a UX researcher who insisted that she was an ‘intuitive’ who could tell what people were thinking. She would often insist that people hadn’t understood something because she “could tell they didn’t understand it, even when they didn’t mention it”.
Try taking that one to the bank.
At heart we are a scientific community and we need to ensure that research is based on fact, not assumption. That means insights need to come from observed behaviour that is repeated and understood, and/or from recorded comments that were made without bias (see below). Findings are:
Objective and unbiased (read this article)
Based on direct observation of behaviour or user comment
Clear and repeatable - running the test again repeats the finding
Actionable (findings you can’t act on are rarely worth reporting on)
2. Prior knowledge is a blessing and a curse
Knowing nothing makes you a terrible facilitator. Knowing too much can mean the same, however. And that goes double for participants.
Raita and Oulasvirta (PDF link, sorry) performed an interesting study into bias. 36 participants were put through a website test, having viewed a positive, negative or neutral review of the platform first.
The standard System Usability Scale (SUS) was used to rate the tasks they performed. There were two key findings from this:
There was no significant impact on success rate. No matter which review they read, all participants succeeded or failed at the same rates.
Their perceived rating of the site was impacted significantly. They rated the platform 30% more easy to use if they had been influenced first by a positive review. See the table below for more detail.
As a facilitator if you know exactly how the system works then you’ll start making assumptions about choices and behaviour.
When it comes to facilitating the research it’s best to know a lot about people, about good research, and about the goals of the research itself.
But it’s best to know very little about the subject interface you’re testing. The less you know more in tune you are with a participant who is seeing it for the first time. Knowing too much places a gulf of knowledge and understanding between you, and that makes it harder to empathise and connect with their comments and actions.
For participants knowing too much can also sink the research. There are two key actions here to help with this:
Ensure participants match the profile
Make sure they are the right people with the right level of expertise you are looking for. When you’re testing for existing audiences then ensure you find people who match that audience profile well - but also explicitly tell them not to bone up or study the interface prior to the session.
Ensure new user tests are blind
When you’re testing for acquisition or new audiences, ensure the participants match the audience profile but also ensure the test is blind, right up to the last minute. Don’t let the participants know what it is they are testing - otherwise human nature is for them to get nosy and start looking at that site or app or software on their own.
3. You see what you expect to see (confirmation, hindsight 2)
Confirmation bias is the bias we all know and love.
You think that you’re late so you’re going to get all the red lights - and of course you do. Only you didn’t, you just noticed the red ones more because you were late and you were looking for them. Your bias is confirmed.
In research confirmation bias is a risky bias to have. You think people are going to find sign-up difficult, so you start looking out for problems in sign-up. Did they hesitate just then? That must be because it’s not clear to them. Did the mouse move a little to the left? That must be because they are thinking they want to go back to the previous page and choose again. You think it’s going to cause problems and like a magic eight ball all signs point to Yes, yes it is.
What’s harder to pull apart at times is the fact that we’re facing at least four different levels of confirmation bias, each and every time we test:
The client thinks their checkout process is too hard to use, and wants to redesign it. This test is just to prove the problem exists so they can justify redesigning and selecting new technology. Confirmation bias is baked in, from the start.
The UX team are provided with the brief and set up the research. But given other websites they’ve looked at, this sign up and checkout process really does look a little old fashioned, and it’s surely going to have some issues they’ve seen elsewhere. A second layer of confirmation bias begins to build.
The facilitator has played with the design and found it non-standard, they struggled to understand how to complete some of the goals - and they’re a UX specialist. How on earth would a standard user fare, especially an older one (because we all know older people struggle with technology and websites, right?). Layer three confirmation bias, calling layer three.
The participant has heard of Biggend shopping before, their friends told them it has bargains but can be painful to use. As soon as they see the name, they wince ever so slightly. They’re expecting to struggle, whilst the people behind the glass watch on. Layer four pulls into the station.
Confirmation bias can - like scary clowns at a travelling circus - be an unpleasant surprise around every corner.
We also need to guard against hindsight findings. These are when a participant justifies a decision in hindsight. They used control B instead of control A, and then we ask them why. Often this will lead to the participant judging their own actions in hindsight, and coming up with a logical reason why - when there may not have been one, or when in fact their decision was unconsciously driven by various design aspects their conscious mind doesn’t process.
And if those justifications happen to match our preconceived ideas of what’s wrong with the design, then we hit the jackpot and confirm our own bias again.
There are three key steps to avoiding an overload of scary-clown-confirmation-bias.
An independent facilitator
The facilitator for any test should be as independent as possible. Using an outsider or independent is best, but using a professional facilitator who practices avoidance of bias can be just as good. Just avoid using one of your team who’s heavily invested in the project and/or the design - or their bias cup may runneth over.
A bias-free script
The test script (or questionnaire or task list or whatever you like to call it) is the crux of any test and the key to avoiding confirmation bias, as far as possible. The goals of the project will probably be biased to some degree, and the client and the team may be bringing to the table their own biases, but the script can avoid much of this. Direction, notes, questions and tasks should be carefully reviewed to ensure they are not leading or biasing the participant. A good facilitator will follow the script and avoid talking too much outside of it, as this will ensure bias is kept to a minimum.
As outlined in a previous point, blind testing will help to avoid the participant carrying too much bias or pre-learning into the room. You can’t completely control for their opinions based on others, but you can ask them to objectively approach this test from scratch. And you can ensure you take observed performance over voiced opinion, a key difference.
4. Participants just aren’t normal
Well, okay, they are - but their behaviour often isn’t, when in the room. Take a look at these known biases:
The Hawthorne effect: people change their behaviours when they know others are watching. And of course cameras, mic’s and a huge one way mirror won’t add to that at all. Participants may be more careful, go slower, read more, take more considered action than they would in the real world. Or, as we sometimes see, they will play up to the attention, act more rashly and with more bravado (generation Z I’m look at you here).
Social desirability: people tend to look for confirmation, and act in ways that the current social group would approve of. This can lead to participants playing down the issues they see (“oh that’s fine, I found it easily enough - it’s just me being a bit slow!”) or under-reporting their dislikes and grumbles. It can also lead to more positive scoring when subjectively scoring interfaces - for example on a 1 - 5 site scoring mode where 5 is best, we rarely see a site score less than 3, no matter how painful the experience.
Task selection bias: people will try longer and use more likely success paths (even if they are slower and more painful) when being observed at a task. You may see people sticking at tasks far longer than they would in the real world - or using search and trawling through pages of results, to avoid failing in the navigation.
The bandwagon: people tend to jump on the bandwagon. That means if the facilitator or the script start to talk positively or negatively about the test subject, they may well be biased by this and join in. Equally when running group research, they may begin to emulate the comments and behaviours of the others in the group, just to fit in.
That’s not an exhaustive list, but it gives you an idea of the problem we face.
At the risk of repeating myself, the solutions here are much the same as those mentioned earlier - independent and bias-free facilitator, script and a blind test. When asking questions, open and non-leading questions are best, for example “How do you feel about the level of information on this page?”
Avoid building a personal relationship with the participant - research has shown that you can build rapport from the simple act of asking someone to hold you cup of coffee for you - imagine how much rapport you can build up over an hour of chummy chat, if you’re not careful.
Scripts should sometimes be explicit with task paths, to avoid task selection bias - tell the participant that this time you are searching, and this time browsing. Use the paths that your research shows you the majority of people use. Mix them up, and have some open tasks where the participant can choose for themselves.
5. You’re leading not following
This is a common one, and I see it regularly when reviewing research. It’s so easy to slip into that I find myself falling foul of this one every now and then.
You see a problem, and you think you know what’s going wrong. So you start looking for it. We begin to subtly hint at or probe at the issue with other participants.
For example I had a facilitator once who would see someone struggle with a label or an icon - and would from that point on ask every other participant whether they understood it or not, whether it was completely clear to them. Lo and behold, many people did indeed agree with him that it wasn’t clear. He felt supremely justified for having spotted this problem and routed it out into the light of day.
Only he didn’t, really. He was biasing the participants to focus on something they would not normally focus on, and look at it with a critical eye to see if they felt it could be improved. And since he was showing them that he felt it could be improved, they were happy to agree.
We can also fall foul of this easily, both in the script and in our language.
As I’ve pointed out before, the script should be non-leading. Some great examples of leading script questions I’ve seen include:
Which content on this page is most important to you?
What do you love most about this site?
What function would you remove from this site?
How hard was it to get to this point?
If you want to avoid priming the participant, then make them non-leading. For example:
Is there anything on this page that you particularly like or dislike?
Is there anything on this site that you particularly like or dislike?
Is there anything you would add or remove from this site?
How easy or hard was it to get to this point?
When I used to run training on facilitation I had a simple example of the Yes/No game. For those that don’t know it, the Yes/No game is where you have to talk for a period of time or answer questions without using the words yes or no. It’s incredibly hard, especially under stress (e.g. at high speed), as those words are integral to the language we use. Removing them is like trying to walk without bending your ankles - technically possible, but fairly tricky at speed.
Being a good facilitator is much the same. You have to remove words that will bias or lead the participant. The simple example I used was to imagine you were user testing Microsoft Word. If you asked someone how they would save a file, you would have lead them in three key ways:
You explained that there is a system concept of keeping information for later
You told them that the concept might be called Save
You told them that the thing they were saving might be called a File.
That’s a pretty big leading statement. So, you have to remove the words that lead, and change it to something like “If you wanted to come back to this information later, but had to turn off the PC now, is there a way we can do that?”
Unwieldy, but far less leading. Just like the Yes/No game, a good facilitator has to constantly parse their language to ensure they aren’t leading the participant.
A fully non-biased test is not possible - even with an impartial machine-based facilitator and a science based peer-reviewed script, you’d still have to contend with the biases and assumptions made by the human who used the system.
But we can reduce bias in our research. We’re not aiming for perfect.
Better is good enough for today.