UX Research Tasks: When (& When Not to) Use AI

In the past year, AI has been inserted into nearly every tool for customer insights.

There are tools that find pain points in customer calls, platforms that write your insights and roadmaps for you, and even services that run user sessions while you’re bouldering (or whatever you’re doing if you’re not running your own research).

But with huge trends, come huge variations in quality.

How well does AI actually help us with customer research, and which tasks does it do for us best?

I’ve spent significant time this year testing tools and experimenting with AI in my own workflows.

With my experience so far, I want to help you:

Get an overview of the AI landscape
Understand where it’s worth your time to integrate AI into your own work
Highlight whether we can trust AI’s help or not

The landscape of AI for customer research

I recently saw a survey on LinkedIn that showed 70-80% of UX Researchers are using AI at work.

And yet, no one I’ve talked to about AI is using it consistently.

Most UX Researchers and Product people I know don’t understand what tasks to use AI for, and how well they can trust it to do the work. It feels like there’s a new tool every week, and most of us don’t have time to evaluate them all.

But in replies to my own LinkedIn posts about AI, the most common questions I get about its use are:

“What kind of tasks are you finding AI best for?”
“How well does AI do the job?”
“I’ve tried [tool] and its transcripts were terrible. Have you found anything better than that?”

Over many months and experiments, I’ve pushed tools to the limits with sessions and content in three languages: English (the most common language for AI), German (also a commonly handled language), and Swedish (not so common).

In all my testing, I’ve looked at tasks that tools are trying to carry out, tasks where I find AI most helpful, and tasks where I believe AI is a long way from being valuable.

Which tasks are AI tools trying to take over?

Naturally, the AI tools available so far try to help one or more specific steps of a typical research process:

🧑‍💻 Desk research
🗺 Planning research
🙋‍♀️ Moderating sessions
📝 Transcribing, notes, and summaries
💡 Research analysis and themes, insights
🔀 Combining input from multiple sources
🛣️ Turning output into product plans

Some tools focus exclusively on one core task from that list. Others try to do many of them at once. But how well are they doing those tasks across the board?

The good, the bad & the ugly of AI Research tools

Right now…it’s mostly bad and ugly.

Transcription and note-taking are two related tasks that many AI tools attempt to do. Unfortunately, I’ve found very few that do a decent job. There are fewer still that do well with languages other than English.

For transcribing and taking helpful notes, I give even the best AI tools a rating of 7/10. That may sound harsh, but when it comes to research, accuracy matters.

If we can’t capture a participant’s words and expressions correctly, our subsequent analysis and insights fall apart.

While I would love to remove myself from the tedious step of transcribing sessions and checking my notes, I can’t completely do that yet. No tool is currently able to document language well enough for me to trust it fully. I also want to check its work.

There’s a chain of dependencies here, so if transcriptions are imperfect, then notes will be, too, as will summaries, insights statements, and prioritization of which observations and insights to work with in the product.

The farther away from common languages like English you get, the worse this becomes. Rarer languages have far less support. Looking for Norwegian or Cantonese? You’re out of luck. I’ve had some entertaining experiences with Swedish transcripts, but the current state of AI here doesn’t even replace my Swedish-as-a-second-language ability.

So far, I’m often disappointed about the level of analysis coming from AI tools. But since I started more methodical testing just months ago, the progress I’ve seen in some tools handling analysis is surprising.

One thing most tools aren’t advanced enough to do is catch insights where different customers don’t use the same terms. It’s a lot to ask of AI to make distant connections that we humans — especially the ones with years of research experience — have become skilled at.

I want AI to find every instance of a customer commenting on “frustrating wait times,” as well as less obvious cases of the same issue, like “took too long” or “30 minutes” mentioned by users in their first 24 hours after signup. Connections between those statements are still the type of skill that most AI tools lack. Where customer feedback references the same problem for new users struggling to get onboarded without the same words, AI doesn’t reliably make the connection.

Where there’s hope

Is it worth using AI for the analysis phase? You’ve probably heard a few big names already recommend that we see AI as our intern or assistant. I agree. I don’t remove myself from the process yet, but I have measurably sped up my analysis by 2-5x by using AI to look for themes and patterns in addition to my own analysis (which is admittedly lighter these days, thanks to AI).

Recently, I ran my own test with Great Question’s AI notes, highlights and analysis help — and was pleasantly surprised. The functionality that helped me most was the ability to query for something specific — like the struggles faced by interview participants — and get a clear list complete with quotes and links to the video clip timestamp where they talked about the topic at hand. For years, I’ve jotted down shorthand notes on paper beside a live user session, complete with timestamps to check again later. I’m happy to say that I no longer need to rely on meticulous time-stamping of notes on my own.

Thankfully, planning research can be formulaic for many teams, and gives AI a task it can easily perform while saving us some valuable time.

As a repeatable process based on established norms for methods like user interviews, usability testing, and diary studies, a few tools generate good enough starting points for a study plan.

I give AI in those best cases an 8/10 for planning. I haven’t seen a test where I completely disagreed with the plan it created. I’ve even experienced a few cases where it did save me significant time. For example, I hadn’t designed one type of study in years, and used AI to generate a rough plan. That saved me from rummaging around in my files to remind myself what to include.

Nonetheless, I haven’t seen AI tools “think” creatively about research plans that involve more than best practices to put together. Don’t expect AI to think outside the standard box about how to test pricing better than a Research Director or CMO with years of experience would.

Perhaps the task I’ve found AI to do best is desk research. Plenty of secondary research platforms have made everything from scientific literature to a consultancy’s SaaS survey results easier to find, synthesize and compare with AI. Since desk research relies on synthesizing information that is more often documented in a format that’s universally adhered to (ex: the reporting of a scientific study’s results is standardized), AI tends to do this job well. I’ll give it a 10/10 here because AI help cut my time on this task significantly when needed. I haven’t yet thought that I would have done a better job without the help of AI tools for desk research.

Then there’s the benefit of “combining” many sources of customer input into one place, all analyzable and synthesizable by the mighty AI faster than you can say, “customer-centricity”. How well does that work?

Most of the tools that collect, compare, and synthesize many customer feedback formats in one place can be a bit messy. There are tools that have a good handle on importing multiple sources and formats, and pulling out some themes that present in all of them, and certainly others that disappoint. The range of quality here is broad. When it works though, it speeds up the formerly manual process of hunting down sources of customer feedback and triangulating myself. If this continues to improve, I’m all for it and looking forward to it.

Are we being replaced as moderators, and should we be?

What about when AI moderates user sessions? Gasp!

The UX Research community has spent 2023 hating on services offering AI users to interview and test products. It’s equally strange to imagine AI conversing with customers in place of experienced researchers, designers, or product managers, but this is becoming a trend, too.

I’ve tested a few tools that claim they can run your user sessions with AI, and none of them got me excited — but not because I’m afraid of being replaced.

I’m concerned with this trend because the human element in research is something I believe we must retain.

I tried a few of these tools as a participant as well as the person setting up an AI to run my sessions. I did my best to keep an open mind. But as a participant, it felt strange and impersonal. The sessions have the stiffness of a junior researcher’s first survey. They didn’t make me feel comfortable being there. I pride myself on having run many sessions with customers who ended our talk by thanking me for my time and an engaging conversation.

I personally don’t want us to lose track of the fact that this is also what customer research is about: making people feel comfortable enough to open up, and having a human-to-human conversation to solve real problems after caring enough to listen to someone share them.

Maybe AI video tools with CGI humans will merge with these customer research tools and we’ll soon get the sense that we’re actually talking to someone, not something, even with a robot on the other side. Until then, I can’t give any tool more than a 3/10 for moderation. The three points feel generous, but let’s say it can save me time if I’m desperate.

The tasks AI does best

Desk research
Planning common research studies
Combining input from multiple sources

The tasks AI does worst

Transcribing and note-taking
Analysis and catching insights from statements that don’t share the same terms/language
Moderating sessions

Proceed with caution: AI might be more biased than you

There’s a plethora of things AI isn’t good at yet, as I’ve detailed above. But there are a few things we even need to be vigilant about with AI.

I’ve been hopeful about the possibility for AI to check my own biases in the research process. I’m truly excited about the opportunity for AI to become my trusted partner in the future, one who doesn’t look at the world through exactly the same biased lens I have.

But today, AI is nowhere near playing that role. It’s more likely to do the opposite: unknowingly tricking us into viewing the world through a lot of bias we didn’t know was part of the process.

AI has been the subject of many debates around racism and sexism. Where the data we put into the machine has harmful bias, so too does the output.

There are three main stages of AI where bias can become part of your research without your knowledge:

Pre-processing: Bias can exist in the data without us being aware of what we’re feeding into the algorithm.
In-processing: Algorithms are trained, but how they are trained to assess, prioritize and ignore data can also incorporate bias.
Post-processing: Delivers the output to us, and however the AI selects what to show us in the end can also include bias, including common forms of discrimination in particular.

How do we know when our AI results can be trusted?

The truth is, we can’t. When we use AI tools for customer research, we have no view into the datasets that these models have been trained on. We don’t know where the data came from, how it was trained, and what selection criteria exist for its output.

If we let AI do a research task on its own, we can’t be sure that it has taken all the evidence into account objectively and weighed it equally before delivering the result. Machines are only as good as the humans who made them, and how can we know what kind of bias the makers have?

The sad truth is that most founders of AI tools as well as most other businesses are male, white, middle class, educated, and so on. That leaves a lot of bias to be accounted for in AI processes.

What’s most disturbing is that a biased AI affects the information you carry with you into the rest of your work. When biased AI output is trusted to form the basis of our product development, we usually don’t even know about it.

As a UX Researcher, Designer, or Product Manager, you may have influence over the rest of your team, your company, even your entire industry. When the information you have is flawed — based heavily on AI models that may have more harmful bias than you yourself — the risk that you misinform others increases.

As much as I hope for a helpful, accurate and inclusive AI in the future, we’re nowhere close. That’s the part of AI that keeps me up at night.

AI + You: Teamwork makes the dream work

The current state of AI isn’t as miraculous as many will tell you it is. As an AI-curious person testing as much as I can, I’m not as impressed as I’d like to be — yet.

I’m hopeful that future iterations of AI tools will continue to provide more, better support for all those repetitive tasks most of us need to do weekly to learn from customers fast enough to survive in competitive industries.

But has it revolutionized how I run customer research for my clients? Not yet.

I’ve heard the argument that learning to use AI tools takes so much time that it equals the amount of time we spend running research the good ‘ol fashioned way. That’s probably true now, but I think this will change.

However, no matter how fast AI helps us work, it needs to be trustworthy to do the job. If the input, processing, and output aren’t using data that present an accurate picture, we can’t rely on it alone.

In my experience, the best way to use AI tools for customer research now as relatively early adopters is to treat AI as a partner — with a dose of healthy skepticism.

I’m using AI without relying on it to do any task on its own. What does that look like? I use a tool for transcribing and note-taking, but still use my tried and trusted note-taking process to jot down important observations myself. Then I compare the two.

I take note of time-stamps from user sessions, and go back to check that AI-generated transcripts documented the important parts well.

I use a few tools to help me pull out pains, needs, and other opportunities, but think of this as double-checking that I haven’t missed something myself. I don’t let AI inform how and where I focus on what participants said without questioning it. And if AI notes an insight I didn’t think was there, I hunt for it myself.

In many cases, AI has sped up my process. But will it replace me? I think we’ll need to be partners and equals, for now and long-term.

‍

Caitlin is a former Head of User Research at a Spotify-backed SaaS company with 13 years of experience running research and experiments. She loves nerding out about product-market fit and helping early-stage startups test new products before launching. For more of her tactical advice, follow her on LinkedIn, check out her new Substack, or contact her directly.

Table of contents

Subscribe to the Great Question newsletter

The UX research tasks you can let AI help with

In the past year, AI has been inserted into nearly every tool for customer insights.

The landscape of AI for customer research

Which tasks are AI tools trying to take over?