What UX researchers get wrong about AI

By
Ned Dwyer
Published
May 21, 2026
What UX researchers get wrong about AI

A field note from r/UXResearch, customer conversations, and the published literature.

I spend a lot of time in two places: on r/UXResearch and on calls with researchers - both those evaluating Great Question, and the hundreds of customers we work with across the Fortune 500. The same misperceptions about AI come up in both. They show up confidently - often in upvoted comments, often in the same breath as "I haven't actually tried it lately" - and in my humble opinion they're holding back a field that already has plenty to be anxious about.

The word that should appear in almost every confident claim about AI in research right now is "yet". AI synthesis can't create a novel insight - yet. AI moderators can't catch what a great human can - yet. Synthetic users aren't a substitute for real participants - yet. The omission of that one word is what turns a defensible 2026 observation into a 2027 mistake. Each of the six misperceptions below is, at its core, a missing "yet."

The AI cynicism is rational, even when the conclusions aren't

A note before I start taking apart specific claims, because I don't want this to read as dismissive of the people making them. The cynicism in UX research right now is not irrational. It comes from somewhere real, and the reasons deserve to be named out loud:

People have already been laid off. The Stanford "Canaries in the Coal Mine" data isn't an abstraction. It's careers, mortgages, identities. Telling someone whose role was eliminated last quarter that AI is "just a tool" lands badly, and rightly so.

The field has been dismissed for years. Long before AI, UX research was the line item finance teams cut first, the function product teams skipped, the discipline companies treated as a luxury when budgets tightened. AI didn't invent the contempt; it gave executives a fresh justification for what they already wanted to do.

Adjacent fields got hit first and badly. Writers, illustrators, translators, voice actors. Researchers have watched what happens when a creative discipline gets told "the AI is good enough." Assuming the same playbook is coming for them is a reasonable inference, not paranoia.

The pace is exhausting. Being told every quarter that the state of the art has moved, that your skepticism is now stale, that you need to re-evaluate everything you concluded six months ago — that's its own kind of cognitive load, on top of doing the actual work.

The hype is genuinely insufferable. LinkedIn is full of confident proclamations from people who haven't sat in a research seat in years. So is this article, just in a different direction. The pattern-matching skepticism toward boosters is a healthy immune response.

Identity is at stake, not just employment. A lot of researchers spent a decade developing a craft - interviewing, sensemaking, the felt judgment of when a participant is lying or hedging. Being told an LLM can approximate that is not just a job threat; it's a referendum on what their working life meant. That hits personally and differently than "we're automating your spreadsheets."

None of this means the six misperceptions below are correct. But I do believe these are the ones worth calling out.

1. "AI hallucinates, so it can't be trusted for research."

This is the most common objection I hear, and it's almost always grounded in an experience from how AI felt in 2023.

On a recent r/UXResearch thread asking whether any AI-driven usability testing tools actually work, the top-voted reply is unambiguous:

"Nope. They are highly prone to hallucinations, faking data, and also, why would you do this, since it is not real data in the first place. We used to call that a dry lab." — u/Insightseekertoo

I replied in that thread asking when they'd last tested these tools. The answer, as is usually the case, was "a while ago." The state of the art is moving in months, not years.

Three things are true in 2026 that weren't true in 2023:

Hallucination rates have dropped sharply. Vectara's HHEM leaderboard — the primary public benchmark for LLM hallucination rates on grounded summarization, currently puts Gemini 2.0 Flash at around 0.7%, GPT-4o at 1.5%, and Claude Sonnet models in the 4–5% range (TechRepublic coverage of the leaderboard here). For context, the same benchmark had top models in the double digits two years ago. The category-level dismissal of LLMs as "unreliable" is based on a generation of models that no longer exists today.

Well-built research platforms now constrain hallucination at the system level. This is what retrieval-augmented generation (RAG) does, and it's not just a nice vendor pitch - it's a well-established technique. A 2024 arXiv paper from Microsoft researchers shows that RAG-based grounding reduces hallucination in structured outputs significantly across multiple LLMs. In practice that means: every AI-surfaced theme has to cite a source, and the source has to be verifiable. If the system can't find the quote, it can't make the claim.

The way to evaluate hallucination risk is to look at the platform, not the underlying model. ChatGPT-with-no-grounding is not the same product as a research repository with retrieval, citation, and deep-link verification. Lumping them together is like saying "spreadsheets are unreliable" because someone got a bad answer from a calculator. The Reddit thread above wasn't wrong to be skeptical. It was wrong to generalize that all AI does this.

It's also worth noting the implicit baseline in most hallucination debates is "perfect human performance," which doesn't exist. Professional transcribers achieve roughly 1–2% word error rate on clean audio, but the Microsoft Achieving Human Parity in Conversational Speech Recognition benchmark documented professional transcriber error rates of 5.9% on Switchboard and 11.3% on CallHome. This is normal multi-speaker conversational audio. Human note-takers paraphrase and miss things. Human moderators mis-remember what a participant said two interviews ago. The right comparison isn't AI-vs-perfection; it's AI-with-citations vs an over-tired researcher synthesizing twelve hours of audio from memory.

2. "Synthetic users mean we don't need real research."

The most-upvoted AI-adjacent post on r/UXResearch in the last 60 days is this one: "The Largest Review of Synthetic Participants Ever Conducted Found Exactly What You'd Expect. Synthetic Participants Don't Work." (135 upvotes). It links to a preprint systematic review of 182 studies finding four fundamental issues with synthetic participants: cognitive misalignments, distortions, misleading believability, and overfitting.

This paper got a lot of airtime, and for good reason - it's one of the most comprehensive analyses the discipline has. But here's what's worth noting if you actually open the preprint: a large share of the 182 underlying studies were run in 2023 or earlier, when the models doing the simulating were GPT-3.5 and early GPT-4. That's two to three model generations ago. The systematic review was published in 2026; the evidence base inside it largely isn't. This is the meta-version of the same trap I'll talk about throughout this piece: confident 2026 conclusions resting on 2023 data.

That doesn't mean synthetic users work now in all use cases - it means the strongest available evidence we have about their limitations is itself a few generations behind. The honest read is: synthetic users can't yet replace real users for many use cases like validation, and the current academic literature documents real limitations, but the rate of change in this area means anyone making categorical "they don't work" (or better yet - "they'll never work") claims should be expected to ground them in 2026-era trials, not in surveys of 2023-era papers.

The peer-reviewed ACM Interactions paper "The Challenges of Synthetic Users in UX Research" (Jan–Feb 2026; note: click the PDF viewer butter) catalogs the durable limitations: inability to capture emotional depth, dependence on training-data biases (including cultural skew), tendency to produce flat, overly-optimistic responses. Some of these are likely to remain true. Others - particularly emotional fidelity and behavioral realism - are moving targets as multi-modal models improve. MeasuringU's review of twelve published comparison studies found nine encouraging findings against fourteen discouraging ones. That's a slightly negative balance, not the wholesale endorsement or dismissal you'd guess from LinkedIn or Reddit, respectively.

Reddit is mostly venomous. From u/bibliophagy:

"'Fake data' or 'made up bullshit' would be more accurate. 'Users' are definitionally humans capable of using your product, which AI-generated responses are not."

This is satisfying to read and largely correct for validation work today. It's also the kind of categorical framing that ages badly. The misperception isn't "synthetic users have limits" - they do. It's "synthetic users are a category mistake that will never have any role." That second claim isn't supported by anything except vibes. I would suggest bad vibes.

Simulated personas for early prototype review, AI-generated probe questions, synthetic question banks for screener design - these already have a real, narrow role today, and the surface area will grow. Hell, we have agents we use to blow smoke through new feature development all the time.

The right framing for now is: synthetic users are bad at being users; AI is good at being an interviewer's intern. Whether the first half of that sentence holds in 3-6-12-18 months is an open question, given how fast AI is moving.

3. "AI is going to replace UX researchers."

The most-commented AI thread in the last 60 days is "anyone else worried about AI layoffs in UX?" — 55 comments, mostly grim. A representative top reply from u/Comfortable-Hair7958:

"Anyone who isn't worried isn't paying attention to what AI is already doing to white collar work… UXR is certainly no exception."

The fear is real and I'm not going to dismiss it. NN/g's State of UX in 2026 explicitly notes that "AI hype created a misleading narrative that new tools could rapidly replace designers and researchers - a story that was convenient in a cost-cutting environment." That misleading narrative caused real layoffs.

But the framing conflates two very different things:

AI is eating execution work. Manual tagging, transcription, theme summarization, first-draft report writing. This is true. NN/g cites research showing AI cuts qualitative analysis time by up to 80%, and notes that 88% of researchers already use AI-assisted analysis and synthesis.

AI is not eating the strategic function - yet, and arguably not for a long time. What it's doing - for the teams using it well - is dramatically scaling it. NN/g's read: "Human direction, curation, and verification will continue to be essential for distilling insights for good products… The researchers who thrive will not be the fastest operators; they will be the clearest thinkers." I think that's right for the next several years. I also don't think it's a permanent truth, and I'm suspicious of anyone who tells you it is.

Dylan Field has the cleanest framing of this dynamic, and Figma has explicitly adopted it as a mantra: AI should "lower the floor, but raise the ceiling — make it so more people can participate in the design process, while also enabling professionals to do even more with AI." (Fortune coverage here.) Both things happen at once. The researchers who lose their jobs in 2026 will mostly be the ones who insisted on staying at the floor.

The contrarian Reddit take worth reading is u/luwaonline1 (44 upvotes): "People want to connect with people, not AI." Both can be true. Humans still do the connecting; AI does the work around it.

Related note: I would also say that in my experience and talking to seasoned practitioners in the role of Research Ops is going to become more important than ever - figuring out how to safely pull all of these tools together; ensuring everyone has the right context and governance, etc. It's a massive job that doesn't just happen. Yet.

4. "AI is killing the junior-to-senior pipeline."

I think this one is mostly correct, and it deserves more honest discussion than it gets. From r/UXResearch this week:

"Big Tech companies broke that whole junior-to-senior level chain with the vendor pipelines that do not really set junior UXRs up for success. A lot of the time, it feels like: 'You stay here and keep doing execution work while we hire more staff-level people and create more AI workflows that slowly eliminate the lower-level…'" — OP

The data backs this up across white-collar work, not just UXR. The Stanford Digital Economy Lab's "Canaries in the Coal Mine? Six Facts about the Recent Employment Effects of Artificial Intelligence" (Brynjolfsson, Chandar, and Chen, Nov 2025) found a 13% relative decline in employment for early-career workers in the most AI-exposed jobs since generative AI's mainstream adoption - while employment for older, more experienced workers in the same occupations stayed stable or grew. The paper drew on ADP payroll records covering millions of U.S. workers across thousands of firms (Fortune's coverage of the original study). MIT's Andrew McAfee warned in Fortune that automating Gen Z entry-level jobs "could backfire and cost companies their future workforce."

The misperception, though, is the implicit conclusion: "therefore juniors should stay out of the field." The teams that will still hire juniors are the ones who give them agentic tooling on day one and skip them straight to orchestration and quality review. The skills that matter for a UXR coming into the field in 2026 aren't "can you code an interview transcript by hand" - they're "can you write a research skill that codifies how we evaluate quality," and "can you spot when an AI-generated insight is structurally wrong."

If you're hiring juniors and still putting them on tagging, you're training them for a job that won't exist. AND you're missing out on an opportunity to work with folks "close to the metal" - who have come of age during the AI transformation and are therefore more likely to be truly AI native.

5. "Real researchers don't use AI. It kills the rigor."

This misperception shows up in two ways. Namely this thread from a former government-consulting QR/QDA practitioner on r/UXResearch:

"From the outside looking in, it seems UXR in the tech space couldn't care less about QR/QDA. They say they want rigor but they actually don't. AI only gives them some…" — u/caramelgelatto

The softer form is the one I see in person at conferences - researchers and designers saying "we don't want to use AI, we want to do it the old-fashioned way." The pattern I keep noticing: most of those people only have access to a hobbled enterprise Copilot that can't actually talk to anything useful. They're judging the entire category based on a tool that was deliberately neutered by their IT department.

Or perhaps they tried something once or twice in 2023. But they rarely have access to the state of the art AI tools like Claude Code and others.

The actual researcher data tells a different story. NN/g reports 88% of UX researchers already use AI-assisted analysis and synthesis. The discipline isn't divided into AI-users and rigor-keepers. It's mostly people who use AI well, people who use it badly, and a vocal minority refusing to use it at all.

The rigor argument is real, and it's the reason platforms need to invest in citation, source-grounded retrieval, and antagonistic AI that pushes back on weak claims. But the conclusion "AI kills rigor" only holds if your AI is ungrounded. RAG-based systems with citation requirements measurably reduce hallucination rates (Microsoft research, 2024) - they change the calculus entirely.

It's also worth being honest about the human baseline here once again. Inter-rater reliability (say that three times fast…) - how often two human researchers coding the same transcripts arrive at the same themes - is a long-studied problem with sobering numbers. On the Landis and Koch interpretation scale widely used in qualitative methods, a Cohen's kappa of 0.41–0.60 is "moderate" agreement and 0.61–0.80 is "substantial" — anything above 0.80 is rare in real qualitative work. In other words, the gold standard the field already accepts for human-coded analysis isn't "perfect agreement" but "two humans agreeing most of the time." That's the bar AI needs to clear. It's not as high as the rigor purists imply.

The teams I see doing the best research in 2026 are using AI more, not less.

6. "AI moderation can't catch what a human can."

This one is the most rapidly-changing, and the dismissals are aging fastest. From the same usability-tools thread:

"Simulation is not the same as authentic interest. Additionally, I've not seen a single instance that can produce insights. By that I mean real insights rather than behavioral summaries." — u/Insightseekertoo

Eighteen months ago, this was correct. AI moderators worked off transcripts only. They missed tone, body language, hesitation, on-screen behavior. A study run by a confused AI moderator was, in fact, a behavioral summary at best.

It's also worth saying out loud: the human baseline for moderation isn't as clean as the field likes to pretend. Elizabeth Loftus's decades of work on the misinformation effect - most famously her demonstration that simply changing a verb from "hit" to "smashed" caused eyewitnesses to remember broken glass that wasn't there - applies directly to UX interviews. Human moderators ask leading questions. They probe into themes they were already looking for. They reconstruct what participants said with their own biases attached. A well-instrumented AI moderator following a consistent script can actually be less contaminated in some instances.

It can also identify the bias in human-moderated sessions to identify it, label it, and help address it in subsequent studies…

Today the frontier is multi-modal: AI moderation that hears intonation, sees facial expressions, watches what's happening on the screen, captures rage-clicks, and adapts its follow-up questions in real time. The first generation of these tools was built by specialist startups (Genway, Listen Labs, Outset). The second generation is now being built natively inside research platforms - which means the moderation, the recruitment, the data governance, and the synthesis all share the same model context. This is the category that's moving fastest, and any take on AI moderation more than 12 months old should be assumed stale.

The reasonable position in 2026 is not "AI moderation doesn't work." It's "AI moderation doesn't work yet for some methods, works fine for others, and the boundary is moving every quarter." Anyone telling you it categorically doesn't work — or categorically does — should be asked the same thing I ask on Reddit: what did you try, and when?

Related note: It was interesting to visit The Market Research Event in October last year where 48 out of 50 exhibitors were touting AI moderated research, with virtually no differentiation in messaging. This is the equivalent of a research company saying "we do surveys". Equally the hate on AI moderation is the equivalent of saying "a survey can't answer this!" - it's often right, but it's also missing the point of where a survey can be powerful.

Bonus: "AI is destroying the planet."

It's not specific to UX research but this one comes up often enough that it deserves a line. Yes, training and inference consume water and energy.

But the per-query consumption is small enough that it shouldn't drive individual workflow decisions. Epoch AI's February 2025 analysis — which revised earlier estimates downward by roughly 10x — puts a typical GPT-4o query at about 0.3 Wh. Sam Altman has since stated that an average query uses about 0.34 Wh and 0.000085 gallons of water; Google's published figure for a median Gemini query is 0.24 Wh. Sean Goedecke's careful breakdown lands a ChatGPT query at roughly 5 mL of water — not the 500 mL figure that circulates widely on social media.

There are real environmental questions to ask about AI at hyperscale. "Should I run a research synthesis on Claude" is not one of them.

Put in human terms:

• The average US household uses about 29 kWh of electricity per day. That's roughly 100,000 ChatGPT queries' worth of energy in a single home's daily consumption.

• A single 4oz hamburger requires about 616 gallons of water to produce. At 5 mL per query, that's the water cost of around 470,000 ChatGPT queries.

• One 8-minute shower uses ~17 gallons of water — about 13,000 ChatGPT queries' worth.

• A single almond takes roughly one gallon of water~760 queries.

And in exchange for that modest per-query footprint, we are getting genuinely transformative scientific capability:

• DeepMind's AlphaFold — which predicted the structure of over 200 million proteins and made them freely available to more than 2 million researchers in 190 countries — was awarded the 2024 Nobel Prize in Chemistry, the first Nobel ever given for an AI breakthrough.

• An AI-designed molecule for OCD reached human trials in 12 months instead of the typical 4–5 years, with the same platform now being applied to oncology.

• An MIT model achieved an 83% success rate identifying synergistic drug combinations for pancreatic cancer.

The reasonable environmental position is: there are real questions to ask about AI at hyperscale - data center siting, grid impact, the carbon profile of frontier training runs. Those are policy questions. "Should I run a research synthesis on Claude" is not one of them. The marginal cost of a research query is microscopic next to a hamburger, and the civilizational upside is curing cancer faster.

How to evaluate any claim you read about AI in research

The single highest-leverage habit you can build right now isn't learning to use any specific AI tool. It's learning to pressure-test the people making claims about them - including this piece. Confident takes about AI in UX research are a dime a dozen on Reddit, LinkedIn, X, and conference panels. Most of them rest on shaky personal evidence that the author hasn't disclosed.

Before you take any AI-in-research claim at face value — whether it's a "this changes everything" booster post or a "this is all slop" takedown - ask:

1. What tool did they actually use? "AI" is not a tool. ChatGPT with no document upload is a different product from Claude with a project, from a research platform with retrieval and citation, from an AI-moderated interview platform. A claim about "AI" that doesn't name the specific tool is a claim about a vibe, not about a technology.

2. When did they last use it? Six months is a generation in AI land. 18 months is two generations. Most categorical dismissals I see were formed in 2023–2024 and have not been re-tested. Most categorical endorsements were formed in the last 30 days and haven't been stress-tested. Recent ≠ correct; old ≠ wrong; but the date is load-bearing context that's almost always missing. And the generations are happening faster and faster.

3. Are they even allowed to use the real thing at work? A huge share of "AI is useless" takes come from people whose IT department gave them a hobbled enterprise Copilot that can't connect to anything. They're judging the category based on a deliberately neutered tool. This is a real constraint, not a personal failing - but it shouldn't drive category-level conclusions.

4. What's their background and incentive? A vendor selling AI tools, a vendor competing with AI tools, a researcher worried about their job, a consultant whose practice depends on the old way of working, a junior who just got an AI workflow that 5x'd their output - each of these comes with a worldview. None of them are disqualifying. All of them are context. Even in this piece you should reflect on my incentives as the founder of an AI-centric UX research platform.

5. Are they reflexively cynical or reflexively bullish? Some people are professional naysayers; some are professional hype amplifiers. If a person's last ten takes on a topic all went the same direction, the eleventh probably will too. Their take is signal about their identity, not necessarily about the technology.

If three of these five answers are "I don't know" or "they didn't say," the claim doesn't deserve much weight yet - including the claims in this article. The right counter-question to any assertion in this space - including mine - is the same one I keep asking on Reddit: what tool, and when?

Apply the framework to me

In the interest of taking my own medicine, here are the strongest counter-arguments to the piece you've just read. I think the article still holds up, but these are the parts where a careful reader should push back hardest.

1. I'm commercially biased and you should weight my claims accordingly. I'm the CEO of an AI-native research platform. My livelihood depends on AI in research being more capable than skeptics claim. I haven't sat in a UXR seat in years. In fact I've never held UX Researcher as a title though I have performed the function many times. Run the framework on me and I score badly on at least two of the five questions. The cleanest way to read this piece is as one practitioner's view, weighted against my obvious incentive - not as neutral analysis.

2. "Yet" cuts both ways. I've leaned heavily on the word yet to soften categorical AI-skeptic claims. But "AI can't do this yet" is itself a confident prediction about a future state I don't have evidence for. Saying "synthetic users will get better" is no more falsifiable than saying "they never will." I'm asking skeptics to drop confident predictions while making my own.

3. Some of my counter-claims are arguments from absence. Pointing out that the 182-study synthetic-users review is built on 2023-era citations doesn't actually demonstrate that synthetic users work meaningfully better in 2026. I don't have peer-reviewed evidence to that effect. (yet… ) The honest version of section 2 is "the strongest available critique is stale, and we don't yet have the replacement evidence." That's a weaker position than I let on.

4. The junior-pipeline prescription is glib. "Give juniors agentic tooling on day one and move them to orchestration" sounds great when you're a senior researcher writing an op-ed. It doesn't help a laid-off junior UXR if organizations have stopped hiring at the junior level entirely - which is what a lot of the labor-market data actually shows. The structural problem is real and my prescription is partial at best.

5. I'm comparing AI to its worst human alternative. "AI vs a tired researcher synthesizing twelve hours of audio from memory" is a real comparison, but it's also a rhetorical trick. The fairer comparison is AI vs a well-resourced research team with time, rest, and peer review. AI doesn't always win that one yet.

None of these dissolve the piece, in my view. But they're the lines a serious skeptic should attack first, and any response from me - or from anyone else writing in this space - should engage with them directly rather than around them.

The takeaway: most AI claims are dated

If you look across all six misperceptions, they share a structure: a true observation from 2023 has been frozen into a categorical claim about 2026, with the word yet quietly removed. Hallucinations were a real problem; they are now a manageable one. AI moderation was bad; it's getting good. Synthetic users can't substitute for real participants; that's true today and may not be true in 18 months.

The way through this isn't to argue. It's to do what I asked the redditor to do: tell me what tool you tried, and when. The conversation gets unstuck immediately. The skeptics aren't wrong to demand evidence. They're just often evaluating against evidence that's now eighteen months stale.

The most dangerous misperception in UX research right now is not any of the six above. It's that the field can sit out this cycle and wait for it to settle. It won't. The researchers who will define what good research looks like in 2027 are the ones writing the skills, the rubrics, the data-governance rules, and the orchestration patterns right now - while the ones arguing on LinkedIn about whether AI is "real" research stay frozen.

Sources

Primary practitioner sentiment — r/UXResearch (last 60 days):

"The Largest Review of Synthetic Participants Ever Conducted Found Exactly What You'd Expect" — 134 upvotes

"anyone else worried about AI layoffs in UX?" — 55 comments

"Are there any AI driven usability testing tools that actually work?"

"What's your plan after UXR"

"Is 'synthetic users' the right name for it?"

"[Rant] There are not many Junior → Senior bridges left."

Hallucination rates and RAG (primary sources):

Vectara HHEM Hallucination Leaderboard (GitHub) — primary public benchmark

• TechRepublic: "AI Models Least & Most Likely to Invent Information" (covers Vectara HHEM rankings)

arXiv: "Reducing Hallucination in Structured Outputs via Retrieval-Augmented Generation" (Microsoft, 2024)

Synthetic users — academic and analyst:

ResearchSquare preprint: 182-study systematic review of synthetic participants (note: preprint, not peer-reviewed; underlying corpus skews 2023-era)

• ACM Interactions: "The Challenges of Synthetic Users in UX Research" (Jan–Feb 2026, peer-reviewed)

• MeasuringU (Jeff Sauro): "A Review of Experiments with Synthetic Users"

• ACM Interactions blog: "The Synthetic Persona Fallacy"

UX field state and AI:

• NN/g: State of UX in 2026

• NN/g: Accelerating Research with AI

• Figma blog (primary source): Dylan Field and Garry Tan on design, AI, and the power of "locking in"

• Fortune: Dylan Field on AI's long-term power to "raise the ceiling"

Entry-level labor market and AI (primary sources):

• Stanford Digital Economy Lab (Brynjolfsson, Chandar, Chen, Nov 2025): "Canaries in the Coal Mine? Six Facts about the Recent Employment Effects of Artificial Intelligence"

• Fortune: Stanford study on AI's "significant and disproportionate impact" on entry-level workers

• Fortune: MIT's Andrew McAfee on automating Gen Z entry-level jobs

• World Economic Forum: How AI is changing the nature of entry level work

• IEEE Spectrum: AI Shifts Expectations for Entry Level Jobs

Human baselines (for comparison against AI performance):

• Microsoft Research: "Achieving Human Parity in Conversational Speech Recognition" — documents professional transcriber error rates of 5.9% (Switchboard) and 11.3% (CallHome)

• Taylor & Francis: "Achieving inter-rater reliability in text-based studies" — overview of Landis and Koch's kappa interpretation for qualitative coding

• Sage Open: "Intercoder Reliability in Qualitative Research" (O'Connor & Joffe, 2020)

• Simply Psychology: "Loftus and Palmer 1974: Car Crash Experiment" — foundational eyewitness/misinformation work

• APA: "How memory can be manipulated, with Elizabeth Loftus, PhD"

AI environmental impact (primary sources):

• Epoch AI: "How much energy does ChatGPT use?" (Feb 2025 revision)

• Sean Goedecke: "Talking to ChatGPT costs 5ml of water, not 500ml"

• IE Insights: From Cloud to Cup: How Much Water Does Your ChatGPT Drink?

Ned is the co-founder and CEO of Great Question. He has been a technology entrepreneur for over a decade and after three successful exits, he’s founded his biggest passion project to date, focused on customer research. With Great Question he helps product, design and research teams better understand their customers and build something people want.

Table of contents
Subscribe to the Great Question newsletter

More from the Great Question blog