For the past two years, I’ve chronicled how the rise in generative and agentic AI applications has pushed many organizations to rethink their quality assurance (QA) and testing strategies.
These companies have been under pressure for years to release software faster, and that pressure has only intensified since this latest AI boom kicked off. The need to balance speed, quality and cost is still there, as are the resource bottlenecks and technical immaturity. We’re at an inflection point where teams are increasingly relying on AI and automation to be able to test quickly, but lack the expertise to ensure that those systems they’re using to test have been properly trained and tuned to effectively reduce risk.
Table of Contents
- AI Use Surges While Confidence Plunges
- One More Time — Humans In The Loop Are a Necessity
- The Teams Getting It Right
AI Use Surges While Confidence Plunges
The 2026 State of Digital Quality report from Applause shows that this gap is having a significant effect on quality. In a survey of more than 1,000 developers and QA professionals and over 4,000 consumers:
- 40% said they experienced hallucinations this year, up from 32% in 2025
- 46% said AI misunderstood their prompts
- 41% said responses lacked sufficient detail.
- 75% reported quality issues, an increase that followed years of steady decline
From an organizational perspective, 55% of organizations have deployed AI features, yet more than half of AI initiatives still fail to reach full production due to integration challenges, quality risks and cost outweighing value.
In a recent McKinsey survey, 74% of respondents said they consider inaccuracy a highly relevant risk as adoption expands. Deloitte saw similar results. In its State of AI report, workforce access to AI expanded by 50% in just one year, but among those workers with access, fewer than 60% use it in their daily workflow — a pattern similar to 2025. And a Nvidia AI report found that when workers were asked about the benefits of AI, 30% of respondents named lack of clarity on AI’s ROI as one of their top challenges.
So what do all these numbers mean? We have AI adoption surging, requiring AI testing tools to keep up, but the quality issues are still there, so organizational confidence is declining due to growing inaccuracy and security risks. How can the average organization keep up?
Related Article: AI Governance Isn’t Slowing You Down — It’s How You Win
One More Time — Humans In The Loop Are a Necessity
In my last article, I poked a bit of fun about how often I talk about the importance of keeping humans in the loop from a testing perspective. It seems like the message is getting out. In the Applause report, more than 61% of organizations surveyed said they rely on human input to evaluate AI performance, with less than one-quarter of respondents claiming to develop AI systems that operate independently.
Fully agentic workflows are not on the roadmap for many organizations. The mix of AI-driven and human-led testing include fine-tuning with synthetic and human-generated data, human-led and automated red teaming as well as AI-first testing agents and human-in-the-loop monitoring.
Over the past two years, user expectations around generative AI have evolved quickly with 84% of those surveyed noting that multimodal functionality — the ability to process and generate text, images, audio and video — is critical. This creates even more of a need for human-led evals and fine-tuning, structured red teaming by both domain experts and generalists. Without this, organizations risk scaling systems they don’t fully understand or control.
The Teams Getting It Right
AI adds speed and scale, but human critique is what earns trust. The companies getting it right combine AI and domain expertise to evaluate and fine-tune their systems, ensuring outputs are more relevant, accurate and inclusive. This includes reviewing all costs early in the product design stage and looking for opportunities to make optimized decisions around design and development without sacrificing quality.
The importance of independent, comprehensive evals can’t be understated. I’ll get into more details on how organizations should approach AI evaluations in my next article.
As AI continues its meteoric rise and takes on even more dimensions, QA must keep up so that organizations can reap the benefits of AI while maximizing accuracy, relevance, inclusivity and safety.
Learn how you can join our contributor community.