Being confident about what you know is always an asset in business. Making plans for a future satellite office, strategizing about a marketing and communications campaign or rolling out a new piece of software requires a high level of expertise and experience. Presenting a plan without good assurances about costs, timeline and risks is never a good practice.
Interestingly, new research released by OpenAI reveals a much more complex picture of what it means to be confident in the age of artificial intelligence. The study asserts that AI making educated (and often wrong) guesses, known as hallucinating, is often preferable and desired, mainly because it’s better to provide an answer than for chatbots to act unsure.
Experts who have examined the research say this could pose problems for companies deploying the technology and relying on AI-generated answers to make key decisions.
Breaking Down OpenAI’s Latest Study on Chatbot Behavior
“Hallucinations are not bugs, they are features."
- Conor Grennan
Chief AI Architect, NYU Stern School of Business
The OpenAI study makes one thing clear about AI: it’s not always accurate. The reason that’s true may not be a surprise to anyone who has used ChatGPT recently. Even looking up a simple email address sometimes returns an incorrect or confusing response. The data used to train chatbots is sometimes woefully outdated.
Knowledge workers in particular may be fooled into thinking a powerful AI chatbot like Google Gemini or ChatGPT can solve complex problems and provide accurate answers, but the study suggests the alternative — not knowing the answer at all — is a worse problem.
“Hallucinations are not bugs, they are features,” said Conor Grennan, the chief AI architect at NYU Stern School of Business and founder and CEO of AI Mindset, adding that OpenAI technology allows “creativity” in how their chatbots respond.
Response Predictions
Data sets, according to Grennan, are trained to predict text responses — but that means errors are inevitable. One way to think about this is when you send a Gmail message. Built-in AI can make suggestions/guesses on what you want to write next, but it’s not always what you intend. For example, you might want to say you will meet someone for lunch but the bot will suggest meeting at the office instead. This type of prediction is common and pervasive with AI.
Poor Training Data
Thomas Randall, research director at Info-Tech Research Group, explained that AI answers will include “unavoidable” hallucinations, especially when there is not a good set of training data available for unusual and rare facts that are hard to quantify.
“For very rare facts or facts not well-represented in training data (think tacit knowledge or unwritten cultural information), errors are almost inevitable unless we provide better data or better mechanisms for knowing what the model doesn’t know."
In other words, it’s garbage in, garbage out all over again, like the early days of personal computing.
Related Article: Are AI Models Running Out of Training Data?
The Problem With Predictive Responses
Inside many corporations, workers frequently rely on AI technology, asking questions about how to plan a new marketing campaign or resolve a budget discrepancy. The most popular AI chatbots, like ChatGPT and Grok, only take a few seconds to respond.
Grennan explained that hallucinations occur not just because the data sets and language models are wrong, but because the bots are trained to be confident about their answers. In fact, he said, research shows AI not having an answer is considered unacceptable in 90% of cases.
“The models learned to guess confidently instead of admitting uncertainty,” said Grennan. “We basically created an incentive structure that rewards overconfident mistakes.”
This “confidence” can be misconstrued as arrogance, and there’s a correlation with humans. Often, pride is the difference between what you actually know and what you want people to think you know. It’s the bridge between reality and false assumptions. Similarly, bots want to present a knowledgeable demeanor — they are hard-coded to appear capable to humans.
Rewarding Confidence Over Correctness
Mitchel Crookson, an AI analyst with AI Tools who read through the study, noted that chatbots have “statistical pressure” to be right most of the time, or at least provide a coherent answer.
“If an AI model is entirely judged on its ability to answer, not the correctness of the answer, then there is that extra pressure to produce an answer always, even when it is pure guesswork,” he said. “After all, there is no punishment or penalty for providing a wrong answer to a user query.”
Indeed, study after study has found an alarming number of hallucinations. One research paper found that bots missed the mark about 40% of the time when they used general internet training data for questions related to cancer research. And recent studies found that AI hallucinations nearly doubled between 2024 and 2025.
Of course, these errors are not stopping people from using the bots. OpenAI estimates that more than one billion people will use ChatGPT by the end of this year.
The question to ask is: What can companies do about hallucinations when so many people rely on AI answers and assistance?
How Enterprises Can Mitigate AI Hallucination Risk
Dealing with AI hallucinations can be challenging, because workers rely on bots often, and there typically are no well-established workflows and processes for using AI tools in the workplace.
One example of how to approach hallucinations comes from Moti Gamburd, CEO at CARE Homecare. The home care firm has multiple offices and dozens of employees who rely on AI tools in their jobs. “We apply AI for assistance in our intake, scheduling and policy drafting activities, but in the field of care work, even a minor hallucination can create a confusion for families who are overwhelmed already."
To combat the AI hallucination problem, Gamburd said his staff uses a methodical workflow. Whenever employees use the tools, they always ask for vetted sources or ask the bot to let the user know if there is no clear answer.
“Everyone using these tools needs to understand their specific strengths and limitations,” said Grennan. “Quality control and fact-checking are core to any responsible workflow. The more your team works with these systems, the better they'll get at knowing when to trust the results and when verification is essential. Which is very similar to how you work with colleagues.”
Turning AI Errors Into Enterprise Insights
While the OpenAI study is new and needs to peer-review, there are several takeaways companies can implement now to deal with AI hallucinations and errors in AI responses.
Grennan made a key point about trusting the bots: Large companies are not monoliths, they are still a collection of people. You can imagine workers in every segment of an enterprise trying to solve real problems in accounting and finance or relying on AI to suggest answers about shipping and delivery logistics. Workers are not just names in a database, they are in the trenches trying to uncover insights and data to help them make decisions. And the bots are just as fallible as humans.
“The key for individuals is to stop treating large language models like calculators. They're more like specialized colleagues who are brilliant in some areas and unreliable in others."
The biggest actionable lesson for enterprises, added Crookson, is that there is no need to stop treating AI hallucinations purely as a big to be patched.
“Hallucination is the side-effect of design choices around training objectives, incentive structures, and evaluation. It’s important to educate stakeholders, including end users, about what hallucination is and where it comes from. If people expect the AI models to always be right, hallucinations undermine trust.”
Related Article: Reducing AI Hallucinations: A Look at Enterprise and Vendor Strategies
Training Your People Matters More Than Training the Model
Part of the answer here is to educate workers about the hallucinations — e.g., how common they have become and why the training data is sometimes suspect. Even learning the term “large language model” can go a long way once you explain that bots use prediction techniques to formulate answers. There’s also value in creating a documented workflow and process — distributed to anyone who might use AI chatbots — that encourages verification and citations.
At the end of the day, a best practice with AI technology is similar to the security field. It’s important to “trust, but verify” in the sense that the AI bots are here to stay and offer value — but only if you know how to use them.