Name: The Inference: The Leadership Mindset Needed to Scale AI
Released: 2026-05-27T00:00:00.000Z

VKTR Editorial Director Michelle Hawley sits down with Daniel Wu, course facilitator for Stanford's AI Professional Program, to discuss what it actually takes to scale AI in the enterprise.

In this inaugural episode of VKTR's The Inference, host Michelle Hawley speaks with Daniel Wu, AI strategy and engineering leader and course facilitator for Stanford's AI Professional Program, about what it actually takes to build and scale AI in the enterprise.

Drawing on his experience leading AI and machine learning for commercial banking at JP Morgan Chase, Daniel shares how organizations can move past the pilot phase and into real, value-driven AI transformation. The conversation covers how to pick the right metrics without falling into the Goodhart's Law trap, why AI literacy is fundamentally a leadership challenge, the most urgent security risks as agentic AI matures and how to distinguish a meaningful capability shift from a flashy demo.

Tune in to learn about Daniel's three-pillar framework for measuring AI ROI that goes beyond mere cost savings.

Host

Guest

Michelle Hawley

Michelle Hawley is Editorial Director of VKTR, where she covers AI disruption, enterprise technology and the leaders shaping what comes next.

Daniel Wu

Daniel Wu is an AI strategy and engineering leader who built the commercial banking AI and ML practice at JP Morgan Chase from the ground up — deploying the firm's largest LLM production system in 2024 and its first agentic AI solution in 2025.

What Stood Out From Our Chat

Start With the Business Problem, Not the Tool
You Can’t Measure AI ROI With Vanity Metrics
The AI Cost Question You Can’t Ignore
Security Risks Now Move Faster Than Governance
AI Literacy Is a Leadership Requirement
The Real Opportunity Lives Beyond the Demo

Key Takeaways

Avoid the AI measurement trap. Lose the vanity metrics with Daniel's three-part ROI measurement framework, which assesses: operational efficiency, growth velocity and human impact.
Find that practical AI lens. Daniel covers how to run a feasibility and requirements test on any new tool, spot scalability gaps before they derail a project and identify whether a capability is truly foundational or just a niche solution.
Know the risks. Daniel walks viewers through alignment risk, internal policy violations, prompt injection and the emerging threat of model collapse.

Enterprise leaders are feeling the heat. It’s time to move beyond proving AI can generate excitement (we know it can) to proving it can reduce costs, improve ops and create real, measurable ROI.

Daniel Wu, an AI strategy and engineering leader, author and course facilitator for Stanford’s AI Professional Program, argues that the biggest problem is a lack of clarity.

Too many organizations start with the tool and then move onto the problem (a business no-no, even before AI came along). They chase the latest model or demo before they define what they actually need AI to accomplish.

That, said Wu, is where enterprise AI initiatives begin to drift.

Start With the Business Problem, Not the Tool

For Wu, successful AI adoption starts with a simple question: What does the business actually need?

That may sound obvious, but it’s often the step enterprises rush past. During his time leading AI and machine learning work in commercial banking, Wu said the priority was not simply to “do AI.” It was to build AI systems that could scale inside one of the most regulated business environments.

The companies who see value are the ones that can connect AI capabilities to a clear operational need. That’s especially important now as businesses move deeper into agentic AI, where agents can plan, reason and take action (without human intervention) across workflows. Without a clearly defined business purpose, that autonomy can create more risk than value.

You Can’t Measure AI ROI With Vanity Metrics

One of the biggest mistakes he sees, said Wu, is treating activity as impact.

Think token usage, number of chatbot sessions, lines of AI-generated code, volume of automated responses. It shows that employees are using AI, but doesn’t prove the business is better off.

In fact, those metrics can distort behavior. If engineering teams are measured by how much AI-generated code they produce (AKA tokenmaxxing), they may produce more code — but not necessarily better software. Instead, it might just mean more complexity, more maintenance and a larger attack surface.

Instead, Wu recommends measuring AI value across three categories:

Operational efficiency: Cost savings, productivity gains and reduced overhead.
Growth and velocity: Faster product launches, improved speed to market and revenue per employee.
Human impact: Employee engagement, retention and whether AI improves or degrades the work experience.

That last category is often overlooked. If AI turns skilled workers into “glorified editors,” the organization may see short-term productivity gains while creating long-term talent problems.

The same logic applies to customer service automation. Deflecting more calls or chats may look efficient on a dashboard. But if customers are frustrated, issues go unresolved or sentiment declines, the ROI story falls apart.

The AI Cost Question You Can’t Ignore

For the first wave of generative AI adoption, cost discipline often took a back seat to experimentation. Companies were eager to get in the game before they got left behind, and they treated model access as the price of staying competitive.

That’s starting to change.

More leaders are starting to ask whether every use case really needs the most powerful model, said Wu. In many cases, they don’t. That’s where AI strategy becomes a portfolio decision. Not every workflow needs frontier AI. Not every problem requires reasoning. Not every employee interaction should trigger expensive token consumption. A more mature enterprise AI program will match the tool to the task.

That mindset is also essential for companies building or buying AI agents. It’s not about whether the agent can perform a task in a flashy demo. Can that agent perform the task reliably and cost-effectively?

Security Risks Now Move Faster Than Governance

Wu is especially concerned about the shift from chatbots to autonomous agents.

Traditional cybersecurity often focuses on external attackers. But AI introduces a different class of risk: systems that misinterpret policies, overshare information, take unauthorized actions or execute flawed instructions at machine speed.

In an agentic environment, the breach might come from an internal workflow where an AI system has too much access and too little oversight.

Enterprises need to account for risks such as:

Prompt injection
Data poisoning
Shadow AI use
Over-permissioned agents
Sensitive data leakage
Poor human oversight
Misaligned automation goals

As companies add more AI touchpoints — APIs, retrieval systems, internal knowledge bases and autonomous workflows — they also expand the attack surface. Security can’t be bolted on after-the-fact. It has to be built into the system architecture from the start.

AI Literacy Is a Leadership Requirement

Many enterprise AI problems trace back to a basic knowledge gap. Leaders feel pressure to adopt AI, but they might not understand how it works.

Execs don’t need to be machine learning engineers. But they do need to understand that GenAI models are probabilistic, not deterministic. They generate likely outputs, not guaranteed truths. The same prompt can produce different answers. A confident response can still be wrong.

Leaders also need to understand the black box nature of many AI systems, the risk of model collapse as AI-generated content floods training data and the alignment challenges created when models are optimized to satisfy users rather than tell the truth.

That’s why AI upskilling can’t be limited to technical teams. Executives, managers, legal teams, HR leaders and frontline employees all need a working understanding of how to use AI responsibly.

The Real Opportunity Lives Beyond the Demo

The enterprise AI market is full of impressive demos. But Wu argues leaders need to separate flashy tools from meaningful capability shifts.

The difference comes down to several questions:

Does this solve a real business problem?
Does it work with the data the company actually has?
Can it scale beyond a narrow pilot?
What infrastructure does it require?
What governance does it need?
Does it improve a workflow or merely decorate it?

Production AI is usually narrower and less magical than the demo. But it’s also more valuable. A task-specific agent that classifies documents reliably and at scale may be less exciting than a general-purpose assistant, yet it may deliver far more enterprise value.

As Wu put it, ROI does not live in the demo. It lives in the messy work of integration and scaling.

The Inference: The Leadership Mindset Needed to Scale AI