Google Gemini logo on laptop screen
News

Gemini 3 Deep Think Sets New Scientific Reasoning Benchmark

2 minute read
Michelle Hawley avatar
By
SAVED
Google’s Gemini 3 Deep Think is now available via API, outperforming Anthropic and OpenAI on scientific benchmarks and expanding enterprise AI reasoning.

Key Takeaways

  • Gemini 3 Deep Think now tackles complex science and engineering problems.
  • The new release outperformed Anthropic and OpenAI on major scientific and reasoning benchmarks.
  • Google is expanding access to Deep Think through the Gemini API for researchers, engineers and enterprises.

Google's upgraded Gemini 3 Deep Think now gives researchers and enterprises API access to reasoning capabilities that outperform rivals on key scientific benchmarks.

The company released the major upgrade on Feb. 12, 2026, making its specialized reasoning mode available to Google AI Ultra subscribers and, for the first time, via the Gemini API to select researchers, engineers and enterprises. According to company officials, the update blends deep scientific knowledge with engineering utility to drive practical applications.

The upgraded model achieved 84.6% on ARC-AGI-2 — verified by the ARC Prize Foundation — compared to Anthropic's Opus 4.6 at 68.8% and OpenAI's GPT-5.2 at 52.9%. Google also unveiled Aletheia, a math agent that autonomously solves open problems and verifies proofs.

In one case study, Lisa Carbone, a mathematician at Rutgers University working on mathematical structures for high-energy physics, used Deep Think to review a highly technical mathematics paper. The model identified a subtle logical flaw that had passed through human peer review unnoticed.

Table of Contents

Gemini 3 Deep Think: Benchmark Performance & Capabilities

Google says the latest upgrade pushes Gemini 3 into leadership territory. The results below detail the model's reported standing across major industry evaluations.

Benchmark/EvaluationReported Performance
ARC-AGI-284.6% accuracy, verified by ARC Prize Foundation
Humanity's Last Exam48.4% without tools, setting new standard
Codeforces Elo3,455 rating, nearly 1,000 points above Opus 4.6
Math Olympiad 2025Gold-medal level performance

The AI benchmarks listed above represent Google's reported results; independent verification of some metrics remains ongoing.

Gemini Powers Google’s Market Resurgence

2024 marked a bit of a stumble for Google. The tech giant had a slow start to the GenAI race — one that turned near catastrophic when its products began to generate images of Nazis and told users to eat rocks. By January 2025, rumors floated around about CEO Sundar Pichaei's job security.

The turnaround throughout 2025, however, delivered a 68% stock surge to a $3.8 trillion market cap — surpassing Microsoft — and the company's first $100 billion quarter.

Google DeepMind's Gemini 3 family now rivals Claude and ChatGPT atop performance leaderboards. The company launched thinking controls and pricing shifts for the Gemini 3 API in late 2025. And in early 2026, Apple agreed to use Gemini to power a revamped Siri and future Apple Intelligence features.

Google Cloud revenue also surged 48% to $17.7 billion in Q4 2025. Both Anthropic and OpenAI signed major infrastructure deals, with Anthropic's October 2025 partnership providing access to up to one million Tensor Processing Units.

Granular Reasoning Controls Arrive in Gemini API

With this latest release, Google DeepMind introduced new API controls that give developers granular authority over reasoning depth, multimodal processing and workflow reliability.

The update added a thinking_level parameter that adjusts internal reasoning depth for cost or quality optimization. This enables developers to tune model behavior based on task complexity — deeper analysis for demanding workflows or faster inference for routine operations.

Learning Opportunities

Google at a Glance

Founded in 1998, Google serves mass-market consumers, advertisers, enterprises and public sector organizations with a broad portfolio of digital products and cloud services. The company offers consumer platforms such as Search, YouTube, Maps, Gmail, Android and Chrome, alongside Google Cloud Platform and Workspace for enterprise customers.

About the Author
Michelle Hawley

Michelle Hawley is an experienced journalist who specializes in reporting on the impact of technology on society. As editorial director at Simpler Media Group, she oversees the day-to-day operations of VKTR, covering the world of enterprise AI and managing a network of contributing writers. She's also the host of CMSWire's CMO Circle and co-host of CMSWire's CX Decoded. With an MFA in creative writing and background in both news and marketing, she offers unique insights on the topics of tech disruption, corporate responsibility, changing AI legislation and more. She currently resides in Pennsylvania with her husband and two dogs. Connect with Michelle Hawley:

Main image: gguy | Adobe Stock
Featured Research