Four boxes with checkmarks inside leading up to a magnifying class with a target inside
News

LMArena Raises $150M Series A at $1.7B Valuation

2 minute read
Michelle Hawley avatar
By
SAVED
AI benchmarking startup LMArena triples its valuation eight months after seed round, indicating investor appetite for model evaluation infrastructure.

Key Takeaways

  • LMArena secured $150M Series A led by Felicis and UC Investments, among others.
  • The platform's community saw a 25x user increase and widespread AI lab adoption in 2024.
  • AI labs and enterprise users gain trusted model performance insights, supporting responsible AI deployment.

Investor appetite for AI evaluation infrastructure surges as enterprises struggle to validate model performance at scale.

LMArena, the AI model evaluation platform developed by UC Berkeley researchers, raised $150 million in Series A funding on Jan. 6. Felicis and UC Investments led the round, with participation from Andreessen Horowitz, The House Fund, LDVP, Kleiner Perkins, Lightspeed Venture Partners and Laude Ventures.

The platform's community grew by more than 25x over the past year, alongside rapid adoption by AI labs that view it as a standard for evaluating real-world model performance. The funding follows strong revenue growth from LMArena's first commercial evaluation product, which launched in September 2025.

Table of Contents

Inside LMArena’s Benchmarking Engine

LMArena officials said the funding addresses increased competition among AI labs, which has created demand for rigorous, reproducible evaluations. Benchmarking large language models requires standardized methods that account for diverse use cases.

Platform ComponentHow It Works
Community Voting50 million votes across text, vision, video and image modalities
Model Evaluations400+ assessments spanning open and proprietary models
Open-Source Data145,000 battle data points across multiple categories
Commercial EvaluationsPaid service for enterprises and model labs (launched September 2025)

Related Article: The Benchmark Trap: Why AI’s Favorite Metrics Might Be Misleading Us

Behind the Valuation: What Drove LMArena’s Growth

LMArena's rapid ascent from UC Berkeley research project to $1.7 billion company shows just how urgently the market needs reliable tools to measure AI model performance in a competitive, fast-moving landscape.

The company incorporated as Arena Intelligence Inc. in April 2025, then closed a $100 million seed round the following month at a $600 million valuation. That same April, competitors published allegations that partnerships with OpenAI, Google and Anthropic enabled benchmark gaming — claims LMArena denied.

In September 2025, the startup launched AI Evaluations, a paid service allowing enterprises and model labs to commission crowdsourced assessments. By December the product reached a $30 million annualized run-rate, demonstrating strong market demand.

Evaluation Is the New Enterprise AI Priority

Recent data shows that 65% of organizations regularly use generative AI, yet adoption has outpaced readiness. Enterprises attempting to scale AI face a major challenge: ensuring systems perform as intended without introducing new risks.

Common gaps include:

  • Inconsistent team alignment
  • Fragmented data pipelines
  • Governance frameworks lacking rigor

The pressure to remain competitive has fueled demand for rigorous evaluation platforms that go beyond vendor-supplied benchmarks. 

Additionally, for enterprises in regulated sectors, transparency has become as valuable as capability. Organizations must trace data lineage, provide audit trails and explain AI-driven outcomes. Transparency around evaluation methods and benchmark scope has become a trust factor for enterprises considering AI adoption at scale. 

Learning Opportunities

Complex multi-step tasks require sophisticated validation before deployment. New agent-to-agent testing solutions address gaps in AI validation as enterprises move pilots into production. This is especially true as organizations deploy AI agents and agentic AI systems that operate with greater autonomy. 

Related Article: Poetiq’s AI Reasoning Layer Hits 54% on ARC-AGI-2 at Half the Cost

LMArena: Berkeley-Born, Enterprise-Built

LMArena is an open platform targeting enterprises, AI model labs and developer teams seeking transparent benchmarking of artificial intelligence models. Founded in 2025 by UC Berkeley researchers, the platform enables public comparison of AI model outputs across text, vision and image domains through community-driven evaluation.

About the Author
Michelle Hawley

Michelle Hawley is an experienced journalist who specializes in reporting on the impact of technology on society. As editorial director at Simpler Media Group, she oversees the day-to-day operations of VKTR, covering the world of enterprise AI and managing a network of contributing writers. She's also the host of CMSWire's CMO Circle and co-host of CMSWire's CX Decoded. With an MFA in creative writing and background in both news and marketing, she offers unique insights on the topics of tech disruption, corporate responsibility, changing AI legislation and more. She currently resides in Pennsylvania with her husband and two dogs. Connect with Michelle Hawley:

Main image: Cagkan | Adobe Stock
Featured Research