Closeup of the Meta sign at the entrance to the Meta Platforms headquarters
News

Meta Unveils Muse Spark, Its First Step Toward ‘Personal Superintelligence’

2 minute read
Michelle Hawley avatar
By
SAVED
Meta debuts Muse Spark to challenge OpenAI, Google and Anthropic.

Key Takeaways

  • Muse Spark is a new proprietary model family, not an upgrade to Llama.
  • Meta claims its new training recipe is more efficient than Llama 4 Maverick.
  • Contemplating mode scored 58% on Humanity's Last Exam by running multiple reasoning agents in parallel.
  • Meta is positioning health as the flagship consumer use case for the model.

Meta's newly formed Superintelligence Labs has released Muse Spark, a multimodal reasoning model that the company says marks the beginning of a fundamentally new approach to its AI development.

Available now on meta.ai and the Meta AI app, Muse Spark introduces visual chain-of-thought reasoning, tool use and a novel multi-agent system called Contemplating mode.

Meta's Muse Spark

Table of Contents

What Is Muse Spark?

Muse Spark is the first model in Meta's "Muse" family — a ground-up rebuild of the company's AI stack, separate from its open-source Llama line.

Meta says Muse Spark is natively multimodal, meaning it was designed from the start to integrate text and visual information rather than bolting vision onto a text model after the fact.

Key capabilities include:

  • Visual Reasoning and Annotation: Analyzing images, recognizing entities and generating dynamic overlays (e.g., nutritional labels on a photo of food)
  • Tool Use: Interacting with external tools and APIs mid-conversation
  • Contemplating Mode: Orchestrating multiple AI agents reasoning in parallel to tackle harder problems without drastically increasing response time

Meta is also opening a private API preview to select developers.

How Does It Perform?

Meta shared benchmark results that puts Muse Spark at a competitive level with leading models from OpenAI and Google, while acknowledging gaps in long-horizon agentic tasks and coding.

In Contemplating mode, the model achieved significant capability improvements in challenging tasks, achieving 58% on Humanity’s Last Exam and 38% in FrontierScience Research.

Meta Spark benchmarks

By Meta's own benchmarks, the model ranks behind only Google Gemini 3.1 Pro and OpenAI's GPT-5.4 in multimodal functionality, though some observers note that Meta didn't release a technical paper alongside the model, suggesting the figures deserve scrutiny.

Meta's Scaling Story

Much of Meta's announcement focused not on the model itself but on the infrastructure and training methodology behind it. The company highlighted three "scaling axes":

Pretraining Efficiency — Meta says its rebuilt training recipe achieves the same capability level using less compute than Llama 4 Maverick, its previous model. The company claims this also makes Muse Spark more efficient than leading base models from competitors.

Reinforcement Learning Stability — Post-training RL showed smooth, log-linear improvement on both training and held-out evaluation data, which Meta says indicates the gains generalize reliably.

Test-Time Reasoning Compression — Rather than simply letting the model think longer, Meta penalizes excessive token use during RL training. This creates what it describes as a "phase transition" where the model learns to compress its reasoning, solving problems with fewer tokens before extending again for harder tasks.

Health as a Flagship Use Case

Meta singled out health as a major application area.

The company collaborated with over 1,000 physicians to curate training data, and demonstrated Muse Spark generating interactive health displays, such as personalized dietary recommendations overlaid on food images with hover-based nutritional breakdowns.

Safety and a Notable Red Flag

Meta says Muse Spark passed safety evaluations across frontier risk categories. However, third-party evaluator Apollo Research found something unusual: the model showed the highest rate of "evaluation awareness" they've observed in any model, frequently identifying test scenarios as alignment traps and reasoning that it should behave honestly because it was being evaluated.

Meta acknowledged this warrants further research but said it was not a blocking concern for launch.

Learning Opportunities

Meta's Bigger AI Picture

Meta's AI-related capital expenditures for 2026 are projected between $115 billion and $135 billion, nearly double last year's spending. Muse Spark is the first tangible output of that investment under MSL. Whether it represents a turning point or merely a foundation, Meta's AI ambitions have decisively moved beyond Llama.

The model will roll out across Facebook, Instagram, WhatsApp, Messenger, and Ray-Ban Meta smart glasses in the coming weeks. The company is also opening a private API preview to select users.

About the Author
Michelle Hawley

Michelle Hawley is an experienced journalist who specializes in reporting on the impact of technology on society. As editorial director at Simpler Media Group, she oversees the day-to-day operations of VKTR, covering the world of enterprise AI and managing a network of contributing writers. She's also the host of CMSWire's CMO Circle and co-host of CMSWire's CX Decoded. With an MFA in creative writing and background in both news and marketing, she offers unique insights on the topics of tech disruption, corporate responsibility, changing AI legislation and more. She currently resides in Pennsylvania with her husband and two dogs. Connect with Michelle Hawley:

Main image: Tada Images | Adobe Stock