Anthropic built elaborate safeguards around Claude Mythos Preview, an AI model so capable of finding and exploiting software vulnerabilities that the company refused to release it publicly. Instead, it shared access with just a dozen trusted enterprise partners. Now, according to a Bloomberg report, that carefully controlled rollout may already be compromised.
An unauthorized group reportedly gained access to Mythos Preview on the same day the initiative was publicly announced — not through a sophisticated hack, but through what Bloomberg describes as an educated guess about the model's location, combined with access held by a current employee at one of Anthropic's third-party contractors.
"We're investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments," an Anthropic official told TechCrunch. The company said it has found no evidence the activity has impacted Anthropic's own systems.
Table of Contents
- How the Breach Reportedly Happened
- Why Mythos Preview Is Different From Other AI Models
- Project Glasswing: A Controlled Release Now Under Scrutiny
- The Bigger Picture: AI and the End-to-End Cyberattack
- What Happens Next
How the Breach Reportedly Happened
The group, whose members have not been publicly identified, operate within a Discord channel focused on tracking and accessing unreleased AI models. According to Bloomberg, they attempted multiple strategies to reach Mythos Preview before succeeding — ultimately leveraging access belonging to a source who works at a contractor that provides services to Anthropic.
Once inside, the group allegedly used the model regularly and provided Bloomberg with screenshots and a live demonstration as evidence. Their stated motivation, according to the source, was curiosity rather than malice, stating that they’re interested in playing around with new models rather than wreaking havoc on them.
The timing is notable. The group purportedly gained access on the same day Project Glasswing — Anthropic's controlled cybersecurity initiative built around Mythos Preview — was publicly announced, with the announcement itself potentially providing clues about the model's deployment architecture.
Related Article: Anthropic Launches Project Glasswing to Fix Cybersecurity's Blind Spots
Why Mythos Preview Is Different From Other AI Models
The potential consequences of this access are not theoretical. Mythos Preview is not a general-purpose chatbot. It’s an AI system that Anthropic's own evaluations show can autonomously discover and exploit vulnerabilities at a scale and severity no prior model has approached.
Among the model's documented capabilities:
- Autonomously discovered thousands of zero-day vulnerabilities across every major operating system and browser — without human direction
- Uncovered a 27-year-old flaw in OpenBSD and a 16-year-old FFmpeg bug that survived five million automated test passes
- Chained together multiple Linux kernel vulnerabilities to achieve full system takeover
- Developed 181 working exploits against a patched Firefox JavaScript engine in benchmark testing, versus just two produced by Anthropic's current flagship model
- Completed the full pipeline from a CVE identifier to a functional root exploit in under a day, at a cost of less than $2,000
Anthropic has said these capabilities were not explicitly trained into the model — they emerged as a byproduct of general improvements in reasoning, code generation and autonomous task completion.
Project Glasswing: A Controlled Release Now Under Scrutiny
The access concern strikes at the heart of Project Glasswing, Anthropic's initiative to use Mythos Preview defensively. The program was named after the glasswing butterfly and structured deliberately to keep the model out of the wrong hands.
Anthropic committed up to $100 million in usage credits and restricted access to a small cohort of enterprise partners. The twelve initial partners were:
| Partner | Sector |
|---|---|
| Amazon Web Services | Cloud Infrastructure |
| Apple | Consumer Technology |
| Anthropic | AI Research |
| Broadcom | Semiconductors / Software |
| Cisco | Networking / Security |
| CrowdStrike | Cybersecurity |
| Cloud / AI | |
| JPMorganChase | Financial Services |
| The Linux Foundation | Open-Source Infrastructure |
| Microsoft | Cloud / Enterprise Software |
| NVIDIA | Semiconductors / AI Hardware |
| Palo Alto Networks | Cybersecurity |
Access was also extended to more than 40 additional organizations that build or maintain software infrastructure — meaning the attack surface for third-party credential exposure was broader than the headline partner list suggests.
The Bigger Picture: AI and the End-to-End Cyberattack
The reported breach arrives as researchers are only beginning to document what models like Mythos Preview can actually do in adversarial hands. Work from Carnegie Mellon University's Incalmo Project demonstrated that large language models, when integrated with cybersecurity tooling, can conduct complete attack sequences autonomously, including installing malware and moving laterally across networks.
"Suddenly, the LLM was able to do an end-to-end attack, install malware on hosts and infect multiple hosts throughout the network,” said Brian Singer, Carnegie Mellon University researcher.
Singer was careful to note that this capability is still limited in scope — functional against roughly 40 known network configurations, but not yet adaptable to the full complexity of real-world enterprise environments. Anthropic officials said they worked directly with CMU to understand the implications of that research.
The financial stakes are also significant. Estimates place annual global cybercrime costs at approximately $500 billion, a figure that has only increased as software ecosystems grow more interconnected and complex.
Related Article: Anthropic's Mythos AI Discovers Thousands of Zero-Days
What Happens Next
Anthropic said its investigation is ongoing. The company has not confirmed whether it has identified exactly how access was obtained, whether it has been revoked or whether any of the model's offensive capabilities were used by the unauthorized group during their reported sessions.
For an initiative premised on controlled, trusted deployment, the incident raises immediate questions: How robust is the vendor vetting process for third-party contractors granted proximity to restricted models? What technical controls prevent credential sharing from becoming model access?
Anthropic has pledged to publish Glasswing findings within 90 days and warned in its launch statement that "frontier AI developers, other software companies, security researchers, open-source maintainers and governments across the world all have essential roles to play" in getting ahead of the threat — a message that lands differently today than it did when it was written.