Claude Mythos Leaked: Anthropic's Most Powerful AI Explained | webencher

Claude Mythos Leaked: Anthropic's Most Powerful AI Is Too Dangerous for Public Release

A data leak exposed Anthropic's next-generation AI model weeks before its official announcement. Claude Mythos — codenamed Capybara — achieves 93.9% on SWE-bench, saturates Cybench, and autonomously discovers thousands of zero-day vulnerabilities. Anthropic says it won't be publicly released.

The accidental reveal

On March 26, 2026, security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) made an unexpected discovery: approximately 3,000 unpublished Anthropic assets had been left in an unsecured, publicly searchable data cache. Among them was a draft blog post announcing a new AI model named Claude Mythos.

Fortune notified Anthropic the same day. The company quickly removed public access to the data store, attributing the exposure to "human error" in their content management system configuration.

Twelve days later, on April 7, 2026, Anthropic made it official.

What is Claude Mythos?

Claude Mythos Preview is Anthropic's most capable model to date — and the first member of a new, fourth tier the company is calling Capybara, sitting above the existing Haiku, Sonnet, and Opus tiers.

Anthropic describes it not as an incremental improvement but as a "step change" in capabilities. The leaked draft called it "dramatically higher" than Claude Opus 4.6 across every benchmark measured.

Benchmark performance

Benchmark	Claude Mythos	Previous best
SWE-bench Verified	93.9%	80.8%
SWE-bench Pro	77.8%	57.7%
GPQA Diamond	94.6%	—
USAMO 2026	97.6%	—
CyberGym	83.1%	66.6% (Opus 4.6)
Cybench	100% (saturated)	—
OSWorld	79.6%	—

The model leads on 17 of the 18 benchmarks Anthropic measured. On USAMO 2026 (the US Mathematics Olympiad), it scored a 55-point improvement over Opus 4.6.

Why it won't be publicly released

The reason Anthropic is keeping Mythos restricted comes down to one word: cybersecurity.

The company's 244-page System Card found that Mythos exhibits "cyber capabilities currently far ahead of any other AI model." Specifically, the model can:

Autonomously discover zero-day vulnerabilities in production software
Develop working exploits without human intervention
Chain multiple vulnerabilities together for complex attack paths
Reverse-engineer closed-source binaries

Anthropic wrote directly in the System Card: making Mythos generally available would make "large-scale AI-driven cyberattacks far more likely." The company briefed U.S. government officials before proceeding.

"We have seen Mythos Preview write exploits in hours that expert penetration testers said would have taken them weeks to develop." — Anthropic red team report

The data leak's significance

The accidental exposure raised questions Anthropic had to answer quickly:

Who saw it? The data store was publicly searchable, meaning any automated crawler or security researcher could have indexed the draft blog post — and possibly the technical details within it.

Was more exposed? Anthropic has not confirmed whether any exploit code, vulnerability details, or model weights were part of the 3,000 exposed assets. The company said the exposure was limited to "content management assets."

Could competitors benefit? A draft blog post describing a model's capabilities doesn't reveal its architecture or training procedure — but it does give competitors a roadmap of what benchmarks to prioritize.

Access and pricing

Mythos is available only to vetted participants in Anthropic's Project Glasswing initiative:

12 charter partners: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, and Anthropic itself
40+ additional organizations maintaining critical open-source software
Platforms: Claude API, Amazon Bedrock (US East), Google Cloud Vertex AI, Microsoft Foundry

Pricing is $25 per million input tokens and$ 125 per million output tokens — five times the cost of Claude Opus 4.6. Partners receive access to a $100 million credit pool during the research preview period.

There is no waitlist, no public API access, and no timeline for broader availability.

How Anthropic framed the announcement

The April 7 announcement was careful to position Mythos as a defensive tool in a coming security crisis, not a product launch.

"After navigating the transition to the Internet in the early 2000s, we have spent the last twenty years in a relatively stable security equilibrium," Anthropic wrote. The company argues Mythos represents a disruption to that equilibrium — and that the right response is to use it to patch vulnerabilities before adversaries gain comparable capabilities.

The historical parallel offered: software fuzzers. When automated fuzz testing emerged, it initially raised concerns about enabling attackers. Today, fuzzing powers projects like OSS-Fuzz, which has found over 10,000 vulnerabilities in critical open-source software. Anthropic is betting Mythos will follow the same arc.

The road ahead

Anthropic has committed:

$100M in Mythos usage credits for Project Glasswing
$2.5M to Alpha-Omega and the Open Source Security Foundation (OpenSSF)
$1.5M to the Apache Software Foundation
Public disclosure of discovered vulnerabilities within 90 days of vendor notification
Cryptographic hashes of all vulnerability/exploit documents published now, enabling future verification of possession

The company also plans to launch "new safeguards with an upcoming Claude Opus model" before Mythos gets any broader distribution — suggesting internal work on capability controls is actively ongoing.

For now, Claude Mythos remains what may be the most significant unreleased AI model in history: technically accessible to a small group of tech giants, and off-limits to everyone else.