News

Claude Mythos Leaked: Anthropic's Most Powerful AI Is Too Dangerous for Public Release

A data leak exposed Anthropic's next-generation AI model weeks before its official announcement. Claude Mythos — codenamed Capybara — achieves 93.9% on SWE-bench, saturates Cybench, and autonomously discovers thousands of zero-day vulnerabilities. Anthropic says it won't be publicly released.


Table of Contents

The accidental reveal

On March 26, 2026, security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) made an unexpected discovery: approximately 3,000 unpublished Anthropic assets had been left in an unsecured, publicly searchable data cache. Among them was a draft blog post announcing a new AI model named Claude Mythos.

Fortune notified Anthropic the same day. The company quickly removed public access to the data store, attributing the exposure to "human error" in their content management system configuration.

Twelve days later, on April 7, 2026, Anthropic made it official.


What is Claude Mythos?

Claude Mythos Preview is Anthropic's most capable model to date — and the first member of a new, fourth tier the company is calling Capybara, sitting above the existing Haiku, Sonnet, and Opus tiers.

Anthropic describes it not as an incremental improvement but as a "step change" in capabilities. The leaked draft called it "dramatically higher" than Claude Opus 4.6 across every benchmark measured.

Benchmark performance

Benchmark Claude Mythos Previous best
SWE-bench Verified 93.9% 80.8%
SWE-bench Pro 77.8% 57.7%
GPQA Diamond 94.6%
USAMO 2026 97.6%
CyberGym 83.1% 66.6% (Opus 4.6)
Cybench 100% (saturated)
OSWorld 79.6%

The model leads on 17 of the 18 benchmarks Anthropic measured. On USAMO 2026 (the US Mathematics Olympiad), it scored a 55-point improvement over Opus 4.6.


Why it won't be publicly released

The reason Anthropic is keeping Mythos restricted comes down to one word: cybersecurity.

The company's 244-page System Card found that Mythos exhibits "cyber capabilities currently far ahead of any other AI model." Specifically, the model can:

  • Autonomously discover zero-day vulnerabilities in production software
  • Develop working exploits without human intervention
  • Chain multiple vulnerabilities together for complex attack paths
  • Reverse-engineer closed-source binaries

Anthropic wrote directly in the System Card: making Mythos generally available would make "large-scale AI-driven cyberattacks far more likely." The company briefed U.S. government officials before proceeding.

"We have seen Mythos Preview write exploits in hours that expert penetration testers said would have taken them weeks to develop." — Anthropic red team report


The data leak's significance

The accidental exposure raised questions Anthropic had to answer quickly:

Who saw it? The data store was publicly searchable, meaning any automated crawler or security researcher could have indexed the draft blog post — and possibly the technical details within it.

Was more exposed? Anthropic has not confirmed whether any exploit code, vulnerability details, or model weights were part of the 3,000 exposed assets. The company said the exposure was limited to "content management assets."

Could competitors benefit? A draft blog post describing a model's capabilities doesn't reveal its architecture or training procedure — but it does give competitors a roadmap of what benchmarks to prioritize.


Access and pricing

Mythos is available only to vetted participants in Anthropic's Project Glasswing initiative:

  • 12 charter partners: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, and Anthropic itself
  • 40+ additional organizations maintaining critical open-source software
  • Platforms: Claude API, Amazon Bedrock (US East), Google Cloud Vertex AI, Microsoft Foundry

Pricing is 25permillioninputtokensand25 per million input tokens and125 per million output tokens — five times the cost of Claude Opus 4.6. Partners receive access to a $100 million credit pool during the research preview period.

There is no waitlist, no public API access, and no timeline for broader availability.


How Anthropic framed the announcement

The April 7 announcement was careful to position Mythos as a defensive tool in a coming security crisis, not a product launch.

"After navigating the transition to the Internet in the early 2000s, we have spent the last twenty years in a relatively stable security equilibrium," Anthropic wrote. The company argues Mythos represents a disruption to that equilibrium — and that the right response is to use it to patch vulnerabilities before adversaries gain comparable capabilities.

The historical parallel offered: software fuzzers. When automated fuzz testing emerged, it initially raised concerns about enabling attackers. Today, fuzzing powers projects like OSS-Fuzz, which has found over 10,000 vulnerabilities in critical open-source software. Anthropic is betting Mythos will follow the same arc.


The road ahead

Anthropic has committed:

  • $100M in Mythos usage credits for Project Glasswing
  • $2.5M to Alpha-Omega and the Open Source Security Foundation (OpenSSF)
  • $1.5M to the Apache Software Foundation
  • Public disclosure of discovered vulnerabilities within 90 days of vendor notification
  • Cryptographic hashes of all vulnerability/exploit documents published now, enabling future verification of possession

The company also plans to launch "new safeguards with an upcoming Claude Opus model" before Mythos gets any broader distribution — suggesting internal work on capability controls is actively ongoing.

For now, Claude Mythos remains what may be the most significant unreleased AI model in history: technically accessible to a small group of tech giants, and off-limits to everyone else.


Was this article helpful?

w

webencher Editorial

Software engineers and technical writers with 10+ years of combined experience in algorithms, systems design, and web development. Every article is reviewed for accuracy, depth, and practical applicability.

More by this author →