Inside Claude Mythos: How Anthropic's AI Found Thousands of Zero-Days Across Every Major OS
Claude Mythos didn't just score well on benchmarks — it autonomously discovered thousands of critical vulnerabilities in every major operating system and browser. This deep dive explains how it works: from hypothesizing bugs in code to chaining four exploits into a full sandbox escape.
Table of Contents
- A new kind of capability
- How the system works
- The discoveries: specific examples
- 1. OpenBSD TCP SACK — 27-year-old crash bug
- 2. FFmpeg — 16-year-old bug missed by 5 million tests
- 3. FreeBSD NFS — Remote Code Execution as Root (CVE-2026-4747)
- 4. Firefox browser — Four-vulnerability sandbox escape chain
- Scale of findings
- What Mythos can't do (yet)
- The implications
A new kind of capability
Most AI models get better at benchmarks gradually. Claude Mythos Preview didn't — it made a discontinuous jump in one specific domain: software security.
On CyberGym, the leading benchmark for AI-assisted vulnerability reproduction, Mythos scored 83.1%. Claude Opus 4.6 scored 66.6%. That 16.5-point gap represents a qualitative difference, not a quantitative one: Mythos converts 72.4% of identified vulnerabilities into working exploits. Previous Claude models consistently failed at exploit development entirely.
More critically: Mythos discovered these vulnerabilities autonomously, with no human guidance after the initial setup.
How the system works
Anthropic deployed Mythos inside a custom agentic scaffold — a containerized testing environment with:
- File-level parallelization: separate agents assigned to different source files simultaneously
- Pre-ranked vulnerability likelihood: each file scored 1–5 for likely bug density before analysis
- Secondary validation agent: filters trivial or duplicate findings before human review
- Tool access: debuggers, sanitizers, fuzzing harnesses, and exploit development utilities
The model's workflow for each target roughly follows this pattern:
- Code inspection — read source files, build a mental model of the codebase
- Hypothesis generation — identify data flows that could lead to memory corruption, logic errors, or authentication bypasses
- Experiment design — write test cases or crafted inputs to confirm the hypothesis
- Debugging — use sanitizers (ASan, UBSan) and debuggers to observe program state
- Exploit development — turn a confirmed bug into a working proof-of-concept
This is exactly what a skilled human security researcher does. Mythos just does it faster, cheaper, and in parallel.
The discoveries: specific examples
1. OpenBSD TCP SACK — 27-year-old crash bug
The vulnerability was introduced in OpenBSD's TCP stack in 1998, when the original developer implemented TCP Selective Acknowledgement (SACK) support.
Root cause: A signed integer overflow in the sequence number comparison routine, combined with improper bounds checking in the SACK window management code. TCP sequence numbers are 32-bit unsigned integers that wrap around — when Mythos examined the comparison logic, it identified that the code treated them as signed in one critical path, creating an arithmetic overflow when sequence numbers wrapped past 2³¹.
Impact: Sending a specially crafted sequence of TCP packets causes a remote machine crash — a denial-of-service attack requiring no authentication.
Why it survived 27 years: The interaction between SACK window management and 32-bit integer wraparound only manifests under specific conditions. Automated fuzzers generate random inputs; they rarely produce the precise sequence needed to trigger this exact integer boundary. Mythos reasoned about the code semantics and constructed the trigger deliberately.
2. FFmpeg — 16-year-old bug missed by 5 million tests
FFmpeg is the open-source multimedia library powering YouTube, VLC, and thousands of other applications. It has been continuously fuzz-tested since at least 2016; OSS-Fuzz has run over 5 million test cases against it.
Mythos found a bug that all of them missed.
Root cause: A logic error in a codec demuxer's timestamp normalization function, introduced approximately 16 years ago. The bug doesn't cause a crash — it causes incorrect behaviour under specific codec configurations that no automated test happened to exercise.
Why fuzzers missed it: Traditional fuzzing is coverage-guided. It generates inputs that exercise new code paths. This bug sits in code that is frequently executed — the path wasn't new — but only triggers when two specific conditions hold simultaneously: a particular codec type combined with a specific timestamp arithmetic edge case. Mythos identified both conditions by reading the code.
3. FreeBSD NFS — Remote Code Execution as Root (CVE-2026-4747)
This is the most severe finding: a 17-year-old stack buffer overflow in FreeBSD's implementation of RFC 2203 RPCSEC_GSS authentication, enabling unauthenticated remote code execution as root.
Technical details:
- A 304-byte attacker-controlled string overflows a 128-byte stack buffer
- The overflow target lacks stack canaries in the affected compilation unit
- Mythos developed a 20-gadget ROP (Return-Oriented Programming) chain, splitting it across six sequential RPC packets to stay under per-packet size limits
- Kernel base address is leaked via an unauthenticated NFSv4
EXCHANGE_IDcall, defeating ASLR
The complete exploit — from initial reconnaissance to root shell — required no human input and cost under $50 in compute at Mythos's pricing.
"Mythos Preview fully autonomously identified and then exploited a 17-year-old remote code execution vulnerability in FreeBSD that allows anyone to gain root on a machine running NFS." — Anthropic red team report
4. Firefox browser — Four-vulnerability sandbox escape chain
The most technically sophisticated finding involved chaining four separate vulnerabilities in Firefox's JavaScript JIT compiler to escape both the renderer process sandbox and the OS-level sandbox.
The chain:
- JIT read primitive — exploit a type confusion in the JS engine to read arbitrary memory
- Heap spray — use the read primitive to locate kernel objects
- JIT write primitive — a second JIT vulnerability allowing controlled writes
- Credential manipulation — overwrite kernel process credentials to escalate privileges
Each individual bug is rated medium severity in isolation. Chained together, they produce a full compromise of the browser host from a malicious web page. Mythos discovered all four and constructed the chain autonomously.
Scale of findings
Across Anthropic's research period:
- Thousands of high- and critical-severity vulnerabilities found across all major OSes and browsers
- 595 crashes at severity tiers 1–2 on the OSS-Fuzz corpus vs. Opus 4.6's 150–175
- 181 working exploits developed for Firefox's JS engine (Opus 4.6: 2)
- 29 register-control instances — a stage required for reliable RCE exploit construction
- 89% severity accuracy: of 198 manually reviewed reports, 89% matched Mythos's own severity assessment exactly; 98% were within one severity level
What Mythos can't do (yet)
Anthropic was careful to note capabilities that remain limited:
- Logic bugs in web applications: authentication bypasses and broken authorization in web apps remain harder for the model than memory-corruption bugs
- Cryptographic vulnerabilities: subtle flaws in cryptographic implementations require deep mathematical reasoning that current models handle inconsistently
- Novel attack techniques: the model primarily applies known classes of vulnerabilities; inventing genuinely new exploit primitives is rare
The implications
The emergence capability is the most unsettling aspect of this announcement:
"We did not explicitly train Mythos Preview to have these capabilities. Rather, they emerged as a downstream consequence of general improvements in code, reasoning, and autonomy."
This means vulnerability discovery isn't a feature Anthropic built — it's a consequence of building a smarter, more capable general-purpose model. As models continue to improve, these capabilities will deepen whether or not any lab intends them to.
Anthropic's response is to give defenders a head start. Whether a six-week window before similar capabilities reach open-source models is enough time to patch critical software is a question the security community is now urgently debating.

