Leaked Documents Reveal Anthropic's Secret 'Claude Mythos' Model — Described as a 'Step Change' with Unprecedented Cybersecurity Risks — ThisDayInAI

Anthropic's Secret Weapon Exposed by Accident

It wasn't a hacker. It wasn't a disgruntled employee. It was a misconfigured content management system. And it may have just changed the AI industry's understanding of what Anthropic is quietly building.

On Thursday, March 26, Fortune reported that a cache of nearly 3,000 unpublished Anthropic assets had been left in a publicly accessible and searchable data store — including a draft blog post announcing a new model the company had been developing in strict secrecy. The model is called Claude Mythos, and if the leaked documents are to be believed, it's unlike anything the company has released before.

What the Leak Revealed

The draft blog post described Mythos as representing "a step change" in AI performance — a phrase Anthropic itself later confirmed in a statement. More provocatively, the company's own internal materials acknowledged that the model poses "unprecedented cybersecurity risks."

"We're developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity," an Anthropic spokesperson told Fortune. "Given the strength of its capabilities, we're being deliberate about how we release it."

The leaked documents also introduced a new naming tier: "Capybara" — described as a new class of model above Opus, Anthropic's current top tier. The draft stated: "'Capybara' is a new name for a new tier of model: larger and more intelligent than our Opus models — which were, until now, our most powerful." Capybara and Mythos appear to refer to the same underlying model, with different codenames used at different stages of development.

How It Was Found

The exposure was discovered independently by Roy Paz, a senior AI security researcher at LayerX Security, and Alexandre Pauwels, a cybersecurity researcher at the University of Cambridge. Fortune contacted Anthropic on Thursday, and the company removed public access to the data store shortly after.

In total, Pauwels estimated close to 3,000 assets linked to Anthropic's blog were publicly discoverable — none of which had been published on official company channels.

The Performance Claims

The leaked draft post made specific and sweeping claims about Mythos's capabilities:

Software coding: Dramatically higher scores than Claude Opus 4.6, which had only recently topped Terminal-Bench 2.0 at 65.4%, surpassing GPT-5.2-Codex.
Academic reasoning: Significant leaps on standardized academic benchmarks.
Cybersecurity: Unprecedented performance — and unprecedented risk. The company's own draft acknowledged this openly, a candor that raised eyebrows across the security community.

The Pentagon Angle

The timing of the leak is notable. Anthropic has been locked in a legal battle with the Trump administration over the Pentagon's attempt to use its AI models in autonomous weapons systems and mass surveillance — uses Anthropic explicitly refused to sanction. A federal judge sided with Anthropic just days before the leak broke. The revelation that Mythos poses "unprecedented cybersecurity risks" has reportedly drawn fresh attention from both security researchers and Pentagon officials alike, adding another layer of geopolitical complexity to the situation.

Anthropic's Response

The company characterized the incident as "human error" in configuring its CMS, and described the leaked materials as "early drafts of content considered for publication." It confirmed the model's existence and that a "small group of early access customers" is currently testing it.

Notably, Anthropic said it considers Mythos "the most capable we've built to date" — a phrase that, in the current competitive environment, carries enormous weight. Whether the unplanned reveal accelerates or delays a public launch remains to be seen.