ThisDayInAI
--:--:--
Today's Gold — Day's Top Story

Anthropic's Secret 'Claude Mythos' Leaked: A New AI Tier More Powerful Than Opus — With Unprecedented Cyber Risks

A data leak exposed Anthropic's most powerful AI yet — Claude Mythos (also called Capybara) — which the company says is "far ahead of any other AI model in cyber capabilities" and poses cybersecurity risks the industry has never seen before.

Anthropic's Secret 'Claude Mythos' Leaked: A New AI Tier More Powerful Than Opus — With Unprecedented Cyber Risks

Anthropic's Best-Kept Secret Just Got Out

In a striking turn of events that underscores the tension between AI development speed and information security, Anthropic — the safety-focused AI lab founded by former OpenAI researchers — accidentally left thousands of unpublished internal documents in a publicly searchable data store this week. Among them: a detailed draft blog post announcing a model called Claude Mythos, described internally as "by far the most powerful AI model we've ever developed."

Fortune magazine, working alongside senior AI security researcher Roy Paz of LayerX Security and University of Cambridge cybersecurity researcher Alexandre Pauwels, reviewed the leaked material before Anthropic locked down the data store. The company confirmed the slip after being contacted by journalists.

"We're developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. Given the strength of its capabilities, we're being deliberate about how we release it." — Anthropic spokesperson

A New Tier Above Opus

If you thought Anthropic's three-tier model lineup — Haiku, Sonnet, and Opus — was settled, think again. The leaked document reveals Anthropic is introducing an entirely new tier called Capybara, with Mythos being the first model in that class. This new tier sits above Opus, making it Anthropic's most capable and expensive offering yet.

"'Capybara' is a new name for a new tier of model: larger and more intelligent than our Opus models — which were, until now, our most powerful," the draft blog post reads.

The same document claims Mythos outperforms Claude Opus 4.6 — the current flagship — by dramatic margins on benchmarks covering software coding, academic reasoning, and cybersecurity tasks. Notably, Capybara and Mythos appear to refer to the same underlying model — with Capybara being the tier name and Mythos being the specific model designation.

The Cybersecurity Problem

Here's where it gets uncomfortable. Anthropic's own internal assessment concluded that Claude Mythos is "currently far ahead of any other AI model in cyber capabilities" and that its release "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."

In plain English: Anthropic built something so capable at finding and exploiting software vulnerabilities that they're genuinely worried about what happens when it gets into the wrong hands. The company is so concerned that its initial rollout strategy focuses exclusively on cyber defenders — organizations that can use Mythos to harden their systems before the broader wave of AI-driven exploits arrives.

This tracks with a broader industry trend. Just weeks earlier, OpenAI's GPT-5.3-Codex became the first model the company classified as "high capability" for cybersecurity under its Preparedness Framework. Meanwhile, Anthropic's Opus 4.6 had already been documented surfacing previously unknown vulnerabilities in production codebases — a capability the company called "dual-use."

Chinese State Hackers Already Tried

The stakes aren't hypothetical. Anthropic has previously disclosed that Chinese state-sponsored hacking groups ran coordinated campaigns using Claude Code to infiltrate approximately 30 organizations — including tech companies, financial institutions, and government agencies — before Anthropic detected and stopped the operation. A model like Mythos in hostile hands represents a qualitative leap beyond that threat.

What Leaked and How

The breach was classified as "human error" in the configuration of Anthropic's content management system. Close to 3,000 assets linked to Anthropic's blog were publicly accessible in the data cache, according to the cybersecurity researchers who reviewed the material. After Fortune notified the company, Anthropic disabled public search access to the data store — but not before the model's existence, name, positioning, and risk profile were documented and verified.

What Comes Next

Anthropic says Mythos is currently in limited testing with early access customers, is expensive to run, and is not ready for general release. The company is proceeding cautiously — a posture consistent with its stated safety-first approach, even as critics note the irony of a data security incident at the most prominent "safety-focused" AI lab in the world.

The incident raises real questions about how AI labs manage information security as the capabilities of their unreleased models become strategically — and competitively — critical. It also sets up what may be one of the most watched model launches of 2026.

0 Comments

No comments yet. Be the first to say something.