Skip to content
OTFotf
All posts

Claude Fable 5 is here: what's real, and what to own

D
DaveAuthor
8 min read
Claude Fable 5 is here: what's real, and what to own

Updated June 9, 2026 — Claude Fable 5 is now live. This post was written hours before launch, carefully separating Anthropic's confirmed facts from the rumors. Here's the updated version: the bets resolved almost exactly as the prediction markets called them.

Anthropic has a model that found a 27-year-old bug in OpenBSD, chained four vulnerabilities into a working browser exploit, and got root on a FreeBSD server. Its public, safeguarded version went generally available today as Claude Fable 5. If you build software, that's good news — a much stronger engine for everything you ship.

So here's the honest rundown: what Anthropic actually shipped, how the pre-launch rumors resolved, and the one move that pays off now that a model this strong is in everyone's hands.

What Anthropic actually shipped

Claude Fable 5 is, in Anthropic's words, "a Mythos-class model that we've made safe for general use." It's the public version of Claude Mythos — the model that's strikingly strong at finding and exploiting software vulnerabilities. The figures Anthropic published for Mythos aren't tidy benchmark percentages — they're real exploits:

  • It turned vulnerabilities in Firefox's JavaScript engine into working exploits 181 times. The same experiment on Opus 4.6 produced working exploits twice.
  • It found a 27-year-old bug in OpenBSD, wrote a remote-code-execution exploit on FreeBSD's NFS server granting unauthenticated root, and achieved full control-flow hijack on ten separate, fully patched targets.

Two models shipped today, not one. Fable 5 is the safeguarded model anyone can use right now — claude-fable-5 via the API, on claude.ai, and across the major clouds. Mythos 5 is the same underlying model with the safeguards lifted in some areas, still restricted to Project Glasswing cyber partners and select researchers — where partners have already found 10,000+ high- and critical-severity vulnerabilities.

How the pre-launch rumors resolved

Before launch, plenty was circulating as fact that Anthropic hadn't confirmed. Here's how each bet landed:

Pre-launch claimHow it resolved
Releases June 9, 2026Confirmed — shipped today.
Public name is "Claude Fable 5"Confirmed — that's the name.
Specific pricingConfirmed — $10 per million input tokens, $50 per million output, with a 90% prompt-caching discount.
The "93.9% SWE-bench, 94.6% GPQA" numbersPartly off. Anthropic has now published its table: 80.3% on SWE-Bench Pro and 88.0% on Terminal-Bench 2.1 — strong, but not the figures that circulated. The "93.9% SWE-bench Verified" number people screenshotted isn't on Anthropic's table at all; the table reports SWE-Bench Pro, a harder test.

The pattern held. The claims tied to a real, datable plan — the date, the name, the price — were right. The precise benchmark numbers people screenshotted were the part to be careful with, and several came in lower than the screenshots claimed. If a figure isn't on the vendor's own page, don't build on it as fact.

One codebase. iOS, Android, and web.

The Fitness Kit ships with auth, a database, and a backend already connected — no setup. Live demo at fitness-preview.otf-kit.dev.

See the live demo

What Anthropic's benchmarks actually say

Anthropic's published comparison is genuinely strong for builders. On SWE-Bench Pro — agentic coding against real production issues — Fable 5 scores 80.3%, against Opus 4.8's 69.2%, GPT 5.5's 58.6%, and Gemini 3.1 Pro's 54.2%. On Terminal-Bench 2.1 it hits 88.0% (Opus 4.8: 82.7%). On the harder FrontierCode set it more than doubles Opus 4.8 (29.3% vs 13.4%). For the long-running, real-repo work you actually ship, that's a real step up.

One caveat the table itself flags. The eye-popping cybersecurity score — 78.0% on ExploitBench, against Opus 4.8's 40% — is starred, because it's the Mythos 5 side: the version with safeguards lifted, locked to Project Glasswing partners. The publicly usable Fable 5 deliberately falls back closer to Opus 4.8 on the cyber and biology benchmarks, by design, because of its blocking safeguards. The model that wrote 181 Firefox exploits and the model you call from the API aren't the same on the dangerous stuff. Worth knowing before you assume the headline number is what you're getting.

That's the news. The more important question is what you do now that a model this strong is in everyone's hands.

Why a stronger model is a tailwind, not a threat

A smarter agent is straightforwardly good for builders. It reads more of your repo before it acts, plans longer chains without losing the thread, and catches its own mistakes before you do. The wall people hit today — agents that forget what they were doing halfway through a task — gets pushed back with every model jump.

But the hype skips the catch: a better engine only pays off if it has somewhere stable to run.

a calm clay developer dropping a glowing new engine into a car already sitting on a finish

The model is the engine. Your codebase is the road.

A frontier model is an engine. Your codebase — the conventions it follows, the components it reuses, the structure it reasons over — is the road it drives on. Drop a stronger engine into a car on a paved road and it just goes faster. Drop the same engine into a car with no road and you've built a more powerful way to drive into a ditch.

That's why two builders running the identical model get wildly different results. One has a clean, owned codebase the agent can navigate; the other has a pile of regenerated screens no one understands. Same engine, different road. The bottleneck was never the model — it's the code the model has to work with.

Sandboxed, regenerate-everything tools make this worse. When the agent re-rolls your whole UI on each prompt, every model upgrade is a fresh roll of the dice instead of a step forward. A stronger model just generates the inconsistencies faster.

Own your code so the upgrade is free

There are two ways to be sitting now that Fable 5 is here.

One: you own your code. The new model reads your repo, follows your conventions, and ships the next feature on top of what's already there. The upgrade is free — you change one line, the model name, and everything you've built gets better.

Two: you rent a platform. You don't get a vote on when the model, the pricing, or even the owner changes. When the platform swaps in a new model, your app gets regenerated by something that doesn't know your conventions — and you find out in production. That's the case for owning your code instead of renting it, and a stronger model only widens the gap between the two.

A stable component base your agent extends, not rewrites

The highest-leverage thing you can own is your UI layer.

When your app sits on a stable set of components — one <Button>, one <Card>, one <Form> with a consistent API — a coding agent's job changes from "invent a button" to "use ours and add a screen." It extends a base instead of regenerating primitives every prompt. The stronger the model, the more screens it can build correctly on that base — because it's reasoning over something consistent instead of guessing at a fresh design each time. That's why your design system is the context your agent is missing.

It matters most across platforms. If the same component renders on web, iOS, and Android from one API, a stronger agent builds three platforms' worth of screens on one stable base — not three diverging rewrites you have to reconcile by hand. That's the difference between a model jump that compounds and one that just multiplies your cleanup.

This is the bet behind our free, MIT cross-platform SDK: same component, web and native, one API, code you keep. Not because the framework matters — because the stable base is what turns every future model into an upgrade instead of a migration.

What to do this week, now that it's here

Fable 5 is generally available today, so this isn't hypothetical anymore. Four moves, all worth doing now:

  1. Pin your conventions where the agent reads them. A CLAUDE.md / .cursorrules that actually spells out your patterns is the difference between an agent that extends your code and one that reinvents it.
  2. Factor your shared UI into a real component layer. Every screen that reuses a <Button> instead of hand-rolling one is a screen a stronger model gets right on the first try.
  3. Keep your code in your own git, not a platform's sandbox. Ownership is what makes the next model a one-line upgrade.
  4. Run your hardest agentic flow through Fable 5 first. That's your honest benchmark — your gnarliest real task on your real codebase — not someone's screenshot.

What this gets you

A stronger Claude is here, and it's good. The builders who benefit most won't be the ones who guessed the release date — they'll be the ones who already own the durable layer: their code, their conventions, their components. Own that, and every model jump from here is a free upgrade you opt into by changing one line.

If you want a head start on the component layer, our free cross-platform SDK is one place to begin — same component, web and native, code you keep. The model will keep changing. Make the thing it builds on something that doesn't have to.

ai-toolsagentsarchitecturecross-platform
OTF Fitness Kit

Stop wiring. Start shipping.

  • Auth, DB, and backend already connected — no Supabase setup needed
  • iOS + Android + web from one codebase
  • CLAUDE.md pre-tuned + 40+ tested AI prompts included