Learning brief
Generated by AI from multiple sources. Always verify critical information.
TL;DR
Anthropic built Claude Mythos, an AI so good at finding security holes that they won't release it to the public. It discovered thousands of zero-day bugs — including a 27-year-old OpenBSD flaw — then accidentally leaked to the internet anyway.
What changed
Claude Mythos found thousands of zero-day vulnerabilities, including bugs that existed for decades undetected.
Why it matters
This AI can hack better than most security professionals — and Anthropic won't let anyone use it.
What to watch
Whether other AI companies follow Anthropic's lead or race to ship their own hacking models.
What Happened
Anthropic just announced Claude Mythos Preview — their most powerful AI model yet — and immediately said you can't have it. Think of it like building the world's best lockpick and then deciding it's too dangerous to sell. The company published a 243-page report detailing what Mythos can do, then locked it away (Source 4, 5, 15).
What makes Mythos scary? It found thousands of zero-day vulnerabilities — security flaws that nobody knew existed. A zero-day is like discovering your front door has had a hidden spare key under the mat for years, and you never noticed. Mythos found one in OpenBSD (a computer operating system) that had been sitting there for 27 years (Source 15). That's older than most YouTube creators.
Here's where it gets messy: someone at Anthropic misconfigured a data store and accidentally exposed 3,000 internal files to the public internet. One of those files contained information about Mythos. So the model they said was too dangerous to release? It leaked anyway (Source 8, 18). It's like declaring your secret recipe is locked in a vault, then leaving a copy on the office printer.
Anthropie released Mythos as part of Project Glasswing, their cybersecurity initiative (Source 3, 10). The Python SDK even got updated to support the model (Source 29), but you still can't actually use it unless you're part of a select research group. The model exists. The code to access it exists. You just don't get access.
Before Mythos: Security researchers manually hunt for bugs, often taking weeks to find a single serious flaw.
After Mythos: An AI can scan software and surface thousands of critical vulnerabilities automatically — including ones that have existed longer than the iPhone.
So What?
The real story here is that AI has crossed a threshold in offensive cybersecurity capabilities. Mythos isn't just better at finding bugs than humans — it's finding bugs humans missed for decades. That 27-year-old OpenBSD vulnerability means someone could have been using that hole since 1999, and every security audit missed it (Source 15). An AI found it in testing.
This puts Anthropic in an impossible position. Release the model, and every hacker — from nation-states to teenagers — gets an automated vulnerability scanner that outperforms most security teams. Don't release it, and you've just proven that frontier AI models are now too dangerous to ship. Either choice sets a precedent the industry isn't ready for.
The uncomfortable truth is that the leak makes the whole "too dangerous to release" position moot. Once 3,000 internal files hit the public internet (Source 8, 18), the information is out there. Anthropic can refuse to give you API access, but they can't un-leak the research. If a motivated actor wanted to replicate Mythos's capabilities, they now have a detailed roadmap in that 243-page system card (Source 5). The company tried to have it both ways — publish the research for transparency, restrict the model for safety — and ended up with neither.
Sources