Mythos and Cybersecurity

Executive Summary

Anthropic developed Claude Mythos Preview, an AI model capable of identifying and exploiting software vulnerabilities at unprecedented scale. The system discovered thousands of flaws across major operating systems and browsers, including a 27-year-old OpenBSD bug and a 16-year-old FFmpeg vulnerability. Mythos generated 181 weaponized attacks from Firefox vulnerabilities alone, significantly outperforming previous models. Anthropic restricted access to approximately 50 critical infrastructure organizations under Project Glasswing to prevent misuse. Key concerns include limited transparency about false positive rates, the model's reduced effectiveness on specialized systems like industrial control software and medical devices, and the risk that sophisticated attackers with domain expertise could leverage such AI as a force multiplier. Experts advocate for greater transparency and broader community engagement in decisions about AI model release.

Summary

Last week, Anthropic pulled back the curtain on Claude Mythos Preview , an AI model so capable at finding and exploiting software vulnerabilities that the company decided it was too dangerous to release to the public. Instead, access has been restricted to roughly 50 organizations—Microsoft, Apple, Amazon Web Services, CrowdStrike and other vendors of critical infrastructure—under an initiative called Project Glasswing . The announcement was accompanied by a barrage of hair-raising anecdotes: thousands of vulnerabilities uncovered across every major...

Published Analysis

Anthropic developed Claude Mythos Preview, an AI model capable of identifying and exploiting software vulnerabilities at unprecedented scale. The system discovered thousands of flaws across major operating systems and browsers, including a 27-year-old OpenBSD bug and a 16-year-old FFmpeg vulnerability. Mythos generated 181 weaponized attacks from Firefox vulnerabilities alone, significantly outperforming previous models. Anthropic restricted access to approximately 50 critical infrastructure organizations under Project Glasswing to prevent misuse. Key concerns include limited transparency about false positive rates, the model's reduced effectiveness on specialized systems like industrial control software and medical devices, and the risk that sophisticated attackers with domain expertise could leverage such AI as a force multiplier. Experts advocate for greater transparency and broader community engagement in decisions about AI model release. Last week, Anthropic pulled back the curtain on Claude Mythos Preview , an AI model so capable at finding and exploiting software vulnerabilities that the company decided it was too dangerous to release to the public. Instead, access has been restricted to roughly 50 organizations—Microsoft, Apple, Amazon Web Services, CrowdStrike and other vendors of critical infrastructure—under an initiative called Project Glasswing . The announcement was accompanied by a barrage of hair-raising anecdotes: thousands of vulnerabilities uncovered across every major... Last week, Anthropic pulled back the curtain on Claude Mythos Preview , an AI model so capable at finding and exploiting software vulnerabilities that the company decided it was too dangerous to release to the public. Instead, access has been restricted to roughly 50 organizations—Microsoft, Apple, Amazon Web Services, CrowdStrike and other vendors of critical infrastructure—under an initiative called Project Glasswing . The announcement was accompanied by a barrage of hair-raising anecdotes: thousands of vulnerabilities uncovered across every major operating system and browser, including a 27-year-old bug in OpenBSD, a 16-year-old flaw in FFmpeg. Mythos was able to weaponize a set of vulnerabilities it found in the Firefox browser into 181 usable attacks; Anthropic’s previous flagship model could only achieve two. This is, in many respects, exactly the kind of responsible disclosure that security researchers have long urged. And yet the public has been given remarkably little with which to evaluate Anthropic’s decision. We have been shown a highlight reel of spectacular successes. However, we can’t tell if we have a blockbuster until they let us see the whole movie. For example, we don’t know how many times Mythos mistakenly flagged code as vulnerable. Anthropic said security contractors agreed with the AI’s severity rating 198 times, with an 89 per cent severity agreement. That’s impressive, but incomplete. Independent researchers examining similar models have found that AI that detects nearly every real bug also hallucinates plausible-sounding vulnerabilities in patched, correct code. This matters. A model that autonomously finds and exploits hundreds of vulnerabilities with inhuman precision is a game changer, but a model that generates thousands of false alarms and non-working attacks still needs skilled and knowledgeable humans. Without knowing the rate of false alarms in Mythos’s unfiltered output, we cannot tell whether the examples showcased are representative. There is a second, subtler problem. Large language models, including Mythos, perform best on inputs that resemble what they were trained on: widely used open-source projects, major browsers, the Linux kernel and popular web frameworks. Concentrating early access among the largest vendors of precisely this software is sensible; it lets them patch first, before adversaries catch up. But the inverse is also true. Software outside the training distribution—industrial control systems, medical device firmware, bespoke financial infrastructure, regional banking software, older embedded systems—is exactly where out-of-the-box Mythos is likely least able to find or exploit bugs. However, a sufficiently motivated attacker with domain expertise in one of these fields could nevertheless wield Mythos’s advanced reasoning capabilities as a force multiplier, probing systems that Anthropic’s own engineers lack the specialized knowledge to audit. The danger is not that Mythos fails in those domains; it is that Mythos may succeed for whoever brings the expertise. Broader, structured access for academic researchers and domain specialists—cardiologists’ partners in medical device security, control-systems engineers, researchers in less prominent languages and ecosystems—would meaningfully reduce this asymmetry. Fifty companies, however well chosen, cannot substitute for the distributed expertise of the entire research community. None of this is an indictment of Anthropic....