Mythos and Cybersecurity

Executive Summary

Anthropic has restricted public access to its new AI model, Claude Mythos Preview, citing significant security risks due to its advanced capability to identify and exploit software vulnerabilities. Under Project Glasswing, access is limited to approximately 50 critical infrastructure vendors, including Microsoft and Apple. The model reportedly uncovered thousands of vulnerabilities across major operating systems and browsers, demonstrating a substantial leap in automated exploit generation compared to previous iterations. While this enables responsible disclosure, concerns remain regarding potential misuse by adversaries possessing domain expertise to weaponize the tool against specialized systems. The article highlights the asymmetry in defense capabilities and calls for greater transparency and broader community engagement to mitigate risks associated with autonomous vulnerability discovery. Regulatory oversight and information sharing are deemed essential to manage the societal impact of such powerful AI technologies.

Summary

Last week, Anthropic pulled back the curtain on Claude Mythos Preview , an AI model so capable at finding and exploiting software vulnerabilities that the company decided it was too dangerous to release to the public. Instead, access has been restricted to roughly 50 organizations—Microsoft, Apple, Amazon Web Services, CrowdStrike and other vendors of critical infrastructure—under an initiative called Project Glasswing . The announcement was accompanied by a barrage of hair-raising anecdotes: thousands of vulnerabilities uncovered across every major...

Published Analysis

Anthropic has restricted public access to its new AI model, Claude Mythos Preview, citing significant security risks due to its advanced capability to identify and exploit software vulnerabilities. Under Project Glasswing, access is limited to approximately 50 critical infrastructure vendors, including Microsoft and Apple. The model reportedly uncovered thousands of vulnerabilities across major operating systems and browsers, demonstrating a substantial leap in automated exploit generation compared to previous iterations. While this enables responsible disclosure, concerns remain regarding potential misuse by adversaries possessing domain expertise to weaponize the tool against specialized systems. The article highlights the asymmetry in defense capabilities and calls for greater transparency and broader community engagement to mitigate risks associated with autonomous vulnerability discovery. Regulatory oversight and information sharing are deemed essential to manage the societal impact of such powerful AI technologies. Last week, Anthropic pulled back the curtain on Claude Mythos Preview , an AI model so capable at finding and exploiting software vulnerabilities that the company decided it was too dangerous to release to the public. Instead, access has been restricted to roughly 50 organizations—Microsoft, Apple, Amazon Web Services, CrowdStrike and other vendors of critical infrastructure—under an initiative called Project Glasswing . The announcement was accompanied by a barrage of hair-raising anecdotes: thousands of vulnerabilities uncovered across every major... Last week, Anthropic pulled back the curtain on Claude Mythos Preview , an AI model so capable at finding and exploiting software vulnerabilities that the company decided it was too dangerous to release to the public. Instead, access has been restricted to roughly 50 organizations—Microsoft, Apple, Amazon Web Services, CrowdStrike and other vendors of critical infrastructure—under an initiative called Project Glasswing . The announcement was accompanied by a barrage of hair-raising anecdotes: thousands of vulnerabilities uncovered across every major operating system and browser, including a 27-year-old bug in OpenBSD, a 16-year-old flaw in FFmpeg. Mythos was able to weaponize a set of vulnerabilities it found in the Firefox browser into 181 usable attacks; Anthropic’s previous flagship model could only achieve two. This is, in many respects, exactly the kind of responsible disclosure that security researchers have long urged. And yet the public has been given remarkably little with which to evaluate Anthropic’s decision. We have been shown a highlight reel of spectacular successes. However, we can’t tell if we have a blockbuster until they let us see the whole movie. For example, we don’t know how many times Mythos mistakenly flagged code as vulnerable. Anthropic said security contractors agreed with the AI’s severity rating 198 times, with an 89 per cent severity agreement. That’s impressive, but incomplete. Independent researchers examining similar models have found that AI that detects nearly every real bug also hallucinates plausible-sounding vulnerabilities in patched, correct code. This matters. A model that autonomously finds and exploits hundreds of vulnerabilities with inhuman precision is a game changer, but a model that generates thousands of false alarms and non-working attacks still needs skilled and knowledgeable humans. Without knowing the rate of false alarms in Mythos’s unfiltered output, we cannot tell whether the examples showcased are representative. There is a second, subtler problem. Large language models, including Mythos, perform best on inputs that resemble what they were trained on: widely used open-source projects, major browsers, the Linux kernel and popular web frameworks. Concentrating early access among the largest vendors of precisely this software is sensible; it lets them patch first, before adversaries catch up. But the inverse is also true. Software outside the training distribution—industrial control systems, medical device firmware, bespoke financial infrastructure, regional banking software, older embedded systems—is exactly where out-of-the-box Mythos is likely least able to find or exploit bugs. However, a sufficiently motivated attacker with domain expertise in one of these fields could nevertheless wield Mythos’s advanced reasoning capabilities as a force multiplier, probing systems that Anthropic’s own engineers lack the specialized knowledge to audit. The danger is not that Mythos fails in those domains; it is that Mythos may succeed for whoever brings the expertise. Broader, structured access for academic researchers and domain specialists—cardiologists’ partners in medical device security, control-systems engineers, researchers in less prominent languages and ecosystems—would meaningfully reduce this asymmetry. Fifty companies, however well chosen, cannot substitute for the distributed...