Anthropic argues that Claude Mythos Preview marks a serious escalation in AI-driven cybersecurity capabilities, with models now approaching a level that could dramatically accelerate both vulnerability discovery and real-world exploitation before defenders are fully prepared. (Source: Image by RR)

Anthropic Says New Model Can Discover and Exploit Critical Software Flaws

Anthropic says its new Claude Mythos Preview model represents a major leap in cybersecurity capability, particularly in identifying and exploiting software vulnerabilities. In the company’s testing, the model was able to autonomously discover zero-day bugs in major operating systems, browsers, and widely used open-source software, then in some cases develop working exploits without human guidance. Anthropic frames the release as a turning point for the security industry and says it is limiting access while launching Project Glasswing to help defenders prepare.

The company, according to an article in red.anthropic.com, argues Mythos Preview’s capabilities emerged not from narrow exploit training, but from broader gains in coding, reasoning, and autonomy. According to the report, the model can find subtle memory-safety flaws, logic bugs, and cryptographic weaknesses, and can also reverse engineer stripped binaries to search for vulnerabilities in closed-source systems. Anthropic says this sharply improves on previous models, which were better at finding and fixing vulnerabilities than exploiting them.

Much of the post focuses on detailed case studies meant to show the model’s practical range. Anthropic describes Mythos Preview finding a 27-year-old OpenBSD bug, a 16-year-old FFmpeg flaw, a FreeBSD remote code execution vulnerability, Linux privilege-escalation chains, browser exploit primitives, and weaknesses in cryptographic libraries and web applications. Because most of the bugs remain unpatched, the company withholds many specifics, publishing cryptographic commitments instead as proof it has the findings while following coordinated disclosure practices.

Anthropic’s broader message is that AI-assisted cyber offense is advancing fast enough to destabilize the current security balance before defenses fully catch up. The company argues defenders should begin using today’s frontier models for vulnerability hunting, patch triage, incident response, and faster remediation, while also shortening software patch cycles and revisiting disclosure policies. In Anthropic’s view, AI will eventually favor defense more than offense, but the near-term transition could be dangerous, and the industry should act urgently now rather than assume current safeguards will hold.

read more at red.anthropic.com