AI-Driven Security Operations,
Artificial Intelligence & Machine Learning,
Next-Generation Technologies & Secure Development
Anthropic Restricts Access to New AI Model Due to Misuse Concerns

In a significant development for cybersecurity, Anthropic announced the creation of an artificial intelligence model deemed too risky for public release. The company contends that its new model, named Claude Mythos Preview, signifies a transformative shift in the landscape of cyber defenses.
Amid ongoing legal battles with the U.S. federal government regarding the deployment of models for autonomous weaponry and surveillance, Anthropic revealed that Mythos Preview has already uncovered numerous severe vulnerabilities, affecting major operating systems and web browsers. “In light of the rapid advancements in AI, it will not be long before such capabilities proliferate, posing serious threats to economies, public safety, and national security,” the company stated.
A consortium comprising over 40 tech firms, including Microsoft, Google, and Cisco, has been granted access to this frontier model, with $100 million in usage credits allocated for the purpose of identifying and mitigating system vulnerabilities. Dubbed “Project Glasswing,” this coalition reflects the collaborative effort to bolster cybersecurity in a rapidly evolving threat landscape.
Anthropic executives emphasized that Mythos Preview is not simply a sophisticated vulnerability scanner; its capabilities extend to identifying long-standing flaws. For instance, the model recently discovered a 27-year-old vulnerability in OpenBSD, which allowed attackers to remotely disrupt operations on any machine utilizing that operating system. Besides this, it identified critical vulnerabilities within the Linux kernel, enabling potential elevation of access privileges by circumventing established security measures such as kernel address space layout randomization.
Moreover, the model exhibited advanced functionality in exploiting multiple vulnerabilities in tandem to construct effective exploits. Anthropic researchers noted they have documented instances where the model successfully chained vulnerabilities to compromise the Linux kernel’s security integrity.
In a related blog post, the researchers asserted that the model possesses an understanding of the underlying logic in coding, which allows it to identify and exploit weaknesses in program functions aimed at access control. They predict a future landscape where attackers and defenders reach an equilibrium. However, they caution that a tumultuous transition phase could ensue if adversaries gain access to such models prior to defenders being adequately prepared.
To address these risks, Anthropic has committed to developing safeguards that prevent malicious outputs. They also intend to publish recommendations on critical cybersecurity practices, including vulnerability disclosure, patch management, and design principles for secure applications. The implications of these advancements underscore the importance of vigilance in a landscape where both attack and defense capabilities are rapidly advancing.