Anthropic Collaborates with Competitors to Prevent AI from Compromising Security

In late March, leaked reports revealed that Anthropic has developed a new AI model named Mythos, which they formally announced on Tuesday. Alongside this announcement, the company introduced an industry consortium called Project Glasswing, aimed at addressing the cybersecurity implications associated with this advanced model and the evolving capabilities across the AI landscape.

Project Glasswing features participation from major tech firms including Microsoft, Apple, and Google, along with Amazon Web Services, the Linux Foundation, Cisco, Nvidia, Broadcom, and over 40 additional organizations spanning technology, cybersecurity, critical infrastructure, and finance. These groups are set to gain exclusive access to Mythos, which has not yet been released to the general public. This initiative aims to provide foundational tech developers with the opportunity to test Mythos within their systems, helping them identify and mitigate vulnerabilities that could arise from simulated cyber-attacks. Anthropic emphasizes that the purpose of convening this consortium is to spark urgent discussion surrounding the impending transformation of software security and digital defense practices globally.

Logan Graham, Anthropic’s frontier red team lead, highlighted that the core message of this initiative is not focused solely on the model or Anthropic itself. He stated the pressing need to prepare for a future where such advanced capabilities become widely available within 6 to 24 months, suggesting that many established security assumptions may soon no longer hold. This shift necessitates a reevaluation of current security paradigms as looming advancements in AI capabilities may profoundly impact defensive strategies.

As AI models become increasingly skilled at identifying vulnerabilities in code and suggesting exploit strategies, this creates a dynamic landscape in cybersecurity. The intersection of advances in AI with traditional security practices produces a next-generation cat-and-mouse scenario, where defenders have sophisticated tools at their disposal that could also empower malicious actors, potentially simplifying attacks that were previously cost-prohibitive or overly complex.

Dario Amodei, CEO of Anthropic, remarked that the Mythos model is a substantial leap forward, emphasizing that while the model was not specifically trained for cybersecurity applications, its coding prowess inadvertently makes it effective in this regard. He acknowledged that more powerful models might emerge from both Anthropic and other entities, underscoring the imperative for a comprehensive response plan to counter the implications of these advancements.

Graham further elaborated that Mythos Preview is engineered not only for identifying vulnerabilities but also boasts capabilities in advanced exploit development, penetration testing, endpoint security assessments, and the evaluation of software binaries without prior access to source code. As the staggered release of Mythos Preview begins with industry collaboration, Anthropic aims to adhere to principles of coordinated vulnerability disclosure, allowing developers a window to address bugs before public discourse on them commences.

This careful approach is underscored by Graham’s assertion that Mythos Preview has achieved results comparable to those of a seasoned security researcher, indicating significant ramifications for how such capabilities should be introduced into the market. Released without appropriate caution, advancements like Mythos may dramatically accelerate malicious activity.

In statements issued during the launch, partners involved in Project Glasswing, including some of Anthropic’s rival firms, exhibited a collaborative spirit regarding this cybersecurity initiative. Heather Adkins, Google’s vice president of security engineering, welcomed the formation of this cross-industry consortium, recognizing the dual challenges and opportunities presented by AI in cyber defense.

Potential Tactics and Techniques: It is pertinent to consider that adversaries may employ several tactics as outlined in the MITRE ATT&CK framework in response to advancements like Mythos. Key tactics could include initial access through phishing methods, persistence through the establishment of backdoors, and privilege escalation to gain higher-level access within targeted environments. The advent of such powerful models heightens the urgency for businesses to reassess their cybersecurity posture and prepare for potential adversarial strategies.

Source