New Research Uncovers Security Risks Posed by AI-Generated Code
Recent studies have revealed alarming vulnerabilities associated with AI-generated computer code, particularly in the context of software supply chains. Researchers found that a significant portion of the code generated by large language models (LLMs) is rife with references to fictitious third-party libraries. This phenomenon not only jeopardizes the integrity of legitimate software products but also creates potential gateways for supply-chain attacks that could lead to data theft and malware deployment.
The investigation analyzed 16 prominent language models, generating a total of 576,000 code samples. Astonishingly, researchers identified that approximately 440,000 of these package dependencies were "hallucinated," indicating they do not exist. Open-source models demonstrated the highest frequency of these errors, with a striking 21 percent of the dependencies pointing to non-existent libraries. Dependencies are critical components that allow software to function correctly, facilitating the development process and supporting the modern software supply chain.
These fictitious dependencies significantly heighten the risk of dependency confusion attacks—a method in which malicious packages exploit the trust placed in third-party components. Attackers can publish malicious software under the guise of legitimate packages, using slightly different version numbers to trick programs into selecting their harmful code over the original. This type of exploitation was first publicly demonstrated in 2021, with attacks successfully targeting major corporations, including Apple and Microsoft, by infiltrating their networks.
Joseph Spracklen, a Ph.D. student at the University of Texas at San Antonio and the lead researcher on the study, highlighted the risks associated with these hallucinations. "When an attacker publishes a package under a fabricated name containing malicious code, they rely on unsuspecting users to trust the model’s output,” Spracklen explained. If users install these packages without thorough verification, the malicious payload can be executed on their systems.
In the realm of AI, “hallucinations” refer to instances where an LLM generates outputs that are either incorrect or unrelated to the prescribed task. This issue has long been a challenge for LLMs, affecting their reliability and overall trustworthiness. The research team has termed this particular phenomenon "package hallucination," an emerging concern within the cybersecurity landscape.
Across 30 separate tests, which included programming in Python and JavaScript, the researchers observed that 19.7 percent of the 2.23 million package references in the generated samples pointed to non-existent libraries. Notably, among these hallucinations, a substantial 205,474 had unique names. The consistency of many of these hallucinated packages—43 percent of them repeated across different queries—points to a troubling trend. In fact, 58 percent of the hallucinated packages appeared multiple times in ten iterations, underscoring a persistent behavior that malicious actors could exploit.
The implications of such package hallucinations are significant, particularly for businesses relying on sound software supply chains. By understanding and countering these risks, organizations can work to reinforce their defenses against potential malware attacks rooted in AI-generated code. The MITRE ATT&CK framework provides a useful lens through which to analyze this situation. Tactics such as initial access, persistence, and privilege escalation could all be relevant in comprehending how these dependency confusion attacks may unfold.
As organizations increasingly integrate AI into their programming workflows, awareness of the threats posed by hallucinated code becomes critical. By recognizing the vulnerabilities embedded in AI-generated outputs and taking proactive measures, businesses can better safeguard their assets against the growing sophistication of cyber threats.