A Chroma database managed by the Russian AI chatbot startup My Jedai has been found exposed online, resulting in a significant data leak that includes survey responses from over 500 Canva Creators. This compromised dataset features personal email addresses, feedback on Canva’s Creator Program, and insights into the experiences of designers from more than a dozen countries, raising concerns about data security in modern AI applications.
The vulnerability was identified by the cybersecurity firm UpGuard, which confirmed that the database was publicly accessible without any authentication measures in place. Although much of the exposed data was generic or publicly available, a notable collection included responses to an extensive survey directed at Canva Creators, who contribute content to the design platform globally.
This survey data encompassed 571 unique email addresses along with detailed responses to 51 questions covering topics such as royalties, user experience, and the adoption of artificial intelligence. Interestingly, some email addresses were recorded multiple times, suggesting that certain users participated in the survey on more than one occasion.
UpGuard’s report, shared with Hackread.com before its official release, described this incident as the first known exposure involving a Chroma database—an advanced technology designed to help chatbots retrieve specific documents when answering queries. The database itself was hosted on an IP address in Estonia and appeared to be controlled by My Jedai, which allows users to enhance chatbots with documents, often with minimal oversight.
The inclusion of Canva-related data within this database raises critical questions regarding the handling of sensitive information in AI training frameworks. While Chroma technology in itself is not insecure, it necessitates proper configuration to avert public exposure. Unfortunately, in this instance, the database was left completely accessible on the internet.
In response to the incident, Canva provided a statement to Hackread, asserting: “We recently became aware that a file containing email addresses and survey responses from a small group of Canva Creators was uploaded to a third-party website. The information was not connected to Canva accounts or platform data in any way. The database owned by the third-party site was not adequately secured, which led to the information being accessible.”
The company further indicated that the exposed information was revealed by a security researcher employing specialized tools, although it was not broadly accessible to the general public or indexed by major search engines. Canva has confirmed that the file has since been removed and reported no unauthorized access attempts. They have also reached out to the impacted Creators and are taking steps to ensure compliance with relevant legal obligations, emphasizing their commitment to data security.
Despite the current lack of evidence for misuse of the data, experts warn that even limited personal information combined with survey responses can facilitate targeted phishing attacks. The survey participants disclosed insights into their professional roles, creative processes, and satisfaction levels with the Canva platform—details that could be leveraged maliciously if they fall into the wrong hands.
My Jedai is a small Russian enterprise focused on enabling users to create chatbots powered by their own documents. Following UpGuard’s notification, the company promptly acted to secure the exposed database within a day. This leak highlights the emerging and unpredictable channels for data exposure resulting from the adoption of AI technologies. As companies begin utilizing advanced tools like Chroma for customer-facing bots or internal assistance, the urgency to integrate data into these systems can sometimes lead to oversights and configuration errors.
This instance underscores a broader trend in the global application of AI tools, where data collected by a prominent Australian tech company has ended up in an unsecured database managed by a small Russian firm, hosted in Estonia. As the use of large language models and third-party chatbot services becomes more prevalent, maintaining traditional boundaries around data custody is increasingly complex.
UpGuard has noted that many documents in the compromised database were innocuous or nonsensical, comprising items like “mystical doctrines” and romantic advice sourced from public platforms such as Marie Claire and WikiHow. However, the presence of genuine corporate data—including internal conversation transcripts and links to restricted file-sharing platforms—demonstrates how sensitive information can inadvertently permeate AI systems when proper safeguards are neglected.