Security Flaws in NVIDIA Triton Allow Unauthenticated Attacks to Execute Code and Compromise AI Servers

Published: August 4, 2025
Category: AI Security / Vulnerability

A newly revealed set of vulnerabilities in NVIDIA’s Triton Inference Server—an open-source platform for deploying artificial intelligence (AI) models across Windows and Linux—puts susceptible servers at risk of takeover. Researchers Ronen Shustin and Nir Ohfeld from Wiz noted in a report released today that when these flaws are exploited together, they could enable remote, unauthenticated attackers to gain full control of the server, facilitating remote code execution (RCE). The identified vulnerabilities include:

  • CVE-2025-23319 (CVSS Score: 8.1): An issue in the Python backend that allows for an out-of-bounds write via specifically crafted requests.
  • CVE-2025-23320 (CVSS Score: 7.5): A flaw in the Python backend where an attacker can exceed the shared memory limit by sending an excessively large request.
  • CVE-2025-23334 (CVSS Score: 5.9): A vulnerability in the Python backend that could lead to an out-of-bounds read.

NVIDIA Triton Vulnerabilities Enable Unauthenticated Code Execution Risks in AI Servers

August 4, 2025

A critical security issue has emerged concerning NVIDIA’s Triton Inference Server, a widely used open-source platform designed for deploying artificial intelligence models on Windows and Linux systems. Recent research reveals that a set of vulnerabilities could potentially be exploited by unauthenticated attackers, jeopardizing the integrity of AI servers that utilize this framework. According to a report from Wiz cybersecurity researchers Ronen Shustin and Nir Ohfeld, the combination of these flaws could allow malicious actors to gain complete control over affected servers, achieving remote code execution (RCE).

The vulnerabilities are rooted in the Python backend of the Triton server, raising serious concerns among businesses that rely on AI technologies for operational efficiency. One notable flaw, identified as CVE-2025-23319, has a critical severity rating of 8.1 on the Common Vulnerability Scoring System (CVSS). This vulnerability permits an attacker to execute an out-of-bounds write by sending a suitably crafted request. In a related vulnerability, CVE-2025-23320, rated at 7.5, could enable an attacker to bypass memory constraints by submitting a particularly oversized request, thereby compromising server integrity. Another issue, CVE-2025-23334, which carries a CVSS score of 5.9, involves the potential for out-of-bounds reads.

The implications of these vulnerabilities extend beyond immediate exploitation. Unauthenticated access could lead to a range of dangers, including unauthorized data manipulation and service disruptions. As businesses increasingly integrate AI models into their workflows, securing these platforms against such threats becomes imperative.

The attack vectors suggested by this incident align with various tactics outlined in the MITRE ATT&CK Matrix, particularly regarding initial access and privilege escalation. An attacker could leverage these vulnerabilities as an entry point, subsequently gaining elevated privileges necessary for executing arbitrary code within the AI server environment. Such techniques highlight the urgency for organizations to adopt robust security measures and vulnerability management processes.

In light of these findings, it is essential for organizations deploying NVIDIA’s Triton Inference Server to review their security posture and implement necessary updates. Being vigilant against such vulnerabilities not only protects sensitive data but also strengthens the overall resilience of AI infrastructures. The landscape of cybersecurity threats is continually evolving, underscoring the need for proactive measures in securing AI technologies and mitigating risks associated with remote exploitation.

Source link