Artificial Intelligence & Machine Learning,
Next-Generation Technologies & Secure Development
Safety Concerns Arise with Launch of o3, o4-mini, and GPT-4.1

Recently, OpenAI announced a significant update, unveiling its most sophisticated reasoning models, o3 and o4-mini, as well as a new coding variant, GPT-4.1. In an unexpected move, the company also indicated the phase-out of its previously costly model, GPT-4.5, focusing instead on innovations designed to enhance safety and performance.
The updated models are advertised to offer notable advancements, including extended context windows capable of handling 1 million tokens. This upgrade aims to address ongoing safety concerns and enhance chain-of-thought reasoning capabilities. However, as OpenAI revises its pricing structure and model classifications, there is mounting pressure on the organization to maintain clarity regarding its testing methodologies and strategies for mitigating risks associated with the newly deployed technologies.
On the announcement day, OpenAI released o3 and o4-mini, which integrate native functionalities for web browsing, Python execution, image analysis, and more into their reasoning processes. Positioned as the company’s most advanced models to date, o3 and o4-mini purportedly demonstrate substantial improvements over earlier iterations like o1 and GPT-4, achieving higher scores on multidisciplinary assessments related to mathematics, coding, and science.
According to preliminary testing by OpenAI, o3 scored 69.1% on the SWE bench verified coding benchmark, while o4-mini closely followed with a score of 68.1%. Notably, o4-mini is also designed to be more economical, utilizing a lower pricing tier which is $1.10 per million input tokens and $4.40 per million output tokens.
Both models are designed to facilitate a chain-of-thought reasoning process, enabling them to effectively parse and manipulate multimodal inputs—such as rotating an image or analyzing a blurred diagram—before generating coherent responses. OpenAI intends to make these models accessible to Pro, Plus, and Team subscribers through its Chat Completions and Responses APIs, with an upgraded version of o4-mini expected shortly to enhance its reliability further.
In light of the advanced capabilities of these models, OpenAI has implemented a “safety-focused reasoning monitor” for o3 and o4-mini. This monitor serves to intercept prompts that may relate to biological or chemical threats, drawing on comprehensive training from over 1,000 hours of red teaming exercises. OpenAI reported that the monitor successfully refrained from responding to high-risk biothreat inquiries 98.7% of the time during simulated testing. However, the organization acknowledges the potential for adversaries to adapt their approaches to bypass these safeguards, advocating for ongoing human oversight in addition to automated monitoring.
Critics are raising concerns over the potential inadequacies in OpenAI’s evaluation processes. Red teaming collaborator Metr expressed that the evaluations were conducted rapidly and under simplified conditions, suggesting that enhanced performance might be observed with more extensive testing methodologies. They emphasized that existing setups may overlook certain risks that could emerge in real-world applications.
In the days preceding the launch of o3 and o4-mini, OpenAI discreetly introduced the GPT-4.1 family, which includes mini and nano options tailored for software development tasks. This model boasts an extensive context window capable of processing up to 1 million tokens in a single input, eclipsing notable literary work such as “War and Peace.” The design improvements are based on direct feedback to enhance usability in actual programming environments.
Notably, GPT-4.1 was released without a public safety report or comprehensive system card, a departure from industry norms. OpenAI’s representative indicated that GPT-4.1 was not categorized as a frontier model, and thus a separate system card was deemed unnecessary. This decision has drawn scrutiny from former OpenAI safety researchers, who underscore the importance of system cards in maintaining transparency regarding safety evaluations and testing protocol adherence.
In a further development, OpenAI announced plans to retire its GPT-4.5 API access by July 14, while indicating that the Orion model will remain available in the research preview for paying users. Launched in late February as a highly advanced model, GPT-4.5 emphasized enhancements in writing and persuasiveness but carried a significant cost of $75 per million input tokens and $150 per million output tokens.