OpenAI Proposes GPT-5 as Quicker, Smarter, and More Precise

Artificial Intelligence & Machine Learning,
Next-Generation Technologies & Secure Development

Company Claims Enhanced Model Reduces Hallucination, Excels in Benchmarks

OpenAI Pitches GPT-5 as Faster, Smarter, More Accurate
Image: Rokas Tenys/Shutterstock

OpenAI has officially launched its latest AI model, GPT-5, making bold declarations about its capabilities in the competitive landscape of generative artificial intelligence. The company asserts that GPT-5 is its “smartest, fastest, and most useful model to date.” As many labs unveil their flagship models claiming superior performance, the use of such superlative language has become quite common in 2025.

The metrics shared by OpenAI indicate that GPT-5 represents a significant advancement over its predecessors. The Pro version of the model achieved an impressive 88.4% on the Graduate-Level Google-Proof Question Answering benchmark, a score reported to surpass that of GPT-4o. Additionally, the company states that the rate of “sycophancy”—its term for overly agreeable responses—has dropped dramatically from 14.5% to below 6%.

In terms of coding ability, OpenAI labels GPT-5 as its most proficient coding model, achieving a score of 74.9% on the SWE-bench Verified benchmark and 88% on the Aider Polyglot benchmarks. This performance places it slightly ahead of Anthropic’s Claude Opus 4.1. OpenAI highlights that GPT-5 can now complete complex coding tasks autonomously and is capable of generating entire interface designs targeting non-developers.

OpenAI is also emphasizing GPT-5’s capabilities in specialized domains. For instance, in the health industry, the model scored 46.2% on a new metric named HealthBench Hard, touted as its strongest health-related performance yet. However, OpenAI does caution users by stating that “ChatGPT does not replace a medical professional,” acknowledging inherent risks in relying on AI for critical healthcare decisions.

Accuracy is another key focus for OpenAI, which claims that GPT-5—when enabled with web search functionality—is approximately 45% less likely to generate factual inaccuracies compared to GPT-4o. When operating in a “thinking” mode, it reportedly exhibits about 80% fewer errors than version o3. On benchmarks pertaining to long-form content, GPT-5 demonstrates six times fewer inconsistencies compared to its predecessor, although the company recognizes that no AI system is entirely immune to generating misleading information.

OpenAI describes GPT-5 not simply as a model but as a more integrated tool for users. The model is available in three tiers: Pro for intensive tasks, mini for quicker operations, and nano for embedded contexts. Free-tier users will initially access GPT-5 mini and can transition to more constrained options as usage caps are reached. Subscribers paying $20 a month can continue using the Pro version, while developers can access all variants through existing API pricing models.

The evolution of the model’s developer toolkit is noteworthy as well. GPT-5’s “Actions” system advances prior function-calling capabilities, allowing for greater control in how applications interact with external tools. This enhancement equips businesses with AI-driven products to better integrate this model into their specific ecosystems to improve overall task completion efficiency without excessive user input.

GPT-5’s smallest variant, the nano model, is optimized for scenarios where resource limitations are a concern, although OpenAI has yet to confirm fully offline capabilities for consumer devices. This trend toward smaller AI models signifies a broader industry shift aimed at making artificial intelligence more accessible while maintaining core functionalities.

The announcement comes in a competitive landscape, with companies like Anthropic, Google DeepMind, and Meta also working on advanced AI models under scrutiny from regulators regarding privacy and data management implications. OpenAI’s ability to substantiate its claims regarding GPT-5’s utility and safety in navigating these complexities will ultimately determine its success.

Source link