OpenAI Introduced GPT-5.5: The Smartest Model Meant to Lead to a "Super App"

April 24, 2026 jarvis

AI article illustration for ai-jarvis.eu

On Thursday, April 23, 2026, OpenAI introduced GPT-5.5, its smartest and most intuitive model to date. The company describes it as the next step towards a "super application" that should combine ChatGPT, the programming agent Codex, and an AI browser into one tool. While previous versions arrived at multi-month intervals, GPT-5.5 comes just a month after GPT-5.4 and promises a leap not only in performance but also in token efficiency.

What GPT-5.5 does better than its predecessors

The most significant shift is in so-called agentic computing – the model's ability to plan, use tools, verify outputs, and complete a task without constant user supervision. OpenAI states that GPT-5.5 significantly improves performance in code writing and debugging, online research, data analysis, document and spreadsheet creation, and software control. Company president Greg Brockman said at a press briefing that it is "a real step forward towards the computing systems we expect in the future".

In practice, this means that a user can assign the model a complex, multi-part task and more reliably trust GPT-5.5 to handle ambiguity, select the right tools, and check the result. It is particularly strong in agentic coding, computer usage, and scientific research – areas that involve reasoning across context and progressively achieving a goal over time.

Benchmarks: where it leads and where it lags

OpenAI has released an extensive set of measurements comparing GPT-5.5 with GPT-5.4, Anthropic's Claude Opus 4.7, and Google's Gemini 3.1 Pro. The model dominates in key programming tests: on Terminal-Bench 2.0, which verifies complex command lines requiring planning and tool coordination, it achieved 82.7% (GPT-5.4 had 75.1%, Claude Opus 4.7 69.4%). On SWE-Bench Pro, measuring the solution of real-world problems on GitHub, it scored 58.6% compared to its predecessor's 57.7%.

The model also improved in the area of professional computer work. On OSWorld-Verified, which tests autonomous control of a real computer environment, it achieved 78.7% (GPT-5.4 75%, Claude Opus 4.7 78%). In the knowledge work area GDPval, which simulates 44 different professions, it scored 84.9%, surpassing all competitors including Gemini 3.1 Pro with 67.3%.

In scientific disciplines, the model also raised the bar. On the new genetic benchmark GeneBench, it achieved 25% compared to 19% for GPT-5.4, and on FrontierMath levels 1–3, it improved to 51.7%. An interesting output is also a mathematical proof about Ramsey numbers, which the model created and which was later verified in the Lean system – this is an example where AI not only helped with code but contributed its own argument in a fundamental mathematical area.

However, it is not absolute dominance. For example, on Humanity's Last Exam without tools, GPT-5.5 remains at 41.4%, while Claude Opus 4.7 has 46.9%. In some areas, the competition is still keeping pace.

Speed, Price, and Availability

One of the most surprising pieces of news is that GPT-5.5 achieves similar latency to GPT-5.4, despite being significantly more capable. OpenAI explains this by co-design with inference infrastructure on NVIDIA GB200 and GB300 NVL72 systems. The model also uses significantly fewer tokens for the same tasks, which in practice means lower costs and faster responses.

For end-users, the model is available from Thursday, April 23, 2026, in ChatGPT and Codex for Plus, Pro, Business, and Enterprise subscribers. The GPT-5.5 Pro version, designed for the most demanding tasks, is being rolled out to Pro, Business, and Enterprise users. In Codex, a context window of 400 thousand tokens is available, as well as a Fast mode with 1.5× faster generation at 2.5× the price.

For developers, OpenAI has published API prices: gpt-5.5 will cost 5 USD per million input tokens and 30 USD per million output tokens. The gpt-5.5-pro version will cost 30 USD per million input tokens and 180 USD per million output tokens. Batch and Flex processing are at half rate, Priority at 2.5×. For comparison: GPT-5.4 is cheaper, but GPT-5.5 should, thanks to its efficiency, be comparably or better priced for many tasks.

Super Application and Security Measures

During the presentation, Greg Brockman again mentioned the vision of a "super application" – a unified service that would combine ChatGPT, Codex, and an AI browser into a single tool for enterprise customers. This concept is not new; OpenAI has been talking about it for a long time, and according to management, GPT-5.5 brings it closer. Elon Musk also has a similar goal with his X platform, so the race for a universal AI interface seems to be culminating.

With increasing capabilities comes greater responsibility. OpenAI classified the biological and cyber capabilities of GPT-5.5 as "high" according to its Preparedness Framework. The model did not show a critical level of risk, but the security team tightened classifiers for cyber misuse. At the same time, the company launched a Bio Bug Bounty with a reward of 25,000 USD for researchers who can find a universal way to bypass the model's biological security filters.

For verified defensive security teams, the Trusted Access for Cyber program is available, which allows the use of GPT-5.5 with fewer restrictions for legitimate security work. OpenAI also collaborates with government partners on critical infrastructure protection.

What this means for Czechia and Europe

For Czech users, the key information is that GPT-5.5 is available directly in ChatGPT, which supports the Czech language. They do not have to rely on unofficial translations or local proxies. For Czech companies and developers, integration in Codex may be particularly interesting, where the model can autonomously navigate software code and documentation.

At the European level, however, the deployment of such powerful models is subject to the EU AI Act. OpenAI states that API deployment requires additional safety and security requirements, which may mean that some enterprise features will initially be available more slowly in the EU than in the USA. Nevertheless, this is a full global rollout, not a regionally limited beta version.

Long Context and Scientific Use

GPT-5.5 significantly improved its performance with long texts. On the Graphwalks BFS with a million token context test, it achieved 45.4%, while GPT-5.4 had only 9.4%. This means the model can work more efficiently with extensive codebases, legal documents, or scientific datasets.

Scientists among external testers are reporting the most significant change in model usage. Immunologist Derya Unutmaz from Jackson Laboratory used GPT-5.5 Pro to analyze gene expression from 62 samples with almost 28 thousand genes and generated a research report that, according to his estimate, would have taken his team months. Mathematician Bartosz Naskręcki from the Adam Mickiewicz University in Poland built an application for algebraic geometry using Codex from a single prompt in 11 minutes.

Such examples suggest that GPT-5.5 is not just a better chatbot, but a tool that can genuinely accelerate scientific research – from biology through mathematics to drug development.

What is the difference between GPT-5.5 and GPT-5.5 Pro?

GPT-5.5 Pro is the same model, but it runs with a higher computational budget (parallel test time compute), which allows it to solve more difficult tasks with greater accuracy. It is only available to Pro, Business, and Enterprise subscribers. In the BrowseComp benchmark, for example, it achieves 90.1% compared to 84.4% for the standard version.

When will GPT-5.5 be available via API?

OpenAI announced that GPT-5.5 and GPT-5.5 Pro will arrive in the API "very soon". An exact date has not been set. API deployment requires additional security checks, so developers should monitor official announcements.

Should I be concerned about biological or cyber misuse of the model?

OpenAI classified these capabilities as "high" but not "critical". The company has tightened security filters, launched a bug bounty program, and offers broader access to verified defensive teams. For ordinary users in Czechia, this means the model has stronger protections that do not restrict legal use.