AI Solves Open Math Problems in an Hour: GPT-5.5 Pro Stunned a Fields Medalist, Claude Learns From Its Own Memory

May 14, 2026 Daniel Cesak

This week in AI research brought two remarkable milestones: GPT-5.5 Pro from OpenAI solved open mathematical problems at the doctoral level in less than an hour — surprising even Fields Medalist Timothy Gowers. Simultaneously, Anthropic taught Claude to "dream" — a new memory reflection system that allows agents to learn from past tasks and improve without human supervision. In addition, Google DeepMind introduced an AI co-mathematician, OpenAI expanded its cybersecurity tools, and Microsoft published global AI adoption statistics.

Listen to this article:

GPT-5.5 Pro and Doctoral-Level Mathematics: What the Fields Medalist's Verdict Means

Timothy Gowers, one of the most significant contemporary mathematicians and a Fields Medal recipient, published a detailed account of his interaction with the GPT-5.5 Pro model. The result shocked not only him but the entire mathematical community. The model solved several open problems from additive number theory within approximately one hour, which Gowers selected from a recent article by mathematician Mel Nathanson.

Specifically, ChatGPT 5.5 Pro first proposed a construction within 17 minutes that improved Nathanson's estimate concerning the size of sumsets — sets that arise from adding elements of a given set of integers. Subsequently, the model processed a far more complex problem concerning a more general case within another 16 minutes, doing so convincingly that MIT student Isaac Rajagopal, whose work was the starting point, labeled the result as "almost certainly correct."

In his article, Gowers stated that the level of the result corresponds to "a perfectly reasonable chapter of a PhD dissertation in combinatorics." For laypeople, it is crucial to understand that this is not a routine computational task, but the creation of an original mathematical proof that previously required weeks to months of human work. GPT-5.5 Pro also came up with a specific idea — the use of so-called dissociated sets — which Rajagopal described as "completely original and ingenious."

What does this mean practically? Gowers warns that the bar for beginning mathematics PhD students has just been raised. Problems that previously served as valuable training material for young researchers because they seemed solvable but were not trivial can now be handled by AI within an hour. This does not mean the end of mathematics, but a fundamental change in how people will collaborate with AI — and which skills will be valued.

Claude Learns to "Dream": Anthropic Builds Self-Improving Agents

While OpenAI is astounding the world with mathematics, Anthropic is pushing the boundaries of agentic AI. In a research preview, it introduced a "dreaming" function for Claude Managed Agents — a process that goes through records of previous sessions, looks for patterns in errors and successes, and updates the agent's memory. The result is that agents learn from their own experience similarly to a person who processes experiences from the previous day upon waking.

The company Harvey, which uses Claude to automate legal tasks, recorded in tests approximately a six-fold increase in task completion success precisely thanks to dreaming. Netflix, meanwhile, deployed multi-agent orchestration where a lead agent delegates analysis of logs from hundreds of builds to specialized sub-agents that work in parallel and share results.

Anthropic simultaneously launched the "outcomes" tool, which allows defining a success rubric and letting the agent correct its own work until it meets the criteria. In internal tests, this improved task success by up to 10 percentage points, most significantly for the hardest problems. The quality of generated documents rose by 8.4% and presentations by 10.1%.

On the hardware front, Anthropic announced an agreement with SpaceX to utilize the full capacity of the Colossus 1 data center — more than 300 megawatts and over 220,000 NVIDIA graphics processors. This will enable higher API limits and faster inference for paid users. The agreement also includes interest in developing orbital AI computing capacities.

Google and OpenAI Expand Their Offerings

Google DeepMind published a research paper on the "AI co-mathematician" — an agentic workbench for mathematical research that achieved 48% success on the hardest level of the FrontierMath Tier 4 benchmark. The system simulates real research procedure: from literature search to computational exploration, hypothesis creation, and proof verification.

OpenAI, meanwhile, expanded its cybersecurity platform Daybreak, which combines GPT-5.5 with the Codex Security model for vulnerability detection, patch generation, and malware analysis. Alphabet launched Googlebook — a new category of notebooks built around Gemini Intelligence with a "Magic Pointer" developed by the Google DeepMind team. For developers, Google released inference optimization for the open Gemma 4 model, which thanks to speculative decoding speeds up generation up to three times.

What This Means for Czech Users and Companies

For Czech readers, these developments are relevant for several reasons. GPT-5.5 Pro is available to ChatGPT Pro subscribers for $200 per month (approximately CZK 4,500) and via API with variable pricing. Claude is accessible through the web interface in both English and Czech, while API allows Czech companies to integrate the model into their own applications. Anthropic recently expanded its international infrastructure, including inference in Europe.

The Czech context is rapidly evolving: the Czech National Bank recently built its own AI center on NVIDIA chips, where it tests models from Mistral, OpenAI, and Alibaba for financial market oversight. The Czech AI Factory is emerging in Ostrava as a hub of European artificial intelligence. These investments show that Czech institutions are not falling behind in adopting the most advanced models.

Microsoft stated in its global AI adoption report that the share of active AI users among the working population worldwide rose from 16.3% to 17.8% in the first quarter of 2026. Twenty-six economies have already crossed the 30% adoption threshold. While no specific figure was given for the Czech Republic, the trend is clear: AI is ceasing to be an experiment and is becoming a standard work tool.

The regulatory framework in the EU, particularly the AI Act, meanwhile emphasizes transparency and model safety. Anthropic with its "outcomes" and "dreaming" and OpenAI with Daybreak are moving in this direction — creating tools that are not only more powerful but also better auditable and controllable.

Conclusion

This week showed that AI models are improving not only their language abilities but are penetrating areas that require deep abstract thinking. GPT-5.5 Pro proves that artificial intelligence can generate original mathematical ideas at the level of top university researchers. Claude, meanwhile, shows the path to agents that not only follow instructions but actively improve themselves. For Czech companies, researchers, and ordinary users, this means one thing: the era when AI served only as advanced search is over. The era of true AI collaborators is beginning.

Do I need a paid account to use GPT-5.5 Pro?

Yes, GPT-5.5 Pro is currently available primarily to ChatGPT Pro subscribers for $200 per month (approximately CZK 4,500) or via API with pay-as-you-go token pricing. For regular use, the GPT-5.5 Instant version is sufficient, which is included in cheaper subscriptions.

How does "dreaming" in Claude differ from regular chat memory?

While regular chat only remembers the current conversation, dreaming is a planned process that goes through an archive of previous sessions, looks for recurring patterns, structures knowledge, and updates the agent's long-term memory. Thanks to this, the agent learns across tasks and improves even without direct human guidance.

Can the Czech scientific community use the AI mathematician from Google DeepMind?

Google DeepMind published a research paper on the AI co-mathematician, but a full-fledged public tool is not yet available. However, Czech researchers can use the public API of Gemini 2.5 Pro models or the freely available open Gemma 4 model for their own experiments. With Google's expanding European infrastructure, better availability for Czech academic institutions can be expected.