GPT-5.5 Pro Solved PhD-Level Math in an Hour. Anthropic's Claude Learned to "Dream"

May 14, 2026 Daniel Cesak

Abstract AI neural network visualization

  Artificial intelligence came close to a boundary last week that until recently sounded like science fiction: solving open mathematical problems at PhD level in a single hour. While Fields Medal laureate Timothy Gowers was testing the capabilities of GPT-5.5 Pro, competitors at Anthropic were teaching their Claude to "dream" — and at the same time closed a computing capacity deal with Elon Musk. What all happened in AI research over the last seven days?

Listen to this article:

GPT-5.5 Pro: When AI Writes Doctoral-Level Math Faster Than a Human

Timothy Gowers, a British mathematician and Fields Medal holder — which is the mathematical equivalent of the Nobel Prize — published on his blog an experience that just a year ago would have raised disbelieving smirks. ChatGPT 5.5 Pro, according to his words, solved several open problems in additive number theory that the author of a recent article, Mel Nathanson, left as a challenge for young researchers. And not only did it solve them — it did so "in about an hour, without significant mathematical input from me," as Gowers describes.

These were tasks from the area of so-called diversity, equity and inclusion for problems in additive number theory. Gowers himself admits that he had to "substantially raise his estimate of the mathematical abilities of large language models." According to him, the following now holds: if there exists an elegant argument for some open problem that human mathematicians have not yet noticed, there is a decent chance that an LLM will find it. And that is a message with far-reaching consequences: open problems from recent articles, which previously served as valuable training material for beginning scientists, suddenly have a new difficulty — to beat AI.

Interestingly, Gowers is not alone. Concurrently, the team around OpenAI announced that a customized variant of GPT-5.5 helped discover a new proof concerning asymptotic properties of so-called off-diagonal Ramsey numbers, which was verified in the language Lean — a formal system for computer verification of mathematical proofs. This means that AI is not just guessing, but also produces rigorously checkable results.

For Czech readers, it is important that this level of mathematical reasoning is currently available only through the highest ChatGPT subscription tiers — typically ChatGPT Pro for USD 200 per month (approximately CZK 4,500). The free version nor regular paid plans offer this capacity. On the other hand, Czech universities and scientific institutions should monitor how quickly this type of tool spreads into academic research — and what computational resources will be needed for Czech science so that it does not fall behind.

Google DeepMind Wants to Be a "Co-Mathematician"

The competition is not sleeping. Google and Google DeepMind introduced on arXiv an "AI co-mathematician" — an agent workbench designed around the actual workflow of mathematical research. This includes generating ideas, searching literature, computational experimentation, proving theorems, tracking unsuccessful hypotheses, and building theories.

According to the authors, the system achieved 48% success on the FrontierMath Tier 4 benchmark, which is one of the most difficult tests of mathematical reasoning for AI. Compared to the competition — for example, with the results of GPT-5.5 Pro — this shows that mathematical research is becoming a new arena of competition among the world's largest AI laboratories. For European research, this is a signal that investments in mathematical AI are not just a luxury, but a necessity for maintaining competitiveness.

Claude Is Learning to "Dream." Anthropic Closed a Deal with SpaceX

While OpenAI and Google are racing in mathematics, Anthropic stepped on a different pedal. The company announced a feature called "dreaming" — a memory refinement process for Claude Managed Agents. In practice, it works so that after completing tasks, the agent "reviews" previous sessions, looks for patterns, updates its memory, and learns from past experiences across various tasks.

Anthropic states that in tests at the law firm Harvey, agents with dreaming achieved approximately 6x higher task completion rates. Netflix then uses multi-agent orchestration to process logs from hundreds of software builds. The dreaming feature is currently in research preview, while multi-agent orchestration is moving to public beta.

What does this mean for the average user? Claude is becoming less of a tool for one-off queries and more of a personal assistant that remembers context across projects. For Czech companies and developers who use Claude, for example, via API, this can mean significantly more efficient automation of internal workflows — from document processing to data analysis.

Even more remarkable is the second news from Anthropic: the company closed a deal with SpaceX for the use of the entire capacity of the Colossus 1 data center. The cooperation will add more than 300 megawatts of power and over 220,000 NVIDIA graphics processors to Anthropic's infrastructure within a few weeks. This is a massive increase in computing capacity that will allow scaling the training and inference of Claude models to a level that until now only the largest players like Google or Microsoft had.

For the European context, it is interesting that Anthropic also announced a partnership with NEC for building the largest Japanese AI workforce and is expanding its presence in Australia and New Zealand. The Czech market remains outside the primary interest, but Claude's cloud services are available through Google Cloud Vertex AI and AWS, thus also for companies in the Czech Republic.

OpenAI Daybreak: AI as a Defender in Cyber Warfare

OpenAI is not lagging behind in another key area — cybersecurity. The company introduced Daybreak, a system that pushes frontier models into the area of discovering vulnerabilities, generating patches, and verifying remedies. The system combines GPT-5.5, Codex Security, and a layered approach for verified defensive workflows.

Daybreak offers secure code review, vulnerability triage, malware analysis, and patch validation. At the same time, it formalizes the division between general and specialized cybersecurity models — with tighter account controls for high-risk security tasks. For Czech companies and institutions facing growing cyber threats, this can be a valuable tool — however, availability and price for the European market remain a question.

Isomorphic Labs Raised $2.1 Billion for AI Drug Design

Alphabet spin-off Isomorphic Labs announced a Series B of $2.1 billion (approximately CZK 48 billion) for scaling its AI engine for drug design. The company combines DeepMind's research heritage, Demis Hassabis's Nobel Prize, and pharmaceutical partnerships into a commercial platform for drug discovery using artificial intelligence.

This amount is one of the largest investments in AI in the life sciences in history and signals that the pharmaceutical industry is entering an era where AI plays a key role in molecular design and drug effect prediction. For the Czech biotech sector, this is a reminder that cooperation with AI platforms can be crucial for competitiveness.

Googlebook and Gemma 4: AI Penetrates Hardware

Alphabet introduced the Googlebook — a new category of notebooks designed primarily for Gemini. The device features a so-called "Magic Pointer" developed in cooperation with Google DeepMind, its own Gemini widgets, and tight integration with Android. While specific availability and prices in Europe are not known, this is another step towards making AI the operating layer of the entire ecosystem — not just a cloud service.

At the same time, Google released multi-token prediction drafters for the open model Gemma 4. This speculative decoding approach can speed up inference up to 3x without degrading output quality or logical reasoning. For local deployment of models — for example, in Czech companies with sensitive data — this means significantly more efficient hardware use and lower costs.

Microsoft: AI Is Used by Almost a Fifth of the World's Population

Microsoft published its Global AI Diffusion Report for Q1 2026. According to it, 17.8% of the world's workforce uses artificial intelligence — up from 16.3% in the previous quarter. A total of 26 economies have already exceeded 30% adoption rate. The numbers show that AI adoption is accelerating, but unevenly: differences between countries, industries, and labor markets are deepening.

For the Czech Republic, this means that if we want to keep pace with Western Europe, we must accelerate digital literacy and the availability of AI tools not only for large companies, but also for small and medium-sized enterprises. The European Union is meanwhile tightening regulations — the AI Act is already beginning to affect which models and applications can be legally deployed on the European market.

What Does This Mean for the Future?

The past week showed that AI is not developing linearly, but exponentially. Mathematics, which until recently was the domain of human intuition, is now terrain where AI not only competes, but sometimes surpasses human researchers. Agent systems like Claude with the dreaming feature learn from their own experiences, and multi-agent orchestration enables solving tasks hitherto considered too complex for automation.

For Czech readers and companies, it is key to realize that this development is not a distant theoretical game. Tools like Claude, ChatGPT, and Gemini are available through cloud APIs even in the Czech Republic — and the difference between those who can use them effectively and those who cannot is quickly becoming a competitive advantage, or respectively, a handicap.

How is it possible that AI solves mathematical problems on which humans have failed?

Large language models like GPT-5.5 Pro search enormous spaces of possible solutions and combine proof techniques from the entire mathematical literature. According to Timothy Gowers, they have the ability to find elegant arguments that people did not notice — especially for problems that did not receive sufficient attention. Still, it is more about "connecting existing ideas" than about true creative originality.

Can a Czech company order Claude with the "dreaming" feature?

The dreaming feature is currently in research preview and available primarily for enterprise customers. Regular users and smaller companies will get access to it only later. Claude as such, however, is available through the web, API, and integrations with Google Cloud and AWS — thus also for Czech companies.

What are "off-diagonal Ramsey numbers" and why are they important?

Ramsey theory is an area of combinatorics that studies under what conditions certain structures appear in any arrangement. "Off-diagonal" variants deal with asymmetric conditions. A new proof in the Lean language means that AI is contributing to the discovery of fundamental mathematical truths that have applications in computer science, graph theory, and algorithmic design.