Skip to main content

Anthropic Investigates Leaked Access to Claude Mythos: Dangerous AI Model Got into the Wrong Hands

AI chip circuit board illustration
Anthropic is investigating a serious security failure: according to Bloomberg, a small group of users gained access to Claude Mythos — a cyber tool that the company considers too dangerous for public release. The incident undermines trust in the protection of the most powerful AI models and raises the question of whether large technology companies can keep their power under control.

What happened? Unauthorized access to a “dangerous” model

The company Anthropic confirmed that it is investigating a report of unauthorized access to the Claude Mythos Preview model through the environment of one of its external suppliers. “We are investigating a report of unauthorized access to Claude Mythos Preview through our third-party supplier’s environment,” the firm stated in an official statement to the BBC.

According to Bloomberg, users in a private forum managed to access the model without standard permissions. The person who facilitated the access already had permission to view Anthropic’s AI models as part of work for an external supplier — but the group abused this access. Bloomberg further reported that the group has been using the model since gaining access, although not for active hacking attacks, because it does not want to be detected.

Anthropic emphasizes that there is no evidence the model fell into the hands of malicious actors, or that its own systems were directly attacked. Nevertheless, the incident raises serious concerns about the ability of large AI laboratories to keep their most advanced models out of reach of those who might abuse them.

Why is Claude Mythos so exceptional — and so dangerous

Claude Mythos Preview is not an ordinary language model. Anthropic describes it as a “qualitative leap” in artificial intelligence capabilities and simultaneously as its most dangerous model to date. The peculiarity is that its risks did not arise from any special cyber training, but simply from improved general reasoning — meaning that similar capabilities could soon be acquired by competing models as well.

According to Anthropic’s internal tests, Mythos achieved 93.9% on the SWE-bench Verified benchmark, which is almost 13 percentage points more than the previous top score (Claude Opus 4.6 reached 80.8%). The model discovered thousands of serious vulnerabilities in all major operating systems and web browsers. Among them, for example:

  • A bug in the OpenBSD security operating system that evaded detection for 27 years,
  • a vulnerability in the FFmpeg video encoder that survived 5 million previous automated tests,
  • several vulnerabilities in the Linux kernel that could allow an attacker full control over a computer.

Mozilla confirmed that Mythos found 271 bugs in Firefox 150. “Claude Mythos Preview is as capable as top human security researchers,” stated Mozilla’s technical director Bobby Holley.

The critical risk is that Mythos can combine multiple vulnerabilities into chain attacks — identify five different weaknesses in one piece of software and combine them into a unique, extremely dangerous exploit. Combined with the growing ability of AI to work long-term without supervision, according to Anthropic we are reaching an inflection point for cyber risks.

Project Glasswing: defense against its own invention

Anthropic responded to the risks by launching Project Glasswing — a coalition of more than 40 of the world’s largest technology companies, including Apple, Google, Microsoft, Cisco, and Broadcom. The goal is to provide selected partners with early access to Mythos so they can find and patch vulnerabilities in their systems and in critical open-source software on which modern digital infrastructure depends.

Anthropic is investing $100 million in credits for model usage into the project and donating another $4 million to open-source security initiatives. “AI capabilities have crossed a threshold that fundamentally changes the urgency of protecting critical infrastructure from cyber threats — and there is no going back,” stated Anthony Grieco, chief security officer at Cisco.

Cybersecurity expert Alex Stamos, former security chief at Facebook and Yahoo, warned: “We have about six months before open-source models catch up to foundation models in finding bugs. After that, every ransomware actor will be able to find and exploit vulnerabilities without leaving traces for investigators — and at minimal cost.”

Government reactions: United Kingdom calls for cooperation, USA deals with dispute

The unauthorized access incident comes at a time when governments around the world are grappling with how to regulate the most powerful AI models — so-called frontier AI. At the CyberUK conference, hosted by the UK National Cyber Security Centre (NCSC), NCSC chief Richard Horne appealed to experts not to panic over new AI attacks, but to focus on the basics of cybersecurity.

“Advanced AI is rapidly enabling the discovery and exploitation of existing vulnerabilities at scale, illustrating how quickly it will reveal places where the basics of cybersecurity still await remediation,” Horne declared. British security minister Dan Jarvis called on AI companies to cooperate with the government on a “generational effort” to protect critical networks from attackers.

In the United States, the situation is more complicated. The White House held a “productive” meeting with Anthropic after the Pentagon attempted to label the firm as a supply chain risk — among other things, after Anthropic refused to amend its contract to allow mass domestic espionage and fully autonomous weapons. A court has so far blocked this move. Nevertheless, according to Axios, even the US NSA gained access to Mythos, even though the Pentagon considers Anthropic a security risk.

President Trump hinted in a CNBC interview that a deal between Anthropic and the Department of Defense is “possible.” “We had very good talks with them and I think it’s shaping up,” Trump stated.

What does this mean for the Czech Republic and Europe?

For Czech readers and European companies, this incident carries several warning signals. All the most powerful AI models are created outside Europe — mainly in the USA and China. This means that the European Union, including the Czech Republic, has no direct control over how these models are trained, built, or released.

The EU AI Act, which entered into force in 2024 and is gradually being implemented, does impose obligations on general-purpose artificial intelligence systems with high impact, but its practical enforceability against American laboratories remains limited. Czech companies and institutions are thus dependent on the good will of firms like Anthropic, OpenAI, or Google that their most powerful models will not be misused — or that they will keep them safe.

The Mythos incident shows that even the laboratories themselves do not have full control. If the model falls into the hands of organized criminals or state actors through third-party vulnerabilities, European critical infrastructures — from power grids to banking systems to hospitals — could be exposed to a new generation of attacks against which current defenses are insufficient.

Raluca Saceanu, CEO of the cybersecurity company Smarttech247, commented on the situation: “When powerful AI tools are accessible or used outside their intended control, the risk is not just a security incident, but the spread of capabilities that could be used for fraud, cyber abuse, or other malicious activities.”

For ordinary users, Claude Mythos remains unavailable. Anthropic has not released it to the public and has no plans for wide distribution. Czech companies can only access it through the Project Glasswing partner program, if they meet strict security conditions. The model does not support Czech better than other versions of Claude — its uniqueness lies exclusively in its cyber capabilities.

Conclusion: A fragile balance between power and control

Anthropic has long presented itself as a safety-oriented laboratory that wants to be the first to encounter the most dangerous AI capabilities — and at the same time show the way to mitigate them. Project Glasswing is, in a sense, the fulfillment of this mission. At the same time, however, it is based on a deeply uncomfortable premise: the only protection against dangerous AI is to build it first.

Unauthorized access to Mythos, although it has not yet caused direct damage, shows how fragile this balance is. If the model gets beyond a controlled environment — whether through a supplier error, model weight theft, or competitive development — the capabilities that today serve defense could be turned against us.

Can Claude Mythos reach ordinary users or hackers?

Anthropic has not released the model to the public and has no intention of doing so. Access is only granted to selected companies within Project Glasswing and some government agencies. The risk lies rather in the fact that competing laboratories or open-source projects will soon develop similar capabilities that will be more widely available.

How does Claude Mythos differ from regular Claude or ChatGPT?

While regular Claude or ChatGPT are universal conversational assistants, Mythos specializes in cybersecurity. Its strength lies in its ability to find extremely complex vulnerabilities in software, chain them into sophisticated attacks, and work long-term without human supervision. According to benchmarks, it significantly surpasses even top human security experts.

What should Czech companies do to protect themselves from similar AI threats?

The foundation is maintaining cyber hygiene: regular software updates, removal of obsolete systems, network segmentation, and employee training. In the long term, companies should monitor the development of the EU AI Act and demand transparency from AI tool suppliers regarding security measures. In the case of critical infrastructure, it is appropriate to conduct regular penetration tests that also take into account scenarios with advanced AI.