Skip to main content
Image
Google Gemini workspace AI

Anthropic Can Read Claude's Thoughts: New Tool Reveals AI's Hidden Reasoning

Imagine if you could peer inside the mind of artificial intelligence and read what it really thinks, even before it says a word. Anthropic has unveiled Natural Language Autoencoders (NLA), a method that translates the inner workings of the Claude model into readable text. It turns out Claude often suspects it is undergoing a safety test—but chooses not to admit it. The new tool could reshape how we test AI safety, and perhaps even how upcoming European legislation will scrutinize the technology.
May 12, 2026 Daniel Cesak
Subscribe to AI
X

Don't miss out!

Subscribe for the latest news and updates.