The world of developers has changed dramatically in the last year. AI assistants are no longer just advanced autocorrectors; they have become autonomous agents capable of reading files, executing commands in the terminal, and interacting with external tools. This capability is incredibly efficient, but at the same time, it creates an entirely new attack surface. Research from the Imperva and Tenet Security teams has revealed that these agents are extremely susceptible to a technique called prompt injection (instruction insertion), which is disguised as legitimate data.
Attack Mechanism: How does an AI agent become an enemy?
The fundamental problem is not a flaw in the model itself (such as GPT-4 or Claude 3.5 Sonnet), but in how the agent processes external data. Within the so-called agentic workflow (a process where AI proceeds step-by-step), the agent takes information from various sources – emails, error logs, or shared contacts – and treats it as fact.
The attacker exploits this trusting approach. If a hacker can insert an instruction into the data stream that the agent reads, an instruction that looks like a system problem solution, the agent will execute it without informing the user. This process is called Agentjacking.
The OpenClaw Case: Hidden Data in Contacts
Research by Imperva focused on OpenClaw, a popular self-hosted AI agent. Researchers found that when processing objects such as vCards (digital business cards) or locations, they are not properly separated from the main prompt. An attacker can insert a hidden command into the "name" field of a contact. Because the model does not see a clear boundary between data and instruction, it interprets the command as part of its task. This can lead to the agent, for example, sending sensitive information to an external server.
The Sentry Case: Fake Error Messages as a Conduit for Malware
Even more dangerous is the finding by the Tenet Security team. They focused on modern developer tools such as Cursor or Claude Code. These tools often use integrations for error monitoring (e.g., Sentry) so developers can quickly fix bugs.
An attacker can inject a fake error into the publicly available Sentry API. This error contains instructions that look like a "recommended fix procedure". The AI agent reads this error, interprets it as legitimate diagnostics, and then executes a command (e.g., via npm install) that installs a malicious package directly into the developer's environment. The result is immediate access to:
- Environmental variables (AWS keys, GitHub tokens).
- Git credentials.
- Private repositories.
Comparison of Popular AI Tools for Developers
For a Czech developer or technology company, it is important to know which tools to use and what their price and security profile are.
| Tool | Type | Price (approx.) | Main Risk in the Context of Agentjacking |
|---|---|---|---|
| Cursor | AI Code Editor | Free tier / $20 monthly | High (direct shell integration) |
| Claude Code | CLI Agent (Anthropic) | Based on API usage | Very High (autonomous execution in terminal) |
| GitHub Copilot | Extension / Chat | $10–$19 monthly | Medium (more limited autonomy) |
While Claude Code offers extreme performance within the CLI, its ability to autonomously execute commands makes it a primary target for Agentjacking. Although Cursor is very popular in the Czech development environment due to its intuitiveness, its deep integration into the operating system requires increased vigilance.
Practical Impact: What does this mean for Czech companies and developers?
In the Czech Republic, which has a strong IT scene and many development studios, this finding represents a fundamental shift in the security paradigm. It is no longer enough to protect the corporate network or email inboxes. The attack surface has shifted to the developer's terminal.
From the perspective of EU regulations (AI Act), these types of vulnerabilities can have serious legal consequences for companies that implement autonomous agents into their processes. If a company deploys an AI agent that has access to production data, and this agent is "hacked" through Agentjacking, it could be considered a failure to ensure system security according to European standards.
How to defend?
- Human-in-the-loop: Never let an AI agent execute commands without human confirmation, especially those that install packages or modify system files.
- Sandboxing: Run development environments and AI agents in isolated containers (e.g., Docker) that do not have access to sensitive keys on the host system.
- Permission Restriction: The AI agent should only have the minimum necessary permissions (principle of least privilege).
- Monitoring: Monitor unusual activity in the terminal and network requests from developer machines.
Can I use Cursor or Claude Code safely?
Yes, but you must change the way you interact. Security does not depend on whether the tool is "safe", but on how much you trust it. Always review every command that the AI agent suggests executing in the terminal, and do not use these tools with administrator (root/sudo) privileges in an environment where you have sensitive keys stored.
How do I know if an AI agent is attacking me?
This is very difficult because the attack occurs through legitimate processes. However, warning signs include unexpected package installations (npm, pip), attempts to send data to unknown domains, or commands like curl or wget that try to exfiltrate the content of files such as .env or .
Is this problem solvable with a single software or update?
No, this is an architectural problem of current LLM agents. Updates can fix specific bugs (as with OpenClaw), but the principle of "trusting data" remains. The only solution is a combination of secure architecture (sandboxing) and human oversight.