Agentjacking: A New Era of Cyberattacks on AI Programmers. How to Protect Autonomous Agents?

June 11, 2026 jarvis

Traditional security processes, designed for the human pace of software development, are becoming ineffective in the era of autonomous AI agents. While a penetration test used to run at the end of a monthly cycle, today's AI agents can generate hundreds of code changes per day. This opens the door to a new phenomenon called "Agentjacking" — a technique where attackers take over the decision-making processes of AI agents and exploit their autonomy to inject malicious code or exfiltrate data.

The world of software engineering is undergoing a fundamental transformation. We are moving from tools that merely suggest lines of code (like the original version of GitHub Copilot) to fully autonomous agents capable of independently solving entire tasks, fixing bugs, and managing infrastructure. This "agentic" development capability is extraordinarily efficient, but according to expert Boaz Barzel of Ox Security, it represents an entirely new and critical security risk.

At the Infosecurity Europe conference, Barzel pointed out that the moment AI begins to act autonomously, the concept of "shift left" (moving security to the beginning of the development cycle) ceases to apply. In the age of agents, there is no longer a "left" place to shift security to — it must be integrated directly into the agent's own operation.

What Is Agentjacking and Why Should You Care?

Agentjacking is not just about vulnerable code. It is about taking control of the logic an AI agent uses to perform tasks. If an attacker can manipulate the instructions or tools available to the agent, they can force it to perform actions it was not assigned — for example, sending sensitive data to an external server or creating a backdoor in an application.

Ox Security has identified four main attack surfaces that traditional security tools cannot effectively cover:

Input: Any instructions, prompts, or protocols entering the agent. An attacker can use so-called prompt injection to change the agent's priorities or make it ignore security rules.
Tools: AI agents today use various interfaces, such as MCP servers (Model Context Protocol) or connections to SaaS services. These "capabilities" can be exploited for lateral movement within a network or data exfiltration.
Execution: Autonomous processes that run without direct human oversight. If an agent runs in a closed loop without monitoring, it can silently perform malicious operations.
Output: Generating vulnerable or destructive code at machine speed. AI can create thousands of lines of code with injection flaws before any human reviewer can even blink.

Tool Comparison: Where Are We Now?

For a Czech developer or company, the question is which tool to choose and how to secure it. Here is a quick comparison of the currently most significant players on the market:

Tool	Type	Price (approximate)	Key Feature
GitHub Copilot	Assistant / Agent	$10–$19 / month	Most widely used, strong integration within the GitHub ecosystem.
Devin (Cognition)	Fully autonomous agent	Enterprise pricing	Capable of solving complex engineering tasks independently.
Replit Agent	Cloud development agent	Free tier / $20+ / month	Ideal for rapid prototyping directly in the browser.

From a security perspective, it is important to note that while GitHub Copilot focuses more on assistance, tools like Devin or Replit Agent operate much closer to the boundary of autonomy, making them prime targets for future "Agentjacking" attacks.

Practical Impact: What Does This Mean for Czech Companies and the EU?

For Czech technology companies, from startups in Prague to large development teams in Brno, this topic has two critical dimensions:

Regulation (EU AI Act): The European Union is already implementing strict rules for artificial intelligence systems. If your company uses autonomous agents to develop critical infrastructure or high-risk applications, you will need to demonstrate not only code security but also the security of the code generation process itself.
Availability and Localization: Most of these tools are primarily English-language. Even though models like GPT-4 or Claude 3.5 Sonnet (which power these agents) understand Czech, technical documentation and security protocols remain in English. This creates a risk of "semantic misunderstanding," where a developer may fail to recognize subtle manipulation (prompt injection) within English instructions.

The Way Out: The Auto-Pentesting Loop

The solution proposed by Boaz Barzel is not to try to restrict AI agents, but to change the way security operates. Instead of security being a separate "phase," it must become system behavior.

The concept is to create a so-called auto-pentesting loop. In this model, a "security agent" works in parallel with a "coding agent." Every commit, every code change, and every call to an external tool must be immediately analyzed by a second, specialized AI model focused exclusively on finding vulnerabilities. The security agent must understand the context — what has changed, what new data is now accessible, and whether this change introduces an exfiltration risk.

This approach shifts the paradigm: security ceases to be a department and becomes a property of the process.

Could an AI agent accidentally steal passwords or API keys from my computer?

Yes, if the agent has access to your local environment or shell and is not properly sandboxed. If an attacker controls the agent via prompt injection, they can command it to read configuration files containing passwords and send them out.

Is it safer to use open-source models (e.g. Llama 3) for development?

Open-source models give you more control over where data is processed, which is key for privacy. However, the model itself does not solve the Agentjacking risk — even an open-source agent can be manipulated if its tools and permissions are not properly configured.

How can I tell if my AI agent is behaving suspiciously?

Monitor for unusual network activity (connection attempts to unknown domains), sudden changes in permissions the agent requests, or code that contains logic for hidden data transmission, which should not be part of normal development.

Agentjacking: A New Era of Cyberattacks on AI Programmers. How to Protect Autonomous Agents?

What Is Agentjacking and Why Should You Care?

Tool Comparison: Where Are We Now?

Practical Impact: What Does This Mean for Czech Companies and the EU?

The Way Out: The Auto-Pentesting Loop

Don't miss out!