Banks Deploy Autonomous AI Blindly: 70% of Institutions Rush Ahead, But Only 1% Have the System Under Control

May 27, 2026 Daniel Cesak

AI article illustration for ai-jarvis.eu

Banks and financial institutions worldwide are investing massively in agentic artificial intelligence — systems that no longer just advise, but decide and act on their own. However, according to a new analysis by Vivek Muraleedharan of Crayon Data in Abu Dhabi (a former HSBC executive), traditional oversight and testing mechanisms are desperately lagging behind this pace. 70% of banks are rushing to deploy agentic AI, but only 1% say their adoption is mature. According to the expert, this gap is not a technology problem — it is a governance crisis just waiting for its trigger.

Listen to this article:

From recommendations to action: Why agentic AI is fundamentally different

The current deployment of artificial intelligence in banking has largely revolved around generative models — customer support chatbots, document summarization systems, or recommendation engines. These systems produce text, suggestions, or analyses, but the final decision always rests with a human. Agentic AI turns this equation on its head.

Agentic systems, such as autonomous agents from Anthropic, OpenAI, or specialized banking tools, can independently perform multi-step operations: they access databases, modify records, initiate payments, block accounts, or trigger fraud investigations. And it is precisely at this point, as Muraleedharan emphasizes in his analysis published on May 26, 2026, that traditional oversight models completely break down.

"When AI stops generating text and starts executing actions — transferring money, freezing accounts, denying benefits — the old playbook of oversight falls apart," Muraleedharan wrote in his LinkedIn post, reported by QA Financial.

A hallucination that makes it through the banking system

The analysis's biggest warning is the difference between a hallucination in a chatbot and a hallucination in an agentic system. While bad advice from a chatbot at most annoys a customer, an autonomous agent's error can mean an unauthorized transaction that has already been executed.

"A hallucination in a chatbot = bad advice. A hallucination in an agentic system = an unauthorized transaction that has already been settled," Muraleedharan stated verbatim. In a banking environment, moreover, many of these actions are practically irreversible — once a payment goes out or an account is blocked, remediation is complex, expensive, and damages client trust.

For software quality assurance (QA) and testing teams, this means a dramatic expansion of the testing perimeter. It is no longer enough to verify the correctness of model outputs — it is necessary to stress-test escalation logic, tool permissions, runtime authorization limits, API interactions, and human intervention thresholds, under both realistic and adversarial conditions.

The silent risk: Compliance drift

Muraleedharan's analysis introduces the concept of "compliance drift" — the silent risk of an autonomous system gradually straying from its original governance boundaries. This is not a one-off incident, but the cumulative effect of dozens of small changes.

"A system that was fully compliant when deployed gradually introduces new, unforeseen risks — not because the model itself has changed, but because the context around it has changed," the expert explains. Evolving APIs, expanded integrations, shifting user behavior, or broadened tool access can gradually alter the agent's effective authority without triggering any conventional monitoring.

The consequence is that pre-deployment testing alone is no longer sufficient. Banks need continuous behavioral reassessment, runtime observability, and ongoing validation of agents' decision boundaries long after deployment to production.

What this means for Czech banks and European regulation

The topic is exceptionally relevant for the Czech Republic as well. The Czech National Bank (ČNB) has significantly strengthened its AI capabilities in recent months — as reported by CzechCrunch, the central bank acquired Nvidia chips and is running models from OpenAI, Mistral, and Alibaba on them for financial market oversight purposes. The irony is that while the regulator itself is deploying AI to monitor banks, banks are simultaneously introducing autonomous systems that elude traditional oversight mechanisms.

The European Union is already responding to this challenge. The EU AI Act, which has come into force, classifies AI deployment in financial services as high-risk and requires robust oversight, testing, and human supervision mechanisms. In his analysis, Muraleedharan explicitly mentions regulatory pressure from the EU, as well as supervisory authorities in the United Kingdom, Singapore, and the United Arab Emirates.

"Central banks are beginning to view failures of autonomous systems as systemic risk events," Muraleedharan warns. For Czech financial institutions — from major banks such as ČSOB, Česká spořitelna, or Komerční banka to fintech startups — this means that investment in AI governance is no longer optional, but mandatory.

Governance is not a barrier, but a prerequisite

The analysis ends with a clear message: "Governance is not a barrier to agentic AI adoption. It is its prerequisite." Muraleedharan proposes a governance lifecycle that includes the phases of design, development, pre-production testing, deployment, runtime controls, continuous monitoring, and eventual decommissioning of autonomous agents.

A key element is adversarial testing — stress testing in a sandbox environment where escalation thresholds, permission boundaries, and execution paths are verified before the system gains access to live banking data. In the production environment, governance systems must then actively enforce "real-time discipline" — restricting unplanned tool usage and defining when a human must intervene.

For Czech banks, this yields a clear recommendation: before every deployment of an autonomous agent, a verified governance, testing, and human oversight plan must be in place. Without it, they risk not only fines from the regulator, but above all client trust — and in banking, trust takes years to build but seconds to lose.

What is the main difference between generative AI and agentic AI in banking?

Generative AI (like ChatGPT or Claude) creates text, recommendations, or analyses, but the final decision is made by a human. Agentic AI, by contrast, performs actions on its own — it can initiate a payment, block an account, or trigger a fraud investigation. It is precisely this autonomy that brings entirely new risks, because an error is no longer just "bad advice" but an already executed and often irreversible operation.

How does the EU AI Act affect autonomous AI systems in banks?

The EU AI Act classifies AI systems in financial services as high-risk. This means a mandatory obligation to implement robust oversight mechanisms, continuous testing, documentation of decision-making processes, and human supervision. Banks that fail to meet these requirements face fines of up to 7% of their global turnover.

What is "compliance drift" and why is it dangerous?

Compliance drift is the gradual deviation of an autonomous system from its original governance boundaries — not due to changes in the model itself, but due to changes in its surroundings (new APIs, expanded integrations, shifts in user behavior). Unlike a one-off incident, compliance drift has no clear trigger, but accumulates across dozens of small changes. This is why continuous, rather than one-time, testing is essential.