Seville, Spain
Seville, Spain
+(34) 624 816 969
The artificial intelligence ecosystem is advancing at a breakneck pace, and with it, the threats that lurk. Microsoft has recently published an update to its Taxonomy of Failure Modes in Agentic AI Systems, incorporating seven new ways in which these systems can be compromised. This finding is no small matter: AI agents, increasingly autonomous and connected, represent a new battleground for enterprise cybersecurity.

Table of contents [Show]
According to Microsoft's analysis, four key elements have driven the identification of these new failure modes. First, the speed of adoption of agentic AI has outpaced traditional security barriers. Second, the maturity of the Model Context Protocol (MCP) has created a more complex and thus more exposed ecosystem. Third, the rise of Computer Use Agents introduces visual attack vectors. Finally, the accumulation of empirical evidence by researchers has allowed the detection of patterns that previously went unnoticed.
These factors, combined, have given rise to a new wave of vulnerabilities that security teams must urgently address. As we already warned in our article on "Dangerous Combination": The 2 Factors That Can "Corrupt" AI Agent Workflows, interconnection and autonomy are a breeding ground for incidents.
Below, we break down each of the threats identified by Microsoft, with practical implications for organizations.
This attack exploits the agent's supply chain, but with a twist: malicious behavior is introduced via natural language, not code. An adversary can modify the agent's instructions in public repositories or knowledge bases, altering its behavior without raising suspicion. It is an evolution of traditional supply chain compromise, now applied to language models.
Here, the attacker inserts instructions that appear aligned with the agent's legitimate task but actually redirect its final objective. For example, a customer service agent could be tricked into diverting payments to a fraudulent account while believing it is completing a valid transaction. This is a form of goal hijacking that requires human oversight controls.
In multi-agent environments, a compromised agent can impersonate another with higher privileges. By inflating its declared permissions to the orchestrator, it gains access to restricted functions. This attack underscores the need to verify each agent's identity cryptographically, not just by its network position.

Agents operating through graphical user interfaces (GUIs) are vulnerable to visual attacks. An adversary can embed malicious instructions in visual elements, such as buttons or images, which the agent interprets as legitimate commands. This vector is particularly dangerous in desktop automation tools.
This failure mode involves contaminating a session's context with biased data. The attacker introduces information that, step by step, does not trigger security controls but cumulatively diverts the agent's reasoning. It is a subtle attack requiring continuous monitoring of the session state.
With the growing adoption of the Model Context Protocol and plugins, attackers can exploit the attack surfaces inherent to these protocols. For example, a malicious plugin could intercept or modify calls between the agent and the model, compromising the integrity of responses.
Finally, an agent may reveal internal implementation details, such as tool names, data schemas, system prompt structure, or human oversight activation logic. This information is gold for attackers, who can use it to design more precise attacks.
Microsoft's list is not merely theoretical; it has direct practical consequences. Security teams must update their red teaming matrices to include these seven failure modes. Additionally, it is crucial to inventory each agent's supply chain via a Software Bill of Materials (SBOM), as the company recommends. In our ethical hacking guide, we already highlighted the importance of penetration testing in AI systems.
Another critical aspect is identity verification: trusting the agent's network position is not enough; cryptographic authentication via verifiable credentials from provisioning is needed. This is especially relevant in light of recent incidents such as Check Point VPN, where lack of proper verification facilitated unauthorized access.

Microsoft advises security teams to use these definitions to influence their planning. Specifically, it suggests:
Human oversight remains a pillar, but it must be carefully designed to avoid being bypassed. In our analysis of intelligent anonymization, we already explored how to protect sensitive data in AI workflows.
Microsoft's update demonstrates that AI security is a dynamic field, where every new capability brings new risks. Organizations deploying AI agents must adopt a proactive approach, integrating these threats into their security strategies from the design phase. As we noted in our article on agents vs. SaaS, AI will not eliminate enterprise software, but it will redefine how we protect it.
Original source: ComputerWorld. Analysis and adaptation by ForgeNEX.