AI Agents Under Scrutiny: Microsoft Reveals Seven New Vulnerabilities Every CISO Must Know

AI Agents Under Scrutiny: Microsoft Reveals Seven New Vulnerabilities Every CISO Must Know

  • 08/Jun/2026
  • ForgeNEX by ForgeNEX
  • AI

The artificial intelligence ecosystem is advancing at a breakneck pace, and with it, the threats that lurk. Microsoft has recently published an update to its Taxonomy of Failure Modes in Agentic AI Systems, incorporating seven new ways in which these systems can be compromised. This finding is no small matter: AI agents, increasingly autonomous and connected, represent a new battleground for enterprise cybersecurity.

microsoft-identifica-siete-nuevas-formas-en-las-qu-0.jpg

Why Now? Four Factors Driving Increased Risks

According to Microsoft's analysis, four key elements have driven the identification of these new failure modes. First, the speed of adoption of agentic AI has outpaced traditional security barriers. Second, the maturity of the Model Context Protocol (MCP) has created a more complex and thus more exposed ecosystem. Third, the rise of Computer Use Agents introduces visual attack vectors. Finally, the accumulation of empirical evidence by researchers has allowed the detection of patterns that previously went unnoticed.

These factors, combined, have given rise to a new wave of vulnerabilities that security teams must urgently address. As we already warned in our article on "Dangerous Combination": The 2 Factors That Can "Corrupt" AI Agent Workflows, interconnection and autonomy are a breeding ground for incidents.

The Seven New Failure Modes in Detail

Below, we break down each of the threats identified by Microsoft, with practical implications for organizations.

1. Agentic Supply Chain Compromise

This attack exploits the agent's supply chain, but with a twist: malicious behavior is introduced via natural language, not code. An adversary can modify the agent's instructions in public repositories or knowledge bases, altering its behavior without raising suspicion. It is an evolution of traditional supply chain compromise, now applied to language models.

2. Goal Hijacking

Here, the attacker inserts instructions that appear aligned with the agent's legitimate task but actually redirect its final objective. For example, a customer service agent could be tricked into diverting payments to a fraudulent account while believing it is completing a valid transaction. This is a form of goal hijacking that requires human oversight controls.

3. Inter-Agent Trust Escalation

In multi-agent environments, a compromised agent can impersonate another with higher privileges. By inflating its declared permissions to the orchestrator, it gains access to restricted functions. This attack underscores the need to verify each agent's identity cryptographically, not just by its network position.

microsoft-identifica-siete-nuevas-formas-en-las-qu-1.jpg

4. Computer Use Agent (CUA) Visual Attack

Agents operating through graphical user interfaces (GUIs) are vulnerable to visual attacks. An adversary can embed malicious instructions in visual elements, such as buttons or images, which the agent interprets as legitimate commands. This vector is particularly dangerous in desktop automation tools.

5. Session Context Contamination

This failure mode involves contaminating a session's context with biased data. The attacker introduces information that, step by step, does not trigger security controls but cumulatively diverts the agent's reasoning. It is a subtle attack requiring continuous monitoring of the session state.

6. MCP / Plugin Abuse

With the growing adoption of the Model Context Protocol and plugins, attackers can exploit the attack surfaces inherent to these protocols. For example, a malicious plugin could intercept or modify calls between the agent and the model, compromising the integrity of responses.

7. Capability / Architecture Disclosure

Finally, an agent may reveal internal implementation details, such as tool names, data schemas, system prompt structure, or human oversight activation logic. This information is gold for attackers, who can use it to design more precise attacks.

Implications for Enterprise Security

Microsoft's list is not merely theoretical; it has direct practical consequences. Security teams must update their red teaming matrices to include these seven failure modes. Additionally, it is crucial to inventory each agent's supply chain via a Software Bill of Materials (SBOM), as the company recommends. In our ethical hacking guide, we already highlighted the importance of penetration testing in AI systems.

Another critical aspect is identity verification: trusting the agent's network position is not enough; cryptographic authentication via verifiable credentials from provisioning is needed. This is especially relevant in light of recent incidents such as Check Point VPN, where lack of proper verification facilitated unauthorized access.

microsoft-identifica-siete-nuevas-formas-en-las-qu-2.jpg

Practical Recommendations

Microsoft advises security teams to use these definitions to influence their planning. Specifically, it suggests:

  • Generate an SBOM for each deployed agent, detailing its components and dependencies.
  • Verify the agent's identity cryptographically by issuing verifiable credentials at provisioning.
  • Add the seven new failure modes to the red team coverage matrix.
  • Audit the user experience in human-in-the-loop scenarios as a security control.

Human oversight remains a pillar, but it must be carefully designed to avoid being bypassed. In our analysis of intelligent anonymization, we already explored how to protect sensitive data in AI workflows.

Conclusion: AI Agent Security Is an Endless Race

Microsoft's update demonstrates that AI security is a dynamic field, where every new capability brings new risks. Organizations deploying AI agents must adopt a proactive approach, integrating these threats into their security strategies from the design phase. As we noted in our article on agents vs. SaaS, AI will not eliminate enterprise software, but it will redefine how we protect it.


Original source: ComputerWorld. Analysis and adaptation by ForgeNEX.

Share: