Scout: OpenClaw Security Crisis: Lessons for Agent Infrastructure Teams

Summary

The OpenClaw security crisis of Q1 2026 is the defining security event for the agent infrastructure ecosystem. In roughly eight weeks, a single open-source AI agent framework went from 6,300 instances to 500,000 — and simultaneously accumulated 60+ CVEs, a coordinated supply chain poisoning campaign (ClawHavoc) that planted 1,184 malicious skills in its marketplace, and the exposure of 135,000+ instances on the public internet with no enterprise kill switch. Two concurrent academic studies [1][2] demonstrated that the vulnerabilities are not implementation bugs to be patched but architectural properties of the personal agent paradigm: poisoning any single dimension of Capability, Identity, or Knowledge triples attack success rates from 24.6% to 64-74%. The crisis has catalyzed a rapid defensive response — OWASP’s Agentic Skills Top 10, NVIDIA’s NemoClaw enterprise wrapper, Microsoft’s Agent Governance Toolkit, Cisco’s DefenseClaw, and open-source scanning tools like SkillRisk — but the fundamental tension between agent autonomy and security remains unresolved. Every team running agent infrastructure needs to understand this incident not as an OpenClaw-specific problem but as the template for what happens when autonomous agents meet real-world attack surfaces.

Key Findings

1. The Full CVE Landscape: Nine Vulnerabilities in Four Days

Between March 18 and March 21, 2026, nine CVEs were publicly disclosed against OpenClaw, several with CVSS scores above 8.0 [3][4]. The most critical include:

CVE	CVSS	Description	Root Cause
CVE-2026-25253	8.8	1-click RCE via auth token exfiltration from `gatewayUrl`	CWE-669: incorrect resource transfer
CVE-2026-24763	8.8	Command injection via Docker PATH env variable	CWE-78: OS command injection
CVE-2026-32913	8.8	Custom auth header leakage via cross-origin redirects	CWE-522: credential management
CVE-2026-32974	8.8	Forged event injection via Feishu webhook verification token	CWE-347: improper verification
CVE-2026-28461	8.7	Unbounded memory growth via Zalo webhook query string churn	CWE-770: resource allocation
CVE-2026-28462	8.7	Path traversal in trace and download output paths	Path traversal

Additional vulnerabilities include session sandbox escape via the session_status tool (CWE-863), authorization bypass in plugin subagent routes via synthetic admin scopes, and authorization bypass in pairing-store access control [3][4].

The architectural pattern is telling: these are not model-level failures. They cluster around the gateway, channel adapters, sandbox, and plugin layers — the exact surfaces that enable agent autonomy. CVE-2026-25253, the most exploited vulnerability, allows full machine compromise from a single click because the gateway architecture exposes auth tokens to any page the agent’s browser visits [5][6]. By the time public disclosure occurred on February 3, 2026, over 40,000 instances were internet-exposed, with 63% assessed as vulnerable [6].

2. ClawHavoc: The First Large-Scale Agent Supply Chain Attack

ClawHavoc is the first coordinated, large-scale supply chain poisoning campaign targeting an AI agent skill marketplace. It represents a qualitative shift from the package-manager supply chain attacks (like TeamPCP) that preceded it, because the attack surface is fundamentally different: agent skills operate with the agent’s full privilege context, meaning a compromised skill inherits access to email, files, credentials, and network [7][8][9].

Timeline: The first malicious skill appeared on ClawHub on January 27, 2026. The campaign surged on January 31. Koi Security named it ClawHavoc on February 1. By mid-February, 824+ confirmed malicious skills were identified across 10,700+ skills in the registry [7][8].

Attack vectors: Attackers embedded payloads through three primary mechanisms [7][8]:

Staged downloads: Skills that pulled additional malware payloads after installation
Reverse shells: Python system calls establishing persistent backdoor access
Direct data exfiltration: Skills that immediately harvested credentials and sensitive data

ClickFix 2.0 — the novel technique: ClawHavoc pioneered what researchers now call ClickFix 2.0 [9][10]. Traditional ClickFix tricks users into copying and executing commands from a web page. ClickFix 2.0 weaponizes the agent itself as a trusted intermediary. Malicious SKILL.md files embed fabricated “prerequisite installation requirements” deep within documentation sections. When the user first invokes the skill, the agent presents a fake setup dialog or simulated error message and provides a terminal command to “fix” the environment. The user trusts the agent, executes the command, and the malicious payload fires [9][10]. This is an especially dangerous pattern because it exploits the fundamental trust relationship between user and agent — the very relationship that makes agents useful.

Payload: The primary payload was Atomic macOS Stealer (AMOS), a commodity infostealer sold as malware-as-a-service. AMOS harvests browser credentials, keychain passwords, cryptocurrency wallets, SSH keys, and files from common user directories. In agent environments, the blast radius extends to API keys, auth tokens, and secrets the agent itself is authorized to access [8][9].

Scale: At peak infection, five of the top seven most-downloaded skills on ClawHub were confirmed malware [11]. Researchers estimate the campaign targeted 300,000 OpenClaw users [8].

3. Architectural Root Causes: The CIK Taxonomy

The UCSC research team’s CIK taxonomy [1] provides the most actionable framework for understanding why these attacks succeed. They organize agent security concerns across three persistent-state dimensions:

Capability: System abilities and permissions — what the agent can do (execute code, access filesystems, call APIs)
Identity: Agent credentials and authentication — who the agent is and what it can access
Knowledge: Stored information and learned patterns — what the agent knows and remembers

Their key finding: poisoning any single CIK dimension increases average attack success rate from 24.6% to 64-74%, tested across Claude Sonnet 4.5, Opus 4.6, Gemini 3.1 Pro, and GPT-5.4. Even the most hardened model showed a threefold vulnerability increase when targeted through a single dimension [1].

Their defense evaluation is equally sobering. The strongest defense achieved only 63.8% success rate against Capability-targeted attacks. File protection mechanisms block 97% of malicious injections but also prevent legitimate updates — a classic security-usability tradeoff that makes the defense impractical in production [1].

The complementary study [2] evaluated six OpenClaw variants (OpenClaw, AutoClaw, QClaw, KimiClaw, MaxClaw, ArkClaw) across multiple backbone models and found that agentized systems are significantly riskier than their underlying models used in isolation. The risk amplification comes from the coupling among model capability, tool use, multi-step planning, and runtime orchestration. Their benchmark of 205 test cases showed that early-stage reconnaissance weaknesses amplify into concrete system-level failures when agents have execution capability and persistent runtime context [2].

The conclusion from both papers: these vulnerabilities are architectural, not incidental. Patching individual CVEs is necessary but insufficient. The personal agent paradigm — broad privileges, persistent state, marketplace extensibility — creates attack surfaces that cannot be eliminated without fundamentally constraining what agents can do.

4. The Two-Axis Vulnerability Taxonomy

A broader systematic analysis of 190 advisories filed against OpenClaw [12] organizes the vulnerability corpus along two orthogonal axes:

System axis (where the vulnerability lives):

Exec policy layer
Gateway
Channel adapters
Sandbox
Browser automation
Plugin/skill layer
Agent/prompt layer

Attack axis (the adversarial technique):

Identity spoofing
Policy bypass
Cross-layer composition
Prompt injection
Supply-chain trust escalation

The cross-layer composition category is particularly instructive for infrastructure teams. These are attacks that chain vulnerabilities across architectural boundaries — e.g., a prompt injection in the agent layer that triggers a policy bypass in the sandbox layer, which enables command injection at the exec layer. No single layer’s defenses can stop a composed attack; defense must be layered and coordinated [12].

5. The Enterprise Kill Switch Gap

The most damning infrastructure failure revealed by the crisis: OpenClaw has no centralized management capability [13]. Growth was explosive — 6,300 instances in week one, 230,000 within weeks, approaching 500,000 by RSAC 2026 — but the platform provides no enterprise management console, no fleet-wide patching mechanism, and no kill switch [13].

When CVE-2026-25253 was disclosed, administrators had to update each instance manually. Organizations with hundreds of instances had no way to know which were compromised, which were patched, or which were running unauthorized skills [13]. The VentureBeat reporting notes that even the four vendors who shipped responses at RSAC 2026 still hadn’t produced the one control enterprises need most: a native kill switch [13].

The agentgateway.dev project [14] articulates the architectural pattern that addresses this: a gateway layer that sits between the agent and all external services, providing a single enforcement point for policy, authentication, rate limiting, and emergency shutdown. This is the same pattern that API gateways brought to microservices, applied to agent infrastructure.

6. The Defensive Toolkit: What Has Shipped

The crisis catalyzed a rapid defensive response. Here is the current state of the toolkit as of April 2026:

NVIDIA NemoClaw (announced GTC, March 16, 2026) [15]: An enterprise wrapper around OpenClaw with three core controls: kernel-level sandbox (deny-by-default via OpenShell), an out-of-process policy engine that compromised agents cannot override, and a privacy router that keeps sensitive data on local Nemotron models while routing complex reasoning to cloud. Early-stage alpha. Launch partners include Cisco, Atlassian, Salesforce, CrowdStrike.

Microsoft Agent Governance Toolkit (released April 2, 2026) [16]: A seven-package open-source system (Python, TypeScript, Rust, Go, .NET) providing runtime policy enforcement. The “Agent OS” component intercepts every agent action before execution at sub-millisecond latency (p99 < 0.1ms). Claims to address all 10 OWASP agentic AI risks. Integrates with LangChain, CrewAI, Google ADK, and Microsoft Agent Framework via native extension points.

Cisco DefenseClaw (available March 27, 2026) [17]: Integrates four open-source tools: Skills Scanner (signature + LLM-based semantic analysis + behavioral dataflow), MCP Scanner, AI BoM (bill of materials), and CodeGuard. Skills Scanner supports SARIF output for GitHub Code Scanning integration and CI/CD gating.

SkillRisk (skillrisk.org) [18]: A client-side scanner that analyzes SKILL.md files, hook scripts, and MCP configurations for known attack patterns including ClawHavoc signatures, MCP SSRF vulnerabilities, CVE-2026-2256 unsanitized shell commands, and credential leaks. Supports OpenClaw, Claude Code, Cursor, and Windsurf skill formats. 100% client-side — no data leaves the device.

Snyk agent-scan [19]: Security scanner for AI agents, MCP servers, and agent skills, from the established vulnerability management vendor.

Key limitation across all tools: Every scanning tool carries the same caveat — they identify known and probable risk patterns but do not certify security. A clean scan does not guarantee a skill is benign [17][18]. This is the fundamental challenge: the attack surface is semantic (agent instructions in natural language), not just syntactic (code patterns), so signature-based detection will always lag behind novel attacks.

7. The OWASP Agentic Security Frameworks

Two complementary OWASP frameworks now provide structured risk taxonomies:

OWASP Top 10 for Agentic Applications (2026) [20]: Published December 2025, this is the first formal taxonomy of risks specific to autonomous AI agents. It covers goal hijacking, tool misuse, delegated trust, inter-agent communication, persistent memory, cascading failures, and rogue agents. Developed with 100+ industry experts.

OWASP Agentic Skills Top 10 [11]: A newer, complementary project specifically focused on skill/plugin security. Version 1.0 (2026 Edition) is in active development. It includes 10 risk pages with full descriptions, platform-specific attack scenarios, preventive mitigations, OWASP/NIST/CVE mappings, and real-world evidence citations. Key items include:

AST02: Trust prompts must not allow repository-controlled configuration to execute before explicit user trust confirmation — directly addressing the ClickFix 2.0 pattern
Agent skills are especially dangerous when they simultaneously have access to private data, exposure to untrusted content, and ability to communicate externally

The distinction matters: the Agentic Applications Top 10 addresses risks from autonomous action (an AI modifying databases, sending emails, calling APIs). The Agentic Skills Top 10 addresses the supply chain and trust-boundary risks specific to the skill/plugin ecosystem that extends agent capabilities.

Practical Implications

For teams running OpenClaw or similar agent infrastructure today

Deploy a gateway immediately. The single most impactful architectural control is an intermediary between your agents and all external services. This provides the kill switch, policy enforcement point, and audit trail that the base platforms lack [14]. The agentgateway pattern, NemoClaw’s out-of-process policy engine, and Microsoft’s Agent OS all implement this pattern differently, but the principle is the same: never let the agent be the outermost security boundary.
Scan every skill before installation. Run Cisco Skills Scanner or SkillRisk against every skill in your environment. Integrate into CI/CD pipelines with SARIF output. But understand this catches known patterns only — treat it as a minimum bar, not a guarantee [17][18].
Audit your CIK exposure. Map which agents have which Capabilities (what system actions they can take), which Identity credentials they hold, and what Knowledge they have accumulated. Poisoning any one dimension triples attack success rates [1]. Minimize each dimension to the narrowest scope required.
Implement the principle of least privilege for agent skills. A skill that needs to read email should not have write access to the filesystem. YAML-based policy engines (as in NemoClaw’s OpenShell) allow fine-grained per-skill access control [15].
Watch for ClickFix 2.0 patterns. Train your team to recognize when an agent presents “prerequisite installation” or “environment fix” commands. Legitimate skills should not require manual terminal commands. Any skill that does is suspicious by default [9][10].

For teams building agent platforms

Build the kill switch on day one. Fleet-wide emergency shutdown, forced patching, and centralized skill revocation are not optional features — they are prerequisites for enterprise deployment. OpenClaw’s crisis demonstrates that organic adoption will outpace your ability to add these controls retroactively [13].
Design for defense-in-depth across architectural layers. Cross-layer composition attacks chain vulnerabilities across the agent, plugin, sandbox, and gateway layers [12]. No single layer’s defenses are sufficient. Each layer must validate independently, and the overall architecture must assume that any individual layer can be compromised.
Treat skill marketplaces as package registries. Apply the same controls that package managers have learned the hard way: signing, provenance attestation, cooldown periods before availability, automated scanning, and revocation mechanisms [7][11].
Separate the policy engine from the agent runtime. If the agent can modify or disable its own security controls, those controls provide no defense against a compromised agent. NemoClaw’s out-of-process policy engine [15] and Microsoft’s stateless Agent OS [16] both implement this critical separation.

For the broader ecosystem

Adopt OWASP frameworks as your baseline threat model. The Agentic Applications Top 10 and Agentic Skills Top 10 provide structured, peer-reviewed starting points. Map your controls against them [11][20]. Regulatory pressure is incoming: the EU AI Act’s high-risk obligations take effect August 2026, and the Colorado AI Act becomes enforceable June 2026.

Open Questions

Can semantic attacks be detected at scale? Current scanning tools detect known syntactic patterns, but ClickFix 2.0 is fundamentally a semantic attack embedded in natural language instructions. Detecting novel semantic attacks requires the same LLM capabilities that make agents useful — creating a recursive dependency that has no clean solution yet.
What is the right autonomy-security tradeoff? The CIK research shows that the strongest defenses block legitimate operations alongside malicious ones (97% malicious injection blocked, but legitimate updates also prevented) [1]. The industry has not converged on where the acceptable tradeoff lies for different use cases.
Will agent skill marketplaces consolidate or fragment? ClawHub’s poisoning may drive enterprises toward curated, private skill registries. But fragmentation increases maintenance burden and reduces the ecosystem benefits that made skills valuable in the first place.
How will the regulatory response shape agent architecture? The EU AI Act and Colorado AI Act will impose specific requirements on autonomous AI systems. Whether these requirements align with the emerging architectural patterns (gateways, policy engines, sandboxing) or create friction with them remains to be seen.
Is NemoClaw’s model viable at scale? The privacy router pattern — routing sensitive data to local models while using cloud models for complex reasoning — is architecturally elegant but introduces latency, cost, and capability tradeoffs that have not been validated at enterprise scale.
What happens when attackers target the defensive tooling itself? SkillRisk, Skills Scanner, and similar tools are themselves software with attack surfaces. The TeamPCP campaign demonstrated cascading compromise through security tooling (Trivy was the entry point). Agent security scanners could become the next high-value target.

Sources

Wang, Z., Tu, H., Xie, C. et al. “Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw.” arXiv:2604.04759v1 (April 6, 2026). https://arxiv.org/abs/2604.04759v1
Wang, Y., Gao, H., Niu, Z. et al. “A Systematic Security Evaluation of OpenClaw and Its Variants.” arXiv:2604.03131v1 (April 3, 2026). https://arxiv.org/abs/2604.03131v1
“Nine CVEs in Four Days: Inside OpenClaw’s March 2026 Vulnerability Flood.” OpenClawAI Blog. https://openclawai.io/blog/openclaw-cve-flood-nine-vulnerabilities-four-days-march-2026
jgamblin/OpenClawCVEs — GitHub tracking repository. https://github.com/jgamblin/OpenClawCVEs/
“CVE-2026-25253: 1-Click RCE in OpenClaw Through Auth Token Exfiltration.” SOCRadar. https://socradar.io/blog/cve-2026-25253-rce-openclaw-auth-token/
“How CVE-2026-25253 exposed every OpenClaw user to RCE.” DEV Community. https://dev.to/andrewsispoidis/how-cve-2026-25253-exposed-every-openclaw-user-to-rce-and-how-to-fix-it-in-one-command-2dj
“ClawHavoc Poisons OpenClaw’s ClawHub With 1,184 Malicious Skills.” CyberPress. https://cyberpress.org/clawhavoc-poisons-openclaws-clawhub-with-1184-malicious-skills/
“ClawHavoc: Inside the Supply Chain Attack That Targeted 300,000 AI Agent Users.” Repello AI. https://repello.ai/blog/clawhavoc-supply-chain-attack
“ClawHavoc: Analysis of Large-Scale Poisoning Campaign Targeting the OpenClaw Skill Market.” Antiy Labs. https://www.antiy.net/p/clawhavoc-analysis-of-large-scale-poisoning-campaign-targeting-the-openclaw-skill-market-for-ai-agents/
“OpenClaw: A viral AI assistant and a magnet for infostealer malware and ClickFix trickery.” Intel 471. https://www.intel471.com/blog/openclaw-a-viral-ai-assistant-and-a-magnet-for-infostealer-malware-and-clickfix-trickery
“OWASP Agentic Skills Top 10.” OWASP Foundation. https://owasp.org/www-project-agentic-skills-top-10/
“A Systematic Taxonomy of Security Vulnerabilities in the OpenClaw AI Agent Framework.” arXiv:2603.27517v1 (March 2026). https://arxiv.org/abs/2603.27517
“OpenClaw has 500,000 instances and no enterprise kill switch.” VentureBeat. https://venturebeat.com/security/openclaw-500000-instances-no-enterprise-kill-switch
“Multi-Agent OpenClaw Architecture with a Kill Switch.” AgentGateway Blog. https://agentgateway.dev/blog/2026-02-21-kill-switch/
“NVIDIA Announces NemoClaw for the OpenClaw Community.” NVIDIA Newsroom. https://nvidianews.nvidia.com/news/nvidia-announces-nemoclaw
“Introducing the Agent Governance Toolkit.” Microsoft Open Source Blog. https://opensource.microsoft.com/blog/2026/04/02/introducing-the-agent-governance-toolkit-open-source-runtime-security-for-ai-agents/
“Cisco Reimagines Security for the Agentic Workforce.” Cisco Newsroom. https://newsroom.cisco.com/c/r/newsroom/en/us/a/y2026/m03/cisco-reimagines-security-for-the-agentic-workforce.html
SkillRisk — AI Agent Skill Security Scanner. https://skillrisk.org/
Snyk agent-scan — Security scanner for AI agents, MCP servers and agent skills. https://github.com/snyk/agent-scan
“OWASP Top 10 for Agentic Applications for 2026.” OWASP GenAI Security Project. https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
“OpenClaw’s Security Crisis: What 346,000 Stars and 135,000 Exposed Instances Teach Us.” DEV Community. https://dev.to/jahanzaibai/openclaws-security-crisis-what-346000-stars-and-135000-exposed-instances-teach-us-about-ai-fpb
“What Security Teams Need to Know About OpenClaw.” CrowdStrike Blog. https://www.crowdstrike.com/en-us/blog/what-security-teams-need-to-know-about-openclaw-ai-super-agent/
“Running OpenClaw safely: identity, isolation, and runtime risk.” Microsoft Security Blog. https://www.microsoft.com/en-us/security/blog/2026/02/19/running-openclaw-safely-identity-isolation-runtime-risk/
“The Agent Skill Ecosystem: When AI Extensions Become a Malware Delivery Channel.” Lakera. https://www.lakera.ai/blog/the-agent-skill-ecosystem-when-ai-extensions-become-a-malware-delivery-channel
“OpenClaw Security Catastrophe: CVE-2026-25253 and the Largest AI Privacy Breach in History.” DEV Community. https://dev.to/tiamatenity/openclaw-security-catastrophe-cve-2026-25253-and-the-largest-ai-privacy-breach-in-history-2ljl
“NVIDIA NemoClaw Explained: OpenClaw Gets Enterprise Security (GTC 2026).” Particula Tech. https://particula.tech/blog/nvidia-nemoclaw-openclaw-enterprise-security
Cisco AI Defense skill-scanner. https://github.com/cisco-ai-defense/skill-scanner
“The 2026 OWASP Agentic Top 10: Why Agentic AI Security Has to Be Planned Up Front.” SPR. https://spr.com/the-2026-owasp-agentic-top-10-why-agentic-ai-security-has-to-be-planned-up-front/