Artificer Digital The Artificer's Grimoire

Scout: Supply Chain Security for Agent Infrastructure

Summary

The TeamPCP campaign of March 2026 demonstrated that AI infrastructure is not merely affected by supply chain attacks — it is a preferred target because of a structural property unique to this stack: credential aggregation. An LLM proxy like LiteLLM holds API keys for dozens of providers simultaneously, making a single package compromise worth orders of magnitude more than a typical library takeover. The attack chain was cascading — Trivy to Checkmarx to LiteLLM to Telnyx — with each compromise yielding credentials to attack the next target. Defenders now have a concrete toolkit: package manager cooldown periods (supported by seven major managers as of March 2026), hash-pinned lock files via uv, PyPI Trusted Publishers with Sigstore attestations, GitHub’s incoming Actions dependency locking, and Datadog’s open-source Supply-Chain Firewall. The gap is adoption: only 4% of organizations pin all GitHub Action hashes, and half of all organizations install third-party packages within one day of release. The defensive patterns exist; the challenge is making them default.

Key Findings

1. AI Infrastructure Has a Unique Attack Surface: Credential Aggregation

The defining characteristic that separates AI supply chain risk from general software supply chain risk is the credential aggregation pattern inherent in LLM proxy and orchestration layers [1][2][3].

A production LiteLLM instance typically holds simultaneous API keys for OpenAI, Anthropic, AWS Bedrock, Google Vertex AI, Azure OpenAI, and potentially dozens more. It also holds database credentials, Redis connections, and Kubernetes service account tokens. Compromising this single dependency yields an organization’s entire AI credential portfolio — a concentration-of-risk that has no real parallel in traditional web application stacks.

The blast radius extends further than direct installs suggest. LiteLLM averages 97 million monthly downloads, but many are transitive — pulled in by DSPy, CrewAI, OpenHands, and other frameworks that depend on it [3]. Teams may not even know they are running LiteLLM, yet they inherit its full attack surface.

The CSA research note [3] identifies a second AI-specific vector: model file deserialization. Pickle-format model files (.pkl, .pt, .bin) execute arbitrary code during deserialization, and standard software composition analysis tooling does not scan them [14][15]. Picklescan, the primary defense tool, has had at least seven zero-day bypasses disclosed across Sonatype and JFrog research [15]. The safetensors format eliminates this class of attack by design (it serializes tensor data only, not Python objects), but pickle remains deeply entrenched in practice [14].

2. The TeamPCP Attack Chain: Anatomy of a Cascading Compromise

The TeamPCP campaign is the most technically sophisticated supply chain operation targeting the AI ecosystem to date. Understanding its structure is essential because it demonstrates the cascading model that will likely define future attacks on this stack [2][3].

March 19 — Trivy compromise (entry point): Attackers exploited unsanitized GitHub Actions workflows in Aqua Security’s Trivy project. They published a malicious release v0.69.4 and force-pushed 76 of 77 GitHub Action tags to malicious commits. The injected payload dumped Runner.Worker memory, scraped credential locations, encrypted results with AES-256 and RSA-4096, and exfiltrated to scan.aquasecurtiy[.]org (a typosquat). Fallback: creating public repositories and uploading stolen data [2].

March 20-22 — npm worm: Attackers deployed a self-propagating npm worm across publisher scopes (@EmilGroup, @opengov, others). The worm automated compromise by stealing npm tokens, determining publishable packages, bumping versions, and republishing with malicious payloads. Iranian-geolocated systems received destructive “kamikaze” containers; others received persistent backdoors [2].

March 23 — Checkmarx and OpenVSX: Using harvested credentials, attackers compromised Checkmarx GitHub Actions and OpenVSX extensions [2][3].

March 24 — LiteLLM (the high-value target): Versions 1.82.7 and 1.82.8 published to PyPI with two distinct execution mechanisms [1][2]:

  • v1.82.7: Malicious base64-encoded payload in litellm/proxy/proxy_server.py, executing on import
  • v1.82.8: A litellm_init.pth file in the wheel, executing on any Python interpreter startup — no import required

The payload: credential harvesting (environment variables, SSH keys, cloud credentials, Kubernetes data, Docker configs, shell history, database credentials, CI/CD secrets), local AES-256/RSA-4096 encryption, exfiltration to models.litellm[.]cloud, persistence via ~/.config/sysmon/sysmon.py with systemd service, C2 polling, and Kubernetes lateral movement via privileged node-setup-* pods [2].

March 27 — Telnyx SDK: Versions 4.87.1 and 4.87.2 used steganography — malicious payloads hidden within WAV audio file frames [2].

The critical lesson: this was not a parallel attack. Each step yielded the credentials for the next. A single Trivy workflow exploit cascaded into four distinct ecosystems within eight days.

3. Package Manager Cooldowns: A Simple Defense That Actually Works

The most immediately deployable defense against fast-moving supply chain attacks is the minimum release age feature, now supported by seven major package managers [4]:

Package ManagerVersionSettingNotes
pnpm 10.16Sep 2025minimumReleaseAgeSupports minimumReleaseAgeExclude for trusted packages
Yarn 4.10.0Sep 2025npmMinimalAgeGate (minutes)npmPreapprovedPackages for exemptions
Bun 1.3Oct 2025minimumReleaseAge in bunfig.toml
Deno 2.6Dec 2025--minimum-dependency-ageOn deno update and deno outdated
uv 0.9.17Dec 2025--exclude-newer (relative duration)Per-package overrides via exclude-newer-package
pip 26.0Jan 2026--uploaded-prior-toAbsolute timestamps only (Seth Larson’s cron workaround for relative)
npm 11.10.0Feb 2026min-release-age

This is relevant because the compromised LiteLLM versions were live for approximately two hours. A cooldown of even 24 hours would have prevented every install during that window. The Datadog DevSecOps 2026 report found that half of all organizations use third-party libraries within one day of release [9] — meaning half the ecosystem has zero buffer against fast-burning compromises.

The tradeoff is real: cooldowns delay legitimate security patches too. Teams need per-package overrides (available in pnpm, Yarn, uv) to exempt trusted packages or use different cooldown tiers based on criticality.

4. Dependency Pinning and Lock Files: The State of Practice

The Python AI ecosystem has historically been poor at dependency pinning. The shift to uv is changing this [10][11]:

What uv provides: Cross-platform deterministic lock files (uv.lock), hash verification for all resolved packages, and automatic environment sync on uv run. Unlike pip-tools, a single lock file works across platforms. This is particularly valuable for AI projects with CUDA/CPU build matrix requirements [10].

Hash verification is the critical feature for supply chain defense. Running pip install --require-hashes -r requirements.txt ensures that even if a package version is replaced on PyPI, the hash mismatch blocks installation [7]. uv supports this natively through its lock file.

The gap: Dependencies are now 278 days behind their latest major version on average, up from 215 days the prior year [9]. This creates pressure to update quickly when updates arrive, which conflicts with cooldown discipline. The resolution is automated dependency updates (Dependabot/Renovate) combined with cooldown enforcement — update frequently but never on day zero.

PyPI attestations add another layer: Sigstore-based attestations, now generally available, tie package releases to specific CI workflows via OIDC [13]. Over 20,000 packages already publish attestations. If LiteLLM had used Trusted Publishers with attestations, the malicious publish would have required compromising the specific GitHub Actions workflow (which TeamPCP did, but the attestation would have created an auditable trail and required matching the exact workflow identity).

5. CI/CD Hardening: The Upstream Battlefield

The TeamPCP campaign started in CI/CD and cascaded outward. The statistics on CI/CD security posture are sobering [9][12]:

  • Only 4% of organizations pin action hashes for all marketplace actions
  • 71% never pin hashes for any marketplace actions
  • 80% use unpinned third-party actions not managed by GitHub
  • 2% actively run previously compromised actions without pinning
  • 87% of organizations have at least one exploitable vulnerability affecting 40% of all services

GitHub’s 2026 Actions Security Roadmap [5] addresses these gaps with five features:

  1. Workflow Dependency Locking: A dependencies: section in workflow YAML that locks all direct and transitive dependencies by commit SHA, with hash verification before execution. (Public preview in 3-6 months)
  2. Policy-Driven Execution Controls: Centralized rulesets controlling who can trigger which workflows. (Public preview in 3-6 months)
  3. Scoped Secrets: Secrets bindable to specific repos, branches, environments, and workflow paths — eliminating implicit credential inheritance. (Public preview in 3-6 months)
  4. Actions Data Stream: Near real-time execution telemetry to S3, Azure Event Hub, or Azure Data Explorer. (Public preview in 3-6 months)
  5. Native Egress Firewall: Layer 7 network boundary outside the runner VM, immutable even with root access inside the runner. Monitor mode and enforce mode. (Public preview in 6-9 months)

Available today: StepSecurity’s Harden-Runner [16] provides runtime security monitoring for GitHub Actions — network egress baselining, file integrity monitoring, and anomaly detection. It detected the tj-actions breach and is used by CISA for their own workflows. The OpenSSF Maintainers’ Guide [12] provides a concrete hardening checklist: pin actions by SHA, configure GITHUB_TOKEN with minimal permissions per job, require MFA for all contributors with commit/release access, enable tag and branch protection, and use GitHub Environments for secret access approval.

6. Defensive Architecture for LLM Proxy Layers

Teams running LLM proxy infrastructure (LiteLLM, Portkey, Kong AI Gateway, or custom) need patterns beyond standard dependency hygiene [6][7][8]:

Credential isolation: Never pass provider API keys as environment variables accessible to the proxy process. Use secrets managers (AWS Secrets Manager, Vault) with short-lived credentials and just-in-time retrieval. If the proxy process is compromised, the attacker should not find credentials in memory or on disk.

Network segmentation: The LLM proxy should run in an isolated network segment with explicit egress rules. Allowlist only the specific API endpoints it needs to reach (e.g., api.openai.com, api.anthropic.com). The TeamPCP payload exfiltrated to models.litellm[.]cloud — a domain that has no business being in an egress allowlist for a legitimate LLM proxy.

Private package mirrors: Deploy devpi, Artifactory, or similar. Serve only vetted package versions. This is the most effective single control for preventing fast-moving supply chain attacks from reaching production [7].

Datadog Supply-Chain Firewall (SCFW) [17]: An open-source tool that wraps pip install and npm install, blocking known malicious packages by querying Datadog’s malicious packages dataset, OSV.dev advisories, and package registry metadata. Install via pip install scfw (or preferably pipx), run as scfw run pip install <package>. Low-friction enough for developer workstations.

.pth file monitoring: The v1.82.8 payload used a .pth file for execution-on-install. Monitor Python site-packages directories for new .pth files containing import or exec statements: find $(python -c "import site; print(site.getsitepackages()[0])") -name "*.pth" -exec grep -El "import|exec" {} \; [7].

7. The SLSA Framework and Where AI Stacks Stand

SLSA (Supply-chain Levels for Software Artifacts) v1.2 [18] provides the most rigorous framework for build integrity, but adoption in the AI Python ecosystem is early. SLSA Level 2 — which requires build integrity through CI/CD pipelines, provenance metadata, and cryptographic artifact signing — is achievable today using PyPI Trusted Publishers + Sigstore attestations + GitHub Actions [13][18].

The gap: most AI infrastructure packages do not yet publish SLSA provenance. Of the packages compromised in the TeamPCP campaign, none had SLSA attestations that would have made the unauthorized publish immediately detectable. SLSA Level 3 (hardened builds with ephemeral, isolated build environments) would have prevented the attack entirely, as the compromised maintainer credentials would not have been sufficient to publish from outside the expected build environment.

The tradeoff: SLSA compliance adds operational overhead to the release process, and many AI/ML projects are maintained by small teams or individuals who prioritize velocity. The ecosystem needs SLSA adoption to become zero-configuration for standard GitHub Actions workflows — which is the direction PyPI’s Trusted Publisher + attestation work is heading [13].

Practical Implications

Immediate Actions (This Week)

  1. Check exposure: Verify no environment has installed litellm==1.82.7, litellm==1.82.8, telnyx==4.87.1, or telnyx==4.87.2. If found, treat as full credential compromise — rotate all credentials reachable from that environment, including LLM provider API keys, cloud credentials, SSH keys, Kubernetes tokens, and CI/CD secrets [2][3].

  2. Hunt for persistence artifacts: Search for litellm_init.pth in site-packages, ~/.config/sysmon/sysmon.py, /tmp/pglog, and Kubernetes pods matching node-setup-* [2].

  3. Audit .pth files in all Python environments used for AI workloads [7].

Short-Term Hardening (Next 30 Days)

  1. Enable package manager cooldowns: Configure --exclude-newer in uv (recommended: 3 days minimum for non-critical packages, 1 day for critical with override) or --uploaded-prior-to in pip [4].

  2. Switch to hash-pinned lock files: Adopt uv.lock for all AI infrastructure projects. Commit lock files to version control. Verify hashes on every install [7][10].

  3. Pin all GitHub Actions by SHA: Replace uses: actions/checkout@v4 with uses: actions/checkout@<full-sha>. Use Dependabot or Renovate to manage updates [12].

  4. Deploy SCFW on developer workstations: pipx install scfw && scfw configure to wrap pip/npm installs with malicious package blocking [17].

  5. Adopt PyPI Trusted Publishers: For any packages you maintain, eliminate stored API tokens in favor of OIDC-based authentication tied to specific GitHub Actions workflows [7][13].

Strategic Architecture (Next Quarter)

  1. Map your credential dependency graph: Identify every package in your AI stack that has access to provider API keys, cloud credentials, or infrastructure secrets. Treat each as a critical dependency regardless of its direct usage [3][6].

  2. Isolate LLM proxy credentials: Move from environment variables to secrets manager with just-in-time retrieval. Implement network egress allowlists on proxy infrastructure [6][7].

  3. Deploy private package mirrors: For production AI infrastructure, serve packages from a vetted mirror (devpi, Artifactory) rather than directly from PyPI [7].

  4. Adopt StepSecurity Harden-Runner: Add runtime security monitoring to CI/CD workflows, particularly those that publish packages or deploy infrastructure [16].

  5. Evaluate safetensors migration: For any model serialization workflows using pickle format, begin migration to safetensors. Where pickle cannot be eliminated, deploy dedicated model file scanning [14][15].

Defensive Checklist for LLM Proxy Deployments

  • Provider API keys retrieved from secrets manager, not environment variables
  • Proxy runs in isolated network segment with explicit egress allowlist
  • Proxy dependencies pinned with hash verification in lock file
  • Package manager cooldown enabled (minimum 72 hours recommended)
  • .pth file monitoring enabled on all Python environments
  • CI/CD actions pinned by SHA with automated update management
  • Credential rotation plan documented and tested for supply chain incident
  • SCFW or equivalent installed on developer workstations
  • Runtime egress monitoring on CI/CD runners (Harden-Runner or equivalent)
  • SBOM generated and maintained for the proxy dependency tree

Open Questions

  1. Cooldown calibration: What is the right cooldown duration? Two hours was the LiteLLM exposure window, but detection time varies wildly. Is 72 hours enough? Should cooldown scale with package criticality or download volume? No empirical data exists yet on optimal thresholds.

  2. Transitive dependency visibility: Teams running CrewAI, DSPy, or LangChain may not realize they depend on LiteLLM transitively. How should frameworks surface their transitive dependency risk to consumers, particularly for credential-touching packages?

  3. SLSA adoption curve for AI packages: PyPI Trusted Publishers and Sigstore attestations are available but not yet widely adopted by AI infrastructure packages. What would accelerate adoption — registry enforcement, framework requirements, or enterprise procurement pressure?

  4. Model file supply chain: The safetensors migration is incomplete, and picklescan has demonstrated bypass vulnerabilities. Is a fundamentally different approach needed for model file integrity verification? Could SLSA-style provenance work for model artifacts on Hugging Face Hub?

  5. Agentic CI/CD risk: The hackerbot-claw campaign [19] demonstrated autonomous AI bots scanning for exploitable CI/CD patterns. As AI agents become more capable, does the attack surface for CI/CD pipeline poisoning grow faster than defenses can keep up?

  6. Cooldown vs. security patches: Cooldowns delay malicious packages but also delay legitimate security fixes. The Python ecosystem lacks a mechanism for “emergency bypass” that would let vetted maintainers publish urgent patches without cooldown. Is this a gap or an acceptable tradeoff?

Sources

  1. Simon Willison, “Malicious LiteLLM,” March 24, 2026 — https://simonwillison.net/2026/Mar/24/malicious-litellm/
  2. Datadog Security Labs, “LiteLLM and Telnyx compromised on PyPI: Tracing the TeamPCP supply chain campaign,” March 2026 — https://securitylabs.datadoghq.com/articles/litellm-compromised-pypi-teampcp-supply-chain-campaign/
  3. Cloud Security Alliance, “TeamPCP and the Cascading AI/ML Supply Chain Campaign,” March 29, 2026 — https://labs.cloudsecurityalliance.org/research/csa-research-note-ai-pypi-supply-chain-campaign-20260329-csa/
  4. Simon Willison, “Package managers need to cool down,” March 24, 2026 — https://simonwillison.net/2026/Mar/24/package-managers-need-to-cool-down/
  5. GitHub Blog, “What’s coming to our GitHub Actions 2026 security roadmap,” March 2026 — https://github.blog/news-insights/product-news/whats-coming-to-our-github-actions-2026-security-roadmap/
  6. DreamFactory, “Why the LiteLLM Supply Chain Attack Is a Wake-Up Call for AI API Credential Management,” March 2026 — https://blog.dreamfactory.com/why-the-litellm-supply-chain-attack-is-a-wake-up-call-for-ai-api-credential-management
  7. Mozilla.ai, “Hardening Your LLM Dependency Supply Chain,” March 2026 — https://blog.mozilla.ai/hardening-your-llm-dependency-supply-chain/
  8. ARMO, “The Library That Holds All Your AI Keys Was Just Backdoored,” March 2026 — https://www.armosec.io/blog/litellm-supply-chain-attack-backdoor-analysis/
  9. Datadog / StepSecurity, “Datadog’s DevSecOps 2026 Report,” March 2026 — https://www.stepsecurity.io/blog/datadogs-devsecops-2026-report-validates-what-weve-been-building
  10. Cuttlesoft, “Python Dependency Management in 2026,” January 2026 — https://cuttlesoft.com/blog/2026/01/27/python-dependency-management-in-2026/
  11. ShiftMag, “Goodbye Python chaos: Meet uv, the AI engineer’s superpowered tooling” — https://shiftmag.dev/tame-python-chaos-with-uv-the-superpower-every-ai-engineer-needs-6051/
  12. OpenSSF, “Maintainers’ Guide: Securing CI/CD Pipelines After the tj-actions and reviewdog Supply Chain Attacks,” June 2025 — https://openssf.org/blog/2025/06/11/maintainers-guide-securing-ci-cd-pipelines-after-the-tj-actions-and-reviewdog-supply-chain-attacks/
  13. PyPI / Sigstore Blog, “PyPI’s Sigstore-powered attestations are now generally available” — https://blog.sigstore.dev/pypi-attestations-ga/
  14. DEV Community / Luke Hinds, “Understanding SafeTensors: A Secure Alternative to Pickle for ML Models” — https://dev.to/lukehinds/understanding-safetensors-a-secure-alternative-to-pickle-for-ml-models-o71
  15. Sonatype, “Exposing 4 Critical Vulnerabilities in Python Picklescan” — https://www.sonatype.com/blog/bypassing-picklescan-sonatype-discovers-four-vulnerabilities
  16. StepSecurity / Harden-Runner GitHub repository — https://github.com/step-security/harden-runner
  17. Datadog, “Supply-Chain Firewall (SCFW)” — https://github.com/DataDog/supply-chain-firewall
  18. SLSA Framework — https://slsa.dev/
  19. Datadog Engineering, “When an AI agent came knocking: Catching malicious contributions in Datadog’s open source repos” — https://www.datadoghq.com/blog/engineering/stopping-hackerbot-claw-with-bewaire/
  20. TrueFoundry, “Supply Chain Attacks in AI: What the LiteLLM Incident Reveals” — https://www.truefoundry.com/blog/supply-chain-attack-ai-infrastructure-litellm
  21. The Record, “Supply chain attack hits widely-used AI package” — https://therecord.media/supply-chain-attack-hits-widely-used-ai-package
  22. Sonatype, “Compromised litellm PyPI Package Delivers Multi-Stage Credential Stealer” — https://www.sonatype.com/blog/compromised-litellm-pypi-package-delivers-multi-stage-credential-stealer