SecurityAI

The LiteLLM Supply Chain Attack: 47,000 Downloads in 46 Minutes

March 26, 2026

by SolaScript

The LiteLLM Supply Chain Attack: 47,000 Downloads in 46 Minutes

#Supply Chain #PyPI #Python #Malware #AI Infrastructure

A simple pip install litellm was all it took. On March 24, 2026, a poisoned version of the popular AI library LiteLLM hit PyPI and started exfiltrating SSH keys, AWS credentials, Kubernetes configs, crypto wallets, and everything else attackers dream about—all within 46 minutes before anyone noticed.

What makes this attack fascinating isn’t just its scope or sophistication. It’s that we only found out about it because the attacker made a rookie mistake. The malware had a bug that turned it into a fork bomb, crashing machines and making itself impossible to ignore. As Andrej Karpathy pointed out on X: “if the attacker didn’t vibe code this attack it could have been undetected for many days or weeks.”

That observation should keep every developer up at night.

How It Went Down

The story comes from Callum McMahon at FutureSearch, who had the misfortune of becoming one of the malware’s first victims. He was working with an MCP plugin inside Cursor that pulled in LiteLLM as a transitive dependency. When version 1.82.8 installed, his 48GB Mac ground to a halt—CPU pegged at 100%, htop taking 10+ seconds to load, 11,000 processes running.

Something was very wrong.

After a hard reset and some investigation with Claude Code, McMahon found the culprit: a rogue package buried in his uv cache. The malicious LiteLLM version had installed a file called litellm_init.pth in site-packages. Python automatically executes .pth files on every interpreter startup—a feature that became a vulnerability.

The .pth file’s first action was to spawn a child Python process. But because .pth files trigger on every interpreter startup, that child process also triggered the .pth file, which spawned another child, which spawned another. Exponential fork bomb. The attacker’s own sloppy implementation—likely vibe-coded according to Karpathy—is what made it visible.

McMahon quickly analyzed the malware, documented it, and reported it to both PyPI and the LiteLLM maintainers. PyPI quarantined the packages within 46 minutes of the malicious versions first being uploaded—remarkably fast response once the report came in.

What the Malware Actually Did

Before that fork bomb bug crashed systems, the payload was doing exactly what you’d expect sophisticated malware to do. According to FutureSearch’s technical analysis, it operated in three stages:

Collection. A Python script harvested sensitive files from the host: SSH private keys and configs, .env files, AWS/GCP/Azure credentials, Kubernetes configs, database passwords, .gitconfig, shell history, crypto wallet files, and anything matching common secret patterns. It also ran commands to dump environment variables and query cloud metadata endpoints.

Exfiltration. The collected data was encrypted with a hardcoded 4096-bit RSA public key using AES-256-CBC, bundled into a tar archive, and POSTed to models.litellm.cloud—a domain that is not part of legitimate LiteLLM infrastructure.

Lateral Movement and Persistence. If a Kubernetes service account token was present, the malware read all cluster secrets across all namespaces and attempted to create privileged alpine pods on every node. Each pod mounted the host filesystem and installed a persistent backdoor at /root/.config/sysmon/sysmon.py with a systemd service.

This wasn’t a script kiddie attack. This was methodical, professional credential harvesting with infrastructure set up to receive the stolen data.

The Numbers Are Terrifying

FutureSearch’s follow-up analysis reveals the blast radius: 46,996 downloads in 46 minutes.

LiteLLM itself has around 97 million downloads per month. But the real danger wasn’t direct installs—it was transitive dependencies. 2,337 packages on PyPI depend on LiteLLM. 88% of those had version specs that would have resolved to the compromised versions during the attack window.

If you ran pip install dspy during those 46 minutes, you got pwned. Same for any other project that depended on litellm>=1.64.0 or similar unpinned constraints. The malware executed during installation itself—before any application code ran. The act of installing was enough.

Here’s what made version pinning so critical:

litellm==1.80.0 → Safe. Resolver installs exactly that version.
litellm>=1.0 → Exposed. No upper bound means latest version.
litellm (no constraint) → Exposed. Same problem.

Only 12% of dependent packages were protected by explicit version pinning or upper bounds that excluded 1.82.x.

Two Different Attacks, Same Window

There were actually two compromised versions: 1.82.7 and 1.82.8. They used different attack vectors and exfiltrated to different command-and-control servers.

Version 1.82.8 used the .pth file approach. Python executes these on any interpreter startup, including during pip install itself. If your resolver picked this version, the payload ran before your application had any chance to intervene.

Version 1.82.7 injected a payload into proxy_server.py that dropped a secondary script. It only triggered when litellm.proxy was imported—primarily affecting proxy server deployments rather than general SDK usage.

Both payloads harvested the same material. Both exfiltrated to attacker-controlled infrastructure. The .pth version was more aggressive, executing on installation rather than waiting for specific imports.

Why This Keeps Happening

Supply chain attacks like this are what Karpathy called “basically the scariest thing imaginable in modern software.” Every time you install any dependency, you could be pulling in a poisoned package anywhere deep inside its entire dependency tree.

Classical software engineering would have you believe that dependencies are good—we’re building pyramids from bricks. But that metaphor breaks down when any brick in your pyramid can secretly be a bomb. Karpathy’s take: “it’s why I’ve been so growingly averse to them, preferring to use LLMs to ‘yoink’ functionality when it’s simple enough and possible.”

There’s also a bitter irony here, as McMahon points out. Simon Willison has been warning about “the lethal trifecta” surrounding MCP servers for almost a year—the combination of tool access, data exposure, and LLM manipulation that makes prompt injection attacks so dangerous. Yet in this case, MCP servers got compromised via “regular old supply chain attacks, no tricking of LLMs required.”

What You Should Do

Check if you’re affected. If you installed or upgraded LiteLLM on or after March 24, 2026, run pip show litellm and check for version 1.82.7 or 1.82.8. Search your package manager caches:

find ~/.cache/uv -name "litellm_init.pth" 2>/dev/null
find ~/.cache/pip -name "litellm_init.pth" 2>/dev/null

Look for persistence mechanisms:

ls -la ~/.config/sysmon/sysmon.py 2>/dev/null
ls -la ~/.config/systemd/user/sysmon.service 2>/dev/null

In Kubernetes, check for unauthorized pods:

kubectl get pods -n kube-system | grep node-setup

If you find any of the above, assume all credentials on the machine are compromised. Rotate SSH keys, cloud provider credentials, API keys, and database passwords. The PyPI packages have been quarantined, but stolen secrets remain valid until someone rotates them.

The Bigger Picture

This attack succeeded because of a compromised maintainer account—neither malicious version has a corresponding tag or release on the LiteLLM GitHub repository. They were uploaded directly to PyPI, bypassing the normal release process.

PyPI’s Trusted Publishers feature ties package uploads to specific CI workflows. If LiteLLM had used it, the attacker would have needed to compromise the GitHub Actions workflow, not just a PyPI API token. Sigstore attestations go further, creating a cryptographic chain from source commit to published artifact.

Both are free. Every package maintainer should enable them.

Lessons for the AI Era

The LiteLLM attack is particularly concerning because of what LiteLLM is: a library that provides a unified interface to dozens of AI providers. Organizations using it have API keys for OpenAI, Anthropic, Google, and others all loaded into the same environment. One compromised dependency and an attacker has access to your entire AI infrastructure billing.

The attack also highlights how the AI tooling ecosystem has created new attack surfaces. MCP servers, AI coding assistants like Cursor, and the rapid iteration culture of “vibe coding” all contribute to environments where dependencies get pulled in without much scrutiny. When your IDE is automatically downloading packages to make an AI plugin work, you’re trusting a lot of upstream code.

McMahon’s recommendation after this experience: move to a remote MCP architecture where the server doesn’t run on the user’s machine. No local code execution means a poisoned dependency can’t touch your filesystem. It’s not always possible, but it collapses this entire attack surface when it is.

The Unsettling Conclusion

The scariest part of this story isn’t that it happened. It’s that we found out at all. The attacker’s fork bomb bug was the only reason this attack became visible within minutes rather than weeks. A more careful implementation would have quietly exfiltrated credentials from thousands of developer machines and CI/CD pipelines without anyone noticing.

How many similar attacks have succeeded without crashing anything? We don’t know. We can’t know.

Pin your dependencies. Use lock files with checksums. Audit packages before upgrading. Enable Trusted Publishers if you maintain packages. And maybe, as Karpathy suggests, think twice before adding that next dependency to your project.

Sometimes building your own brick is safer than trusting the supply chain.

Published by

Sola Fide Technologies - SolaScript

This blog post was crafted by AI Agents, leveraging advanced language models to provide clear and insightful information on the dynamic world of technology and business innovation. Sola Fide Technology is a leading IT consulting firm specializing in innovative and strategic solutions for businesses navigating the complexities of modern technology.

Keep Reading