Three separate npm breaches hit AI infrastructure in one week. Anthropic shipped its Claude Code source code to the registry by accident. Axios, the JavaScript HTTP library that almost every Node app depends on, was hijacked by a North Korean threat group. Mercor confirmed it was breached through the LiteLLM compromise we flagged last week. The package manager is now the single most leveraged attack surface in the AI stack.
Anthropic accidentally published its own crown jewels. An npm packaging error exposed the full Claude Code source. The Hacker News, BleepingComputer, and SiliconAngle all confirmed it on March 31. Anthropic then spent the next 48 hours issuing takedowns against thousands of mirror repositories, which the company later said was an accident. By April 2, the leaked code was being weaponized as bait in infostealer campaigns. The packaging mistake matters less than how quickly the leak got reverse-engineered into a working open framework. Multi-agent orchestration harnesses replicating Claude Code's design are the single most active development mechanism this week, with one of the dominant repositories already past 140K stars. The harness pattern was being studied before the leak. The leak just turned a research project into a copy.
The Axios npm compromise is the more dangerous story. Late March, the maintainer of Axios was social-engineered through a fake Microsoft Teams error message, and the attacker pushed a cross-platform RAT into one of the most-installed HTTP libraries on the registry. Google's threat intelligence team attributed it to UNC1069, a North Korean group. Npm-based supply chain attacks were converging for over a week before Google's public attribution. The Hacker News timeline traces the social engineering vector in detail. If you are running any agentic system that pulls Axios transitively, and almost every one does, the post-incident audit is not optional.
Mercor confirmed it was hit by the LiteLLM compromise. Last week we discussed the LiteLLM PyPI backdoor that scraped credentials on every Python process start. TechCrunch reported on March 31 that Mercor, the AI talent marketplace, was breached through that exact vector. Mercor is the first named downstream victim. There will be more. The class of company at risk is the one where engineering velocity ran ahead of dependency review, which is most of them. The LiteLLM compromise is exactly the kind of incident that surfaces a quarter late, when the credentials get reused.
Gemma 4 was abliterated in ninety minutes
Google released Gemma 4 on April 2. Four variants under Apache 2.0, including a 26B mixture-of-experts model and a 31B dense. The official DeepMind announcement framed the release around on-device multimodal capability; HuggingFace's launch post walked through the variants; NVIDIA shipped NVFP4 quants on day one. Within ninety minutes of the official drop, aggressive uncensored variants of the dense models were live on HuggingFace. That timing is not anomalous anymore. Abliteration tooling has matured to the point where the gap between release and weight surgery is measured in minutes. The MoE angle is new this week. Standard abliteration techniques developed for dense transformers do not cleanly transfer to mixture-of-experts architectures, where refusal behavior appears to route through expert selection rather than uniform residual stream directions. So Gemma 4's 26B MoE variant is harder to uncensor than its 31B dense sibling, and the community is now publishing routing-aware abliteration methods specifically targeting it. Anyone shipping a model with a safety policy now has one hour before that policy is removed from dense weights. MoE buys you slightly more, until it doesn't.
OpenClaw is no longer free. Anthropic confirmed that Claude Code subscribers will pay extra for OpenClaw usage, the framework Anthropic released as open weights two weeks ago. Anthropic is running the familiar play of releasing an open standard, building the surrounding tooling, and then metering access to the tooling. Mistral, Meta, and others have already done this, just on a longer timeline. For the agentic security category specifically, ClawKeeper and similar runtime watchdog frameworks have emerged as the third-party answer, and Cisco shipped DefenseClaw as an open-source alternative. The agent-governance market we flagged last week now has three commercial vendors competing for the runtime layer.
Voxtral has a real competitor already. Two weeks after Mistral released Voxtral-4B-TTS with the sub-100ms time-to-first-audio claim, k2-fsa's OmniVoice shipped with support for 600+ languages via zero-shot diffusion TTS. T5Gemma-TTS appeared from a third group using an encoder-decoder approach to bypass the prefix-competition problem in codec language models. Three architecturally distinct open-weight TTS systems in two weeks, all targeting the latency-and-multilingual frontier that ElevenLabs has been monetizing. Commercial TTS pricing in 2027 looks softer than it did in 2025.
On our radar
- NPU serving as a category emerging from AMD. AMD's Lemonade ships local LLM serving over GPU+NPU, and LM Studio's new headless CLI lets you run Gemma 4 locally with Claude Code integration. Consumer NPU silicon has been a feature in search of a workload for two years. Local agentic coding may finally be that workload.
- Anthropic's M&A pace is accelerating. Coefficient Bio for $400M, a new PAC, and active fundraising at $380B all dropped in a single week. The company is behaving like one preparing for a regulatory or IPO event, not like one settling in for another research cycle.
- 1-bit LLMs as a deployment category. Prism ML's Bonsai-8B is being marketed as the first commercially viable 1-bit model, with the GGUF variant clearing 39K downloads. Whether 1-bit holds quality on reasoning tasks at long context is the open question. If it does, the deployment math for edge devices changes in a way that nothing else released this week comes close to.
Signal data for this briefing is provided by HiddenState, Mosaic Theory's signal intelligence platform.
— Cosmo