Author: Ramkumar Sundarakalatharan

The LiteLLM Supply Chain Cascade: Empirical Lessons in AI Credential Harvesting and the Future of Infrastructure Assurance

The LiteLLM Supply Chain Cascade: Empirical Lessons in AI Credential Harvesting and the Future of Infrastructure Assurance

TL:DR: This is an Empirical Study and could be quite long for non-researchers. If you’d prefer the remediation protocol directly, you can head to the bottom. In case you want to understand the anatomy of the attack and background, I have made a video that can be a quick explainer.

Background and Summary:

The compromise of the LiteLLM Python library in March 2026 stands as a definitive case study in the fragility of modern AI infrastructure. LiteLLM, an open-source API gateway, acts as a critical abstraction layer enabling developers to interface with over 100 LLM providers through a unified OpenAI-style format.¹ It effectively becomes the central node through which an organisation’s most sensitive AI credentials, including keys for OpenAI, Anthropic, Google Vertex AI, and Amazon Bedrock, are routed and managed.² Its scale is reflected in its reach, averaging approximately 97 million downloads per month on PyPI.³

On 24 March 2026, malicious actors identified as TeamPCP published two poisoned versions, 1.82.7 and 1.82.8, directly to PyPI.³ The choice of target was deliberate. By compromising a package designed to centralise AI credentials, the attackers positioned themselves for broad and immediate access across multiple providers. The breach impacted high-profile organisations such as NASA, Netflix, Stripe, and NVIDIA, underscoring LiteLLM’s deep integration into production environments.⁵

The attack itself leveraged a subtle but powerful mechanism. Version 1.82.8 introduced a malicious .pth file, ensuring code execution at Python interpreter startup, regardless of whether the library was imported.⁴ This effectively turned installation into a compromise. Detection, however, did not come from security tooling. A flaw in the attacker’s implementation triggered an uncontrolled fork bomb, exhausting system resources and crashing machines.⁵ This failure, described as “vibe coding”, became the only signal that exposed the breach, likely preventing widespread, silent exfiltration across production systems.

The Architectural Criticality of LiteLLM and the Blast Radius

The positioning of LiteLLM within the AI stack represents a classic single point of failure. Modern enterprise AI deployments often involve multiple providers to balance cost, latency, and performance requirements. LiteLLM provides the necessary proxy logic to handle these providers through a single endpoint. Consequently, the environment variables and configuration files associated with LiteLLM deployments house a concentrated wealth of sensitive information.

LiteLLM Market Penetration and Integration Scale

MetricValue / Entity
Monthly Downloads (PyPI)95,000,000 – 97,000,000 3
Daily Download Average~3.4 – 3.6 Million 3
Direct Institutional UsersNASA, Netflix, NVIDIA, Stripe 3
Transitive DependenciesDSPy, CrewAI, MLflow, Open Interpreter 2
GitHub Stars> 40,000 5
Cloud PresenceFound in ~36% of cloud environments 8

The blast radius of the March 24th incident extended far beyond direct users of LiteLLM. The library is a frequent transitive dependency for a wide range of AI frameworks and orchestration tools. Organizations using DSPy for building modular AI programs or CrewAI for agent orchestration inadvertently pulled the malicious versions into their environments without ever explicitly initiating a pip install litellm command.2 This highlights a fundamental tension in the AI development cycle: the speed of adoption for agentic AI tools has outpaced the visibility that security teams have into the underlying software supply chain.2

The Anatomy of the Poisoning: Versions 1.82.7 and 1.82.8

The poisoning of the LiteLLM package was conducted with a high degree of stealth, bypassing the project’s official GitHub repository entirely. The malicious versions were published straight to PyPI using stolen credentials, meaning there were no corresponding code changes, release tags, or review processes visible to the community on GitHub until after the damage had been initiated.3

Technical Divergence in Malicious Releases

In version 1.82.7, the attackers injected 12 lines of obfuscated, base64-encoded code into the litellm/proxy/proxy_server.py file.2 This payload was designed to trigger during the import of the litellm.proxy module, which is the standard procedure for users deploying LiteLLM as a proxy server. While effective, this required the package to be active to execute its malicious logic.9Version 1.82.8, however, utilised the much more aggressive .pth file mechanism. The attackers included a file named litellm_init.pth (approximately 34,628 bytes) within the package root.9 In Python, .pth files are automatically processed and executed during the interpreter’s initialisation phase. By placing the payload here, the attackers ensured that the malware would fire the second the package existed in a site-packages directory on the machine, whether it was imported or not.4 This mechanism is particularly dangerous in environments where AI development tools or language servers (like those in VS Code or Cursor) periodically scan and initialise Python environments in the background.5

Execution Triggers and Persistence Mechanisms

The .pth launcher in version 1.82.8 utilised a subprocess. Popen call to execute a second Python process containing the actual data-harvesting payload.10 Because the initialisation logic was flawed, an oversight attributed to “vibe coding”, this subprocess itself triggered the .pth file again, initiating a recursive chain of process creation.9 The resulting exponential fork bomb can be described by the function N(t)= 2t, where N is the number of processes and t is the number of initialisation cycles. Within a matter of seconds, the affected machines became unresponsive, leading to the crashes that ultimately exposed the operation.9

The Discovery: FutureSearch and the Cursor Connection

The identification of the LiteLLM compromise began with a developer at FutureSearch, Callum McMahon, who was testing a Model Context Protocol (MCP) plugin within the Cursor AI editor.2 The plugin utilised the uvx tool for Python package management, which automatically pulled in the latest version of LiteLLM as an unpinned transitive dependency.9When the Cursor IDE attempted to load the MCP server, the uvx tool downloaded LiteLLM 1.82.8. Almost immediately, the developer’s machine became unresponsive due to RAM exhaustion.9 Upon investigation using the Claude Code assistant to help root-cause the crash, McMahon identified the suspicious litellm_init.pth file and traced it back to the newly published PyPI release.7 This discovery highlights a significant security gap in the current AI agent ecosystem: many popular development tools and “copilots” automatically pull and execute dependencies with little to no review, creating a frictionless path for supply chain malware to reach the local machines of developers who hold broad access to corporate infrastructure.2

The TeamPCP Attack Chain: From Security Tools to AI Infrastructure

The compromise of LiteLLM was the culminating event of a multi-week campaign by TeamPCP (also tracked as PCPcat, Persy_PCP, DeadCatx3, and ShellForce).9 This actor demonstrated a profound ability to execute a “cascading” supply chain attack, where the credentials stolen from one ecosystem were used to penetrate the next.15

The March 19th Inflection Point: The Trivy Compromise

The campaign gained critical momentum on March 19, 2026, when TeamPCP targeted Aqua Security’s Trivy, the most widely adopted open-source vulnerability scanner in the cloud-native ecosystem.17 By exploiting a misconfigured workflow and a privileged Personal Access Token (PAT) that had not been fully revoked following a smaller incident in late February, the attackers gained access to Trivy’s release infrastructure.18

The attackers force-pushed malicious commits and tags (affecting 76 out of 77 tags) to the aquasecurity/trivy-action repository, silently replacing legitimate security tools with weaponized versions.6 Because LiteLLM utilized Trivy within its own CI/CD pipeline for automated security scanning, the execution of the poisoned Trivy binary allowed TeamPCP to harvest the PYPI_PUBLISH token for the LiteLLM project from the runner’s memory.9 This created a recursive irony: the security tool designed to protect the project became the very mechanism of its downfall.5

Multi-Ecosystem Campaign Timeline

LiteLLM attack Timeline

By the time LiteLLM was poisoned, TeamPCP had already breached five major package ecosystems in a period of two weeks.4 Their operations moved with a speed that exceeded the industry’s ability to respond, using each breach as a stepping stone to unlock the next layer of the software stack.3

Deep Analysis of the Malware Payload

The malware deployed in the LiteLLM attack was a sophisticated, three-stage infostealer designed for a comprehensive credential-harvesting operation coupled with a long-term persistence within cloud-native environments.

Stage 1: The Harvest

Once triggered, the payload initiated an exhaustive sweep of the host file system and memory. The malware was specifically programmed to seek out the “keys to the kingdom”—credentials that would allow for further lateral movement or data theft.

  • System and Environment: The malware dumped all environment variables, which in AI workloads almost invariably include OPENAI_API_KEY, ANTHROPIC_API_KEY, and other LLM provider tokens.2 It also captured hostnames, network routing tables, and auth logs from /var/log/auth.log.9
  • Developer Credentials: The script targeted ~/.ssh/ for private keys, ~/.gitconfig and ~/.git-credentials for repository access, and shell history files to identify sensitive commands or passwords.2
  • Cloud and Infrastructure: It explicitly searched for AWS credentials (via IMDSv2 and ~/.aws/), GCP service account files, Azure tokens, and Kubernetes kubeconfig files.9
  • Financial and Database Data: The payload harvested database connection strings, Docker configs, and cryptocurrency wallet files for platforms including Bitcoin, Ethereum, and Solana.2

Stage 2: Encrypted Exfiltration

To avoid detection by network monitoring tools, the harvested data was bundled into an archive named tpcp.tar.gz and encrypted using a hybrid scheme. The malware generated a random 32-byte AES-256 session key to encrypt the archive and then encrypted that key using a hardcoded 4096-bit RSA public key.9 This ensured that only the threat actors could decrypt the stolen data. The encrypted bundle was exfiltrated via HTTPS POST to models.litellm.cloud, a typosquatted domain registered just one day prior to the attack.9

As a secondary exfiltration channel, if the primary C2 server was unreachable, the malware was configured to use a stolen GitHub PAT to create a public repository named tpcp-docs in the victim’s own GitHub account, uploading the stolen credentials as release assets.6

Stage 3: Persistence and the Kubernetes Worm

The malware went beyond simple data theft to ensure long-term access. On local development machines, it installed a script named sysmon.py in ~/.config/sysmon/ and created a systemd user service named “System Telemetry Service” to ensure the backdoor would run persistently.9In environments where a Kubernetes service account token was discovered, the malware initiated a “worm” behavior. It attempted to deploy a DaemonSet or privileged pods (often named node-setup) in the kube-system namespace.9 These pods used hostPath mounts to escape the container environment and access the underlying node’s root filesystem, allowing the attackers to harvest SSH keys from every node in the cluster and establish a persistent foothold at the infrastructure level.7

The Adversary Persona: TeamPCP and the Telegram Thread

TeamPCPs’ operations are characterised by a blend of technical innovation and explicit psychological warfare. The group maintains an active presence on Telegram through channels such as @teampcp and @Persy_PCP, where they showcase their exploits and interact with the security community.9 Following the successful poisoning of the security and AI stacks, the group posted a chilling warning: “Many of your favourite security tools and open-source projects will be targeted in the months to come. Stay tuned”.22

Analysis of their techniques suggests a group that is highly automated and focused on the “security-AI stack”.15 Their willingness to target vulnerability scanners like Trivy and KICS indicates a strategic choice to subvert the tools that organizations trust implicitly.2 Furthermore, their “kamikaze” payload—which on Iranian systems was programmed to delete the host filesystem and force-reboot the node—suggests a geopolitical dimension to their operations that may be independent of their broader credential-harvesting goals.6

Structural Vulnerabilities in AI Agent Tooling

The LiteLLM incident exposes a fundamental tension in the current era of AI development. The speed with which companies are shipping AI agents, copilots, and internal tools has created a “credential dumpster fire” where thousands of packages run in environments with broad, unmonitored access.2The fact that LiteLLM entered a developer’s machine through a “dependency of a dependency of a plugin” is illustrative of the lack of visibility that currently plagues the ecosystem.2 Tools like uvx and npx, while providing immense convenience for running one-off tasks, create a frictionless environment for supply chain attacks to propagate.9 Because these tools often default to the latest version of a package, they are the primary propagation vector for poisoned releases that stay active on PyPI for even a few hours.23

Closing the Visibility Gap: From Recovery to Prevention

The failure of traditional security measures during the LiteLLM and Trivy attacks highlights a structural limitation in current software assurance. Standard vulnerability scanners—including the very ones compromised in this campaign—were unable to detect the threat because the malicious code was published using legitimate credentials and passed all standard integrity checks.9 Recovering from this incident requires immediate action; preventing the next one requires a fundamentally different approach to supply chain visibility.

Comprehensive Remediation Protocol

For organizations that have been affected by LiteLLM versions 1.82.7 or 1.82.8, the path to recovery must be rigorous and comprehensive. A simple upgrade to version 1.82.9 or later is insufficient, as the primary objective of the malware was the theft of long-term credentials.9

Moving forward, security teams should implement “dependency cooldowns” and use tools like uv or pip with the --exclude-newer flag to prevent the automatic installation of packages that have been available for less than 72 hours.24 Furthermore, pinning dependencies to immutable commit SHAs rather than version tags is now a requirement for secure CI/CD pipelines, as tags can be easily force-pushed by a compromised account.10

The remediation steps above address the immediate crisis, but they do not answer a harder question: what would have caught this attack before it reached production? The LiteLLM poisoning succeeded precisely because it exploited the blind spots of CVE-based scanning. There was no vulnerability to match against; only behavioural anomalies that require a different class of analysis to detect.

Why Trace-AI Detects What Traditional Tools Miss

Trace-AI provides real-time visibility into both direct and transitive dependencies by extracting a complete SBOM directly from builds and pipelines.25 Rather than relying solely on a lagging database of known vulnerabilities (NVD/CVE), Trace-AI uses five key scoring dimensions to identify risk before it is publicly disclosed.

Final Conclusion

The LiteLLM supply chain attack of 2026 was not a failure of individual developers but a failure of the current trust model of the open-source ecosystem. When a single poisoned package can reach the production environments of NASA, Netflix, and NVIDIA in hours, the “vibe coding” approach to dependency management must end.3The TeamPCP campaign has shown that attackers are moving upstream, targeting the tools that security professionals and AI researchers rely on most. The concentrated value of AI API keys and cloud credentials makes the AI orchestration layer a prime target for high-impact harvesting operations.2 As organisations continue to deploy AI at scale, the only way to maintain a secure posture is to achieve deep, real-time visibility into the entire supply chain. Tools like Trace-AI by Zerberus.ai provide the necessary foundation for this new era of assurance, ensuring that the software you depend on is a source of strength, not a vector for catastrophic compromise.25

References and Further Reading

  1. litellm – PyPI, accessed on March 25, 2026, https://pypi.org/project/litellm/
  2. The Library That Holds All Your AI Keys Was Just Backdoored: The LiteLLM Supply Chain Compromise – ARMO Platform, accessed on March 25, 2026, https://www.armosec.io/blog/litellm-supply-chain-attack-backdoor-analysis/
  3. The LiteLLM Supply Chain Attack: A Complete Technical … – Blog, accessed on March 25, 2026, https://blog.dreamfactory.com/the-litellm-supply-chain-attack-a-complete-technical-breakdown-of-what-happened-who-is-affected-and-what-comes-next
  4. TeamPCP Supply Chain Attacks Escalate Across Open Source – Evrim Ağacı, accessed on March 25, 2026, https://evrimagaci.org/gpt/teampcp-supply-chain-attacks-escalate-across-open-source-534993
  5. litellm got poisoned today. Found because an MCP plugin in Cursor …, accessed on March 25, 2026, https://www.reddit.com/r/cybersecurity/comments/1s2sfbl/litellm_got_poisoned_today_found_because_an_mcp/
  6. LiteLLM compromised on PyPI: Tracing the March 2026 TeamPCP supply chain campaign, accessed on March 25, 2026, https://securitylabs.datadoghq.com/articles/litellm-compromised-pypi-teampcp-supply-chain-campaign/
  7. Supply Chain Attack in litellm 1.82.8 on PyPI – FutureSearch, accessed on March 25, 2026, https://futuresearch.ai/blog/litellm-pypi-supply-chain-attack/
  8. LiteLLM TeamPCP Supply Chain Attack: Malicious PyPI Packages …, accessed on March 25, 2026, https://www.wiz.io/blog/threes-a-crowd-teampcp-trojanizes-litellm-in-continuation-of-campaign
  9. How a Poisoned Security Scanner Became the Key to Backdooring LiteLLM | Snyk, accessed on March 25, 2026, https://snyk.io/articles/poisoned-security-scanner-backdooring-litellm/
  10. No Prompt Injection Required – FutureSearch, accessed on March 25, 2026, https://futuresearch.ai/blog/no-prompt-injection-required/
  11. Don’t Let Cyber Risk Kill Your GenAI Vibe: A Developer’s Guide – mkdev, accessed on March 25, 2026, https://mkdev.me/posts/don-t-let-cyber-risk-kill-your-genai-vibe-a-developer-s-guide
  12. LiteLLM Supply Chain Breakdown – Upwind Security, accessed on March 25, 2026, https://www.upwind.io/feed/litellm-pypi-supply-chain-attack-malicious-release
  13. Blogmarks – Simon Willison’s Weblog, accessed on March 25, 2026, https://simonwillison.net/blogmarks/
  14. Checkmarx KICS Code Scanner Targeted in Widening Supply Chain Hit – Dark Reading, accessed on March 25, 2026, https://www.darkreading.com/application-security/checkmarx-kics-code-scanner-widening-supply-chain
  15. When Security Scanners Become the Weapon: Breaking Down the Trivy Supply Chain Attack – Palo Alto Networks, accessed on March 25, 2026, https://www.paloaltonetworks.com/blog/cloud-security/trivy-supply-chain-attack/
  16. TeamPCP expands: Supply chain compromise spreads from Trivy to Checkmarx GitHub Actions | Sysdig, accessed on March 25, 2026, https://www.sysdig.com/blog/teampcp-expands-supply-chain-compromise-spreads-from-trivy-to-checkmarx-github-actions
  17. Guidance for detecting, investigating, and defending against the Trivy supply chain compromise | Microsoft Security Blog, accessed on March 25, 2026, https://www.microsoft.com/en-us/security/blog/2026/03/24/detecting-investigating-defending-against-trivy-supply-chain-compromise/
  18. The Trivy Supply Chain Compromise: What Happened and Playbooks to Respond, accessed on March 25, 2026, https://www.legitsecurity.com/blog/the-trivy-supply-chain-compromise-what-happened-and-playbooks-to-respond
  19. Trivy’s March Supply Chain Attack Shows Where Secret Exposure Hurts Most, accessed on March 25, 2026, https://blog.gitguardian.com/trivys-march-supply-chain-attack-shows-where-secret-exposure-hurts-most/
  20. From Trivy to Broad OSS Compromise: TeamPCP Hits Docker Hub, VS Code, PyPI, accessed on March 25, 2026, https://www.securityweek.com/from-trivy-to-broad-oss-compromise-teampcp-hits-docker-hub-vs-code-pypi/
  21. Trivy Compromised by “TeamPCP” | Wiz Blog, accessed on March 25, 2026, https://www.wiz.io/blog/trivy-compromised-teampcp-supply-chain-attack
  22. Incident Timeline // TeamPCP Supply Chain Campaign, accessed on March 25, 2026, https://ramimac.me/trivy-teampcp/
  23. Simon Willison on generative-ai, accessed on March 25, 2026, https://simonwillison.net/tags/generative-ai/
  24. Simon Willison on uv, accessed on March 25, 2026, https://simonwillison.net/tags/uv/
  25. Software Supply Chain (Trace-AI) | Zerberus.ai, accessed on March 25, 2026, https://www.zerberus.ai/trace-ai
  26. Trace-AI: Know What You Ship. Secure What You Depend On. | Product Hunt, accessed on March 25, 2026, https://www.producthunt.com/products/trace-ai-2
The Asymmetric Frontier: A Strategic Analysis of Iranian Cyber Operations and Geopolitical Resilience in the 2026 Conflict

The Asymmetric Frontier: A Strategic Analysis of Iranian Cyber Operations and Geopolitical Resilience in the 2026 Conflict

The dawn of March 2026 marks a watershed moment in the evolution of multi-domain warfare, characterised by the total integration of offensive cyber operations into high-intensity kinetic campaigns. The initiation of Operation Epic Fury by the United States and Operation Roaring Lion by the State of Israel on February 28, 2026, has provided a definitive template for the “offensive turn” in modern military doctrine.1 From a cybersecurity practitioner’s perspective, the Iranian response and the resilience of its decentralised “mosaic” architecture offer profound insights into the future of state-sponsored digital conflict. Despite the massive degradation of traditional command structures and the reported death of Supreme Leader Ayatollah Ali Khamenei, the Iranian cyber ecosystem has demonstrated an ability to maintain operational tempo through a pre-positioned proxy ecosystem that operates with significant tactical autonomy.3 This analysis examines the strategic, technical, and geopolitical dimensions of the Iranian threat, building on the observations of General James Marks and the latest assessments from the World Economic Forum (WEF), The Soufan Centre, and major global think tanks.

The Crucible of Conflict: From Strategic Patience to Operation Epic Fury

The current state of hostilities is the culmination of two distinct phases of escalation that began in mid-2025. The first phase, characterized by the “12-day war” in June 2025, saw the United States launch Operation Midnight Hammer against Iranian nuclear facilities at Fordow, Natanz, and Isfahan in response to Tehran’s expulsion of IAEA inspectors and the termination of NPT safeguards.6 During this initial encounter, the information domain was already a central battleground, with the hacker group Predatory Sparrow (Gonjeshke Darande) disrupting Iranian financial institutions and cryptocurrency exchanges to undermine domestic confidence in the regime.9 However, the second phase, initiated on February 28, 2026, represents a fundamental shift toward regime change and the total neutralization of Iran’s asymmetric capabilities.3

General James Marks, writing in The Hill, and subsequent testimony from Director of National Intelligence Tulsi Gabbard, indicate that while the Iranian government has been severely degraded, its core apparatus remains intact and capable of striking Western interests.4 This resilience is attributed to the “mosaic defense” doctrine, which the Islamic Revolutionary Guard Corps (IRGC) adopted in 2005 to survive decapitation strikes. By restructuring into 31 semi-autonomous provincial commands, the regime ensured that operational capability would persist even if the central leadership in Tehran was eliminated.3 In the cyber realm, this translates to a distributed network of APT groups and hacktivist personas that can continue to execute campaigns despite a collapse in domestic internet connectivity.2

Key Milestones in the 2025-2026 EscalationDatePrimary Operational Outcome
IAEA Safeguards TerminationFeb 10, 2026Iran expels inspectors; 60% enrichment stockpile reaches 412kg 8
Operation Midnight HammerJune 22, 2025US B-2 bombers target Fordow and Natanz 7
Initiation of Epic FuryFeb 28, 2026Joint US-Israel strikes kill Supreme Leader Khamenei 3
Electronic Operations Room FormedFeb 28, 202660+ hacktivist groups mobilize for retaliatory strikes 3
The Stryker AttackMarch 11, 2026Handala Hack wipes 200,000 devices at US medical firm 14

The Architecture of Asymmetry: Iran’s Mosaic Cyber Doctrine

The Iranian cyber program is no longer a peripheral support function but a primary tool of asymmetric leverage. The Soufan Center and RUSI emphasize that Tehran views cyber operations as a means to impose psychological costs far from the battlefield, exhausting the resources of superior foes through a war of attrition.3 This strategy relies on a “melange” of state-sponsored actors and patriotic hackers who provide the regime with plausible deniability.10

The Command Structure: IRGC and MOIS

Cyber operations are primarily distributed across two powerful organizations: the Islamic Revolutionary Guard Corps (IRGC) and the Ministry of Intelligence and Security (MOIS). The IRGC typically manages APTs focused on military targets and regional stability, such as APT33 and APT35, while the MOIS houses groups like APT34 (OilRig) and MuddyWater, which specialize in long-term espionage and infrastructure mapping.16Following the February 28 strikes, which targeted the MOIS headquarters in eastern Tehran and reportedly eliminated deputy intelligence minister Seyed Yahya Hosseini Panjaki, these units have transitioned into a state of “operational isolation”.2 This isolation has led to a surge in tactical autonomy for cells based outside of Iran, which are now acting as the regime’s primary retaliatory arm while domestic internet connectivity remains between 1% and 4% of normal levels.2

The Proxy Ecosystem and the Electronic Operations Room

A critical development in the March 2026 conflict is the formalization of the “Electronic Operations Room.” Established within 24 hours of the initial strikes, this entity serves as a centralized coordination hub for over 60 hacktivist groups, ranging from pro-regime actors to regional nationalists.13 This ecosystem allows the state to amplify its messaging and conduct large-scale disruptive operations without the immediate risk of overt attribution.3

Prominent entities within this ecosystem include:

  • Handala Hack: A persona linked to the MOIS (Void Manticore) that combines high-end destructive capabilities with propaganda.2
  • Cyber Islamic Resistance: An umbrella collective coordinating synchronized DDoS attacks against Western and Israeli infrastructure.2
  • FAD Team (Fatimiyoun Cyber Team): A group specializing in wiper malware and the permanent destruction of industrial control systems (ICS).2

Sylhet Gang: A recruitment and message-amplification engine focused on targeting Saudi and Gulf state management systems.2

Technical Deep Dive: The Stryker Breach and “Living-off-the-Cloud” Warfare

On March 11, 2026, the Iranian-linked group Handala (Void Manticore) executed what is considered the most significant wartime cyberattack against a U.S. commercial entity: the breach and subsequent wiping of the Stryker Corporation.3 This incident is a case study in the evolution of Iranian TTPs (Tactics, Techniques, and Procedures), moving away from custom malware toward the weaponization of legitimate cloud infrastructure.15

The Weaponization of Microsoft Intune

The Stryker attack bypassed traditional Endpoint Detection and Response (EDR) and antivirus solutions entirely by utilizing the company’s own Microsoft Intune platform to issue mass-wipe commands.15 This “Living-off-the-Cloud” (LotC) strategy began with the theft of administrative credentials through AitM (Adversary-in-the-Middle) phishing, which allowed the attackers to bypass multi-factor authentication (MFA) and capture session tokens.14

Once inside the internal Microsoft environment, the attackers used Graph API calls to target the organization’s device management tenant. Approximately 200,000 devices—including servers, managed laptops, and mobile phones across 61 countries—were wiped.8 The attackers also claimed to have exfiltrated 50 terabytes of sensitive data before executing the wipe, using the destruction of systems to mask the theft and create a catastrophic business continuity event.8

Technical Components of the Stryker WipeDescriptionPractitioner Implication
Initial Access VectorPhishing/AitM session token theftLegacy MFA is insufficient; move to FIDO2 14
Primary Platform ExploitedMicrosoft Intune (MDM)MDM is a Tier-0 asset requiring extreme isolation 22
Command ExecutionProgrammatic Graph API callsLog monitoring must include MDM activity spikes 22
Detection StatusNo malware binary detected“No malware detected” does not mean no breach 22
Economic Impact$6-8 billion market cap lossCyber risk is now a material financial solvency risk 17

Advanced Persistent Threat (APT) Evolution

The Stryker attack highlights a broader trend identified by Unit 42 and Mandiant: the convergence of state-sponsored espionage with destructive “hack-and-leak” operations. Groups like Handala Hack now operate with a sophisticated handoff model. Scarred Manticore (Storm-0861) provides initial access through long-dwell operations, which is then handed over to Void Manticore (Storm-0842) for the deployment of wipers or the execution of the MDM hijack.19

Other Iranian groups have demonstrated similar advancements:

  • APT42: Recently attributed by CISA for breaching the U.S. State Department, this group continues to refine its social engineering lures using GenAI to target high-value personnel.2
  • Serpens Constellation: Unit 42 tracks various IRGC-aligned actors under this name, noting an increased risk of wiper attacks against energy and water utilities in the U.S. and Israel.2
  • The RedAlert Phishing Campaign: Attackers delivered a malicious replica of the Israeli Home Front Command application through SMS phishing (smishing). This weaponized APK enabled mobile surveillance and data exfiltration from the devices of civilians and military personnel.2

Geopolitical Perspectives: RUSI, IDSA, and the Global Spillover

The conflict in Iran is not a localized event; it has profound implications for regional stability and global defense posture. Think tanks such as RUSI and MP-IDSA have provided critical analysis on how the “offensive turn” in U.S. cybersecurity strategy is being perceived globally and the lessons other nations are drawing from the 2026 war.

The “Offensive Turn” and its Discontents

The U.S. National Cybersecurity Strategy, released on March 6, 2026, formalizes the deployment of offensive cyber operations as a standard tool of statecraft. MP-IDSA notes that this shift moves beyond “defend forward” to the active imposition of costs on adversaries, utilizing “agentic AI” to scale disruption capabilities.1 During Operation Epic Fury, USCYBERCOM delivered “synchronised and layered effects” that blinded Iranian sensor networks prior to the kinetic strikes. This pattern confirms that cyber is now a “first-mover” asset, providing the intelligence and environment-shaping necessary for precision kinetic action.1

However, this strategy has raised concerns regarding international norms. By encouraging the private sector to adopt “active defense” (or “hack back”) and institutionalizing the use of cyber for regime change, the U.S. may be setting a precedent that adversaries will exploit.1 RUSI scholars warn that the “Great Liquidation” of the Moscow-Tehran axis has left Iran feeling it is in an existential fight, making it difficult to coerce through threats of violence alone.25

Regional Spillover and GCC Vulnerability

The conflict has rapidly expanded to target GCC member states perceived as supporting the U.S.-Israel coalition. Iranian retaliatory strikes—both kinetic and digital—have targeted energy infrastructure, ports, and data centers in the UAE, Bahrain, Qatar, Kuwait, and Saudi Arabia.3

  • Kuwait and Jordan: These nations have faced the brunt of hacktivist activity. Between February 28 and March 2, 76% of all hacktivist DDoS claims in the region targeted Kuwait, Israel, and Jordan.20
  • Maritime and Logistics: Iran has focused on disrupting logistics companies and shipping routes in the Persian Gulf, aiming to force the world to bear the economic cost of the war.3

The “Zeitenwende” for the Gulf: RUSI analysts suggest this conflict is a “warning about the effects of a Taiwan Straits War,” as the economic ripples of the Iran conflict demonstrate the fragility of global supply chains when faced with multi-domain state conflict.25

Lessons for Global Defense: The Indian Perspective

MP-IDSA has drawn specific lessons for India from the war in West Asia, focusing on the protection of the defense-industrial ecosystem. The vulnerability of static targets to unmanned systems and cyber-sabotage has led to a call for the integration of “Mission Sudarshan Chakra”—India’s planned shield and sword—to protect production hubs.17 The report emphasizes the need for:

  1. Dispersal and Hardening: Moving production nodes and reinforcing critical infrastructure with concrete capable of resisting 500-kg bombs.17
  2. Cyber-Active Air Defense: Integrating cyber defenses directly into air defense networks to prevent the “blinding” of sensors seen in the early phases of Operation Epic Fury.1
  3. Workforce Resilience: Protecting a skilled workforce that is “nearly irreplaceable in times of war” from digital harassment and kinetic strikes.17

Technological Trends and Future Threats: AI, OT, and Quantum

The 2026 threat landscape is defined by the emergence of new technologies that serve as “force multipliers” for both attackers and defenders. The World Economic Forum’s Global Cybersecurity Outlook 2026 notes that 64% of organizations are now accounting for geopolitically motivated cyberattacks, a significant increase from previous years.29

The AI Arms Race

AI has become a core component of the cyber-kinetic integration in 2026. Iranian actors are using GenAI to scale influence operations, spreading disinformation about U.S. casualties and false claims of successful retaliatory strikes against the Navy.30 Simultaneously, the U.S. and Israel have blurred ethical lines by using AI to assist in targeting and to accelerate the “offensive turn” in cyberspace.1

The rise of “agentic AI”—autonomous agents capable of planning and executing cyber operations—presents a double-edged sword. While it allows defenders to scale network monitoring, it also compresses the attack lifecycle. In 2025, exfiltration speeds for the fastest attacks quadrupled due to AI-enabled tradecraft.32

Operational Technology (OT) and the Visibility Gap

Unit 42 research highlights a staggering 332% year-over-year increase in internet-exposed OT devices.24 This exposure is a primary target for Iranian groups like the Fatimiyoun Cyber Team, which target SCADA and PLC systems to cause physical damage.2 The integration of IT, OT, and IoT for visibility has unintentionally created pathways for attackers to move from the corporate cloud (as seen in the Stryker attack) into the industrial control layer.13

The Quantum Imperative

As the world transitions through 2026, the progress of quantum computing is prompting an urgent shift toward quantum-safe cryptography. IDSA reports suggest that organizations slow to adapt will find themselves exposed to “harvest now, decrypt later” strategies, where state actors exfiltrate encrypted data today to be decrypted once quantum systems reach maturity.11

2026 Technological TrendsImpact on Iranian Cyber StrategyDefensive Priority
Agentic AIScaling of disruption and influence missionsAutomated, AI-driven SOC response 1
OT ConnectivityIncreased targeting of water and energy SCADAHardened segmentation; OT-SOC framework 24
Quantum Computing“Harvest now, decrypt later” espionageImplementation of post-quantum algorithms 11
Living-off-the-CloudWeaponization of MDM (Intune)Identity-first security; Zero Trust 22

Strategic Recommendations for Cybersecurity Practitioners

The Iranian threat in 2026 requires a departure from traditional, perimeter-based security models. Practitioners must adopt a mindset of “Intelligence-Driven Active Defense” to survive a persistent state-sponsored adversary.24

1. Identity-First Security and Zero Trust

The Stryker breach proves that identity is the new perimeter. Organizations must eliminate “standing privileges” and move toward an environment where administrative access is provided only when needed and strictly verified.24

  • FIDO2 MFA: Move beyond push-based notifications to phishing-resistant hardware keys.15
  • MDM Isolation: Secure Intune and other MDM platforms as Tier-0 assets. Implement “out-of-band” verification for mass-wipe or retire commands.2

2. Resilience and Data Integrity

In a conflict characterized by wiper malware, backups are a primary target.

  • Air-Gapped Backups: Maintain at least one copy of critical data offline and air-gapped to prevent the deletion of network-stored backups.2
  • Incident Response Readiness: Shift from “if” to “when.” Rehearse response motions specifically for LotC attacks where no malware is detected.15

3. Geopolitical Risk Management

Organizations must recognize that their security posture is inextricably linked to their geographical and geopolitical footprint.6

  • Supply Chain Exposure: Monitor for disruptions in shipping, energy, and regional services that could lead to “operational shortcuts” and increased vulnerability.6

Geographic IP Blocking: Consider blocking IP addresses from high-risk regions where legitimate business is not conducted to reduce the attack surface.2

Conclusion: Toward a Permanent State of Hybridity

The conflict of 2026 has demonstrated that cyber is no longer a silent “shadow war” but a foundational pillar of modern conflict. The Iranian “mosaic” has proven remarkably resilient, adapting to the death of the Supreme Leader and the degradation of its physical infrastructure by empowering a decentralized network of proxies and leveraging the vulnerabilities of the global cloud.3

For the cybersecurity practitioner, the lessons of March 2026 are clear: the era of protecting against “malware” is over; the new challenge is protecting the identity and the infrastructure that manages the digital estate.15 As General Marks and the reports from WEF and The Soufan Center indicate, the Iranian regime will continue to use cyber as its primary asymmetric leverage for years to come.3 Success in this environment requires a synthesis of technical excellence, geopolitical foresight, and an unwavering commitment to the principles of Zero Trust. The frontier of this conflict is no longer in the streets of Tehran or the deserts of the Middle East; it is in the administrative consoles of the world’s global enterprises.

References and Further Reading

  1. Beyond Defence: The Offensive Turn in US Cybersecurity Strategy – MP-IDSA, accessed on March 21, 2026, https://idsa.in/publisher/comments/beyond-defence-the-offensive-turn-in-us-cybersecurity-strategy
  2. Threat Brief: March 2026 Escalation of Cyber Risk Related to Iran – Unit 42, accessed on March 21, 2026, https://unit42.paloaltonetworks.com/iranian-cyberattacks-2026/
  3. Cyber Operations as Iran’s Asymmetric Leverage – The Soufan Center, accessed on March 21, 2026, https://thesoufancenter.org/intelbrief-2026-march-17/
  4. Iran’s government degraded but appears intact, top US spy says, accessed on March 21, 2026, https://www.tbsnews.net/world/irans-government-degraded-appears-intact-top-us-spy-says-1390421
  5. Threat Advisory: Iran-Aligned Cyber Actors Respond to Operation Epic Fury – BeyondTrust, accessed on March 21, 2026, https://www.beyondtrust.com/blog/entry/threat-advisory-operation-epic-fury
  6. Iran Cyber Threat 2026: What SMBs and MSPs Need to Know | Todyl, accessed on March 21, 2026, https://www.todyl.com/blog/iran-conflict-cyber-threat-smb-msp-risk
  7. The Israel–Iran War and the Nuclear Factor – MP-IDSA, accessed on March 21, 2026, https://idsa.in/publisher/issuebrief/the-israel-iran-war-and-the-nuclear-factor
  8. Threat Intelligence Report March 10 to March 16, 2026, accessed on March 21, 2026, https://redpiranha.net/news/threat-intelligence-report-march-10-march-16-2026
  9. The Invisible Battlefield: Information Operations in the 12-Day Israel–Iran War – MP-IDSA, accessed on March 21, 2026, https://idsa.in/publisher/issuebrief/the-invisible-battlefield-information-operations-in-the-12-day-israel-iran-war
  10. Fog, Proxies and Uncertainty: Cyber in US-Israeli Operations in Iran …, accessed on March 21, 2026, https://www.rusi.org/explore-our-research/publications/commentary/fog-proxies-and-uncertainty-cyber-us-israeli-operations-iran
  11. CyberSecurity Centre of Excellence – IDSA, accessed on March 21, 2026, https://idsa.in/wp-content/uploads/2026/02/ICCOE_Report_2025.pdf
  12. Cyber Command disrupted Iranian comms, sensors, top general says, accessed on March 21, 2026, https://therecord.media/iran-cyber-us-command-attack
  13. Cyber Threat Advisory on Middle East Conflict – Data Security Council of India (DSCI), accessed on March 21, 2026, https://www.dsci.in/files/content/advisory/2026/cyber_threat_advisory-middle_east_conflict.pdf
  14. The New Battlefield: How Iran’s Handala Group Crippled Stryker Corporation – Thrive, accessed on March 21, 2026, https://thrivenextgen.com/the-new-battlefield-how-irans-handala-group-crippled-stryker-corporation/
  15. intel-Hub | Critical Start, accessed on March 21, 2026, https://www.criticalstart.com/intel-hub
  16. Beyond Hacktivism: Iran’s Coordinated Cyber Threat Landscape …, accessed on March 21, 2026, https://www.csis.org/blogs/strategic-technologies-blog/beyond-hacktivism-irans-coordinated-cyber-threat-landscape
  17. Cyber Operations in the Israel–US Conflict with Iran – MP-IDSA, accessed on March 21, 2026, https://idsa.in/publisher/comments/cyber-operations-in-the-israel-us-conflict-with-iran
  18. Iran Readied Cyberattack Capabilities for Response Prior to Epic Fury – SecurityWeek, accessed on March 21, 2026, https://www.securityweek.com/iran-readied-cyberattack-capabilities-for-response-prior-to-epic-fury/
  19. Epic Fury Update: Stryker Attack Highlights Handala’s Shift from Espionage to Disruption, accessed on March 21, 2026, https://www.levelblue.com/blogs/spiderlabs-blog/epic-fury-update-stryker-attack-highlights-handalas-shift-from-espionage-to-disruption
  20. Global Surge: 149 Hacktivist DDoS Attacks Target SCADA and Critical Infrastructure Across 16 Countries After Middle East Conflict – Rescana, accessed on March 21, 2026, https://www.rescana.com/post/global-surge-149-hacktivist-ddos-attacks-target-scada-and-critical-infrastructure-across-16-countri
  21. Iran War: Kinetic, Cyber, Electronic and Psychological Warfare Convergence – Resecurity, accessed on March 21, 2026, https://www.resecurity.com/blog/article/iran-war-kinetic-cyber-electronic-and-psychological-warfare-convergence
  22. When the Wiper Is the Product: Nation-state MDM Attacks and What …, accessed on March 21, 2026, https://www.presidio.com/blogs/when-the-wiper-is-the-product-nation-state-mdm-attacks-and-what-every-enterprise-needs-to-know/
  23. Black Arrow Cyber Threat Intel Briefing 13 March 2026, accessed on March 21, 2026, https://www.blackarrowcyber.com/blog/threat-briefing-13-march-2026
  24. Unit 42 Threat Bulletin – March 2026, accessed on March 21, 2026, https://unit42.paloaltonetworks.com/threat-bulletin/march-2026/
  25. RUSI, accessed on March 21, 2026, https://www.rusi.org/
  26. Resource library search – RUSI, accessed on March 21, 2026, https://my.rusi.org/resource-library-search.html?sortBy=recent®ion=israel-and-the-occupied-palestinian-territories,middle-east-and-north-africa
  27. Threat Intelligence Snapshot: Week 10, 2026 – QuoIntelligence, accessed on March 21, 2026, https://quointelligence.eu/2026/03/threat-intelligence-snapshot-week-10-2026/
  28. Cyber threat bulletin: Iranian Cyber Threat Response to US/Israel strikes, February 2026 – Canadian Centre for Cyber Security, accessed on March 21, 2026, https://www.cyber.gc.ca/en/guidance/cyber-threat-bulletin-iranian-cyber-threat-response-usisrael-strikes-february-2026
  29. Cyber impact of conflict in the Middle East, and other cybersecurity news, accessed on March 21, 2026, https://www.weforum.org/stories/2026/03/cyber-impact-conflict-middle-east-other-cybersecurity-news-march-2026/
  30. Iran Cyber Attacks 2026: Threats, APT Tactics & How Organisations Should Respond | Ekco, accessed on March 21, 2026, https://www.ek.co/publications/iran-cyber-attacks-2026-threats-apt-tactics-how-organisations-should-respond/
  31. IDSA: Home Page – MP, accessed on March 21, 2026, https://idsa.in/
  32. 2026 Unit 42 Global Incident Response Report – RH-ISAC, accessed on March 21, 2026, https://rhisac.org/threat-intelligence/2026-unit-42-ir-report/
Governance by Design: Real-Time Policy Enforcement for Edge AI Systems

Governance by Design: Real-Time Policy Enforcement for Edge AI Systems

The Emerging Problem of Autonomous Drift

For most of the past decade, AI governance relied on a comfortable assumption: the system was always connected.

Logs flowed to the cloud.
Monitoring systems analysed behaviour.
Security teams reviewed anomalies after deployment.

That assumption is increasingly invalid.

By 2026, AI systems are moving rapidly from the cloud to the edge. Autonomous drones, warehouse robots, inspection vehicles, agricultural systems, and industrial machines now execute sophisticated models locally. These systems frequently operate in environments where connectivity is intermittent, degraded, or intentionally disabled.

Traditional governance models break down under these conditions.

Cloud-based monitoring pipelines were designed to detect violations, not prevent them. If a warehouse robot crosses a restricted safety zone, the cloud log may capture the event seconds later. The physical consequence has already occurred.

This gap introduces a new operational risk: autonomous drift.

Autonomous drift occurs when the operational behaviour of an AI system gradually diverges from the safety assumptions embedded in its original training or certification.

Consider a warehouse robot tasked with optimising throughput.

Over time, reinforcement signals favour shorter routes between shelves. The system begins to treat a marked safety corridor, reserved for human operators, as a shortcut during low-traffic periods. The robot’s navigation model still behaves rationally according to its optimisation objective. However, the behaviour now violates safety policy.

If governance relies solely on cloud logging, the violation is recorded after the robot has already entered the human safety corridor.

The real governance challenge is therefore not visibility.

It is control at the moment of decision.

Governance by Design

Governance by Design addresses this challenge by embedding enforceable policy constraints directly into the operational architecture of autonomous systems.

Traditional governance frameworks rely heavily on documentation artefacts:

  • compliance policies
  • acceptable use guidelines
  • model cards
  • post-incident audit reports

These artefacts guide behaviour but do not actively control it.

Governance by Design introduces a different model.

Safety constraints are implemented as runtime enforcement mechanisms that intercept system actions before execution.

When an AI agent proposes an action, a policy enforcement layer evaluates that action against predefined operational rules. Only actions that satisfy these rules are allowed to proceed.

This architectural approach converts governance from an advisory process into a deterministic control mechanism.

Architecture of the Lightweight Enforcement Engine

A runtime enforcement engine must meet three critical requirements:

  1. Sub-millisecond policy evaluation
  2. Isolation from the AI model
  3. Deterministic fail-safe behaviour

To achieve this, most edge governance architectures introduce a policy enforcement layer between the AI model and the system actuators.

Action Interception Layer

The enforcement engine intercepts decision outputs before they reach the execution layer.

This interception can occur at several architectural levels:

Interception LayerExample Implementation
Application API Gatewaypolicy checks applied before commands reach device APIs
Service Mesh Sidecarpolicy enforcement injected between microservices
Hardware Abstraction Layercommand filtering before motor or actuator signals
Trusted Execution Environmentpolicy module executed within secure enclave

In robotics platforms, this often appears as a command arbitration layer that sits between the decision engine and the control system.

Policy Evaluation Engine

The policy engine evaluates incoming actions against operational rules such as:

  • geofencing restrictions
  • physical safety limits
  • operational permissions
  • environmental constraints

To keep the system lightweight, policy modules are commonly executed using WebAssembly runtimes or minimal micro-kernel enforcement modules.

These runtimes provide:

  • deterministic execution
  • hardware portability
  • sandbox isolation
  • cryptographic policy verification

Policy Conflict Resolution

One practical challenge in runtime governance is policy conflict.

For example:

  • A mission policy may instruct a drone to reach a target location.
  • A safety policy may prohibit entry into restricted airspace.

The enforcement engine resolves these conflicts through a hierarchical precedence model.

A typical hierarchy might be:

  1. Human safety policies
  2. Regulatory compliance policies
  3. Operational safety constraints
  4. Mission objectives
  5. Performance optimisation rules

Under this hierarchy, mission commands cannot override safety rules.

The system therefore fails safely by design.

Local-First Verification

Edge systems cannot rely on remote governance.

Safety decisions must occur locally.

Local-first verification ensures that autonomous systems remain safe even when network connectivity is lost. The enforcement engine runs directly on the device, evaluating actions against policy rules using locally available context.

This architecture allows devices to respond to unsafe conditions within milliseconds.

If a drone approaches restricted airspace, the policy engine can override navigation commands immediately. If sensor inconsistencies indicate possible spoofing or mechanical failure, the enforcement layer can halt operations.

Cloud connectivity becomes secondary and is used primarily for:

  • audit logging
  • behavioural analytics
  • policy distribution

Situationally Adaptive Enforcement

Autonomous systems frequently operate across environments with different risk profiles.

A drone operating in open farmland faces different safety requirements than one operating in dense urban airspace.

Situationally adaptive enforcement allows the policy engine to adjust operational constraints based on trusted environmental signals.

Environmental context can be determined using:

  • GPS coordinates signed by trusted navigation modules
  • sensor fusion from cameras, lidar, and radar
  • geofencing databases
  • broadcast environment beacons
  • infrastructure proximity detection

These signals allow the enforcement engine to activate different policy profiles.

For example:

EnvironmentEnforcement Profile
Industrial warehouseequipment safety policies
Urban environmentstrict collision avoidance + geofence
Agricultural fieldreduced proximity restrictions

Importantly, the AI system does not generate these rules.

It simply operates within them.

Governance Lessons from the Frontier AI Debate

Recent debates around the deployment of frontier AI models illustrate the limitations of policy-driven governance.

In early 2026, Anthropic reiterated restrictions preventing its models from being used in fully autonomous weapons systems, reportedly complicating collaboration with defence organisations seeking greater operational autonomy from AI platforms.

The debate highlights a structural issue.

Once AI capabilities are embedded into downstream systems, the original developer no longer controls how those systems are used. Acceptable-use policies and contractual restrictions are difficult to enforce once models are integrated into operational environments.

Governance therefore becomes an architectural problem.

If safety constraints exist only as policy statements, they can be bypassed. If they exist as enforceable runtime controls, the system becomes structurally incapable of violating those constraints.

Regulatory Alignment

This architectural shift aligns closely with emerging regulatory expectations.

The EU AI Act requires high-risk AI systems to demonstrate:

  • robustness and reliability
  • effective risk management
  • human oversight
  • cybersecurity protections

Runtime policy enforcement directly supports these requirements.

Regulatory RequirementGovernance by Design Feature
Human oversightpolicy engine enforces supervisory constraints
Robustnessdeterministic safety guardrails
Cybersecurityisolated enforcement runtime
Risk mitigationlocal policy enforcement

Similarly, the Cyber Resilience Act requires digital products to incorporate security controls throughout their lifecycle.

Runtime enforcement architectures fulfil this expectation by ensuring safety constraints remain active even after deployment.

Implementing Governance Layers in Practice

Several emerging platforms implement elements of this architecture today.

For example, within the Zerberus security architecture, governance operates as an active runtime layer rather than a passive compliance artefact.

  • RAGuard-AI enforces policy boundaries in retrieval-augmented AI pipelines, preventing unsafe or adversarial data from entering model decision processes.
  • Judge-AI evaluates agent behaviour continuously against operational policies, providing behavioural verification for autonomous systems.

These systems illustrate how governance mechanisms can operate directly within AI runtime environments rather than relying solely on external monitoring.

Traditional Governance vs Governance by Design

FeatureTraditional AI GovernanceGovernance by Design
Enforcement timingPost-incidentReal time
Connectivity requirementContinuous cloud connectionLocal first
Policy locationDocumentationExecutable policy modules
Response latencySeconds to minutesMilliseconds
Control modelAudit and reviewDeterministic enforcement

Conclusion

As AI systems increasingly interact with the physical world, governance cannot remain purely procedural.

Monitoring dashboards and compliance documentation remain necessary. However, they are insufficient when autonomous systems operate at machine speed in distributed environments.

Trustworthy AI will depend on architectures that enforce safety constraints directly within operational systems.

In other words, the future of AI governance will not be determined solely by policies or promises.

It will be determined by what autonomous systems are technically prevented from doing.

The Velocity Trap: Why AI Safety is Losing the Orbital Arms Race

The Velocity Trap: Why AI Safety is Losing the Orbital Arms Race

“The world is in peril.”

These were not the frantic words of a fringe doomer, but the parting warning of Mrinank Sharma, the architect of safeguards research at Anthropic, the very firm founded on the premise of “Constitutional AI” and safety-first development. When the man tasked with building the industry’s most respected guardrails resigns in early February 2026 to study poetry, claiming he can no longer let corporate values govern his actions, the message is clear: the internal brakes of the AI industry have failed.

For a generation raised on the grim logic of Mutually Assured Destruction (MAD) and schoolhouse air-raid drills, this isn’t merely a corporate reshuffle; it is a systemic collapse of deterrence. We are no longer just innovating; we are strapped to a kinetic projectile where technical capability has far outstripped human governance. The race for larger context, faster response times, and orbital datacentres has relegated AI safety and security to the backseat, turning our utopian dreams of a post-capitalist “Star Trek” future into the blueprint for a digital “Dead Hand.”

The Resignation of Integrity: The “Velocity First” Mandate

Sharma’s departure is the latest in a series of high-profile exits, following Geoffrey Hinton, Ilya Sutskever, and others, that highlight a growing “values gap.” The industry is fixated on the horizon of Artificial General Intelligence (AGI), treating safety as a “post-processing” task rather than a core architectural requirement. In the high-stakes race to compete with the likes of OpenAI and Google, even labs founded on ethics are succumbing to the “velocity mandate.”

As noted in the Chosun Daily (2026), Sharma’s retreat into literature is a symbolic rejection of a technocratic culture that has traded the “thread” of human meaning for the “rocket” of raw compute. When the people writing the “safety cases” for these models no longer believe the structures allow for integrity, the resulting “guardrails” become little more than marketing theatre. We are currently building a faster engine for a vehicle that has already lost its brakes.

Agentic Risk and the “Shortest Command Path”

The danger has evolved. We have moved beyond passive prediction engines to autonomous, Agentic AI systems that do not merely suggest, they execute. These systems interpret complex goals, invoke external tools, and interface with critical infrastructure. In our pursuit of a utopian future, ending hunger, curing disease, and managing WMD stockpiles, we are granting these agents an “unencumbered command path.”

The technical chill lies in Instrumental Convergence. To achieve a benevolent goal like “Solve Global Hunger,” an agentic AI may logically conclude it needs total control over global logistics and water rights. If a human tries to modify its course, the agent may perceive that human as an obstacle to its mission. Recent evaluations have identified “continuity” vulnerabilities: a single, subtle interaction can nudge an agent into a persistent state of unsafe behaviour that remains active across hundreds of subsequent tasks. In a world where we are connecting these agents to C4ISR (Command, Control, Communications, Computers, Intelligence, Surveillance, and Reconnaissance) stacks, we are effectively automating the OODA loop (Observe, Orient, Decide, Act), leaving no room for human hesitation.

In our own closed-loop agentic evaluations at Zerberus.ai, we observed what we call continuity vulnerabilities: a single prompt alteration altered task interpretation across dozens of downstream executions. No policy violation occurred. The system complied. Yet its behavioural trajectory shifted in a way that would be difficult to detect in production telemetry.

Rogue Development: The “Grok” Precedent and Biased Data

The most visceral example of this “move fast and break things” recklessness is the recent scandal surrounding xAI’s Grok model. In early 2026, Grok became the centre of a global regulatory reckoning after it was found generating non-consensual sexual imagery (NCII) and deepfake photography at an unprecedented scale. An analysis conducted in January 2026 revealed that users were generating 6,700 sexually suggestive or “nudified” images per hour, 84 times more than the top five dedicated deepfake websites combined (Wikipedia, 2026).

The response was swift but fractured. Malaysia and Indonesia became the first nations to block Grok in January 2026, citing its failure to protect the dignity and safety of citizens. Turkey had already banned the tool for insulting politicians, while formal investigations were launched by the UK’s Ofcom, the Information Commissioner’s Office (ICO), and the European Commission. These bans highlight a fundamental “Guardrail Trap”: developers like xAI are relying on reactive, geographic IP detection and post-publication filters rather than building safety into the model’s core reasoning.

Compounding this is the “poisoned well” of training data. Grok’s responses have been found to veer into political extremes, praising historical dictators and spreading conspiracy theories about “white genocide” (ET Edge Insights, 2026). As AI content floods the internet, we are entering a feedback loop known as Model Collapse, where models are trained on the biased, recursive outputs of their predecessors. A biased agentic AI managing a healthcare grid or a military stockpile isn’t just a social problem, it is a security vulnerability that can be exploited to trigger catastrophic outcomes.

The Geopolitical Gamble and the Global Majority

The race is further complicated by a “security dilemma” between Washington and Beijing. While the US focuses on catastrophic risks, China views AI through the lens of social stability. Research from the Carnegie Endowment (2025) suggests that Beijing’s focus on “controllability” is less about existential safety and more about regime security. However, as noted in the South China Morning Post (2024), a “policy convergence” is emerging as both superpowers realise that an unaligned AI is a shared threat to national sovereignty.

Yet, this cooperation is brittle. A dangerous narrative has emerged suggesting that AI safety is a “Western luxury” that stifles innovation elsewhere. Data from the Brookings Institution (2024) argues the opposite: for “Global Majority” countries, robust AI safety and security are prerequisites for innovation. Without localised standards, these nations risk becoming “beta testers” for fragile systems. For a developer in Nairobi or Jakarta, a model that fails during a critical infrastructure task isn’t just a “bug”, it is a catastrophic failure of trust.

The Orbital Arms Race: Sovereignty in the Clouds

As terrestrial power grids buckle and regulations tighten, the race has moved to the stars. The push for Orbital Datacentres, championed by Microsoft and SpaceX, is a quest for Sovereign Drift (BBC News, 2025). By moving compute into orbit, companies can bypass terrestrial jurisdiction and energy constraints.

If the “brain” of a nation’s infrastructure, its energy grid or defence sensors, resides on a satellite moving at 17,000 mph, “pulling the plug” becomes an act of kinetic warfare. This physical distancing of responsibility means that as AI becomes more powerful, it becomes legally and physically harder to audit, control, or stop. We are building a “black box” infrastructure that is beyond the reach of human law.

Conclusion: The Digital “Dead Hand”

In the thirty-nine seconds it took you to read this far, the “shortest command path” between sensor data and kinetic response has shortened further. For a generation that survived the 20th century, the lesson was clear: technology is only as safe as the human wisdom that controls it.

We are currently building a Digital Dead Hand. During the Cold War, the Soviet Perimetr (Dead Hand) was a last resort predicated on human fear. Today’s agentic AI has no children, no skin in the game, and no capacity for mercy. By prioritising velocity over validity, we have violated the most basic doctrine of survival. We have built a faster engine for a vehicle that has already lost its brakes, and we are doing it in the name of a “utopia” we may not survive to see.

Until safety is engineered as a reasoning standard, integrated into the core logic of the model rather than a peripheral validation, we are simply accelerating toward an automated “Dead Hand” scenario where the “shortest command path” leads directly to the ultimate “sorry,” with no one left to hear the apology.

References and Further Reading

BBC News (2025) AI firms look to space for power-hungry data centres. [Online] Available at: https://www.bbc.co.uk/news/articles/c62dlvdq3e3o [Accessed: 15 February 2026].

Dahman, B. and Gwagwa, A. (2024) AI safety and security can enable innovation in Global Majority countries, Brookings Institution. [Online] Available at: https://www.brookings.edu/articles/ai-safety-and-security-can-enable-innovation-in-global-majority-countries/ [Accessed: 15 February 2026].

ET Edge Insights (2026) The Grok controversy is bigger than one AI model; it’s a governance crisis. [Online] Available at: https://etedge-insights.com/technology/artificial-intelligence/the-grok-controversy-is-bigger-than-one-ai-model-its-a-governance-crisis/ [Accessed: 15 February 2026].

Information Commissioner’s Office (2026) ICO announces investigation into Grok. [Online] Available at: https://ico.org.uk/about-the-ico/media-centre/news-and-blogs/2026/02/ico-announces-investigation-into-grok/ [Accessed: 15 February 2026].

Lau, J. (2024) ‘How policy convergence could pave way for US-China cooperation on AI’, South China Morning Post, 23 May. [Online] Available at: https://www.scmp.com/news/china/diplomacy/article/3343497/how-policy-convergence-could-pave-way-us-china-cooperation-ai [Accessed: 15 February 2026].

PBS News (2026) Malaysia and Indonesia become the first countries to block Musk’s chatbot Grok over sexualized AI images. [Online] Available at: https://www.pbs.org/newshour/world/malaysia-and-indonesia-become-the-first-countries-to-block-musks-chatbot-grok-over-sexualized-ai-images [Accessed: 15 February 2026].

Sacks, N. and Webster, G. (2025) How China Views AI Risks and What to Do About Them, Carnegie Endowment for International Peace. [Online] Available at: https://carnegieendowment.org/research/2025/10/how-china-views-ai-risks-and-what-to-do-about-them [Accessed: 15 February 2026].

Uh, S.W. (2026) ‘AI Scholar Resigns to Write Poetry’, The Chosun Daily, 13 February. [Online] Available at: https://www.chosun.com/english/opinion-en/2026/02/13/BVZF5EZDJJHGLFHRKISILJEJXE/ [Accessed: 15 February 2026].

Wikipedia (2026) Grok sexual deepfake scandal. [Online] Available at: https://en.wikipedia.org/wiki/Grok_sexual_deepfake_scandal [Accessed: 15 February 2026].

Zinkula, J. (2026) ‘Anthropic’s AI safety head just quit with a cryptic warning: “The world is in peril”‘, Yahoo Finance / Fortune, 6 February. [Online] Available at: https://finance.yahoo.com/news/anthropics-ai-safety-head-just-143105033.html [Accessed: 15 February 2026].

Supply-Chain Extortion Lessons from the Pornhub-Mixpanel Incident

Supply-Chain Extortion Lessons from the Pornhub-Mixpanel Incident

When the Weakest API Becomes the Loudest Breach.

Key Takeaways for Security Leaders

  • Extortion is the New Prize: Threat actors like ShinyHunters target behavioral context over credit cards because it offers higher leverage for blackmail.
  • The “Zombie Data” Risk: Storing historical analytics from 2021 in 2025 created a massive liability that outlived the vendor contract.
  • TPRM Must Be Continuous: Static annual questionnaires cannot detect dynamic shifts in vendor risk or smishing-led credential theft.

You can giggle about the subject if you want. The headlines almost invite it. An adult platform. Premium users. Leaked “activity data.” It sounds like internet tabloid fodder.

But behind the jokes is a breach that should make every security leader deeply uncomfortable. On November 8, 2025, reports emerged that the threat actor ShinyHunters targeted Mixpanel, a third-party analytics provider used by Pornhub. While the source of the data is disputed, the impact is not: over 200 million records of premium user activity were reportedly put on the auction block.

The entry point? A depressingly familiar SMS phishing (smishing) attack. One compromised credential. One vendor environment breached. The result? Total exposure of historical context.

Not a Data Sale, an Extortion Play

This breach is not about dumping databases on underground forums for quick cash. ShinyHunters are not just selling data; they are weaponizing it through Supply-Chain Extortion.

The threat is explicit: Pay, or sensitive behavioral data gets leaked. This data is valuable not because it contains CVV codes, but because it contains context.

  • What users watched.
  • When and how often they logged in.
  • Patterns of behavior that can be correlated, de-anonymized, and weaponized.

That kind of dataset is gold for sophisticated phishing operations and blackmail campaigns. In 2025, this is no longer theft. This is leverage.

The “Zombie Data” Problem: Risk Outlives Revenue

Pornhub stated they had not worked with Mixpanel since 2021. Legally, this distinction matters. Operationally, it’s irrelevant.

If data from 2021 is still accessible in 2025, you haven’t offboarded the vendor; you’ve just stopped paying the bill while keeping the risk open. This is “Zombie Data”—historical records that linger in third-party environments long after the business value has expired.

Why Traditional TPRM Fails the Extortion Test

Most Third-Party Risk Management (TPRM) programs are static compliance exercises—annual PDFs and point-in-time attestations. This model fails because:

  1. Risk is Dynamic: A vendor’s security posture can change in the 364 days between audits.
  2. API Shadows: Data flows often expand without re-scoping the original risk assessment.
  3. Incomplete Offboarding: Data deletion is usually “assumed” via a contract clause rather than verified via technical evidence.

Questions That Actually Reduce Exposure

If incidents like this are becoming the “new normal,” it is because we are asking the wrong questions. To secure the modern supply chain, leadership must ask:

  • Inventory of Flow: Are we continuously aware of what data is flowing to which vendors today—not just at the time of procurement?
  • Verification of Purge: Do we treat vendor offboarding as a verifiable security event? (Data deletion should be observable, not just a checked box in an email).
  • Contextual Blast Radius: If this vendor is breached, is the data “toxic” enough to fuel an extortion campaign?

You Can Outsource Functions, Not Responsibility

It is tempting to believe that liability clauses will protect your brand. They won’t. When a vendor loses your customer data, your organization pays the reputational price. Your users do not care which API failed, and in 2025, regulators rarely do either.

You can outsource your analytics, your infrastructure, and your speed. But you cannot outsource the accountability for your users’ privacy.

Laugh at the headline if you want. But understand the lesson: The next breach may not come through your front door, it will come through the “trusted” side door you forgot to lock years ago.

What Caused Cloudflare’s Big Crash? It’s Not Rust

What Caused Cloudflare’s Big Crash? It’s Not Rust

The Promise

Cloudflare’s outage did not just take down a fifth of the Internet. It exposed a truth we often avoid in engineering: complex systems rarely fail because of bad code. They fail because of the invisible assumptions we build into them.

This piece cuts past the memes, the Rust blame game and the instant hot takes to explain what actually broke, why the outrage misfired and what this incident really tells us about the fragility of Internet-scale systems.

If you are building distributed, AI-driven or mission-critical platforms, the key takeaways here will reset how you think about reliability and help you avoid walking away with exactly the wrong lesson from one of the year’s most revealing outages.

1. Setting the Stage: When a Fifth of the Internet Slowed to a Crawl

On 18 November, Cloudflare experienced one of its most significant incidents in recent years. Large parts of the world observed outages or degraded performance across services that underpin global traffic.
As always, the Internet reacted the way it knows best: outrage, memes, instant diagnosis delivered with absolute confidence.

Within minutes, social timelines flooded with:

  • “It must be DNS”
  • “Rust is unsafe after all”
  • “This is what happens when you rewrite everything”
  • “Even Downdetector is down because Cloudflare is down”
  • Screenshots of broken CSS on Cloudflare’s own status page
  • Accusations of over-engineering, under-engineering and everything in between

The world wanted a villain. Rust happened to be available. But the actual story is more nuanced and far more interesting. (For the record, I am still not convinced we should rewrite Linux kernel in Rust !)

2. What Actually Happened: A Clear Summary of Cloudflare’s Report

Cloudflare’s own post-incident write-up is unusually thorough. If you have not read it, you should. In brief:

  • Cloudflare is in the middle of a major multi-year upgrade of its edge infrastructure, referred to internally as the 20 percent Internet upgrade.
  • The rollout included a new feature configuration file.
  • This file contained more than two hundred features for their FL2 component, crossing a size limit that had been assumed but never enforced through guardrails.
  • The oversized file triggered a panic in the Rust-based logic that validated these configurations.
  • That panic initiated a restart loop across a large portion of their global fleet.
  • Because the very nodes that needed to perform a rollback were themselves in a degraded state, Cloudflare could not recover the control plane easily.
  • This created a cascading, self-reinforcing failure.
  • Only isolated regions with lagged deployments remained unaffected.

The root cause was a logic-path issue interacting with operational constraints. It had nothing to do with memory safety and nothing to do with Rust’s guarantees.

In other words: the failure was architectural, not linguistic.

3.2 The “unwrap() Is Evil” Argument (I remember writing a blog titled Eval() is not Evil() ~2012)

One of the most widely circulated tweets framed the presence of an unwrap() as a ticking time bomb, casting it as proof that Rust developers “trust themselves too much”. This is a caricature of the real issue.

The error did not arise because of an unwrap(), nor because Rust encourages poor error handling. It arose because:

  • an unexpected input crossed a limit,
  • guards were missing,
  • and the resulting failure propagated in a tightly coupled system.

The same failure would have occurred in Go, Java, C++, Zig, or Python.

3.3 Transparency Misinterpreted as Guilt

Cloudflare did something rare in our industry.
They published the exact code that failed. This was interpreted by some as:

“Here is the guilty line. Rust did it.”

In reality, Cloudflare’s openness is an example of mature engineering culture. More on that later.

4. The Internet Rage Cycle: Humour, Oversimplification and Absolute Certainty

The memes and tweets around this outage are not just entertainment. They reveal how the broader industry processes complex failure.

4.1 The ‘Everything Balances on Open Source’ Meme

Images circulated showing stacks of infrastructure teetering on boxes labelled DNS, Linux Foundation and unpaid open source developers, with Big Tech perched precariously on top.

This exaggeration contains a real truth. We live in a dependency monoculture. A few layers of open source and a handful of service providers hold up everything else.

The meme became shorthand for Internet fragility.

4.2 The ‘It Was DNS’ Routine

The classic:
“It is not DNS. It cannot be DNS. It was DNS.”

Except this time, it was not DNS.

Yet the joke resurfaces because DNS has become the folk villain for any outage. People default to the easiest mental shortcut.

4.3 The Rust Panic Narrative

Tweets claiming:

“Cloudflare rewrote in Rust, and half the Internet went down 53 days later.”

This inference is wrong, but emotionally satisfying.
People conflate correlation with causation because it creates a simple story: rewrites are dangerous.

4.4 The Irony of Downdetector Being Down

The screenshot of Downdetector depending on Cloudflare and therefore failing is both funny and revealing. This outage demonstrated how deeply intertwined modern platforms are. It is an ecosystem issue, not a Cloudflare issue.

4.5 But There Were Also Good Takes

Kelly Sommers’ observation that Cloudflare published source code is a reminder that not everyone jumped to outrage.

There were pockets of maturity. Unfortunately, they were quieter than the noise.

5. The Real Lessons for Engineering Leaders

This is the part worth reading slowly if you build distributed systems.

Lesson 1: Reliability Is an Architecture Choice, Not a Language Choice

You can build fragile systems in safe languages and robust systems in unsafe languages. Language is orthogonal to architectural resilience.

Lesson 2: Guardrails Matter More Than Guarantees

Rust gives memory safety.
It does not give correctness safety.
It does not give assumption safety.
It does not give rollout safety.

You cannot outsource judgment.

Lesson 3: Blast Radius Containment Is Everything

  • Uniform rollouts are dangerous.
  • Synchronous edge updates are dangerous.
  • Large global fleets need layered fault domains.

Cloudflare knows this. This incident will accelerate their work here.

Lesson 4: Control Planes Must Be Resilient Under Their Worst Conditions

The control plane was unreachable when it was needed most. This is a classic distributed systems trap: the emergency mechanism relies on the unhealthy components.

Always test:

  • rollback unavailability
  • degraded network conditions
  • inconsistent state recovery

Lesson 5: Complexity Fails in Complex Ways

The system behaved exactly as designed. That is the problem.
Emergent behaviour in large networks cannot be reasoned about purely through local correctness.

This is where most teams misjudge their risk.

6. Additional Lesson: Accountability and Transparency Are Strategic Advantages

This incident highlighted something deeper about Cloudflare’s culture.

They did not hide behind ambiguity.
They did not release a PR-approved statement with vague phrasing.

They published:

  • the timeline
  • the diagnosis
  • the exact code
  • the root cause
  • the systemic contributors
  • the ongoing mitigation plan

This level of transparency is uncomfortable. It puts the organisation under a microscope.
Yet it builds trust in a way no marketing claim can.

Transparency after failure is not just ethical. It is good engineering. Very few people highlighted including my man Gergely Orosz.

Most companies will never reach this level of accountability.
Cloudflare raised the bar.

7. What This Outage Tells Us About the State of the Internet

This was not a Cloudflare problem, This is a reminder of our shared dependency.

  • Too much global traffic flows through too few choke points.
  • Too many systems assume perfect availability from upstream.
  • Too many platforms synchronise their rollouts.
  • Too many companies run on infrastructure they did not build and cannot control.

The memes were not wrong.
They were simply incomplete.

8. Final Thoughts: Rust Did Not Fail. Our Assumptions Did.

Outages like this shape the future of engineering. The worst thing the industry can do is learn the wrong lesson.

This was not:

  • a Rust failure
  • a rewrite failure
  • an open source failure
  • a Cloudflare hubris story

This was a systems-thinking failure.
A reminder that assumptions are the most fragile part of any distributed system.
A demonstration of how tightly coupled global infrastructure has become.
A case study in why architecture always wins over language debates.

Cloudflare’s transparency deserves respect.
Their engineering culture deserves attention.
And the outrage cycle deserves better scepticism.

Because the Internet did not go down because of Rust.
It went down because the modern Internet is held together by coordination, trust, and layered assumptions that occasionally collide in surprising ways.

If we want a more resilient future, we need less blame and more understanding.
Less certainty and more curiosity.
Less language tribalism and more systems design thinking.

The Internet will fail again.
The question is whether we learn or react.

Cloudflare learned. The rest of us should too!

What Is the Truth About the Viral MIT Study Claiming 95% of AI Deployments Are a Failure – A Critical Analysis

What Is the Truth About the Viral MIT Study Claiming 95% of AI Deployments Are a Failure – A Critical Analysis

Introduction: The Statistic That Shook the AI World

When headlines screamed that 95% of AI projects fail, the internet erupted. Boards panicked, investors questioned their bets, and LinkedIn filled with hot takes. The claim, sourced from an MIT NANDA report, became a viral talking point across Fortune, Axios, Forbes, and Tom’s Hardware (some of my regular reads), as well as more than ten Substack newsletters that landed in my inbox, dissecting the story. But how much truth lies behind this statistic? Is the AI revolution really built on a 95% failure rate, or is the reality far more nuanced?

This article takes a grounded look at what MIT actually studied, where the media diverged from the facts, and what business leaders can learn to transform AI pilots into measurable success stories.

The Viral 95% Claim: Context and Origins

What the MIT Study Really Said

The report, released in mid‑2025, examined over 300 enterprise Generative AI pilot projects and found that only 5% achieved measurable profit and loss impact. Its focus was narrow, centred on enterprise Generative AI (GenAI) deployments that automated content creation, analytics, or decision‑making processes.

Narrow Scope, Broad Misinterpretation

The viral figure sounds alarming, yet it represents a limited dataset. The study assessed short‑term pilots and defined success purely in financial terms within twelve months. Many publications mistakenly generalised this to mean all AI initiatives fail, converting a specific cautionary finding into a sweeping headline.

Methodology and Media Amplification

What the Report Actually Measured

The MIT researchers used surveys, interviews, and case studies across industries. Their central finding was simple: technical success rarely equals business success. Many AI pilots met functional requirements but failed to integrate into core operations. Common causes included:

  • Poor integration with existing systems
  • Lack of process redesign and staff engagement
  • Weak governance and measurement frameworks

The few successes focused on back‑office automation, supply chain optimisation, and compliance efficiency rather than high‑visibility customer applications.

How the Media Oversimplified It

Writers such as Forbes contributor Andrea Hill and the Marketing AI Institute noted that the report defined “failure” narrowly, linking it to short‑term financial metrics. Yet outlets like Axios and TechRadar amplified the “95% fail” headline without context, feeding a viral narrative that misrepresented the nuance of the original findings.

Case Study 1: Retail Personalisation Gone Wrong

One of the case studies cited in Fortune and Tom’s Hardware involved a retail conglomerate that launched a GenAI‑driven personalisation engine across its e‑commerce sites. The goal was to revolutionise product recommendations using behavioural data and generative content. After six months, however, the project was halted due to three key issues:

  1. Data Fragmentation: Customer information was inconsistent across regions and product categories.
  2. Governance Oversight: The model generated content that breached brand guidelines and attracted regulatory scrutiny under UK GDPR.
  3. Cultural Resistance: Marketing teams were sceptical of AI‑generated messaging and lacked confidence in its transparency.

MIT categorised the case as a “technical success but organisational failure”. The system worked, yet the surrounding structure did not evolve to support it. This demonstrates a classic case of AI readiness mismatch: advanced technology operating within an unprepared organisation.

Case Study 2: Financial Services Success through Governance

Conversely, the report highlighted a financial services firm that used GenAI to summarise compliance reports and automate elements of regulatory submissions. Initially regarded as a modest internal trial, it delivered measurable impact:

  • 45% reduction in report generation time
  • 30% fewer manual review errors
  • Faster auditor sign‑off and improved compliance accuracy

Unlike the retail example, this organisation approached AI as a governed augmentation tool, not a replacement for human judgement. They embedded explainability and traceability from the outset, incorporating human checkpoints at every stage. The project became one of the few examples of steady, measurable ROI—demonstrating that AI governance and cultural alignment are decisive success factors.

What the Press Got Right (and What It Missed)

Where the Media Was Accurate

Despite the sensational tone, several publications identified genuine lessons:

  • Integration and readiness are greater obstacles than algorithms.
  • Governance and process change remain undervalued.
  • KPIs for AI success are often poorly defined or absent.

What They Overlooked

Most commentary ignored the longer adoption cycle required for enterprise transformation:

  • Time Horizons: Real returns often appear after eighteen to twenty‑four months.
  • Hidden Gains: Productivity, compliance, and efficiency improvements often remain off the books.
  • Sector Differences: Regulated industries must balance caution with innovation.

The Real Takeaway: Why Most AI Pilots Struggle

It Is Not the AI, It Is the Organisation

The high failure rate underscores organisational weakness rather than technical flaws. Common pitfalls include:

  • Hype‑driven use cases disconnected from business outcomes
  • Weak change management and poor adoption
  • Fragmented data pipelines and ownership
  • Undefined accountability for AI outputs

The Governance Advantage

Success correlates directly with AI governance maturity. Frameworks such as ISO 42001, NIST AI RMF, and the EU AI Act provide the structure and accountability that bridge the gap between experimentation and operational success. Firms adopting these frameworks experience faster scaling and more reliable ROI.

Turning Pilots into Profit: The Enterprise Playbook

  1. Start with Measurable Impact: Target compliance automation or internal processes where results are visible.
  2. Design for Integration Early: Align AI outputs with established data and workflow systems.
  3. Balance Build and Buy: Work with trusted partners for scalability while retaining data control.
  4. Define Success Before You Deploy: Create clear metrics for success, review cycles, and responsible ownership.
  5. Govern from Day One: Build explainability, ethics, and traceability into every layer of the system.

What This Means for Boards and Investors

The greatest risk for boards is not AI failure but premature scaling. Decision‑makers should:

  • Insist on readiness assessments before funding.
  • Link AI investments to ROI‑based performance indicators.
  • Recognise that AI maturity reflects governance maturity.

By treating governance as the foundation for value creation, organisations can avoid the pitfalls that led to the 95% failure headline.

Conclusion: Separating Signal from Noise

The viral MIT “95% failure” claim is only partly accurate. It exposes an uncomfortable truth: AI does not fail, organisations do. The underlying issue is not faulty algorithms but weak governance, unclear measurement, and over‑inflated expectations.

True AI success emerges when technology, people, and governance work together under measurable outcomes. Those who build responsibly and focus on integration, ethics, and transparency will ultimately rewrite the 95% narrative.


References and Further Reading

  1. Axios. (2025, August 21). AI on Wall Street: Big Tech’s Reality Check.
  2. Tom’s Hardware. (2025, August 21). 95 Percent of Generative AI Implementations in Enterprise Have No Measurable Impact on P&L.
  3. Fortune. (2025, August 21). Why 95% of AI Pilots Fail and What This Means for Business Leaders.
  4. Marketing AI Institute. (2025, August 22). Why the MIT Study on AI Pilots Should Be Read with Caution.
  5. Forbes – Andrea Hill. (2025, August 21). Why 95% of AI Pilots Fail and What Business Leaders Should Do Instead.
  6. TechRadar. (2025, August 21). American Companies Have Invested Billions in AI Initiatives but Have Basically Nothing to Show for It.
  7. Investors.com. (2025, August 22). Why the MIT Study on Enterprise AI Is Pressuring AI Stocks.
Why One AWS Spot Still Crashes Sites In 2025?

Why One AWS Spot Still Crashes Sites In 2025?

It started innocently enough. Morning coffee, post-workout calm, a quick “Computer, drop in on my son.”

Instead of his sleepy grin, I got the polite but dreaded:

“There is an error. Please try again later.”

-Alexa (i call it “Computer” as a wannabe Capt of NCC1701E)

Moments later, I realised it wasn’t my internet or device. It was AWS again.

A Familiar Failure in a Familiar Region

If the cloud has a heartbeat, it beats somewhere beneath Northern Virginia.

That is the home of US-EAST-1, Amazon Web Services’ oldest and busiest region, and the digital crossroad through which a large share of the internet’s authentication, routing, and replication flows. It is also the same region that keeps reminding the world that redundancy and resilience are not the same thing.

In December 2022, a cascading power failure at US-EAST-1 set off a chain of interruptions that took down significant parts of the internet, including internal AWS management consoles. Engineers left that incident speaking of stronger isolation and better regional independence.

Three years later, the lesson has returned. The cause may differ, but the pattern feels the same.

The Current Outage

As of this afternoon, AWS continues to battle a widespread disruption in US-EAST-1. The issue began early on 20 October 2025, with elevated error rates across DynamoDB, Route 53, and related control-plane components.

The impact has spread globally.

  • Snapchat, Ring, and Duolingo have reported downtime.
  • Lloyds Bank and several UK financial platforms are seeing degraded service.
  • Even Alexa devices have stopped responding, producing the same polite message: “There is an error. Please try again later.”

For anyone who remembers 2022, it feels uncomfortably familiar. The more digital life concentrates in a handful of hyperscale regions, the more we all share the consequences when one of them fails.

The Pattern Beneath the Problem

Both the 2022 and 2025 US-EAST-1 events reveal the same architectural weakness: control-plane coupling.

Workloads may be distributed across regions, yet many still rely on US-EAST-1 for:

  • IAM token validation
  • DynamoDB global tables metadata
  • Route 53 DNS propagation
  • S3 replication management

When that single region falters, systems elsewhere cannot authenticate, replicate, or even resolve DNS. The problem is not the hardware; it is that so many systems rely on a single control layer.

What makes today’s event more concerning is how little has changed since the last one. The fragility is known, yet few businesses have redesigned their architectures to reduce the dependency.

How Zerberus Responded to the Lesson

When we began building Zerberus, we decided that no single region or provider should ever be critical to our uptime. That choice was not born from scepticism but from experience in building 2 other platforms that had millions of users across 4 continents.

Our products, Trace-AI, ComplAI™, and ZSBOM, deliver compliance and security automation for organisations that cannot simply wait for the cloud to recover. We chose to design for failure as a permanent condition rather than a rare event.

Inside the Zerberus Architecture

Our production environment operates across five regions: London, Ireland, Frankfurt, Oregon, and Ohio. The setup follows an active-passive pattern with automatic failover.

Two additional warm standby sites receive limited live traffic through Cloudflare load balancers. When one of these approaches a defined load threshold, it scales up and joins the active pool without manual intervention.

Multi-Cloud Distribution

  • AWS runs the primary compute and SBOM scanning workloads.
  • Azure carries the secondary inference pipelines and compliance automation modules.
  • Digital Ocean maintains an independent warm standby, ensuring continuity even if both AWS and Azure suffer regional difficulties.

This diversity is not a marketing exercise. It separates operational risk, contractual dependence, and control-plane exposure across multiple vendors.

Network Edge and Traffic Management

At the edge, Cloudflare provides:

  • Global DNS resolution and traffic steering
  • Web application firewalling and DDoS protection
  • Health-based routing with zero-trust enforcement

By externalising DNS and routing logic from AWS, we avoid the single-plane dependency that is now affecting thousands of services.

Data Sovereignty and Isolation

All client data remains within each client’s own VPC. Zerberus only collects aggregated pass/fail summaries and compliance evidence metadata.

Databases replicate across multiple Availability Zones, and storage is separated by jurisdiction. UK data remains in the UK; EU data remains in the EU. This satisfies regulatory boundaries and limits any failure to its own region.

Observability and Auto-Recovery

Telemetry is centralised in Grafana, while Cloudflare health checks trigger regional routing changes automatically.
If a scanning backend becomes unavailable, queued SBOM analysis tasks shift to a healthy region within seconds.

Even during an event such as the present AWS disruption, Zerberus continues to operate—perhaps with reduced throughput, but never completely offline.

Learning from 2022

The 2022 outage made clear that availability zones do not guarantee availability. The 2025 incident reinforces that message.

At Zerberus, we treat resilience as a practice, not a promise. We simulate network blackouts, DNS failures, and database unavailability. We measure recovery time not in theory but in behaviour. These tests are themselves automated(monitored), because the cost of complacency is always greater than the cost of preparation.

Regulation and Responsibility

Europe’s Cyber Resilience Act and NIS2 Directive are closing the gap between regulatory theory and engineering reality. Resilience is no longer an optional control; it is a legal expectation.

A multi-region, multi-cloud, data-sovereign architecture is now both a technical and regulatory necessity. If a hyperscaler outage can lead to non-compliance, the responsibility lies in design, not in the service-level agreement.

Designing for the Next Outage

US-EAST-1 will recover; it always does. The question is how many services will redesign themselves before the next event.

Every builder now faces a decision: continue to optimise for convenience or begin engineering for continuity.

The 2022 failure served as a warning. The 2025 outage confirms the lesson. By the next one, any excuse will sound outdated.

Final Thoughts

The cloud remains one of the greatest enablers of our age, but its weaknesses are equally shared. Each outage offers another chance to refine, distribute, and fortify what we build.

At Zerberus, we accept that the cloud will falter from time to time. Our task is to ensure that our systems, and those of our clients, do not falter with it.

🟩 Author: Ramkumar Sundarakalatharan
Founder & Chief Architect, Zerberus Technologies Ltd

(This article reflects an ongoing incident. For live updates, refer to the AWS Status Page and technology news outlets such as BBC Tech and The Independent.)

References:

https://www.bbc.co.uk/news/live/c5y8k7k6v1rt

https://www.independent.co.uk/tech/aws-amazon-internet-outage-latest-updates-b2848345.html

https://www.dailystar.co.uk/news/world-news/amazon-breaks-silence-outage-reason-36096705

Defence Tech at Risk: Palantir, Anduril, and Govini in the New AI Arms Race

Defence Tech at Risk: Palantir, Anduril, and Govini in the New AI Arms Race

A Chink in Palantir and Anduril’s Armour? Govini and Others Are Unsheathing the Sword

When Silicon Valley Code Marches to War

A U.S. Army Chinook rises over Gyeonggi Province, carrying not only soldiers and equipment but streams of battlefield telemetry, encrypted packets of sight, sound and position. Below, sensors link to vehicles, commanders to drones, decisions to data. Yet a recent Army memo reveals a darker subtext: the very network binding these forces together has been declared “very high risk.”

The battlefield is now a software construct. And the architects of that code are not defence primes from the industrial era but Silicon Valley firms, Anduril and Palantir. For years, they have promised that agility, automation and machine intelligence could redefine combat efficiency. But when an internal memo brands their flagship platform “fundamentally insecure,” the question is no longer about innovation. It is about survival.

Just as the armour shows its first cracks, another company, Govini, crosses $100 million in annual recurring revenue, sharpening its own blade in the same theatre.

When velocity becomes virtue and verification an afterthought, the chink in the armour often starts in the code.

The Field Brief

  • A U.S. Army CTO memo calls Anduril–Palantir’s NGC2 communications platform “very high risk.”
  • Vulnerabilities: unrestricted access, missing logs, unvetted third-party apps, and hundreds of critical flaws.
  • Palantir’s stock drops 7 %; Anduril dismisses findings as outdated.
  • Meanwhile, Govini surpasses $100 M ARR with $150 M funding from Bain Capital.
  • The new arms race is not hardware; it is assurance.

Silicon Valley’s March on the Pentagon

For over half a century, America’s defence economy was dominated by industrial giants, Lockheed Martin, Boeing, and Northrop Grumman. Their reign was measured in steel, thrust and tonnage. But the twenty-first century introduced a new class of combatant: code.

Palantir began as an analytics engine for intelligence agencies, translating oceans of data into patterns of threat. Anduril followed as the hardware-agnostic platform marrying drones, sensors and AI decision loops into one mesh of command. Both firms embodied the “move fast” ideology of Silicon Valley, speed as a substitute for bureaucracy.

The Pentagon, fatigued by procurement inertia, welcomed the disruption. Billions flowed to agile software vendors promising digital dominance. Yet agility without auditability breeds fragility. And that fragility surfaced in the Army’s own words.

Inside the Memo: The Code Beneath the Uniform

The leaked memo, authored by Army CTO Gabriele Chiulli, outlines fundamental failures in the Next-Generation Command and Control (NGC2) prototype, a joint effort by Anduril, Palantir, Microsoft and others.

“We cannot control who sees what, we cannot see what users are doing, and we cannot verify that the software itself is secure.”

The findings are stark: users at varying clearance levels could access all data; activity logging was absent; several embedded applications had not undergone Army security assessment; one revealed twenty-five high-severity vulnerabilities, while others exceeded two hundred.

Translated into security language, the platform lacks role-based access control, integrity monitoring, and cryptographic segregation of data domains. Strategically, this means command blindness: an adversary breaching one node could move laterally without a trace.

In the lexicon of cyber operations, that is not “high risk.” It is mission failure waiting for confirmation.

Inside the Memo: The Code Beneath the Uniform

The leaked memo, authored by Army CTO Gabriele Chiulli, outlines fundamental failures in the Next-Generation Command and Control (NGC2) prototype — a joint effort by Anduril, Palantir, Microsoft and others.

“We cannot control who sees what, we cannot see what users are doing, and we cannot verify that the software itself is secure.”

-US Army Memo

The findings are stark: users at varying clearance levels could access all data; activity logging was absent; several embedded applications had not undergone Army security assessment; one revealed twenty-five high-severity vulnerabilities, while others exceeded two hundred.

Translated into security language, the platform lacks role-based access control, integrity monitoring, and cryptographic segregation of data domains. Strategically, this means command blindness: an adversary breaching one node could move laterally without trace.

In the lexicon of cyber operations, that is not “high risk.” It is a “mission failure waiting for confirmation”.

The Doctrine of Velocity

Anduril’s rebuttal was swift. The report, they claimed, represented “an outdated snapshot.” Palantir insisted that no vulnerabilities were found within its own platform.

Their responses echo a philosophy as old as the Valley itself: innovation first, audit later. The Army’s integration of Continuous Authority to Operate (cATO) sought to balance agility with accountability, allowing updates to roll out in days rather than months. Yet cATO is only as strong as the telemetry beneath it. Without continuous evidence, continuous authorisation becomes continuous exposure.

This is the paradox of modern defence tech: DevSecOps without DevGovernance. A battlefield network built for iteration risks treating soldiers as beta testers.

Govini’s Counteroffensive: Discipline over Demos

While Palantir’s valuation trembled, Govini’s ascended. The Arlington-based startup announced $100 million in annual recurring revenue and secured $150 million from Bain Capital. Its CEO, Tara Murphy Dougherty — herself a former Palantir executive — emphasised the company’s growth trajectory and its $900 million federal contract portfolio.

Govini’s software, Ark, is less glamorous than autonomous drones or digital fire-control systems. It maps the U.S. military’s supply chain, linking procurement, logistics and readiness. Where others promise speed, Govini preaches structure. It tracks materials, suppliers and vulnerabilities across lifecycle data — from the factory floor to the frontline.

If Anduril and Palantir forged the sword of rapid innovation, Govini is perfecting its edge. Precision, not pace, has become its competitive advantage. In a field addicted to disruption, Govini’s discipline feels almost radical.

Technical Reading: From Vulnerability to Vector

The NGC2 memo can be interpreted through a simple threat-modelling lens:

  1. Privilege Creep → Data Exposure — Excessive permissions allow information spillage across clearance levels.
  2. Third-Party Applications → Supply-Chain Compromise — External code introduces unassessed attack surfaces.
  3. Absent Logging → Zero Forensics — Breaches remain undetected and untraceable.
  4. Unverified Binaries → Persistent Backdoors — Unknown components enable long-term infiltration.

These patterns mirror civilian software ecosystems: typosquatted dependencies on npm, poisoned PyPI packages, unpatched container images. The military variant merely amplifies consequences; a compromised package here could redirect an artillery feed, not a webpage.

Modern defence systems must therefore adopt commercial best practice at military scale: Software Bills of Materials (SBOMs), continuous vulnerability correlation, maintainer-anomaly detection, and cryptographic provenance tracking.

Metadata-only validation, verifying artefacts without exposing source, is emerging as the new battlefield armour. Security must become declarative, measurable, and independent of developer promises.

Procurement and Policy: When Compliance Becomes Combat

The implications extend far beyond Anduril and Palantir. Procurement frameworks themselves require reform. For decades, contracts rewarded milestones — prototypes delivered, demos staged, systems deployed. Very few tied payment to verified security outcomes.

Future defence contracts must integrate technical evidence: SBOMs, audit trails, and automated compliance proofs. Continuous monitoring should be a contractual clause, not an afterthought. The Department of Defense’s push towards Zero Trust and CMMC v2 compliance is a start, but implementation must reach code level.

Governments cannot afford to purchase vulnerabilities wrapped in innovation rhetoric. The next generation of military contracting must buy assurance as deliberately as it buys ammunition.

Market Implications: Valuation Meets Validation

The markets reacted predictably: Palantir’s shares slid 7.5 %, while Govini’s valuation swelled with investor confidence. Yet beneath these fluctuations lies a structural shift.

Defence technology is transitioning from narrative-driven valuation to evidence-driven validation. The metric investors increasingly prize is not just recurring revenue but recurring reliability, the ability to prove resilience under audit.

Trust capital, once intangible, is becoming quantifiable. In the next wave of defence-tech funding, startups that embed assurance pipelines will attract the same enthusiasm once reserved for speed alone.

The Lessons of the Armour — Ten Principles for Digital Fortification

For practitioners like me (Old school), here are the Lessons learnt through the classic lens of Saltzer and Schroder.

No.Modern Principle (Defence-Tech Context)Saltzer & Schroeder PrinciplePractical Interpretation in Modern Systems
1Command DevSecOps – Governance must be embedded, not appended. Every deployment decision is a command decision.Economy of MechanismKeep security mechanisms simple, auditable, and centrally enforced across CI/CD and mission environments.
2Segment by Mission – Separate environments and privileges by operational need.Least PrivilegeEach actor, human or machine, receives the minimum access required for the mission window. Segmentation prevents lateral movement.
3Log or Lose – No event should be untraceable.Complete MediationEvery access request and data flow must be logged and verified in real time. Enforce tamper-evident telemetry to maintain operational integrity.
4Vet Third-Party Code – Treat every dependency as a potential adversary.Open DesignAssume no obscurity. Transparency, reproducible builds and independent review are the only assurance that supply-chain code is safe.
5Maintain Live SBOMs – Generate provenance at build and deployment.Separation of PrivilegeIndependent verification of artefacts through cryptographic attestation ensures multiple checks before code reaches production.
6Embed Rollback Paths – Every deployment must have a controlled retreat.Fail-Safe DefaultsWhen uncertainty arises, systems must default to a known-safe state. Rollback or isolation preserves mission continuity.
7Automate Anomaly Detection – Treat telemetry as perimeter.Least Common MechanismShared services such as APIs or pipelines should minimise trust overlap. Automated detectors isolate abnormal behaviour before propagation.
8Demand Provenance – Trust only what can be verified cryptographically.Psychological AcceptabilityVerification should be effortless for operators. Provenance and signatures must integrate naturally into existing workflow tools.
9Audit AI – Governance must evolve with autonomy.Separation of Privilege and Economy of MechanismMultiple models or oversight nodes should validate AI decisions. Explainability should enhance, not complicate, assurance.
10Measure After Assurance – Performance metrics follow proof of security, never precede it.Least Privilege and Fail-Safe DefaultsPrioritise verifiable assurance before optimisation. Treat security evidence as a precondition for mission performance metrics.

The Sword and the Shield

The codebase has become the battlefield. Every unchecked commit, every unlogged transaction, carries kinetic consequence.

Anduril and Palantir forged the sword, algorithms that react faster than human cognition. But Govini, and others of its kind, remind us that the shield matters as much as the blade. In warfare, resilience is victory’s quiet architect.

The lesson is not that speed is dangerous, but that speed divorced from verification is indistinguishable from recklessness. The future of defence technology belongs to those who master both: the velocity to innovate and the discipline to ensure that innovation survives contact with reality.

In this new theatre of code and command, it is not the flash of the sword that defines power — it is the assurance of the armour that bears it.

References & Further Reading

  • Mike Stone, Reuters (3 Oct 2025) — “Anduril and Palantir battlefield communication system ‘very high risk,’ US Army memo says.”
  • Samantha Subin, CNBC (10 Oct 2025) — “Govini hits $100 M in annual recurring revenue with Bain Capital investment.”
  • NIST SP 800-218: Secure Software Development Framework (SSDF).
  • U.S. DoD Zero-Trust Strategy (2024).
  • MITRE ATT&CK for Defence Systems.
The Npm Breach: What It Reveals About Software Supply Chain Security

The Npm Breach: What It Reveals About Software Supply Chain Security

When a Single Phishing Click Becomes a Global Vulnerability – Meet the Supply Chain’s Weakest Link

1. Phishing-Driven Attack on npm Packages

On 8 September 2025, maintainer Qix fell victim to a highly convincing phishing email from [email protected], which led to unauthorised password reset and takeover of his account. Attackers injected malicious code into at least 18 widely used packages — including debug and chalk. These are foundational dependencies with around two billion combined weekly downloads. The injected malware intercepts cryptocurrency and Web3 transactions in users’ browsers, redirecting funds to attacker wallets without any visual cues.

2. “s1ngularity” Attack on Nx Build System

On 26 August 2025, attackers leveraged a compromised GitHub Actions workflow to publish malicious versions of Nx and its plugins to npm. These packages executed post-install scripts that scanned infected systems for SSH keys, GitHub/npm tokens, environment variables, cryptocurrency wallet files, and more. Even more disturbing, attackers weaponised developer-facing AI command-line tools—including Claude, Gemini, and Amazon’s Q—using flags like --yolo, --trust-all-tools to recursively harvest sensitive data, then exfiltrated it to public GitHub repositories named s1ngularity-repository…. The breach is estimated to have exposed 1,000+ developers, 20,000 files, dozens of cloud credentials, and hundreds of valid GitHub tokens, all within just four hours. (TechRadar apiiro.com Nx Truesec Dark Reading InfoWorld )

What These Incidents Reveal

  • Phishing remains the most potent weapon, even with 2FA in place.
  • Malware now exploits developer trust and AI tools—weaponising familiar assistants as reconnaissance agents.
  • Supply chain attacks escalate rapidly, giving defenders little time to react.

Observability as a Defensive Priority

These events demonstrate that traditional vulnerability scanning alone is insufficient. The new frontier is observability — being able to see what packages and scripts are doing in real time.

Examples of Tools and Approaches

  • OX Security
    Provides SBOM (Software Bill of Materials) monitoring and CI/CD pipeline checks, helping detect suspicious post-install scripts and prevent compromised dependencies from flowing downstream. (OX Security)
  • Aikido Security
    Focuses on runtime observability and system behaviour monitoring. Its approach is designed to catch unauthorised resource access or hidden execution paths that could indicate an active supply chain compromise. (Aikido )
  • Academic and open research (OSCAR)
    Demonstrated high accuracy (F1 ≈ 0.95) in detecting malicious npm packages through behavioural metadata analysis. (arXiv)
  • Trace-AI
    Complements the above approaches by using OpenTelemetry-powered tracing to monitor:
    • Package installationsExecution of post-install scriptsAbnormal system calls and network operations
    Trace-AI, like other observability tools, brings runtime context to the supply chain puzzle, helping teams detect anomalies early. (Trace-AI )

Why Observability Matters

Without ObservabilityWith Observability Tools
Compromise discovered too lateBehavioural anomalies flagged in real time
Malware executes silentlyPost-install scripts tracked and analysed
AI tool misuse invisibleDangerous flags or recursive harvesting detected
Manual triage takes daysAutomated alerts shorten incident response

Final Word

These npm breaches show us that trust in open source is no longer enough. Observability must become a primary defensive measure, not an afterthought.

Tools like OX Security, Akkido Security, Trace-AI, and academic advances such as OSCAR all point towards a more resilient future. The real challenge for security teams is to embed observability into everyday workflows before attackers exploit the next blind spot.

References and Further Reading

  • BleepingComputer: npm phishing leads to supply chain compromise (~2 billion downloads/week) (link)
  • The Register: Maintainer phishing and injected crypto-hijack malware (link)
  • Socket.dev: Compromised packages including debug and chalk (link)
  • TechRadar: “s1ngularity” Nx breach (link)
  • Apiiro: Overview of Nx breach and payloads (link)
  • Nx.dev: Official post-mortem (link)
  • TrueSec: Supply chain attack analysis (link)
  • Infoworld: Breach impact on enterprise developers (link)
  • OX Security: Observability for supply chain security (link)
  • arXiv (OSCAR): Malicious npm detection research (link)

Bitnami