Extortion is the New Prize: Threat actors like ShinyHunters target behavioral context over credit cards because it offers higher leverage for blackmail.
The “Zombie Data” Risk: Storing historical analytics from 2021 in 2025 created a massive liability that outlived the vendor contract.
TPRM Must Be Continuous: Static annual questionnaires cannot detect dynamic shifts in vendor risk or smishing-led credential theft.
You can giggle about the subject if you want. The headlines almost invite it. An adult platform. Premium users. Leaked “activity data.” It sounds like internet tabloid fodder.
But behind the jokes is a breach that should make every security leader deeply uncomfortable. On November 8, 2025, reports emerged that the threat actor ShinyHunters targeted Mixpanel, a third-party analytics provider used by Pornhub. While the source of the data is disputed, the impact is not: over 200 million records of premium user activity were reportedly put on the auction block.
The entry point? A depressingly familiar SMS phishing (smishing) attack. One compromised credential. One vendor environment breached. The result? Total exposure of historical context.
Not a Data Sale, an Extortion Play
This breach is not about dumping databases on underground forums for quick cash. ShinyHunters are not just selling data; they are weaponizing it through Supply-Chain Extortion.
The threat is explicit: Pay, or sensitive behavioral data gets leaked. This data is valuable not because it contains CVV codes, but because it contains context.
What users watched.
When and how often they logged in.
Patterns of behavior that can be correlated, de-anonymized, and weaponized.
That kind of dataset is gold for sophisticated phishing operations and blackmail campaigns. In 2025, this is no longer theft. This is leverage.
The “Zombie Data” Problem: Risk Outlives Revenue
Pornhub stated they had not worked with Mixpanel since 2021. Legally, this distinction matters. Operationally, it’s irrelevant.
If data from 2021 is still accessible in 2025, you haven’t offboarded the vendor; you’ve just stopped paying the bill while keeping the risk open. This is “Zombie Data”—historical records that linger in third-party environments long after the business value has expired.
Why Traditional TPRM Fails the Extortion Test
Most Third-Party Risk Management (TPRM) programs are static compliance exercises—annual PDFs and point-in-time attestations. This model fails because:
Risk is Dynamic: A vendor’s security posture can change in the 364 days between audits.
API Shadows: Data flows often expand without re-scoping the original risk assessment.
Incomplete Offboarding: Data deletion is usually “assumed” via a contract clause rather than verified via technical evidence.
Questions That Actually Reduce Exposure
If incidents like this are becoming the “new normal,” it is because we are asking the wrong questions. To secure the modern supply chain, leadership must ask:
Inventory of Flow: Are we continuously aware of what data is flowing to which vendors today—not just at the time of procurement?
Verification of Purge: Do we treat vendor offboarding as a verifiable security event? (Data deletion should be observable, not just a checked box in an email).
Contextual Blast Radius: If this vendor is breached, is the data “toxic” enough to fuel an extortion campaign?
You Can Outsource Functions, Not Responsibility
It is tempting to believe that liability clauses will protect your brand. They won’t. When a vendor loses your customer data, your organization pays the reputational price. Your users do not care which API failed, and in 2025, regulators rarely do either.
You can outsource your analytics, your infrastructure, and your speed. But you cannot outsource the accountability for your users’ privacy.
Laugh at the headline if you want. But understand the lesson: The next breach may not come through your front door, it will come through the “trusted” side door you forgot to lock years ago.
Cloudflare’s outage did not just take down a fifth of the Internet. It exposed a truth we often avoid in engineering: complex systems rarely fail because of bad code. They fail because of the invisible assumptions we build into them.
This piece cuts past the memes, the Rust blame game and the instant hot takes to explain what actually broke, why the outrage misfired and what this incident really tells us about the fragility of Internet-scale systems.
If you are building distributed, AI-driven or mission-critical platforms, the key takeaways here will reset how you think about reliability and help you avoid walking away with exactly the wrong lesson from one of the year’s most revealing outages.
1. Setting the Stage: When a Fifth of the Internet Slowed to a Crawl
On 18 November, Cloudflare experienced one of its most significant incidents in recent years. Large parts of the world observed outages or degraded performance across services that underpin global traffic. As always, the Internet reacted the way it knows best: outrage, memes, instant diagnosis delivered with absolute confidence.
Within minutes, social timelines flooded with:
“It must be DNS”
“Rust is unsafe after all”
“This is what happens when you rewrite everything”
“Even Downdetector is down because Cloudflare is down”
Screenshots of broken CSS on Cloudflare’s own status page
Accusations of over-engineering, under-engineering and everything in between
The world wanted a villain. Rust happened to be available. But the actual story is more nuanced and far more interesting. (For the record, I am still not convinced we should rewrite Linux kernel in Rust !)
2. What Actually Happened: A Clear Summary of Cloudflare’s Report
Cloudflare’s own post-incident write-up is unusually thorough. If you have not read it, you should. In brief:
Cloudflare is in the middle of a major multi-year upgrade of its edge infrastructure, referred to internally as the 20 percent Internet upgrade.
The rollout included a new feature configuration file.
This file contained more than two hundred features for their FL2 component, crossing a size limit that had been assumed but never enforced through guardrails.
The oversized file triggered a panic in the Rust-based logic that validated these configurations.
That panic initiated a restart loop across a large portion of their global fleet.
Because the very nodes that needed to perform a rollback were themselves in a degraded state, Cloudflare could not recover the control plane easily.
This created a cascading, self-reinforcing failure.
Only isolated regions with lagged deployments remained unaffected.
The root cause was a logic-path issue interacting with operational constraints. It had nothing to do with memory safety and nothing to do with Rust’s guarantees.
In other words: the failure was architectural, not linguistic.
3.2 The “unwrap() Is Evil” Argument (I remember writing a blog titled Eval() is not Evil() ~2012)
One of the most widely circulated tweets framed the presence of an unwrap() as a ticking time bomb, casting it as proof that Rust developers “trust themselves too much”. This is a caricature of the real issue.
The error did not arise because of an unwrap(), nor because Rust encourages poor error handling. It arose because:
an unexpected input crossed a limit,
guards were missing,
and the resulting failure propagated in a tightly coupled system.
The same failure would have occurred in Go, Java, C++, Zig, or Python.
3.3 Transparency Misinterpreted as Guilt
Cloudflare did something rare in our industry. They published the exact code that failed. This was interpreted by some as:
“Here is the guilty line. Rust did it.”
In reality, Cloudflare’s openness is an example of mature engineering culture. More on that later.
4. The Internet Rage Cycle: Humour, Oversimplification and Absolute Certainty
The memes and tweets around this outage are not just entertainment. They reveal how the broader industry processes complex failure.
4.1 The ‘Everything Balances on Open Source’ Meme
Images circulated showing stacks of infrastructure teetering on boxes labelled DNS, Linux Foundation and unpaid open source developers, with Big Tech perched precariously on top.
This exaggeration contains a real truth. We live in a dependency monoculture. A few layers of open source and a handful of service providers hold up everything else.
The meme became shorthand for Internet fragility.
4.2 The ‘It Was DNS’ Routine
The classic: “It is not DNS. It cannot be DNS. It was DNS.”
Except this time, it was not DNS.
Yet the joke resurfaces because DNS has become the folk villain for any outage. People default to the easiest mental shortcut.
4.3 The Rust Panic Narrative
Tweets claiming:
“Cloudflare rewrote in Rust, and half the Internet went down 53 days later.”
This inference is wrong, but emotionally satisfying. People conflate correlation with causation because it creates a simple story: rewrites are dangerous.
4.4 The Irony of Downdetector Being Down
The screenshot of Downdetector depending on Cloudflare and therefore failing is both funny and revealing. This outage demonstrated how deeply intertwined modern platforms are. It is an ecosystem issue, not a Cloudflare issue.
4.5 But There Were Also Good Takes
Kelly Sommers’ observation that Cloudflare published source code is a reminder that not everyone jumped to outrage.
There were pockets of maturity. Unfortunately, they were quieter than the noise.
5. The Real Lessons for Engineering Leaders
This is the part worth reading slowly if you build distributed systems.
Lesson 1: Reliability Is an Architecture Choice, Not a Language Choice
You can build fragile systems in safe languages and robust systems in unsafe languages. Language is orthogonal to architectural resilience.
Lesson 2: Guardrails Matter More Than Guarantees
Rust gives memory safety. It does not give correctness safety. It does not give assumption safety. It does not give rollout safety.
You cannot outsource judgment.
Lesson 3: Blast Radius Containment Is Everything
Uniform rollouts are dangerous.
Synchronous edge updates are dangerous.
Large global fleets need layered fault domains.
Cloudflare knows this. This incident will accelerate their work here.
Lesson 4: Control Planes Must Be Resilient Under Their Worst Conditions
The control plane was unreachable when it was needed most. This is a classic distributed systems trap: the emergency mechanism relies on the unhealthy components.
Always test:
rollback unavailability
degraded network conditions
inconsistent state recovery
Lesson 5: Complexity Fails in Complex Ways
The system behaved exactly as designed. That is the problem. Emergent behaviour in large networks cannot be reasoned about purely through local correctness.
This is where most teams misjudge their risk.
6. Additional Lesson: Accountability and Transparency Are Strategic Advantages
This incident highlighted something deeper about Cloudflare’s culture.
They did not hide behind ambiguity. They did not release a PR-approved statement with vague phrasing.
They published:
the timeline
the diagnosis
the exact code
the root cause
the systemic contributors
the ongoing mitigation plan
This level of transparency is uncomfortable. It puts the organisation under a microscope. Yet it builds trust in a way no marketing claim can.
Transparency after failure is not just ethical. It is good engineering. Very few people highlighted including my man Gergely Orosz.
Most companies will never reach this level of accountability. Cloudflare raised the bar.
7. What This Outage Tells Us About the State of the Internet
This was not a Cloudflare problem, This is a reminder of our shared dependency.
Too much global traffic flows through too few choke points.
Too many systems assume perfect availability from upstream.
Too many platforms synchronise their rollouts.
Too many companies run on infrastructure they did not build and cannot control.
The memes were not wrong. They were simply incomplete.
8. Final Thoughts: Rust Did Not Fail. Our Assumptions Did.
Outages like this shape the future of engineering. The worst thing the industry can do is learn the wrong lesson.
This was not:
a Rust failure
a rewrite failure
an open source failure
a Cloudflare hubris story
This was a systems-thinking failure. A reminder that assumptions are the most fragile part of any distributed system. A demonstration of how tightly coupled global infrastructure has become. A case study in why architecture always wins over language debates.
Cloudflare’s transparency deserves respect. Their engineering culture deserves attention. And the outrage cycle deserves better scepticism.
Because the Internet did not go down because of Rust. It went down because the modern Internet is held together by coordination, trust, and layered assumptions that occasionally collide in surprising ways.
If we want a more resilient future, we need less blame and more understanding. Less certainty and more curiosity. Less language tribalism and more systems design thinking.
The Internet will fail again. The question is whether we learn or react.
Cloudflare learned. The rest of us should too!
What Is the Truth About the Viral MIT Study Claiming 95% of AI Deployments Are a Failure – A Critical Analysis
Introduction: The Statistic That Shook the AI World
When headlines screamed that 95% of AI projects fail, the internet erupted. Boards panicked, investors questioned their bets, and LinkedIn filled with hot takes. The claim, sourced from an MIT NANDA report, became a viral talking point across Fortune, Axios, Forbes, and Tom’s Hardware (some of my regular reads), as well as more than ten Substack newsletters that landed in my inbox, dissecting the story. But how much truth lies behind this statistic? Is the AI revolution really built on a 95% failure rate, or is the reality far more nuanced?
This article takes a grounded look at what MIT actually studied, where the media diverged from the facts, and what business leaders can learn to transform AI pilots into measurable success stories.
The Viral 95% Claim: Context and Origins
What the MIT Study Really Said
The report, released in mid‑2025, examined over 300 enterprise Generative AI pilot projects and found that only 5% achieved measurable profit and loss impact. Its focus was narrow, centred on enterprise Generative AI (GenAI) deployments that automated content creation, analytics, or decision‑making processes.
Narrow Scope, Broad Misinterpretation
The viral figure sounds alarming, yet it represents a limited dataset. The study assessed short‑term pilots and defined success purely in financial terms within twelve months. Many publications mistakenly generalised this to mean all AI initiatives fail, converting a specific cautionary finding into a sweeping headline.
Methodology and Media Amplification
What the Report Actually Measured
The MIT researchers used surveys, interviews, and case studies across industries. Their central finding was simple: technical success rarely equals business success. Many AI pilots met functional requirements but failed to integrate into core operations. Common causes included:
Poor integration with existing systems
Lack of process redesign and staff engagement
Weak governance and measurement frameworks
The few successes focused on back‑office automation, supply chain optimisation, and compliance efficiency rather than high‑visibility customer applications.
How the Media Oversimplified It
Writers such as Forbes contributor Andrea Hill and the Marketing AI Institute noted that the report defined “failure” narrowly, linking it to short‑term financial metrics. Yet outlets like Axios and TechRadar amplified the “95% fail” headline without context, feeding a viral narrative that misrepresented the nuance of the original findings.
Case Study 1: Retail Personalisation Gone Wrong
One of the case studies cited in Fortune and Tom’s Hardware involved a retail conglomerate that launched a GenAI‑driven personalisation engine across its e‑commerce sites. The goal was to revolutionise product recommendations using behavioural data and generative content. After six months, however, the project was halted due to three key issues:
Data Fragmentation: Customer information was inconsistent across regions and product categories.
Governance Oversight: The model generated content that breached brand guidelines and attracted regulatory scrutiny under UK GDPR.
Cultural Resistance: Marketing teams were sceptical of AI‑generated messaging and lacked confidence in its transparency.
MIT categorised the case as a “technical success but organisational failure”. The system worked, yet the surrounding structure did not evolve to support it. This demonstrates a classic case of AI readiness mismatch: advanced technology operating within an unprepared organisation.
Case Study 2: Financial Services Success through Governance
Conversely, the report highlighted a financial services firm that used GenAI to summarise compliance reports and automate elements of regulatory submissions. Initially regarded as a modest internal trial, it delivered measurable impact:
45% reduction in report generation time
30% fewer manual review errors
Faster auditor sign‑off and improved compliance accuracy
Unlike the retail example, this organisation approached AI as a governed augmentation tool, not a replacement for human judgement. They embedded explainability and traceability from the outset, incorporating human checkpoints at every stage. The project became one of the few examples of steady, measurable ROI—demonstrating that AI governance and cultural alignment are decisive success factors.
What the Press Got Right (and What It Missed)
Where the Media Was Accurate
Despite the sensational tone, several publications identified genuine lessons:
Integration and readiness are greater obstacles than algorithms.
Governance and process change remain undervalued.
KPIs for AI success are often poorly defined or absent.
What They Overlooked
Most commentary ignored the longer adoption cycle required for enterprise transformation:
Time Horizons: Real returns often appear after eighteen to twenty‑four months.
Hidden Gains: Productivity, compliance, and efficiency improvements often remain off the books.
Sector Differences: Regulated industries must balance caution with innovation.
The Real Takeaway: Why Most AI Pilots Struggle
It Is Not the AI, It Is the Organisation
The high failure rate underscores organisational weakness rather than technical flaws. Common pitfalls include:
Hype‑driven use cases disconnected from business outcomes
Weak change management and poor adoption
Fragmented data pipelines and ownership
Undefined accountability for AI outputs
The Governance Advantage
Success correlates directly with AI governance maturity. Frameworks such as ISO 42001, NIST AI RMF, and the EU AI Act provide the structure and accountability that bridge the gap between experimentation and operational success. Firms adopting these frameworks experience faster scaling and more reliable ROI.
Turning Pilots into Profit: The Enterprise Playbook
Start with Measurable Impact: Target compliance automation or internal processes where results are visible.
Design for Integration Early: Align AI outputs with established data and workflow systems.
Balance Build and Buy: Work with trusted partners for scalability while retaining data control.
Define Success Before You Deploy: Create clear metrics for success, review cycles, and responsible ownership.
Govern from Day One: Build explainability, ethics, and traceability into every layer of the system.
What This Means for Boards and Investors
The greatest risk for boards is not AI failure but premature scaling. Decision‑makers should:
Insist on readiness assessments before funding.
Link AI investments to ROI‑based performance indicators.
Recognise that AI maturity reflects governance maturity.
By treating governance as the foundation for value creation, organisations can avoid the pitfalls that led to the 95% failure headline.
Conclusion: Separating Signal from Noise
The viral MIT “95% failure” claim is only partly accurate. It exposes an uncomfortable truth: AI does not fail, organisations do. The underlying issue is not faulty algorithms but weak governance, unclear measurement, and over‑inflated expectations.
True AI success emerges when technology, people, and governance work together under measurable outcomes. Those who build responsibly and focus on integration, ethics, and transparency will ultimately rewrite the 95% narrative.
References and Further Reading
Axios. (2025, August 21). AI on Wall Street: Big Tech’s Reality Check.
Tom’s Hardware. (2025, August 21). 95 Percent of Generative AI Implementations in Enterprise Have No Measurable Impact on P&L.
Fortune. (2025, August 21). Why 95% of AI Pilots Fail and What This Means for Business Leaders.
Marketing AI Institute. (2025, August 22). Why the MIT Study on AI Pilots Should Be Read with Caution.
Forbes – Andrea Hill. (2025, August 21). Why 95% of AI Pilots Fail and What Business Leaders Should Do Instead.
TechRadar. (2025, August 21). American Companies Have Invested Billions in AI Initiatives but Have Basically Nothing to Show for It.
Investors.com. (2025, August 22). Why the MIT Study on Enterprise AI Is Pressuring AI Stocks.
It started innocently enough. Morning coffee, post-workout calm, a quick “Computer, drop in on my son.”
Instead of his sleepy grin, I got the polite but dreaded:
“There is an error. Please try again later.”
-Alexa (i call it “Computer” as a wannabe Capt of NCC1701E)
Moments later, I realised it wasn’t my internet or device. It was AWS again.
A Familiar Failure in a Familiar Region
If the cloud has a heartbeat, it beats somewhere beneath Northern Virginia.
That is the home of US-EAST-1, Amazon Web Services’ oldest and busiest region, and the digital crossroad through which a large share of the internet’s authentication, routing, and replication flows. It is also the same region that keeps reminding the world that redundancy and resilience are not the same thing.
In December 2022, a cascading power failure at US-EAST-1 set off a chain of interruptions that took down significant parts of the internet, including internal AWS management consoles. Engineers left that incident speaking of stronger isolation and better regional independence.
Three years later, the lesson has returned. The cause may differ, but the pattern feels the same.
The Current Outage
As of this afternoon, AWS continues to battle a widespread disruption in US-EAST-1. The issue began early on 20 October 2025, with elevated error rates across DynamoDB, Route 53, and related control-plane components.
The impact has spread globally.
Snapchat, Ring, and Duolingo have reported downtime.
Lloyds Bank and several UK financial platforms are seeing degraded service.
Even Alexa devices have stopped responding, producing the same polite message: “There is an error. Please try again later.”
For anyone who remembers 2022, it feels uncomfortably familiar. The more digital life concentrates in a handful of hyperscale regions, the more we all share the consequences when one of them fails.
The Pattern Beneath the Problem
Both the 2022 and 2025 US-EAST-1 events reveal the same architectural weakness: control-plane coupling.
Workloads may be distributed across regions, yet many still rely on US-EAST-1 for:
IAM token validation
DynamoDB global tables metadata
Route 53 DNS propagation
S3 replication management
When that single region falters, systems elsewhere cannot authenticate, replicate, or even resolve DNS. The problem is not the hardware; it is that so many systems rely on a single control layer.
What makes today’s event more concerning is how little has changed since the last one. The fragility is known, yet few businesses have redesigned their architectures to reduce the dependency.
How Zerberus Responded to the Lesson
When we began building Zerberus, we decided that no single region or provider should ever be critical to our uptime. That choice was not born from scepticism but from experience in building 2 other platforms that had millions of users across 4 continents.
Our products, Trace-AI, ComplAI™, and ZSBOM, deliver compliance and security automation for organisations that cannot simply wait for the cloud to recover. We chose to design for failure as a permanent condition rather than a rare event.
Inside the Zerberus Architecture
Our production environment operates across five regions: London, Ireland, Frankfurt, Oregon, and Ohio. The setup follows an active-passive pattern with automatic failover.
Two additional warm standby sites receive limited live traffic through Cloudflare load balancers. When one of these approaches a defined load threshold, it scales up and joins the active pool without manual intervention.
Multi-Cloud Distribution
AWS runs the primary compute and SBOM scanning workloads.
Azure carries the secondary inference pipelines and compliance automation modules.
Digital Ocean maintains an independent warm standby, ensuring continuity even if both AWS and Azure suffer regional difficulties.
This diversity is not a marketing exercise. It separates operational risk, contractual dependence, and control-plane exposure across multiple vendors.
Network Edge and Traffic Management
At the edge, Cloudflare provides:
Global DNS resolution and traffic steering
Web application firewalling and DDoS protection
Health-based routing with zero-trust enforcement
By externalising DNS and routing logic from AWS, we avoid the single-plane dependency that is now affecting thousands of services.
Data Sovereignty and Isolation
All client data remains within each client’s own VPC. Zerberus only collects aggregated pass/fail summaries and compliance evidence metadata.
Databases replicate across multiple Availability Zones, and storage is separated by jurisdiction. UK data remains in the UK; EU data remains in the EU. This satisfies regulatory boundaries and limits any failure to its own region.
Observability and Auto-Recovery
Telemetry is centralised in Grafana, while Cloudflare health checks trigger regional routing changes automatically. If a scanning backend becomes unavailable, queued SBOM analysis tasks shift to a healthy region within seconds.
Even during an event such as the present AWS disruption, Zerberus continues to operate—perhaps with reduced throughput, but never completely offline.
Learning from 2022
The 2022 outage made clear that availability zones do not guarantee availability. The 2025 incident reinforces that message.
At Zerberus, we treat resilience as a practice, not a promise. We simulate network blackouts, DNS failures, and database unavailability. We measure recovery time not in theory but in behaviour. These tests are themselves automated(monitored), because the cost of complacency is always greater than the cost of preparation.
Regulation and Responsibility
Europe’s Cyber Resilience Act and NIS2 Directive are closing the gap between regulatory theory and engineering reality. Resilience is no longer an optional control; it is a legal expectation.
A multi-region, multi-cloud, data-sovereign architecture is now both a technical and regulatory necessity. If a hyperscaler outage can lead to non-compliance, the responsibility lies in design, not in the service-level agreement.
Designing for the Next Outage
US-EAST-1 will recover; it always does. The question is how many services will redesign themselves before the next event.
Every builder now faces a decision: continue to optimise for convenience or begin engineering for continuity.
The 2022 failure served as a warning. The 2025 outage confirms the lesson. By the next one, any excuse will sound outdated.
Final Thoughts
The cloud remains one of the greatest enablers of our age, but its weaknesses are equally shared. Each outage offers another chance to refine, distribute, and fortify what we build.
At Zerberus, we accept that the cloud will falter from time to time. Our task is to ensure that our systems, and those of our clients, do not falter with it.
(This article reflects an ongoing incident. For live updates, refer to the AWS Status Page and technology news outlets such as BBC Tech and The Independent.)
A Chink in Palantir and Anduril’s Armour? Govini and Others Are Unsheathing the Sword
When Silicon Valley Code Marches to War
A U.S. Army Chinook rises over Gyeonggi Province, carrying not only soldiers and equipment but streams of battlefield telemetry, encrypted packets of sight, sound and position. Below, sensors link to vehicles, commanders to drones, decisions to data. Yet a recent Army memo reveals a darker subtext: the very network binding these forces together has been declared “very high risk.”
The battlefield is now a software construct. And the architects of that code are not defence primes from the industrial era but Silicon Valley firms, Anduril and Palantir. For years, they have promised that agility, automation and machine intelligence could redefine combat efficiency. But when an internal memo brands their flagship platform “fundamentally insecure,” the question is no longer about innovation. It is about survival.
Just as the armour shows its first cracks, another company, Govini, crosses $100 million in annual recurring revenue, sharpening its own blade in the same theatre.
When velocity becomes virtue and verification an afterthought, the chink in the armour often starts in the code.
The Field Brief
A U.S. Army CTO memo calls Anduril–Palantir’s NGC2 communications platform “very high risk.”
Vulnerabilities: unrestricted access, missing logs, unvetted third-party apps, and hundreds of critical flaws.
Palantir’s stock drops 7 %; Anduril dismisses findings as outdated.
Meanwhile, Govini surpasses $100 M ARR with $150 M funding from Bain Capital.
The new arms race is not hardware; it is assurance.
Silicon Valley’s March on the Pentagon
For over half a century, America’s defence economy was dominated by industrial giants, Lockheed Martin, Boeing, and Northrop Grumman. Their reign was measured in steel, thrust and tonnage. But the twenty-first century introduced a new class of combatant: code.
Palantir began as an analytics engine for intelligence agencies, translating oceans of data into patterns of threat. Anduril followed as the hardware-agnostic platform marrying drones, sensors and AI decision loops into one mesh of command. Both firms embodied the “move fast” ideology of Silicon Valley, speed as a substitute for bureaucracy.
The Pentagon, fatigued by procurement inertia, welcomed the disruption. Billions flowed to agile software vendors promising digital dominance. Yet agility without auditability breeds fragility. And that fragility surfaced in the Army’s own words.
Inside the Memo: The Code Beneath the Uniform
The leaked memo, authored by Army CTO Gabriele Chiulli, outlines fundamental failures in the Next-Generation Command and Control (NGC2) prototype, a joint effort by Anduril, Palantir, Microsoft and others.
“We cannot control who sees what, we cannot see what users are doing, and we cannot verify that the software itself is secure.”
The findings are stark: users at varying clearance levels could access all data; activity logging was absent; several embedded applications had not undergone Army security assessment; one revealed twenty-five high-severity vulnerabilities, while others exceeded two hundred.
Translated into security language, the platform lacks role-based access control, integrity monitoring, and cryptographic segregation of data domains. Strategically, this means command blindness: an adversary breaching one node could move laterally without a trace.
In the lexicon of cyber operations, that is not “high risk.” It is mission failure waiting for confirmation.
Inside the Memo: The Code Beneath the Uniform
The leaked memo, authored by Army CTO Gabriele Chiulli, outlines fundamental failures in the Next-Generation Command and Control (NGC2) prototype — a joint effort by Anduril, Palantir, Microsoft and others.
“We cannot control who sees what, we cannot see what users are doing, and we cannot verify that the software itself is secure.”
-US Army Memo
The findings are stark: users at varying clearance levels could access all data; activity logging was absent; several embedded applications had not undergone Army security assessment; one revealed twenty-five high-severity vulnerabilities, while others exceeded two hundred.
Translated into security language, the platform lacks role-based access control, integrity monitoring, and cryptographic segregation of data domains. Strategically, this means command blindness: an adversary breaching one node could move laterally without trace.
In the lexicon of cyber operations, that is not “high risk.” It is a “mission failure waiting for confirmation”.
The Doctrine of Velocity
Anduril’s rebuttal was swift. The report, they claimed, represented “an outdated snapshot.” Palantir insisted that no vulnerabilities were found within its own platform.
Their responses echo a philosophy as old as the Valley itself: innovation first, audit later. The Army’s integration of Continuous Authority to Operate (cATO) sought to balance agility with accountability, allowing updates to roll out in days rather than months. Yet cATO is only as strong as the telemetry beneath it. Without continuous evidence, continuous authorisation becomes continuous exposure.
This is the paradox of modern defence tech: DevSecOps without DevGovernance. A battlefield network built for iteration risks treating soldiers as beta testers.
Govini’s Counteroffensive: Discipline over Demos
While Palantir’s valuation trembled, Govini’s ascended. The Arlington-based startup announced $100 million in annual recurring revenue and secured $150 million from Bain Capital. Its CEO, Tara Murphy Dougherty — herself a former Palantir executive — emphasised the company’s growth trajectory and its $900 million federal contract portfolio.
Govini’s software, Ark, is less glamorous than autonomous drones or digital fire-control systems. It maps the U.S. military’s supply chain, linking procurement, logistics and readiness. Where others promise speed, Govini preaches structure. It tracks materials, suppliers and vulnerabilities across lifecycle data — from the factory floor to the frontline.
If Anduril and Palantir forged the sword of rapid innovation, Govini is perfecting its edge. Precision, not pace, has become its competitive advantage. In a field addicted to disruption, Govini’s discipline feels almost radical.
Technical Reading: From Vulnerability to Vector
The NGC2 memo can be interpreted through a simple threat-modelling lens:
Privilege Creep → Data Exposure — Excessive permissions allow information spillage across clearance levels.
These patterns mirror civilian software ecosystems: typosquatted dependencies on npm, poisoned PyPI packages, unpatched container images. The military variant merely amplifies consequences; a compromised package here could redirect an artillery feed, not a webpage.
Modern defence systems must therefore adopt commercial best practice at military scale: Software Bills of Materials (SBOMs), continuous vulnerability correlation, maintainer-anomaly detection, and cryptographic provenance tracking.
Metadata-only validation, verifying artefacts without exposing source, is emerging as the new battlefield armour. Security must become declarative, measurable, and independent of developer promises.
Procurement and Policy: When Compliance Becomes Combat
The implications extend far beyond Anduril and Palantir. Procurement frameworks themselves require reform. For decades, contracts rewarded milestones — prototypes delivered, demos staged, systems deployed. Very few tied payment to verified security outcomes.
Future defence contracts must integrate technical evidence: SBOMs, audit trails, and automated compliance proofs. Continuous monitoring should be a contractual clause, not an afterthought. The Department of Defense’s push towards Zero Trust and CMMC v2 compliance is a start, but implementation must reach code level.
Governments cannot afford to purchase vulnerabilities wrapped in innovation rhetoric. The next generation of military contracting must buy assurance as deliberately as it buys ammunition.
Market Implications: Valuation Meets Validation
The markets reacted predictably: Palantir’s shares slid 7.5 %, while Govini’s valuation swelled with investor confidence. Yet beneath these fluctuations lies a structural shift.
Defence technology is transitioning from narrative-driven valuation to evidence-driven validation. The metric investors increasingly prize is not just recurring revenue but recurring reliability, the ability to prove resilience under audit.
Trust capital, once intangible, is becoming quantifiable. In the next wave of defence-tech funding, startups that embed assurance pipelines will attract the same enthusiasm once reserved for speed alone.
The Lessons of the Armour — Ten Principles for Digital Fortification
For practitioners like me (Old school), here are the Lessons learnt through the classic lens of Saltzer and Schroder.
No.
Modern Principle (Defence-Tech Context)
Saltzer & Schroeder Principle
Practical Interpretation in Modern Systems
1
Command DevSecOps – Governance must be embedded, not appended. Every deployment decision is a command decision.
Economy of Mechanism
Keep security mechanisms simple, auditable, and centrally enforced across CI/CD and mission environments.
2
Segment by Mission – Separate environments and privileges by operational need.
Least Privilege
Each actor, human or machine, receives the minimum access required for the mission window. Segmentation prevents lateral movement.
3
Log or Lose – No event should be untraceable.
Complete Mediation
Every access request and data flow must be logged and verified in real time. Enforce tamper-evident telemetry to maintain operational integrity.
4
Vet Third-Party Code – Treat every dependency as a potential adversary.
Open Design
Assume no obscurity. Transparency, reproducible builds and independent review are the only assurance that supply-chain code is safe.
5
Maintain Live SBOMs – Generate provenance at build and deployment.
Separation of Privilege
Independent verification of artefacts through cryptographic attestation ensures multiple checks before code reaches production.
6
Embed Rollback Paths – Every deployment must have a controlled retreat.
Fail-Safe Defaults
When uncertainty arises, systems must default to a known-safe state. Rollback or isolation preserves mission continuity.
7
Automate Anomaly Detection – Treat telemetry as perimeter.
Least Common Mechanism
Shared services such as APIs or pipelines should minimise trust overlap. Automated detectors isolate abnormal behaviour before propagation.
8
Demand Provenance – Trust only what can be verified cryptographically.
Psychological Acceptability
Verification should be effortless for operators. Provenance and signatures must integrate naturally into existing workflow tools.
9
Audit AI – Governance must evolve with autonomy.
Separation of Privilege and Economy of Mechanism
Multiple models or oversight nodes should validate AI decisions. Explainability should enhance, not complicate, assurance.
10
Measure After Assurance – Performance metrics follow proof of security, never precede it.
Least Privilege and Fail-Safe Defaults
Prioritise verifiable assurance before optimisation. Treat security evidence as a precondition for mission performance metrics.
The Sword and the Shield
The codebase has become the battlefield. Every unchecked commit, every unlogged transaction, carries kinetic consequence.
Anduril and Palantir forged the sword, algorithms that react faster than human cognition. But Govini, and others of its kind, remind us that the shield matters as much as the blade. In warfare, resilience is victory’s quiet architect.
The lesson is not that speed is dangerous, but that speed divorced from verification is indistinguishable from recklessness. The future of defence technology belongs to those who master both: the velocity to innovate and the discipline to ensure that innovation survives contact with reality.
In this new theatre of code and command, it is not the flash of the sword that defines power — it is the assurance of the armour that bears it.
References & Further Reading
Mike Stone, Reuters (3 Oct 2025) — “Anduril and Palantir battlefield communication system ‘very high risk,’ US Army memo says.”
Samantha Subin, CNBC (10 Oct 2025) — “Govini hits $100 M in annual recurring revenue with Bain Capital investment.”
NIST SP 800-218: Secure Software Development Framework (SSDF).
U.S. DoD Zero-Trust Strategy (2024).
MITRE ATT&CK for Defence Systems.
The Npm Breach: What It Reveals About Software Supply Chain Security
When a Single Phishing Click Becomes a Global Vulnerability – Meet the Supply Chain’s Weakest Link
1. Phishing-Driven Attack on npm Packages
On 8 September 2025, maintainer Qix fell victim to a highly convincing phishing email from [email protected], which led to unauthorised password reset and takeover of his account. Attackers injected malicious code into at least 18 widely used packages — including debug and chalk. These are foundational dependencies with around two billion combined weekly downloads. The injected malware intercepts cryptocurrency and Web3 transactions in users’ browsers, redirecting funds to attacker wallets without any visual cues.
2. “s1ngularity” Attack on Nx Build System
On 26 August 2025, attackers leveraged a compromised GitHub Actions workflow to publish malicious versions of Nx and its plugins to npm. These packages executed post-install scripts that scanned infected systems for SSH keys, GitHub/npm tokens, environment variables, cryptocurrency wallet files, and more. Even more disturbing, attackers weaponised developer-facing AI command-line tools—including Claude, Gemini, and Amazon’s Q—using flags like --yolo, --trust-all-tools to recursively harvest sensitive data, then exfiltrated it to public GitHub repositories named s1ngularity-repository…. The breach is estimated to have exposed 1,000+ developers, 20,000 files, dozens of cloud credentials, and hundreds of valid GitHub tokens, all within just four hours. (TechRadar apiiro.com Nx Truesec Dark Reading InfoWorld )
What These Incidents Reveal
Phishing remains the most potent weapon, even with 2FA in place.
Malware now exploits developer trust and AI tools—weaponising familiar assistants as reconnaissance agents.
Supply chain attacks escalate rapidly, giving defenders little time to react.
Observability as a Defensive Priority
These events demonstrate that traditional vulnerability scanning alone is insufficient. The new frontier is observability — being able to see what packages and scripts are doing in real time.
Examples of Tools and Approaches
OX Security Provides SBOM (Software Bill of Materials) monitoring and CI/CD pipeline checks, helping detect suspicious post-install scripts and prevent compromised dependencies from flowing downstream. (OX Security)
Aikido Security Focuses on runtime observability and system behaviour monitoring. Its approach is designed to catch unauthorised resource access or hidden execution paths that could indicate an active supply chain compromise. (Aikido )
Academic and open research (OSCAR) Demonstrated high accuracy (F1 ≈ 0.95) in detecting malicious npm packages through behavioural metadata analysis. (arXiv)
Trace-AI Complements the above approaches by using OpenTelemetry-powered tracing to monitor:
Package installationsExecution of post-install scriptsAbnormal system calls and network operations
Trace-AI, like other observability tools, brings runtime context to the supply chain puzzle, helping teams detect anomalies early. (Trace-AI )
Why Observability Matters
Without Observability
With Observability Tools
Compromise discovered too late
Behavioural anomalies flagged in real time
Malware executes silently
Post-install scripts tracked and analysed
AI tool misuse invisible
Dangerous flags or recursive harvesting detected
Manual triage takes days
Automated alerts shorten incident response
Final Word
These npm breaches show us that trust in open source is no longer enough. Observability must become a primary defensive measure, not an afterthought.
Tools like OX Security, Akkido Security, Trace-AI, and academic advances such as OSCAR all point towards a more resilient future. The real challenge for security teams is to embed observability into everyday workflows before attackers exploit the next blind spot.
Opening Scene: The Unthinkable Inside Your Digital Fortress
Imagine standing before a vault that holds every secret of your organisation. It is solid, silent and built to withstand brute force. Yet, one day you discover someone walked straight in. No alarms. No credentials. No trace of a break-in. That is what the security community woke up to when researchers disclosed Vault Fault. A cluster of flaws in the very tools meant to guard our digital crown jewels.
Behind the Curtain: The Guardians of Our Secrets
Secrets management platforms like HashiCorp Vault and CyberArk Conjur or Secrets Manager sit at the heart of modern identity infrastructure. They store API keys, service credentials, encryption keys and more. In DevSecOps pipelines and hybrid environments, they are the trusted custodians. If a vault is compromised, it is not one system at risk. It is every connected system.
Vault Fault Unveiled: A Perfect Storm of Logic Flaws
Security firm Cyata revealed fourteen vulnerabilities spread across CyberArk and HashiCorp’s vault products. These were not just minor configuration oversights. They included:
CyberArk Conjur: IAM authenticator bypass by manipulating how regions are parsed. Privilege escalation by authenticating as a policy. Remote code execution by exploiting the ERB-based Policy Factory.
HashiCorp Vault: Nine zero-day issues including the first ever RCE in Vault. Bypasses of multi-factor authentication and account lockout logic. User enumeration through subtle timing differences. Escalation by abusing how policies are normalised.
These were chains of logic flaws that could be combined to devastating effect. Attackers could impersonate identities, escalate privileges, execute arbitrary code and exfiltrate secrets without ever providing valid credentials.
The Fallout: When Silent Vaults Explode
Perhaps the most unnerving fact is the age of some vulnerabilities. Several had been present for up to nine years. Quiet, undetected and exploitable. Remote code execution against a secrets vault is the equivalent of giving an intruder the keys to every door in your company. Once inside, they can lock you out, leak sensitive information or weaponise access for extortion.
Response and Remedy: Patch, Shield, Reinvent
Both vendors have issued fixes:
CyberArk Secrets Manager and Self-Hosted versions 13.5.1 and 13.6.1.
CyberArk Conjur Open Source version 1.22.1.
HashiCorp Vault Community and Enterprise editions 1.20.2, 1.19.8, 1.18.13 and 1.16.24.
Cyata’s guidance is direct. Patch immediately. Restrict network exposure of vault instances. Audit and rotate secrets. Minimise secret lifetime and scope. Enable detailed audit logs and monitor for anomalies. CyberArk has also engaged directly with customers to support remediation efforts.
Broader Lessons: Beyond the Fault
The nature of these flaws should make us pause. They were not memory corruption or injection bugs. They were logic vulnerabilities hiding in plain sight. The kind that slip past automated scans and live through version after version.
It is like delegating your IaaS or PaaS to AWS or Azure. They may run the infrastructure, but you are still responsible for meeting your own uptime SLAs. In the same way, even if you store secrets such as credit card numbers, API tokens or encryption keys in a vault, you remain responsible for securing them. The liability for a breach still sits with you.
Startups are especially vulnerable. Many operate under relentless deadlines and tight budgets. They offload everything that is not seen as part of their “core” operations to third parties. This speeds up delivery but also widens the blast radius when those dependencies are compromised. When your vault provider fails, your customers will still hold you accountable.
This should push us to adopt more defensive architectures. Moving towards ephemeral credentials, context-aware access and reducing reliance on long-lived static secrets.
We also need a culture shift. Secrets vaults are not infallible. Their security must be tested continuously. This includes adversarial simulations, code audits and community scrutiny. Trust in security systems is not a one-time grant. It is a relationship that must be earned repeatedly.
Closing Reflection: Trust Must Earn Itself Again
Vault Fault is a reminder that even our most trusted systems can develop cracks. The breach is not in the brute force of an attacker but in the quiet oversight of logic and design. As defenders, we must assume nothing is beyond failure. We must watch the watchers, test the guards and challenge the fortresses we build. Because the next fault may already be there, waiting to be found.
For years, Palantir has been the enigma of Silicon Valley. Once known for its secretive, high-stakes data work with intelligence agencies, it evolved into a cultural force, the nucleus of what many call the “Palantir Mafia.” As explored in previous pieces like Inside the Palantir Mafia: Secrets to Succeeding in the Tech Industry and Startups That Are Quietly Shaping the Future, its alumni have gone on to shape the world in both visible and subterranean ways.
But in 2025, the story pivots. As detailed in Innovation Drain: Is Palantir Losing Its Edge in 2025?, the company’s commercial growth slowed, innovation seemed to stagnate, and cultural relevance dimmed. Yet paradoxically, its profits soared.
Why? Because the leviathan found a new host, the government.
II. Government as a Business Model
Palantir’s Q1 2025 numbers stunned even seasoned analysts: revenue surged to $884 million, putting the company on a nearly $3.6 billion annual run-rate, with a staggering net profit of $214 million. This profitability was driven almost entirely by the public sector. Over 45% of its revenue came from U.S. government contracts, including:
A landmark $10 billion Army enterprise deal that consolidates 75 separate contracts
Active deployments in the FAA, IRS, CDC, and ICE
International expansion through NHS UK and allied defence systems
What makes Palantir unique is not just what it builds, but how it embeds itself. Its software is no longer a tool; it is infrastructure. And once its proprietary data formats and analytical models are woven into an agency’s core processes, the technical debt and operational risk of removal become insurmountable.
III. The Ethical Cost of Indispensability
The WIRED exposé and a suite of supporting reports from The Washington Post and AINvest paint a troubling picture. Government agencies are increasingly outsourcing decision-making logic to Palantir’s opaque algorithms.
Whether it’s CDC pandemic modelling or ICE’s predictive analytics, the lack of transparency and public oversight is striking. The life-altering consequences of these opaque models—whether they influence a quarantine order or a deportation notice—are executed with the full force of the state, yet designed beyond the reach of public scrutiny. And when Business Insider revealed that Palantir executives like Shyam Sankar were simultaneously appointed as reserve officers in the U.S. Army while bidding on DoD contracts, the red flags couldn’t be clearer.
This is not just procurement, it’s institutional capture.
IV. AI, Warfare, and the New Lobbying Order
CEO Alex Karp has been unabashed about his mission. He frames Palantir as a “Pro-Western Values AI company,” standing in contrast to big tech firms, he claims are too appeasing of adversaries.
This narrative is powerful, and profitable. The GENIUS Act, signed in July 2025, provides regulatory clarity and budgetary guarantees for AI use in federal operations. Palantir has been instrumental in shaping its language.
Still, competitors are circling. Microsoft and OpenAI are quietly trialling AI models within federal programmes. But their frameworks, often open and research-driven, struggle to match Palantir’s closed, battle-tested ecosystem.
Proponents, including many within the DoD, argue this integration is a feature, not a bug. They contend that Palantir’s platform provides a level of speed and data fusion that legacy systems cannot match, a capability deemed essential for modern warfare. However, this argument sidesteps the fundamental question of whether battlefield efficiency justifies the outsourcing of public accountability.
V. Commercial Plateau vs. Government High
In Innovation Drain, we argued that Palantir had lost its edge. That may still be true, for the private sector. Commercial clients demand agility, flexibility, and measurable ROI. Palantir’s platform, by contrast, is tailored for slow-moving, high-budget bureaucracies.
What looks like stagnation in the tech world is actually peak performance in the defence-industrial complex.
VI. The Palantir Mafia’s Legacy Revisited
Ironically, while Palantir itself morphs into a government fixture, its alumni have diverged sharply. From Anduril’s autonomous defence platforms to Epirus’ directed-energy weapons and a slew of stealth analytics startups, the real innovation has escaped the mothership.
As discussed in Startups That Are Quietly Shaping the Future, these offshoots channel Palantir DNA, aggressive mission focus, stealth operations, and a disdain for Big Tech groupthink, into areas Palantir can no longer touch.
This raises a question: Is Palantir still part of the innovation ecosystem, or has it become a bureaucratic toolsmith for the surveillance state?
VII. Conclusion: Leviathan or Lighthouse?
Palantir is no longer just a company. It is an institution woven into the fabric of multiple governments, operating across domains where civilian oversight is minimal and ethical debate is muted.
Its rise is instructive, not just as a case of business success, but as a warning of what happens when tech monopolies gain state-like permanence. The question isn’t just whether Palantir is profitable. It’s whether we, as citizens, are comfortable with a private company wielding this much invisible power.
In 2025, the true Palantir story isn’t about code. It’s about control.
Build Smarter, Ship Faster: Engineering Efficiency and Security with Pre-Commit
In high-velocity engineering teams, the biggest bottlenecks aren’t always technical; they are organisational. Inconsistent code quality, wasted CI cycles, and preventable security leaks silently erode your delivery speed and reliability. This is where pre-commit transforms from a utility to a discipline.
This guide unpacks how to use pre-commit hooks to drastically improve engineering efficiency and development-time security, with practical tips, real-world case studies, and scalable templates.
Time lost in CI failures that could have been caught locally
Onboarding delays due to inconsistent tooling
Pre-Commit to the Rescue
Automates formatting, linting, and static checks
Runs locally before Git commit or push
Ensures only clean code enters your repos
Best Practices for Engineering Velocity
Use lightweight, scoped hooks like black, isort, flake8, eslint, and ruff
Set stages: [pre-commit, pre-push] to optimise local speed
Enforce full project checks in CI with pre-commit run --all-files
Case Study: Engineering Efficiency in D2C SaaS (VC Due Diligence)
While consulting on behalf of a VC firm evaluating a fast-scaling D2C SaaS platform, we observed recurring issues: poor formatting hygiene, inconsistent PEP8 compliance, and prolonged PR cycles. My recommendation was to introduce pre-commit with a standardised configuration.
Within two sprints:
Developer velocity improved with 30% faster code merges
CI resource usage dropped 40% by avoiding trivial build failures
The platform was better positioned for future investment, thanks to a visibly stronger engineering discipline
Shift-Left Security: Prevent Leaks Before They Ship
The Problem
Secrets accidentally committed to Git history
Vulnerable code changes sneaking past reviews
Inconsistent security hygiene across teams
Pre-Commit as a Security Gate
Enforce secret scanning at commit time with tools like detect-secrets, gitleaks, and trufflehog
Standardise secure practices across microservices via shared config
Prevent common anti-patterns (e.g., print debugging, insecure dependencies)
Pre-Commit Security Toolkit
detect-secrets for credential scanning
bandit for Python security static analysis
Custom regex-based hooks for internal secrets
Case Study: Security Posture for HealthTech Startup
During a technical audit for a VC exploring investment in a HealthTech startup handling patient data, I discovered credentials hardcoded in multiple branches. We immediately introduced detect-secrets and bandit via pre-commit.
Impact over the next month:
100% of developers enforced local secret scanning
3 previously undetected vulnerabilities were caught before merging
Their security maturity score, used by the VC’s internal checklist, jumped significantly—securing the next funding round
“One mispatched server in the cloud can ignite a wildfire of trust collapse across 140,000 tenants.”
1. The Context: Why This Matters
In March 2025, a breach at Oracle Cloud shook the enterprise SaaS world. A few hours after Rahul from CloudSEK first flagged signs of a possible compromise, I published an initial analysis titled Is Oracle Cloud Safe? Data Breach Allegations and What You Need to Do Now. That piece was an urgent response to a fast-moving situation, but this article is the reflective follow-up. Here, I break down not just the facts of what happened, but the deeper problem it reveals: the fragility of transitive trust in modern cloud ecosystems.
Threat actor rose87168 leaked nearly 6 million records tied to Oracle’s login infrastructure, affecting over 140,000 tenants. The source? A misconfigured legacy server still running an unpatched version of Oracle Access Manager (OAM) vulnerable to CVE‑2021‑35587.
Initially dismissed by Oracle as isolated and obsolete, the breach was later confirmed via datasets and a tampered page on the login domain itself, captured in archived snapshots. This breach was not just an Oracle problem. It was a supply chain problem. The moment authentication breaks upstream, every SaaS product, platform, and identity provider depending on it inherits the risk, often unknowingly.
Welcome to the age of transitive trust. shook the enterprise SaaS world. Threat actor rose87168 leaked nearly 6 million records tied to Oracle’s login infrastructure, affecting over 140,000 tenants. The source? A misconfigured legacy server still running an unpatched version of Oracle Access Manager (OAM) vulnerable to CVE‑2021‑35587.
Initially dismissed by Oracle as isolated and obsolete, the breach was later confirmed via datasets and a tampered page on the login domain itself, captured in archived snapshots. This breach was not just an Oracle problem. It was a supply chain problem. The moment authentication breaks upstream, every SaaS product, platform, and identity provider depending on it inherits the risk, often unknowingly.
Welcome to the age of transitive trust.
2. Anatomy of the Attack
Attack Vector
Exploited: CVE-2021-35587, a critical RCE in Oracle Access Manager.
Payload: Malformed XML allowed unauthenticated remote code execution.
Exploited Asset
Legacy Oracle Cloud Gen1 login endpoints still active (e.g., login.us2.oraclecloud.com).
These endpoints were supposedly decommissioned but remained publicly accessible.
Proof & Exfiltration
Uploaded artefact visible in Wayback Machine snapshots.
Discover hidden dependencies in SaaS trust chains and enforce control validation
7. Lessons Learned
When I sat down to write this, these statements felt too obvious to be called lessons. Of course authentication is production infrastructure, any practitioner would agree. But then why do so few treat it that way? Why don’t we build failovers for our SSO? Why is trust still assumed, rather than validated?
These aren’t revelations. They’re reminders; hard-earned ones.
Transitive trust is NOT NEUTRAL, it’s a silent threat multiplier. It embeds risk invisibly into every integration.
Legacy infrastructure never retires itself. If it’s still reachable, it’s exploitable.
Authentication systems deserve production-level fault tolerance. Build them like you’d build your API or Payment Gateway.
Trust is not a diagram to revisit once a year; it must be observable, enforced, and continuously verified.
8. Making the Invisible Visible: Why We Built Zerberus
Transitive trust is invisible until it fails. Most teams don’t realise how many of their security guarantees hinge on external identity providers, third-party SaaS integrations, and cloud-native IAM misconfigurations.
At Zerberus, we set out to answer a hard question: What if you could see the trust relationships before they became a risk?
We map your entire trust graph, from identity providers and cloud resources to downstream tools and cross-SaaS entitlements.
We continuously verify the health and configuration of your identity and access layers, including:
MFA enforcement
Secret expiration windows
IDP endpoint exposure
We bridge compliance and security by treating auth controls and access posture as observable artefacts, not static assumptions.
Your biggest security risk may not be inside your codebase, but outside your control plane. Zerberus is your lens into that blind spot.