CyberResilience - Nocturnalknight's Lair

Why One AWS Spot Still Crashes Sites In 2025?

By Ramkumar Sundarakalatharan | October 20, 2025 | Comments 0 Comment

It started innocently enough. Morning coffee, post-workout calm, a quick “Computer, drop in on my son.”

Instead of his sleepy grin, I got the polite but dreaded:

“There is an error. Please try again later.”
-Alexa (i call it “Computer” as a wannabe Capt of NCC1701E)

Moments later, I realised it wasn’t my internet or device. It was AWS again.

A Familiar Failure in a Familiar Region

If the cloud has a heartbeat, it beats somewhere beneath Northern Virginia.

That is the home of US-EAST-1, Amazon Web Services’ oldest and busiest region, and the digital crossroad through which a large share of the internet’s authentication, routing, and replication flows. It is also the same region that keeps reminding the world that redundancy and resilience are not the same thing.

In December 2022, a cascading power failure at US-EAST-1 set off a chain of interruptions that took down significant parts of the internet, including internal AWS management consoles. Engineers left that incident speaking of stronger isolation and better regional independence.

Three years later, the lesson has returned. The cause may differ, but the pattern feels the same.

The Current Outage

As of this afternoon, AWS continues to battle a widespread disruption in US-EAST-1. The issue began early on 20 October 2025, with elevated error rates across DynamoDB, Route 53, and related control-plane components.

The impact has spread globally.

Snapchat, Ring, and Duolingo have reported downtime.
Lloyds Bank and several UK financial platforms are seeing degraded service.
Even Alexa devices have stopped responding, producing the same polite message: “There is an error. Please try again later.”

For anyone who remembers 2022, it feels uncomfortably familiar. The more digital life concentrates in a handful of hyperscale regions, the more we all share the consequences when one of them fails.

The Pattern Beneath the Problem

Both the 2022 and 2025 US-EAST-1 events reveal the same architectural weakness: control-plane coupling.

Workloads may be distributed across regions, yet many still rely on US-EAST-1 for:

IAM token validation
DynamoDB global tables metadata
Route 53 DNS propagation
S3 replication management

When that single region falters, systems elsewhere cannot authenticate, replicate, or even resolve DNS. The problem is not the hardware; it is that so many systems rely on a single control layer.

What makes today’s event more concerning is how little has changed since the last one. The fragility is known, yet few businesses have redesigned their architectures to reduce the dependency.

How Zerberus Responded to the Lesson

When we began building Zerberus, we decided that no single region or provider should ever be critical to our uptime. That choice was not born from scepticism but from experience in building 2 other platforms that had millions of users across 4 continents.

Our products, Trace-AI, ComplAI™, and ZSBOM, deliver compliance and security automation for organisations that cannot simply wait for the cloud to recover. We chose to design for failure as a permanent condition rather than a rare event.

Inside the Zerberus Architecture

Our production environment operates across five regions: London, Ireland, Frankfurt, Oregon, and Ohio. The setup follows an active-passive pattern with automatic failover.

Two additional warm standby sites receive limited live traffic through Cloudflare load balancers. When one of these approaches a defined load threshold, it scales up and joins the active pool without manual intervention.

Multi-Cloud Distribution

AWS runs the primary compute and SBOM scanning workloads.
Azure carries the secondary inference pipelines and compliance automation modules.
Digital Ocean maintains an independent warm standby, ensuring continuity even if both AWS and Azure suffer regional difficulties.

This diversity is not a marketing exercise. It separates operational risk, contractual dependence, and control-plane exposure across multiple vendors.

Network Edge and Traffic Management

At the edge, Cloudflare provides:

Global DNS resolution and traffic steering
Web application firewalling and DDoS protection
Health-based routing with zero-trust enforcement

By externalising DNS and routing logic from AWS, we avoid the single-plane dependency that is now affecting thousands of services.

Data Sovereignty and Isolation

All client data remains within each client’s own VPC. Zerberus only collects aggregated pass/fail summaries and compliance evidence metadata.

Databases replicate across multiple Availability Zones, and storage is separated by jurisdiction. UK data remains in the UK; EU data remains in the EU. This satisfies regulatory boundaries and limits any failure to its own region.

Observability and Auto-Recovery

Telemetry is centralised in Grafana, while Cloudflare health checks trigger regional routing changes automatically.
If a scanning backend becomes unavailable, queued SBOM analysis tasks shift to a healthy region within seconds.

Even during an event such as the present AWS disruption, Zerberus continues to operate—perhaps with reduced throughput, but never completely offline.

Learning from 2022

The 2022 outage made clear that availability zones do not guarantee availability. The 2025 incident reinforces that message.

At Zerberus, we treat resilience as a practice, not a promise. We simulate network blackouts, DNS failures, and database unavailability. We measure recovery time not in theory but in behaviour. These tests are themselves automated(monitored), because the cost of complacency is always greater than the cost of preparation.

Regulation and Responsibility

Europe’s Cyber Resilience Act and NIS2 Directive are closing the gap between regulatory theory and engineering reality. Resilience is no longer an optional control; it is a legal expectation.

A multi-region, multi-cloud, data-sovereign architecture is now both a technical and regulatory necessity. If a hyperscaler outage can lead to non-compliance, the responsibility lies in design, not in the service-level agreement.

Designing for the Next Outage

US-EAST-1 will recover; it always does. The question is how many services will redesign themselves before the next event.

Every builder now faces a decision: continue to optimise for convenience or begin engineering for continuity.

The 2022 failure served as a warning. The 2025 outage confirms the lesson. By the next one, any excuse will sound outdated.

Final Thoughts

The cloud remains one of the greatest enablers of our age, but its weaknesses are equally shared. Each outage offers another chance to refine, distribute, and fortify what we build.

At Zerberus, we accept that the cloud will falter from time to time. Our task is to ensure that our systems, and those of our clients, do not falter with it.

🟩 Author: Ramkumar Sundarakalatharan
Founder & Chief Architect, Zerberus Technologies Ltd

(This article reflects an ongoing incident. For live updates, refer to the AWS Status Page and technology news outlets such as BBC Tech and The Independent.)

References:

https://www.bbc.co.uk/news/live/c5y8k7k6v1rt

https://www.independent.co.uk/tech/aws-amazon-internet-outage-latest-updates-b2848345.html

https://www.dailystar.co.uk/news/world-news/amazon-breaks-silence-outage-reason-36096705

InfoSec’s Big Problem: Too Much Hope in One Cyber Database

By Ramkumar Sundarakalatharan | April 16, 2025 | Comments 0 Comment

The Myth of a Single Cyber Superpower: Why Global Infosec Can’t Rely on One Nation’s Database

What the collapse of MITRE’s CVE funding reveals about fragility, sovereignty, and the silent geopolitics of vulnerability management

I. The Day the Coordination Engine Stalled

On April 16, 2025, MITRE’s CVE program—arguably the most critical coordination layer in global vulnerability management—lost its federal funding.

There was no press conference, no coordinated transition plan, no handover to an international body. Just a memo, and silence. As someone who’s worked in information security for two decades, I should have been surprised. I wasn’t. We’ve long been building on foundations we neither control nor fully understand.The CVE database isn’t just a spreadsheet of flaws. It is the lingua franca of cybersecurity. Without it, our systems don’t just become more vulnerable—they become incomparable.

II. From Backbone to Bottleneck

Since 1999, CVEs have given us a consistent, vendor-neutral way to identify and communicate about software vulnerabilities. Nearly every scanner, SBOM generator, security bulletin, bug bounty program, and regulatory framework references CVE IDs. The system enables prioritisation, automation, and coordinated disclosure.

But what happens when that language goes silent?

“We are flying blind in a threat-rich environment.”
— Jen Easterly, former Director of CISA (2025)

That threat blindness is not hypothetical. The National Vulnerability Database (NVD)—which depends on MITRE for CVE enumeration—has a backlog exceeding 10,000 unanalysed vulnerabilities. Some tools have begun timing out or flagging stale data. Security orchestration systems misclassify vulnerabilities or ignore them entirely because the CVE ID was never issued.

This is not a minor workflow inconvenience. It’s a collapse in shared context, and it hits software supply chains the hardest.

III. Three Moves That Signalled Systemic Retreat

While many are treating the CVE shutdown as an isolated budget cut, it is in fact the third move in a larger geopolitical shift:

January 2025: The Cyber Safety Review Board (CSRB) was disbanded—eliminating the U.S.’s central post-incident review mechanism.
March 2025: Offensive cyber operations against Russia were paused by the U.S. Department of Defense, halting active containment of APTs like Fancy Bear and Gamaredon.
April 2025: MITRE’s CVE funding expired—effectively unplugging the vulnerability coordination layer trusted worldwide.

This is not a partisan critique. These decisions were made under a democratically elected government. But their global consequences are disproportionate. And this is the crux of the issue: when the world depends on a single nation for its digital immune system, even routine political shifts create existential risks.

IV. Global Dependency and the Quiet Cost of Centralisation

MITRE’s CVE system was always open, but never shared. It was funded domestically, operated unilaterally, and yet adopted globally.

That arrangement worked well—until it didn’t.

There is a word for this in international relations: asymmetry. In tech, we often call it technical debt. Whatever we name it, the result is the same: everyone built around a single point of failure they didn’t own or influence.

“Integrate various sources of threat intelligence in addition to the various software vulnerability/weakness databases.”
— NSA, 2024

Even the NSA warned us not to over-index on CVE. But across industry, CVE/NVD remains hardcoded into compliance standards, vendor SLAs, and procurement language.

And as of this month, it’s… gone!

V. What Europe Sees That We Don’t Talk About

While the U.S. quietly pulled back, the European Union has been doing the opposite. Its Cyber Resilience Act (CRA) mandates that software vendors operating in the EU must maintain secure development practices, provide SBOMs, and handle vulnerability disclosures with rigour.

Unlike CVE, the CRA assumes no single vulnerability database will dominate. It emphasises process over platform, and mandates that organisations demonstrate control, not dependency.

This distinction matters.

If the CVE system was the shared fire alarm, the CRA is a fire drill—with decentralised protocols that work even if the main siren fails.

Europe, for all its bureaucratic delays, may have been right all along: resilience requires plurality.

VI. Lessons for the Infosec Community

At Zerberus, we anticipated this fracture. That’s why our ZSBOM™ platform was designed to pull vulnerability intelligence from multiple sources, including:

MITRE CVE/NVD (when available)
Google OSV
GitHub Security Advisories
Snyk and Sonatype databases
Internal threat feeds

This is not a plug; it’s a plea. Whether you use Zerberus or not, stop building your supply chain security around a single feed. Your tools, your teams, and your customers deserve more than monoculture.

VII. The Superpower Paradox

Here’s the uncomfortable truth:

When you’re the sole superpower, you don’t get to take a break.

The U.S. built the digital infrastructure the world relies on. CVE. DNS. NIST. Even the major cloud providers. But global dependency without shared governance leads to fragility.

And fragility, in cyberspace, gets exploited.

We must stop pretending that open-source equals open-governance, that centralisation equals efficiency, or that U.S. stability is guaranteed. The MITRE shutdown is not the end—but it should be a beginning.

A beginning of a post-unipolar cybersecurity infrastructure, where responsibility is distributed, resilience is engineered, and no single actor—however well-intentioned—is asked to carry the weight of the digital world.

References

Gatlan, S. (2025) ‘MITRE warns that funding for critical CVE program expires today’, BleepingComputer, 16 April. Available at: https://www.bleepingcomputer.com/news/security/mitre-warns-that-funding-for-critical-cve-program-expires-today/ (Accessed: 16 April 2025).
Easterly, J. (2025) ‘Statement on CVE defunding’, Vocal Media, 15 April. Available at: https://vocal.media/theSwamp/jen-easterly-on-cve-defunding (Accessed: 16 April 2025).
National Institute of Standards and Technology (NIST) (2025) NVD Dashboard. Available at: https://nvd.nist.gov/general/nvd-dashboard (Accessed: 16 April 2025).
The White House (2021) Executive Order on Improving the Nation’s Cybersecurity, 12 May. Available at: https://www.whitehouse.gov/briefing-room/presidential-actions/2021/05/12/executive-order-on-improving-the-nations-cybersecurity/ (Accessed: 16 April 2025).
U.S. National Security Agency (2024) Mitigating Software Supply Chain Risks. Available at: https://media.defense.gov/2024/Jan/30/2003370047/-1/-1/0/CSA-Mitigating-Software-Supply-Chain-Risks-2024.pdf (Accessed: 16 April 2025).
European Commission (2023) Proposal for a Regulation on Cyber Resilience Act. Available at: https://digital-strategy.ec.europa.eu/en/policies/cyber-resilience-act (Accessed: 16 April 2025).

Disbanding the CSRB: A Mistake for National Security

By Ramkumar Sundarakalatharan | January 25, 2025 | Comments 0 Comment

Why Ending the CSRB Puts America at Risk

Imagine dismantling your fire department just because you haven’t had a major fire recently. That’s effectively what the Trump administration has done by disbanding the Cyber Safety Review Board (CSRB), a critical entity within the Cybersecurity and Infrastructure Security Agency (CISA). In an era of escalating cyber threats—ranging from ransomware targeting hospitals to sophisticated state-sponsored attacks—this decision is a catastrophic misstep for national security.

While countries across the globe are doubling down on cybersecurity investments, the United States has chosen to retreat from a proactive posture. The CSRB’s closure sends a dangerous message: that short-term political optics can override the long-term need for resilience in the face of digital threats.

The Role of the CSRB: A Beacon of Cybersecurity Leadership

Established to investigate and recommend strategies following major cyber incidents, the CSRB functioned as a hybrid think tank and task force, capable of cutting through red tape to deliver actionable insights. Its role extended beyond the public-facing reports; the board was deeply involved in guiding responses to sensitive, behind-the-scenes threats, ensuring that risks were mitigated before they escalated into crises.

The CSRB’s disbandment leaves a dangerous void in this ecosystem, weakening not only national defenses but also the trust between public and private entities.

CSRB: Championing Accountability and Reform

One of the CSRB’s most significant contributions was its ability to hold even the most powerful corporations accountable, driving reforms that prioritized security over profit. Its achievements are best understood through the lens of its high-profile investigations:

Key Milestones

Why the CSRB’s Work Mattered

The CSRB’s ability to compel change from tech giants like Microsoft underscored its importance. Without such mechanisms, corporations are less likely to prioritise cybersecurity, leaving critical infrastructure vulnerable to attack. As cyber threats grow in complexity, dismantling accountability structures like the CSRB risks fostering an environment where profits take precedence over security—a dangerous proposition for national resilience.

Cybersecurity as Strategic Deterrence

To truly grasp the implications of the CSRB’s dissolution, one must consider the broader strategic value of cybersecurity. The European Leadership Network aptly draws parallels between cyber capabilities and nuclear deterrence. Both serve as powerful tools for preventing conflict, not through their use but through the strength of their existence.

By dismantling the CSRB, the U.S. has not only weakened its ability to deter cyber adversaries but also signalled a lack of commitment to proactive defence. This retreat emboldens adversaries, from state-sponsored actors like China’s STORM-0558 to decentralized hacking groups, and undermines the nation’s strategic posture.

Global Trends: A Stark Contrast

While the U.S. retreats, the rest of the world is surging ahead. Nations in the Indo-Pacific, as highlighted by the Royal United Services Institute, are investing heavily in cybersecurity to counter growing threats. India, Japan, and Australia are fostering regional collaborations to strengthen their collective resilience.

Similarly, the UK and continental Europe are prioritising cyber capabilities. The UK, for instance, is shifting its focus from traditional nuclear deterrence to building robust cyber defences, a move advocated by the European Leadership Network. The EU’s Cybersecurity Strategy exemplifies the importance of unified, cross-border approaches to digital security.

The U.S.’s decision to disband the CSRB stands in stark contrast to these efforts, risking not only its national security but also its leadership in global cybersecurity.

Isolationism’s Dangerous Consequences

This decision reflects a broader trend of isolationism within the Trump administration. Whether it’s withdrawing from the World Health Organization or sidelining international climate agreements, the U.S. has increasingly disengaged from global efforts. In cybersecurity, this isolationist approach is particularly perilous.

Global threats demand global solutions. Initiatives like the Five Eyes’ Secure Innovation program (Infosecurity Magazine) demonstrate the value of collaborative defence strategies. By withdrawing from structures like the CSRB, the U.S. not only risks alienating allies but also forfeits its role as a global leader in cybersecurity.

The Cost of Complacency

Cybersecurity is not a field that rewards complacency. As CSO Online warns, short-term thinking in this domain can lead to long-term vulnerabilities. The absence of the CSRB means fewer opportunities to learn from incidents, fewer recommendations for systemic improvements, and a diminished ability to adapt to evolving threats.

The cost of this decision will likely manifest in increased cyber incidents, weakened critical infrastructure, and a growing divide between the U.S. and its allies in terms of cybersecurity capabilities.

Conclusion

The disbanding of the CSRB is not just a bureaucratic reshuffle—it is a strategic blunder with far-reaching implications for national and global security. In an age where digital threats are as consequential as conventional warfare, dismantling a key pillar of cybersecurity leaves the United States exposed and isolated.

The CSRB’s legacy of transparency, accountability, and reform serves as a stark reminder of what’s at stake. Its dissolution not only weakens national defences but also risks emboldening adversaries and eroding trust among international partners. To safeguard its digital future, the U.S. must urgently rebuild mechanisms like the CSRB, reestablish its leadership in cybersecurity, and recommit to collaborative defence strategies.

References & Further Reading

TechCrunch. (2025). Trump administration fires members of cybersecurity review board in horribly shortsighted decision. Available at: TechCrunch
The Conversation. (2025). Trump has fired a major cybersecurity investigations body – it’s a risky move. Available at: The Conversation
TechDirt. (2025). Trump disbands cybersecurity board investigating massive Chinese phone system hack. Available at: TechDirt
European Leadership Network. (2024). Nuclear vs Cyber Deterrence: Why the UK Should Invest More in Its Cyber Capabilities and Less in Nuclear Deterrence. Available at: ELN
Royal United Services Institute. (2024). Cyber Capabilities in the Indo-Pacific: Shared Ambitions, Different Means. Available at: RUSI
Infosecurity Magazine. (2024). Five Eyes Agencies Launch Startup Security Initiative. Available at: Infosecurity Magazine
CSO Online. (2024). Project 2025 Could Escalate US Cybersecurity Risks, Endanger More Americans. Available at: CSO Online