Blog

The Cloudflare incident: How cside minimized customer impact

On November 18th, Cloudflare had an incident that impacted thousands of customers. This blog explores how we limited impact to our own customers.

Nov 21, 2025 • 4 min de lectura

Simon Wijckmans Founder & CEO

On November 18th Cloudflare had an incident that impacted thousands of customers, including customers using our service. Our proxy service is hosted on AWS in an ultra high availability architecture, which was not affected (even by the recent AWS outage either). We also designed our system to be resilient against centralized failures, and to have limited customer impact if they do go down.

The incident lasted approximately 5h 34m, from time to down to full resolution (we saw recovery as soon as 3 hours into the incident). You can see our incident timeline here.

We would like to discuss some interesting observations during the outage, as well as highlight some finer points about our architecture that limited the impact on our customers during this outage.

Internal Observations + How We Limited Customer Impact

Since our proxy and internal processing pipeline are hosted in AWS, there was no impact to any of our critical operations. However our dashboard is hosted on Cloudflare, so that was affected as most websites on the internet were. Because we use Cloudflare for hosting (Cloudflare Workers/Pages/etc) and not just proxying, we were not able to just re-route DNS to work around the outage. On top of that, many upstream services rely on Cloudflare in one way or another. When in need for a cheap asset distribution through a CDN you naturally end up looking at Cloudflare.

Upstream Outages

From our point of view we were able to observe the outage from upstream servers. We saw a high number of 5XX errors from servers impacted by the Cloudflare outage on our proxy. We also received alerts about this, and you can note that the time of the increase in errors matches up almost exactly with when the Cloudflare outage happened at 11:48 UTC.

screenshot-5xx-errors-cloudflare-incident-cside-upstream-servers — Screenshot: 5xx Errors from Cloudflare Affected Servers - Detected by cside

Since our proxy goes through AWS load balancers, and we return the same HTTP response as the upstream script sources, we get all of the metrics when outages like this happen. This is a benefit of having traffic routed through our system since we get to observe outages like this immediately and we can notify our customers of the impact.

How we kept serving scripts during the outage

We cache requests to identical scripts where caching policy (Cache-Control) allows, so in this case scripts that were hosted on Cloudflare were still accessible and would be accessible until the cache is invalidated. This is a benefit to using the cside proxy.

Here is a screenshot of our internal Grafana dashboard showing our script metrics during the outage time period.

screenshot-script-deliverability-during-cloudflare-incident-november-18 — Dashboard: cside Script Delivery During Cloudflare Outage

During the outage: It shows that we had a 70.8% cache hit rate, which means that many scripts were still being served during the outage that may have otherwise been inaccessible.

Regular baseline: This percentage is close to normal for us. For example, on November 17th the average cache hit rate was 74%, meaning we were still serving our usual number of cached scripts.

The total number of requests did go down however.

cside is designed to handle widespread outages

These sorts of widespread outages are unavoidable due to the centralized nature of cloud providers, but we do our best to limit the impact of them by having multi-region deployments of our proxy and a “Fail Open" architecture that means requests will still go through even if everything goes down.

It’s also important to point out that our edge services are designed to operate in an “isolated” mode if our centralized pipeline goes down. This means that even if we are unable to communicate with that system, our proxy will still be operational and can still receive and return requests for scripts. So by design, a centralized system going down cannot completely take down all of our edge nodes.

You can read a breakdown of how our architecture prevents sites from going down here.

The Cloudflare blog post here goes into a lot of detail as well, which is worth reading through.

Sidenote about error handling:

The cause of Cloudflare’s outage happened to be related to a particular failure mode of Rust programs using .unwrap() function calls, which is what caused the 500 errors we saw. We do not use this function at all in our proxy codebase, which is also written in Rust.

Cside is a team of seasoned distributed web engineers. From core contributors to browser like Servo, ex-Cloudflare engineers and early open source contributors to Tailwind and Bootstrap. We care about the web, we treat our infrastructure and architecture as a piece of art. We applaud companies like Cloudflare for sharing deep details about incidents and learned from them throughout our career to prevent them from happened wherever possible.

Founder & CEO Simon Wijckmans

Founder and CEO of c/side. Building better security against client-side executed attacks, and making solutions more accessible to smaller businesses. Web security is not an enterprise only problem.

Tabla de Contenidos

Volver arriba

Don't just take our word for it, ask AI

Ask ChatGPT

Ask Claude

Ask Perplexity AI

CTDPA: Guide to Requirements + Third-Party Script Compliance

Get a clear breakdown of Connecticut Data Privacy Act rules, enforcement timelines, and how to manage third-party scripts correctly.

Expired Domain Risks: A Real Example from Oracle’s Website

An expired domain reference is all an attacker needs to execute phishing under a trusted origin. This blog looks at an example from Oracle’s code.

The Cloudflare incident: How cside minimized customer impact

On November 18th, Cloudflare had an incident that impacted thousands of customers. This blog explores how we limited impact to our own customers.

How WebView mobile apps are dangerous for banking

Banking "apps" that run on browser environments expose credentials without teams realizing it. This article explores examples of WebView mobile app attacks.

Shady Plugins in WooCommerce: Security Risks & Protection Tips

Your checkout is only as safe as your plugins. Discover how WooCommerce handles plugin HTML, why that matters, and the steps to stop malicious code.

Fail Open Architectures: the importance of being ready for a bad day.

Customers diligently ask: “what happens if cside goes down?” or “will it add latency?”. This is how our fail-open architecture is prepared for a bad day.

Reflectiz vs cside

Reflectiz uses a “proprietary browser” which crawls the website. However, client-side attacks are dynamic. Let's dig in on why we do things differently.

How Merchants Can Prevent Chargebacks (tools you need in 2026)

Still have a chargeback stack built for the pre-VAMP era? Here's how leading fraud teams use early dispute blocking to stay ahead of tighter rules in 2026.

Device Fingerprinting in CE 3.0 | How to Block More Chargeback Disputes

This is how merchants use device fingerprinting to win more Compelling Evidence cases (VISA), blocking first-party fraud and lowering VAMP ratios.

How to Bypass JavaScript Agents, CSP, and Crawlers (Client-Side Security Testing)

Most client-side compliance tools can be easily bypassed. We show you how to test weaknesses in CSP, crawler, and JS agents + safer alternatives.