Why So Many Websites Went Down When Cloudflare Broke (Explained Simply)

Nov 19

On 18 November 2025, a lot of the internet suddenly stopped working.

People trying to open websites protected by Cloudflare started seeing error pages. Some couldn’t log in to their company tools, others couldn’t reach dashboards or apps they use every day.(The Cloudflare Blog)

It looked scary. But it wasn’t a hack, and it wasn’t a massive DDoS attack.

According to Cloudflare’s official report, it was caused by a small internal change that had an unexpectedly huge effect.(The Cloudflare Blog)

Let’s unpack what happened in normal language.

First: what does Cloudflare actually do?

You can think of Cloudflare as a giant security and traffic company for the web.

Millions of websites sit behind Cloudflare.
When you visit those sites, your browser first talks to Cloudflare.
Cloudflare then decides:
- Is this safe?
- Is this a human or a bot?
- Where should we send this request?

So if Cloudflare has a bad day, many websites have a bad day at the same time.

The short story: a bad “instruction file” crashed their traffic system

Cloudflare has a product called Bot Management. It helps decide if something visiting a website is:

a real person, or
an automated bot that should be blocked.

To make that decision, Cloudflare’s systems use a special instruction file called a “feature file”. You can imagine it like a checklist for a security guard:

“If you see these signs, it’s probably a bot. If you see those signs, it’s probably a human.”

This file is:

Created automatically every few minutes from Cloudflare’s internal database.
Sent out to computers all over their global network.(The Cloudflare Blog)

On November 18, three things happened in sequence:

Cloudflare changed permissions on their internal database.
This was supposed to be a routine improvement in how they manage access.(The Cloudflare Blog)
Because of that change, the process that builds the instruction file accidentally pulled in duplicate information, so the file suddenly became much larger than normal.(The Cloudflare Blog)
The program that reads this file (inside Cloudflare’s main traffic system) had a built-in size limit. The new, oversized file passed that limit, the program couldn’t handle it properly, and it crashed.(The Cloudflare Blog)

Because this program sits in the middle of Cloudflare’s network, its crash meant:

Many requests from users around the world could not be processed, and Cloudflare had to show error pages instead.

Why the internet seemed to go on and off

If you look at Cloudflare’s own graph, the errors didn’t just spike once and stay high. They went up, down, up, down.(The Cloudflare Blog)

That’s because:

The instruction file is recreated every five minutes.
Cloudflare’s database was being upgraded piece by piece.
Sometimes the file was generated from an old part of the system (good file).
Sometimes it was generated from an updated part (bad file).

So every few minutes, Cloudflare’s network would:

Work normally when the “good” file went out.
Break again when the “bad” file went out.

Only later, when all database servers behaved the new way, did the system settle into a fully broken state until engineers fixed it.(The Cloudflare Blog)

Which Cloudflare products were affected?

According to Cloudflare’s report, the outage hit several key services:(The Cloudflare Blog)

Core CDN & Security – many websites behind Cloudflare showed generic 5xx error pages.
Turnstile – Cloudflare’s challenge / CAPTCHA product failed to load, so login pages protected by it stopped working.
Workers KV – a key-value storage service used by many apps saw a big spike in errors.
Dashboard & Access – people struggled to log in to the Cloudflare dashboard and to apps protected by Cloudflare Access.

So even though the original problem was “just” a bad file for the bot detection system, it rippled out into many other areas.

How Cloudflare fixed it

Once engineers figured out that the oversized instruction file was the cause, the recovery steps were clear:

Stop generating and sending the bad file.
Manually push an older, known-good file into the system.
Restart the core traffic software so every machine would reload the good file.(The Cloudflare Blog)

Cloudflare says:

Core traffic was mostly back to normal by 14:30 UTC.
All systems were fully healthy again by 17:06 UTC.(The Cloudflare Blog)

What non-technical people can learn from this

A few important points, even if you’re not a developer:

1. This was not a hack

Cloudflare is very clear: this wasn’t an attack. It was an internal mistake – a small configuration change that had a much bigger impact than expected.(The Cloudflare Blog)

2. The modern internet has “single points of failure”

When so many websites, apps and companies all rely on the same providers (Cloudflare, AWS, Google Cloud, etc.), a problem in one place can feel like “the whole internet is down”.

It’s convenient to put everything behind one powerful service, but it also means:

One bug or misconfiguration can affect millions of users at once.

3. Tiny technical details can have huge business impact

From the outside, “a database permission change and a configuration file that got too big” sounds trivial.

But for companies:

It means lost sales.
Broken logins and dashboards.
Real money and trust on the line.

That’s why infrastructure teams obsess over safeguards, testing, and monitoring — and why they write public postmortems like this one.

In one sentence

On November 18, 2025, Cloudflare went down – and took many websites with it – because a routine database change accidentally created an oversized configuration file for their bot-detection system, which crashed the software that handles web traffic across their network until engineers stopped the bad file, pushed a good one, and restarted everything.

Sorca Marian

Founder, CEO & CTO of Self-Manager.net & abZGlobal.net | Senior Software Engineer

https://self-manager.net/