What went wrong at Cloudflare?

You may have noticed that yesterday ChatGPT, X and a number of other websites -- including BetaNews for a while -- were unavailable due to an issue with online security service Cloudflare. So what went wrong?
According to the company the problem occurred after a configuration file designed to handle threat traffic did not work as intended and ‘triggered a crash’ in its software handling traffic for its wider services.
“We apologize to our customers and the Internet in general for letting you down today," the company says in an official statement. "Given the importance of Cloudflare's services, any outage is unacceptable."
This is another example of many different parts of the web being reliant on, and therefore vulnerable to, services from a single supplier. Just a month ago we saw a similar issue with an AWS outage taking out multiple services, and an Azure problem also affecting high profile sites.
Fadl Mantash, CISO at Tribe Payments says:
Today’s Cloudflare outage shows how vulnerable the digital economy has become. When a single upstream provider experiences issues, the impact doesn’t stay contained; it cascades across industries, touching everything from social media platforms to e-commerce checkouts and backend payment services.
Payments are particularly exposed. The infrastructure behind a single transaction relies on a chain of cloud platforms, processors, third-party APIs, authentication tools, and card schemes. When any link in that chain fails, the entire journey can break. It’s the same pattern we saw during last year’s CrowdStrike incident: the initial issue wasn’t in payments, yet payments were among the most visible casualties.
It also highlights the dependency of other parts of the world on major US service providers, so that a problem has global implications.
“This isn’t just another technical setback. This is a perfect illustration of how dependent we are on just a few major service providers and the dangers this entails," says Ramutė Varnelytė, CEO at IPXO, the IP resource management platform. “Both this incident and the recent AWS outage that occurred at the end of October should serve as yet another cautionary tale. Cloudflare currently helps manage and secure traffic for about 20 percent of the web. Naturally, the effects of the outage were felt worldwide. While it was fun to joke about how, at least for one day, we could be sure that ChatGPT didn’t write any of the content we’re reading, the key message we should take away from this incident is the importance of resilient infrastructure and the need for an actionable plan.”
How were you affected by Cloudflare’s problems? Let us know in the comments.