Fastly Outage Shows Both Sides of the Same Internet Coin
Companies spend big money on making sure their websites never go down. They invest in expensive equipment, hire the best tech wizards and they vet every service provider. So imagine the dismay when a major internet blackout caused by a Fastly customer recently knocked out some high-profile websites.
At about 11:00 BST on 8th June 2021, a Fastly customer changing settings triggered a bug that took down 85% of Fastly's network. The blackout affected some fairly high-profile sites, including those belonging to the New York Times and the Guardian. Even Amazon was affected.
The good news is that Fastly technicians had the problem resolved in under an hour and, although the outage was certainly very inconvenient, it wasn't devastating. No data was lost and security wasn't compromised. Still, there are many lessons to be learned. The outage showed us two sides of the same internet coin.
Consolidation Can Be Dangerous
One side of the coin reveals the dangers of consolidation. Think of it this way: there are only a small number of major players responsible for operating and maintaining the majority of the internet's infrastructure.
Fastly is a cloud provider that specialises in edge computing, a way of extending cloud infrastructure to the edge of the delivery network so as to guarantee data is as close to customers as possible. Their worldwide network is active on five continents and it processes some 13 Tb of data and nearly 11 million requests per second.
The thing is that Fastly isn't alone. Amazon Web Services (AWS) is another infrastructure behemoth. As of 2020, it commanded 33% of the global cloud market. There are a few others, too. The problem is that it only takes one of them going down to cause major disruptions. Imagine if both Fastly and AWS went down simultaneously – the results are almost unthinkable.
There is legitimate concern about internet consolidation and its implications on security and support. The fewer organisations involved, the easier it is to bring down the whole thing.
Consolidation Can Mean More Uptime
The other side of the coin is that consolidation can mean more uptime. In fact, it almost always does. Fastly proved as much by having most of its network up and running again in such a short amount of time. They were able to do that based on the economics of scale. Because their network is so large, they have the technical staff and physical resources to immediately address problems.
A smaller company might not have been able to respond as quickly. With fewer resources and a smaller staff, perhaps it would have taken twice as long to get its network back online. The longer the downtime, the more damage done.
Companies like Fastly and AWS also take advantage of their size to improve products and services at a much faster rate than their smaller competitors and, because of that, they are quickly making the internet capable of things no one dreamed of even just 10 years ago.
Blackouts Will Always Occur
If we step back and look at the Fastly outage pragmatically, it becomes abundantly clear that things were not as bad as they could have been. The incident also reminds us that blackouts will always occur and there is no way to prevent them entirely. The question then becomes, how do we want to handle them?
Consolidation absolutely increases the chances that one or two events could shut down most of the internet. But consolidation also makes it easier to fix things when they break. You have two sides to the same coin, which actually isn't necessarily a bad thing.