On Tuesday of this week, many of the largest websites were unavailable for in excess of an hour. A company called Fastly provides technology that allows these websites to load faster, by storing data on servers closer to users, and protects against spikes in traffic. When a single user changed a setting, gov.uk went down, as did sites ranging from Amazon, to BBC, Spotify and many news providers. In all, around 85% of Fastlys customers’ sites were affected.
You’d think that this could cause a huge problem for Fastly, but it’s actually resulted in an increase in their share price.
How did Fastly come out of this so well?
The unexpected reaction is largely down to how Fastly have dealt with the issue:
- They detected the issue within a minute of it arising. (They constantly monitor their service.)
- They identified the cause. (An undiscovered software bug.)
- They acted – fast – to remove the issue. (They disabled the configuration.)
- Within 49 minutes, 95% of their network was back to operating as normal. (85% were affected.)
- They’ve accepted total responsibility and are publicly apologising to their customers, as well as those who use their customers’ sites.
Finally, and perhaps most importantly, the widespread reports of the outage has highlighted the number and size of businesses the company works for.
What can we all learn from this?
Whilst Fastly certainly won’t be congratulating themselves on the fact that the issue arose, the points I’ve listed above serve as a masterclass on how to deal with a problem in our own businesses.
We’re all human and mistakes WILL happen. It’s purely a matter of time before they do, so it’s a really smart move to make sure you’ve taken some pre-emptive steps to minimise the upheaval they could cause. Some thought before they happen can make a world of difference when they actually do.
1 Fastly were able to monitor their downtime and react as soon as that dipped.
Do you have a way to monitor the sales and delivery process at every stage, and the service levels you’re providing?
Are you asking for feedback from customers so you can adjust your processes if there is anything they’re not happy with?
What checks and balances could you put in place to make sure you’re aware if you have a similar type of issue?
2 Fastly identified the cause of the problem and resolved it quickly.
Do your team understand how your processes bolt together well enough to be able to quickly and efficiently solve any problems when they arise?
Are your team adequately empowered that they can make any decisions that need to be made, or know who to contact in the event that the issue exceeds that authority?
Whilst Fastly’s issue was an undiscovered software bug, have you stress tested your systems and processes to identify potential weak spots? If you can’t remove the weakness entirely, is there a stop-gap solution that could be ready to swing into place should the worst happen?
Do you team know how to keep the business running in the case of differing circumstances?
3 Fastly have apologised.
There’s no arguing that Fastly didn’t provide the service they were being paid for, and that could have had a dramatic effect on their customers’ sales. I’m sure there are penalties that will be invoked by their customers, but by apologising, accepting liability and finding a fast solution, they’ve taken as much heat from the issue as possible.
They can now (hopefully) discuss the rest of the fallout and plans to avoid a repeat of any kind calmly and rationally with their customers. Without an admission or apology, the heat remains and you’re far more likely to lose business.
Have you trained your team on how you want them to respond to your customers in the event of a complaint? If your customer feels that they’ve been listened to, that you understand why they’re unhappy and explain what you’re going to do to put it right, that’s a great starting point.
Do you have a Disaster Recovery Plan?
What the Fastly issue reminds us all of is the importance of a Disaster Recovery Plan, that would include many of the above points. The term sounds complex, but is essence, a Disaster Recovery Plan is purely a document that summarises all of the potential things that could go wrong, and lists the potential solutions, contacts and plans to keep the business moving, should they happen.
Over the last 15 months, we’ve all had to adjust and battle through. Now that some kind of normality is returning, it’s time to start thinking ahead again. If you don’t yet have a plan, or your team aren’t able to put it into effect without you, then putting one together would be a really sensible thing to do.
Fastly have proved this week that responding in the right way to a potential disaster can actually have a dramatically positive effect on a business.