On March 10th, 2021, a fire broke out at a data centre in Strasbourg owned by OVH Cloud, Europes largest cloud services provider. It destroyed one of the four data centres on-site, and badly damaged a second. The electricity to the site was switched off for safety reasons, which shut down the other two data centres on-site. This event caused significant disruption to OVH Cloud and its clients, including the French government.
The Uptime Institute pointed out in a recent blog post that data centre fires are infrequent. There have been fewer than one every two years, on average, since 1994. Indeed, data centre outages are as often caused by fire suppression equipment as by fires themselves. There are therefore few opportunities to learn from experience in this area, so what lessons can we draw from OVH Clouds experience?
Backing up data is a necessity, not a luxury
There is a tendency among non-IT people to assume that holding data in the cloud is the same as backing it up. To a certain extent, that is true. If your data are held in the cloud and on-site computer malfunctions, you will be able to get another machine up and running relatively quickly from the cloud. However, you need to ask questions about your cloud providers built-in redundancy and data backup practices. Cloud providers also need to consider how they can ensure that their clients are protected in the event of a catastrophe. In the wake of the fire, OVH Cloud has announced that data back-ups will in future be free to all its customers, not a paid add-on.
On-site redundancy is good, but not sufficient
This lesson should have been learned from previous disasters such as Hurricane Sandy in the US. Then, backups kept on the other side of New York proved to be useless in recovering from the disaster. Only businesses whose cloud provider was out of state were able to get up and running again quickly. On-site redundancy is goodbut it is not enough. Cloud providers need to build in redundancy across more than one site so that there is always a data backup. The OVH Cloud fire has reinforced the message that you cannot rely on being able to keep two out of four data centres on the site running at all times.
Having a disaster recovery plan is essential for all businesses, including cloud providers
OVH Cloud had a clear disaster recovery plan, and the company brought it into action straight away. OVHs founder, Octave Klaba, set out the priorities for action immediately, focusing on restoring services to customers via the unaffected data centres on-site where possible. However, part of the priority was helping OVH Cloud customers to activate their own disaster recovery plans. In other words, the responsibility is not only on cloud providers. Those using cloud services also need their own plans to manage in a crisis. This might include, for example, a secondary service provider, and separate data backup services.
Transparency and openness in disaster communications can take you a long way
OVH Cloud took immediate steps to publish information about the fire, and the actions it was taking to restore services to its customers. It was open about further disruption, including when smoke was detected in one of the data centres several days later, necessitating further investigation. OVH published daily updates on its own website. It also shared information via its social media channels, including Facebook, YouTube, Twitter and LinkedIn. Its founder also provided videos and information on his Twitter feed in both English and French. The sheer level of detail in the updates is impressiveand the general response from the press and industry commentators was positive.
No industry can afford to be complacent about risk
OVH Clouds experience shows that it is unwise to be complacent about any potentially catastrophic risk, however infrequently it generally occurs. There are no common standards about suppressing fires in data centres around the world: they vary by jurisdiction. OVH has announced plans to establish a new fire-testing laboratory to explore how fires spread within different types of data centres, and how best to suppress them. Octave Klaba has also committed to sharing the labs findings widely as a way to establish new industry standards on fire detection and suppression. The legacy of the OVH Cloud fire on the data centre, therefore, looks likely to be both long-lasting and positive.