Case study: How Azure downtime triggered parametric insurance policy

Microsoft Azure had a major downtime event on April 1, 2021 directly affected a Parametrix customer with mission-critical systems on Azure’s East US regions.

Microsoft Azure had a major downtime event on April 1, 2021 which affected their East US and Central regions. Companies running their production systems on one of these services had no service availability for nearly two hours.

This extended outage directly affected a Parametrix customer with mission-critical systems on Azure’s East US regions. This technology company, which suffered financial, reputational and productivity damages, received the monetary payout triggered by the outage in just six (6) days!

[ Downtime Happens | Are you covered for damages? | Learn more ]

Why did the Azure downtime happen?

On April 1st starting around 9PM GMT and ending nearly two hours later, the Parametrix Monitoring System (PMS) indicated that Microsoft Azure was experiencing downtime due to DNS server errors which affected the Azure VM (Virtual Machines) and the Azure SQL services.

Azure’s DNS servers provide domain name resolution using the Microsoft Azure infrastructure. This service is provided natively as part of the cloud as a network Protocol with multiple Azure services relying on these DNS servers.

Both the VM computational service and SQL database service were impacted by these errors. Azure customers running their production systems on one of these services in either the East US or Central US regions had no service availability for the duration of the downtime incident.

What did our monitoring system detect?

The Parametrix Monitoring System identified the outage as soon as it began and recorded the Azure VM and Azure SQL services that were affected in the East US and Central US regions. The deep granularity of the system allows it to capture exactly which services were down during the entire event, and to specify exactly what didn’t work on each service.

The system detected Service Unavailable Status on Azure VM instances in the East US region which had 100% error rates, and in the Central US region which had 50% error rates. The Azure SQL service outage only impacted management operations and not instance operations or running operations.

What did our customer experience?

Our customer validated the serious effects of the downtime incident on their operations and finances. Their mission-critical systems, which reside on Azure’s East US regions, went down for nearly two hours. They started receiving customer support tickets almost immediately, requiring them to mobilize many of their support and technical team members in order to address these issues.

The downtime incident was damaging not only to their employee productivity, but also to their reputation with customers whose operations were directly impacted. Just six (6!) days later, Parametrix covered their financial damage through the monetary payout which was triggered by the outage.

Downtime Happens.


Are you covered for damages?


Contact us to learn more.

Yonantan Hatzor
A successful entrepreneur, Yonatan co-founded Parametrix in 2018 based on his realization that cloud downtime was a growing, unaddressed risk for businesses.
View Profile
Published
April 19, 2021
Category
Blog