Geo-HA Service Fabric Cluster
One of the biggest challenges that we face when we build an Internet-scale solution, is high availability across geographic locations (Geo-HA). Why is this important? Well, there can be a few different reasons. The most common reason, is to be able to survive data center outages. Another reason, is to bring services closer to end users so that we can provide good user experiences.
Geo-HA brings challenges to the table. For example, should we use an Active-Passive or Active-Active strategy for data across regions? Keeping in mind that Active-Active is difficult to get right, we need to take time to analyze and to make the correct choices. We need to consider our Disaster recovery (DR) plan, target RPO and RTO. Azure has a whole bunch of mechanisms for replication, backup and monitoring, so how do we decide what’s the right combination?
Today’s Internet-scale services are built using microservices. Service Fabric is a next-generation middleware platform used for building enterprise-class, Tier-1 services. This microservices platform allows us to build scalable, highly available, reliable, and easy to manage solutions. It addresses the significant challenges in developing and managing stateful services. The Reliable Actors API is one of two high-level frameworks provided by Service Fabric, and it is based on the Actor pattern. This API gives us an asynchronous, single-threaded programming model that simplifies our code while still providing the advantages of scalability and reliability guarantees offered by Service Fabric.
A Service Fabric cluster is HA within its geographic region by default. Thinking about our heritage of on premise data centers, we’ve poured thousands of man-hours to deploy Disaster Recovery sites in secondary physical locations, because we know that everything is possible. Over the past few years, we’ve experienced many interesting scenarios, for example, a cut cable, or a faulty DNS entry broke the Internet. So why should we do anything differently in the cloud? We must treat each region as we treat our own data centers and think about Geo-HA.
The rest of this post is about taking high availably to the next level by deploying a Geo-HA Service Fabric cluster. Continue Reading…