What Does Your Cloud Disaster Recovery Plan Look Like?

April 1, 2013 — 1 Comment

I previously wrote about how important lessons usually don’t come easy… and that replication is important. A few weeks ago the Windows Azure platform was going through a rough patch and all SQL Database instances went offline for a few hours.

At that moment, I recommended the use of SQL Data Sync to keep an on-premise backup of your data.

Since then I took a few steps back and read more on the subject. I also setup my first cloud to on-premise SQL Data Sync group and I must say that I’m impressed!

What’s interesting about SQL Data Sync, is that a backup isn’t limited to the cloud. It provides bidirectional synchronization between two or more databases, which can be located on-premise or in datacenters.

Configuring SQL Data Sync doesn’t require any code! Everything is configured through the Windows Azure Management Portal and on your on-premise servers.

There are many different ways to backup your SQL Database instances and they all come at a price. The available options do not cover all the possible disasters on their own so a mix and match may be the solution you are looking for.

The scope of this post will be limited to SQL Data Sync. Future posts will explore other ways to backup your databases. If you are curious about creating an instant copy of your production database within the same SQL Database Server see my posts about creating a staging database instance from a production instance on Windows Azure SQL Database.

SQL Data Sync has a huge potential, as can be observed from the following diagram. Databases on Windows Azure are valuable assets because you can access them from literally anywhere on and off the planet!


The Reasons to Setup an SQL Data Sync Group are Numerous and Include

  • Creating an off-cloud fallback database — Creating an on-premise, nearly up to date, replica of the production database for backup and or fall back purposes. Pointing your cloud services to an on-premises SQL Server Database instance will incur extra bandwidth costs and will create additional latency for the end users. Keep in mind that a slow system is better than none at all. Being at a stand still without the capacity to provide services to your customers is a horrible place to be!
  • Keeping an offsite backup of on-premise databases — Creating backups of on-premise databases on the cloud can allow your company to recover from data loss due to physical damage to your company’s headquarters.
  • Maintaining backups in multiple physical locations — A master database can have multiple backups in multiple locations. For example, a sales SQL Database instance could have a backup located in your company’s headquarters, in a datacenter located on a different continent, in Amazon datacenters and in your city’s underground datacenter. All these replicas could be kept in sync using an SQL Data Sync group.
  • Reducing on-premise network latency — Creating an on-premise replica of the database to reduce network latency. Working on a local copy of the data will yield greater performances and will result in a better end user experience. Having this local copy synchronize with the cloud, so that worker roles and web roles may work with the data, opens doors to interesting possibilities.
  • Migrating data to the cloud — Copying large amounts data to Windows Azure SQL Database can be challenging. SQL Data Sync can be used to take data from your on-premise database and copy it to Windows Azure SQL Database without being throttled by the service.
  • Quality Assurance — QA teams can greatly benefit from an on-premise replica of the production database. There is no better way to certify that new versions of your software is production ready, than to test it against the real data.
  • To diagnose production issues — In most systems it can be tedious and dangerous to diagnose production issues on a production server. Creating an on-premise replica of your Windows Azure SQL Database can ensure that teams, who are working hard to diagnose and fix ongoing issues, do not incur additional issues for the end users.
  • Aggregating data from multiple sources — Data aggregation from multiple sources is extremely useful in scenarios where you are required to create reports.

Steps to Start Syncing Your Data With SQL Data Sync

Please note, that to reduce costs associated with data transfers its important to create the Sync Hub in the datacenter which contains the majority of the Sync Group databases.

More on SQL Data Sync

SQL Data Sync is built using the Microsoft Sync Framework, which is a proven framework when it comes to synchronization. This also means that, if you feel adventurous, you could build your own module to sync data between databases, doing so will require custom code.

SQL Data Sync groups are composed of a single hub SQL Database instance located on Windows Azure and one or more member databases, which can reside on-premise or in the cloud. When configuring a sync group it’s imperative to add tables in the right order, because member databases do not have foreign keys. Doing so will ensure that data changes will applied in the correct order.

Data changes are stored in batch files and are downloaded over HTTPS by the SQL Data Sync service. On Windows Azure these batch files are stored on Windows Azure Blob Storage Service until they merged into the hub database.

Believe it or not, naming your sync group properly is crucial. At first, you may only have a single sync group configured which works out pretty well. But things get tricky when you add more than one sync group, be sure that the name of your sync group communicates intent. It will drastically help in the long run by providing insight into what is being synced and in what direction.

There are three different sync directions. SQL Data Sync uses Bi-Directional by default.

  • Bi-Directional — Data is synchronized in both directions between the sync group hub and members
  • Sync to the Hub — Data is synchronized from the sync group members to the hub.
  • Sync from the Hub — Data is synchronized from the sync group hub to its members.

Multiple Sync Groups Can be Leveraged in a Disaster Recovery Plan

For example, imagine that we have a hub database which collects sales data from a company’s stores. The data is synchronized using the Sync to the Hub configuration. This prevents other stores from receiving data which isn’t theirs. Then a second sync group is put in place to sync products and prices from the hub database to the stores. The data is synchronized using the Sync from the Hub configuration. Then a third sync group is created, this sync group is put in place to sync customers between all the stores and the hub database. The data is synchronized using the Bi-Directional configuration, ensuring that if headquarters modifies a customer’s address, it will be replicated to all of the company’s stores. Then comes a fourth sync group, this time its goal is to create a full backup of the hub database. The data is synchronized using the Sync from the Hub configuration from the hub database to an undisclosed off-site location. This will ensure that in the event of a disaster, a full backup is available on and off the cloud. To further protect against data loss, this fourth sync group can be configured to push data to many member database.

Based on the previous example, it quickly becomes apparent that good sync group names crucial! Having many sync groups with drastically different responsibilities can rapidly become overwhelming.

Its also interesting to note that SQL Data Sync groups are not all or nothing. Individual tables can be selected and data can be filtered using row filters applied to table columns.

When we talk about data synchronization we can’t omit conflict resolution. SQL Data Sync has two strategies that can be configured to deal with conflicts.

  • The first is Hub Wins, where the first write to the hub is propagated to all member databases and all subsequent writes are ignored.
  • The second is Client Wins, where every change is written to the hub database overriding prior changes.The last write is then propagated out to all member databases. 

Data synchronization occurs by default every 30 minutes. But don’t worry, this can be changed. Data can be synchronized from a minimum of every 5 minutes to every month! You can also disable automatic synchronization all together. In this case, you can trigger synchronization through the Windows Azure management portal.

Putting it All Together

SQL Data Sync is one approach to setting up a disaster recovery plan for your Windows Azure SQL Database. As we have observed in the past, from Amazon and from Windows Azure, even though data is in at least 3 different physical locations at all times, it isn’t 100% safe. To be fair, your data is probably safer in the cloud than on-premise, but it doesn’t hurt to have an on-premise backup just in case.

Having an on-premise backup has its benefits. As mentioned above, your teams will have extra options available to them when they need them the most. When diagnosing production issues, nothing is better than production data.

Consequently, creating an off-cloud backup also empowers the production team to point your cloud services to an
on-premise SQL Server Database instance if your Windows Azure SQL Database ever goes offline. It will effectively  create additional latency for your end users, but keep in mind that a slow system is better than none at all.


One response to What Does Your Cloud Disaster Recovery Plan Look Like?


    I want in on that “Remote office” :)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.