Steps Towards Optimized Windows Azure SQL Databases
November 30, 2013 6 Comments
Windows Azure SQL Database is a high density multi-tenant database service, whose performance is not guaranteed due to its nature. Thus, it’s imperative that we pay attention to details. This post is all about reviewing the structure and performance of existing Windows Azure SQL Databases by giving you insights into what you should be looking for.
Prior to making any changes to your application, you should look into taking control of your database’s life cycle management by employing tools like the Microsoft SQL Server Data Tools. These tools allow you to target specific platforms like Windows Azure SQL Database, which is very handy because it allows you to identify unsupported features before you deploy. Furthermore, they provide IntelliSense and make it possible to manage database versions by storing scripts in source control solutions like Microsoft Team Foundation Server.
Once your database schema is secured and under control, it’s time to go through its tables to make sure that Clustered Indexes are defined properly. Failing to do so can cause headaches when it comes time for maintenance and modifications.
By default Windows Azure SQL Database creates Clustered Indexes on Primary Keys, which by design isn’t a bad thing since we usually access data by the primary key. On Windows Azure SQL Databases there are times when it can be more advantageous to define Clustered Indexes on non-Primary Key columns. For example, you may want to store your data in a particular order so that you don’t need to sort it when it is queried by your application. This aspect of Clustered Indexes is quite important because sort operations are quite costly.
Intentionally creating Clustered Indexes separate from Constraints and Primary Keys brings you the flexibility to change them without having to go through a complex drop and create dance.
By this time, your Clustered Indexes should be sorted out and it’s time to revise NonClustered Indexes. These indexes should be narrow, as they can be combined by SQL Database to satisfy your queries. You should also consider augmenting their selectivity by applying a where clause in order to exempt unnecessary rows. Paying attention to little details like filtering indexes also helps in cost reductions. It’s not uncommon to find that indexes make up more than 50% of the total database size.
By reducing the storage taken up by indexes, I was been able to go from a 40GB database ($105.90/month) to a 30GB database ($85.92/month) without sacrificing performance. This translates to saving $239.76/year. Keep in mind that these numbers are based on pricing available on November 30, 2013. Please see the Windows Azure Pricing Calculator for updated pricing.
NonClustered Indexes should be created for foreign key columns and for any column used in a where, a join, a merge or any clause that requires a lookup. I personally start by creating them for foreign keys. Then I observe how my application extracts data from the database schema and create indexes based on these observations.
Identifying missing indexes requires that your database is experiencing “normal” load. Otherwise your findings may be skewed unexpected queries. Furthermore, you should analyze and consider each of the missing indexes identified by Windows Azure SQL Database. Blindly creating these indexes may create overlap, resulting in indexes that require more space than is really need.
You should also consider rebuilding indexes on a regular basis. Fragmented indexes take up unnecessary space and can be quite slow. Setting up an automated job whose responsibility is to monitor and rebuild indexes is quite useful and can be done without too much work.
These are the first steps that I take when I need to optimize a Windows Azure SQL Database. Then I identify the top 20 slowest queries and try to convert table scan and index scan operations into index seeks. By now, if things are still gloomy, I start looking at denormalizing tables that are heavily read and normalizing tables that have a high volume of writes. Since some tables would ideally need to be highly normalized and denormalized at the same time, I have to ask the ultimate question “does this data really belong in an SQL Database?” In many circumstances, the answer is no. So I put my database on a diet and leverage an ecosystem of specialized cloud storage services to spread the load.
Data doesn’t absolutely need to be stored in Windows Azure SQL Database. Often moving the data out of Windows Azure SQL Database can dramatically improve the overall performance of your system. It can also allow for a greater level of parallelism by removing locks.
Have you optimized your Windows Azure SQL Database?
Looking for more resources?
- Take Control – Windows Azure SQL Database Application Life Cycle Management Made Easy!
- Building Clustered Indexes on Non-Primary Key Columns in Windows Azure SQL Database
- Creating NONCLUSTERED INDEXES on Massive Tables in Windows Azure SQL Database
- Identify Missing Indexes in Your Windows Azure SQL Database
- Don’t Forget About Index Maintenance on Windows Azure SQL Database
- Identifying the Top 20 Slowest Queries Being Executed Against a Windows Azure SQL Database
- Does Your Data Really Belong in Your SQL Database?
- Put Your Windows Azure SQL Database on a Diet!