Building Clustered Indexes on Non-Primary Key Columns in Azure SQL Database

space-star-cluster-1920x1200

Azure SQL Database doesn’t support Heap Tables. Thus creating tables in Azure SQL Database requires Clustered Indexes. By default Azure SQL Database creates the Clustered Index on the table’s Primary Key.

CREATE TABLE [dbo].[TestTable] (
    [Id]        UNIQUEIDENTIFIER NOT NULL,
    [FirstName] NVARCHAR (10)    NOT NULL,
    [LastName]  NVARCHAR (10)    NOT NULL,
    [Type]      INT              NOT NULL,
    [Timestamp] ROWVERSION       NOT NULL,
    PRIMARY KEY (Id));

Primary Keys often make good Clustered Indexes because data is regularly accessed through them. But there are times when the Primary Key just doesn’t cut it. Fortunately, you can define a Clustered Index on other columns.

CREATE TABLE [dbo].[TestTable] (
    [Id]        UNIQUEIDENTIFIER NOT NULL,
    [FirstName] NVARCHAR (10)    NOT NULL,
    [LastName]  NVARCHAR (10)    NOT NULL,
    [Type]      INT              NOT NULL,
    [Timestamp] ROWVERSION       NOT NULL,
    PRIMARY KEY NONCLUSTERED (Id));
CREATE CLUSTERED INDEX IX_TestTable on TestTable (FirstName, LastName);

Changing a clustered index in Azure SQL Database is possible but it comes with some constraints. Failing to satisfy the existing constraints enforce by the index will result in “The new index definition does not match the constraint being enforced by the existing index”.

CREATE CLUSTERED INDEX IX_TestTable
ON TestTable ([Type],[FirstName],[LastName])
WITH (DROP_EXISTING=ON)
ON [Primary]

Things to keep in mind when creating Indexes

Focus on building narrow clustered indexes by choosing columns and their data types wisely. For the best performance, keep the overall byte size of the index as small as possible.
Non Clustered Indexes use the clustered index key as a row key
Columns who get updated frequently shouldn’t be part of the Index.
The physical table data is sorted based on the clustered index columns.
Support the most commonly used access path to the data because there can only be one Clustered Index per table.

Tips and Tricks for Azure SQL Database

First of all, don’t forget about index maintenance on Azure SQL Database! Then think long and hard about your Clustered Indexes. Many people suggest that GUIDs (uniqueidentifiers) are bad candidates for clustering keys given they will not be ordered and cause page splits, causing higher latencies and fragmentation? This is not true Azure SQL Database! Well at least not to the degree that you experience in an on-premise SQL Server. Azure SQL Database dbs give you 3 replicas and that means the characteristics of writes are very different compared to a single SQL DB without High Availability (HA).

In Azure SQL Database the writes have to be confirmed by 2 out of the 3 copies thus are always network level writes… A network write today is much slower in latency compared to what a page split can cause. But you shouldn’t forget that there are impacts of doing more work with GUIDs (uniqueidentifiers) [More…] and that they probably don’t make the best Clustered Index keys either!

In a blog post “Uniqueidentifier and Clustered Indexes” the author points out that the easiest and best approach to creating a Clustered Index is to use a datetime2 column. This insures that the data for the new row is inserted at the end of the table data. Consequently, there is no need for rearranging of the cluster. Adding the data to the end ensures the best performance for inserts. So to sum things up, if you’re using Guids, as I often do for so many reasons, your Primary Key probably shouldn’t be your Clustered Index.

datetime2 has a resolution of 1/1000000 of a second

Find Out More About Indexes

6 responses to Building Clustered Indexes on Non-Primary Key Columns in Azure SQL Database

changsul October 1, 2013 at 1:18 AM

Reblogged this on ChangSu's tech blog.

LikeLike

Sameer Abdul February 5, 2015 at 5:09 PM

In this piece of code you have mentioned:
CREATE CLUSTERED INDEX IX_TestTable
ON TestTable ([Type],[FirstName],[LastName])
WITH (DROP_EXISTING=ON)
ON [Primary]

You have used “On [PRIMARY]”, but I believe this is not supported in SQL Azure. It’s not working for me. Is this not the case for you?

LikeLike

- Alexandre Brisebois February 6, 2015 at 11:57 AM
  
  This used to work, do you have a de mo script I could use to repro?
  
  LikeLike