Azure SQL Database doesn’t support Heap Tables. Thus creating tables in Azure SQL Database requires Clustered Indexes. By default Azure SQL Database creates the Clustered Index on the table’s Primary Key.
CREATE TABLE [dbo].[TestTable] ( [Id] UNIQUEIDENTIFIER NOT NULL, [FirstName] NVARCHAR (10) NOT NULL, [LastName] NVARCHAR (10) NOT NULL, [Type] INT NOT NULL, [Timestamp] ROWVERSION NOT NULL, PRIMARY KEY (Id));
Primary Keys often make good Clustered Indexes because data is regularly accessed through them. But there are times when the Primary Key just doesn’t cut it. Fortunately, you can define a Clustered Index on other columns.
CREATE TABLE [dbo].[TestTable] ( [Id] UNIQUEIDENTIFIER NOT NULL, [FirstName] NVARCHAR (10) NOT NULL, [LastName] NVARCHAR (10) NOT NULL, [Type] INT NOT NULL, [Timestamp] ROWVERSION NOT NULL, PRIMARY KEY NONCLUSTERED (Id)); CREATE CLUSTERED INDEX IX_TestTable on TestTable (FirstName, LastName);
Changing a clustered index in Azure SQL Database is possible but it comes with some constraints. Failing to satisfy the existing constraints enforce by the index will result in “The new index definition does not match the constraint being enforced by the existing index”.
CREATE CLUSTERED INDEX IX_TestTable ON TestTable ([Type],[FirstName],[LastName]) WITH (DROP_EXISTING=ON) ON [Primary]
Things to keep in mind when creating Indexes
- Focus on building narrow clustered indexes by choosing columns and their data types wisely. For the best performance, keep the overall byte size of the index as small as possible.
- Non Clustered Indexes use the clustered index key as a row key
- Columns who get updated frequently shouldn’t be part of the Index.
- The physical table data is sorted based on the clustered index columns.
- Support the most commonly used access path to the data because there can only be one Clustered Index per table.
Tips and Tricks for Azure SQL Database
First of all, don’t forget about index maintenance on Azure SQL Database! Then think long and hard about your Clustered Indexes. Many people suggest that GUIDs (uniqueidentifiers) are bad candidates for clustering keys given they will not be ordered and cause page splits, causing higher latencies and fragmentation? This is not true Azure SQL Database! Well at least not to the degree that you experience in an on-premise SQL Server. Azure SQL Database dbs give you 3 replicas and that means the characteristics of writes are very different compared to a single SQL DB without High Availability (HA).
In Azure SQL Database the writes have to be confirmed by 2 out of the 3 copies thus are always network level writes… A network write today is much slower in latency compared to what a page split can cause. But you shouldn’t forget that there are impacts of doing more work with GUIDs (uniqueidentifiers) [More…] and that they probably don’t make the best Clustered Index keys either!
In a blog post “Uniqueidentifier and Clustered Indexes” the author points out that the easiest and best approach to creating a Clustered Index is to use a datetime2 column. This insures that the data for the new row is inserted at the end of the table data. Consequently, there is no need for rearranging of the cluster. Adding the data to the end ensures the best performance for inserts. So to sum things up, if you’re using Guids, as I often do for so many reasons, your Primary Key probably shouldn’t be your Clustered Index.
datetime2 has a resolution of 1/1000000 of a second