In this section, I would like to list down the considerations and guidelines when creating a clustered index key.
- It is always better to define Clustered Index on a column that is static ( values that do not change). You need to design the index in such a way that the fragmentation can be minimized. Whenever the value of the Clustered Index key is updated, it will move the row to a different place in the index chain and also it needs to update the Clustered Index both in clustered index key and Non-Clustered indexes. Hence, this is an expensive operation and is the reason for the cause of fragmentation.
- Clustered Index key should be narrow (as small as possible as it can). Wide Index key should be avoided. Having a wide key, will make the Non-Clustered Index’s(if present) leaf level to increase and there by increasing its size and introducing additional reads when scanning.
- It is important to define the Clustered Index key to be unique when the column doesn’t have duplicates. This will avoid the overhead of extra bytes to be stored in the Index key.
- If the column has duplicates, then an identity column can be made as right most column in the index key to make it unique.
- Identity columns, Sequence columns are good choice for Clustered Index Keys. Columns that are defined as unique identifiers and whose values are generated using NEWID() function introduce random I/O operations which proves to be expensive operation and leads to index fragmentation. It is better to define such columns with NEWSEQUENTIALID() function which will make the index key ever increasing because the data is always inserted at the end of the index thereby reducing page splits and fragmentation also. There is a problem with ever increasing index keys. If the key value goes outside of the histogram data of the index statistic, then the estimated number of rows will be reportedly wrong which leads to sub-optimal execution plan.
- When a table has only Non-Clustered index(es), it is better to have a Clustered Index in place in order to avoid expensive Key-Lookup operations in case a big table ( at least for the most frequently executed queries on SQL Server) . Although having a covering Non-clustered index is desirable, adding a clustered index will be an ideal solution to optimize the performance of the queries.