This
article is the part of series Mastering Index, in this particular article
we will be exploring Index on computed
columns. Take a look at the content of this series –
Mastering
Index – Introduction
One of
the most interesting and talked about topic in database field is Index. I am
writing this Series of article to cover all the different types of Index
available in SQL Server. I will be writing this series in two categories Index
and Special Index. Please find the List of topics below, which I will covering
–
Mastering
Index
1.
Clustered Index
2.
Nonclustered Index
3.
Unique Index
4.
Index with included columns
5.
Index on computed columns
6.
Filtered Index
Special Index
1.
XML Index
2.
Full Text Index
3.
Spatial Index
4.
Columnstore Index
If you
have read this series from the beginning, I am sure you have good understanding
of Index, Index types, Index cost and benefits. As the name suggest, we will be
exploring the Index on computed columns. SQL Server provides the option to
create Index on a dynamically calculated column. Computed columns are not being
stored and when you query them SQL Server calculated the value and returns it.
You can use PERSISTED option to store the calculated column value and
in this case it acts as a normal column.
We can also
create index on a computed column which is deterministic but not precise. In
this case we need to declare the column as PERSISTED, while creating the table
or Alter Table statement. So that SQL Server stores the computed values and updates
those when a column data is altered that is being used in the calculation of
computed column. PERSISTED option can change the performance of your queries.
If you are using PERSISTED option in this case SQL Server will be calculating
the values, storing it and updating the Index, so our inserts will be slower
but reads will be faster. If you are not using the PERSISTED, your inserts will
be faster because there will be con calculations and storing the values of
computed columns but on the other hand reads will be slower because column
values will be calculated on the fly.
Before we
create any index on computed column, we need to make sure our computed column
conforms to following criteria –
1.
Ownership requirements
2.
Determinism requirements
3.
Precision requirements
4.
Data type requirements
5.
SET option requirements
Ownership Requirements
Table and the
functions used in computed column should have the same owner.
Determinism Requirements
All the function referenced
in computed column should be precise and deterministic. All the columns which
are used in computed column should belong to the same table and should not pull
data from multiple rows or aggregate functions. Computed column logic or
expression should not access the user data or system data.
Precision Requirements
Computed column
expression should not have real or float data types.
Data
Type Requirements
Computed column expression should
not return data type test, ntext or image. Computed column can be index if
return data type is allowable as index key.
SET
Option Requirements
The connection on which the index is created,
and all connections trying INSERT, UPDATE, or DELETE statements that will
change values in the index, must have six SET options set to ON and one option
set to OFF. The optimizer ignores an index on a computed column for any SELECT
statement executed by a connection that does not have these same option
settings.
o
The NUMERIC_ROUNDABORT option must be set to
OFF, and the following options must be set to ON:
o
ANSI_NULLS
o
ANSI_PADDING
o
ANSI_WARNINGS
o
ARITHABORT
o
CONCAT_NULL_YIELDS_NULL
o
QUOTED_IDENTIFIER
Setting
ANSI_WARNINGS to ON implicitly sets ARITHABORT to ON when the database
compatibility level is set to 90 or higher.
Now let’s
create a table with computed column, you can run the below scrip and can create
table. Scrip also has the code to insert some dummy data. Please remember, I am
using AdventureWorks2014 database. As you can see in the below script we are
creating a column Total_Leave_Hours,
which is the sum of VacationHours and SickLeaveHours.
Begin
set NoCount on
Declare @Count int
Declare @TotalRec int
If exists (SELECT OBJECT_ID(N'[AdventureWorks2014].[HumanResources].[Employee_Test]'))
Begin
Drop table
[AdventureWorks2014].[HumanResources].[Employee_Test]
End
CREATE TABLE
[HumanResources].[Employee_Test](
[BusinessEntityID]
[int] IDENTITY(1,1) NOT NULL Primary Key,
[NationalIDNumber]
[nvarchar](15) NOT NULL,
[LoginID]
[nvarchar](256)
NOT NULL,
[OrganizationNode]
[hierarchyid] NULL,
[OrganizationLevel]
[int] NULL,
[JobTitle]
[nvarchar](50) NOT NULL,
[BirthDate]
[date] NOT NULL,
[MaritalStatus]
[nchar](1) NOT NULL,
[Gender]
[nchar](1) NOT NULL,
[HireDate]
[date] NOT NULL,
[SalariedFlag]
[dbo].[Flag] NOT
NULL DEFAULT ((1)),
[VacationHours]
[smallint] NOT NULL
DEFAULT ((0)),
[SickLeaveHours]
[smallint] NOT NULL
DEFAULT ((0)),
[CurrentFlag]
[dbo].[Flag] NOT
NULL DEFAULT ((1)),
[rowguid]
[uniqueidentifier] ROWGUIDCOL NOT NULL DEFAULT (newid()),
[ModifiedDate]
[datetime] NOT NULL
DEFAULT (getdate()),
[Total_Leave_Hours]
as [VacationHours]+[SickLeaveHours]
Persisted
) ON [PRIMARY]
Set @Count=0
Set @TotalRec=1000
WHILE (@TotalRec>@Count)
BEGIN -- While Begin
INSERT INTO
[HumanResources].[Employee_Test]
([NationalIDNumber]
,[LoginID]
,[OrganizationNode]
,[OrganizationLevel]
,[JobTitle]
,[BirthDate]
,[MaritalStatus]
,[Gender]
,[HireDate]
,[SalariedFlag]
,[VacationHours]
,[SickLeaveHours]
,[CurrentFlag]
,[rowguid]
,[ModifiedDate])
Select
[NationalIDNumber]
,[LoginID]
,[OrganizationNode]
,[OrganizationLevel]
,[JobTitle]
,[BirthDate]
,[MaritalStatus]
,[Gender]
,[HireDate]
,[SalariedFlag]
,[VacationHours]
,[SickLeaveHours]
,[CurrentFlag]
,[rowguid]
,[ModifiedDate]
From [HumanResources].[Employee]
set @Count=@Count+1
END
End
Once the
table is ready you can create the index on computed column Total_Leave_Hours from GUI or
script.
USE [AdventureWorks2014]
GO
SET ARITHABORT ON
SET CONCAT_NULL_YIELDS_NULL ON
SET QUOTED_IDENTIFIER ON
SET ANSI_NULLS ON
SET ANSI_PADDING ON
SET ANSI_WARNINGS ON
SET NUMERIC_ROUNDABORT OFF
GO
/****** Object: Index
[NonClusteredIndex-20161102-154039]
Script Date: 11/2/2016 3:41:47 PM ******/
CREATE NONCLUSTERED INDEX [NonClusteredIndex-20161102-154039] ON [HumanResources].[Employee_Test]
(
[Total_Leave_Hours]
ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE =
OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Here we
are not going to evaluate the performance impact of Index on computed columns
but it has all the Pros and Cons of an Index. Impact of index depends on the
many factors and kind of queries you are going to write.
Important DMVs
sys.indexes
sys.index_columns
sys.dm_db_index_physical_stats function
sys.dm_db_database_page_allocations function
sys.dm_db_missing_index_details
sys.dm_db_missing_index_groups
Interview Questions
Q. Explain PERSISTED property of computed column?
Q. What are the prerequisites to create an index on computed column?Q. Can we create a clustered index on a computed column?
Q. Explain a scenario where index on a computed index can impact the performance?
Recommended Links -