Friday, September 23, 2016

Mastering Index - Filtered Index

Mastering Index - Filtered Index

This article is the part of series Mastering Index, in this particular article we will be exploring Filtered Index. Take a look at the content of this series – 

Mastering Index – Introduction
One of the most interesting and talked about topic in database field is Index. I am writing this Series of article to cover all the different types of Index available in SQL Server. I will be writing this series in two categories Index and Special Index. Please find the List of topics below, which I will covering –

Mastering Index
1.       Clustered Index
2.       Nonclustered Index
3.       Unique Index
4.       Index with included columns
5.       Index on computed columns
6.       Filtered Index

Special Index
1.       XML Index
2.       Full Text Index
3.       Spatial Index

4.       Columnstore Index

Filtered index is a type of non clustered index introduced in Sql Server 2008. As the name suggest filtered index is an optimized non clustered index with a filter criteria. It uses filter criteria to index a portion of rows which qualifies for the given predicate. So in filtered index we create index on well defined subset of records. Hence the resultant index will be smaller in size compared to full index so it’s performs better and searches will be faster. Filtered index are well suited for the queries that select records from a well defined subset of records. A well designed filtered index can enhance the query performance and reduce the maintenance and storage cost of an index.

So we can summarize the advantages of filtered index as follows –
Reduced maintenance cost – as I mentioned above we are creating index on only a subset of records so maintenance cost will be less as compared to full non clustered index. We can say cost depends on the fraction of data which qualifies for the given filters. So maintenance cost is proportional to the number of records which qualifies for the give predicate.     
Less storage cost - As we know index store size depends on the number of records and columns size which are part of index definitions. So in case of filtered index lesser the qualified number records, index size  will be smaller.   
Improve query performance – If optimizer can use a filtered index to generate an execution plan, our query execution will be faster for the simple reason, sql server is searching records in a subset or a part of table instead of full table search. It is important to write a query in such a way, so that it can use the filtered index.
When to Use Filtered Index?

Use filtered index for a well defined category or a column which has a large portion of values as NULL.
Suppose you have a customer table with Name, email and contact number. Most of the customers just provide the email address because they don’t want to get called from us. So in most of the records contact number is NULL. Let’s assume that our data processing uses column contact number, in this case it is a good idea to create filtered index on this call. So we will be processing only Non NULL values.

Other scenario could be, let’s assume we have another column in our customer table Country. Most of our reports and data processing is country wise, so in this case filtered index can be helpful and can improve the performance.

Creating Filtered Index and understanding how it works with example

CREATE NONCLUSTERED INDEX FilteredIndexName 
ON Table_Name (CloumnName1, CloumnName1) 
WHERE CloumnName3 IS NOT NULL


Let’s take an example – 
I am creating two Indexes on DimCustomer table of database AdventureWorksDW2014.
Please find the code below, you can also create the index from Management Studio UI(find the steps at the end of article).

CREATE NONCLUSTERED INDEX [NonClusteredIndex_AllTotalChildren] ON [dbo].[DimCustomer]
(
       [TotalChildren] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO


This script creates a non clustered index for all the values of TotalChildren column.

CREATE NONCLUSTERED INDEX [FilteredNonClusteredIndex_TotalChildren] ON [dbo].[DimCustomer]
(
       [TotalChildren] ASC
)
Where [TotalChildren]>0
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO


This script creates a non clustered filtered index on TotalChildren column where values are more than 0.
You can check the index and their property by using below script, which clearly shows filtered index is using less memory space by examine the UsedPageCount.

SELECT I.name [IndexName],i.type_desc [IndexType],
 PS.row_count [NumberOfRows],
 PS.used_page_count [UsedPageCount],
 PS.reserved_page_count [ReservedPageCount]
FROM sys.indexes I
 INNER JOIN sys.dm_db_partition_stats ps
  ON I.object_id = ps.object_id AND I.index_id = ps.index_id
WHERE I.object_id = OBJECT_ID('DimCustomer')





Now let’s test the performance.

First let’s drop the filtered index which we have created and now we have only  ‘NonClusteredIndex_AllTotalChildren’ index which is full  non clustered index. Before we run the script let’s clearer buffer and cache by using below DBCC commands –

DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;

Now run the script, I got the following output. Here you notice that query took 2 seconds to run completely.


Now this time we create the filtered index and clear the cache and buffer. When I ran the script, I got the following result. Here it took 1 second to complete the execution. One second may not sound impressive here, but I have very less number of records and if you see the it’s two times faster with filtered index.




Here I am adding steps to create filtered index from Management Studio , it is directly from Microsoft website.
You can create filtered index from Management Studio also –

1.    In Object Explorer, click the plus sign to expand the database that contains the table on which you want to create a filtered index.
2.    Click the plus sign to expand the Tables folder.
3.    Click the plus sign to expand the table on which you want to create a filtered index.
4.    Right-click the Indexes folder, point to New Index, and select Non-Clustered Index….
5.    In the New Index dialog box, on the General page, enter the name of the new index in the Index name box.
6.    Under Index key columns, click Add….
7.    In the Select Columns from table_name dialog box, select the check box or check boxes of the table column or columns to be added to the unique index.
8.    Click OK.
9.    On the Filter page, under Filter Expression, enter SQL expression that you’ll use to create the filtered index.
10.  Click OK.

We have some restriction, I am listing them below and it’s directly from Microsoft website

Limitations and Restrictions

·         You cannot create a filtered index on a view. However, the query optimizer can benefit from a filtered index defined on a table that is referenced in a view. The query optimizer considers a filtered index for a query that selects from a view if the query results will be correct.
·         Filtered indexes have the following advantages over indexed views:
o    Reduced index maintenance costs. For example, the query processor uses fewer CPU resources to update a filtered index than an indexed view.
o    Improved plan quality. For example, during query compilation, the query optimizer considers using a filtered index in more situations than the equivalent indexed view.
o    Online index rebuilds. You can rebuild filtered indexes while they are available for queries. Online index rebuilds are not supported for indexed views. For more information, see the REBUILD option for ALTER INDEX (Transact-SQL).
o    Non-unique indexes. Filtered indexes can be non-unique, whereas indexed views must be unique.
·         Filtered indexes are defined on one table and only support simple comparison operators. If you need a filter expression that references multiple tables or has complex logic, you should create a view.
·         A column in the filtered index expression does not need to be a key or included column in the filtered index definition if the filtered index expression is equivalent to the query predicate and the query does not return the column in the filtered index expression with the query results.
·         A column in the filtered index expression should be a key or included column in the filtered index definition if the query predicate uses the column in a comparison that is not equivalent to the filtered index expression.
·         A column in the filtered index expression should be a key or included column in the filtered index definition if the column is in the query result set.
·         The clustered index key of the table does not need to be a key or included column in the filtered index definition. The clustered index key is automatically included in all nonclustered indexes, including filtered indexes.
·         If the comparison operator specified in the filtered index expression of the filtered index results in an implicit or explicit data conversion, an error will occur if the conversion occurs on the left side of a comparison operator. A solution is to write the filtered index expression with the data conversion operator (CAST or CONVERT) on the right side of the comparison operator.
·         Review the required SET options for filtered index creation in CREATE INDEX (Transact-SQL) syntax