I will be writing this series of articles for the Special Indexes in SQL Server. In this series we will cover following Index –
1. XML Index
2. Full Text Index
3. Spatial Index
4. Columnstore Index
Prerequisites
1. If you are reading this article, I am assuming that you know Index basics like type of Index, how it works and their benefits.
2. You are aware of XML data type in SQL Server.
You might have guess from the name, XML Indexes can be created on XML type columns which are stored as Large Binary Objects (BLOBs). When we create XML Index, it Index path, tags and values of XML column and improve query performance. When we do not have XML Index and our query is fetching data from XML data type, in such scenarios SQL Server parses the XML during execution, which results in poor query performance due to extra over head of parsing XML and applying the predicates. Usually XML values are large and retrieved values are comparatively small, if we create XML Index, it can skip the parsing xml data during execution and can improve performance by using Index for searching which results in better query performance. XML Index may sound different from traditional Indexes (clustered and Nonclustered Index) but this is implemented in similar way by using internal tables and B- Tree structures. We cannot query internal tables used for XML Index.
As other Indexes, XML Index also has maintains cost. So we need to consider this before creating any XML Index. We will be discussing this in details later in the article.
There are two kind of XML Index
1. Primary XML Index
2. Secondary XML Index
If a table has more than one XML type column, we can create one Primary XML Index and multiple Secondary XML Indexes for each XML column. We can create up to 249 XML Indexes and 999 Nonclustered Indexes.
Note – The first Index on an XML type column has to be Primary XML Index. Secondary XML Index can be created on Primary XML Index. Please remember that XML is case sensitive even though your SQL Server instance is not.
Primary XML Index
As mentioned above, XML Index on a XML type column are implemented by internal tables and B- Tree structures. These internal tables are called Node tables. Primary XML Index is a clustered Index on a node table, but for creating a Primary XML Index table must have a primary key which is implemented as clustered Index. This primary key can have up to 15 columns and used by SQL Server to lookup the records. If you want to see the columns in Node table for a given Primary XML Index, you can run the below script –
Let’s create our first Primary XML Index -
Here you provide the name for you Index and use add button to select the column on which you want to create the Index. In this case it is CatalogDescription.
PATH Secondary XML Index
VALUE Secondary XML Index
PROPERTY Secondary Index
As you can see in the output of below script, top row (marked in green), shows the internal table xml_Index_nodes_663673412_256000, which is created by SQL Server for our XML Index on table ProductModel_TestXMLIndex.
1. XML Index
2. Full Text Index
3. Spatial Index
4. Columnstore Index
Prerequisites
1. If you are reading this article, I am assuming that you know Index basics like type of Index, how it works and their benefits.
2. You are aware of XML data type in SQL Server.
You might have guess from the name, XML Indexes can be created on XML type columns which are stored as Large Binary Objects (BLOBs). When we create XML Index, it Index path, tags and values of XML column and improve query performance. When we do not have XML Index and our query is fetching data from XML data type, in such scenarios SQL Server parses the XML during execution, which results in poor query performance due to extra over head of parsing XML and applying the predicates. Usually XML values are large and retrieved values are comparatively small, if we create XML Index, it can skip the parsing xml data during execution and can improve performance by using Index for searching which results in better query performance. XML Index may sound different from traditional Indexes (clustered and Nonclustered Index) but this is implemented in similar way by using internal tables and B- Tree structures. We cannot query internal tables used for XML Index.
As other Indexes, XML Index also has maintains cost. So we need to consider this before creating any XML Index. We will be discussing this in details later in the article.
There are two kind of XML Index
1. Primary XML Index
2. Secondary XML Index
If a table has more than one XML type column, we can create one Primary XML Index and multiple Secondary XML Indexes for each XML column. We can create up to 249 XML Indexes and 999 Nonclustered Indexes.
Note – The first Index on an XML type column has to be Primary XML Index. Secondary XML Index can be created on Primary XML Index. Please remember that XML is case sensitive even though your SQL Server instance is not.
Primary XML Index
As mentioned above, XML Index on a XML type column are implemented by internal tables and B- Tree structures. These internal tables are called Node tables. Primary XML Index is a clustered Index on a node table, but for creating a Primary XML Index table must have a primary key which is implemented as clustered Index. This primary key can have up to 15 columns and used by SQL Server to lookup the records. If you want to see the columns in Node table for a given Primary XML Index, you can run the below script –
SELECT
I.name as 'IndexName',
c.name as 'NodeTableColumnName'
FROM sys.columns c
JOIN sys.Indexes i ON i.object_id = c.object_id
WHERE i.name = 'PXML_ProductModel_CatalogDescription_Test'
AND i.type = 1;
I got the following output for my SQL Server instance.
IndexName
|
NodeTableColumnName
|
PXML_ProductModel_CatalogDescription_Test
|
id
|
PXML_ProductModel_CatalogDescription_Test
|
nid
|
PXML_ProductModel_CatalogDescription_Test
|
tagname
|
PXML_ProductModel_CatalogDescription_Test
|
taguri
|
PXML_ProductModel_CatalogDescription_Test
|
tid
|
PXML_ProductModel_CatalogDescription_Test
|
value
|
PXML_ProductModel_CatalogDescription_Test
|
lvalue
|
PXML_ProductModel_CatalogDescription_Test
|
lvaluebin
|
PXML_ProductModel_CatalogDescription_Test
|
hid
|
PXML_ProductModel_CatalogDescription_Test
|
xsinil
|
PXML_ProductModel_CatalogDescription_Test
|
xsitype
|
PXML_ProductModel_CatalogDescription_Test
|
pk1
|
Let’s create our first Primary XML Index -
Here we will be creating a copy of [Production].[ProductModel] table from AdventureWorks2014 database. We are not working on original table because if we messed up, we have the original table and we can restore our test table from it. Let’s call it [Production].[ProductModel_TestXMLIndex], you can directly run the following script to create the table. This table has two XML type columns, CatalogDescription and Instructions.
-- rename all constraints and Index
CREATE TABLE [Production].[ProductModel_TestXMLIndex](
[ProductModelID]
[int] IDENTITY(1,1) NOT NULL,
[Name]
[dbo].[Name] NOT
NULL,
[CatalogDescription]
[xml](CONTENT [Production].[ProductDescriptionSchemaCollection]) NULL,
[Instructions]
[xml](CONTENT [Production].[ManuInstructionsSchemaCollection]) NULL,
[rowguid]
[uniqueidentifier] ROWGUIDCOL NOT NULL CONSTRAINT
[DF_ProductModel_rowguid_test] DEFAULT (newid()),
[ModifiedDate]
[datetime] NOT NULL
CONSTRAINT
[DF_ProductModel_ModifiedDate_test] DEFAULT (getdate()),
CONSTRAINT [PK_ProductModel_ProductModelID_test] PRIMARY KEY CLUSTERED
(
[ProductModelID]
ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE =
OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
Go
Once the table is ready, please make sure we do not have any XML Index on it. You can check it in Management Studio under table properties. Now we know there is no XML Index on our test table but before we run any query, we need data in our table. I am using our original table and loading data into our test table. Please run the same query multiple times so we have enough amounts of data to see any performance improvement. I know this is not the best way to create the data but here we are just concern about volume of data not the quality.
-- Run this script for 20 to 30 times
Insert into [Production].[ProductModel_TestXMLIndex]
Select
T.Name,
T.CatalogDescription,
T.Instructions,
T.rowguid,
T.ModifiedDate
from [Production].[ProductModel]
T
Go
Now our test table is ready with data. Let’s run a simple query and capture the query execution plan.
DBCC FREEPROCCACHE
Go
SELECT CatalogDescription.value('
declare namespace
PD="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelDescription";
(/PD:ProductDescription/@ProductModelID)[1]', 'int') AS Result
FROM [Production].[ ProductModel_TestXMLIndex]
WHERE CatalogDescription IS NOT NULL
ORDER BY Result DESC
Go
When I ran the above script in my environment, I got the following execution plan. As you can see in screen shot below, cost of Table Valued function (XML Reader with XPath filter) is 98%. You can cross check in blue rectangle that there is no XML Index, we just have a primary key.
Now let’s see how XML Index can improve the performance. Run the below script to create the XML Index or you can create it from Management studio GUI.
CREATE PRIMARY XML INDEX [PXML_ProductModel_CatalogDescription_Test]
ON [Production].[ProductModel_TestXMLIndex]
(
[CatalogDescription]
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE =
OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Go to our test table then, right click the Index, there you can find the option of New Index. Select Primary XML Index. Then you get a pop up window.
Here you provide the name for you Index and use add button to select the column on which you want to create the Index. In this case it is CatalogDescription.
Now let’s run our same script. In my environment, I got the following output (please see the below screen shot). Now you can see that we have Primary XML Index PXML_ProductModel_CatalogDescription_Test
As you can see in red rectangle cost is now 58% because now query optimizer is using the Primary XML Index.
Secondary XML Indexes
Once we have created the Primary XML Index, we can create Secondary XML Index on XML column. Secondary XML Indexes are Nonclustered Index on node table or internal table and these are completely optional. There are three types of Secondary XML Index (Path, Value and Property) and we can create them in any order. These help in certain types of XQuery execution.
Let’s create a Secondary XML Index for Path (find the example below). In this case Path Secondary XML Index is a Nonclustered Index on the HID and Value column of the Node table. Similarly Value Index is Nonclustered Index on Value and HID column and Property Index is a Nonclustered Index on pk, HID and Value column of the Node table. You can find the example for creating the all three types of Secondary XML Index.
PATH Secondary XML Index
CREATE XML INDEX [SecondaryXmlIndexPath]
ON [Production].[ProductModel_TestXMLIndex]
(
[CatalogDescription]
)
USING XML INDEX [PXML_ProductModel_CatalogDescription_Test]
FOR PATH WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE =
OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
VALUE Secondary XML Index
CREATE XML INDEX [SecondaryXmlIndexValue]
ON [Production].[ProductModel_TestXMLIndex]
(
[CatalogDescription]
)
USING XML INDEX [PXML_ProductModel_CatalogDescription_Test]
FOR VALUE WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE
= OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
PROPERTY Secondary Index
CREATE XML INDEX [SecondaryXmlIndexProperty]
ON [Production].[ProductModel_TestXMLIndex]
(
[CatalogDescription]
)
USING XML INDEX [PXML_ProductModel_CatalogDescription_Test]
FOR PROPERTY WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE
= OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
We can run simple scripts against DMVs and can pull XML Index list in our
database.
select
O.name as 'TableName',
I.name as 'IndexName',
I.type_desc as 'IndexType'
from Sys.Indexes I
Inner Join Sys.Objects O on I.object_id=O.object_id
where I.type_desc like 'XML'
Or
select
O.name as 'TableName',
I.name as 'IndexName',
I.type_desc as 'IndexType'
from Sys.xml_Indexes I
Inner Join Sys.Objects O on I.object_id=O.object_id
TableName
|
IndexName
|
IndexType
|
ProductModel
|
PXML_ProductModel_CatalogDescription
|
XML
|
ProductModel
|
PXML_ProductModel_Instructions
|
XML
|
Store
|
PXML_Store_Demographics
|
XML
|
ProductModel_TestXMLIndex
|
PXML_ProductModel_CatalogDescription_Test
|
XML
|
ProductModel_TestXMLIndex
|
SecondaryXmlIndexPath
|
XML
|
ProductModel_TestXMLIndex
|
SecondaryXmlIndexProperty
|
XML
|
ProductModel_TestXMLIndex
|
SecondaryXmlIndexValue
|
XML
|
Person
|
PXML_Person_AddContact
|
XML
|
Person
|
PXML_Person_Demographics
|
XML
|
Person
|
XMLPATH_Person_Demographics
|
XML
|
Person
|
XMLPROPERTY_Person_Demographics
|
XML
|
Person
|
XMLVALUE_Person_Demographics
|
XML
|
As you can see in the output of below script, top row (marked in green), shows the internal table xml_Index_nodes_663673412_256000, which is created by SQL Server for our XML Index on table ProductModel_TestXMLIndex.
SELECT
T.name,
T.object_id,
T.schema_id,
T.type_desc,
T.create_date,
T.modify_date,
T.internal_type_desc,
T.is_ms_shipped
FROM sys.internal_tables T
Order by T.modify_date Desc
name
|
object_id
|
schema_id
|
type_desc
|
create_date
|
modify_date
|
internal_type_desc
|
is_ms_shipped
|
xml_Index_nodes_663673412_256000
|
791673868
|
4
|
INTERNAL_TABLE
|
9/28/16 15:31
|
9/28/16 15:39
|
XML_INDEX_NODES
|
0
|
ifts_comp_fragment_1589580701_5
|
535672956
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_COMP_FRAGMENT
|
0
|
ifts_comp_fragment_1077578877_3
|
519672899
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_COMP_FRAGMENT
|
0
|
ifts_comp_fragment_610101214_4
|
503672842
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_COMP_FRAGMENT
|
0
|
fulltext_docidfilter_1077578877
|
251147940
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_DOCID_FILTER
|
0
|
fulltext_Indexeddocid_1077578877
|
267147997
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_INDEXED_DOCID
|
0
|
fulltext_avdl_1077578877
|
283148054
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_AVDL
|
0
|
fulltext_Index_docidmap_1077578877
|
219147826
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_DOCID_MAP
|
0
|
fulltext_Index_docidstatus_1077578877
|
235147883
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_DOCID_STATUS
|
0
|
fulltext_Index_docidstatus_1589580701
|
155147598
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_DOCID_STATUS
|
0
|
fulltext_docidfilter_1589580701
|
171147655
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_DOCID_FILTER
|
0
|
fulltext_Indexeddocid_1589580701
|
187147712
|
4
|
INTERNAL_TABLE
|
7/17/14 16:11
|
7/17/14 16:11
|
FULLTEXT_INDEXED_DOCID
|
0
|
Below script shows the
list of column in node table for Primary XML Index.
-- list the columns in primary XML Index
in the internal node table
SELECT *
FROM sys.columns c
INNER JOIN sys.Indexes i ON i.object_id = c.object_id
WHERE i.name = 'PXML_ProductModel_CatalogDescription_Test'
AND i.type = 1;