Thursday, September 29, 2016

Mastering Index – Introduction

One of the most interesting and talked about topic in database field is Index. I am writing this Series of article to cover all the different types of index available in SQL Server. I will be writing this series in two categories Index and Special Index. Please find the List of topics below, which I will covering –

Mastering Index
1.     Clustered Index
2.     Non Clustered Index
3.     Unique Index
4.     Index with included columns
5.     Index on computed columns
6.     Filtered Index
Special Index
1.     XML Index
2.     Full Text Index
3.     Spatial Index
4.     Columnstore Index

Let’s start with the basic questions like what is Index, why we need Index etc. I am sure most of you are aware of these answers, so feel free to move to next topic.

We all are familiar with the books and how to find a particular topic or lesson in it. Yes we use the index. Suppose I bought a new book say ‘SQL Server Internals 2012’. Today I want to study about Query Optimizer Architecture, how I will find it in the book. Certainly not by reading the book from beginning until I find the Query Optimizer Architecture. Since we have a very useful tool called Contents or Index. I will go to contents, check the chapter names and chapter 11 The Query Optimizer Looks relevant. Now checking the contents of Chapter 11, I can see Optimizer Architecture has page number 624. Once I have the page number it’s pretty easy to find it. Since my book has 983 pages, so I will flip the half book roughly, if I am on page 520 it means 624 is on right side. I will be repeating the same process until I find the page 624. Yes, database Index works exactly the same ways.   

Index implementation in a database is slightly different from a book. SQL Server uses the B-tree data structure to store the index or being more precise it uses B+ tree. If you are not aware about B-tree, I would recommend you to read about it for better understanding of Index. Here I am adding a simple diagram to give you an idea about B-Tree.






Recommended Book for Data Structure - Data Structures (SIE)


Let’s get back to our book example. Assume that I am looking for Union All syntax. If I use the Contents/Index from the beginning of the book, it’s very difficult, simply because Index is created for chapters and topics not for the key words. Now checkout the last few pages of your book. You can see another Index, but it is different. This index is created on keywords. Now searching Union All is very easy here, because keywords are grouped alphabetically. You go to the U section then easy search the Union All because it is sorted alphabetically. In my book Union All is 555 to 559 pages. You can notice the difference between the Indexes, First on shows the physical order of the information in the book but the last one contains the location of the information. Congratulations, you have learned Clustered and Non Clustered Index.
When we create a clustered index on a table, it physically arranges the records as per the index. Since we can arrange information in only a single physical order, therefore we cannot create multiple clustered indexes. Good news is we can create many non clustered indexes, SQL Server allows you to create 999 on a single table.

Now we know, indexes help in finding information faster or in database language it makes queries faster. Certainly indexes help in faster execution of the queries but it doesn’t mean we can create many index blindly. Index consumes resources like disk space and CPU in case of data insertion, deletion or update.

Assume that our book has 1000 page and contains 700 pages of different indexes and only 300 pages of information. This is called over indexing and we need to be careful while creating index. Think about your data, kind of queries you will be running on them and index maintenance cost before creating index. Here are key points to remember about Index in SQL Server –
1.     Primary Key is by default is a Clustered Index
2.     We can create only one Clustered Index on a table
3.     Clustered Index physically arrange the data
4.     We can create 999 clustered index on a table (actual number depends on your SQL Server version too)
5.     Index have maintenance cost and consumes disk space and CPU
6.     Indexing improves the query performance but over indexing has its own flaws
7.     Table without any Index is called Heap

We will be discussing about individual Index type in details in my next articles.
Stay Tuned……
Thanks

Recommended Book for this series - Microsoft SQL Server 2012 Internals (Microsoft Press)

Related Articles-