Microsoft Business Intelligence (Data Tools)|Nonclustered Index structure in SQL Server

Wednesday, February 15, 2017

Nonclustered Index structure in SQL Server

We already know that all indexes are organized on B-trees except ColumnStore indexes and always have a single root node. SQL Server always uses this root node as the starting point for traversing an Index. In an Index tree, all the index nodes above the leaf node include the root node also are known as Non-leaf nodes. The leaf node is the bottom level of the index structure and contains the key value index entry either reference rows of the data pages or the complete rows of the data pages.

Nonclustered indexes have the same B-Tree structure as clustered indexes with one significant difference. The leaf node of the nonclustered index contains key values not the actual data. These key values map to pointer or clustering keys that locate rows in the data pages.

How does Nonclustered index implement?

The implementation of the nonclustered index depends on whether the data pages of a table are managed as a Heap or as a clustered index. If nonclustered index is built in a heap then SQL Server uses pointers in the leaf node index pages that point a row in the data pages.

If we have a table with clustered index, SQLServer builds a nonclustered index on the top of the clustered index structure and SQL Server uses clustering keys in the leaf node index page of the nonclustered index to point the cluster index.

How does Nonclustered index navigate on a Heap?

To understand this, we could take an alphabet example where all the data on the Nonclustered index is built in a heap as given below:

In the above Heap based nonclustered index page, leaf node pages contain the pointers that point a row in the data pages also.

How does SQL Server navigate a nonclustered index?

To understand this, we are taking the above example and we need to find out the alphabet ‘M’ from our data table where nonclustered index is built on the first name data field and we are using below script to find out the result as given below:

SELECT FirstName, LastName

FROM dbo.DataTable

WHERE FirstName='M'

Root node Navigation

SQL Server starts with Root node of the nonclustered index and evaluated that ‘M’ is greater than or equal to the key value actor ‘A’ and the compression result is to true. Now, SQL server moves to next key value ‘M’ on the root node and do the same compression evaluation which is to true again. Now, SQL server moves to next key value ‘R’ on the root node and do the same compression evaluation which is false. In this case, SQL Server uses the previous key value ‘M’ on the root node and moves to intermediate node.

Intermediate node Navigation

After moving to intermediate node, SQL Server evaluated that ‘M’ is greater than or equal to the key value actor ‘M’ where the compression result is to true. Now, SQL server moves to next key value ‘N’ on the intermediate node and compression evaluation result is false. In this case, SQL Server uses the previous key value ‘M’ on the intermediate node and moves to Leaf node.

Leaf node Navigation

SQL Server reads through the leaf node until it finds ‘M’ and then uses the point value that allocated with the data page.

Data Page Navigation

SQL Server reads through the data page until it finds the key value is equal to ‘M’. SQL Server returns this row to the query processor. The pages in the data chain and the rows in them are ordered on the value of the nonclustered index key.

Conclusion

The implementation of the nonclustered index depends on whether the data pages of a table are managed as a Heap or as a Clustered index. The leaf node of the nonclustered index contains the key value not the actual data. These key values map to the pointer or clustering keys that locate in the data pages.

Index structure in SQL Server

Clustered Index structure in SQL Server

How does SQL Server navigate a clustered index?

References: https://technet.microsoft.com/en-us/library/ms177443(v=sql.105).aspx

Mukesh Singh

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

1 comment:

UnknownNov 25, 2020, 3:13:00 AM
thanx, god
ReplyDelete
Replies

Add comment

Wednesday, February 15, 2017

Nonclustered Index structure in SQL Server

1 comment:

Popular Posts