We know that Indexing is the central key
for every types of the data management. They are easily applicable in relational
tables or even files in a tree of directories (a file system actually resembles
an index in many ways).
The XML indexes have been introduced with
SQL Server 2005 and they are using in the most of the organizations because of the
ubiquity and readability of the XML format. XML instances are stored in xml type
columns as the large binary objects (BLOBs) and they can be too large up to 2
GB.
Absence of an XML index, these binary
large objects is shredded at run time to evaluate a query. This shredding can
be time-consuming. This run-time shredding can be costly, depending on the size
and number of instances stored in the column.
As per the name indication, XML Indexes
are applicable on the columns of XML data type. XML Indexes will be applicable
on the specific table; the table must have a clustered
index on the primary key column.
Types
of XML Indexes
XML indexes fall into the following
categories:
- Primary XML index
- Secondary XML index
The first index on the XML type
column must be the primary XML index. Using the primary XML index, the
following types of secondary indexes are supported: PATH, VALUE, and PROPERTY.
Depending on the type of queries, these secondary indexes might help improve
query performance.
Primary
XML indexes
The primary XML
index is a shredded and persisted representation of the XML BLOBs in the xml data
type column. For each XML binary large object (BLOB) in the column, the index
creates several rows of data. The number of rows in the index is approximately
equal to the number of nodes in the XML binary large object.
Just like a table can only have one
clustered index, an XML column can have only one primary XML index. They index
all tags, values and paths over the XML instances in the column and benefit
query performance.
Queries within XML instances use the
primary XML index, and can return scalar values or XML subtrees by using the
index itself.
Each row stores the following node
information:
- Tag name such as an element or attribute name.
- Node value.
- Node type such as an element node, attribute node, or text node.
- Document order information, represented by an internal node identifier.
- Path from each node to the root of the XML tree. This column is searched for path expressions in the query.
- Primary key of the base table. The primary key of the base table is duplicated in the primary XML index for a back join with the base table, and the maximum number of columns in the primary key of the base table is limited to 15.
Creating a primary XML index is just as
straight-forward as creating a regular single-column index on a table.
USE [AdventureWorks2012]
GO
--- Create Primary XML Index
CREATE PRIMARY
XML INDEX IndXML_ProductModel_CatalogDescription_Path
ON [Production].[ProductModel]
(
[CatalogDescription]
)
GO
|
Secondary
XML indexes
To enhance search
performance, you can create secondary XML indexes. A primary XML index must
first exist before you can create secondary indexes.
The secondary XML index still covers the
same data as the underlying primary index (i.e. the entire XML object), but you
could say that it creates a more specific index, based on the primary index.
The T-SQL syntax
for setting up a secondary XML index is very similar to the primary index, with
the addition of the “USING XML INDEX” clause, which tells SQL Server that this
isn’t a primary index, but is based on one.
Syntax
of Secondary XML Index
Creating a Secondary XML index is just as
straight-forward as creating a regular single-column index on a table which
will be based on primary index.
USE [AdventureWorks2012]
GO
---- Create
Secondry XML Index
CREATE XML INDEX [IXML_ProductModel_CatalogDescription] ON [Production].[ProductModel]
(
[CatalogDescription]
)
USING XML INDEX IndXML_ProductModel_CatalogDescription_Path
FOR PATH;
GO
|
There are three different types of
secondary indexes (specified with the FOR keyword at the end
of the statement). The difference between them is what they index and how this
information is arranged in the database based what does it suit your query
needs.
PATH
secondary indexes
The PATH
secondary XML index is optimally designed for looking up paths, for instance
when you’re using the exist() method in your queries. The index key of this
type of index is the node pathof all nodes (elements, attributes,
values) in the XML document.
VALUE
secondary indexes
The index key of
the VALUE secondary XML index is the node value and the node
path, so you may want to consider this type if you’re looking for a
specific value, but don’t know the exact path.
PROPERTY
secondary indexes
The PROPERTY
secondary XML index’s key is the base table’s primary key, node
path and node value. This provides the means to perform
queries that search specific table records and specific paths in those records’
XML documents, all in a single index scan/seek.
Guidelines
for secondary indexes
- If your workload uses path expressions significantly on XML columns, the PATH secondary XML index is likely to speed up your workload. The most common case is the use of the exist() method on XML columns in the WHERE clause of Transact-SQL.
- If your workload retrieves multiple values from individual XML instances by using path expressions, clustering paths within each XML instance in the PROPERTY index may be helpful. This scenario typically occurs in a property bag scenario when properties of an object are fetched and its primary key value is known.
- If your workload involves querying for values within XML instances without knowing the element or attribute names that contain those values, you may want to create the VALUE index.
Type of Indexes in SQL
No comments:
Post a Comment