Friday, August 28, 2015

SQL – Wide Tables

As we know that tables are used to store all the data and stated as the database objects. They stored the data in a row-and-column format which is very similar to a spreadsheet. Each row represents a unique record, and each column represents a field in the record.

A standard user-defined table can have up to 1,024 columns. The number of rows in the table is limited only by the storage capacity of the server.
Wide Tables
This feature was introduced with SQL Server 2008 release on-wards. A wide table is just a table that uses column sets and sparse columns. It still follows the same width restrictions per row (8019 bytes). To make a table wide you just add a column set to its definition. They come into the picture whenever business needs to increase the total of columns more than 1024 and wide table can have to 30,000. 
The number of non-sparse columns is still 1024 and the max size of the row is still 8000 bytes.

Sparse columns are ordinary columns that have an optimized storage for null values. Sparse columns reduce the space requirements for null values at the cost of more overhead to retrieve non-null values. Consider using sparse columns when the space saved is at least 20 percent to 40 percent.
True fact about Sparse columns
  • Sparse columns are ordinary columns, with the addition of the SPARSE property. 
  • Sparse columns do not support primary keys, data compression, NOT NULL constraints, or default values.

A sparse column set gathers all sparse columns into a new column that is similar to a derived or computed column, but with additional functionality – its data can be updated and selected from directly.
True fact about column set
  • A column set may be included in a table definition at the time of creation, or added later, if no sparse columns exist yet. 
  • Only one column set may be used per table, and a column set cannot be added to a table that already contains sparse columns. 
  • Replication, distributed queries, and change data capture (CDC) do not support using column sets. 
  • Security permissions can be granted and revoked from the column set column, similar to any other column in the table.


Whenever designing the wide tables with sparse columns, keep in mind that an additional 2 bytes of overhead are required for each non-null sparse column in the table when a row is being updated. As a result of this additional memory requirement, updates can fail unexpectedly with error 576 when the total row size, including this memory overhead, exceeds 8019, and no columns can be pushed off the row.

A wide table has defined a column set, which is an untyped XML representation that combines all the sparse columns of a table into a structured output. The number of indexes and statistics is also increased to 1,000 and 30,000, respectively.

How to create Wide Table
However, by leveraging the column set functionality, we can create a table with greater than 1,024 columns.  The AdventureWorks2012 example from Books Online shows a nice example, where the column SpecialPurposeColumns is defined as the column set that returns the 4 sparse columns defined in the table definition:

USE AdventureWorks2012;
GO

-- create wide table with sparce and column set
CREATE TABLE DocumentStore
    (
DocID int PRIMARY KEY,
     Title varchar(200) NOT NULL,
     ProductionSpecification varchar(20) SPARSE NULL,
     ProductionLocation smallint SPARSE NULL,
     MarketingSurveyGroup varchar(20) SPARSE NULL,
     MarketingProgramID int SPARSE NULL,
    SpecialPurposeColumns XML COLUMN_SET FOR ALL_SPARSE_COLUMNS
);
GO
-- Insert values
INSERT DocumentStore(DocID, Title, ProductionSpecification, ProductionLocation)
VALUES (1, 'Tire Spec 1', 'AXZZ217', 27);
GO

-- Insert Values
INSERT DocumentStore(DocID, Title, MarketingSurveyGroup)
VALUES (2, 'Survey 2142', 'Men 25 - 35');
GO

-- Fetch data from wide table
SELECT * FROM DocumentStore;
GO

DocID
Title
SpecialPurposeColumns
1
Tire Spec 1
<ProductionSpecification>AXZZ217</ProductionSpecification>
<ProductionLocation>27</ProductionLocation>
2
Survey 2142
<MarketingSurveyGroup>Men 25 - 35</MarketingSurveyGroup>

Limitation of Wide Tables
Wide tables have the following performance implications.
  • Wide tables can increase the cost to maintain indexes on the table. We recommend that the number of indexes on a wide table be limited to the indexes that are required by the business logic. 
  • As the number of indexes increases, so does the DML compile-time and memory requirement. Nonclustered indexes should be filtered indexes that are applied to data subsets. For more information, see Create Filtered Indexes.
  • Applications can dynamically add and remove columns from wide tables. When columns are added or removed, compiled query plans are also invalidated. 
  • When data is added and removed from a wide table, performance can be affected. Applications must be designed for the projected workload so these changes to the table data are minimized.
  • Limit the execution of DML statements on a wide table that update multiple rows of a clustering key. These statements can require significant memory resources to compile and execute.
  • Switch partition operations on wide tables can be slow and might require large amounts of memory to process. 
  • The performance and memory requirements are proportional to the total number of columns in both the source and target partitions.
  • Update cursors that update specific columns in a wide table should list the columns explicitly in the FOR UPDATE clause. This will help optimize performance when you use cursors.
  • There are size restrictions on the rows and it can be slower to retrieve data than if you use related tables (even those with one-to-one relationships).
  • Transnational replication supports sparse columns, but it does not support column sets, which can be used with sparse columns.
  • Merge replication does not support sparse columns or column sets. Change data capture supports sparse columns, but it does not support column sets.
Summary
To make a table into a wide table, you must create sparse columns and also add a column set. If you don't create a column set your table will still be limited to 1024 columns. Sparse columns and column sets in SQL Server provide a way of accommodating the less-structured data that has always proved to be tricky for the relational model. Wide tables can increase the cost to maintain indexes on the table and switching partition operations work very slow and require more memory to process. 

To know more on the different kinds of the tables in SQL Servers at
  1. SQL – Wide Tables
  2. SQL - Table Variables
  3. SQL - Temp Table (Local & Global Temp Tables)
  4. SQL - Common Table Expression
  5. SQL - Difference between Table Variable and Common Type Expression
  6. SQL - Difference between Temp Table and CTE
  7. SQL - Difference between Temp Table and Table Variable

References:
https://msdn.microsoft.com/en-us/library/ms189084(v=sql.110).aspx

No comments:

Post a Comment