Saturday, October 3, 2015

SQL - Merge Statement

This beautiful feature was introduced with SQL Server 2008 released and facilitates us to perform either an INSERT or UPDATE or DELETE statement using a single MERGE statement.  The MERGE statement supports inserting and updating rows in a table with a single operation.  In order to accomplish this MERGE statement requires both a Source and Target table.  The Source table is used to identify the rows that needed be inserted or update or deleted, and the Target table is the table that rows will be inserted or updated. 
Prior to the introduction of SQL Server 2008 if you wanted to write logic that inserted rows into a target table based on some conditions if they didn’t match, or updated them if they did exist then you needed a conditional series of “if then else” logic and needed to perform both the UPDATE and INSERT statement within while loop for more than one records. 

Facts about Merge Statements
  1. MERGE statements are always long and we have not many occasions to use it. But MERGE is very useful - when it's hard to write separate UPDATE/INSERT/DELETE, because after one step you lose information which rows to affect in next one.
  2. The MERGE operations are atomic and the ON values can change in the middle of the MERGE operation, depending on other, concurrent operations.
  3. The MERGE statement supports several clauses that facilitate the different types of data modifications.
You can use the MERGE statement to perform the following operations:
  • Conditionally insert or update rows in a target table - If the row exists in the target table, update one or more columns; otherwise, insert the data into a new row.
  • Synchronize two tables - Insert, update, or delete rows in a target table based on differences with the source data.
Primary Clauses for Merge Syntax - The MERGE syntax consists of five primary clauses:
  1. The MERGE clause specifies the table or view that is the target of the insert, update, or delete operations.
  2. The USING clause specifies the data source being joined with the target.
  3. The ON clause specifies the join conditions that determine where the target and source match.
  4. The WHEN clauses having WHEN MATCHED, WHEN NOT MATCHED BY TARGET, and WHEN NOT MATCHED BY SOURCE which will specify the actions to take based on the results of the ON clause and any additional search criteria specified in the WHEN clauses.
  5. The OUTPUT clause returns a row for each row in the target that is inserted, updated, or deleted.
Specifying Source and Target Search Conditions
It is important to understand how the source and target data are merged into a single input stream and how additional search criteria can be used to correctly filter out unneeded rows. Otherwise, we might specify the additional search criteria in a way that produces incorrect results.

To understand the functionality of the Merge syntax, we will use the following tables such as given below-
USE tempdb;
GO
---- Create target table
CREATE TABLE dbo.Target(ItemID int, ItemName varchar(20), Qty int,  CONSTRAINT Target_PK PRIMARY KEY(ItemID));
---- Create Source table    

CREATE TABLE dbo.Source(ItemID int, ItemName varchar(20),  Qty int,  CONSTRAINT Source_PK PRIMARY KEY(ItemID));
GO

---insert data into Target table
INSERT dbo.Target(ItemID, ItemName,  Qty) VALUES
(100, 'Mango', 10),
(101, 'Orange', 15),
(102, 'Apple', 25);
GO

---insert data into Source table
INSERT dbo.Source(ItemID, ItemName,  Qty) Values
(103, 'Banana', 14),
(104, 'Grapes', 16),
(101, 'Orange', 35)
GO

Select ItemID, ItemName, Qty FROM dbo.Target

ItemID
ItemName
Qty
100
Mango
10
101
Orange
15
102
Apple
25

Select ItemID, ItemName, Qty FROM dbo.Source
ItemID
ItemName
Qty
101
Orange
35
103
Banana
14
104
Grapes
16

In the above example, we are using the two tables such as Target table having the actual records and Source table having the new records or updatable records. Now, we need to insert the new records into the Target table as well as update the existing records based on the matched conditions such as –
-- MERGE statement with the join conditions specified correctly.
USE tempdb;
GO
MERGE Target AS T
USING Source AS S
ON (T.ItemID = S.ItemID)
WHEN MATCHED
    THEN UPDATE SET T.ItemName = S.ItemName, T.Qty = S.Qty
WHEN NOT MATCHED BY TARGET
    THEN INSERT(ItemID, ItemName, Qty)
    VALUES(S.ItemID, S.ItemName, S.Qty)
WHEN NOT MATCHED BY SOURCE AND T.ItemName LIKE 'S%'
    THEN DELETE
OUTPUT $action, inserted.*, deleted.*;

GO

Select ItemID, ItemName, Qty FROM dbo.Target
ItemID
ItemName
Qty
100
Mango
10
101
Orange
35
102
Apple
25
103
Banana
14
104
Grapes
16

In the above Merge statement shows that the input stream results depend on the combination of WHEN clauses. Now assume that we want to perform the following actions on the target table based on that input stream:
  1. Update rows from the source table when the Item ID does exist in the target table.
  2. Insert rows from the source table when the Item ID does not exist in the target table.
  3. Delete rows in the target table when the target Item name starts with 'S' and the Item ID does not exist in the source table.
To perform these actions, the following WHEN clauses are required:
  1. WHEN MATCHED BY SOURCE THEN UPDATE
  2. WHEN NOT MATCHED BY TARGET THEN INSERT
  3. WHEN NOT MATCHED BY SOURCE THEN DELETE
As described in the previous table, when both of the WHEN NOT MATCHED clauses are specified, the resulting input stream is a full outer join of the source and target tables. Now that the input stream results are known, consider how the insert, update, and delete actions will be applied to the input stream.

As stated earlier, the WHEN clauses specify the actions to take based on the results of the ON clause and any additional search criteria specified in the WHEN clauses. In many cases, the search conditions specified in the ON clause produces the required input stream. However, in the example scenario, the insert and delete actions require additional filtering to restrict the affected rows to those that have an employee name that starts with 'S'. In the following example, the filtering conditions are applied to the WHEN NOT MATCHED BY TARGET and the WHEN NOT MATCHED BY SOURCE. The output from the statement shows that the expected rows from the input stream are corrected, inserted, or deleted.

Using MERGE in Integration Services Packages
In SQL Server 2008 Integration Services (SSIS), the SQL statement in an Execute SQL task can contain a MERGE statement. This MERGE statement enables you to accomplish multiple INSERT, UPDATE, and DELETE operations in a single statement.

Typically, we use the MERGE statement when we want to apply changes that include inserts, updates, and deletions from one table to another table. Prior to SQL Server 2008, this process required both a Lookup transformation and multiple OLE DB Command transformations. The Lookup transformation performed a row-by-row lookup to determine whether each row was new or changed. The OLE DB Command transformations then performed the necessary INSERT, UPDATE, and DELETE operations. In SQL Server 2008, a single MERGE statement can replace both the Lookup transformation and the corresponding OLE DB Command transformations.

To use the MERGE statement in a package, follow these steps:
  1. Create a Data Flow task that loads, transforms, and saves the source data to a temporary or staging table.
  2. Create an Execute SQL task that contains the MERGE statement.
  3. Connect the Data Flow task to the Execute SQL task, and use the data in the staging table as the input for the MERGE statement.
Using MERGE in Other Scenarios
In the following scenarios, we can use the MERGE statement either outside or inside an Integration Services package. However, an Integration Services package is often required to load this data from multiple heterogeneous sources, and then to combine and cleanse the data. Therefore, you might consider using the MERGE statement in a package for convenience and ease of maintenance.
Tracking Buying Habits
Tracking Price History
Merging a Table with New Data Against the Old Table

Conclusion
A common scenario is updating one or more columns in a table if a matching row exists, or inserting the data as a new row if a matching row does not exist. This is usually done by passing parameters to a stored procedure that contains the appropriate UPDATE and INSERT statements. With the MERGE statement, we can replace the individual DML statements with a single statement. This can improve query performance because the operations are performed within a single statement, therefore, minimizing the number of times the data in the source and target tables are processed. However, performance gains depend on having correct indexes, joins, and other considerations in place. This topic provides best practice recommendations to help you achieve optimal performance when using the MERGE statement.
References 
https://technet.microsoft.com/en-us/library/bb522522(v=sql.105).aspx

4 comments:

  1. It might be worth mentioning that the syntax Microsoft uses extends the ANSI/ISO standard syntax for the same construct. If you have to port code you will need to know this.. You can only count on insertion and update in other products, not deletion. The flow of data in the ANSI/ISO standard is based on magnetic tape files (really!), so data flows from a source to the target and only the target can be changed.

    ReplyDelete

Popular Posts