Monday, November 19, 2018

SSIS - Redirect bad data by Conditional Split Transformation

In this tutorial, we are going to learn how does Conditional Split transformation work in SSIS package. The Conditional Split transformation can route data rows to different outputs depending on the content of the data. The implementation of the Conditional Split transformation is similar to a CASE decision structure in a programming language. 

We can use conditional split between our source flat file and the destination SQL Server table to redirect bad data from flat file source. If there's a bad record and we know that which column can have the bad data then based of the business rules, we can put some condition and going to write is bad, go to the new SQL dump table, otherwise, go to the "good" table.

As we already stated that the transformation evaluates expressions, and based on the results, directs the data row to the specified output. This transformation also provides a default output, so that if a row matches no expression it is directed to the default output.

Note - The Conditional Split transformation directs each input row to one output only. If we enter multiple conditions, the transformation sends each row to the first output for which the condition is true and disregards subsequent conditions for that row. If we need to evaluate several conditions successively, we may need to concatenate multiple Conditional Split transformations in the data flow.

Based on the functionality of  Conditional Split transformation, it ensures us that every every row has only one output. To understand the bad data handling, we can assume that we have a sales data where some rows have missing order date or some rows have missing ship date missing, then we can redirect these rows by using the following condition -

[Order Date] == "" || [Ship Date] == ""
To learn more on bad data handling, please visit my YouTube channel -


Friday, November 16, 2018

SSIS - Conditional Split Transformation

In this tutorial, we are going to learn how does Conditional Split transformation work in SSIS package. The Conditional Split transformation can route data rows to different outputs depending on the content of the data. The implementation of the Conditional Split transformation is similar to a CASE decision structure in a programming language. 
The transformation evaluates expressions, and based on the results, directs the data row to the specified output. This transformation also provides a default output, so that if a row matches no expression it is directed to the default output.

We can configure the Conditional Split transformation in the following ways:
* Provide an expression that evaluates to a Boolean for each condition you want the transformation to test.
* Specify the order in which the conditions are evaluated. Order is significant, because a row is sent to the output corresponding to the first condition that evaluates to true.
* Specify the default output for the transformation. The transformation requires that a default output be specified.

To live demo, please visit my YouTube Channel at -


Each input row can be sent to only one output, that being the output for the first condition that evaluates to true. 
The Conditional Split transformation includes the FriendlyExpression custom property. This property can be updated by a property expression when the package is loaded. 

Note - The Conditional Split transformation directs each input row to one output only. If you enter multiple conditions, the transformation sends each row to the first output for which the condition is true and disregards subsequent conditions for that row. If you need to evaluate several conditions successively, you may need to concatenate multiple Conditional Split transformations in the data flow

Wednesday, November 14, 2018

SSIS - Expression Task

In this tutorial, we are going to learn how does Expression Task work in SSIS package. The SSIS Expression task will allow you to set variable values at runtime without the need for a script task. This was a new feature when SSIS 2012 rolled out a few years ago. It’s a pretty easy task to use and also helps you avoid having to write any .net code as well. The expression language includes functions and operators.

Things to be Noted: 

  • You can only set 1 variable at a time with this task. If you need to do more you can add additional expression tasks or use one script task to set all of them at once.
  • You must have SSIS 2012 or higher. If you have SSIS 2008R2 and below you can still use this functionality but it’s slightly more complicated because you will have to create a script task to do this for you.
One common usage for the expression task in SSIS is to implement conditional flow of the constraints. In the below screenshot you can see that we implement an expression tasks which follows an error checking activity and evaluates if there were any errors , if yes the execution move to the right else to the left.

To watch the live demo, please visit my YouTube Channel -

Tuesday, November 13, 2018

SSIS - Foreach Loop Container

In this tutorial, we are going to learn how does For Loop Container work in SSIS package. The For Loop container defines a repeating control flow in a package. The loop implementation is similar to the For looping structure in programming languages. For Loop container groups set of tasks together and allows you to looping through the tasks in your package which is similar like for loop in any programming language. In case until a fixed count than you will use the “For Loop container”

Configuration editor has only two options For Loop and Expressions. 
  • InitExpression: This will initialize the loop with specified values.
  • EvalExpression: specifies an expression to stop the loop when the expression evaluates to false. This expression must evaluate to a Boolean. 
  • AssignExpression: specifies an expression that changes a condition in the same way each time the loop iterates.Which is optional one.
Drag and drop “For loop” container and configure as shown below



Difference between For Loop and Foreach Loop Container?
A For Loop container loops till a constant value. So for example if you want to loop until a value is 12 or less than 12 so for up-to fixed count you have to use For Loop container.
Whereas Foreach Loop container loops through collection so this collection can files in a folder or collection can be in record set.
Foreach loop - Foreach loop container actually loop through a collection. Now this collection row in a record set files in a folder.
Foreach Loop Container Task: When there is dependency of number of time task should execute on data such as files, variables we will use this.The Foreach Loop container defines a repeating control flow in a package. The loop implementation is similar to Foreach looping structure in programming languages. In a package, looping is enabled by using a Foreach enumerator. The Foreach Loop container repeats the control flow for each member of a specified enumerator.

In short, a For Loop Container uses a loop condition that is defined by expressions whereas Foreach Loop Container iterates through a collection of objects.

With a Foreach Loop Container, we can perform variable mapping as well, which would be useful for tasks within the loop.

Monday, November 12, 2018

SSIS - How does Aggregate Transformation work

In this tutorial, we are going to learn how does Aggregate Transformation work in SSIS package. The Aggregate transformation is used to perform aggregate operations/functions on groups in a dataset. The aggregate functions available are- Count, Count Distinct, Sum, Average, Minimum and Maximum. The Aggregate transformation has one input and one or more outputs.

The Aggregate transformation handles null values in the same way as the SQL Server relational database engine. The behaviour is defined in the SQL-92 standard. The following rules apply:
  • In a GROUP BY clause, nulls are treated like other column values. If the grouping column contains more than one null value, the null values are put into a single group.
  • In the COUNT (column name) and COUNT (DISTINCT column name) functions, nulls are ignored and the result excludes rows that contain null values in the named column.
  • In the COUNT (*) function, all rows are counted, including rows with null values.
Aggregate Transformation Configuration - You configure the Aggregate transformation at the transformation, output, and column levels. At the transformation level, you configure the Aggregate transformation for performance by specifying the following values:
  1. The number of groups that are expected to result from a Group by operation. They are mostly are string values.
  2. The number of distinct values that are expected to result from a Count distinct operation. These values could be string or numeric.
  3. The percentage by which memory can be extended during the aggregation.
  4. The Aggregate transformation can also be configured to generate a warning instead of failing when the value of a divisor is zero.

For more on Aggregate Transformation - Watch a Live demo here


ConclusionThe Aggregate transformation is asynchronous, which means that it does not consume and publish data row by row. Instead it consumes the whole rowset, performs its groupings and aggregations, and then publishes the results.