The Lookup
transformation performs lookups by joining data in input columns with columns
in a reference dataset. Lookup transformation takes input value and searches
for a row in dataset that contains this value in the specified field; once it
finds it we are able to extract values from different fields that belong to the
same. This means use the lookup to access additional information in a related
dataset which is based on values in other datasets common columns.
For better understanding, we can use two data tables such as Employee having employee details like as EmpCode, EmpName and DeptId and another table such as Department having DeptId and Department.
Tables
|
We can use
this transformation in many scenarios to meet the business requirements because
this is most necessary transformation component in building of a data
warehouse. During our data extract, we need to pull the source system key
(business key) that we want to find in our dimension table and replace with ID
(surrogate key) which will be used to load our Fact data table.
Here, I’m showing only matching and un-matching output with a very simple example as given below:
We’ll create a new package and add data flow task in source control as given below:
In data flow
control, add source item and set the data source connection also as given
below:
Now, we are using Employee table to extract all employee information in the OLE DB Source as given below:
We are able to view all the source
columns in the above image. Now, we need to use the lookup transformation to
pull the entire department from the department table which are based on the
common column called DeptId in both datasets. So, put a lookup transformation
in data flow control and link with the data source control as given below:
Configuring the General page of the Lookup Transformation Editor
Now, we need
to show all unmatched records and matched records with the help of lookup
transformation. In Lookup Transformation Editor Window’s General selection, we
configure the lookup transformation by using one option from three configuring cache
mode options which are defined as:
Full cache: The referenced dataset is
generated and loaded into cache before the Lookup transformation is executed.
Partial cache: The referenced dataset is
generated when the Lookup transformation is executed, and the dataset is loaded
into cache.
No cache: The referenced dataset is
generated when the Lookup transformation is executed, but no data is loaded
into cache.
Connection type section, which supports two options: Cache Connection Manager and OLE DB Connection Manager. In our tutorials, we are using OLE DB connection manager option because we are retrieving data from the database.
Type of lookup. At the
bottom there is a drop down box with four values specified. We need to pull
both matched and unmatched rows from lookup transformation, so we will have to choose
Redirect rows to no match output as
given below:
Configuring the Connection page of the Lookup
Transformation Editor
In
Connection page, set OLE DB connection manager which is the database that
contains both fields that we are looking for, specify department table and clicked
preview button so we can see values from the data table as given below:
Configuring the Columns page of the Lookup
Transformation editor
In Columns
page, we can see two datasets. The left side is input (comes from source item
in data flow) and the right side is what we look up against which we specified
in previous step (connection page section). Here, we need to drag DeptId input
field input lookup DeptId field which is the main part responsible for finding
the correct row and checked Department column from the lookup dataset also as
given below:
Note: Lookup match is based on case sensitive
algorithm. If we are matching records in lookup transformation then both should
have the same data types. If they are characters then both string should be in
lower or capital format.
Selecting an output for the Lookup transformation
We have configured
lookup transformation successfully. We need to show both matched and unmatched
outputs, for this just drop two Row Sampling Transformations as given below:
When we try
to connect lookup transformation then it will ask for Lookup Match Output or
Lookup No Match Output as given below:
Connect both Row Sampling transformation with Lookup Transformation such as one for Matched and another for No Match output. Now, we need to Enable Data Viewer for both outputs as given below:
Time
to run our SSIS Package
The final step, of course, is to run
the SSIS package in BIDS. When we run the package, we can monitor that how many
rows matched the lookup dataset and how many did not as given below:
Conclusion
No comments:
Post a Comment