Monday, May 30, 2016

SSIS - Load multiple files through Foreach Loop Container

We have a situation where we are importing a number of files with the same format, using the same SSIS package. In this situation, we need a counter to set the next file name within the data source connection string, means dynamic connection string and dynamic directory path where all the raw data files are stored. We can use Foreach Loop Container in SSIS to loop through files in a specified folder.


So, we need two variables here as -
inpDirectoryPath - This variable will be used to store the directory path for the files
inpFullFilePath -  This variable will be used to store the dynamic connecting string for the processing file.
Foreach Loop Container- This control will be used to set the dynamic connection string from the shared directory path.
Data Flow Task – This control will be used to make a relation between source and destination based on the dynamic connection string.
Source File Connection- This is the dynamic connection string which will be set in the Foreach Loop container by the help of inpFullFilePath variable.
Database Destination – This is the connection component to make a connection to the database.

There are few steps which are given below:
Create a table named dbo.SalesOrders as shown below -
USE [DEMO]
GO

/****** Object:  Table [dbo].[SalesOrders]    Script Date: 5/30/2016 3:44:58 PM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[SalesOrders]
(
                [OrdNo] [int] NULL,
                [CustNo] [int] NULL,
                [ItemNo] [int] NULL,
                [Qty] [int] NULL,
                [Price] [numeric](18, 6) NULL
) ON [PRIMARY]

GO

This example will populate this table by reading three different csv files having the same layout. The files have the same layout with the following data columns-
OrdNo
CustNo
ItemNo
Qty
Price
2001
5010
88601
22
230
2002
5010
88602
15
650
2003
5010
88603
20
1070
2004
5010
88604
35
1490
2005
5010
88605
40
1910
2006
5010
88606
65
2330
2007
5010
88607
10
2750
2008
5010
88608
12
3170
2009
5010
88609
8
3590
On the SSIS package, create two variables as shown below: 


Variable inpDirectoryPath will contain the path where the csv files are located;


inpFullFilePath should be configured to point to one valid csv file (this is required only during the initial configuration of the Flat connection manager).
After doing the above settings and configurations; right click on the Foreach Loop Container and go to the Foreach Loop Editor window as given below:

Dynamic Directory Path Settings-
In Foreach Loop Editor, set Foreach File Enumerator in the collection section and set the Directory path to inpDirectoryPath. Retrieve file name should be fully qualified as shown above. 

Dynamic File Path Settings-
1. To set the dynamic file path for the raw data source connection string, we need to set the variable mappings in the Foreach Loop Editor as given below:

Now, inpFullFilePath variable is map with 0 and click on the OK button to configure it. Let check the actual "Flat File Connection Manager". Go to the property window and set the following connection string as given below:

We have done the followings -
1)  Dynamic Directory Path Settings in Foreach Loop Container
2) Dynamic File Path Settings in Flat File Connection Manager


Now, just drop Data Flow Task in Foreach Loop Container and click on the edit. It will open in Data Flow where we are dropping one Flat File Source and one OLE DB Destination as given below:

Click on the Flat File Source and set the connection string and preview the data as given below:

Now, we have seen the output of the raw data file and can also view the columns in the raw data file as given below:

Click on OK button to close the Flat File Source Editor and link this OLE DB Destination. Now click on the OLE DB Destination to open  OLE DB Destination Editor and set the database connection string and map the destination data table also as given below:

Now, go to the Mapping and see the columns mappings as given below:

Wow!! we have completed all the steps to load the multiple files having the same data format and same destination table. Now, It's time to execute the package. Just execute the package and see the data movements as given below:
Control Flow:  

Data Flow:

SQL Server Database:


Conclusion
In this section, we have learnt the use of dynamic variables in multiple files uploads, set the dynamic connection string with the help of variables and Foreach Loop Container to process n numbers of files. It gives us a facility to load only the specific type of raw data files into the system.

No comments:

Post a Comment