Monday, May 30, 2016

SSIS - Load multiple files through Foreach Loop Container

We have a situation where we are importing a number of files with the same format, using the same SSIS package. In this situation, we need a counter to set the next file name within the data source connection string, means dynamic connection string and dynamic directory path where all the raw data files are stored. We can use Foreach Loop Container in SSIS to loop through files in a specified folder.


So, we need two variables here as -
inpDirectoryPath - This variable will be used to store the directory path for the files
inpFullFilePath -  This variable will be used to store the dynamic connecting string for the processing file.
Foreach Loop Container- This control will be used to set the dynamic connection string from the shared directory path.
Data Flow Task – This control will be used to make a relation between source and destination based on the dynamic connection string.
Source File Connection- This is the dynamic connection string which will be set in the Foreach Loop container by the help of inpFullFilePath variable.
Database Destination – This is the connection component to make a connection to the database.

There are few steps which are given below:
Create a table named dbo.SalesOrders as shown below -
USE [DEMO]
GO

/****** Object:  Table [dbo].[SalesOrders]    Script Date: 5/30/2016 3:44:58 PM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[SalesOrders]
(
                [OrdNo] [int] NULL,
                [CustNo] [int] NULL,
                [ItemNo] [int] NULL,
                [Qty] [int] NULL,
                [Price] [numeric](18, 6) NULL
) ON [PRIMARY]

GO

This example will populate this table by reading three different csv files having the same layout. The files have the same layout with the following data columns-
OrdNo
CustNo
ItemNo
Qty
Price
2001
5010
88601
22
230
2002
5010
88602
15
650
2003
5010
88603
20
1070
2004
5010
88604
35
1490
2005
5010
88605
40
1910
2006
5010
88606
65
2330
2007
5010
88607
10
2750
2008
5010
88608
12
3170
2009
5010
88609
8
3590
On the SSIS package, create two variables as shown below: 


Variable inpDirectoryPath will contain the path where the csv files are located;


inpFullFilePath should be configured to point to one valid csv file (this is required only during the initial configuration of the Flat connection manager).
After doing the above settings and configurations; right click on the Foreach Loop Container and go to the Foreach Loop Editor window as given below:

Dynamic Directory Path Settings-
In Foreach Loop Editor, set Foreach File Enumerator in the collection section and set the Directory path to inpDirectoryPath. Retrieve file name should be fully qualified as shown above. 

Dynamic File Path Settings-
1. To set the dynamic file path for the raw data source connection string, we need to set the variable mappings in the Foreach Loop Editor as given below:

Now, inpFullFilePath variable is map with 0 and click on the OK button to configure it. Let check the actual "Flat File Connection Manager". Go to the property window and set the following connection string as given below:

We have done the followings -
1)  Dynamic Directory Path Settings in Foreach Loop Container
2) Dynamic File Path Settings in Flat File Connection Manager


Now, just drop Data Flow Task in Foreach Loop Container and click on the edit. It will open in Data Flow where we are dropping one Flat File Source and one OLE DB Destination as given below:

Click on the Flat File Source and set the connection string and preview the data as given below:

Now, we have seen the output of the raw data file and can also view the columns in the raw data file as given below:

Click on OK button to close the Flat File Source Editor and link this OLE DB Destination. Now click on the OLE DB Destination to open  OLE DB Destination Editor and set the database connection string and map the destination data table also as given below:

Now, go to the Mapping and see the columns mappings as given below:

Wow!! we have completed all the steps to load the multiple files having the same data format and same destination table. Now, It's time to execute the package. Just execute the package and see the data movements as given below:
Control Flow:  

Data Flow:

SQL Server Database:


Conclusion
In this section, we have learnt the use of dynamic variables in multiple files uploads, set the dynamic connection string with the help of variables and Foreach Loop Container to process n numbers of files. It gives us a facility to load only the specific type of raw data files into the system.

Friday, May 27, 2016

SQL – Inline Queries

Most of the developers are very familiar with inline queries. Inline queries are basically known as sub queries or Inner Select statements which are always used inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another inline query.  They can be used anywhere in SQL scripts to encounter the conditional expressions. The SQL statement containing an Inline query is also known an outer query or outer select.
How can Inline Queries helpful?
Most of the T-SQL statements which include inline queries can be alternatively formulated as join to encounter the conditional expressions. Other questions can be posed only with inline queries. In T-SQL, there is usually no performance difference between a statement that includes an inline query and a semantically equivalent version that does not. However, in some cases where existence must be checked, a join yields better performance.

Components of an Inline Query
An inline query has the same features as a normal SQL query. They only persist in the SQL statements and could have the following components –
  1. A regular SELECT query including regular select list components from the main table.
  2. A regular FROM clause including one or more table, function or view names.
  3. An optional WHERE clause to encounter the conditional expressions.
  4. An optional GROUP BY clause if your query contains aggregation functions.
  5. An optional HAVING clause if your query contains aggregation functions.
Points to be remember 
The SELECT query of an Inline query is always enclosed in parentheses. It cannot include a COMPUTE or FOR BROWSE clause, and may only include an ORDER BY clause when a TOP clause is also specified. There are three basic types of inline queries-
  1. Operate on lists introduced with IN, or those that a comparison operator modified by ANY or ALL.
  2. Are introduced with an unmodified comparison operator and must return a single value.
  3. Are existence tests introduced with EXISTS.

To understand their functionality, we will create a table variable to work as the base table for the inline statements as given below- 
---- declare OrderMaster table variable body
DECLARE @OrderMaster TABLE
(
OrdId INT IDENTITY(100,1),
OrdDate VARCHAR(30),
CustName VARCHAR(20)
)
---- declare OrderDetails table variable body
DECLARE @OrderDetails TABLE
(
OrdId INT,
ItemId VARCHAR(30),
ItemName VARCHAR(20),
SellingPrice Decimal(12,4)
)
---- Insert Values
INSERT INTO @OrderMaster (OrdDate,CustName)
Values (Getdate(),'Ryan Arjun'),(Getdate(),'Bill Trade'),
(Getdate(),'Rosy White')

---- Pull Order Data
SELECT * FROM @OrderMaster
OrdId
OrdDate
CustName
100
May 28 2016 10:08AM
Ryan Arjun
101
May 28 2016 10:08AM
Bill Trade
102
May 28 2016 10:08AM
Rosy White
----Fill Order Details
Insert Into @OrderDetails (OrdId, ItemId, ItemName, SellingPrice)
Values (100, 201,'Apple',135.78),(100, 202,'Mango',235.78)
,(101, 203,'Banana',124.50),(101, 204,'Orange',321.15)
,(102, 205,'Banana',124.50),(102, 204,'Orange',321.15),(102, 201,'Apple',135.78)

--- Pull Order Details
select * from @OrderDetails
OrdId
ItemId
ItemName
SellingPrice
100
201
Apple
135.78
100
202
Mango
235.78
101
203
Banana
124.5
101
204
Orange
321.15
102
205
Banana
124.5
102
204
Orange
321.15
102
201
Apple
135.78


Now, we have the tables and want to pull the order and customer wise total sales.
Pull Single Value
There are many ways to do this but we are using inline query to accomplish this task. To pull the single value, we are using inline query within the main SQL statement as given below-
---- Use inline query in the select statement
select OrdId, CustName,
---- Inline query for single value
[Price] =(Select  
[Price]=sum(SellingPrice)
from @OrderDetails
where OrdId=  Om.OrdId
)
from @OrderMaster OM
OrdId
CustName
Price
100
Ryan Arjun
371.56
101
Bill Trade
445.65
102
Rosy White
581.43
Pull Multiple Values with Group By
If we want to pull more than one column then inline query should work as join with the main query as given below:
---- Inline Query as Join
SELECT Om.OrdId, Om.CustName,
OD.[Item-Qty], OD.Price
FROM @OrderMaster OM
JOIN
(
SELECT  OrdId,
[Item-Qty]=COUNT(ItemId),
[Price]=SUM(SellingPrice)
FROM @OrderDetails
GROUP BY ORDID
)OD
ON OM.ORDID=OD.ORDID
OrdId
CustName
Item-Qty
Price
100
Ryan Arjun
2
371.56
101
Bill Trade
2
445.65
102
Rosy White
3
581.43
Pull Value with Where Clause and Group By

If we want to pull more than one column based on some conditional expression then inline query should work as join with the main query as given below:
---- Inline Query as Join
SELECT Om.OrdId, Om.CustName,
OD.[Item-Qty], OD.Price
FROM @OrderMaster OM
JOIN
(SELECT  OrdId,
[Item-Qty]=COUNT(ItemId),
[Price]=SUM(SellingPrice)
FROM @OrderDetails
---- Conditional Expression
WHERE ItemName='Apple'
GROUP BY ORDID
)OD
ON OM.ORDID=OD.ORDID
OrdId
CustName
Item-Qty
Price
100
Ryan Arjun
1
135.78
102
Rosy White
1
135.78

Conclusion
It’s very beneficial concept in SQL and we can use them inside a SELECT, INSERT, UPDATE, or DELETE statement, or inside another inline query.  They are easily applicable in the function and stored procedure. These features are important in some Transact-SQL statements; the inline-query can be evaluated as if it were an independent query. Conceptually, the inline-query results are substituted into the outer query.