Wednesday, April 25, 2018

TSQL - GROUP BY with ROLLUP and CUBE


The ROLLUP operator is a very useful in generating reports that contain subtotals and totals which generates a result set that shows aggregates for a hierarchy of values in the selected columns. The super-aggregated column is represented by a NULL value and we can assign any value such as ‘ALL’, although the super-aggregate rows will always be added last. Multiple aggregates over different columns will be added if there are multiple GROUP BY columns.
WITH ROLLUP, as well as WITH CUBE, both are non-standard and deprecated.
To understand both operators, we will observe both the ROLLUP and CUBE operators with the help of a simple example as given below-
----- Create table variable
DECLARE @Item_List Table
(Product Varchar(20), Maker Varchar(20), Quantity Int)

----- Insert Data into table variable
INSERT INTO @Item_List(Product,Maker, Quantity )
VALUES ('Laptop', 'Dell', 300),
('Laptop', 'Lenovo', 150),
('Laptop', 'HP', 400),
('Desktop', 'Dell', 120),
('Desktop', 'Lenovo', 80),
('Desktop', 'HP', 250)

----- Pull the data from table variable
SELECT * FROM @Item_List;
Product
Maker
Quantity
Desktop
Dell
120
Desktop
HP
250
Desktop
Lenovo
80
Laptop
Dell
300
Laptop
HP
400
Laptop
Lenovo
150

----- Result by using WITH ROLLUP;
SELECT Product=ISNULL(Product,'ALL'), Maker=ISNULL(Maker,'ALL'), Quantity=Sum(Quantity)
FROM @Item_List
GROUP BY Product, Maker WITH ROLLUP;
Product
Maker
Quantity
Desktop
Dell
120
Desktop
HP
250
Desktop
Lenovo
80
Desktop
ALL
450
Laptop
Dell
300
Laptop
HP
400
Laptop
Lenovo
150
Laptop
ALL
850
ALL
ALL
1300
In addition to the subtotals generated by the ROLLUP extension, the WITH CUBE operator will generate subtotals for all combinations of the dimensions specified. In general, WITH CUBE generates a result set that shows aggregates for all combinations of values in the selected columns.
---- Result by using With Cube
SELECT Product=ISNULL(Product,'ALL'), Maker=ISNULL(Maker,'ALL'), Quantity=Sum(Quantity)
FROM @Item_List
GROUP BY Product , Maker WITH Cube ;
Product
Maker
Quantity
Desktop
Dell
120
Laptop
Dell
300
ALL
Dell
420
Desktop
HP
250
Laptop
HP
400
ALL
HP
650
Desktop
Lenovo
80
Laptop
Lenovo
150
ALL
Lenovo
230
ALL
ALL
1300
Desktop
ALL
450
Laptop
ALL
850
As the number of dimensions/columns increase, so do the combinations of subtotals that need to be calculated. The ROLLUP operator can also be used to calculate sub-totals for each column, based on the groupings within that column.
The Difference between ROLLUP and CUBE
There is only one major difference between the functionality of the ROLLUP operator and the CUBE operator. ROLLUP operator generates aggregated results for the selected columns in a hierarchical way whereas CUBE generates an aggregated result that contains all the possible combinations for the selected columns or dimensions.

Monday, April 23, 2018

SSIS - Data Flow


Microsoft provided SQL Server Integration Services, an amazing ETL tool to create packages to meet the business requirements. In SSIS, a package is a graphical design tool which may consist of organized collection of connections, control flow elements, data flow elements, event handlers, variables, parameters, and configurations.
After Control Flow, the Data Flow task is one of the most important and powerful components in SSIS which is responsible to retrieve data from heterogeneous  data sources, transform that data, and insert it into heterogeneous  destinations.

If you want to go Data Flow section then we must have to add Data Flow task in the control flow section tab as given below –


After adding Data Flow Task in Control Flow tab, double-click the Data Flow task in the control flow. This will move you to the Data Flow tab as given below –

How can we Configuring the Data Flow? 
As we already stated that SSIS is completely a graphical design tool and we can configure a Data Flow task by adding components to the Data Flow tab. In this area, SSIS tool supports three types of data flow components to meet the business:
  1. Sources: Extract data from external data sources, such as line-of-business systems, CRM systems, relational databases, web services, and SharePoint lists
  2. Transformations: This includes cleansing the data and converting it to an OLAP-friendly data model. The OLAP-friendly data model traditionally consists of dimension and fact tables in a star or snowflake schema, and closely maps SSAS’s dimensional model (SSAS stands for SQL Server Analysis Services)
  3. Destinations: Load the data in data warehouse/data marts, so that it can be quickly accessed by querying tools, such as reports. In practice, this implies processing SSAS cubes

Conclusion
Data Flow task will always start with a source and will usually end with a destination, but not always. We can also add as many transformations as necessary to prepare the data for the destination. For example, we can use the Derived Column transformation to add a computed column to the data flow, or we can use a Conditional Split transformation to split data into different destinations based on specified criteria. 


Popular Posts

Get Sponsored by Big Brands