Microsoft Business Intelligence (Data Tools)|SSIS

Monday, April 23, 2018

SSIS - Data Flow

Microsoft provided SQL Server Integration Services, an amazing ETL tool to create packages to meet the business requirements. In SSIS, a package is a graphical design tool which may consist of organized collection of connections, control flow elements, data flow elements, event handlers, variables, parameters, and configurations.

After Control Flow, the Data Flow task is one of the most important and powerful components in SSIS which is responsible to retrieve data from heterogeneous data sources, transform that data, and insert it into heterogeneous destinations.

If you want to go Data Flow section then we must have to add Data Flow task in the control flow section tab as given below –

After adding Data Flow Task in Control Flow tab, double-click the Data Flow task in the control flow. This will move you to the Data Flow tab as given below –

How can we Configuring the Data Flow?

As we already stated that SSIS is completely a graphical design tool and we can configure a Data Flow task by adding components to the Data Flow tab. In this area, SSIS tool supports three types of data flow components to meet the business:

Sources: Extract data from external data sources, such as line-of-business systems, CRM systems, relational databases, web services, and SharePoint lists
Transformations: This includes cleansing the data and converting it to an OLAP-friendly data model. The OLAP-friendly data model traditionally consists of dimension and fact tables in a star or snowflake schema, and closely maps SSAS’s dimensional model (SSAS stands for SQL Server Analysis Services)
Destinations: Load the data in data warehouse/data marts, so that it can be quickly accessed by querying tools, such as reports. In practice, this implies processing SSAS cubes

Conclusion

A Data Flow task will always start with a source and will usually end with a destination, but not always. We can also add as many transformations as necessary to prepare the data for the destination. For example, we can use the Derived Column transformation to add a computed column to the data flow, or we can use a Conditional Split transformation to split data into different destinations based on specified criteria.

Mukesh Singh

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

2 comments:

UnknownApr 23, 2018, 11:00:00 PM
The Data Flow, conversely, can branch, split, and merge, providing parallel processing, but this concept is different from the Control Flow. Even though there may be a set of connected linear transformations, you cannot necessarily call the process a serial process, because the transformations in most cases will run at the same time, handling subsets of the data in parallel and passing groups of data downstream.
ReplyDelete
Replies
UnknownApr 23, 2018, 11:01:00 PM
Multiple components are running at the same time because the Data Flow Transformations are working together in a coordinated streaming fashion, and the data is being transformed in groups (called buffers) as it is passed down from the source to the subsequent transformations.
ReplyDelete
Replies

Add comment

Monday, April 23, 2018

SSIS - Data Flow

2 comments:

Popular Posts