Microsoft Business Intelligence (Data Tools)|2018

Thursday, December 20, 2018

SSRS - Conditional Formatting By Switch Function

In this tutorial, we will learn How to use Switch function features in your chart report.

Switch is useful when there are three or more conditions to test because it's simpler to read than nested IIF statements. It's similar to a Case statement in T-SQL. A Switch will return the first expression that it finds true, so the order of the conditions is very important. Switch works like case statement and we can write multiple conditions. Switch evaluate them in sequencer manner and take the first expression that evaluated to true.

As with most things in SSRS you can do something in more than 1 way but today I wanted to introduce you to the Switch Function.

The SWITCH function allows you to have multiple conditions when evaluating an expression. It evaluates each condition within the function.

I would recommend you read the expression you have written very carefully because your logic is flawed and that is why sometimes, your SWITCH statement is not working as you believe it should.

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Wednesday, December 19, 2018

SSRS - Scale Break features in Chart

In this tutorial, we will learn How to use Scale Break features in your chart report. A scale break is a stripe drawn across the plotting area of a chart to denote a break in continuity between the high and low values on a value axis (usually the vertical, or y-axis). Use a scale break to display two distinct ranges in the same chart area.

You cannot specify where to place a scale break on your chart. The chart uses its own calculations based on the values in your dataset to determine whether there is sufficient separation between data ranges to draw a scale break on the value axis (y-axis) at run time.

We will show you a lesser well known feature within SSRS that only takes 1 click to implement.

We can enable this feature by right clicking on the vertical axis and selecting the (Vertical Axis Properties). Then after that just click the check box for (Enable Scale Breaks.)

To change the style of the scale break

Open the Properties pane.
On the design surface, right-click on the y-axis of the chart. The properties for the y-axis object (named Chart Axis by default) are displayed in the Properties pane.
In the Scale section, expand the ScaleBreakStyle property.
Change the values for ScaleBreakStyle properties, such as BreakLineType and Spacing.

The MaxNumberOfBreaks I think is self-explanatory. When you click on (Enable Scale Breaks) SSRS will automatically determine how, when and where to place the scale breaks. So if you only want 1 you can change that setting here.

Another property you can change is the ‘CollapsibleSpaceThreshold.’ You can change this setting to help control how aggressive the scale break is. This can be handy if you have a scale break that appears in the report and it probably shouldn’t or maybe it should. You can
adjust this up or down to help eliminate this issue.

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Thursday, December 13, 2018

SSRS - How to Add Bookmarks Actions in Report

SSRS bookmarks are essentially customized navigational links in the report. The bookmark can be any string, but it must be unique in the report. If the bookmark is not unique, a link to the bookmark finds the first matching bookmark.

For example, if your report displays a table grouped by color, you would add a bookmark based on the group expression to the group header. Then you would add a table with a single text box at the beginning of the report that displayed the color values, and set the bookmark link on that text box. When you click the color, the report jumps to the page that displays the group header row for that color.

In this tutorial, we are going to learn how to Add bookmarks and bookmark links to a report. Add bookmarks and bookmark links to a report when you want to provide a customized table of contents or to provide customized internal navigation links in the report. Typically, you add bookmarks to locations in the report to which you want to direct users, such as to each table or chart or to the unique group values displayed in a table or matrix. We can create your own strings to use as bookmarks, or, for groups, you can set the bookmark to the group expression.

To add a bookmark link - In report design view, select the text box, image, chart, or other report item to which you want to add a bookmark. The properties for the selected item appear in the Properties pane. For example, you could type BikePhoto as the bookmark for an image in your report.

In Design view, right-click the text box, image, chart, to which you want to add a link and then click Properties.
In The Properties dialog box for that report item, click Action.
Select Go to bookmark. An additional section appears in the dialog box for this option.
In the Select bookmark box, type or select a bookmark or an expression that evaluates to a bookmark. Using the previous example, type BikePhoto to create a link to the image in your report.
Click OK.
(Optional) The text is not automatically formatted like a link. For text, it is helpful to change the color and effect of the text to indicate that the text is a link. For example, change the color to blue and the effect to underline in the Font section in the Home tab of the Ribbon.
To test the link, click Run to preview the report, and then click the report item that you set this link on

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Wednesday, December 12, 2018

SSRS - How to Create Recursive Report

In Reporting Services paginated reports, a recursive hierarchy group organises data from a single report dataset that includes multiple hierarchical levels, such as the report-to structure for manager-employee relationships in an organisational hierarchy.

In this tutorial, we are going to learn how to use SSRS to report on recursive data. If you have a self referencing table or even a Parent-Child hierarchy setup in an SSAS multidimensional cube you could find these techniques useful. In our example below we want to display the data within a multi-level hierarchy. We will be working with HR data and we want to see the CEO at the top of hierarchy and then see who reports to who after that.

To create a recursive hierarchy group - There are very simple steps to create a recursive hierarchical report as given below:

In Design view, add a table, and drag the dataset fields to display. Typically, the field that you want to show as a hierarchy is in the first column.
Right-click anywhere in the table to select it. The Grouping pane displays the details group for the selected table. In the Row Groups pane, right-click Details, and then click Edit Group. The Group Properties dialog box opens.
In Group expressions, click Add. A new row appears in the grid.
In the Group on list, type or select the field to group.
Click Advanced.
In the Recursive Parent list, enter or select the field to group on.
Click OK.
Run the report. The report displays the recursive hierarchy group, although there is no indent to show the hierarchy

For demo of this report, you can visit our YouTube channel as well -

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Monday, November 19, 2018

SSIS - Redirect bad data by Conditional Split Transformation

In this tutorial, we are going to learn how does Conditional Split transformation work in SSIS package. The Conditional Split transformation can route data rows to different outputs depending on the content of the data. The implementation of the Conditional Split transformation is similar to a CASE decision structure in a programming language.

We can use conditional split between our source flat file and the destination SQL Server table to redirect bad data from flat file source. If there's a bad record and we know that which column can have the bad data then based of the business rules, we can put some condition and going to write is bad, go to the new SQL dump table, otherwise, go to the "good" table.

As we already stated that the transformation evaluates expressions, and based on the results, directs the data row to the specified output. This transformation also provides a default output, so that if a row matches no expression it is directed to the default output.

Note - The Conditional Split transformation directs each input row to one output only. If we enter multiple conditions, the transformation sends each row to the first output for which the condition is true and disregards subsequent conditions for that row. If we need to evaluate several conditions successively, we may need to concatenate multiple Conditional Split transformations in the data flow.

Based on the functionality of Conditional Split transformation, it ensures us that every every row has only one output. To understand the bad data handling, we can assume that we have a sales data where some rows have missing order date or some rows have missing ship date missing, then we can redirect these rows by using the following condition -

[Order Date] == "" || [Ship Date] == ""

To learn more on bad data handling, please visit my YouTube channel -

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Friday, November 16, 2018

SSIS - Conditional Split Transformation

In this tutorial, we are going to learn how does Conditional Split transformation work in SSIS package. The Conditional Split transformation can route data rows to different outputs depending on the content of the data. The implementation of the Conditional Split transformation is similar to a CASE decision structure in a programming language.

The transformation evaluates expressions, and based on the results, directs the data row to the specified output. This transformation also provides a default output, so that if a row matches no expression it is directed to the default output.

We can configure the Conditional Split transformation in the following ways:

* Provide an expression that evaluates to a Boolean for each condition you want the transformation to test.

* Specify the order in which the conditions are evaluated. Order is significant, because a row is sent to the output corresponding to the first condition that evaluates to true.

* Specify the default output for the transformation. The transformation requires that a default output be specified.

To live demo, please visit my YouTube Channel at -

Each input row can be sent to only one output, that being the output for the first condition that evaluates to true.

The Conditional Split transformation includes the FriendlyExpression custom property. This property can be updated by a property expression when the package is loaded.

Note - The Conditional Split transformation directs each input row to one output only. If you enter multiple conditions, the transformation sends each row to the first output for which the condition is true and disregards subsequent conditions for that row. If you need to evaluate several conditions successively, you may need to concatenate multiple Conditional Split transformations in the data flow

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Wednesday, November 14, 2018

SSIS - Expression Task

In this tutorial, we are going to learn how does Expression Task work in SSIS package. The SSIS Expression task will allow you to set variable values at runtime without the need for a script task. This was a new feature when SSIS 2012 rolled out a few years ago. It’s a pretty easy task to use and also helps you avoid having to write any .net code as well. The expression language includes functions and operators.

Things to be Noted:

You can only set 1 variable at a time with this task. If you need to do more you can add additional expression tasks or use one script task to set all of them at once.
You must have SSIS 2012 or higher. If you have SSIS 2008R2 and below you can still use this functionality but it’s slightly more complicated because you will have to create a script task to do this for you.

One common usage for the expression task in SSIS is to implement conditional flow of the constraints. In the below screenshot you can see that we implement an expression tasks which follows an error checking activity and evaluates if there were any errors , if yes the execution move to the right else to the left.

To watch the live demo, please visit my YouTube Channel -

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Tuesday, November 13, 2018

SSIS - Foreach Loop Container

In this tutorial, we are going to learn how does For Loop Container work in SSIS package. The For Loop container defines a repeating control flow in a package. The loop implementation is similar to the For looping structure in programming languages. For Loop container groups set of tasks together and allows you to looping through the tasks in your package which is similar like for loop in any programming language. In case until a fixed count than you will use the “For Loop container”

Configuration editor has only two options For Loop and Expressions.

InitExpression: This will initialize the loop with specified values.
EvalExpression: specifies an expression to stop the loop when the expression evaluates to false. This expression must evaluate to a Boolean.
AssignExpression: specifies an expression that changes a condition in the same way each time the loop iterates.Which is optional one.

Drag and drop “For loop” container and configure as shown below

Difference between For Loop and Foreach Loop Container?

A For Loop container loops till a constant value. So for example if you want to loop until a value is 12 or less than 12 so for up-to fixed count you have to use For Loop container.

Whereas Foreach Loop container loops through collection so this collection can files in a folder or collection can be in record set.

Foreach loop - Foreach loop container actually loop through a collection. Now this collection row in a record set files in a folder.

Foreach Loop Container Task: When there is dependency of number of time task should execute on data such as files, variables we will use this.The Foreach Loop container defines a repeating control flow in a package. The loop implementation is similar to Foreach looping structure in programming languages. In a package, looping is enabled by using a Foreach enumerator. The Foreach Loop container repeats the control flow for each member of a specified enumerator.

In short, a For Loop Container uses a loop condition that is defined by expressions whereas Foreach Loop Container iterates through a collection of objects.

With a Foreach Loop Container, we can perform variable mapping as well, which would be useful for tasks within the loop.

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Monday, November 12, 2018

SSIS - How does Aggregate Transformation work

In this tutorial, we are going to learn how does Aggregate Transformation work in SSIS package. The Aggregate transformation is used to perform aggregate operations/functions on groups in a dataset. The aggregate functions available are- Count, Count Distinct, Sum, Average, Minimum and Maximum. The Aggregate transformation has one input and one or more outputs.

The Aggregate transformation handles null values in the same way as the SQL Server relational database engine. The behaviour is defined in the SQL-92 standard. The following rules apply:

In a GROUP BY clause, nulls are treated like other column values. If the grouping column contains more than one null value, the null values are put into a single group.
In the COUNT (column name) and COUNT (DISTINCT column name) functions, nulls are ignored and the result excludes rows that contain null values in the named column.
In the COUNT (*) function, all rows are counted, including rows with null values.

Aggregate Transformation Configuration - You configure the Aggregate transformation at the transformation, output, and column levels. At the transformation level, you configure the Aggregate transformation for performance by specifying the following values:

The number of groups that are expected to result from a Group by operation. They are mostly are string values.
The number of distinct values that are expected to result from a Count distinct operation. These values could be string or numeric.
The percentage by which memory can be extended during the aggregation.
The Aggregate transformation can also be configured to generate a warning instead of failing when the value of a divisor is zero.

For more on Aggregate Transformation - Watch a Live demo here

Conclusion - The Aggregate transformation is asynchronous, which means that it does not consume and publish data row by row. Instead it consumes the whole rowset, performs its groupings and aggregations, and then publishes the results.

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Sunday, September 30, 2018

Build and rebuild clustered columnstore indexes online in SQL Server 2019

Microsoft development team works around the SQL Server to make it as a central part of the Microsoft data platform and we cannot deny the truth about SQL Server that is an industry leader in operational database management systems (ODBMS). Microsoft has been introduced SQL Server 2019 preview which builds on previous releases to grow SQL Server as a platform that gives us choices of development languages, data types, on-premises or cloud, and operating systems.

If we are talking for optimizing database performance then the indexes are one of the most powerful tools for it. This is the biggest fact for the indexes that they can become more fragmented and less effective in case we did any insert, update, and delete operations on the data table. This is a pain for DBAs to take care of indexes and on the regularly basis, index rebuild operations is required.

With the growing sizes of databases, index rebuilds can take a very long time. Build and rebuild clustered columnstore indexes online is the feature of SQL Server 2019 preview where creating clustered columnstore indexes (CCI) was an offline process in the previous versions of SQL Server - requiring all changes stop while the CCI is created.

With SQL Server 2019 preview and Azure SQL Database you can create or re-create CCI online. Workload will not be blocked and all changes made on the underlying data are transparently added into the target columnstore table.

Examples of new Transact-SQL statements that can be used are:

---- Create Clustered Columnstore Index

CREATE CLUSTERED COLUMNSTORE INDEX cci

ON <tableName>

WITH (ONLINE = ON);

---- Altered Clustered Columnstore Index

ALTER INDEX cci

ON <tableName>

REBUILD WITH (ONLINE = ON);

Combine that with the business needs for our applications to be always available and performant and this can be an issue. Big OLTP environments with busy workloads often have very short maintenance windows with some too short to execute large index rebuild operations.

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Wednesday, September 26, 2018

Azure Data Studio

Microsoft Development team is going to be introduced Azure Data Studio in SQL Server 2019 which was released under the preview name SQL Operations Studio in previous version. Azure Data Studio is a lightweight, modern, open source, cross-platform database tool for data professionals using the Microsoft family of on-premises and cloud data platforms on Windows, MacOS, and Linux.

Azure Data Studio offers a modern editor experience with Intellisense, code snippets, source control integration, and an integrated terminal. It is engineered with the data platform user in mind, with built in charting of query result sets and customizable dashboards.

With the help of this tool, now that SQL Server can run on all the major platforms, the Mac and Linux users were having to use third party tools to manage SQL server. Currently you can edit whole table, but not limited data that you need to run query, and edit only the results of that query.

Edit and run queries in a modern development environment with lightning fast Intellisense, code snippets, and source control integration.
Quickly visualize data with built-in charting of your result sets.
Create custom dashboards for your servers and databases using customizable widgets.
Easily manage your broader environment with the built-in terminal.
Analyze data in an integrated notebook experience built on Jupyter.
Enhance your experience with custom theming and extensions.
And explore your Azure resources with a built-in subscription and resource browser.
Supports scenarios using SQL Server Big Data Cluster.

Those of us who don't want to run Microsoft's OS in a VM are really appreciative of the efforts they're making with things like this and VS code in making them cross-platform. Keep in mind, this is in preview, and provides an open source extension platform, so if you want that functionality, you can likely build it if it ends up lacking that in the final release.
Reference:https://docs.microsoft.com

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Monday, September 3, 2018

SSRS - Filling blank cells in matrix

In Reporting Services, tables, matrices, and lists are data regions that display paginated report data in cells that are organized into rows and columns. The cells typically contain text data such as text, dates, and numbers but they can also contain gauges, charts, or report items such as images. Collectively, tables, matrices, and lists are frequently referred to as tablix data regions.

Sometimes, when we have created a matrix/tablix/list with data in it, but there are quite a few blank cells where there is no data. In this case, we don't want to show the blank values on the report and we have to replace the blanks with zeroes.

There are some conditional expressions which can be helpful to full-fill this requirements-

Just right click on the text cells and go to the expression
In expression window, check if the cell contains Nothing then set 0 value else fill the actual value as given below-

=IIF(Fields!MyColumn.Value IS NOTHING , 0 , Fields!MyColumn.Value)

OR

=IIf(IsNothing(Fields!MyColumn.Value),0,Fields!MyColumn.Value)

If the field is empty or blank then it puts 0, otherwise it uses the actual data from the dataset.

Please watch a live demo at our YouTube Channel-

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Friday, August 31, 2018

SSRS - Export SSRS report into multiple excel sheets name

SSRS is the most amazing data visualisation tool which provides many amazing features. One of the best feature of this tools is to export data into multiple sheets in excel and provides Name the worksheets according to group value. We can assign dynamically sheet name / tab name. When SSRS report is exported into excel format then the report data will be divided into multiple sheets and each sheet must have Tablix Header.

To create multi tabbed report, we have to follow the following steps -

Add Row Group to the report and choose Parent Group
In Group Properties window, choose Page Break and checked Between each instance of a group
In Row Group section, click on Group details and go to property window
In property window, open expression window for PageName and assign the group name from the report dataset
To fix tablix header in every new sheet/tab, Click on Advance mode form Row and Column Group and set true property of KeepTogether and RepeatOnNewPage
Preview of the report ---- Done

For actual implementation, please watch the below video -

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Tuesday, August 28, 2018

SQL Server - Import raw data file into SQL Server database

In SQL Server Management Studio, SQL Server Import and Export Wizard option is helpful to into your raw data into SQL Server database. Import Flat File provides us to create simple packages that import and export data between many popular data formats including databases, spreadsheets, and text files. The import and export wizard also create the destination database and the tables into which the data is inserted.

Import Text/CSV file into SQL Server follow below mention steps:

In SQL Server Management Studio, right click on the database and choose Import Flat File from tasks options
Specify Data Source as Flat File Source and Text/CSV file Location
Specify Export location Details
Mapping Source and Destination tables (Leave this option as it is)
Summary page to display our current configuration
RUN THE PACKAGE

To watch a demo here-

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

SSRS - How to creating Dynamic Tabs in SSRS Report

In this video of SSRS Tutorial, we will learn that how to create a SSRS Report with dynamic Tabs and pull the data according to selected tab.

As you know that there is no straight forward way to include tabs in SSRS report builder. We have to achieve this by using different tricks. Here is what we will learn while creating this report by using some normal steps and some logical tricks.

Few steps to follow-

Create Empty report in SSRS in SSRS Project
Add Report Header so can display the Report Title on each of the Page
Add dataset for dynamic tabs
Add Matrix and assign dataset as dataset name property
Drop tabs value from dataset into matrix column header
Add parameter for dynamic tab to filter the data
Set colour formatting for tabs based on the condition as per in the below video
Add action to matrix column and provide the parameter from dataset column
Hide the report parameter
Add another dataset to pull the data from the database and filter the data by passing the value from the parameter
Add table to generate the tablix report and set the dataset name
Drop columns from the report dataset
Do needful formatting to the report
Preview the report

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Thursday, July 19, 2018

Preparation for a successful Data Lake in the cloud

A data lake is conceptual data architecture which is not based on any specific technology. So, the technical implementation can vary technology to technology, which means different types of storage can be utilized, which translates into varying features.

The pillars of a data lake also include scalable and durable storage of data, mechanisms to collect and organise that data, and tools to process and analyze the data and share the findings.
If we are talking about architectural point of views for a well-developed cloud-based data lake then it must be capable to serve many corporate audiences, including IT applications, infrastructure, and operations teams, data scientists and even line of business groups.

If you are planning to develop a successful data lake then you should have to consider the cloud service providers which allow organizations to avoid the cost and hassle of managing an on-premises data center by moving storage, compute, and networking to hosted solutions. Cloud services also offer many other advantages such as ease of provisioning, elasticity, scalability, and reduced administration.

Apart from this, we should have to consider the following things:

Type of storage: Data Lake storage does matter for any organisation because it is directly link with cost and efforts.

The most common data lake implementations utilize:

HDFS (Hadoop Distributed File System)

Proprietary distributed file systems with HDFS compatibility (ex: Azure Data Lake Store)

Object storage (ex: Azure Blob Storage or Amazon S3)

The following options for a data lake are less commonly used due to greatly reduced flexibility:

Relational databases (ex: SQL Server, Azure SQL Database, Azure SQL Data Warehouse)

NoSQL databases (ex: Azure Cosmos DB)

Security capabilities- We have already stated that Data Lake is not based any specific technology. So, implementation of securities, privacy, and governance must be differed to technology to technology.

For example, service such as Azure Data Lake Store implements hierarchical security based on access control lists, whereas Azure Blob Storage implements key-based security. These capabilities are continually evolving in the cloud, so be sure to verify on a frequent basis.

Other hand, AWS has a number of ready-to-roll services here, including AWS Identity and Access Management (IAM) for roles, AWS Key Management Service (KMS) to create and control the encryption keys used to encrypt our data.

Data management services – Data is the most important component for any organisation which is used in different platforms. The data lake analogy is conceived to help bring a common and visual understanding to the benefits of distributed computing systems able to handle multiple types of data, in their native formats, with a high degree of flexibility and scalability.

With the right data captured from a variety of sources, we should be capable to expose that information to data professionals and business decision makers without an oppressive amount of red tape, or bureaucracy from IT.

For example, AWS is introducing AWS Glue as an ETL engine to easily understand data sources, prepare the data, and load it reliably to data stores. Azure Data Lake (ADL) integrations, developers who are required to manage information in those services can use Data Lake Explorer within ADL Tools for Visual Studio Code to get a better and quicker grasp of their cloud-based big data environments.

Data Efficiency and Business Execution - One of the most powerful features of cloud-based deployments is elasticity, which refers to scaling resources up or down depending on demand. Data lakes should be made easily accessible to a wide range of users, and their efforts in implementing and supporting core applications, for any line of business or function and business users are able to utilize this internal data efficiency to help perform core activities more effectively

Disaster recovery - The most critical data from a disaster recovery standpoint is our raw data. The ability to recover our data after a damaging weather event, system error, or human error is crucial.

Azure Data Lake Store provides locally-redundant storage (LRS). Hence, the data in our Azure Data Lake Store account is resilient to transient hardware failures within a region through automated replicas. This ensures durability and high availability, meeting the Azure Data Lake Store SLA.

AWS offers all the tools and capabilities we need to transfer data into the cloud and build comprehensive backup & restore solutions that are compatible with your IT environment.

Please visit us to learn more on -

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Wednesday, July 18, 2018

Why does cloud make Data Lakes better?

A data lake is conceptual data architecture which is not based on any specific technology. So, the technical implementation can vary technology to technology, which means different types of storage can be utilized, which translates into varying features.

The main focus of a data lake is that it is not going to replace a company’s existing investment in its data warehouse/data marts. In fact, they complement each other very nicely. With a modern data architecture, organizations can continue to leverage their existing investments, begin collecting data they have been ignoring or discarding, and ultimately enable analysts to obtain insights faster. Employing cloud technologies translates costs to a subscription-based model which requires much less up-front investment for both cost and effort.

The most of the organizations are enthusiastically considering cloud for functions like Hadoop, Spark, data bases, data warehouses, and analytics applications. This makes sense to build their data lake in the cloud for a number of reasons such as infinite resources for scale-out performance, and a wide selection of configurations for memory, processors, and storage. Some of the key benefits include:

Pervasive security - A cloud service provider incorporates all the aggregated knowledge and best practices of thousands of organizations, learning from each customer’s requirements.
Performance and scalability - Cloud providers offer practically infinite resources for scale-out performance, and a wide selection of configurations for memory, processors, and storage.
Reliability and availability - Cloud providers have developed many layers of redundancy throughout the entire technology stack, and perfected processes to avoid any interruption of service, even spanning geographic zones.
Economics - Cloud providers enjoy massive economies of scale, and can offer resources and management of the same data for far less than most businesses could do on their own.
Integration - Cloud providers have worked hard to offer and link together a wide range of services around analytics and applications, and made these often “one-click” compatible.
Agility - Cloud users are unhampered by the burdens of procurement and management of resources that face a typical enterprise, and can adapt quickly to changing demands and enhancements.

Advantages of a Cloud Data Lake – it is already proved that a data lake is a powerful architectural approach to finding insights from untapped data, which brings new agility to the business. The ability to harness more data from more sources in less time will directly lead to a smarter organization making better business decisions, faster. The newfound capabilities to collect, store, process, analyze, and visualize high volumes of a wide variety of data, drive value in many ways. Some of the advantages of cloud data lake is given below –

Better security and availability than you could guarantee on-premises
Faster time to value for new projects
Data sources and applications already cloud-based
Faster time to deploy for new projects
More frequent feature/functionality updates
More elasticity (i.e., scaling up and down resources)
Geographic coverage
Avoid systems integration effort and risk of building infrastructure/platform
Pay-as-you-go (i.e., OpEx vs. CapEx)

A basic premise of the data lake is adaptability to a wide range of analytics and analytics-oriented applications and users, and clearly AWS has an enormous range of services to match any. Many engines are available for many specific analytics and data platform functions. And all the additional enterprise needs are covered with services like security, access control, and compliance frameworks and utilities.

Please visit us to learn more on -

With over 17 years of experience in the Data Engineering stack across a variety of cloud and on-premises systems, I have successfully delivered more than ten complete business product solutions. My expertise lies in building robust infrastructure and architecture to support data engineering, data analytics, and machine learning processes. These solutions have significantly improved collaboration among cross-functional teams, including data scientists, business analysts, software engineers, and stakeholders. Key Contributions Data Modelling and Integration • Data Modeling: Developed various data models to produce suitable data for business users, data analytics, data science, and data visualization teams. • Legacy Systems and Cloud Technologies: Integrated legacy systems with modern cloud-based technologies (AWS, Azure, GCP), data lakes, and data warehouses. • Streamlined Data Pipelines: Built efficient data pipelines, data warehouses, BI reports, and dashboards to streamline data access and insights.

Subscribe to: Posts (Atom)