Data Warehouse

The term of Data Warehouse was introduced by Bill Inmon in 1990. According to Inmon, A data warehouse (OLAP) is a database, which is kept separate from the organization's operational databases (OLTP) and having subject oriented, integrated, time-variant, and non-volatile collection of data. The primary goal of a data warehouse is to provide a generalized and consolidated data in multidimensional view to enable business users to make better decisions.
In simple words, A Data warehouse contains the data from all/many segments of the business and more organized specifically to "facilitate Reporting and Analysis.


Important facts about Data warehouse:
  1. A data warehouse (OLAP) as a collection of data-marts and each data-mart consists of one to many (OLTP) databases where the database is specific to a specific problem set.
  2. Data Warehouse provides a high performance for reporting and analytical queries which is used by executives to organize, understand, and use their data to take strategic decisions.
  3. Dimensional Modeling techniques are used for the Data Warehouse design.
  4. Data is not updated frequently in data warehouse and possesses consolidated historical data.

Data Warehouse Features
As we know that Data warehousing has a clear set of objectives such as data persistence, single easy to navigate data model, fast query performance, etc. and have the following key features -
Subject Oriented - A data warehouse is subject oriented because it provides information around a subject (product, customers, suppliers, sales, revenue and inventory etc) rather than the organization's ongoing operations. A data warehouse focuses on modelling and analysis of data for decision making.
Integrated - A data warehouse is constructed by integrating data from heterogeneous sources such as relational databases, flat files, etc. This integration enhances the effective analysis of data.
Time-Variant - The data collected in a data warehouse is identified with a particular time period. The data in a data warehouse provides information from the historical point of view. The data warehouse clearly must account for changes in the source system.

Non-Volatile - Once data is in the data warehouse, it will never be changed means the previous data is not erased when new data is added to it. A data warehouse is kept separate from the operational database and therefore frequent changes in operational database is not reflected in the data warehouse.

Why Put Separate Data Warehouse?
There are following reasons to put separate data warehouse from the operational databases-
To maintain the high performance for both systems:
  • Operational Database is tuned for OLTP activities such as access methods, indexing, concurrency control, recovery
  • Data Warehouse is tuned for OLAP activities such as complex OLAP queries, multidimensional view, and consolidation.
Having different functions and different data:
  • Missing data: Business decision support requires historical data which operational Databases do not typically maintain the missing data.
  • Data consolidation: Data Warehouse requires consolidation (aggregation, summarization) of data from heterogeneous sources.
  • Data quality: different sources typically use inconsistent data representations, codes and formats which have to be reconciled.
To know more on OLTP vs OLAP

OLAP is a powerful analysis tool for forecasting, statistical computations, aggregations and involves more than just the multidimensional display of information. OLAP tools also must be able to extract and summarise requested data according to the needs of an end user, and there are two approaches for this data extraction that need to be discussed.
Needs of Data Warehouses?
A typical organization generates lots of data during their day to day business activities. To take the remarkable decision to get success in the business, information assets (data) are seriously valuable to any enterprise, and because of this, these assets must be properly stored and readily accessible when they are needed. This information usually contains historical data which derived from transaction data and can include data from other sources. After stored the data in the data warehouse, it is ready to focusing on the modeling and analysis of data for decision makers, not on daily operations or transaction processing.


In this way, it provides a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process. They are designed to accommodate ad hoc queries and updated on a regular basis by the ETL process (run nightly or weekly) using bulk data modification techniques.
We need to clean and process our operational data which comes from various source systems before putting it into the warehouse and business users directly access data derived from several source systems through the data warehouse.  
To know more, click on 
  1. Data Warehouse Architecture and Multidimensional Model
  2. Data Warehouse - Dimension tables.
  3. Data Warehouse - Fact tables.
  4. Data Warehouse - Conceptual Modeling.
  5. Data Warehouse - Star schema.
  6. Data Warehouse - Snowflake schema.
  7. Data Warehouse - Fact constellations
  8. Collaboration of OLTP and OLAP systems
  9. Major differences between OLTP and OLAP
  10. Data Warehouse - Multidimensional Cube

In these days, Data warehouses are quickly growing in popularity as a way to manage the “hybrid cloud” situation and many companies now have to deal with as their data sources increasingly are located both in the Cloud and on-premises. They are using the both options. A real data warehouse, as opposed to cubes with disparately organized data, is needed to consolidate all of the islands of information into a central business user friendly repository to create the foundation for modern BI to take place.

5 comments:

  1. Good recommendable job running by you thanks for sharing valuable knowledge about SQL Server. and MSBI......I also very much passionate about sql. Please keep continue.........

    ReplyDelete
  2. Hi, the information you provided here is very informative and useful for me. If anyone interested to learn AngularJS Training reach GangBoard. MSBI Training | Hadoop Training

    ReplyDelete
  3. This is a very impressive and interesting blog!
    If you like, you can check out my data warehouse blog too.

    https://dwhblog.org

    ReplyDelete

  4. This is a great pressure information for me. I would be really like that our marvelous posted. Thankful this effortless post.
    Selenium Training in Chennai | Best Selenium Testing Training Institute in Chennai

    ReplyDelete
  5. Great post! I am actually getting ready to across this information, It's very helpful for this blog.Also Selenium Training in Chennai great with all of the valuable information
    you have to Keep up the good work you are doing well.

    ReplyDelete

Popular Posts