Friday, September 11, 2015

DW - OLAP Servers

OLAP is a powerful analysis tool for forecasting, statistical computations, aggregations and involves more than just the multidimensional display of information. OLAP tools also must be able to extract and summarise requested data according to the needs of an end user, and there are two approaches for this data extraction that need to be discussed.
As we know about that data in a data warehouse is organized to support analysis rather than to process real-time transactions as in online transaction processing systems (OLTP).  OLAP technology enables data warehouses to be used effectively for online analysis, providing rapid responses to iterative complex analytical queries.

OLAP is stand for Online Analytical Processing and OLTP Server is the chief component which stays between a client and a database management systems (DBMS).

In very simple words, OLAP servers present business users with multidimensional data from data warehouse or data marts, without concerns regarding how or where the data are stored. The OLAP servers are key points to understand that how data is organized in the database and has special functions for analyzing the data.

OLAP's multidimensional data model and data aggregation techniques organise and summarise large amounts of data so it can be evaluated quickly using online analysis and graphical tools. The answer to a query into historical data often leads to subsequent queries as the analyst searches for answers or explores possibilities. OLAP systems provide the speed and flexibility to support the analyst in real time.
Types of OLAP Servers
Cubes in a data warehouse are stored in three different modes and we can have four types of OLAP servers which are given below:
Relational OLAP (ROLAP) Servers: These are the intermediate servers that stand in between a relational back-end server and client front-end tools. They use a relational or extended-relational DBMS to store and manage warehouse data, and OLAP middle-ware to support missing pieces. ROLAP servers include optimization for each DBMS back end, implementation of aggregation navigation logic, and additional tools and services. ROLAP technology tends to have greater scalability than MOLAP technology. 
The ROLAP storage mode causes the aggregations of the partition to be stored in indexed views in the relational database that was specified in the partition's data source.
Advantages of ROLAP
  1. ROLAP is considered to be more scalable in handling large data volumes, especially models with dimensions with very high cardinality (i.e., millions of members).
  2. With a variety of data loading tools available, and the ability to fine-tune the ETL code to the particular data model, load times are generally much shorter than with the automated MOLAP loads.
  3. The data are stored in a standard relational database and can be accessed by any SQL reporting tool (the tool does not have to be an OLAP tool).
  4. ROLAP tools are better at handling non-aggregable facts (e.g., textual descriptions). MOLAP tools tend to suffer from slow performance when querying these elements.
  5. By decoupling the data storage from the multi-dimensional model, it is possible to successfully model data that would not otherwise fit into a strict dimensional model.
  6. The ROLAP approach can leverage database authorization controls such as row-level security, whereby the query results are filtered depending on preset criteria applied, for example, to a given user or group of users (SQL WHERE clause).
Disadvantages of ROLAP
  1. There is a consensus in the industry that ROLAP tools have slower performance than MOLAP tools. However, see the discussion below about ROLAP performance.
  2. The loading of aggregate tables must be managed by custom ETL code. The ROLAP tools do not help with this task. This means additional development time and more code to support.
  3. When the step of creating aggregate tables is skipped, the query performance then suffers because the larger detailed tables must be queried. This can be partially remedied by adding additional aggregate tables, however it is still not practical to create aggregate tables for all combinations of dimensions/attributes.
  4. ROLAP relies on the general purpose database for querying and caching, and therefore several special techniques employed by MOLAP tools are not available (such as special hierarchical indexing). However, modern ROLAP tools take advantage of latest improvements in SQL language such as CUBE and ROLLUP operators, DB2 Cube Views, as well as other SQL OLAP extensions. These SQL improvements can mitigate the benefits of the MOLAP tools.
  5. Since ROLAP tools rely on SQL for all of the computations, they are not suitable when the model is heavy on calculations which don't translate well into SQL. Examples of such models include budgeting, allocations, financial reporting and other scenarios.
Multidimensional OLAP (MOLAP) servers: These servers support multidimensional views of data through array-based multidimensional storage engines. They map multidimensional views directly to data cube array structures. The advantage of using a data cube is that it allows fast indexing to pre-computed summarized data.
This is the more traditional way of OLAP analysis. In MOLAP, data is stored in a multidimensional cube. The storage is not in the relational database, but in proprietary formats. Most MOLAP solutions store these data in optimized multidimensional array storage, rather than in a relational database.

Advantages of MOLAP
  1. Fast query performance due to optimized storage, multidimensional indexing and caching.
  2. Smaller on-disk size of data compared to data stored in relational database due to compression techniques.
  3. Automated computation of higher level aggregates of the data.
  4. It is very compact for low dimension data sets.
  5. Array models provide natural indexing.
  6. Effective data extraction achieved through the pre-structuring of aggregated data.
  7. Cube technology are often proprietary and do not already exist in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed.
Disadvantages of MOLAP
  1. Within some MOLAP Solutions the processing step (data load) can be quite lengthy, especially on large data volumes. This is usually remedied by doing only incremental processing, i.e., processing only the data which have changed (usually new data) instead of reprocessing the entire data set.
  2. Some MOLAP methodologies introduce data redundancy.
Hybrid OLAP Servers: They are combination of ROLAP (Relational OLAP) and MOLAP (Multidimensional OLAP) which are other possible implementations of OLAP. HOLAP allows storing part of the data in a MOLAP store and another part of the data in a ROLAP store, allowing a tradeoff of the advantages of each. The degree of control that the cube designer has over this partitioning varies from product to product.

HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can "drill through" from the cube into the underlying relational data. For example, a HOLAP server may allow large volumes of detail data to be stored in a relational database, while aggregations are kept in a separate MOLAP store. The Microsoft SQL Server 7.0 OLAP Services supports a hybrid OLAP server.


Specialized SQL Servers: Specialized SQL servers provide advanced query language and query processing support for SQL queries over star and snowflake schemas in a read-only environment.

References:
http://social.technet.microsoft.com/wiki/contents/articles/19898.aspx

2 comments:

  1. Please correct your starting statement "OLTP (should be OLAP) is stand for Online Analytical Processing and OLTP Server is the chief component which stays between a client and a database management systems (DBMS).

    ReplyDelete
    Replies
    1. Thank you so much Syed. I have updated this one.

      Delete