Business Intelligence and Data Warehousing - Series 3

As we all know, BI is extension of data warehousing. The data warehouse is basis for BI solutions. First Questions comes in our mind why do we need a data warehouse ?

Following are the key reason for need of the data warehouse:

  • All information in one place
  • All history available
  • Easy to understand
  • Clear and uniform definitions
  • Standardized data
  • Quick access

However, Strict rules to define the architect is not exists for data warehouse but over a few years common architects are grown and help us to define for best our purpose.

  1. Independent Data Marts : Each data mart is built and loaded individually without sharing the metadata
  2. Data mart bus : It is the solutions with confirmed dimensions.
  3. Hub and Spoke : Centralized data warehouse and dependent data marts.
  4. Centralized Data Warehousing : Similar to Hub and Spoke but without spokes so that end user directly access the data warehouse data.
  5. Federated : An architecture where multiple data marts already exists and integrated afterwards. Such approach suitable for virtual data warehouse.

Data warehousing challenges :

  • Data Quality

As per the recent analysis  there is not any company in the world that does not have the data problem in the source application. Data Quality comes from variety of forms like

  1. Duplicate Data
  2. Incomplete Data
  3. Incorrect Data
  4. Conflicting data
  5. Unclear metadata
  6. Missing data
  7. Null Value
  • Data Volume and Performance

The biggest challenges to analyze these amount of data to achieve the acceptable amount data is to achieve an acceptable query performance for end users.  Several techniques in RDBMS to address the performance.

  1. Indexing
  2. Partitioning
  3. Aggregations
  4. Materialized views
  5. Window Functions
  6. Archiving

We will discussing  more about Change Data Capture and Data Modeling for data warehouse in next blogs.

At SPEC India , we follow the following steps for successful execution of Data warehousing implementations.

  • First step to Identify the business problem to solve
  • Second step to design the relational data warehouse to solve the business problem. This is alsp known as dimensional modeling.
  • The next step to identify the data sources and verify the data exists or not.
  • The next step is ETL (Extractions, Transformation and Loading)
  • The next step is to build one or more cubes from the data in the relational DW.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Comments are closed.