Data warehousing & Business Intelligence

Get Started. It's Free
or sign up with your email address
Data warehousing & Business Intelligence by Mind Map: Data warehousing & Business Intelligence

1. Project Planning

1.1. Project Team

1.1.1. Business Business Sponsor Justifies and sponsors the Project Ensures Commitment from end users Handles investment decision Top level or SBU head level executive Understands business pain points Understands DW Concepts Business Analyst Subject matter Expert Understands Business Conducts and documents business requirements Skills Business Champion Represents interest and concerns of business user community Strong understanding of business pain points Strong understanding of business process Good communicator

1.1.2. Project Management Program / Project Manager has overall responsibility for the delivery works with the project sponsor, user community, and development team Develops and manages the project plan Manages project scope, timeline and budget Resolves budget issues Communicates project progress to Project Sponsor(s) Responsible for delivering on the commitments

1.1.3. Technical DBA management and maintenance of the physical tables of the Data Warehouse, Staging Area Implements Backup and recovery procedures Data Warehouse Architect / Data Modeller responsible for developing the logical data models documenting the models in Entity-Relationship Diagrams (ERD) must work with the Business Analysts and ETL Developers to translate the business requirements into a logical model develops the overall Data Warehouse logical architecture e.g. configuration of the data warehouse, data marts, and an operational data store or staging area OLAP Modeler Responsible for developing OLAP models ETL Developer Expert in ETL tools like Informatica, SSIS, Talend Develops ETL routines Report developer Develops canned reports Skills: Crystal reports, Cognos report studio, Jasper reports

1.1.4. Training Trainers

1.1.5. Testing QA Project Manager Testing Analyst UAT Analyst Tester

1.2. Peroject Plan

1.3. Documentation

2. Testing Framework

3. Business Intelligence

3.1. Pyramid of Decision-making

3.1.1. Strategic

3.1.2. Tactical

3.1.3. Operational

3.2. Definition

3.2.1. Business intelligence (BI) is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance.

3.3. Benefits

3.3.1. improves decision making

3.3.2. identify cost-cutting ideas

3.3.3. uncover business opportunities

3.3.4. improves business operations

3.4. OLAP


3.4.2. Operations Drill-up Drill-down Drill thru Slice & Dice Pivot

3.4.3. Structure Dimensions Hierarchies Levels Members Measures Cube

3.4.4. Demonstration

3.4.5. MDX

4. Data Warehousing

4.1. database that is maintained separately from organization's operational database(s)

4.2. Key personalities

4.2.1. Bill Inmon (father of data warehousing, Coined the term in 1970) Data warehouse is a part of the overall business intelligence system An enterprise has one data warehouse, and several data marts source their information from the data warehouse Information is stored in 3NF Definition subject-oriented integrated time-variant non-volatile collection of data in support of mgmt. decision-making process

4.2.2. Ralph Kimball conglomerate of all data marts within the enterprise Information is always stored in the dimensional model

4.2.3. Dan Linstedt (Data Vault)

4.3. Kimball vs Inmon

4.3.1. Requirements Strategic (I) Tactical (K)

4.3.2. Data integration Enterprise wide (I) Departmental (K)

4.3.3. Data Structure Meets multiple and varied information requirements including non-metric data (I) Primarily metric data, business performance, scorecards, multiple dimensions (K)

4.3.4. Skill set requirements Bigger team of specialists (I) Relatively smaller team of generalists (K)

4.3.5. Time to market High (I) Low, incremental

4.3.6. Start-up cost High (I) Low (K)

4.4. Technology

4.4.1. Hardware considerations Scalability Parallel Processing Support CPU Speed Disk Capacity Memory Capacity (RAM) I/O Capacity

4.4.2. Software Databases Oracle Microsoft SQL Server IBM DB2 Teradata Sybase MYSQL Infobright HP Vertica SAP Hana IBM Dash DB ETL SQL Server Integration Services Informatica Ab Initio IBM DataStage Talend Pentaho Kettle Reporting Cognos SAP Business Objects Microstrategy Crystal Reports SQL Server Reporting Services OBIEE Pentaho Jaspersoft Spago BI OLAP Cognos Business Objects Microstrategy Pentaho Mondrian SQL Server Analysis Services In-memory Analytics Tableau Qlikview Predictive Analytics / Data Mining SAS R SSAS Metadata ERWIN DW Life cycle management / Automation Kalido BI Ready WhereScape Open Source BI Jaspersoft Pentaho Spago BI

4.4.3. DW Appliance Netezza Teradata DATAllegro

4.4.4. Standards Common Warehouse Metamodel (CWM)

4.5. OLTP vs Data Warehousing

4.6. Datamart

4.6.1. A data mart is a subset or an aggregation of the data stored to a primary data warehouse. It includes a set of information pieces relevant to a specific business area, corporate department, or category of users.

4.6.2. Dependent

4.6.3. Independent

4.7. Architecture

4.7.1. Layers Data Source layer Data Extraction layer Staging Layer Data Transformation and Load Data storage (DW layer) Aggregate layer Presentation Layer Metadata Layer Business Metadata Technical Metadata Operational Metadata Operation layer

4.8. Critical Success Factors

4.8.1. Sponsorship and Involvement

4.8.2. Business Requirements

4.8.3. Enterprise Architecture

4.8.4. DW Architecture and Design

4.8.5. DW Technology

4.8.6. Information / Data Quality

4.8.7. Development Environment

4.9. Dimensional Modelling

4.9.1. Overview set of techniques and concepts used in data warehouse design Does not necessarily involve a relational database and can be used with RDBMS Column Stores Multi-dimensional Databases or even a flat files json intended to support end user queries oriented around understandability & performance always use concept of facts and dimensions

4.9.2. 4 Step dimensional design process select a business process business processes are the operational activities performed by an organization declare grain examples grain is level of details we want to capture in datawarehouse declaring grain is the pivotal step in dimensional modelling grain must be declared before choosing dimensions or facts because every candidate dimension or fact must be consistent with the grain identify dimensions Dimension provides "who, what, where, when, why, and how" context surrounding a business process event Dimension provides descriptive attributes used by BI systems. Used for filtering and grouping of facts A disproportionate amount of effort is put into the modelling of dimensions because they are the drivers of the user’s BI experience identify facts Facts are the measurements that result from a business process event and are almost always numeric

4.9.3. Bus Architecture key element of incremental DW development approach it is technology and database independent it decomposes DW planning process into manageable pieces by focusing on organization's core processes

4.9.4. Schemas Star Schema Snowflake Schema Fact Constellation

4.9.5. Development Approaches Top Down Driven by user requirements Bottom Up Driven by existing source systems Hybrid Combination of Top Down / Bottom Up

4.9.6. Techniques Dimension Tables Structure Surrogate Keys Multiple hierarchies Role playing dimensions Slowly Changing Dimensions Conformed Dimensions Junk Dimension Degenerated dimensions Time Dimension Dimension Hierarchies Regular (fixed depth) hierarchy Ragged hierarchy Fact Tables Structure Types of meaasures Types of Fact Tables

5. Other Related fields

5.1. Master data Management

5.2. Corporate Performance Management

5.3. Complex Event Processing