Data
Warehousing |
I.Data
Warehousing Overview
Overview
Typical uses
2.Definition,
Architecture And Concepts
Enterprise Data Model
Operational vs. historical data
Extract Transform Load (ETL)
Metadata
Data warehouse vs. data mart
Data mining
OLAP vs. OLTP
Massive size implementation
Logical design vs. physical design
Normalization vs. denormalization
Referential constraints
3.Data Modelling Options
Entity model
Star schema
Snowflake schema
4.Dimensional Modelling
Development Life Cycle
Requirements analysis
Requirements gathering
Requirements validation
Requirements modelling
Schema design
Project definition
Warehouse design
Implementation
Follow-up and review
5. Dimensional Modelling Design
Overview
Metadata properties
Star schema
Snowflake schema
Cubes
Measures and facts
Attributes and relationships
Dimension
Hierarchies
Role-playing dimensions
Joins
Summary tables and aggregation
Exercises
6. Implementation Options
Overview
Top down
Bottom up
Sizing
Cleaning
Populating the data warehouse
7. Extract, Transform, Load (Etl)
Terms And Concepts
Options
Extraction options
Transformation options
Loading options
Change Data Capture and publishing
Staging areas
8. Extracting
Logical-to-physical data
mapping
Disparate (heterogeneous) data sources
Extracting changes data ?delta or other
9. Data Cleaning And Conforming
Data quality criteria
Design methods and alternatives
Cleaning deliverables
Conforming dimension tables
Conforming fact tables
10. Dimension Table Delivery
Dimension table structure
Surrogate key generation
Dimension table grain
Flat (denormalized) or snowflake?
Data and time dimensions
態ig?vs. 憇mall?dimensions
Dimensional roles
Dimensions as subdimensions
Degenerate dimensions
11. Slowly Changing Dimensions
And Multivalued Dimensions
Type 1
Type 2
Type 3
Hybrid
Late arrivals
Definition
Bridge tables
12. Fact Table Delivery
Fact table structure
Referential integrity (RI)
Surrogate key derivation and flow
Fundamental grain
Transaction fact tables
Factless fact tables
Periodic snapshots
Accumulating snapshots
13. fact table load
considerations
Index management
Partition management
Updates, deletes and inserts
Recovery
Summary tables
Parallelism
14. Data Warehouse Performance
Design
Materialized views
Large concurrent reports
Short running queries
Long running queries
Random queries
Occasional updates
On-line utilities
Index options
Partitioning and parallelism (e.g., LOADs)
15. Introduction To Statistics,
Analytic And Olap Sql Queries
AVG
CORRELATION
COUNT
COUNT_BIG
CONVARIANCE
MAX
MIN
RAND
STDDEV
SUM
VARIANCE
Regression function
GROUPING, ROLLUP AND CUBE
16. Physical Design
Considerations
Denormalization
Index choices
Data placement
Free space
Summary tables
Data compression
|
|
|
|
OCOT
Advantages
100
%Instructor-Led Class
State-of-the-Art Facilities
Unlimited Lab Time
Labs Open 7-days a Week
Free Repeat
Free Job Placement
Financial Aid Possible
Resume Writing
Interview Skills |
|
© 2008 Ontario College of
Technology
|