Advanced Data Warehouse Data Design Topics
Thomas Haughey
President
InfoModel LLC
March 4, 2007
3:30 PM - 6:45 PM
Level: Advanced

This presentation is for experienced data warehouse architects and database designers. The presentation will describe the most challenging data warehouse data design problems the world of data warehousing has faced. Among these requirements are: handling aggregation, heterogeneous product and transaction types, handling time and history, handling changing dimensions, handling late arriving data, supporting data with different rates of change and stability, supporting large scale database environments such as MPP (massively parallel processing). Designing a data warehouse requires different roles and uses of data, a different use of normalization, and new modeling constructs. Key special requirements of the data warehouse focus on time, location, and dimensional aspects of data. These requirements are among the reasons that analytical data modeling demands different skills, perspectives and techniques.

Topics include:

  • Data warehouse architectures
  • New view of dimensional modeling
  • Required snowflakes
  • Conforming facts and dimensions
  • Heterogeneous dimensions and facts
  • Changing dimensions and facts
  • Mixed changes
  • Modeling for different types of time changes
  • Fact to fact joins
  • Do all facts have count, amount; are all dimensions without them.
  • Factless facts.
  • Fact or dimension
  • Design for parallel
  • Multiple roles
  • Use of surrogate keys
  • Handling multi-valued dimensions
  • Dimensions with varying characteristics
  • Handling complex dimensions, such as hierarchical, ragged, multiple dimensions
  • Handling time and history
  • Surrogate keys
  • Name value pairs
  • What changed?
  • Name value pairs
  • Detecting change data
  • Problems with flattening T1 and T2 dimensions
  • Designing aggregates
  • Aggregates vs. on-the-fly
  • Supporting restatement or aggregates
  • Predicate analysis for star joins
  • Designing for trickle load
  • Exercises

  • Tom Haughey is considered one of the originators of Information Engineering in America. He is currently President of InfoModel LLC, a training and consulting company specializing in practical, rapid development methods. His courses on data management, data warehousing, Information Engineering and software development have been delivered to Fortune 1000 companies around the world. He worked on the development of seven different CASE tools, over 40,000 copies of which have been sold to date. Tom was formerly Chief Technology Officer for the Pepsi Bottling Group and Enterprise Director of Data Warehousing for Pepsico. He was also Vice President of Technology for Computer Systems Advisers, who market the CASE tools called POSE and SILVERRUN. He wrote his own CASE tool in 1984. He worked for IBM for 17 years as a Senior project manager. Tom is the author of many articles on Data Management, Information Engineering and Data Warehousing.
    Close Window