When working on a data warehouse project, there are two well-known methodologies for data warehouse system development including the Corporate Information Factory (CIF) and Business Dimensional Lifecycle (BDL). Which one is better for business? The following summary reviews each the advantages and disadvantages of each approach.
Corporate Information Factory Definition and Main Principles
This approach, defined by Bill Inmon, is top-down, data is normalized to 3rd normal form, and the enterprise data warehouse creates data marts. It is a single repository of enterprise data and creates a framework for Decision Support Systems (DSS). For this top-down approach, the data integration requirements are enterprise-wide.
Corporate Information Factory Pros and Cons
- Maintenance is fairly easy
- Subsequent projects costs lower
- Building the data warehouse can be time consuming
- There can be a high initial cost
- Longer time for start-up
- Specialist team required
Business Dimensional Lifecycle Definition and Main Principles:
This approach, defined by Ralph Kimball, is bottoms, up where data marts are created to provide reporting. The data architecture is a collection of confirmed dimensions and confirmed facts that are shared between facts in two or more data marts. The data integration requirements for this bottom up approach includes data integration requirements for individual business areas.
Business Dimensional Lifecycle Pros and Cons
- Takes less time to build the data warehouse
- Low initial cost with fairly predictable subsequent costs
- Fast initial set up
- Only a generalist team is required
- Maintenance can be difficult, redundant and subject to revisions
In the top-down approach, unlike the bottom-up approach, there is an enterprise data warehouse, relational tools, normalized data model, complexity in design, and a discrete time frame. In the bottom-up approach, unlike the top-down approach, there are dimensional tools, process orientation and a slowly changing time frame.
Both CIF and BDL use Extract, Transform and Load (ETL) to load the data warehouse. But, how the data is modeled, loaded and stored is different. The different architecture impacts the delivery time of the data warehouse and the ability to accommodate changes in ETL design.