Article on Integrating XML Operational Data into a Data Warehouse
August 24, 2009
I have previously covered how DB2 9.7 supports native XML data with hash partitioning (database partitioning), range partitioning (table partitioning), and multi-dimensional clustering. These new features make it feasible to analyze information in native XML format, side-by-side with relational data, in a data warehouse. And, of course, being able to work with native XML data in such scenarios offers many efficiencies and advantages.
In reality, many data warehouse projects involve pulling different types of information from disparate data sources around an organization. My colleagues have published the first in a series of two articles that provide step-by-step instructions for integrating information from such disparate sources into a data warehouse. The first of those articles is now available at IBM InfoSphere DataStage and DB2 pureXML, Part 1: Integrate XML operational data into a data warehouse.
This article tells you how to use IBM® InfoSphere™ DataStage to extract and transform XML data managed by DB2® pureXML®. It also explores how DataStage can load this data into a table with traditional SQL data types, and a table with both relational and XML columns. The article includes sample scripts and data that you can download.
The second part of this article series will explore another important scenario: using DataStage to read information from a flat file, convert the data into an XML format, and load this XML data into a data warehouse that contains a table with a DB2 pureXML column.