XML in Oracle 11g
September 30, 2008
Here are some observations about the XML storage capabilities in Oracle 11g. These observations were deduced from public sources. Please be aware that I work for IBM who competes directly with Oracle in this regard.
Oracle provides three options for storing XML data:
- Unstructured, which is essentially Character Large OBject (CLOB) storage. Like any CLOB implementation, you will need to retrieve and parse the XML data before executing XPath and XQuery statements, which has a query-time performance impact.
- XML-Object-Relational, which shreds the XML data into object-relational tables. There are multiple storage options to choose from (which I will cover in a moment). Oracle recommends this option for data-centric use cases. Of course, with this option, retrieving the original XML data will incur a performance hit as the data is re-composed.
- XML-Binary, which stores the XML data as a token stream in a Binary Large OBject (BLOB). Oracle recommends this option for document-centric use cases.
If you use XML-Object-Relational, which is also known as Structured or Schema-based storage, you have five different options to choose from for storing repeating elements:
- Store in CLOBs
- Store in VARRAY as LOBs
- Store in VARRAY as nested tables
- Store in VARRAY as XMLType pointers to BLOBs
- Store in VARRAY as XMLType pointers to nested tables
If your head is spinning with the different options, I don’t blame you.
Oracle provides a special index called
XMLindex, which indexes the internal structure of XML data. Actually, this index is a table. There is an interesting post on the Oracle Discussion Forum. In this post, an Oracle user describes their experiments with XBRL data in Oracle 10g and 11g. In their experiments, case 3 runs on Oracle 11g without XMLIndex in 9.6 seconds, while case 4 runs with XMLIndex and takes 574 seconds. So, in this user’s experience, running with the XMLIndex results in approximately a 50x slowdown.
Note that if you want to create an index for numeric or date values in Oracle 11g, you must use stored procedures, which create separate indexes that you must asynchronously maintain.
Oracle now has an XML update function. It is not compliant with the W3C XQuery Update Facility. Although, Oracle did recently announce the XQuilla XQuery engine, which they claim “will (maybe) free the way for the W3C XQuery Update Facility 1.0 candidate specification / implementation, which is embedded in the XQuilla XQuery engine, for other Oracle products”. So it is possible that a future release of Oracle may support the W3C XQuery Update Facility.
The compendium of XML storage options in Oracle 11g are, in my opinion, essentially based upon existing relational infrastructure. This is very different than the approach that IBM has taken, where they have truly built native support for XML data into their database from the ground up.
Please note that these are solely my personal opinions and not necessarily those of my employer IBM.