Many people come here while searching for the best native XML database. Being employed by a vendor with a leading native XML database, I cannot make impartial judgements in this regard. So I’m not going to try. I can, however, make sure you you know how easy it is to evaluate my employer’s native XML capabilities, so you can make up your own mind.

Not only does IBM offer a production-quality native XML database for no charge. It also provides publication-quality books about it for free. This makes it very easy for you to get started with XML databases.

DB2 9 Express-C has no data storage limits. You can store as much data as you like in the database. DB2 9 Express-C has no evaluation time limit. You can use it as long as you like. The only limits are that you use a server with a maximum of 2 cores and 2GB of RAM. Download it from the DB2 Express-C Web page.

IBM has published two books about native XML storage. These books are available for purchase. However, you can also freely view these books as HTML or download the PDF version for offline viewing and printing. Click on images below to see these books:

DB2 9 pureXML Guide DB2 9: pureXML Overview and Fast Start

Good luck with your evaluations.

= = = = = = = = = = = = = = = = = = = = = = = = = = =

!! Additional comment, December 2012:

NOTE:  This redbook “DB2 pureXML Guide” was a great resource for DB2 9.1. However, DB2 versions 9.5, 9.7 and 10.1 have added a lot of additional XML features and enhancements that are not covered in this early redbook.

Hence, this redbook is now outdated.

For more up-to-date information on DB2 pureXML, see the DB2 pureXML Cookbook, the redbook “Extremely pureXML in DB2 10 for z/OS“, or the articles listed on this page:
http://www.ibm.com/developerworks/wikis/display/db2xml/Technical+Papers+and+Articles

 

Some of you have been asking for more information about XQuery versus SQL/XML. In particular, it appears that you are interested in understanding the levels of support for common operations. I’ll take a few moments to compare both XQuery and SQL/XML in IBM DB2 9.  However, please note that not all vendors provide the same levels of support. For instance, all vendors do not support sub-document update, and those that do support it may not implement the XQuery standard. So please, before making any decisions, verify the levels of support provided by your vendor.

Operation XQuery SQL/XML Comments
Inserting an XML document No Yes You use SQL to insert an entire XML document.
Retrieving an XML document Yes Yes  
Retrieving part of an XML document Yes Yes  
Using predicates with relational data Yes Yes XQuery does not support relational predicates. However, IBM DB2 supports SQL in XQuery, allowing predicates with relational data.
Using predicates with XML data Yes Yes  
Deleting an XML document No Yes You use SQL to delete an entire XML document.
Updating an XML document Yes Yes  
Updating part of an XML document Yes Yes  
Joining XML data Yes Yes Using XQuery is the easier approach. Using SQL/XML is typically difficult to code.
Joining XML wth relational data Yes Yes XQuery does not support joins to relational data. However, IBM DB2 supports SQL in XQuery, allowing joins to relational data.
Transforming XML Yes Yes Using XQuery is the easier approach. Using SQL/XML is typically difficult to code.
Aggregating XML data Yes Yes Using SQL/XML is the easier approach. Using XQuery is possible with embedded SQL, but is typically difficult to code.
Calling external functions No Yes  
Passing parameter markers No Yes  

At first glance, it may appear that SQL/XML has more extensive support. However, this is in part because logically-speaking certain tasks do not belong in XQuery.  Also note that some tasks are easier to code with XQuery.  This ease of coding can make a significant difference in some environments.

Are you evaluating XML database vendors? If so, here is a list of questions that can help you when you evaluate vendors. Of course, some questions may not apply to your situation.  For instance, update capabilities may not be necessary in audit and logging systems.  You can weed questions out that do not apply to you.

Performance:

  • Ask about the performance when inserting XML data into the repository. I came across one customer who unfortunately went with a vendor that could not keep up with with their database ingest needs and had to switch to IBM.  XML query performance alone is often not a sufficient measure. In some systems you may have to use heavy indexing for good query performance, but these indexes then lead to significant overhead for insert, update, and delete.  In fact, you should also probably ask about the overhead incurred when working with indexes.
  • Ask about the query performance. Different types of queries have vastly different characteristics, so make sure that the performance proof points they give match your situation. For instance, are the performance proof points using the same kind of data you use, are they at the same granularity as your typical queries, and are they working across a similar data set to yours.
  • Ask if their XML performance proof points are publicly available with sufficient detail about the test data, workload, hardware used, etc to verify their claims.
  • Ask about any restrictions there are on their XML indexes.  Can all data types be indexed?
  • Ask about the scalability limits for the database.
  • Ask for proof points on the reliability of the database.

Schema Support

  • If you work with multiple XML schema, ask about their schema handling capabilities. For instance, do you need it to support multiple schema?  Or do you need it to support different versions of the same schema in the same column?  It is important to clarify these requirements up-front.
  • Schema updates are inevitable.  Ask what is involved when you have new versions of schema.  Is it a seamless experience, or does it require a significant migration effort?
  • In fact, I would recommend validating that the vendor can work with your schema up-front, especially if it is complex.  I have heard of situations where people have had issues with other vendors in this regard.
  • Or perhaps you have schema-less documents. If so, do all of their features support such documents?

Language Support

  • If you plan to use SQL, make sure that their SQL/XML function can meet your needs.
  • Similarly, if you plan to use XQuery, make sure their XQuery implementation can meet your needs.
  • Confirm whether XQuery is embedded in SQL, or whether XQuery can be used standalone and via an API.  This may be important to you.
  • Are XML updates important to you?  If so, ask about their support for update capabilities.  And ask if there are any limitations in their update support.  I understand that certain vendors have limitations in this regard. In particular, you will want to ask if they support the XQuery Update Facility that is being standardized by the W3C.

Miscellaneous

  • Ask if there is a no-charge version of the database for pilot projects.  XML features can look wonderful on paper but may be more difficult to use than you expect.  Limitations in functionality and usability are not obvious from documentation and white papers, but are revealed when start doing hands-on work with your own XML data.
  • If you need your reports and applications to also work with legacy relational data, ask how easy it is to work with XML and legacy relational data.
  • Do you need to work with digital signatures? If so confirm that the digital signatures can be validated against retrieved documents.
  • Ask to what extent they are standards-compliant.
  • If high availability is important, ask if they offer high availability features.

Thanks to Matthias Nicola with his help in compiling this list.  And good luck with your selection process.  Cheers!

And now, the third installment. I hope you enjoy…