XML in SQL Server 2005
September 19, 2008
I’d like to make one thing perfectly clear before I begin this post… the following are solely my personal opinions and not necessarily those of my employer IBM.
In the past, I have mentioned that each vendor has a very different implementation of “native XML storage”. Here are some observations about the XML storage capabilities in Microsoft SQL Server 2005. Of course, Microsoft released SQL Server 2008 in August. I will review the new release in a later post.
You should be aware that I have no inside information about SQL Server and that all of these observations were deduced from public sources. You should also be aware that I work for IBM who competes directly with Microsoft in this regard. But, nonetheless, I think you will find the following information useful.
SQL Server parses XML data upon insertion and transforms it into a binary token string, which is then stored in a BLOB. This is parsed storage, but it is a stream rather than in tree format. The stream contains information about the hierarchical relationships between the the elements.
SQL Server provides a primary and secondary index to optimize query performance. The “primary XML index” in SQL Server is actually a table. You could conceivably consider this implementation to be a clever form of shredding. The primary index is a table on which the secondary indexes are defined.
When it comes to issuing queries, SQL Server supports the two industry-standard query languages for XML data: SQL/XML and XQuery. However, you should be aware that:
- The SQL Server implementation of SQL/XML includes non-standard extensions.
- SQL Server does not support standalone XQuery. And, in fact, SQL Server translates XQuery commands into SQL before execution.
Also, if you will have queries based on path expressions or queries against large XML documents, you should very carefully evaluate SQL Server query performance.
When it comes to support for XML schemas, SQL Server does not allow multiple versions of a schema in the same schema collection. Also, SQL Server does not allow you to alter a schema. You should be aware that these schema flexibility and schema evolution restrictions can create headaches as you work with real-world XML schemas.
Finally a few more words about indexing. Did you know that SQL Server does not support indexing individual elements and attributes? It always indexes all elements and all attributes. This leads me to recommend that you carefully evaluate the performance and logging overhead of creating and updating indexes in SQL Server for your particular use case.
The following sources were consulted when compiling this post:
- XML Best Practices for Microsoft SQL Server, Microsoft Software Developer Network paper, April 2004.
- XML Indexes in SQL Server 2005, Microsoft Software Developer Network paper, August 2005
- XML Support in Microsoft SQL Server 2005, Microsoft Software Developer Network paper, December 2005
- Documentation for MS SQL Server 2005, beta 2