XML… in the Cloud !

April 16, 2017

Over the past couple of years I have focused a lot on Cloud Data Services and less on XML as a specialty topic.  But sure enough, all of the great XML processing capabilities that you know from DB2 pureXML are also available in cloud database services such as dashDB and DB2 on Cloud.

DB2 on Cloud is a so-called hosted service, which means that clients can rent a DB2 server in a IBM cloud data center where IBM manages the hardware, storage, and networking while the customer has full control (and full responsibility) over the administration of the software stack, including OS and DB2. For example, applying fixpacks as well as overall database administration including backup/restore, database configuration, etc. is all in the hands of the customer. This gives you a lot of control and flexibility, but also requires a certain degree of effort and skill.

In contrast, dashDB is a fully managed service in the cloud. dashDB is also based on DB2 software but the entire software and database administration is done by an IBM operations team. This service includes software patches and upgrades, database backup/restore, setting database configuration parameters, defining table spaces and bufferpools, and so on. This takes a lot of the everyday database admin efforts off your shoulders so you can focus directly on your application use case: creating tables, loading data, running queries, setting permissions for your users, connecting applications, and so on.

Both dashDB and DB2 on Cloud provide key XML capabilities, such as:

Note that there are two flavors of dashDB, known as dashDB for Analytics and dashDB for Transactions. The latter is the more suitable choice for most XML processing applications.

The overall benefit of such a cloud services is faster time to value.  You don’t need to spend time on setting up a server and installing software, and you have no up-front investment. You pay for the service only for as a long as you use it, like a subscription.

The next time you need relational or XML database capabilities, consider dashDB and DB2 on Cloud as simple and convenient options! Both are available in an IBM data center near you!




Encapsulating database operations in stored procedures has a variety of potential benefits. For example, stored procedures allow you to code more complex data manipulation logic than what can be done with an individual SQL statement.  Also, combining multiple operations in a single stored procedure can improve performance because the operations are executed close to the data (in the database engine) with a single invocation from the application, reducing the number of API calls to the database.

For these reasons it is a quite common practice to implement pieces of XML processing logic in database stored procedures. Some applications want the database to validate XML documents against an XML Schema, which is easy to do as part of insert, update, or load operations, or even in database queries. But, how to validation XML in a stored procedure?

It’s easy. Let’s look at the following procedure as a simple example. It takes an XML document “doc” as an input parameter, and returns a flag “isvalid” which is 1 if the document is valid and 0 if the document is not valid. Of course, you could return more detailed information if needed, but 0 and 1 should be sufficient as an example.

The procedure declares a condition handle for SQLSTATE 2200M, which is raised when an XML document is not well-formed or not valid in an attempted schema validation. If that state should occur during the execution of the stored procedure, then the corresponding exit handler gets executed. In this example, the exit handler simply sets “isvalid” to zero.

  DECLARE invalid_document CONDITION FOR '2200M';
  DECLARE EXIT HANDLER FOR invalid_document 
       SET isvalid = 0; -- indicate that doc failed to validate
      -- indicate that the document is valid:
      SET isvalid = 1;
      -- then process the XML as intended:
    END IF;
END # -- end of procedure


After these declarations, the first operation in the stored procedure is the document validation with the XMLVALIDATE function. The XMLVALIDATE function specifies the name (ID) of the XML Schema that is being used for validation. In this example, the schema name is hardcoded as “db2admin.myschema”, but it could also be another parameter into the stored procedure.

If the validation is successful (i.e. IS VALIDATED is true) then we set the “isvalid” flag to 1 and proceed with whatever processing we want to perform on the XML document. If the validation is not successful, then the procedure executes the exit handler and terminates.

This example provides a simple skeleton for XML stored procedures, to ensure that the XML processing in the body of the procedure is performed only if validation is successful.

I spent this week in Philadelphia at the annual DB2 user group conference. The week was packed with excellent technical sessions, panel discussions, customer presentations, hands-on labs, seminars, and lots of networking opportunities.

As you would expect from such a big event, a very large variety of DB2 and application development topics were on the grid, including DB2 BLU, high availability, DB2 11 for z/OS, query optimization, DB2 in the Cloud, and many others.

And, almost 9 years since the introduction of pureXML in DB2 9.1, XML was still a popular topic at this year’s conference, with some excellent sessions from DB2 users and from IBMers:

E03 – XML – Essentials for the DB2 z/OS DBA
Thanikachalam “Billy” Sundarrajan, Fifth Third Bank

E05 – XML and JSON in Examples
Jan Marek, CA Technologies

E17 – XML Data Processing Performance
Aleskey Slutch and Alexander Veremyev, Raiffeisenbank Russia

Hands-on Lab: Learning Exciting JSON and XML in IBM DB2 for z/OS
Jane Man, IBM Corporation




December 24, 2013

As you probably noticed, DB2 10.5 features a new technology called BLU Acceleration which provides significant gains in performance, reduced storage consumption, and simplicity for analytical database workloads. To this end, BLU combines column-organized storage with novel compression techniques, hardware-specific optimizations, data skipping, new techniques for join processing and aggregations, and other innovations.

Since the release of BLU in June 2013, I was asked several times whether BLU can be used with XML or how BLU relates to XML in general.

The short answer is that a BLU table, i.e. a column-organized table, cannot contain columns of type BLOB, CLOB, or XML at this point in time. If you use BLU in a DB2 10.5 database, your XML documents can be stored in the same database but need to reside in traditional tables which are not column-organized.

Fundamentally, XML and BLU follow two very different approaches to managing data.

The concept of XML documents is to keep all attributes for a given business object (e.g. an order, a product, a contract) physically together as one entity so that you can easily retrieve or insert all the relevant information for an object in one shot. This is particularly useful for complex objects whose attributes would otherwise be spread over many relational tables. XML is also useful for objects whose set of attributes can evolve over time, since XML also offers schema flexibility. The main focus of using XML is on the business object as the unit of interest.

In contrast, BLU is designed to store all attributes separately from each other, purposefully breaking up the objects, even more so than traditional storage formats for relational tables. The focus of BLU is on accessing individual attributes across all or many objects as efficiently as possible. The column-organized storage purposefully exploits the fact that many analytical questions never need to see whole objects but only a (small) subset of the attributes. The unit of interest is the individual attribute, not the entire objects.

As such, XML and BLU serve different purposes and access patterns, both valid in their own right.

Analytical queries that need to scan, join, filter, aggregate, etc. on individual attributes across a large number of objects will perform much better with BLU than with XML.

Applications that often need to read or write entire objects, esp. complex objects, perform much better with XML than with BLU. Similarly, queries that typically access most or all columns of a relational table perform better on traditional row-organized tables than on column-organized tables.

Now what if you have subset of attributes in your XML documents that you want to run analytical queries on, ideally across all or many of the documents? In this case you can consider extracting some of these attributes for redundant storage in a BLU table. This can be done with triggers, stored procedures, or application logic.

In a previous post I described how to write a query that produces a list of all elements and attributes that occur in a document or a set of documents. 

The core idea of that technique was to use the XPath expression $doc//(*, @*) in the XMLTABLE function. In this expression, $doc references the XML document or XML column that is used as input. This expression also uses the so-called comma operator that combines two squences into one. These two sequences are are all elements (*) and all attributes (@*).

For this technique to work in DB2 for z/OS, you need to code the XPath expression in a slightly different way:  ($doc//*,  $doc//@*)  , which lists all elements followed by all attributes.

It is useful to sort the result by the parent of each node in the document so that the elements and attributes that belong to the same parent appear in consecutive order in the result set.

The following listing shows two queries that produce the same result set, and the second of the two queries is the recommendad notation for DB2 for z/OS:

FROM purchaseorder p,
     XMLTABLE('$doc//(*, @*)' passing p.porder as "doc"
       node   VARCHAR(20) PATH 'name(.)',
       parent VARCHAR(20) PATH '../name(.)') AS T
WHERE poid = 5000
ORDER BY parent;

FROM purchaseorder p,
     XMLTABLE('($doc//*, $doc//@*)' passing p.porder as "doc"
       node   VARCHAR(20) PATH 'name(.)',
       parent VARCHAR(20) PATH '../name(.)') AS T
WHERE poid = 5000
ORDER BY parent;

 And then you can ertainly extend such queries to also list the node values or an indication whether the node is an element or an attribute, as previously described here.

Similar considerations apply to queries that list all paths in a document.




This blog on XML database technology and DB2 pureXML has been active for almost 5 years now. My former colleage Conor O’Mahony ran this blog from early 2008 to December 2009, and I took over on Dec 15, 2009. We have received a lot of feedback and many of the blog posts were inspired by questions from XML database users.

I’m not sure yet how much time I will have for blogging in 2013 and beyond. In any case, I liked the idea of producing a summary page that provides a quick overview of all the topics and posts in this blog. But, how to produce such a summary most efficiently?

Well, true to the topic of this blog I decided to load this blog as a piece of XML into DB2 pureXML and run queries on it!

First I downloaded the complete blog as an HTML document. But, HTML is not necessarily wellformed XML because HTML allows elements with missing end tags and other goofy things. Luckily there are free tools available that convert HTML into XHTML. XHTML is a form of HTML that complies with the rules for well-formed XML documents.

After converting the entire blog into an XML (XHTML) document, it took only three steps to create a list of blog post titles, dates, and URLs:

CREATE TABLE nativexmldatabase(doc XML);

LOAD FROM blog.del OF DEL REPLACE INTO nativexmldatabase(doc);

SELECT date, title, url
FROM nativexmldatabase,
     seq     FOR ORDINALITY,
     date    VARCHAR(20)   PATH 'h3',
     title   VARCHAR(105)  PATH 'h2/a',
     url     VARCHAR(145)  PATH 'h2/a/@href') AS T;

Of course, a quick look into the XML document is necessary to discover the required tag names and XPath expressions, such as //div[@class=”entrytitle”] to iterate over all the individual posts, or h2/a and h2/a/@href to retrieve the title and URL of each post.

This query produces a nice list of all blog post titles, dates, and URLs. Instead of posting this listing as-is, I decided to roughly group the blog posts along the following topic areas:

* Design
* Performance
* Migration
* Relational to XML or XML to relational
* XML Queries
* XML Storage and Indexing
* XML Schemas
* Application Development
* Use Cases and Case Studies
* ETL and Warehousing
* DB2 pureXML Product News
* Other Products and Tools
* Publications and Resources
* XML Coverage at IDUG and IOD Conferences
* News, Announcements, Miscellaneous

And here is the result, a table of contents for nativexmldatabase.com from 2008 through 2012. I hope this can serve as a quick reference to topics of interest.

Date Design
Dec 4, 2012 Good and bad XML design
Jan 21, 2012 Business Records in the 21st Century
Jan 8, 2012 Data Normalization Reconsidered
Jan 21, 2011 Name/Value Pairs – A pretty bad idea in XML as in Relational !
Jan 8, 2011 XML Manipulation in Application code vs. a Native XML Database
Sep 28, 2010 5 Reasons for Storing XML in a Database
Sep 26, 2008 Flexible Schemas: When to Persist Data in XML Instead of Relational
Aug 4, 2008 When to Store in XML Format
Mar 21, 2008 When to use a Native XML Database
Date Performance
Oct 23, 2012 Overcoming Performance Obstacles in XML Encryption
Nov 15, 2011 TPoX 2.1 has been released!
May 9, 2011 News Flash: Intel publishes TPoX Benchmark results on new 10-core Westmere-EX CPUs
Mar 4, 2011 New TPoX Benchmark Results Available
Jan 8, 2011 XML Manipulation in Application code vs. a Native XML Database
Aug 22, 2010 XML versus Relational Database Performance
July 14, 2010 A 10TB XML Data Warehouse Benchmark
Nov 3, 2008 Benchmark for 1TB Transactional XML System
Mar 20, 2008 Native XML versus CLOB and Shredding
Date Migration
June 29, 2012 How to migrate from the XML Extender to DB2 pureXML
Oct 3, 2010 How to migrate XML from LOB to XML columns
Date Relational to XML or XML to relational
Aug 30, 2010 XML Construction in Views and User-Defined Functions
Aug 24, 2011 How to generate XML from relational data – with line breaks!
June 8, 2011 How to quickly produce XML from relational tables
June 5, 2011 How to insert relational data into existing XML documents (Part 2)
May 21, 2011 How to add relational data into existing XML documents (Part 1)
Date XML Queries
Sep 13, 2012 How to Validate XML Documents in Database Queries
Aug 14, 2012 Result set cardinalities when querying repeating XML elements
Aug 10, 2012 Using the XMLTABLE function with MERGE statements in DB2 for z/OS
July 30, 2012 Using the XMLTABLE function in UPDATE and MERGE statements
May 24, 2012 How to query CDATA sections in XML
Feb 4, 2012 How to list the paths of all elements in an XML document?
Nov 30, 2011 XQuery support in DB2 10 for z/OS
Oct 9, 2011 Advanced SQL/XML: Joins and FLWOR Expressions in the XMLTABLE Function
Aug 29, 2011 Quantified expressions in XQuery: When ‘some’ and ‘every’ satisfy!
Mar 18, 2011 XQuery and SQL/XML: How to convert a date that is not a date into a date?
Aug 11, 2010 How to order query results based on XML values
May 26, 2010 XML Profiling – How to get a list of all elements and attributes
May 9, 2010 How to deal with dirty data
July 24, 2011 XQuery and Other XML Manipulation in SQL Stored Procedures
April 6, 2011 Returning data from one path based on predicates in another (part 2/2)
April 3, 2011 Intra-document references: Returning data from one path depending on predicates in another path
June 17, 2010 An XQuery Cheat Sheet
April 12, 2010 The XMLTABLE function and a case where a left outer join can help
Feb 21, 2010 Get Started with XML Full-Text Search in DB2
Jan 25, 2010 tokenize it !
Jan 11, 2010 Watch your step! Positional predicates in XPath.
Oct 14, 2008 XQuery: Introduction, Tips, and Future Directions
July 25, 2008 XQuery versus SQL/XML – Part Deux
June 26, 2008 XQuery versus SQL/XML
Date XML Storage and Indexing
July 11, 2012 XML Storage in DB2: To inline or not to inline?
June 8, 2012 XML Indexing in DB2 9.x and DB2 10 for Linux, UNIX, and Windows
July 11, 2011 How to index XML documents in the presence of schema diversity?
July 19, 2010 Structure-agnostic XML indexes in DB2
Dec 26, 2010 How to insert XML documents from the file system?
Feb 21, 2010 Get Started with XML Full-Text Search in DB2
May 6, 2009 DB2 Compresses XML Data by 60% to 80%
May 3, 2009 Multi-Dimensional Clustering for Tables with Native XML Data
May 1, 2009 Use Hash Partitioning for Fast Analysis of XML Data
Oct 16, 2008 DBA Concerns about Native XML Storage
April 8, 2008 Native XML Reduces Storage Costs
Date XML Schemas
Sep 13, 2012 How to Validate XML Documents in Database Queries
Feb 29, 2012 What is an XML Schema and why should I care?
Aug 22, 2008 Schema Evolution
Aug 21, 2008 Schema Flexibility
Date Application Development
Sep 29, 2011 At your service: REST with DB2 pureXML!
Aug 8, 2011 Using XQuery across Application Server and Database Server
July 30, 2011 Beyond the Database: Native XML and XQuery in the Application Server
Feb 25, 2011 CICS meets XML: Using CICS Web Servives with DB2 pureXML
Jan 8, 2011 XML Manipulation in Application code vs. a Native XML Database
Oct 8, 2010 Buidling XML Solutions with Rational Software Architect (RSA) and Rational Application Dev
Mar 1, 2010 XML Support in JDBC 4.0: The SQLXML Interface
Nov 9, 2009 Building a JSON and DB2 pureXML Application
Aug 5, 2008 Native XML Storage Reduces Development Costs
Nov 18, 2008 Referential Integrity and XML Data
Date XML Use Cases and Case Studies
Sep 12, 2011 DB2 pureXML – Rich in Proteins!
June 24, 2011 A true story about using XML forms and DB2 to process millions of tax returns
May 12, 2010 Webcast on XML Use Cases (May 18)
Feb 7, 2010 BJC HealthCare Improves Clinical Research with DB2 pureXML
Nov 2, 2009 Electronic Forms Using Adobe PDF and IBM DB2 pureXML
Sep 8, 2009 The Patient-Centered Medical Home
Aug 24, 2009 Electronic Health Records for Smarter Healthcare
July 17, 2009 Article about XML for Healthcare
June 16, 2009 Building an XML-Based Electronic Forms Solution
June 5, 2009 Short Video Highlighting Early Experiences with DB2 9.7
Mar 30, 2009 GUS Group uses DB2 pureXML for ERP System
Mar 13, 2009 UCLA Health System uses XML and SOA to Optimize Data Integration and Migration
Mar 2, 2009 Storing and Retrieving ACORD Data for Insurance
Feb 26, 2009 Speedy Communications and Collaboration in XML-based Environments
Dec 30, 2008 XForms and Native XML Storage
Mar 27, 2008 How to Choose Solution Areas
Mar 25, 2008 More information about Storebrand
Mar 23, 2008 Productivity Improvements at Storebrand
Date ETL and Warehousing
April 13, 2011 XML ETL with DataStage 8.5
July 14, 2010 A 10TB XML Data Warehouse Benchmark
June 23, 2010 More on Data Warehousing and XML
April 5, 2010 More on XML ETL
Mar 26, 2010 XML ETL
Jan 31, 2010 New Article on XML in the Data Warehouse
May 28, 2009 XML in the Data Warehouse
April 22, 2009 Fast Performance for Native XML Data in the Warehouse
Aug 24, 2009 Article on Integrating XML Operational Data into a Data Warehouse
Date DB2 pureXML product news
April 19, 2012 XML Enhancements in DB2 10 for Linux, UNIX, and Windows
Nov 30, 2010 Node-level XML Updates in DB2 10 for z/OS
Nov 18, 2010 New XML Features in DB2 10 for z/OS !
June 8, 2010 Updated version of a free XML database: DB2 Express-C 9.7.2
Aug 7, 2009 SOA Projects: IBM DB2 versus Oracle Database
Feb 11, 2009 XML Database in the Cloud
Feb 10, 2009 pureXML Moves into Core DB2 Product
Date Other Products and Tools
Dec 5, 2010 XML Tools for DB2 pureXML
Mar 8, 2010 Integration of Altova XMLSpy with DB2 pureXML
Jan 18, 2010 XPath and XQuery Functions in SQL Server vs. DB2
Jan 4, 2010 Shredding and constructing XML in mySQL
Dec 23, 2009 XML Support in mySQL
Dec 17, 2009 Oracle vs. mySQL: The Full Monty
Aug 7, 2009 MyCareTeam and IBM Collaborate to Improve Continuing Care for Diabetes
July 8, 2009 Why Won’t Oracle Publish XML Benchmark Results for TPoX?
Sep 30, 2008 XML in Oracle 11g
Sep 24, 2008 XML in SQL Server 2008
Sep 19, 2008 XML in SQL Server 2005
July 11, 2008 Questions for XML Database Vendors
May 13, 2008 XML-only Databases
May 12, 2008 All Native XML Databases are not Equal
Date Publications and Resources
Feb 13, 2011 Extremely pureXML in DB2 10 for z/OS
Sep 14, 2010 DB2 pureXML Cookbook – Errata
Aug 13, 2009 DB2 pureXML Cookbook – 45% Discount
June 3, 2009 DB2 pureXML for Dummies . Get Your Copy!
April 29, 2010 Get off to a fast start with DB2 pureXML
April 10, 2009 DB2 pureXML Cookbook
Feb 25, 2009 Intel Insights Magazine covers Native XML Storage
Feb 5, 2009 Henrik’s Blog
Feb 5, 2009 DB2 pureXML leads SOA World Magazine Reader’s Choice Award
Jan 21, 2009 Podcast Series about DB2 pureXML on the Mainframe
Jan 20, 2009 Reports for Information in Native XML Format
Dec 30, 2008 Anant Jhingran on Improving XML Applications
Sep 1, 2008 DB2 pureXML Online Communities
July 27, 2008 Getting Started with Native XML Databases
Date XML Coverage at IDUG and IOD Conferences
Oct 21, 2012 XML from Vegas to Berlin!
Oct 18, 2011 XML in Las Vegas !
April 25, 2011 XML sessions at IDUG North America 2011
Oct 21, 2010 Next week in Vegas
April 18, 2010 DB2 for z/OS pureXML Hands-on Lab @ IDUG North America
Oct 23, 2009 DB2 pureXML at the IOD Conference
May 28, 2009 IBM IOD Conference Attendees get a Poken for Networking
May 13, 2009 Short Video from the IDUG North America Confernece
Nov 11, 2008 XML at IBM Information on Demand Conference
Sep 11, 2008 DB2 pureXML at the IOD Conference
Sep 4, 2008 Meet Native XML Databases Users
Date News, Announcements, Miscellaneous
July 30, 2010 DB2 Express-C Light – A reduced footprint XML database
Mar 15, 2010 More DB2 pureXML Bootcamps
Feb 14, 2010 DB2 pureXML Webinars and Bootcamps
Dec 15, 2009 Hi there here is the new guy!
Dec 13, 2009 Welcome Matthias Nicola!
June 9, 2009 Flirting with Poken
April 22, 2009 The XML Rap Superstar
April 22, 2009 IDUG announces winners of XML Challenge
April 20, 2009 Teaser for Upcoming DB2 pureXML Features
Mar 12, 2009 DB2 and Cloud Computing – Chat with the Labs
Feb 5, 2009 Free Online Conference – Data in Action
Jan 20, 2009 DB2 Technology Sandbox
Dec 26, 2008 Native XML Storage for the Mac
Dec 3, 2008 Update on the XML Challenge
Nov 5, 2008 Why IBM DB2 is ideal for Transactional XML
Oct 9, 2008 10 Reasons why DBAs Should Understand Native XML
Oct 2, 2008 XML Challenge Web Site is Live!
Sep 28, 2008 1st Oct is Online Community Action Day
Aug 7, 2008 Webcast: A primer for storing and retrieving XML data
July 1, 2008 Viral Video – Wednesday
June 20, 2008 Viral Video – Tuesday
June 16, 2008 Viral Video – Monday
June 13, 2008 Learn to Use XML with Databases and win Prizes!
Mar 22, 2008 How to Market New Technology
Mar 19, 2008 Welcome

Happy browsing!

Good and bad XML design

December 4, 2012

Whenever people design things –such as houses, software systems, database schemas, or even XML structures– there are good and not so good design options.  Roughly speaking, a design is usually considered “good” if it allows for easy and (cost-)efficient use and implementation of whatever is being built.

For XML documents we have seen various cases of “good” and “bad” designs over the years. Previously I wrote about name/value pair design for XML documents, which often is not a good choice.

Recently I received a question regarding another design choice, which I would like to share here in a simplified form.

The question was about an XML message format to transmit a list of order numbers and the corresponding order system identifiers. Two alternative XML structures were proposed:

Option A:


Option B:


The motivation for option B might be a smaller message size. One might argue that XML elements are by definition ordered, and if we know that the message consists of pairs of order number and order system, then XML format B is as easy to interpret as format A. Or, is it not?

The problem that I see with option B is that the absence of the <Order> elements gives up the explicit structure of the data and instead requires implicit knowledge, i.e. one must know that the OrderNumber at position i belongs to the OrderSystem at position i+1.

In my view, this design is against the principles of XML. In XML, the tags and their nesting should describe the data explicitly. XML is meant to be a hierarchical data format so that data items that belong together are grouped under the same parent (as in option A). After all, that is the benefit of XML over a flat file format.

I also think that using the explicit structure of option A is less error-prone than using option B which, requires implicit knowledge based on element positions.

How about queryability?

It turns out that XML format A is considerably easier to query than option B. Let’s assume you need to code a SQL/XML query to list the order numbers and systems. For that exercise we assume that the messages are stored in a table “mytable” with XML column “doc”.

The query for XML option A is very simple, using the XMLTABLE function in its most basic form. The row-generating XPath expression $DOC/OrderList/Order iterates over the orders, and the column expressions extract the order number and system.

-- Query for XML option A:

SELECT OrderNumber, OrderSystem
FROM mytable,
     XMLTABLE ('$DOC/OrderList/Order'
         OrderSystem VARCHAR(20) PATH 'OrderSystem',
         OrderNumber VARCHAR(20) PATH 'OrderNumber' );

-------------------- --------------------
A0000001             ABC
B0000001             XP1
C0000001             Q-9

3 record(s) selected.

How can you produce the same result set from XML format B? This is remarkably tricky! You need to iterate over the child elements and keep track of which element is at position i versus i+1 to produce the proper pairs in the output.

There are likely several ways to code the query for XML option B, and you see one solution below. It seems hard to query the elements by name alone because the important structural information is by position. Hence, this query iterates over the positions 1 to N, where N is the number of elements. For each even position 2, 4, 6, etc. the query produces the pair of the current ($pos) and the previous ($pos – 1) element to construct the proper output.

-- Query for XML option B:

SELECT OrderNumber, OrderSystem
FROM mytable,
     XMLTABLE('for $pos in (1 to count($DOC/OrderList/*))
               where $pos mod 2 = 0
               return  <pair>
                         {$DOC/OrderList/*[$pos - 1]}
         OrderNumber VARCHAR(20) PATH 'OrderNumber',
         OrderSystem VARCHAR(20) PATH 'OrderSystem' );

OrderSYSTEM          OrderNUMBER
-------------------- --------------------
ABC                  A0000001
XP1                  B0000001
Q-9                  C0000001

3 record(s) selected.

This query is not only more complex but also potentially less efficient. Note that the return clause of the FLOWR expresion constructs the explicit pairing that it missing in XML format B.

Let me know if you can think of a simpler query to produce the same result from XML format B.

Next, imagine a message that doesn’t have just 2 fields per order but maybe 10 or 100 fields per order. Then option B becomes increasingly more ugly.

Or worse, what if there is a schema change that allows one order to have a different number of fields than the next? Then the query that we coded for XML format B will no longer work because it was based on a specific structural assumption that is no longer true. In contrast, our query for XML format A will work even if you add additional fields to some of the orders in the message.

In summary, the benefits for XML option A include the following:

  • Explicit structure rather than implicit positional format
  • Easier and more efficient to query (query by name rather query by postion)
  • More resilient to schema changes

I think that these benefits far outweigh the concern that XML format A is larger. Also, you could chose to send or store the XML in a compressed format to greatly alleviate space concerns.