How to add relational data into existing XML documents (Part 1)

May 21, 2011

The beauty of a “hybrid” database system is that it allows you to manage both XML and relational data at the same time, even side-by-side in the same table if you want. You can choose which data to store in XML and which in relational format. Best practices for determining the “right” mix of XML and relational in a database schema is an interesting topic, but I’ll reserve that for a separate blog post in the near future.

Today, let’s look at how you can insert information from relational columns into an XML document. Let’s start with a simple table and a couple of rows. Each row holds an XML document in an XML column as well as a document_id and the date when the document was created:

CREATE TABLE documents(document_id INTEGER, created DATE, document XML);

INSERT INTO documents
VALUES(1, '2011-05-20',
'<mydoc>
   <header>
     <title>How to skin a cat</title>
     <author id="17">John Doe</author>
   </header>
   <body>
     <info>There are many ways...</info>
   </body>
 </mydoc>');

INSERT INTO documents
VALUES(2, '2011-05-21',
'<mydoc>
   <header>
     <title>Introduction to XML</title>
     <author id="23">Peter Pan</author>
     <author id="49">Susie Swing</author>
   </header>
   <body>
     <info>What is XML?</info>
   </body>
 </mydoc>');

Now let’s assume we want to retrieve these documents and include the “created” date from the relational column as an XML element <created> within the <header> element. This is not hard. You can simply include an XQuery update expression in your query that retrieves the documents:

SELECT XMLQUERY('copy $new := $doc
                 modify do insert <created>{$crdate}</created>
                           into $new/mydoc/header
                 return $new'
                 PASSING documents.document as "doc",
                         documents.created  as "crdate")
FROM documents;

The XMLQUERY function contains the XQuery update expression (in single quotes) and the PASSING clause that passes column information to the XQuery expression. Here we are passing the XML columns “document” and the DATE column “created”, so that they can be referenced as XQuery variables $doc and $crdate.

DB2 for Linux, UNIX, and Windows also allows an abbreviated syntax, i.e. you can omit the PASSING clause and refeence the column names directly (in upper case), as in the following equiavalent query:

SELECT XMLQUERY('copy $new := $DOCUMENT
                 modify do insert <created>{$CREATED}</created>
                           into $new/mydoc/header
                 return $new' )
FROM documents;

The XQuery update expression assigns the XML from the XML column ($DOCUMENT) to the variable $new and then it modifies and returns $new. The “modify” clause inserts the new XML element <created> into the /mydoc/header element, using the value from the relational DATE column that is referenced here as $CREATED.

As a result, both of the queries above return the following output, with the desired <created> elements:

<mydoc>
  <header>
    <title>How to skin a cat</title>
    <author id="17">John Doe</author>
    <created>2011-05-20</created>
  </header>
  <body>
    <info>There are many ways...</info>
  </body>
</mydoc>

<mydoc>
  <header>
    <title>Introduction to XML</title>
    <author id="23">Peter Pan</author>
    <author id="49">Susie Swing</author>
    <created>2011-05-21</created>
  </header>
  <body>
    <info>What is XML?</info>
  </body>
</mydoc>

It’s that simple!

Note that these queries do not change the documents that are stored in the “documents” column. They only modify the documents on the fly while reading them out. If you want to permanently change the documents, you would use the exact same XQuery update expression in an SQL UPDATE statement rather than a SELECT statement, like this:

UPDATE documents
SET document =  XMLQUERY('copy $new := $DOCUMENT
                          modify do insert <created>{$CREATED}</created>
                                    into $new/mydoc/header
                          return $new' );

But, let’s stick with queries for now and extend the XQuery update expression to also merge the relational document_id into the XML. Specifically, assume we need to include the document_id as an attribute “docid” in the “mydoc” element. We can simply add another “insert” operation in the modify clause, which is easy because the modify clause can have a comma-separated list of operations, including insert, replace, and delete operations.

To add an attribute you have to specify “insert attribute” followed by the desired attribute name (docid) and the attribute value in curly brackets. The curly brackets can either contain a literal value in double quotes or -as in our case here- a reference, such as our reference to the relational DOCUMENT_ID column:

SELECT XMLQUERY('copy $new := $DOCUMENT
                 modify (do insert <created>{$CREATED}</created>
                            into $new/mydoc/header,
                         do insert attribute docid {$DOCUMENT_ID}
                            into $new/mydoc )
                 return $new' )
FROM documents;

And the result is as desired:

<mydoc docid="1">
  <header>
    <title>How to skin a cat</title>
    <author id="17">John Doe</author>
    <created>2011-05-20</created>
  </header>
  <body>
    <info>There are many ways...</info>
  </body>
</mydoc>

<mydoc docid="2">
  <header>
    <title>Introduction to XML</title>
    <author id="23">Peter Pan</author>
    <author id="49">Susie Swing</author>
    <created>2011-05-21</created>
  </header>
  <body>
    <info>What is XML?</info>
  </body>
</mydoc>

Going one step further, you might wonder how to insert data from relational columns in *other* tables into these XML documents. Well, you can do a join so that the desired values from the other table line up in the same row as your XML document, and then apply the same techniques as above.

It get’s slightly more interesting if you want add information from *multiple* rows into a single document, based on a one-to-many relationship. I can show you how to do this, but I’m about to board a flight to go  vacation, so you’ll find a couple of examples in my next blog post when I’m back, or many more examples in the DB2 pureXML Cookbook!

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 52 other followers

%d bloggers like this: