How to insert relational data into existing XML documents (Part 2)

June 5, 2011

In my previous blog post I promised that I would provide some examples of how to insert information from multiple relational tables and multiple rows into an XML document.

So, let’s continue the example from my previous post where we had a table with relational INTEGER and DATE columns as well as an XML column. For simplicity, we now use only one XML document in this XML column, but it could be many more:

DROP TABLE documents;
CREATE TABLE documents(document_id INTEGER, created DATE, document XML);

INSERT INTO documentsVALUES(2, '2011-05-21',
'<mydoc>
   <header>
     <title>Introduction to XML</title>
     <author id="23">Peter Pan</author>
     <author id="49">Susie Swing</author>
   </header>
   <body>
     <info>What is XML?</info>
   </body>
 </mydoc>');

Let’s add a second table “document_reviews” that contains a rating and review for each document. Documents and reviews are related to each other and can be joined based on the document_id column:


CREATE TABLE document_reviews(document_id INTEGER,
                              rating CHAR(2),
                              review VARCHAR(32000) );

INSERT INTO document_reviews VALUES(2, 'A+', 'This books is...');

Now we want to to read the XML document from the “documents” table and insert the corresponding rating information from the “document_reviews” table into the XML. This task is quite easy, using the techniques described in my previous post and simply using a join between the two tables:

SELECT XMLQUERY('copy $new := $DOCUMENT
                 modify do insert <rating>{$RATING}</rating>
                           into $new/mydoc/header
                 return $new' )
FROM documents, document_reviews
WHERE documents.document_id = document_reviews.document_id;

Note that the XQuery variable $RATING refers to the “rating” column in the document_reviews table. And the result of this query is the original document with the desired <rating> element inserted into the <header>:

<mydoc>
  <header>
    <title>Introduction to XML</title>
    <author id="23">Peter Pan</author>
    <author id="49">Susie Swing</author>
    <rating>A+</rating>
  </header>
  <body>
    <info>What is XML?</info>
  </body>
</mydoc>

1 record(s) selected.

Now it gets more interesting. What would the query above return if there were *multiple* review rows for this one document? Well, let’s find out! Let’s insert a second review for the same document_id and run the query again.

INSERT INTO document_reviews VALUES(2, ‘C’, ‘I did not like…’);

After inserting this additional review, our query above returns the XML document twice, once for each of the two reviews:

<mydoc>
  <header>
    <title>Introduction to XML</title>
    <author id="23">Peter Pan</author>
    <author id="49">Susie Swing</author>
    <rating>A+</rating>
  </header>
  <body>
    <info>What is XML?</info>
  </body>
</mydoc>

<mydoc>
  <header>
    <title>Introduction to XML</title>
    <author id="23">Peter Pan</author>
    <author id="49">Susie Swing</author>
    <rating>C</rating>
  </header>
  <body>
    <info>What is XML?</info>
  </body>
</mydoc>

2 record(s) selected.

The first document in the result set contains a <rating> element with the rating from the first review, and the second document contains a <rating> element according to the second review. The original XML document is duplicated for each rating, due to the nature of the relational join between the two tables.

Although this form of output might be useful in some cases, chances are that you would like all ratings for this document to be included in a single copy of the document. This can be achieved if we convert the join into a correlated subquery that produces a sequence of <rating> elements that are all inserted into a single copy of the document:

SELECT XMLQUERY('copy $new := $DOCUMENT
                 modify do insert db2-fn:sqlquery("
                             SELECT XMLELEMENT(NAME ""rating"", rating)
                             FROM document_reviews
                             WHERE document_id = parameter(1)" ,
                             $docid            )
                           into $new/mydoc/header
                 return $new'
                 PASSING documents.document_id as "docid" )
FROM documents;

In this query, the modify clause contains the function db2-fn:sqlquery that takes an SQL query and optional parameters as input. This SQL query reads the rating column from the document_reviews tables and constructs a sequence of <rating> elements.

The WHERE clause of this subquery is parameterized, using $docid as the parameter value (line 5). The $docid is passed into the XQuery context from the documents.document_id column (line 9), and ensures that the subquery only reads ratings for correct document. As a result, a single document with two <rating> elements is returned:

<mydoc>
  <header>
    <title>Introduction to XML</title>
    <author id="23">Peter Pan</author>
    <author id="49">Susie Swing</author>
    <rating>A+</rating>
    <rating>C</rating>
  </header>
  <body>
    <info>What is XML?</info>
  </body>
</mydoc>

1 record(s) selected.

I think these example provided a good idea of the powerful capabilities that exist to combine XML and relational data.

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 50 other followers

%d bloggers like this: