Node-level XML Updates in DB2 10 for z/OS

November 30, 2010

In my previous blog post I wrote about the new XML enhancements in DB2 10 for z/OS, the new release of IBM’s relational database for the mainframe. One of the exciting new XML features are node-level XML updates.

While DB2 9 for z/OS only supported full-document replacement, DB2 10 for z/OS allows you to insert, delete, or modify individual XML elements or attributes (“nodes”) within an XML document. This is based on the XQuery Update Facility, which defines the syntax and semantic of node-level XML updates.

In DB2 10 for z/OS, such XML updates are always embedded in an SQL Update statement, such as the following statement that updates the XML column “info” in the table “customer”:

UPDATE customer
SET info =  XMLMODIFY( <xquery update expression>)
WHERE ….

The WHERE clause of the UPDATE statement can contain relational and/or XML predicates to select zero, one, or multiple rows (documents) for update. For each qualifying row the XMLMODIFY function applies the XQuery Update expression to the respective XML document.

The XMLMODIFY function is new in DB2 10 for z/OS and can only be used on the right-hand side of a SET clause in an UPDATE statement.

The following update expressions are supported in the XMLMODIFY function:

Let’s look at some examples for each of those update expressions.

The following UPDATE statement modifies the document with the relational id “123″ and changes the value of the <city> element to “San Jose”:

UPDATE customer
SET info =  XMLMODIFY(‘replace value of node /customer/addr/city with “San Jose” ‘ )
WHERE customer.id = 123

If you want to update an element or attribute that occurs multiple times per document then you need to select exactly one of its occurrences. In other words, the target of the update expression must be a single node per document. It cannot be zero or multiple nodes per documents. Otherwise an error is returned.

Assuming that a customer document can have multiple <phone> elements, the following UPDATE statement changes the value of the phone element whose @type attribute has the value “home”:

UPDATE customer
SET info =  XMLMODIFY(‘replace value of node /customer/phone[@type = "home"]  with “408-463-4963″  ‘ )
WHERE customer.id = 123

The new value can also be provided via a parameter marker or host variable. For example, the next UPDATE provides the new phone number via the host variable :newnum. The second argument of the XMLMODIFY function contains the host variable and assigns an XQuery variable name (“new”) that can be referenced with a $-sign in the update expression itself. In this way the variable $new refers to the host variable:

UPDATE customer
SET info =  XMLMODIFY(‘replace value of node /customer/phone[@type = "home"]  with $new ‘, :newnum as “new )
WHERE customer.id = :cid

Next, let’s assume we need to replace the home phone number with a cell phone number. This can be done with a node replacement. The following UPDATE provides a new <phone> element (shown in green). It uses the XMLPARSE function to convert the text representation of the new element to data type XML,  and assigns the new element to the variable $new. This variable is used in the “replace node” expression to replace the selected existing phone element:

UPDATE customer
SET info =  XMLMODIFY(‘replace node /customer/phone[@type = "home"]  with $new ‘ ,
XMLPARSE(document ‘<phone type=”cell”>408-123-4567</phone>‘)  as “new”
)
WHERE customer.id = 123

Again, the new element could also have been provided  by a host variable.
You can also add and remove nodes from a document. The next UPDATE deletes the phone element with the cell phone number from the document:

UPDATE customer
SET info =  XMLMODIFY(‘delete nodes /customer/phone[@type = "cell"] )
WHERE customer.id = 123

Note: the “delete” expression is the only update expression that is allowed to affect zero, one, or multiple nodes per document. While the “replace” expression can only be applied to one node per document, the “delete” expression can remove multiple nodes at once!  The following UPDATE deletes all <phone> elements that appear on the path /customer/phone:

UPDATE customer
SET info =  XMLMODIFY(‘delete nodes /customer/phone )
WHERE customer.id = 123

To insert a new element, attribute, or document fragment you need to provide the new node and specify a target location where the new node should be placed in the document. For example, the next UPDATE statement inserts the new <phone> element (in green) into the document so that it appears after the existing  <email> element:

UPDATE customer
SET info =  XMLMODIFY(‘insert node $new  after /customer/email ,
XMLPARSE(document ‘<phone type=”cell”>408-123-4567</phone>‘)  as “new”)
WHERE customer.id = 123

Instead of using the keyword “after” you can also specify that a new node should be inserted “before” an existing element or as a child “into” an existing element. When you use the keyword “into” to add a node as a new child to an existing element, then you can also specify “as first into” or “as last into” to explicitly make it the first or the last child node.

The XPath the specifies the target location of the new node, such as /customer/email in the example above, must identify exactly one node, and must node lead to zero or multiple nodes.

And finally, if your XML data contains namespaces, you must declare the namespaces in the XMLMODIFY function just like you would in an XMLEXISTS or XMLQUERY function. This next update declares a default element namespace that applies to all elements in the XPath /customer/addr/city:

UPDATE customer
SET info =  XMLMODIFY(”declare default element namespace http://www.example.com&#8221;;  replace value of node /customer/addr/city with “San Jose” ‘ )
WHERE customer.id = 123

From here on you should try it out for yourself, or let me know if you have any questions!

 

 


About these ads

21 Responses to “Node-level XML Updates in DB2 10 for z/OS”

  1. Sandy Davis Says:

    Why is the syntax to do node level updates different in DB2 Z than in DB2 LUW? In LUW we use TRANSFORM. Will this syntax be coming to LUW soon?


    • Hi Sandy,

      I think the difference in syntax between LUW and z/OS are smaller than what they seem. The biggest difference is really the fact that DB2 z/OS allows XQuery Update expressions only in the new XMLMODIY function which can only be used in the SET clause of an UPDATE statement. DB2 LUW allows XML update expressions “everywhere”, e.g. in the XMLQUERY function, XMLEXISTS, and the XMLTABLE function, and those can be used not only in UPDATEs but also in INSERTs and queries. This allows documents to be modified as they are inserted, or modified as they are queried without changing them in the underlying table. Such capabilities are sometimes useful, but most of the time XML update expressions are used in UPDATE statements, as supported in DB2 z/OS. Hence, the DB2 z/OS support for XML updates is a very streamlined and efficient implementation for the predominant use case.

      The other difference that you have noticed is about the update expressions themselves. First, note that the “transform” keyword in LUW is optional and has been dropped from XQuery Update Facility standard. So, lets’ compare a typical update between z/OS and LUW:

      z/OS:

      UPDATE customer
      SET info = XMLMODIFY(‘
      replace value of node /customer/addr/city with “San Jose” ‘ )
      WHERE customer.id = 123

      LUW:

      UPDATE customer
      SET info = XMLQUERY(
      ‘copy $new := $INFO modify do
      replace value of /customer/addr/city with “San Jose”
      return $new‘)
      WHERE customer.id = 123;

      You see that LUW uses the additional “copy” clause, the “modify do” keywords, and the “return” clause. Those are omitted in DB2 for z/OS, which has no impact on the most common types of updates. The actual “replace”, “delete”, and “insert” expressions, including the “insert after”, “insert before”, “insert into”, etc. are the same across DB2 LUW and DB2 z/OS, based on the XQuery Update standard.

      The same, well, except for the keyword “node” in the DB2 z/OS syntax. What’s up with that? The reason is that the XQuery Update standard has added the keyword “node” and removed the keywords “modify do” after the support for XML updates was implemented in DB2 9.5 for LUW. That was very unfortunate. Luckily, the keywords “node”, “modify do”, and “transform” are just syntactic sugar but don’t change the meaning or behaviour of these updates at all. Some people argue that they are just noise. In the future we may allow the keywords “modify do” to be omitted in LUW and the keyword “node” to be added, for the sake of platform compatibility and without compromising backwards compatibility for existing LUW users.

      As for the “copy” clause and the “return” clause in LUW: Update 26 through 29 in the article “Update XML in DB2 9.5” show some examples where these clause might be useful. However, in the most common update cases these clause do not add much benefit. Hence, their omission in DB2 z/OS is -in my opinion- not a big deal and allows for simpler and shorter syntax.

      Matthias

  2. Dave Scheffer Says:

    Do we have performance figures and best practice recommendations on these new features? I don’t want to prohibit my community from iterative XML operations yet at the same time need to understand capacity planning and load issues.


    • Hi Dave,
      let me give you a quick answer now and I’ll try to find some more information in the next few days.

      First, the new XML updates provide a big performance benefit. In DB2 9 for z/OS, XML updates require full document replacement, which typically means that a full document is retrieved from the database to the application, parsed and modified in the application, and shipped back to DB2 where it is fully parsed and logged again, even if only 1 value in the document has changed. In DB2 10 for z/OS, a node-level update can be submitted to the database and is executed without XML parsing and without round-tripping the document over the network. Additionally, if the XML document is large then DB2 10 for z/OS does not log the entire document but only the section in which the update happened. I don’t have a number right now, but all this a BIG performance gain, not hard to see.

      The binary XML (which I mentioned in my previous post) allows XML to be transmitted between client and server in an optimized format. This format is about 17% to 46% smaller in size than traditional textual XML. It also saves DB2 between 9% and 30% of CPU cost during XML insert. If you also count the reduced latency, you may see the end-to-end response time for XML inserts reduced by 10% to 50%. Similar benefits are possible when you retrieve XML in binary format into a Java application. These numbers have been found in some internal tests. Obviously, the performance gain may vary depending on your hardware, network, and on the exact nature of your applications and XML documents.

      As for best practices, there will be a Redbook coming out in a few months, titled “Extremely pureXML in DB2 10 for z/OS”. Let me see what else we can make available until then.

      Thanks,

      Matthias

  3. Venu Says:

    Hi Matt,
    I am working on Xquries. When I searched in GOOGLE and I found your examples and they are very helpful to write the Xqueries.

    I am trying to use the host variable concept in where caluse but it did not work for me.
    Can I get any syntax for that.

    If I use the static value(ex: “12121212″) in XMLEXISTS it is working fine but it is not taking the host variable :HV-INT-ID in XMLEXISTS.

    Here is my SQL
    EXEC SQL
    DECLARE COMFRM_CSR CURSOR FOR
    SELECT X.ID_10
    ,X.CONF_NMBR
    FROM COMFIL.FORM as Y,
    XMLTABLE
    (
    ‘$a/Filing/FilingKeys’ PASSING FORM_DATA_XML
    AS “a” COLUMNS
    ID_10 VARCHAR(11) PATH ‘INT_ID’,
    CONF_NMBR VARCHAR(17) PATH ‘CON_NMBR’
    )
    AS X
    WHERE Y.INT_ID = :HV-INT-ID
    AND Y.DCMT_NMBR = :HV-DCMT-NMBR
    AND XMLEXISTS(
    ‘$FORM_DATA_XML/Filing/FilingKeys[INT_ID/text()= $d]‘
    passing :HV-INT-ID as “d”)
    FOR FETCH ONLY
    END-EXEC


  4. Hi Venu,

    what was the error msg that you received and which version of DB2 are you using?

    In any case, I think that SQL requires you to cast the host variable to the desired data type that you want t0 use in the comparison, like this:

    XMLEXISTS(‘$FORM_DATA_XML/Filing/FilingKeys[INT_ID/text()= $d]‘
    passing CAST(:HV-INT-ID as INTEGER) as “d”)

    If you are using DB2 for z/OS then I think you also need to pass the XML column name explicitly like you did in the XMLTABLE function:

    XMLEXISTS(‘$a/Filing/FilingKeys[INT_ID/text()= $d]‘
    passing FORM_DATA_XML AS “a”, CAST(:HV-INT-ID AS INTEGER) AS “d”)

    Does this solve your problem?

    Thx,

    Matthias

  5. Venu Says:

    Hi Matt,
    I am having an issue in XQuery while using the XMLEXISTS in where clause.
    Could you please verify the syntax is correct and guide me to resolve this.
    Here is my query:

    SELECT ,INT_DCMT_ID
    ,XMLCAST (XMLQUERY(
    ‘for $x in $p/ReturnCompositionBO/composition/forms/*/@intDcmtSeqNmbr return $x’ passing DTF.FILING.FILING_XML as “p”)

    INTO ,:HV-INT-DCMT-ID
    ,:HV-INT-DCMT-SEQ-NMBR
    FROM TABLE
    WHERE INT_DCMT_ID = :HV-IN-INT-DCMT-ID
    AND XMLEXISTS(‘$b/ReturnCompositionBO/composition/forms/*/@intDcmtSeqNmbr = $d’ passing FILING_XML as “b”, :HV-VAR as “d”);

    SQL is executed successfully but it not filtering the vlaues. It is retrieving more than one row. As per the data it has only ONE row in table with intDcmtSeqNmbr = “2″. I am passing value 2 to HV-VAR. But it is returning other rows also.
    Is the syntax is correct?

    Thanks in advance.
    Regards,
    Venu.


    • Hi Venu,

      I suspect the reason is that you forgot the square brackets that are required for XPath predicates. Try it like this:

      …XMLEXISTS(‘$b/ReturnCompositionBO/composition/forms/*[ @intDcmtSeqNmbr = $d ]’ passing FILING_XML as “b”, :HV-VAR as “d”);

      Does that help?

      Thanks,

      - Matthias

      • Venu Says:

        Hi Matt,
        Thank you very much for your response.
        I tried with square brackets also. Did not work.
        When I use the brackets I am getting bind error with SQLCODE = -16002.

        Thank you very much for your time.

        Regards,
        Venu.


      • Hmm, the error 16002 means that there is still a syntax error (unexpected token) in your query. I’m quite certain that you do need the square brackets in your XPath expression. You just need to get the syntax right. The error msg should indicate what the unexpected token was. You could also try to first get the query to work correctly on the DB2 command line. Let me know if you need more help with that.

        Matthias

  6. Kayla Says:

    Hi Matthias,

    Is it possible to rename a node based on the actual value.

    For instance I have something like the following…

    ABC
    anotherproduct
    someproduct

    well this is the short form.

    What if I need to go through and say “replace all products with the value ‘ABC’ with the new value ‘BBB’.

    So the new xml would be…
    BBB
    anotherproduct
    someproduct

    Rather than just replacing where “cid=3000″. As in, instead of using an attribute as my where, I need to only replace where a value of a node is something in particular.

    Any idea?
    Thanks

    • Kayla Says:

      darn, i figured it would leave out my tags. basically each of those ABC, anotherproduct, someproduct, are in a “product” tag which is in a “root” tag. so I need to look at all the product tags with a value of ABC and replace with a value of BBB.

  7. Frano Says:

    If I have pureXML column in DB2 table that contains xml like this:

    123456
    2012-09-22

    987654
    2013-12-22

    is it possible to replace the value of node data/item/value but only for the node data/item that has changedate = ’2013-12-22′, from ’987654′ to ’235689′?

    Thanks
    Frano

  8. Frano Says:

    Thank you for your answer. I am not sure that “Update 20″ is similar with what I was asking. In this example you can put ‘type = …’ in path but I need to refer on value of one tag to change value of another. I have two tags with path ‘data/item’ and inside one of the ‘item’ tags I like to change value of the path ‘data/item/value’, but only for the one that has ‘data/item/changedate’ = 2013-22-22.

    It is like you have xml with five customers with their addresses and account numbers in one xml column in one row of BANK DB2 table. Each of them has path ‘data/customer’ and inside customer tag there are tags for name, surname, address and account number. Now I like to change address for the customer with account number 12345679 and name John. And other four I like to keep intact.

    If the structure is in relational DB2 table we can use SQL update:

    Update CUSTOMER set ADDRESS = ‘Long street 25′ where accnumber = 12345679 and NAME = ‘John’;

    I need xquery update expression that can do the same inside the xml saved in purexml column.


    • Here is an example:

      create table mytable(id INTEGER, xmlcol XML);
      insert into mytable values (1, your-sample-document-here );

      update mytable
      set xmlcol = xmlquery( 'copy $new := $XMLCOL
      modify do replace value of $new/data/item[changedate = "2013-12-22"]/value
      with 999999
      return $new' )
      where id = 1;

      Does this help?

      And one more example:

      update customer
      set xmlcol = xmlquery( 'copy $new := $XMLCOL
      modify do replace value of $new/data/customer[accnumber = 12345679 and name = "John"]/address
      with "Long Street 25"
      return $new' )
      where ... ;

      • Frano Says:

        Thank you very much. I am really new in this stuff. Until now I never used pureXML columns in my databases and this is very helpful.

  9. Frano Says:

    so I have xml like this:

    #data#
    #item#
    #value#123456#/value#
    #changedate#2012-09-22#/changedate#
    #/item#
    #item#
    #value#987654#/value#
    #changedate#2013-12-22#/changedate#
    #/item#
    #/data#

    more and less signs are both repleced with #


  10. Yes, this is possible. The XPath that identifies the item that you want to update can contain a predicate on the element “changedate”.

    There are several examples (for DB2 LUW) in this article:
    https://www.ibm.com/developerworks/data/library/techarticle/dm-0710nicola/
    In this article, look for the section “Modify repeating nodes”. Example “Update 20″ is similar to what you are asking.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 50 other followers

%d bloggers like this: