XQuery and SQL/XML: How to convert a date that is not a date into a date?

March 18, 2011

Recently I received an interesting question about a date that is not a really date. Take the following XML element as an exanple:

<StartTime>Monday Mar 23 2009 10:20:24.19</StartTime>

This element contains a value that describes a date and a time. The problem is that the string “Monday Mar 23 2009 10:20:24.19” is really only a string and not a valid representation of a date or a timestamp in the world of XML.

The XML Schema specification defines a specific data type for timestamps: xs:dateTime. According to that spec, literal values must have a specific format if they want to be of type xs:dateTime, or better: if you want to process them as xs:dateTime with a language such as XPath or XQuery. For example, the value 2002-10-10T12:00:00 is a valid xs:dateTime value.

Unless you deal with negative dates, xs:dateTime values must be of the form yyyy-mm-dd’T’hh.mm.ss, plus optional fractional seconds and an optional timezone which I am omitting here for simplicity. The character ‘T’ is required as a delimiter between the date and the time portion of the literal.

So, how can you convert the original string value into a valid xs:dateTime value? Also, how can can you convert that string into a proper SQL timestamp?

Let’s start with an XQuery solution to convert the string above to xs:dateTime, and then we look at a SQL/XML solution to produce a SQL timestamp. Both approaches assume, for simplicity, that the input data is a small document in an XML column:

create table mytest(doc XML);
insert into mytest values ('<root><StartTime>Monday Mar 23 2009 10:20:24.19</StartTime></root>');

XQuery

The main trick is to break the string into its pieces for year, months, day, time, and so on. This can be done with the fn:tokenize function that I wrote about earlier. Go back to my post about fn:tokenize if you need to refresh your memory on that function.

Here is a first XQuery that is not quite perfect yet, but a good start:

xquery 
for $t in db2-fn:xmlcolumn('MYTEST.DOC')/root/StartTime
let $pieces := fn:tokenize($t, " ")
let $timestamp := fn:concat($pieces[4],"-",$pieces[2],"-",$pieces[3],"T",$pieces[5])
return $timestamp;
----------------------------------
2009-Mar-23T10:20:24.19
 1 record(s) selected.

In the “for” clause,  the StartTime element gets assigned to the variable $t. In the first “let” clause, the function fn:tokenize splits the string value of StartTime along the spaces in the string. The second parameter of the fn:tokenize function specifies that the blank character is a delimiter in the string, and the string should be tokenized along that delimiter. The variable $piece now contains a sequence of tokens.

In the second “let” clause we use the function fn:concat to concatenate the tokens in a different order and with different delimiters. We start with the 4th token from the input string (year), append a dash, then append the 2nd token (month), then another dash, then the 3rd token (day), and so on.

The result is almost what we want. The remaining task is to convert the month “Mar” into a “03″. And of course we need to be able to perform this conversion for any of the 12 months that can possibly occur. For example, you could think of a 12-way if-then-else expression, but that would be rather ugly.

A more elegant approach is to build simple “month map” from which we pick the right month. Look at the two additional “let” clauses in bold below. The “let $map” defines a constant piece of XML that maps the three-letter month names to the numbers 01, 02, 03, and so on. Then the “let $month” performs a lookup in that map to pick the element whose name matches $piece[2], which is “Mar” in our example, and the /text() value of the element <Mar> is “03″. This “03″ is then used in the concat of the final result:

xquery 
for $t in db2-fn:xmlcolumn('MYTEST.DOC')/root/StartTime
let $pieces := fn:tokenize($t, " ")
let $map := <map><Jan>01</Jan><Feb>02</Feb><Mar>03</Mar><Apr>04</Apr>. . . </map>
let $month := $map/*[name()=$pieces[2]]/text()
let $timestamp := fn:concat($pieces[4],"-", $month,"-",$pieces[3],"T",$pieces[5])
return xs:dateTime($timestamp);
----------------------------------
2009-03-23T10:20:24.19 
 1 record(s) selected.

Also n0te that the cast to xs:dateTime in the “return” clause proves that we have produced a valid xs:dateTime value.

SQL/XML

Now that we have the XQuery solution above, it is trival to plug this into an SQL/XML query to produce a SQL timestamp:

SELECT XMLCAST ( XMLQUERY('for $t in $DOC/root/StartTime
   let $pieces := fn:tokenize($t, " ")
   let $map := <map><Jan>01</Jan><Feb>02</Feb><Mar>03</Mar><Apr>04</Apr> </map>
   let $month := $map/*[name()=$pieces[2]]/text()
   let $timestamp := fn:concat($pieces[4],"-", $month,"-",$pieces[3],"T",$pieces[5])
   return $timestamp')  AS TIMESTAMP)
FROM mytest;
--------------------------
2009-03-23-10.20.24.190000

 1 record(s) selected.

The XMLQUERY function contains the same XQuery expression as before to produce an xs:dateTime value, which the XMLCAST function can than cast to the SQL type TIMESTAMP. Simple as that.

As always, there are multiple solutions to every problem, so here is another one. This query use fn:tokenize to split the string into its five pieces and each piece is returned by the XMLTABLE function as a separate relational column. The expression in the SELECT clause concatenates the pieces in the right order, and uses a good old SQL CASE expression to map the three-latter month name to the equivalent two-digit number:

SELECT year || '-' || 
   (CASE
      WHEN month = 'Jan' THEN '01'
      WHEN month = 'Feb' THEN '02'
      WHEN month = 'Mar' THEN '03'
      WHEN month = 'Apr' THEN '04'
      WHEN month = 'May' THEN '05'
      WHEN month = 'Jun' THEN '06'
      WHEN month = 'Jul' THEN '07'
      WHEN month = 'Aug' THEN '08'
      WHEN month = 'Sep' THEN '09'
      WHEN month = 'Oct' THEN '10'
      WHEN month = 'Nov' THEN '11'
      WHEN month = 'Dec' THEN '12'        
      ELSE 'unknown'
    END)      
  || '-' || day  || '-' || time
FROM mytest, XMLTABLE('$DOC/root' 
 COLUMNS
   dayOfweek  VARCHAR(20)  PATH 'fn:tokenize(StartTime, " ")[1]',
   month      CHAR(3)      PATH 'fn:tokenize(StartTime, " ")[2]',
   day        INTEGER      PATH 'fn:tokenize(StartTime, " ")[3]',
   year       INTEGER      PATH 'fn:tokenize(StartTime, " ")[4]',
   time       VARCHAR(32)  PATH 'fn:tokenize(StartTime, " ")[5]'          
 ) T;

----------------------------------------------------------------
2009-03-23-10:20:24.19

 1 record(s) selected.

Which of these solutions do you like best?

About these ads

8 Responses to “XQuery and SQL/XML: How to convert a date that is not a date into a date?”


  1. You forgot to include the case where the day is less than 10. You’ll want to prefix that with a ’0′ in such a case.

    How about this:

    xquery
    for $t in db2-fn:xmlcolumn(‘MYTEST.DOC’)/root/StartTime
    let $pieces := fn:tokenize($t, ” “)
    let $map := 01020304. . .
    let $month := $map/*[name()=$pieces[2]]/text()
    let $day := xs:integer($pieces[3]
    let $day := if ($day lt 10) then concat(’0′,fn:string($day)) else fn:string($day)
    let $timestamp := fn:concat($pieces[4],”-”, $month,”-”,$day,”T”,$pieces[5])
    return xs:dateTime($timestamp);

    Please verify that:

    Monday Mar 3 2009 10:20:24.19

    results in:

    2009-03-03T10:20:24.19


  2. [Sorry about the repost. No preview button and I didn't realize that the brackets would be eaten.]

    You forgot to include the case where the day is less than 10. You’ll want to prefix that with a ’0′ in such a case.

    How about this:

    xquery
    for $t in db2-fn:xmlcolumn(‘MYTEST.DOC’)/root/StartTime
    let $pieces := fn:tokenize($t, ” “)
    let $map := <map><Jan>01</Jan><Feb>02</Feb><Mar>03</Mar><Apr>04</Apr>. . . </map>
    let $month := $map/*[name()=$pieces[2]]/text()
    let $day := xs:integer($pieces[3]
    let $day := if ($day lt 10) then concat(’0′,fn:string($day)) else fn:string($day)
    let $timestamp := fn:concat($pieces[4],”-”, $month,”-”,$day,”T”,$pieces[5])
    return xs:dateTime($timestamp);

    Please verify that:

    <StartTime>Monday Mar 3 2009 10:20:24.19</StartTime>

    results in:

    <StartTime>2009-03-03T10:20:24.19</StartTime>


  3. Nicola-san,

    Well, things have been a little hectic in the Tokyo area the past week, but are starting to settle down. Thanks.

    I am glad to see so much improvement in DB2′s XQuery implementation as compared to the first beta versions of 9.0. It has really come a long way. Congratulations to the whole team.

  4. Nagesh Says:

    Good solution !

    Of course, the above code would have to be adjusted for locale implications. For example, the date separator could be a slash or a period. Similarly, the positions of day, month could be different.

    Personally, I was hoping to see a db2-fn: where the argument is a xsd:dateTime and the result a proper DB2 TIMESTAMP value formatted according to the CURRENT LOCALE. This would be FANTASTIC !

  5. Nagesh Says:

    Adding on to my earlier comment, a supplementary ‘reverse’ db2-fn: function would be even better. That is, input provided is a valid DB2 TIMESTAMP value and the result is a valid xsd:dateTime value.


    • Hi Nagesh,

      thanks for your comments. You’re right, the solution above is for one particular non-standard date format only. It’s not a general-purpose solution for all possible non-standard formats. (As an aside, this raises the question of how and why the non-standard date format was produced in the first place and not converted to the standard XML dateTime format when the XML was originally generated. But yes, I understand that it is not a perfect world… )

      You say that it would be nice to encapsulate the date format conversion in a function. Yes, it can be done in a user-defined function which can then be used in SQL or XQuery statements. And it is certainly also possible to create an inverse function that performs the conversion in the other direction. Maybe I will blog some examples in the future (when I have time), but if you have a very specific need for such functions, please contact me at mnicola@us.ibm.com.

      Matthias


  6. Hi Michael,

    good to hear from you. Hope all is well.
    Yes, good catch! If the 3rd of March comes in as <StartTime>Monday Mar 3 2009 10:20:24.19</StartTime> rather than <StartTime>Monday Mar 03 2009 10:20:24.19</StartTime>, then my solution above is insufficient.

    The enhanced solution that you suggested works fine, like this:

    xquery
    for $t in db2-fn:xmlcolumn('MYTEST.DOC')/root/StartTime
    let $pieces := fn:tokenize($t, " ")
    let $map := <map><Jan>01</Jan><Feb>02</Feb><Mar>03</Mar><Apr>04</Apr>. . . </map>
    let $month := $map/*[name()=$pieces[2]]/text()
    let $day_int := xs:integer($pieces[3])
    let $day := if ($day_int lt 10) then concat('0',fn:string($day_int)) else fn:string($day_int)
    let $timestamp := fn:concat($pieces[4],"-", $month,"-",$day,"T",$pieces[5])
    return xs:dateTime($timestamp);

    -----------------------------------------
    2009-03-23T10:20:24.19
    2009-03-03T10:20:24.19

    2 record(s) selected.

    Thanks!

    Matthias


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 52 other followers

%d bloggers like this: