Update on the XML Challenge
December 3, 2008
I just found out that more than 55,000 people have entered the XML challenge so far. Remember that some contests in the challenge are run on a monthly basis, so there is still an opportunity to win prizes. And I understand that there are still more than $50,000 in prizes that have yet to be won in US and Canada. I also heard that the organizers still have t-shirts to give away for people who simply enter the contest in the US. But these are running out fast, so don’t waste any time with your entry.
The two programming-oriented contests in the challenge actually opened on the 1st of December in the US (there are separate contests in different countries). Entries can be sent in until the 31st of January. Good luck with your entries…
Referential Integrity and XML Data
November 18, 2008
A few people have recently asked about the ability to ensure referential integrity in XML data. In this case, they are referring to the referential integrity feature that is common to relational databases. I thought I’d take a few moments to share one possible answer with you.
Referential integrity is a relational database feature that ensures consistency is maintained between items that reference one another. For instance, if you have a database table with order information that refers to a database table with product information, referential integrity ensures that each order refers to a valid product. In technical terms, referential integrity ensures that each product ID specified in the order table (where the product ID is a foreign key in the order table), exists in the product table.
Over the years, DBAs have found referential integrity to be a valuable feature in their relational databases. And now, they are asking how to ensure referential integrity with XML data.
The XML standard does not include mechanisms for ensuring referential integrity amongst XML elements. You can use schemas to place constraints on the XML data, but these constraints are not really enforced by the database (except insofar as the database is used to validate via a schema). And the ability to place constraints on XML data doesn’t quite add up to the ability to the ability to ensure the same level of consistency checking offered by referential integrity in a relational database.
However, there is a possible answer. This is one of the cases where using a hybrid relational/XML database proves to be very useful. You extract values from certain XML elements and store those values in relational columns in the same table, and then place constraints on those relational columns to ensure referential integrity. There is a cost associated with this approach, due to increased storage and additional programming logic. However, for many, this increased cost is justified by the ability to ensure referential integrity.
Benchmark for 1TB Transactional XML System
November 3, 2008
IBM continues to openly benchmark its DB2 pureXML capabilities. Last week at the Information On Demand Conference in Las Vegas, Intel and IBM released details of their latest joint benchmark. The goal of the benchmark is to show the performance levels that you can expect for inserts, updates, deletes, and queries on transactional XML data. This latest benchmark is with 1TB of XML data using the FIXML standard industry format. The results are very interesting. Thanks to the way that DB2 stores XML data, the XML data occupies less than 450GB of disk space when stored in DB2. If you compare the results of this benchmark against previous benchmarks, you can see that the addition of 50% more processing cores provides 48% more throughput (at lower CPU utilization rates), indicating a nice scale-out story. Intel actually measured a 4-CPU server, with 6 cores per CPU, processing more than 6700 XML-based TPoX transactions per second. One of the key aspects of the benchmark is the minimal amount of database tuning needed to obtain these results. For more details, including hardware specifications and database configuration, see the presentation from the conference.
10 Reasons why DBAs Should Understand Native XML
October 9, 2008
Here are 10 reasons why relational database administrators need to understand and possibly use native XML storage…
XML Challenge Web Site is Live!
October 2, 2008
I recently blogged about the Search for an XML Superstar contest. I’m happy to say that the Web site is now live at www.xmlchallenge.com.
The XML challange actually consists of five separate contests, so there is a good chance you will find a contest you can enter, regardless of your level of technical ability:
- Video contest
- Gadget contest
- Query contest
- Ported application contest
- XML contest
The contest is open to both students and professionals. You can enter as many of the contests as you like. In the US, prizes include Laptops, Wii consoles, Zune players, iPod Touches, iPod Nanos, USB keys, and T-shirts. Good luck with your entries…
XML in Oracle 11g
September 30, 2008
Here are some observations about the XML storage capabilities in Oracle 11g. These observations were deduced from public sources. Please be aware that I work for IBM who competes directly with Oracle in this regard.
Oracle provides three options for storing XML data:
- Unstructured, which is essentially Character Large OBject (CLOB) storage. Like any CLOB implementation, you will need to retrieve and parse the XML data before executing XPath and XQuery statements, which has a query-time performance impact.
- XML-Object-Relational, which shreds the XML data into object-relational tables. There are multiple storage options to choose from (which I will cover in a moment). Oracle recommends this option for data-centric use cases. Of course, with this option, retrieving the original XML data will incur a performance hit as the data is re-composed.
- XML-Binary, which stores the XML data as a token stream in a Binary Large OBject (BLOB). Oracle recommends this option for document-centric use cases.
If you use XML-Object-Relational, which is also known as Structured or Schema-based storage, you have five different options to choose from for storing repeating elements:
- Store in CLOBs
- Store in VARRAY as LOBs
- Store in VARRAY as nested tables
- Store in VARRAY as XMLType pointers to BLOBs
- Store in VARRAY as XMLType pointers to nested tables
If your head is spinning with the different options, I don’t blame you.
Oracle provides a special index called XMLindex, which indexes the internal structure of XML data. Actually, this index is a table. There is an interesting post on the Oracle Discussion Forum. In this post, an Oracle user describes their experiments with XBRL data in Oracle 10g and 11g. In their experiments, case 3 runs on Oracle 11g without XMLIndex in 9.6 seconds, while case 4 runs with XMLIndex and takes 574 seconds. So, in this user’s experience, running with the XMLIndex results in approximately a 50x slowdown.
Note that if you want to create an index for numeric or date values in Oracle 11g, you must use stored procedures, which create separate indexes that you must asynchronously maintain.
Oracle now has an XML update function. It is not compliant with the W3C XQuery Update Facility. Although, Oracle did recently announce the XQuilla XQuery engine, which they claim “will (maybe) free the way for the W3C XQuery Update Facility 1.0 candidate specification / implementation, which is embedded in the XQuilla XQuery engine, for other Oracle products”. So it is possible that a future release of Oracle may support the W3C XQuery Update Facility.
The compendium of XML storage options in Oracle 11g are, in my opinion, essentially based upon existing relational infrastructure. This is very different than the approach that IBM has taken, where they have truly built native support for XML data into their database from the ground up.
Please note that these are solely my personal opinions and not necessarily those of my employer IBM.
1st October is Online Community Action Day
September 28, 2008
Adam Gartenberg is a colleague of mine at IBM. He is a big advocate of online communities, and wants to ensourage online communities to be as active as possible. After all, the more we participate in online communities, the greater the benefit for everyone.
As such, Adam is dubbing October 1st to be Online Community Action Day. The idea is that, on this day, you will make your best effort to contribute in some way to an online community… any online community. For example, you could:
- If you agree or disagree with a blog post, add a comment.
- If there is a blog post that made a difference in your job, leave a comment saying “thank you.”
- If you have a handy tip worth sharing, add a post to a discussion forum.
- Sign up for an online community like ChannelDB2.
By being active, you will help make online communities better!
Flexible Schemas: When to Persist Data in XML Instead of Relational
September 26, 2008
One of the great benefits of XML as a format for persisting data is that it is relatively easy to update the schema. XML was designed to be inherently flexible in nature. Adding, removing, and updating elements or attributes in the data and schema are relatively straighforward operations.
In the past, when persisting XML data, many organizations mapped XML data into relational tables and ended up pulling their hair out when later updating schemas. Increasingly, organizations are choosing to store this data natively in XML format.
Of course, you could store XML data in a Character Large OBject (CLOB) or a Binary Large OBject (BLOB) in a relational table and still enjoy the benefits of easier schema management. However, the overhead involved in retrieving data from a CLOB or a BLOB often makes such situations unworkable. For instance, to work with the XML data, you need to retrieve the entire CLOB or BLOB, and then you need to process the contents of the CLOB or BLOB with an XML parser, before being ready to work with the XML data. This is an awkward and inefficient architecture, incurring a significant overhead for each data read operation. Many organizations are turning to data management solutions like DB2 pureXML that allow them to natively store and process their XML data.
Here at IBM, we have come across several instances where organizations have embraced the flexibility of XML schemas and chosen to persist their data in a native XML format:
- One of the world’s leading telecommunications companies recently overhauled its order entry systems. Designing an order entry system that caters for many thousands of products and services in a variety of geographies is a tremendous challenge, especially when designing a schema that can cater for current and future offerings. It was for this reason that the telecommunications giant decided to store their order data in XML format. Thanks to the flexible nature of XML schemas, they can cater for their existing complex needs, while minimizing the impact of later introducing innovative new products and services.
- Taxation authorities are faced with taxation rules and taxation forms that change on a yearly basis. Therefore, their data schemas must change each year. Some years, the changes consist of relatively straightforward field additions. However, some years the changes consist of larger reorganizations, which are much more difficult to manage. For taxation authorities, being able to easily manage schema changes from year-to-year is a compelling reason to move to persisting data in an XML format. IBM is currently working with multiple taxation authorities around the world to improve their tax collection systems. If you want to read about one such taxation authority’s experiences, check out New York State tax agency uses pureXML to simplify filing of more than 2 million returns already.
- A Japanese software company developed a system that stores and manages diverse information for education establishments. Because each educational instituation has different storage needs, and because those storage needs evolve over time, this company discovered that using the relational model was both cumbersome and expensive. You can read more about this company and see some great quotes about their switch from relational to XML at Software Research Associates Tohoku chooses IBM DB2 9 with pureXML for UniVision+EV system.
- A leading Chinese energy and utilities corporation developed a flexible data analysis and reporting system that could handle data from extremely diverse facilities across its more than 100 constituent companies. Their approach is to store common information in relational format and to store information from diverse sources–that have different schemas–in XML format. By using such a hybrid relational/XML approach, they are able to take advantage of flexible XML schemas to easily accommodate additions, updates, and changes. For more information about this situation, see China Huadian Corporation chooses IBM DB2 9 with pureXML to integrate and analyze corporate property information.
As you can see, all these companies take advantage of the flexible nature of XML schemas and native XML storage to make systems easier to manage and update, often implementing solutions that were previously difficult or impossible to do.
XML in SQL Server 2008
September 24, 2008
This is a follow-up to my look at XML in SQL Server 2005.
With the recent release of SQL Server 2008, Microsoft made updates to the XML support in SQL Server. In particular, they made improvements to the XML Schema Definitions (XSD) that they support, they added support for the let clause in XQuery FLWOR expressions, and they added support for certain XML data manipulation insertions.
It is good to see support for the XQuery let clause, although I actually removed a reference to their lack of support for the let clause from my previous post because I felt that it was not a big deal. The expanded XML schema and XML manipulation support will prove useful for users. However, as far as I know (and I am relying on the Microsoft documentation here), the primary issues remain:
- SQL Server does not allow multiple versions of a schema in the same schema collection.
- SQL Server does not support indexing individual elements and attributes.
- The SQL/XML implementation includes non-standard extensions.
- SQL Server does not support standalone XQuery.
And you still need to carefully evaluate the performance of the following for your environment:
- Queries based on path expressions.
- Queries against large XML documents.
- Creating and updating indexes.
The following sources were consulted when compiling this post:
- What’s New for XML in SQL Server 2008, SQL Server Technical Article, August 2008.
XML in SQL Server 2005
September 19, 2008
I’d like to make one thing perfectly clear before I begin this post… the following are solely my personal opinions and not necessarily those of my employer IBM.
In the past, I have mentioned that each vendor has a very different implementation of “native XML storage”. Here are some observations about the XML storage capabilities in Microsoft SQL Server 2005. Of course, Microsoft released SQL Server 2008 in August. I will review the new release in a later post.
You should be aware that I have no inside information about SQL Server and that all of these observations were deduced from public sources. You should also be aware that I work for IBM who competes directly with Microsoft in this regard. But, nonetheless, I think you will find the following information useful.
SQL Server parses XML data upon insertion and transforms it into a binary token string, which is then stored in a BLOB. This is parsed storage, but it is a stream rather than in tree format. The stream contains information about the hierarchical relationships between the the elements.
SQL Server provides a primary and secondary index to optimize query performance. The “primary XML index” in SQL Server is actually a table. You could conceivably consider this implementation to be a clever form of shredding. The primary index is a table on which the secondary indexes are defined.
When it comes to issuing queries, SQL Server supports the two industry-standard query languages for XML data: SQL/XML and XQuery. However, you should be aware that:
- The SQL Server implementation of SQL/XML includes non-standard extensions.
- SQL Server does not support standalone XQuery. And, in fact, SQL Server translates XQuery commands into SQL before execution.
Also, if you will have queries based on path expressions or queries against large XML documents, you should very carefully evaluate SQL Server query performance.
When it comes to support for XML schemas, SQL Server does not allow multiple versions of a schema in the same schema collection. Also, SQL Server does not allow you to alter a schema. You should be aware that these schema flexibility and schema evolution restrictions can create headaches as you work with real-world XML schemas.
Finally a few more words about indexing. Did you know that SQL Server does not support indexing individual elements and attributes? It always indexes all elements and all attributes. This leads me to recommend that you carefully evaluate the performance and logging overhead of creating and updating indexes in SQL Server for your particular use case.
The following sources were consulted when compiling this post:
- XML Best Practices for Microsoft SQL Server, Microsoft Software Developer Network paper, April 2004.
- XML Indexes in SQL Server 2005, Microsoft Software Developer Network paper, August 2005
- XML Support in Microsoft SQL Server 2005, Microsoft Software Developer Network paper, December 2005
- Documentation for MS SQL Server 2005, beta 2
DB2 pureXML at the IOD Conference
September 11, 2008
In my previous post, I mentioned the great sessions at the upcoming Information on Demand conference. I thought you might be interested in soem details. Here is a small selection of the session titles and abstracts, as they appear on the conference Web site, and in no particular order:
1712 Introduction to XML and DB2 pureXML for Dummies
- Is XML in your future? Come to this friendly and informative session and learn the basics of XML, and learn why you need to know about IBM DB2® pureXML™. DB2 pureXML makes your XML projects easier and improves your application performance. All attendees will receive a free “DB2 pureXML for Dummies” booklet.
1438 Learn how Verizon Streamlined their Order System
- Verizon Business delivers advanced IP, data, voice and wireless solutions in 75 countries. Processing and tracking of orders is critical. Until recently, Verizon had multiple order entry systems, with no common place to store order information. Learn how Verizon created a single change-resistant and cost-effective order management system for all orders, regardless of order entry. Also learn how they designed and implemented the system to allow new products and services to be added on the fly, thus improving business agility and reducing time to market for new offerings.
1659 Implementing an Effective Electronic Government Solution - NY State Tax
- Hear how NY State Department of Taxation and Finance implemented a streamlined process for electronic submission of taxes. Learn about the agency’s conversion to electronic forms for tax submission and the use of IBM DB2® pureXML to efficiently store and manage the electronic form data. Find out about the realized benefits from the first year of processing and hear about the lessons learned and what future enhancements are being considered. Finally, hear a discussion of how what was learned can be applied to other government agencies.
1660 Using XML for Effective Corss-agency Shared Services in Public Safety
- Learn how effective inter-agency sharing can be accomplished through XML. The session will introduce the value of IBM DB2® pureXML for leveraging shared information in public security. It will review how Shandong Public Security takes advantage of DB2 pureXML technology to help business users (policemen) access this data. It will also demonstrate how they combine the advantage of both DB2 pureXML and IBM Info 2.0 to help business users discover direct and indirect links in data, and deliver the valuable intelligence to policemen’s daily activities.
1661 Streamline Government Processing Through Electronic Forms and DB2 pureXML
- Forms are everywhere; learn how converting to electronic forms can improve citizen access, streamline processes and provide better record-keeping. Hear about the value of IBM DB2® pureXML for storing, managing and analyzing the data created by electronic forms tools such as IBM Lotus® Forms. See a demo of a simple eForms application implemented with Lotus Forms connected to DB2 pureXML and the resulting simplicity of the database and query infrastructure required.
1677 Improving Health Care in China with a DB2 pureXML EMR Solution
- This session will articulate the value of IBM DB2® pureXML in an electronic medical record (EMR) solution, and how it improves health care in China, with a population of 1.4 billion. The IBM specialist, joined by an EMR specialist, will share their experiences with both DB2 pureXML and the medical specialist Cache Database. They’ll compare the two technologies, explain why pureXML was chosen, and describe how they migrated their Cache-based solution to DB2 pureXML. The speakers will discuss the business value of their new EMR solution and its business implications for the emerging health-care markets.
1197 DB2 pureXML Production Experiences at UCLA
- Last year we worked with IBM Watson Lab to prototype the IBM DB2® V9 pureXML™ for our Patient Oriented Document System (PODS), the UCLA enterprise-wide patient record repository. In 2007 IOD we shared the prototyping approaches and the potential benefits of pureXML for managing the metadata. In this presentation we will share our production experiences using pureXML to manage metadata for the new PODS4 as well as the actual benefits realized for the upgrade from XML Extender to pureXML. To appreciate the significance of pureXML’s impact on the PODS repository, we will briefly show PODS as a critical component of the UCLA Document Management System in the overall extended service oriented architecture (xSOA) that’s in progress.
1622 Top 10 Best Practices for DB2 pureXML
- IBM DB2® software offers IBM pureXML™ support with efficient XML storage, XML indexing, and query languages SQL/XML and XQuery. These are powerful but novel concepts in DB2. Although many existing guidelines for managing data in DB2 are also valid for XML, additional considerations can help you ensure that XML data is managed easily and efficiently. Based on experience with DB2 pureXML customer projects, this session presents the top 10 best practices for flexible and high-performance XML management with DB2 pureXML.
1678 DB2 pureXML Customers - Trends and Successes
- Come hear a series of real customer success stories involving IBM DB2® pureXML™ technology. Industries represented include financial, health, retail, telecom and government. For each customer, we will discuss its problem, the business value of DB2 pureXML and the technical solution.
Don’t forget there are also meet-the-expert sessions, birds-of-a-feather sessions, and demo-til-you-drop sessions.
Meet Native XML Databases Users
September 4, 2008
Do you want to hear native XML database users speak about their experiences? Do you want to ask them questions? If so, there is an event that may interest you.
IBM’s Information on Demand conference is being held during the last week of October in Las Vegas. This conference covers all aspects of information management, which of course is much broader than the management of XML data.
There are 29 sessions dedicated to native XML data management, including:
- 8 sessions where DB2 pureXML customers or consultants talk about their experiences
- 8 technical sessions, where you learn about the technical details of DB2 pureXML
- 8 hands-on lab sessions, where you get to use the software to complete mini projects
And you will be particularly interested in:
- 2 Birds-of-a-Feather sessions, where DB2 pureXML users get together to talk about their experiences
- Meet-the-Expert sessions, where you schedule 1×1 time with a DB2 pureXML expert to ask them whatever questions you want
And this is only the native XML data management part of the conference. The conference has much more to offer, including great opportunities to learn about all aspects of information management, to network with fellow professionals, and of course to be entertained.
One last thing… if you do go, keep an eye out for a free “DB2 pureXML for Dummies” booklet that will be handed out at the conference. It will be handed out at the DB2 booth in the Expo floor and at my speaking session.
DB2 pureXML Online Communities
September 1, 2008
A quick note to let you know about a couple of online communities for DB2 pureXML:
- LinkedIn Group
Are you a member of LinkedIn? If you are, there is a LinkedIn group dedicated to DB2 pureXML. This is an great opportunity to network with other users from around the world on the IBM DB2 pureXML Network. - User Forum
Do you have technical questions about DB2 pureXML? If you do, then there is a new forum on IBM developerWorks. Make sure to go to the DB2 pureXML Forum to ask your questions, post your tips, and read answers to other people’s questions.
Schema Evolution
August 22, 2008
Yesterday, I briefly discussed schema flexibility. Today, I’m going to talk briefly about schema evolution. Schema evolution refers to the ability to easily move to a new version of an XML schema. When evaluating vendors, make sure the database management system you choose meets your current and future schema evolution needs.
Consider a situation where an organization stores messages that adhere to one of the major XML standards, like HL7 or FpML. Industry standards are evolving, with new versions of those standards being made available over time. Moving to a new version of an XML standard usually means also moving to a new–and hopefully compatible–XML schema.
If the new version of the schema is compatible with the old version of the schema, you want to make sure that your database management system moves to the new schema with a minimal amount of disruption. At the very least, they should support the ability to move to this new XML schema without:
- Needing to re-validate all of your existing XML documents [... which would be a pain!)
- Needing to change your existing XML documents [... which would be an even more painful experience]
And if the new version of the schema is not compatible with the old version of the schema, you want to make sure that your database management system supports schema flexibility to handle the situation. This situation is sometimes referred to as uncompatible schema evolution.
Again, this is a topic that is often overlooked when evaluating XML storage needs, and one that has proven to be quite troublesome when overlooked.
Quick Start for Persisting XML Standards-Compliant Data
August 18, 2008
XML-based standards have emerged in many industries. For instance, there is ACORD in insurance, FIXML in financial services, NIEM in government, and so on.
Are you evaluating options for persisting standards-compliant XML data? If so, you should know about a great resource. As you know, you can freely download IBM DB2, which is a data server for both relational and XML data. Well, IBM has also made available working demos for a number of XML standards, including ACORD, FIXML, FpML, MISMO, NIEM, OTA, TAX1120, TWIST, UNIFI, and more. The demos show end-to-end XML data exchange, together with data retrieval via RESTful Web services, Atom feeds, and XForms.
You can see the demos for yourself at:
http://services.alphaworks.ibm.com/DB2pureXMLDemo/Demo.html
And you can download sample data and demo scripts at:
http://www.alphaworks.ibm.com/tech/purexml/download
Questions for XML Database Vendors
July 11, 2008
Are you evaluating XML database vendors? If so, here is a list of questions that can help you when you evaluate vendors. Of course, some questions may not apply to your situation. For instance, update capabilities may not be necessary in audit and logging systems. You can weed questions out that do not apply to you.
Performance:
- Ask about the performance when inserting XML data into the repository. I came across one customer who unfortunately went with a vendor that could not keep up with with their database ingest needs and had to switch to IBM. XML query performance alone is often not a sufficient measure. In some systems you may have to use heavy indexing for good query performance, but these indexes then lead to significant overhead for insert, update, and delete. In fact, you should also probably ask about the overhead incurred when working with indexes.
- Ask about the query performance. Different types of queries have vastly different characteristics, so make sure that the performance proof points they give match your situation. For instance, are the performance proof points using the same kind of data you use, are they at the same granularity as your typical queries, and are they working across a similar data set to yours.
- Ask if their XML performance proof points are publicly available with sufficient detail about the test data, workload, hardware used, etc to verify their claims.
- Ask about any restrictions there are on their XML indexes. Can all data types be indexed?
- Ask about the scalability limits for the database.
- Ask for proof points on the reliability of the database.
Schema Support
- If you work with multiple XML schema, ask about their schema handling capabilities. For instance, do you need it to support multiple schema? Or do you need it to support different versions of the same schema in the same column? It is important to clarify these requirements up-front.
- Schema updates are inevitable. Ask what is involved when you have new versions of schema. Is it a seamless experience, or does it require a significant migration effort?
- In fact, I would recommend validating that the vendor can work with your schema up-front, especially if it is complex. I have heard of situations where people have had issues with other vendors in this regard.
- Or perhaps you have schema-less documents. If so, do all of their features support such documents?
Language Support
- If you plan to use SQL, make sure that their SQL/XML function can meet your needs.
- Similarly, if you plan to use XQuery, make sure their XQuery implementation can meet your needs.
- Confirm whether XQuery is embedded in SQL, or whether XQuery can be used standalone and via an API. This may be important to you.
- Are XML updates important to you? If so, ask about their support for update capabilities. And ask if there are any limitations in their update support. I understand that certain vendors have limitations in this regard. In particular, you will want to ask if they support the XQuery Update Facility that is being standardized by the W3C.
Miscellaneous
- Ask if there is a no-charge version of the database for pilot projects. XML features can look wonderful on paper but may be more difficult to use than you expect. Limitations in functionality and usability are not obvious from documentation and white papers, but are revealed when start doing hands-on work with your own XML data.
- If you need your reports and applications to also work with legacy relational data, ask how easy it is to work with XML and legacy relational data.
- Do you need to work with digital signatures? If so confirm that the digital signatures can be validated against retrieved documents.
- Ask to what extent they are standards-compliant.
- If high availability is important, ask if they offer high availability features.
Thanks to Matthias Nicola with his help in compiling this list. And good luck with your selection process. Cheers!
Viral Video - Wednesday
July 1, 2008
And now, the third installment. I hope you enjoy…
XQuery versus SQL/XML
June 26, 2008
XQuery and SQL/XML are two standards-based languages for retrieving information from XML. Many XML storage vendors support for both standards. Although, as is typical for standards implementation, those vendors have varying degrees of support for the standards.
Recently, some people asked me whether XQuery or SQL/XML will win the XML retrieval wars. This question surprised me. You see, I believe there is a valid need for products to support both.
XQuery is a W3C Recommendation. It is supported by vendors like IBM, Oracle, and Microsoft. The language includes features like variables, data types, operators, conditional expressions, and functions. It uses XPath expressions to select information from XML. So XQuery is, in essence, a new language for many people to learn.
SQL/XML, on the other hand, is a set of extensions to SQL. It consists of the XML data type, a collection of XML publishing functions, conversion functions, schema validation functions, and more. SQL/XML was developed by INCITS H2.3, with participation from Oracle, IBM, Microsoft, and others. So SQL/XML is, in essence, an extension to an existing language.
There is a valid need for both.
Some developers are already comfortable with XML development. These developers will easily adapt to XQuery. Also, in many circumstances, XQuery offers developers a strong combination of programming power and ease of use. Finally, XQuery offers strong performance for many XML tasks. Although, each implementation of XQuery and its accompanying database are different, so please verify your vendor’s performance in this regard.
SQL/XML, on the other hand, is ideal for environments where developers are comfortable with SQL programming. When you consider the maturity of the SQL language, together with the strong API support and domain knowledge for query optimization, you realize that SQL/XML is ideal for certain environments. It may take more effort to code certain XML tasks in SQL/XML, but that may be acceptable in some environments.
So I believe that certain environments will favor one language over the other, that there is no reason why both languages can’t exist, and that it is good to allow people to choose the language that suits them best. What do you think?
Learn to Use XML with Databases and win Prizes!
June 13, 2008
The International DB2 Users Group (IDUG) recently announced the Search for the XML Superstar contest. IDUG is an independent, not-for-profit, user-run organization who promote the effective use of the DB2 family of products. Naturally the focus of the contest is to teach people about databases and the storage and retrieval of XML data.
The contest consists of education, followed by a series of quizzes to test your understanding. It then proceeds to a fully-blown programming contest. There is also a video contest. Cool prizes include laptops, Nintendo Wii, and the Segway i2 personal transport system!
If nothing else, this is a great opportunity to learn about XML and get some goodies at the same time. All details of the contest do not appear to be available as I write this post. The organizers say they will post them to the Channel DB2 Web site soon.
My name is Conor O'Mahony. I lead XML product strategy for Data Management at IBM. These postings are my opinions and do not necessarily represent IBM’s positions, strategies, or opinions.