Building a JSON and DB2 pureXML Application
November 9, 2009
JavaScript Object Notation (JSON) is a popular way for servers and clients to exchange information. It uses serialized text to represent objects (or data structures). If you are interested in using JSON when you work with DB2 pureXML, there is a new series of articles you should check out.
These articles teach you how DB2 pureXML can store, manage, and query JSON. The articles include downloadable sample code and step-by-step instructions. They also describe the benefits of using pureXML to store JSON.
- Part 1: Store and Query JSON with DB2 pureXML
- Part 2: Create Universal Services for pureXML that Expose JSON
- Part 3: Coming soon!
Part 3 will focus on the creation of the presentation layer with Open-Social Gadgets that rely on the JSONx Universal Services as a back-end.
IBM Donates Insurance Models to ACORD
November 5, 2009
The adoption of XML-based standards in many industries has had a significant impact on native XML database adoption. After all, the processing efficiencies offered by a native XML database like DB2 pureXML are vital as systems scale up and out. Readers of the blog will also know that IBM has made it easy for DB2 users to work with many XML-based industry standards by freely distributing working sample implementations of these standards.
Of course, IBM supports such standards by making its industry experts available for participation in many of the standards committees. This week IBM further demonstrated its support for the Association for Cooperative Operations Research and Development (ACORD) and its insurance and financial services standards by donating technology assets including its Insurance Application Architecture (IAA) Business Object/Data Model and the IAA Product Specification Diagram (PSD) to ACORD.
This represents a significant contribution of intellectual property associated with IBM’s insurance industry models. The donated models are from IBM’s Insurance Application Architecture (IAA), the company’s insurance business and IT architecture framework. IAA is continually developed by IBM in collaboration with 100 leading insurance companies around the world, and the IAA model has been licensed by more than 200 insurers worldwide. IAA Business Object/Data Model, coupled with the existing ACORD Information Model, will form the building blocks for the next version of the ACORD Information Model.
This announcement builds on efforts to encourage collaboration and drive innovation in the insurance industry. The adoption of standardized business processes and models will make it easier for insurers to work with agents, brokers and other data partners.
You can read more about this announcement at IBM Contributes Technology Assets to Help Drive Standards in Insurance Industry. You can read more about implementing ACORD standards in DB2 pureXML at Storing and Retrieving ACORD Data for Insurance
Electronic Forms Using Adobe PDF and IBM DB2 pureXML
November 2, 2009
Bryan Patterson has written an excellent step-by-step article about creating an electronic forms solution on Adobe Developer Connection. The article guides you through creating a simple yet powerful eForms solution based on Adobe LiveCycle Designer ES and IBM DB2 pureXML. The article includes links to download all software needed to automate:
- The collection of user data using electronic forms
- The transmission of user data using XML, and
- The storage of user data using a database.
Because all parts of the solution use XML, there is no need for complex data mapping or conversion steps between components. The XML data format used in this example implementation is a very simple structure but you can easily expand the format to meet specific needs or even base the format on one of the many XML-based industry standards for data exchange such as NIEM for government, ACORD for insurance, or FIXML for financial markets. You can read more at Creating an XML electronic forms solution with an Adobe PDF form and IBM DB2 pureXML.
DB2 pureXML at the IOD Conference
October 23, 2009
The IBM Information on Demand conference is upon us again. It is being held next week in Las Vegas. Once again, if you are interested in native XML databases, there is an exciting line-up of activities and sessions. Here are some of the highlights:
- Ask The Experts
- You can freely schedule 1×1 sessions with DB2 pureXML experts from IBM.
- Business Track
- How Verizon Business Streamlined their Order Management System, featuring Andrew Washburn (Verizon)
- How BJC HealthCare? is Using IBM DB2 pureXML to Improve Medical Research, featuring Tom Holdener (BJC Healthcare)
- Technical Track
- Developing DB2 pureXML Applications with COBOL, Java, and .NET: Techniques and a Use Case, featuring Kal Mirza (Barclays Wealth)
- Implementing an Enterprise Order Database with DB2 pureXML at Verizon, featuring Andrew Washburn (Verizon)
- Taking XML to the Data Warehouse with Intel and DB2, featuring Agustin Gonzalez (Intel)
- Querying Large Medical Data Sets using IBM DB2 pureXML, featuring Tom Holdener (BJC Healthcare)
- XML and DB2 pureXML for Beginners
- SOA and IBM DB2 pureXML: The Role of DB2 in an Innovative Architecture
- IBM DB2 pureXML in DB2 for z/OS: Exciting Enhancements and New Features
- Customer Experiences and Case Studies of IBM DB2 pureXML in DB2 9 for z/OS
- Hands on Labs
- Introduction to IBM DB2 pureXML
- New and Advanced Features of IBM DB2 pureXML
The Patient-Centered Medical Home
September 8, 2009
I am enjoying several “email conversations” after last week’s Electronic Health Records for Smarter Healthcare Webinar. In one of those threads, Paul Grundy is sharing his experiences with the patient-centered medical home movement. (You can find Paul blogging at Healthnex.) The patient-centered medical home involves leveraging technology to improve the efficiency and quality of patient care. It is a great example of transforming healthcare from the ground-up, and something I would love to see in my community because I genuinely believe that it would improve the quality of healthcare my family receives. Of course, XML standard-based formats and technologies make this movement possible. You can see great videos about patient-centered medical home at:
The organization promoting this concept are the Patient-Centered Primary Care Collaborative. They are a coalition of more than 600 major employers, consumer groups, organizations representing primary care physicians, and others who have joined to advance the concept of a patient-centered medical home. I genuinely hope that their efforts bear fruit.
Electronic Health Records for Smarter Healthcare
August 24, 2009
Electronic Health Records are a hot topic at the moment. The US federal government has set aside $19 billion in an economic stimulus package to create an electronic health record for every American by 2014. The government is not only using incentives to encourage adoption; they are also using penalties. Between using the carrot and the stick, the US federal government is determined to bring this wave of technology into mass adoption in the healthcare industry.
Next week, I will join Robert Abate to deliver an ‘Espresso Webcast’ about the advantages of implementing standards-based infrastructure for Electronic Health Records (EHRs) and Electronic Medical Records (EMRs). We will also discuss the considerations you need to be aware of as you work with the infrastructure for electronic health systems. Espresso Webcasts are slightly shorter than typical Webcasts, lasting about 35 minutes or so.
The Webcast will be on Tuesday 01 September at 12pm ET. You can register at Electronic Health Records for Smarter Healthcare.
I have previously covered how DB2 9.7 supports native XML data with hash partitioning (database partitioning), range partitioning (table partitioning), and multi-dimensional clustering. These new features make it feasible to analyze information in native XML format, side-by-side with relational data, in a data warehouse. And, of course, being able to work with native XML data in such scenarios offers many efficiencies and advantages.
In reality, many data warehouse projects involve pulling different types of information from disparate data sources around an organization. My colleagues have published the first in a series of two articles that provide step-by-step instructions for integrating information from such disparate sources into a data warehouse. The first of those articles is now available at IBM InfoSphere DataStage and DB2 pureXML, Part 1: Integrate XML operational data into a data warehouse.
This article tells you how to use IBM® InfoSphere™ DataStage to extract and transform XML data managed by DB2® pureXML®. It also explores how DataStage can load this data into a table with traditional SQL data types, and a table with both relational and XML columns. The article includes sample scripts and data that you can download.
The second part of this article series will explore another important scenario: using DataStage to read information from a flat file, convert the data into an XML format, and load this XML data into a data warehouse that contains a table with a DB2 pureXML column.
DB2 pureXML Cookbook – 45% Discount
August 13, 2009
In the past, I discussed the DB2 pureXML Cookbook. This book is very valuable for all DB2 pureXML users, from novice through expert.
If you are interested in buying this book, please be aware that International DB2 User Group (IDUG) members get a 45% discount on IBM Press books. IDUG membership is free. For information about how to get the discount, visit the following Web page: http://www.idug.org/public-spotlights/45-book-discount.html
Also, as a special promotion, IDUG have a competition where they are giving away 3 copies of this book as prizes. For information about entering to win a free copy of the book, visit the following Web page: http://www.idug.org/public-spotlights/free-db2-book.html
SOA Projects: IBM DB2 versus Oracle Database
August 7, 2009
If you are implementing a SOA environment, Solitaire has a very interesting finding for you. Solitaire authored a whitepaper where they analyze database operations at more than 4,100 production systems. As part of their analysis of database operations on IBM System p, they looked at the correlation between the success rate of SOA projects and the choice of database software.
To classify a SOA project as successful, they asked the organization if they now enjoy a 25% or more increase in resource utilization and a 30% or more increase in the speed of provisioning. Here is a chart that shows the relative success rates for SOA projects that involve IBM DB2 and Oracle Database. Solitaire do not say why DB2 does so much better. Perhaps DB2’s superior native XML storage is a factor?
You can read the full Solitaire Report at Whitepaper: DB2 Performance on IBM System p® and System x®.
Almost 24 million people in the US are diagnosed with diabetes. If you know someone with diabetes, you know about the hassles that constant monitoring imposes. MyCareTeam and IBM have collaborated to improve continuous monitoring in such situations, with a solution that both reduces costs and improves the quality of healthcare. I am particularly interested in this collaboration because it involves the use of XML data. IBM and MyCareTeam have written a great paper that covers a number of topics that will be of interest to those in the diabetes and healthcare technologies fields. For instance, there is information about the use of technologies like XML storage and Web services in the context of continuing care. There is also information about related initiatives such as the Continua Health Alliance’s role in selecting appropriate standards. You can read more at Healthcare in the Home: Continuing Care for Diabetes with Collaborative Technologies.
Article about XML for Healthcare
July 17, 2009
The latest issue of IBM Database Magazine has an interesting article titled Healthcare’s XML Heartbeat. In this article, Ken North describes the rise and rise of XML in the healthcare industry. He talks about the key role that XML is playing in the emergence of electronic medical records, the efficient exchange of information, and increasing levels of interoperability. The article gives great insight into the XML-based electronic medical records environment at UCLA Health System.
Why won’t Oracle publish results for the Transaction Processing over XML (TPoX) benchmark?
We know that Oracle has implemented TPoX demonstration and test systems. Oracle has demonstrated TPoX systems at their conferences. Also, Oracle has included TPoX tests and data in their research efforts and as part of their X-Files demonstration. So we know that Oracle has used TPoX. Why won’t they publish benchmark results?
Oracle claims that the TPoX benchmark is narrowly scoped and that it doesn’t handle the diverse use cases of XML. They are correct in that TPoX does not model multiple scenarios. It models only one scenario… a security trading scenario that uses a real-world XML Schema (FIXML). Such a scenario involves a high volume of relatively small XML documents. The benchmark takes into account write, update, delete, indexing, XML schema, logging, concurrency, and other database considerations. While the TPoX benchmark does indeed model only one scenario, it makes sure to incorporate a real-world mix of XML-related operations for that scenario.
Database benchmarks are always focused on a specific usage scenario, and TPoX is no exception. Relational database benchmarks have always taken the same approach: TPC-C focuses on OLTP systems, TPC-W on web-based transaction systems, TPC-H on ad-hoc decision support systems, TPC-R on decision support systems with precomputed and materialized views. There are database benchmarks that focus on SAP workloads, and so on. The reason for this approach is that combining all these diverse use cases into a single benchmark would lead to a test scenario that does not represent anything in the real world. In the same spirit, TPoX focuses on just one of various common XML use cases. Other XML benchmarks that focus on other use cases, such as XML content and full-text search, are also desirable but yet to be defined.
TPoX is entirely open-source (with major contributions from Intel and IBM). In TPoX 1.3 contributors from the University of Furtwangen in Germany have added initial support for Oracle Database and Microsoft SQL Server. In particular, they adjusted the TPoX queries to support Oracle Database and SQL Server syntax, and they have extended the TPoX workload driver so it connects to Oracle Database and Microsoft SQL Server. Anybody, including Oracle, is welcome to enhance, revise, or modify the TPoX benchmark as they deem appropriate for meaningful benchmarking.
The TPoX benchmark is a useful measuring stick for the many organizations who have transactional systems with small XML documents. I am amused that Oracle, on the one hand continually highlights the need for separately handling the diverse XML uses cases, and then on the other hand complains that TPoX handles only one use case and not a diverse range of use cases. Don’t they realize that they are contradicting themselves
Oracle also claims that TPoX attempts to follow the Transaction Processing Performance Council (TPC) approach, and that the TPC approach deviates from production system workloads. It is true that many people, including myself, consider some of the TPC benchmarks to have flaws. However, they still serve a purpose for people who are evaluating database options. Although the benchmarks are not a direct indication of a performance in an end user’s environment, they are still a useful tool for indicating relative performance.
I am not aware of any any alternative XML benchmarks proposed by Oracle. If Oracle has an XML benchmark that they believe is better, it would be great for everyone in the industry if they would bring it forward. Everyone would benefit from such a move, especially the people who are trying to evaluate their XML storage options. I am curious to know why Oracle hasn’t published any benchmark results for its XML capabilities, and instead focuses its efforts on debates that are difficult to resolve. IBM has published both TPoX benchmarks results and internal benchmark results. When are Oracle going to step up to the plate?
Oracle has long claimed that the fact that Oracle Database has multiple different ways to store XML data is an advantage. At last count, I think they have something like seven different options:
- Unstructured
- XML-Object-Relational, where you store repeating elements in CLOBs
- XML-Object-Relational, where you store repeating elements in VARRAY as LOBs
- XML-Object-Relational, where you store repeating elements in VARRAY as nested tables
- XML-Object-Relational, where you store repeating elements in VARRAY as XMLType pointers to BLOBs
- XML-Object-Relational, where you store repeating elements in VARRAY as XMLType pointers to nested tables
- XML-Binary
Their argument is that XML has diverse use cases and you need different storage methods to handle those diverse use cases. I don’t know about you, but I find this list to be a little bewildering. How do you decide among the options? And what happens if you change your mind and want to change storage method?
But back to my original question… Why don’t Oracle publish results for the TPoX benchmark? Perhaps it is because Oracle are still trying to figure out which of their seven storage options is best to use
Building an XML-Based Electronic Forms Solution
June 16, 2009
Many organizations are putting their forms “online”. If you are working on an electronic forms project, I’d like to let you know about a couple of useful resources that my colleague Bryan Patterson has been busy creating:
- A step-by-step tutorial that shows you how to create an electronic forms-based solution. You don’t need to purchase any software to get this demo working in your environment. It uses the trial version of Lotus® Forms to create and manage the online forms; it uses the no-charge version of DB2® Express-C to receive and store the XML data; it uses the no-charge version of IBM Data Studio Developer to create a simple Web service; and it uses the no-charge WebSphere Application Server Community Edition. To see the tutorial, go to Build an intelligent eForms solution based on DB2 pureXML, Lotus Forms, and Web services.
- A video that provides an overview for the above solution and walks through the step-by-step tutorial. To see the video, go to Create an electronic form solution with DB2 pureXML and Lotus Forms.
Make sure to check out the list of resources for both of these… they contain some useful links.
Flirting with Poken
June 9, 2009
It seems like every conference I go to this year has successively stronger ties to virtual networking. There are increasing levels of Twitter activity being shown on giant displays. It’s interesting to watch conference attendees routinely ignore giant displays, until they realize that one is showing real-time tweets about the conference, and then stopping in their tracks to take in the twitter stream.
The advent of blogs dedicated to individual sessions is also interesting. These blogs provide an unintimidating venue for people who are not comfortable asking questions in a large room. They also allow a conversation to continue after the conference. Although, the traffic to them is quite limited. I imagine that we are all struggling to keep up with the bandwidth demands of these new networking tools.
But my most interesting recent experience was with the Poken goven to all conference attendees at the recent IBM Information on Demand conference in Berlin. The Poken allows you to exchange a “digital handshake” with other conference attendees. By touching Pokens, you exchange contact details, including information about your accounts on popular social networking sites like LinkedIn, Twitter, and Facebook. You can also exchange details with a special Poken to get the presentation for the session you are attending.
The first thing I must say is that I was very happy to see a Poken help desk after registering. I did need a little help because initial attempts to exchange information were not succeeding. I was not giving my Poken enough time to exchange details.
The second thing I must say is that, after using the Poken for the entire conference, I still want to give out business cards. You see, after I get a business card, I find a few minutes to write a few notes on that business card. This way I have some additional context when I return from a conference with another stack of business cards. Thankfully I continued to do this at the conference because all I get with the poken is a sequential list of new Poken friends. I can see their profile, including a photo if they uploaded one. But it does not give me enough context for a follow up (unless they are especially memorable).
So, in my opinion, the Poken is a useful addition to the business card, but it will not replace it. You may argue that I could simply write some notes elsewhere and keep track of them. But there something very easy about writing a few key words on the back of a business card. For me the best solution would be to use my smartphone to easily “poke” people. Then, if I could add notes to newly acquired “business cards” on my smartphone, I could truely consider replacing the business card.
DB2 pureXML for Dummies—Get Your Copy!
June 3, 2009
If you want to learn more about native XML databases and DB2 pureXML, this eBook uses the fun and easy-to-understand “for Dummies” format to do just that. It introduces these topics, while guiding you to create your first native XML database. You don’t need to purchase any software to get started creating native XML databases—you simply use the freely available version of DB2. To get your copy, download DB2 pureXML for Dummies. Make sure to download it today as there are a limited number of free downloads available.
Matthias Nicola on XML in the Data Warehouse
May 28, 2009
Here is a short video showing Matthias Nicola speaking about XML in the data warehouse at the IDUG conference. He talks about the new features in DB2 that support native XML data in data warehouse environments. Apologies for the choppy nature of the video. It was taken by hand with my inexpensive pocket camcorder. You can click on the HQ button in the YouTube viewer to see the higher quality version.
Attendees of the upcoming Information on Demand conference in Berlin will get a Poken with their conference badge. Poken is a new tool that offers a smart way to network and share data.
When you insert the Poken into the USB port of a computer, you are connected to the Poken website where you fill in your personal data and create a profile, including links to your profiles on social networking sites like Linked-In, Facebook, Twitter, Myspace, and so on. This will then be stored on your Poken.
To network with someone at the conference, simply hold your Poken up to theirs and exchange IDs, creating an electronic handshake. The next time you connect your Poken to a computer and go to your profile page, you will see all the profiles of exchanged IDs together with their Social Networking sites. This enables you to easily stay in contact post-conference. No more dog-eared business cards surfacing weeks later!
Not only that, but you can also use your Poken to facilitate information download. For example, you can easily obtain session information and session presentations. This is my virgin Poken experience at a conference, so I am really curious to see how it works out…
Short Video from the IDUG North America Confernece
May 13, 2009
Greetings from the IDUG North America Conference in Denver, Colorado. IDUG is the International DB2 Users Group—an independent, not-for-profit organization for DB2 users by DB2 users. If you are a DB2 user, IDUG provide an invaluable resource. Here is a video showing a few short glimpses from Day 1 of the conference:
DB2 Compresses XML Data by 60% to 80%
May 6, 2009
This continues my series of posts about the new features for working with native XML data in the IBM DB2 database software.
Compression reduces the amount of storage space needed for data. Data storage costs money, so minimizing this cost is very important for many organizations. Especially when storage costs can be reduced by 60% to 80%. Storage-related costs include the actual storage devices themselves, the power consumed by those storage devices, and the time spent maintaining these devices.
Another benefit of data compression is that it often improves database performance. Because the data requires less disk space, you typically have reduced levels of disk I/O activity, which can improve database performance. Also, because more data is being cached, you may also enjoy improved buffer pool hit ratios. In many cases, the performance gain due to reduced I/O and better memory utilization outweighs the extra CPU cycles required to compress and decompress the data.
When storing XML data, DB2 typically places the XML data in a location called the XML Data Area (XDA). However, if the XML data is less than 32KB in size, it can be stored with the relational data (this is called inlining).
With DB2 9.5, you can compress XML data that is inlined, allowing you to reduce storage for XML data. For instance, the XML transactions in the TPoX benchmark are typically smaller than 32k, allowing them to be inlined and compressed. In the most recent TPoX benchmark, one terabyte of raw XML data is stored in 390 gigabytes of storage, giving a compression ratio of 61%.
DB2 9.7 extends compression to all XML data, regardless of whether it is in the XDA or inlined. In other words, DB2 9.7 can compress XML data, regardless of size. (The maximum size of an individual piece of XML data that can be stored in DB2 is 2 gigabytes.)
The degree to which XML data can be compressed depends on the nature of the XML data. IBM has tested the new data compression features with six different data sets. Three of these data sets were supplied by IBM clients, and represent real world client usage. The other three data sets represent XML data sets available in the public domain. The data sets include XML documents that range in size from 2KB to 100MB. The following diagram shows the storage savings that have been achieved (this diagram is from Cindy Saracco and Matthias Nicola’s article titled Enhance business insight and scalability of XML data with new DB2 V9.7 pureXML features).

As you can see, compressing XML data typically results in 60 to 80 percent disk space savings with DB2 9.7.
Finally, I’d also like to mention that if you compress XML data, you can also compresses any indexes for that XML data. Compressed indexes also reduce physical I/O and increase buffer pool hit ratios, which often leads to a net performance gain.


My name is Conor O'Mahony. I am Program Director, DB2 Product Marketing at IBM. These posts are my opinions and do not necessarily represent IBM’s positions, strategies, or opinions.