September 29, 2011
Undeniably, XML is the message format of choice for many service-oriented architectures and application integration efforts. Also, many SOA and web service implementaions use REST as the protocol for accessing URL-addressable resources and services. REST stands for Representational State Transfer and is built on top of HTTP, which acts as the underlying transport layer.
With a new set user-defined functions (UDFs) in DB2, it has become very simple to issue REST requests directly from your SQL statements or DB2 stored procedures. This enables DB2 to easily interact with REST-based services and integrate information from the web or URL-based resources into the database.
The new REST UDFs allow you to receive and provide information in binary format (BLOB) or textual format (CLOB), which includes but is not limited to XML. The basic REST UDFs are scalar functions that perform the simple HTTP operations GET, POST, PUT, and DELETE. These UDFs are:
Each of these functions take a URL and, optionally, an HTTP header as input paramaters. Additionally, the POST and PUT functions have a third parameter for the BLOB or CLOB data that you want to upload.
The REST UDFs are implemented in Java and support HTTP as well as HTTPS with SSL encryption.
When you use these UDFs in SQL statements to receive information, you can:
- insert the information into DB2 tables
- use the information as a parameter to search for related data in DB2
- join the information with existing data in DB2
- process the information in a stored procedure
- or use the information in any other way that SQL provides
In particular, if the information is in XML format you can apply any DB2 pureXML functions to it, such as:
- the XMLQUERY function to extract pieces from the XML message
- the XMLTABLE function to split the XML into smaller XML pieces
- the XMLTABLE function to shred the XML into relational tables
- the XMLVALIDATE function to validate the XML against an XML Schema
If you are interested and want to try this for yourself, read the following article:
“Accessing HTTP and RESTful services from DB2: Introducing the REST user-defined functions for DB2”
This article describes the REST functions for DB2 in more detail and presents several concrete usage examples. The UDFs themselves are available for download at the bottom of this article.
Enjoy your REST !
September 12, 2011
How much protein is in a steak? Well, a 6-ounce steak can contain about 38 grams worth of protein.
How much protein does salmon contain? Approximately 34 gram of protein in a 6 ounce piece of salmon.
How much protein can you find in lentils? A cup of cooked lentils has 18 rams of protein.
And how much protein can you find in DB2 pureXML? Tons! Hundreds of Gigabytes and soon over a 1TB of proteins!
It’s true, DB2 pureXML is an excellent source for proteins. DB2 can store, compress, and index the proteins so that you always find the right ones quickly when you need some.
What distinguishes DB2 from a steak is that DB2 stores all proteins in XML or in a hybrid XML/relational format! This enables members of the biological research community to analyze and compare the structure and composition of protein molecules. The findings can help them explain diseases, develop new drugs, or understand the interactions between different proteins.
The protein data is publicly available from the Protein Data Bank (PDB) which is world-wide repository of structural protein data. To facilitate data exchange and flexibility, the data is available in XML.
Typically one XML document describes a single protein molecule. Such a document includes detailed information about all the atoms that the protein consists of, which can be hundreds or sometimes thousands of atoms. And to represent the structure of the protein, the 3-dimensional spatial coordinates of each atom are included as well, resulting in XML documents that can be hundreds of MB in size to describe a single protein.
Searching and analyzing such amounts of complex information is a challenge. To tackle this challenge, researchers in Germany decided to harness the hybrid XML/relational capabilities of DB2 to efficiently store and query the protein data. I was happy to assist them to get the most out of DB2 pureXML.
The article “Managing the Protein Data Bank with DB2 pureXML” describes the database design and optimization that facilitate protein data analysis in a scalable manner. Even if you’re not a biologist (I’m not!), the article is an interesting cases study in how a real-world data management problem has been mapped to the hybrid features of the DB2 database system.