Why Won’t Oracle Publish XML Benchmark Results for TPoX?

July 8, 2009

Why won’t Oracle publish results for the Transaction Processing over XML (TPoX) benchmark?

We know that Oracle has implemented TPoX demonstration and test systems. Oracle has demonstrated TPoX systems at their conferences. Also, Oracle has included TPoX tests and data in their research efforts and as part of their X-Files demonstration. So we know that Oracle has used TPoX. Why won’t they publish benchmark results?

Oracle claims that the TPoX benchmark is narrowly scoped and that it doesn’t handle the diverse use cases of XML. They are correct in that TPoX does not model multiple scenarios. It models only one scenario… a security trading scenario that uses a real-world XML Schema (FIXML). Such a scenario involves a high volume of relatively small XML documents. The benchmark takes into account write, update, delete, indexing, XML schema, logging, concurrency, and other database considerations. While the TPoX benchmark does indeed model only one scenario, it makes sure to incorporate a real-world mix of XML-related operations for that scenario.

Database benchmarks are always focused on a specific usage scenario, and TPoX is no exception. Relational database benchmarks have always taken the same approach: TPC-C focuses on OLTP systems, TPC-W on web-based transaction systems, TPC-H on ad-hoc decision support systems, TPC-R on decision support systems with precomputed and materialized views. There are database benchmarks that focus on SAP workloads, and so on. The reason for this approach is that combining all these diverse use cases into a single benchmark would lead to a test scenario that does not represent anything in the real world. In the same spirit, TPoX focuses on just one of various common XML use cases. Other XML benchmarks that focus on other use cases, such as XML content and full-text search, are also desirable but yet to be defined.

TPoX is entirely open-source (with major contributions from Intel and IBM). In TPoX 1.3 contributors from the University of Furtwangen in Germany have added initial support for Oracle Database and Microsoft SQL Server. In particular, they adjusted the TPoX queries to support Oracle Database and SQL Server syntax, and they have extended the TPoX workload driver so it connects to Oracle Database and Microsoft SQL Server. Anybody, including Oracle, is welcome to enhance, revise, or modify the TPoX benchmark as they deem appropriate for meaningful benchmarking.

The TPoX benchmark is a useful measuring stick for the many organizations who have transactional systems with small XML documents. I am amused that Oracle, on the one hand continually highlights the need for separately handling the diverse XML uses cases, and then on the other hand complains that TPoX handles only one use case and not a diverse range of use cases. Don’t they realize that they are contradicting themselves :-)

Oracle also claims that TPoX attempts to follow the Transaction Processing Performance Council (TPC) approach, and that the TPC approach deviates from production system workloads. It is true that many people, including myself, consider some of the TPC benchmarks to have flaws. However, they still serve a purpose for people who are evaluating database options. Although the benchmarks are not a direct indication of a performance in an end user’s environment, they are still a useful tool for indicating relative performance.

I am not aware of any any alternative XML benchmarks proposed by Oracle. If Oracle has an XML benchmark that they believe is better, it would be great for everyone in the industry if they would bring it forward. Everyone would benefit from such a move, especially the people who are trying to evaluate their XML storage options. I am curious to know why Oracle hasn’t published any benchmark results for its XML capabilities, and instead focuses its efforts on debates that are difficult to resolve. IBM has published both TPoX benchmarks results and internal benchmark results. When are Oracle going to step up to the plate?

Oracle has long claimed that the fact that Oracle Database has multiple different ways to store XML data is an advantage. At last count, I think they have something like seven different options:

  • Unstructured
  • XML-Object-Relational, where you store repeating elements in CLOBs
  • XML-Object-Relational, where you store repeating elements in VARRAY as LOBs
  • XML-Object-Relational, where you store repeating elements in VARRAY as nested tables
  • XML-Object-Relational, where you store repeating elements in VARRAY as XMLType pointers to BLOBs
  • XML-Object-Relational, where you store repeating elements in VARRAY as XMLType pointers to nested tables
  • XML-Binary

Their argument is that XML has diverse use cases and you need different storage methods to handle those diverse use cases. I don’t know about you, but I find this list to be a little bewildering. How do you decide among the options? And what happens if you change your mind and want to change storage method?

But back to my original question… Why don’t Oracle publish results for the TPoX benchmark? Perhaps it is because Oracle are still trying to figure out which of their seven storage options is best to use :-)

About these ads

3 Responses to “Why Won’t Oracle Publish XML Benchmark Results for TPoX?”

  1. Steve OHearn Says:

    I’m curious – have you ever tried to run TPoX on Oracle and see what the results are? I for one would love to see you all do this.


  2. Hi Steve, as far as I know Oracle has license restrictions that would prevent anyone from publishing performance results for Oracle. It would have to be Oracle themselves to publish TPoX results. It seems that Oracle has been experimenting with TPoX, as you can see in section 7.2 of this Oracle paper:
    http://www.vldb.org/pvldb/1/1454177.pdf

    There the authors write: “In practice, we have seen that a more realistic data centric XML use-case is that of a large collection of moderately sized XML documents. TPOX [60] models such XML use-cases.”

    I agree with this statement, as it underlines that TPoX is a useful benchmark for a certain class of data-centric XML applications.


  3. [...] this post from IBM for more Oracle-poking on the complexity of storage options available. Excerpt: Oracle has [...]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 51 other followers

%d bloggers like this: