Thursday, February 02, 2012

Atomikos distributed transactions scalability

At this moment I'm working on the project where we using distributed transactions. The number of XA Resources in our case is not big at all, but for me is very interesting to know how Atomikos can scale when the number of XA Resources is large. That is why you are reading this blog post. I've desided to push this pet project on GitHub for you to be able to test/play/join the effort, etc. - https://github.com/daoway/AtomikosDistributedScalability

What do I have... I have Atomikos 3.7.1, Spring 3.1 (if you are still using 3.0 consider migration to 3.1 - it's better) and Hibernate. In this particular case I've decided to use embedded in memory H2 database to eliminate possible network overhead and to simplify setup of underlying farm of XA Resources. It's possible because H2 has XA compatible data source implementation. BTW, it can be used also for unit/integration testing instead of Oracle XA Datasource to test transactional behavior for example.

So, I'm playing with H2 XA Army. I have a lot of in-memory H2 database instances, one simple domain object and Hibernate. I want to persist this object across all nodes in XA transaction and calculate time for this effort. Then increase the number of XA resources for some Delta and repeat this procedure until I'll get OutOfMemoryException. Just kidding :) All this stuff is in-memory and RAM amount is limited, for sure, the maximum of XA resource I've played with was 2000.

Here is the result of my experiment (click on link to see Adobe Flex chart with test results) :


As you can see, replication time grows linear, together with number of XA resources participating in "distributed" transaction. If so, we can conclude that Atomikos has linear scalability.

Thanks for reading. Any comments/suggestions are welcomed.

3 comments:

Anonymous said...

Thanks for your post. Seeing how Atomikos deals with a growing number of transactional resources was interesting.
But on a daily basis it might be more interesting to see how Atomikos deals with many concurrent users. This would put the focus on connection pooling. When using XA transaction, database connection can be used in a stateless way by virtue of the XID. So, in theory XA connections might scale better than "plain" connections.

Stas Ostapenko said...

Thanks for your comment. Good idea for next blog post. Atomikos have it's own internal db pool, it also would be interesting how to determine right pool size for particular workload.

Anonymous said...

Hi,

Quoting: "When using XA transaction, database connection can be used in a stateless way by virtue of the XID. So, in theory XA connections might scale better than "plain" connections."

This is true in theory, but wrong in practice. Most XA drivers (JDBC and/or JMS) don't work in that stateless kind of way so the intended advantage is pure fiction, alas...

Best,
Guy