Tuesday, 23 February 2010

Infinispan 4.0.0.Final has landed!

It is with great pleasure that I'd like to announce the availability of the final release of Infinispan 4.0.0. Infinispan is an open source, Java-based data grid platform that I first announced last April, and since then the codebase has been through a series of alpha and beta releases, and most recently 4 release candidates which generated a lot of community feedback.

It has been a long and wild ride, and the very active community has been critical to this release. A big thank you to everyone involved, you all know who you are.

Benchmarks
I recently published an article about running Infinispan in local mode - as a standalone cache - compared to JBoss Cache and EHCache. The article took readers through the ease of configuration and the simple API, and then demonstrated some performance benchmarks using the recently-announced Cache Benchmarking Framework. We've been making further use of this benchmarking framework in the recent weeks and months, extensively testing Infinispan on a large cluster.

Here are some simple charts, generated using the framework. The first set compare Infinispan against the latest and greatest JBoss Cache release (3.2.2.GA at this time), using both synchronous and asynchronous replication. But first, a little bit about the nodes in our test lab, comprising of a large number of nodes, each with the following configuration:
  • 2 x Intel Xeon E5530 2.40 GHz quad core, hyperthreaded processors (= 16 hardware threads per node)
  • 12GB memory per node, although the JVM heaps are limited at 2GB
  • RHEL 5.4 with Sun 64-bit JDK 1.6.0_18
  • InfiniBand connectivity between nodes
And a little bit about the way the benchmark framework was configured:
  • Run from 2 to 12 nodes in increments of 2
  • 25 worker threads per node
  • Writing 1kb of state (randomly generated Strings) each time, with a 20% write percentage

ReadsWrites
Synchronous
Replication
Asynchronous
Replication
As you can see, Infinispan significantly outperforms JBoss Cache, even in replicated mode. The large gain in read performance, as well as asynchronous write performance, demonstrates the minimally locking data container and new marshalling techniques in Infinispan. But you also notice that with synchronous writes, performance starts to degrade as the cluster size increases. This is a characteristic of replicated caches, where you always have fast reads and all state available on each and every node, at the expense of ultimate scalability.

Enter Infinispan's distributed mode. The goal of data distribution is to maintain enough copies of state in the cluster so it can be durable and fault tolerant, but not too many copies to prevent Infinispan from being scalable, with linear scalability being the ultimate prize. In the following runs, we benchmark Infinispan's synchronous, distributed mode, comparing 2 different Infinispan configurations. The framework was configured with:
  • Run from 4 to 48 nodes, in increments of 4 (to better demonstrate linear scalability)
  • 25 worker threads per node
  • Writing 1kb of state (randomly generated Strings) each time, with a 20% write percentage

ReadsWrites
Synchronous
Distribution


















As you can see, Infinispan scales linearly as the node count increases. The different configurations tested, lazy stands for enabling lazy unmarshalling, which allows for state to be stored in Infinispan as byte arrays rather than deserialized objects. This has certain advantages for certain access patterns, for example where remote lookups are very common and local lookups are rare.


How does Infinispan comparing against ${POPULAR_PROPRIETARY_DATAGRID_PRODUCT}?
Due to licensing restrictions on publishing benchmarks of such products, we are unfortunately not at liberty to make such comparisons public - although we are very pleased with how Infinispan compares against popular commercial offerings, and plan to push the performance envelope even further in 4.1.

And just because we cannot publish such results, that does not mean that you cannot run such comparisons yourself. The Cache Benchmark Framework has support for different data grid products, including Oracle Coherence, and more can be added easily.

Aren't statistics just lies?
We strongly recommend you running the benchmarks yourself. Not only does this prove things for yourself, but also allows you to benchmark behaviour on your specific hardware infrastructure, using the specific configurations you'd use in real-life, and with your specific access patterns.

So where do I get it?
Infinispan is available on the Infinispan downloads page. Please use the user forums to communicate with us about the release. A full change log of features in this release is on JIRA, and documentation is on our newly re-organised wiki. We have put together several articles, chapters and examples; feel free to suggest new sections for this user guide - topics you may find interesting or bits you feel we've left out or not addressed as fully.

What's next?
We're busy hacking away on Infinispan 4.1 features. Expect an announcement soon on this, including an early alpha release for folks to try out. If you're looking for Infinispan's roadmap for the future, look here.

Cheers, and enjoy!
Manik

6 comments:

  1. NB: The label on the charts suggest an Apple JVM on Mac OS X. This is just the JVM used to generate the charts and not the JVM used to run the benchmarks. Thanks to Sanne Grinovero for pointing this out.

    ReplyDelete
  2. Infiniband interconnects? My customers won't be able to procure anything faster than 1gb ethernet. Can you give us a feeling for how much these benchmarks would suffer with the slower cluster communication speed?

    ReplyDelete
  3. @Jeff that would depend on the cluster size. We've done pretty extensive testing on a small (8-node) cluster with 1GB ether as well.

    At the end of the day though, your best bet is to download the benchmarking framework and try it out on your own cluster.

    ReplyDelete
  4. why in synch. distr. cache read operations plummet so significantly when the cluster size >8? could you explain please

    ReplyDelete
  5. @greg for a four nodes cluster, 50% of the gets will be local and no roundtrip involved: it's 25% chance that the node is the owner of the key (according the consistent hashing function), AND 25% chance that it is a backup node. So 50% of the calls are local, and much faster than remote gets. For larger clusters sizes, the percentage of local gets (out of total number of gets) is 2/cluster size; i.e. 25% for 8 node cluster, which is significantly smaller.

    ReplyDelete
  6. This comment has been removed by a blog administrator.

    ReplyDelete