Tuesday 16 February 2010

Benchmarking Infinispan and other Data Grid software

Why benchmarking?
Benchmarking is an important aspect for us: we want to monitor our performance improvements between releases and compare ourselves with other products as well. Benchmarking a data grid product such as Infinispan is not a trivial task: one needs to start multiple processes over multiple machines, coordinate between them to make sure everything runs at once and centralize reports. Then there is the question of what access patterns the benchmark should stress.

Introducing the cache benchmarking framework (CBF)
What we've come up with is a tool to help us run our benchmarks and generate reports and charts. And more:
- simple to configure (see config sample bellow)
- simple to run. We supply a set of .sh scripts that connect to remote nodes and start cluster instances for you.
- open source. Everybody can download it, read the code and run the benchmarks by themselves. Published results can be easily verified and validated.
- extensible. It's easy to extend the framework in order to benchmark against additional products. It's also easy to write different data access patterns to be tested.
- scalable. At this moment we've used CBF for benchmarking up to 62 nodes.
- users can test products, configurations, and access patterns on their own hardware and network. This is crucial, since it means educated decisions can be made based on relevant and use-case specific statistics and measurements. Further, the benchmark can even be used to compare performance of different configurations and tuning parameters of a single data grid product, to help users choose a configuration that works best for them

Below is a sample configuration file and generated report.

<bench-config>

<master bindAddress="${127.0.0.1:master.address}" port="${2103:master.port}"/>

<benchmark initSize="2" maxSize="${4:slaves}" increment="1">
<DestroyWrapper runOnAllSlaves="true"/>
<StartCluster/>
<ClusterValidation partialReplication="false"/>
<Warmup operationCount="1000"/>
<WebSessionBenchmark numberOfRequests="2500" numOfThreads="2"/>
<CsvReportGeneration/>
</benchmark>

<products>
<jbosscache3>
<config name="mvcc/mvcc-repl-sync.xml"/>
</jbosscache3>
<infinispan4>
<config name="repl-sync.xml"/>
<config name="dist-sync.xml"/>
<config name="dist-sync-l1.xml"/>
</infinispan4>
</products>

<reports>
<report name="Replicated">
<item product="infinispan4" config="repl-sync.xml"/>
<item product="jbosscache3" config="mvcc/mvcc-repl-sync.xml"/>
</report>
<report name="Distributed">
<item product="infinispan4" config="dist-*"/>
</report>
<report name="All" includeAll="true"/>
</reports>

</bench-config>


And this is what a generated charts look like:



Where can you find CBF?
CBF can be found here. For a quick way of getting up to speed with it we recommend the 5 minutes tutorial.

Enjoy!

Mircea



2 comments:

  1. Hi,
    Could you please explain more what the benchmark is doing ?
    I have not found enough information into the 5mn tutorial.
    Thanks.

    ReplyDelete
  2. Where we can find results of benchmarks for different hardware? Maybe it is good idea to create page with people results?

    ReplyDelete