Thursday, 22 December 2011

Startup performance

One of the things I've done recently was to benchmark how quickly Infinispan starts up.  Specifically looking at LOCAL mode (where you don't have the delays of opening sockets and discovery protocols you see in clustered mode), I wrote up a very simple test to start up 2000 caches in a loop, using the same cache manager.

This is a pretty valid use case, since when used as a non-clustered 2nd level cache in Hibernate, a separate cache instance is created per entity type, and in the past this has become somewhat of a bottleneck.

In this test, I compared Infinispan 5.0.1.Final, 5.1.0.CR1 and 5.1.0.CR2.  5.1.0 is significantly quicker, but I used this test (and subsequent profiling) to commit a couple of interesting changes in 5.1.0.CR2, which has improved things even more - both in terms of CPU performance as well as memory footprint.

Essentially, 5.1.0.CR1 made use of Jandex to perform annotation scanning of internal components at build-time, to prevent expensive reflection calls to determine component dependencies and lifecycle at runtime.  5.1.0.CR2 takes this concept a step further - now we don't just cache annotation lookups at build-time, but entire dependency graphs.  And determining and ordering of lifecycle methods are done at build-time too, again making startup times significantly quicker while offering a much tighter memory footprint.

Enough talk.  Here is the test used, and here are the performance numbers, as per my laptop, a 2010 MacBook Pro with an i5 CPU.


Multiverse:InfinispanStartupBenchmark manik [master]$ ./bench.sh 
---- Starting benchmark ---


  Please standby ... 


Using Infinispan 5.0.1.FINAL (JMX enabled? false) 
   Created 2000 caches in 10.9 seconds and consumed 172.32 Mb of memory.


Using Infinispan 5.0.1.FINAL (JMX enabled? true) 
   Created 2000 caches in 56.18 seconds and consumed 315.21 Mb of memory.


Using Infinispan 5.1.0.CR1 (JMX enabled? false) 
   Created 2000 caches in 7.13 seconds and consumed 157.5 Mb of memory.


Using Infinispan 5.1.0.CR1 (JMX enabled? true) 
   Created 2000 caches in 34.9 seconds and consumed 243.33 Mb of memory.


Using Infinispan 5.1.0.CR2(JMX enabled? false) 
   Created 2000 caches in 3.18 seconds and consumed 142.2 Mb of memory.


Using Infinispan 5.1.0.CR2(JMX enabled? true) 
   Created 2000 caches in 17.62 seconds and consumed 176.13 Mb of memory.


A whopping 3.5 times faster, and significantly more memory-efficient especially when enabling JMX reporting.  :-)


Enjoy!
Manik

Wednesday, 21 December 2011

Infinispan 5.1.0.CR2 is out in time for Xmas!

Infinispan 'Brahma' 5.1.0.CR2 is out now with a load of fixes and a few internal changes such the move to a StaX based XML parser as opposed to relying on JAXB which did not get in for CR1. The new parser is a lot faster and has less overhead and does not require any changes from a user perspective.

We've also worked on improving startup time by indexing annotation metadata at build time and reading it at runtime. From a Infinispan user perspective, there's been some changes to how Infinispan is extended, in particular related to custom command implementations, where we know use JDK's ServiceLoader to load them.

As per usual, downloads are in the usual place, use the forums to provide feedback and report any issues.

Cheers, Merry Christmas and a Happy New Year to all the Infinispan community! :)
Galder

Tuesday, 6 December 2011

First Infinispan 5.1.0 'Brahma' candidate release is out!

We're getting close to releasing Infinispan 5.1 'Brahma', and today I have the pleasure of announcing the Infinispan 5.1.0.CR1, our first release candidate. So, what's in it?
  • Ahead of future eventual consistency support, Infinispan now supports versioned cache entries which means that existing write skew checks on REPEATABLE_READ caches can be done more accurately.
  • Support for CDI injection of remote caches! Thanks to excellent work of community contributor Kevin Pollet, you can now inject remote caches as well as embedded caches using CDI. Detailed documentation on how to use is available here. In the mean time, checkout the CDI integration module in the Infinispan source code for examples.
Finally, we've introduced considerable new functionality in the previous Infinispan 5.1 alpha/beta releases, and so this first release candidate contains some important fixes for the newly introduced functionality, so if you're using any previous alpha/beta releases, please upgrade asap and provide us some feedback!

Cheers,
Galder

Infinispan coming to the French Alps!

Remember that on the 15th of December, I'll be speaking at the Alpes JUG in Grenoble about Infinispan. This is a great opportunity for anyone interested in topics such as data caching and data grids to come and learn about Infinispan and its ecosystem, including Hibernate second level cache, Hibernate OGM...etc.

Looking forward to meeting Emannuel and the rest of the Alpes JUG gang :)

Cheers,
Galder

Friday, 25 November 2011

Infinispan @Devoxx



Compared with the previous editions, this year's Devoxx was not that different: well organised, packed with interesting presentations and full rooms. And plenty of Belgian beer :)
Pete Muir, Sanne Grinovero and myself were also given the chance to speak. And we did take our time with a three hours deep dive into the Infinispan ecosystem, plenty of live demos and good discussions.
If you couldn't make it and you can't wait for the video to be published don't worry, the demo is available online here. Give it a spin and let us know what you think!

Cheers,
Mircea

Wednesday, 23 November 2011

More on transaction performance: use1PcForAutoCommitTransactions

What's use1PcForAutoCommitTransactions all about?



Don't be scared the name, use1PcForAutoCommitTransactions is a new feature (5.1.CR1) that does quite a cool thing: increases your transactions's performance.
Let me explain.
Before Infinispan 5.1 you could access the cache both transactional and non-transactional. Naturally the non-transactional access is faster and offers less consistency guarantees.But we don't support mixed access in Infinispan 5.1, so what what's to be done when you need the speed of non-transactional access and you are ready to trade some consistency guarantees for it?
Well here is where use1PcForAutoCommitTransactions helps you. What is does is forces an induced (autoCommit=true) transaction to commit in a single phase. So only 1 RPC instead of 2RPCs as in the case of a full 2 Phase Commit (2PC).

At what cost?


You might end up with inconsistent data if multiple transactions modify the same key concurrently. But if you know that's not the case, or you can live with it then use1PcForAutoCommitTransactions will help your performance considerably.

An example


Let's say you do a simple put operation outside the scope of a transaction:


Now let's see how this would behave if the cache has several different transaction configurations:

Not using 1PC...



The put will happen in two RPCs/steps: a prepare message is sent around and then a commit.

Using 1PC...



The put happens in one RPC as the prepare and the commit are merged. Better performance.

Not using autoCommit



An exception is thrown, as this is a transactional cache and invocations must happen within the scope of a transaction.

Enjoy!
Mircea

Tuesday, 22 November 2011

Infinispan 5.1.0.BETA5 is out!

Infinispan 5.1.0.BETA5 has just been the released with a few interesting additions and important fixes:
  • Locks acquired within a transaction are now reordered in order to avoid deadlocks. There's no new configuration required to take advantage of this feature. More information on how lock reordering works can be found here.
  • One of the aims of Infinispan 5.1 'Brahma' series is to move away from JAXB and instead use Stax based XML parsing. Ahead of that, a new configuration API based on builders has been developed. Expect to hear more about it and examples on using the API in the next few days.
Amongst the fixes included in this release, it's worth mentioning:
  • The demo paths that were broken in 5.1.0.BETA4 have now been fixed.
  • Some of the Infinispan jars in 5.1.0.BETA4 were showing duplicate classes. This was the result of an OSGI bundle generation bug, and so to avoid the issue 5.1.0.BETA5 OSGI bundle generation has been disabled. This functionality will be re-enabled once the issue has been fixed by the Maven Felix plugin.
As always, please keep the feedback coming. You can download the release from here and you get further details on the issues addressed in the changelog.

Cheers,
Galder

Friday, 11 November 2011

Some worth mentioning improvements for pessimistic transactions

Pessimistic transactions were added in 5.1 and are the "rebranding" of eager transactions from previous Infinispan releases. But besides the re-branding, the code also brought some worth mentioning performance optimisation:
  • a single RPC happens for acquiring lock on a key, disregarding the number of invocations. So if you call cache.put(k,v) in a loop, within the scope of the same transaction, there is only one remote call to the owner of k.
  • if the key you want to lock/write maps to the local node then no remote locks are acquired. In other words there won't be any RPCs for writing to a key that maps locally. This can be very powerful used in conjunction with the KeyAffinityService, as it allows you to control the locality of you keys.
  • during the two phase commit (2PC), the prepare phase doesn't perform any RPCs: this optimisation is based on the fact locks are already acquired on each write. This means that then number of RPCs during transactions lifespan is reduced with 1.
  • for some writes to the cache (e..g cache.put(k,v)) two RPCs were performed: one to acquire the remote lock and one to fetch the previous value. The obvious optimisation in this case was to make a single RPC for both operations - which we do starting with 5.1.
Enjoy!
Mircea

Thursday, 10 November 2011

Fewer deadlocks, higher throughput

Here's the problem: first transaction (T1) writes to key a and b in this order. Second transaction (T2) writes to key b and a - again order is relevant. Now with some "right timing" T1 manages to acquire lock on a and T2 acquires lock on b. And then they wait one for the other to release locks so that they can progress. This is what is called a deadlock and is really bad for your system throughput - but I won't insist on this aspect, as I've mentioned it a lot in my previous posts.

What I want to talk about though is a way to solve this problem. Quit a simple way - just force an order on your transaction writes and you're guaranteed not to deadlock: if both T1 and T2 write to a then b (lexicographical order) there won't be any deadlock. Ever.
But there's a catch. It's not always possible to define this order, simply because you can't or because you don't know all your keys at the very beginning of the transaction.

Now here's the good news: Infinispan orders the keys touched in a transaction for you. And it even defines an order so that you won't have to do that. Actually you don't have to anything, not even enable this feature, as it is already enabled for you by default.
Does it sound too good to be true? That's because it's only partially true. That is lock reordering only works if you're using optimistic locking. For pessimistic locking you still have to do it the old way - order your locks (that's of course if you can).

Wanna know more about it? Read this.

Expect and enjoy this feature in our next release 5.1.0.BETA5.

Stay tunned!
Mircea

Wednesday, 9 November 2011

Single lock owner: an important step forward

The single lock owner is a highly requested Infinispan improvement. The basic idea behind it is that, when writing to a key, locks are no longer acquired on all the nodes that own that key, but only on a single designated node (named "main owner").

How does it help me?


Short version: if you use transactions that concurrently write to the same keys, this improvement significantly increases your system' throughput.


Long version: If you're using Infinispan with transactions that modify the same key(s) concurrently then you can easily end up in a deadlock. A deadlock can also occur if two transaction modify the same key at the same time - which is both inefficient and counter-intuitive. Such a deadlock means that at one transaction(or both) eventually rollback but also the lock on the key is held for the duration of a lockAquistionTimout config option (defaults to 10 seconds). These deadlocks reduces the throughput significantly as transactions threads are held inactive during deadlock time. On top of that, other transactions that want to operate on that key are also delayed, potentially resulting in a cascade effect.

What's the added performance penalty?


The only encountered performance penalty is during cluster topology changes. At that point the cluster needs to perform some additional computation (no RPC involved) to fail-over the acquired locks from previous to new owners.
Another noticeable aspect is that locks are now being released asynchronously, after the transaction commits. This doesn't add any burden to the transaction duration, but it means that locks are being held slightly longer. That's not something to be concerned about if you're not using transactions that compete for same locks though.
We plan to benchmark this feature using Radargun benchmark tool - we'll report back!

Want to know more?


You can read the single lock owner design wiki or/and follow the JIRA JIRA discussions.

More locking improvements in Infinispan 5.1.0.BETA4

The latest beta in the Infinispan 5.1 "Brahma" series is out. So, what's in Infinispan 5.1.0.BETA4? Here are the highlights:
  • A hugely important lock acquisition improvement has been implemented that results in locks being acquired in only a single node in the cluster. This means that deadlocks as a result of multiple nodes updating the same key are no longer possible. Concurrent updates on a single key will now be queued in the node that 'owns' that key. For more info, please check the design wiki and keep an eye on this blog because Mircea Markus, who's the author of this enhancement, will be explaining it in more detail very shortly. Please note that you don't need to make any configuration or code changes to take advantage of this improvement.

  • A bunch of classes and interfaces in the core/ module have been migrated to an api/ and commons/ module in order to reduce the size of the dependencies that the Hot Rod Java client had. As a result, there's been a change in the hierarchy of Cache and CacheContainer classes, with the introduction of BasicCache and BasicCacheContainer, which are parent classes of existing Cache and CacheContainer classes respectively. What's important is that Hot Rod clients must now code againts BasicCache and BasicCacheContainers rather than Cache and CacheContainer. So previous code that was written like this will no longer compile:
    import org.infinispan.Cache;
    import org.infinispan.manager.CacheContainer;
    import org.infinispan.client.hotrod.RemoteCacheManager;
    ...
    CacheContainer cacheContainer = new RemoteCacheManager();
    Cache cache = cacheContainer.getCache();
    
    Instead, if Hot Rod clients want to continue using interfaces higher up the hierarchy from the remote cache/container classes, they'll have to write:
    import org.infinispan.BasicCache;
    import org.infinispan.manager.BasicCacheContainer;
    import org.infinispan.client.hotrod.RemoteCacheManager;
    ...
    BasicCacheContainer cacheContainer = new RemoteCacheManager();
    BasicCache cache = cacheContainer.getCache();
    
    Previous code that interacted against the RemoteCache and RemoteCacheManager should work as it used to:
    import org.infinispan.client.hotrod.RemoteCache;
    import org.infinispan.client.hotrod.RemoteCacheManager;
    ...
    RemoteCacheManager cacheContainer = new RemoteCacheManager();
    RemoteCache cache = cacheContainer.getCache();
    
    We apologise for any inconvenience caused, but we think that the Hot Rod clients will hugely benefit from this vastly reducing the number of dependencies they need.

  • Finally, a few words about the ZIP distribution file. In BETA4 we've added some cache store implementations that were missing from previous releases, such as the RemoteCacheStore that talks to Hot Rod servers, and we've added a brand new demo application that implements a near-caching pattern using JMS. Please be aware that this demo is just a simple prototype of how near caches could be built using Infinispan and HornetQ.

As always, please keep the feedback coming. You can download the release from here and you get further details on the issues addressed in the changelog.

Cheers,
Galder

Thursday, 27 October 2011

Infinispan 5.1.0.BETA3 is out with Atomic Map and Hot Rod improvements

I'm very proud to announce yet another beta release in the 5.1 'Brahma' series. This time is the turn of Infinispan 5.1.0.BETA3 which apart from containing many small fixes, it comes with two major improvements:

Fine-grained Atomic Maps

Atomic Maps are special constructs that users can use to bundle data into the value side of a key/value pair. What's special about them is that when the map changes, only the changes or deltas of that map are transfered, which makes Atomic Maps very efficient from a replication perspective when individual elements are modified.

Up until Infinispan 5.1.0.BETA2, the other interesting characteristic of these Atomic Maps was the fact that Atomic Map locking and isolation happened at the the level of the entire Atomic Map. So, if a single key/value pair in the Atomic Map was modified, the entire map was locked.

Starting with Infinispan 5.1.0.BETA3, thanks to Vladimir Blagojevic, Atomic Maps supporting fine-grained locking are available as well. What this means is that an Atomic Map's key/value pairs can be modified in parallel thanks to the ability to lock individual map entries as opposed to the entire map.

This will be particularly handy for heavy Atomic Map users such as JBoss Application Server 7 which uses Atomic Maps for maintaining HTTP sessions, and Hibernate OGM which decomposes entities into Atomic Maps.

Hot Rod server topology improvements

When we originally designed Hot Rod protocol version 1.0, we decided that whenever a distributed cache wanted to send information about the topology of the backend servers to the clients, we'd send the hash ids of each of these nodes. At the time, this seemed like a good idea, until virtual nodes were implemented...

With virtual nodes, each physical Hot Rod server can potentially represent tens, hundreds or even thousands of different virtual nodes. If we stuck with the original protocol, that would mean that we'd have to send each virtual node's hash id back to the client. So, for a cluster of 8 nodes, and 1000 virtual nodes, that'd be at least 80kb of hash ids being transfered back to the client, on top of tons of redundant information about a node's host and port, which is very inefficient.

So, after having some discussions, we decided to evolve the Hot Rod protocol to version 1.1 in order to address this issue. The end result is that now it's the responsibility of the Hot Rod client to generate the hash ids of each of the physical nodes. We do that by sticking to a general formula to generate a Hot Rod server's hash id which both the Hot Rod server and clients can implement.

This improvement has also lead to the significant decrease in memory consumption of the Hot Rod server because it does not need to cache those hash ids anymore.

So, if you are using Infinispan Hot Rod servers and in particular you'are configuring virtual nodes, you definitely should be upgrading your Hot Rod server and client libraries. From a client code perspective, no changes are necessary because starting with 5.1.0.BETA3, Hot Rod clients talk to servers using this latest protocol.

Finally, remember to use user forums to report back, grab the release here, enjoy and keep the feedback coming!!

Cheers,
Galder

Wednesday, 19 October 2011

Infinispan 5.1.0.BETA2 is out and asymmetric clusters are here!

The latest beta of the 5.1 'Brahma' series, 5.1.0.BETA2, is out now and thanks to Dan Berindei, it comes with support for asymmetric clusters which has been highly demanded.

Before asymmetric clusters were supported, it was required that all Infinispan caches that client code interacted against were defined and running in all nodes in the cluster, otherwise, cluster wide messages for a cache that did not exist in a node would fail. So, imagine this scenario where c1 and c2 are user defined caches configured with replication:

Node A [c1]
Node B [c1, c2] 

Without asymmetric clusters, whenever c2 cache was modified in Node B, a replication message would be sent to Node A, but the replication would fail indicating that c2 was not defined in Node A. This failure would get propagated back to Node B which would result in the modification failing. This kind of problems can particularly affect managed environments such as the JBoss Application Server, because often, deployments will be made in a subset of the cluster, so it could well happen that not all nodes have a particular cache started.

So, what Infinispan 5.1.0.BETA2 'Brahma' brings is support for this type of scenarios by maintaining a view correlating nodes and started caches, hence allowing any node to know which other nodes have a particular cache started. This means that in the above case, Node B would not have sent a replication message to Node A because it would know that c2 was only started in Node B.

The lack of support for asymmetric clusters is what forced Infinispan servers to only accept invocations for predefined caches because these predefined could be started when the servers were started, hence avoiding the asymmetric cluster problem. Now that asymmetric clusters are supported, it's likely that this limitation go away, but the timeline is to be defined yet.

This release also includes a bunch of other fixes and as always, please use the user forums to report back, grab the release here, enjoy and keep the feedback coming.

Cheers,
Galder

Monday, 17 October 2011

An understudy for Devoxx 2011

I won't be able to make it to Devoxx this year, but worry not, the University talk and hands-on deep-dive on Infinispan will still go on.  Pete and Mircea will be joined by Sanne Grinovero - maintainer of Infinispan's querying capabilities, Lucene and Hibernate hacker, and committer on Hibernate OGM.  Now I wish I was attending, as an audience member!  :-)

Enjoy
Manik

Tuesday, 4 October 2011

5.1.0.BETA1 "Brahma" is out with reworked transaction handling

It's been a frantic couple of weeks at chez Infinispan with loads of hacking, presentations preparation, team meetings...etc and we're now proud to release Infinispan 5.1.0.BETA1 "Brahma".

For this first beta release, the transaction layer has been redesigned as explained by Mircea in this blog post. This is a very important step in the process of implementing some key locking improvements, so we're very excited about this! Thanks Mircea :)

There's a bunch of other little improvements, such as avoiding the use of thread locals for cache operations with flags. As a result, optimisations like the following are now viable:
AdvancedCache cache = ...
Cache forceWLCache = cache.withFlags(Flag.FORCE_WRITE_LOCK);
forceWLCache.get("voo");
forceWLCache.put("voo", "doo");
...
Previously each cache invocation would have required withFlags() to be called, but now you only need to do it once and you can cache the "flagged" cache and reuse it.

Another interesting little improvement is available for JDBC cache store users. Basically, database tables can now be discovered within an implicit schema. So, if each user has a different schema, the tables will be created within their own space. This makes it easier to manage environments where the JDBC cache store is used by multiple caches at the same time because management is limited to adding a user per application, as opposed to adding a user plus prefixing table names. Thanks to Nicolas Filotto for bringing this up.

Please keep the feedback coming, and as always, you can download the release from here and you get further details on the issues addressed in the changelog.

Cheers,
Galder

Monday, 3 October 2011

Transaction remake in Infinispan 5.1

If you ever used Infinispan in a transactional way you might be very interested in this article as it describes some very significant improvements in version 5.1 "Brahma" (released with 5.1.Beta1):
  • starting with this release an Infinispan cache can accessed either transactionally or non-transactionally. The mixed access mode is no longer supported (backward compatibility still maintained, see below). There are several reasons for going this path, but one of them most important result of this decision is a cleaner semantic on how concurrency is managed between multiple requestors for the same cache entry.

  • starting with 5.1 the supported transaction models are optimistic and pessimistic. Optimistic model is an improvement over the existing default transaction model by completely deferring lock acquisition to transaction prepare time. That reduces lock acquisition duration and increases throughput; also avoids deadlocks. With pessimistic model, cluster wide-locks are being acquired on each write and only being released after the transaction completed (see below).


Transactional or non transactional cache?


It's up to you as an user to decide weather you want to define a cache as transactional or not. By default, infinispan caches are non transactional. A cache can be made transactional by changing the transactionMode attribute:

transactionMode can only take two values: TRANSACTIONAL and NON_TRANSACTIONAL. Same thing can be also achieved programatically:

Important:for transactional caches it is required to configure a TransactionManagerLookup.

Backward compatibility


The autoCommit attribute was added in order to assure backward compatibility. If a cache is transactional and autoCommit is enabled (defaults to true) then any call that is performed outside of a transaction's scope is transparently wrapped within a transaction. In other words Infinispan adds the logic for starting a transaction before the call and committing it after the call.

So if your code accesses a cache both transactionally and non-transactionally, all you have to do when migrating to Infinispan 5.1 is mark the cache as transactional and enable autoCommit (that's actually enabled by default, so just don't disable it :)

The autoCommit feature can be managed through configuration:

or programatically:


Optimistic Transactions


With optimistic transactions locks are being acquired at transaction prepare time and are only being held up to the point the transaction commits (or rollbacks). This is different from the 5.0 default locking model where local locks are being acquire on writes and cluster locks are being acquired during prepare time.

Optimistic transactions can be enabled in the configuration file:

or programatically:

By default, a transactional cache is optimistic.

Pessimistic Transactions


From a lock acquisition perspective, pessimistic transactions obtain locks on keys at the time the key is written. E.g.

When cache.put(k1,v1) returns k1 is locked and no other transaction running anywhere in the cluster can write to it. Reading k1 is still possible. The lock on k1 is released when the transaction completes (commits or rollbacks).

Pessimistic transactions can be enabled in the configuration file:

or programatically:


What do I need - pessimistic or optimistic transactions?


From a use case perspective, optimistic transactions should be used when there's not a lot of contention between multiple transactions running at the same time. That is because the optimistic transactions rollback if data has changed between the time it was read and the time it was committed (writeSkewCheck).

On the other hand, pessimistic transactions might be a better fit when there is high contention on the keys and transaction rollbacks are less desirable. Pessimistic transactions are more costly by their nature: each write operation potentially involves a RPC for lock acquisition.

The path ahead


This major transaction rework has opened the way for several other transaction related improvements:

  • Single node locking model is a major step forward in avoiding deadlocks and increasing throughput by only acquiring locks on a single node in the cluster, disregarding the number of redundant copies (numOwners) on which data is replicated

  • Lock acquisition reordering is a deadlock avoidance technique that will be used for optimistic transactions

  • Incremental locking is another technique for minimising deadlocks.




Stay tuned!
Mircea

Tuesday, 27 September 2011

Catch me if you can... at Soft Shake or JUDCon!

As you're probably well aware, the Infinispan team is delivering talks all around the world. If you're in the US and you want to find out more Infinispan/EDG, Manik will be speaking at JavaOne, but if like me you're staying out in Europe, why not come to Soft Shake in Geneva?

Soft Shake is an IT conference being held in Geneva on October 3rd and 4th and I'll be speaking about data grids and data caching with Infinispan on the 3rd. In fact, I'll be speaking twice! At 1pm you'll see me doing an introduction to data grids and data caching, and at 5pm I'll be delving into the data grid vs database debate.

So, if any of this topics interest you, come and join us at Soft Shake! It's gonna be fun :)

And that is not all! I'll be one of the Infinispan team members speaking in JUDCon London at the end of October. The agenda is now live and you'll see me talking about near caching on the 31st of October. Don't miss it!

Cheers,
Galder

Wednesday, 21 September 2011

Next Infinispan 5.1.0 alpha hits the streets!

Infinispan 5.1.0.ALPHA2 "Brahma" is out now containing a consolidated push-based approach for both state transfer in replicated caches and rehashing in distributed ones. The new changes don't have great impact on the distributed cache users, but for those that relied on state transfer, it's definitely good news :). State transfer now works in such way that when a node joins, all nodes in the cluster push state to it, rather than the new node getting it from the cluster coordinator. As a result of this, the task of providing the state is paralellized, reducing the load on state providers.

On top of that, this Infinispan release is the first one to integrate JGroups 3.0 which brings plenty of API changes that simplifies a lot of the Infinispan/JGroups interaction. If you want to find out more about the new JGroups version, make sure you check Bela's blog and the brand new JGroups manual.

Please keep the feedback coming, and as always, you can download the release from here and you get further details on the issues addressed in the changelog.

Cheers,
Galder

When Infinispan meets CDI

Since version 5.0 (Pagoa) Infinispan has a new module. This module is a portable CDI extension which integrates Infinispan with the CDI programming model. Here are some highlights of what is provided:
  • full integration with Java EE 6
  • typesafe cache (and cache manager) injection
  • JCACHE annotations support
Please note that this module is a technology preview and its API can still change. Next let's discuss some of its details.

Typesafe injection and configuration of cache

The first feature you can use out of the box is the typesafe injection of the Cache and the CacheManager. Without any configuration you can inject the default cache, as well as the cache manager used by the extension. This injection can be performed in any bean managed by Java EE like EJB, Servlet and CDI beans. The only thing to do is to use the @Inject annotation:




Please note that the cache injection is typed. In this case, only String typed Java objects could be added as key and value.

It's also possible to configure the injected cache using CDI. The first step is to create a CDI qualifier, and then create the cache configuration producer, annotated with @ConfigureCache. The qualifier is used to qualify the injection point and the cache configuration producer:



In the same way, a cache can be defined with the default configuration of the cache manager in use, using a producer field:



One advantage of this approach is that all cache configurations of the entire application can be gathered together into a single Configuration class.

The Infinispan CDI extension provides a cache manager with a default configuration (and it is used by default). You can also override the default configuration (used by the default cache manager), as well as the default cache manager. You can find more information here.

JCache annotations support

JCache (aka JSR-107) is famous as the oldest open JSR. However, this JSR has recently seen extensive progress, and is a candidate for inclusion in Java EE 7 (JSR-342).

This specification defines a standard caching API to work with a standalone cache, as well as a distributed cache. An interesting part of the specification are the annotations which are designed to solve common caching use cases. Some of the annotations defined by this specification are:
  • @CacheResult caches the result of a method call
  • @CachePut caches a method parameter
  • @CacheRemoveEntry removes an entry from a cache
  • @CacheRemoveAll removes all entries from a cache
The following example illustrates the use of these annotations:



The Infinispan CDI extension adds support for these annotations. The only thing to do is to enable the CDI interceptors in your application beans.xml - you can find more information here.

Infinispan CDI and JBoss AS 7

With JBoss AS 7, you can setup an Infinispan cache manager in the server configuration file. This allows you to externalize your Infinispan configuration and also to lookup the cache manager from JNDI, normally with the @Resource annotation. This post has more details on the subject.

As we mentioned earlier, you can override the default cache manager used by the Infinispan CDI extension. To use a JBoss AS 7 configured cache, you need to use the cache manager defined in JBoss AS 7. You only need to annotate the default cache manager producer with @Resource. Simple!



Now, you can inject the cache defined in JBoss AS 7 as we described earlier.

What's next?

Here is a highlight of the features you will see soon.
  • support for all JSR 107 annotations - @CachePut, @CacheDefaults
  • support for remote cache
  • ability to listen Infinispan events with CDI observers
  • and more - let us know what you want ;-)
As usual you can open issues and features request on the Infinispan JIRA (component CDI Integration).

Feel free to open a topic in the Infinispan forum if you need help.

The Infinispan CDI documentation is here.

To see the Infinispan CDI extension in action you can browse and run the quickstart application here or watch this screencast.

Enjoy!


About the author
 
Kevin Pollet is a software engineer at SERLI a Consulting & Software Engineering company based in France. He's an Open Source advocate and contributes on many projects such as Infinispan and Hibernate Validator, both at SERLI and at home. He is also involved in the Poitou-Charentes JUG and has spoken in many JUG events. He enjoys attending Java events like JUDCon, JBoss World and Devoxx.

Tuesday, 13 September 2011

Infinispan 5.1.0.ALPHA1 released: Distributed Queries are here!

Having released Infinispan 5.0.1.FINAL yesterday, today is the turn of releasing Infinispan 5.1.0.ALPHA1 "Brahma". This is the first in a series of alpha releases which will give Infinispan users the chance to play with the newest features. Here're some highlights of what's included in this release:

  • Thanks to Israel Lacerra, Infinispan now supports fully distributed queries which allows queries to be parallelised across all nodes. Creating a distributed query is very easy, simply call SearchManager.getClusteredQuery. Please note that this feature is experimental and the API is likely to change before the final release.
  • Infnispan Query module uses Hibernate Search 4 now.
  • In Infinispan 5.0, we introduced the possibility of executing operations against a cache in such way that class loading could occur with a user-provided classloader. In this new release, we've extended the use case to allow data that's stored in binary format to be unmarshalled with the given classloader. This is particularly handy for those users that are deploying Infinispan in modular, or OSGI-like environments. For more information, check AdvancedCache.with(ClassLoader) API.
Please keep the feedback coming, and as always, you can download the release from here and you get further details on the issues addressed in the changelog.

By the way, remember that members of the Infinispan team will be speaking in events such as JavaOne, SoftShake, JUDCon London, Devoxx...etc across the globe. Don't miss them!

Cheers,
Galder

Monday, 12 September 2011

Infinispan 5.0.1.FINAL is out!

Thanks to everyone that downloaded Infinispan 5.0.0.FINAL in the last month. We've had tremendous feedback and as a result of that we've just released Infinispan 5.0.1.FINAL to address some of the most important issues reported. These issues are primarily around distribution clustering mode and rehashing. So, if you're using any of these features, I'd strongly recommend that you upgrade as soon as possible.

Please keep the feedback coming, and as always, you can download the release from here and you get further details on the issues addressed in the changelog.

Finally, we know have a documentation space fully dedicated to Infinispan 5.0. Make sure you check it out!

Cheers,
Galder

Thursday, 1 September 2011

JavaOne 2011 and Devoxx 2011

I never got around to blogging about this when my talks were accepted for JavaOne this year, but it's about time.

I have a conference session titled "A Tale About Caching (JSR 107) and Data Grids (JSR 347) in Enterprise Java" and a BoF session focused on JSR 347 titled "Making Java EE Cloud-Friendly: JSR 347, Data Grids for the Java Platform", which I will be delivering with fellow Infinispan developer, JBoss rockstar and overall nice guy Pete Muir.

Later on in the year, I will also be running a University talk at Devoxx in Antwerp, titled "A real-world deep-dive into Infinispan".  This too will be with Pete and Mircea Markus, another core Infinispan developer.

This will be a great chance to learn more about Infinispan, data grids, JSR 107 and JSR 347, so if you are attending these conferences, make sure you add these talks to your agenda!  :-)

Cheers
Manik

Wednesday, 10 August 2011

Transactions enhancements in 5.0

Besides other cool features such as Map reduce and distributed executors, Infinispan 5.0.0 "Pagoa" brings some significant improvements around transactional functionality:
  • transaction recovery is now supported, with a set of tools that allow state reconciliation in case the transaction fails during 2nd phase of 2PC. This is especially useful in the case of transactions spreading over Infinispan and another resource manager, e.g. a database (distributed transactions). You can find out more on how to enable and use transaction recovery here.
  • Synchronization enlistment is another important feature in this release. This allows Infinispan to enlist in a transaction as an Synchronization rather than an XAResource.This enlistment allows the TransactionManager to optimize 2PC with a 1PC where only one other resource is enlisted with that transaction (last resource commit optimization). This is particularly important when using Infinispan as a 2nd level cache in Hibernate. You can read more about this feature here.
  • besides that several bugs were fixed particularly when it comes to the integration with a transaction manager - BIG thanks to the community for reporting and testing them!
To summarise, Infinispan can participate in a transaction in 3 ways:
  1. as a fully fledged XAResource that supports recovery
  2. as an XAResource, but without recovery. This is the default configuration
  3. and as an Synchronization
In order to analyze the performance of running Infinispan in different transactional modes I've enhanced and used Radargun. The diagram below shows a performance comparison between running Infinispan in all the 3 modes described. The forth plot in the chart shows the performance of running Infinispan without transactions - this gives an idea about the cost of using transactions vs. raw operations.



The benchmark was run on this Radargun configuration, using Infinispan 5.0.0.CR5 configured as shown here. As a TransactionManager JBossTS 4.15.0.FINAL was used, configured with a VolatileStore as shown here. Each node was an 4-core Intel(R) Xeon(R) CPU E5640 @ 2.67GHz, with 4GB RAM.
Each transaction spread over only one put operation. The chart shows the following:
  • a non-transactional put is about 40% faster than a transactional one
  • Synchronization-enlisted transactions outperform an XAResource enlisted one by about 20%
  • A recoverable cache has about the same performance as a non-recoverable cache when it comes to transactions.
And that's not all! During Infinispan 5.0.0 development we've been thinking a lot about how we can improve transactional throughput, especially in scenarios in which multiple transactions are writing on the same key. As a result we've come up with some improvement suggestions summarised here: please feel free to take a look and comment!

Cheers,
Mircea

Friday, 5 August 2011

Infinispan 5.0.0.FINAL has hit the streets!

So here we have it - Infinispan 5.0 Pagoa has been released.  This is a big, big release over 4.2.x, with over 45 new features (including the much more robust PUSH-based rehashing, XA recovery, smart L1 invalidation and virtual nodes) and over 30 bugs squashed, including several critical performance and stability related ones.  Major new programming models are supported too - from Spring and CDI through to OSGi, map/reduce and distributed code execution.

Pagoa has gone through over six months of development, the first alpha being made publicly available in December 2010, and 8 whole release candidates since the end of April this year.  This is the most stable, fastest, feature-rich version of Infinispan to date.  Pagoa has been integrated in other products, projects, frameworks and services - including the lightning-fast JBoss AS 7 - and we expect to see much, much more in this regard.

Pagoa really is a community-centric release.  I've seen loads of participation, from users, system integrators, extension-authors, researchers and academics, framework authors, and PaaS providers.  This participation has taken the form of providing feedback and bug reports through to profiler analysis; from helping with documentation and demos through to contributing major new features; from suggesting ideas and improvements to participating in detailed design meetings.  It is this participation that really helps Infinispan grow and mature, and at the same time innovate, taking us one step closer to becoming the best damn data grid out there.

So, a big thank you to everyone who participated, this really is your release.

As usual, download the release, provide feedback, read through the detailed changelog.  And check out our brand-new documentation site too!!  :-)

Finally, in other news, I recently blogged about Brahma, the codename for Infinispan 5.1. Yes, work has already started here, expect Brahma to be a real firecracker.  Check out the post, vote for your most desired features.  Brahma will also form the basis of Red Hat's Enterprise Data Grid product, which was announced in May.  You'll finally have a fully supported open source data grid!

Enjoy!
Manik

Wednesday, 27 July 2011

Infinispan in JBoss AS7

A couple weeks ago saw the final release of JBoss AS 7.0. Like AS6 before it, AS7 uses Infinispan as the distributed caching solution behind its clustering functionality. So what do you need to know about using Infinispan in AS7?

Configuration


Unlike previous releases of JBoss AS, AS7 centralizes all server configuration into one location. This include Infinispan cache configurations, which are defined by the Infinispan subsystem, within domain.xml or standalone.xml:

The complete schema for the Infinispan subsystem is included in the AS7 binary distribution:
If you are familiar with Infinispan's native configuration file format or the corresponding configuration file from AS6, you'll notice some obvious similarities, but some noteworthy differences.

While a native Infinispan configuration file contains cache configurations for a single cache container, like AS6, the Infinispan subsystem configuration defines multiple cache containers, each identified by a name. As with AS6, cache containers can have 1 or more aliases.

Being concise


The Infinispan subsystem's configuration schema attempts to be more concise than the equivalent configuration in AS6. This is a direct result of the following changes:

Where is <global/>?

Much of the global configuration contains references to other AS services. In AS7, these services are auto-injected behind the scenes. This includes things like thread pools (described below), the JGroups transport (also described below), and the mbean server.

Configuration default values

AS7 supplies a set of custom default values for various configuration properties. These defaults differ depending on the cache mode. The complete set of default values can be found here:

File-based cache store

Because clustering services use the file-based cache store frequently, we've simplified its definition. First, by using a distinctive element, you no longer need to specify the class name. The location of the store is defined by 2 attributes:
<file-store relative-to="..." path="..."/>
The relative-to attribute defines a named path, and defaults to the server's data directory; whereas the path attribute specifies the directory within relative-to, and defaults to the cache container name.

Specifying cache mode

Instead of defining the cache mode via a separate <clustering mode="..."/> attribute, each cache mode uses it's own element, the child elements of which are specific to that cache mode. For example, rehashing properties are only available within the <distributed-cache/> element.

Where is <default/>?

The semantics of the default cache of a cache container are different in AS7 than in native Infinispan. In native Infinispan, the configuration within <default/> defines the cache returned by calls to CacheContainer.getCache(), while <namedCache/> entries inherit the configuration from the default cache.
In AS7, all caches defined in the Infinispan subsystem are named caches. The default-cache attribute identifies which named cache should be returned by calls to CacheContainer.getCache(). This lets you easily modify the default cache of a cache container, without having to worry about rearranging configuration property inheritance.

Specifying a transport


The Infinispan subsystem uses with the JGroups subsystem to provide it's JGroups channel. By default, cache containers use the default-stack as defined by the JGroups subsystem.
<subsystem xmlns="urn:jboss:domain:jgroups:1.0" default-stack="udp">
  <stack name="udp">
    <!-- ... -->
  </stack>
  <stack name="tcp">
    <!-- ... -->
  </stack>
</subsystem>
Changing the default stack for all clustering services is a simple as changing the default-stack attribute defined in the JGroups subsystem. An individual cache-container can opt to use a particular stack by specifying a stack attribute within its transport element.
e.g.
<cache-container name="web" default-cache="repl">
  <transport stack="tcp"/>
  <replicated-cache name="repl" mode="ASYNC" batching="true">
    <locking isolation="REPEATABLE_READ"/>
    <file-store/>
  </replicated-cache>
</cache-container>
JGroups channels are named using the cache container name.

Defining thread pools


Cache containers defined by the Infinispan subsystem can reference thread pools defined by the threading subsystem. Externalizing thread pool in this way has the additional advantage of being able to manage the thread pools via native JBoss AS management mechanisms, and allows you to share thread pools across cache containers.
e.g.
<cache-container name="web" default-cache="repl" listener-executor="infinispan-listener" eviction-executor="infinispan-eviction" replication-queue-executor="infinispan-repl-queue">
  <transport executor="infinispan-transport"/>
<replicated-cache name="repl" mode="ASYNC" batching="true">
    <locking isolation="REPEATABLE_READ"/>
<file-store/>
  </replicated-cache>
</cache-container>

<subsystem xmlns="urn:jboss:domain:threads:1.0">
  <thread-factory name="infinispan-factory" priority="1"/>
  <bounded-queue-thread-pool name="infinispan-transport"/>
    <core-threads count="1"/>
    <queue-length count="100000"/>
    <max-threads count="25"/>
    <thread-factory name="infinispan-factory"/>
  </bounded-queue-thread-pool>
  <bounded-queue-thread-pool name="infinispan-listener"/>
    <core-threads count="1"/>
    <queue-length count="100000"/>
    <max-threads count="1"/>
    <thread-factory name="infinispan-factory"/>
  </bounded-queue-thread-pool>
  <scheduled-thread-pool name="infinispan-eviction"/>
    <max-threads count="1"/>
    <thread-factory name="infinispan-factory"/>
  </scheduled-thread-pool>
  <scheduled-thread-pool name="infinispan-repl-queue"/>
    <max-threads count="1"/>
    <thread-factory name="infinispan-factory"/>
  </scheduled-thread-pool>
</subsystem>

Cache container lifecycle


During AS6 server startup, the CacheContainerRegistry service would create and start all cache containers defined within its infinispan-configs.xml file. Individual caches were started and stopped as needed. Lifecycle control of a cache was the complete responsibility of the application or service that used it.
Instead of a separate CacheContainerRegistry, AS7 uses the generic ServiceRegistry from the jboss-msc project (i.e. JBoss Modular Service Container). When AS7 starts, it creates on-demand services for each cache and cache container defined in the Infinispan subsystem. A service or deployment that needs to use a given cache or cache container simply adds a dependency on the relevant service name. When the service or deployment stops, dependent services are stopped as well, provided they are not still demanded by some other service or deployment. In this way, AS7 handles cache and cache container lifecycle for you.

There may be an occasion where you'd like a cache to start eagerly when the server starts, without requiring a dependency from some service or deployment. This can be achieve by using the start attribute of a cache.
e.g.
<cache-container name="cluster" default-cache="default">
  <alias>ha-partition</alias>
  <replicated-cache name="default" mode="SYNC" batching="true" start="EAGER">
    <locking isolation="REPEATABLE_READ"/>
  </replicated-cache>
</cache-container>

Using an Infinispan cache directly


AS7 adds the ability to inject an Infinispan cache into your application using standard JEE mechanisms. This is perhaps best explained by an example:
@ManagedBean
public class MyBean<K, V> {
  @Resource(lookup="java:jboss/infinispan/my-container-name")
  private org.infinispan.manager.CacheContainer container;
  private org.infinispan.Cache<K, V> cache;

  @PostConstruct
  public void start() {
    this.cache = this.container.getCache();
  }
}
That's it! No JBoss specific classes required - only standard JEE annotations. Pretty neat, no?

There's only one catch - due to the AS's use of modular classloading, Infinispan classes are not available to deployments by default. You need to explicitly tell the AS to import the Infinispan API into your application. This is most easily done by adding the following line to your application's META-INF/MANIFEST.MF:
Dependencies: org.infinispan export
So, how does it all work? If you recall, during server startup, the AS creates and registers an on-demand service for every Infinispan cache container defined in the Infinispan subsystem. For every cache container, the Infinispan subsystem also creates and registers a JNDI binding service that depends on the associated cache container service. When the AS deployer encounters the @Resource(lookup) annotation, it automatically adds a dependency to the application on the JNDI binding service associated with the specified JNDI name. In the case of the Infinispan JNDI binding, the binding itself already depends on the relevant Infinispan cache container service. The net effect is, your application will include a dependency on the requested cache container. Consequently, the cache container will automatically start on deploy, and stop (including all caches) on undeploy.

Sounds great! Where do I get it?


You can download the JBoss AS 7.0.0 Final release here:

User documentation can be found here:

And direct any questions to the user forums:

Keep a look out for the 7.0.1 release expected in the coming weeks, which contains a number of clustering fixes identified since the initial final release.

How can I contribute?


Here's the best place to start:

Wednesday, 20 July 2011

One last release candidate for Infinispan 5.0

Magic 8, baby!  Let's make this release candidate count.  5.0.0.CR8 is out, download it, provide feedback, we're moving fast towards a final release on Pagoa.

A bunch of bugs fixed over and above CR7, we'd love to hear what you have to say about this release.

Cheers
Manik

Tuesday, 19 July 2011

Infinispan 5.1 has a codename

The polls are in, and Infinispan 5.1, following tradition of naming releases after quality beers around the world, will be codenamed Brahma, from Brazil.

Beer aside, Brahma is a continuation of the work started with Infinispan 5.0 Pagoa.  Some of the key features of Brahma include:
In addition to these big features, a number of smaller enhancements and improvements are also planned, including:
So as we come ever closer to releasing Pagoa in its final form, contributors have already started hacking on code for Brahma.  Expect to see alphas of Brahma hit the interwebs very soon!

Enjoy
Manik

Wednesday, 6 July 2011

On datagrids at JFS

Hi,

I'll be be presenting on Infinispan and general datagrids concepts, tomorrow 7 July at Java Forum Stuttgart. If you are around and interested in the subject, or just want brainstorm about your datagrid use case just come and say hi!

Cheers,
Mircea

Thursday, 30 June 2011

Another release candidate for 5.0

I've just released 5.0.0.CR7. Hopefully we won't see any more release candidates before 5.0.0.Final, so give this a good, hearty test and let us know what you think.

A number of important bugs are fixed, details are in JIRA, and you can download the release in the usual place.  Please provide feedback on the user forums.

Enjoy, and onward to 5.0.0.Final!
Manik

Friday, 24 June 2011

Infinispan @ jazoon

I've just returned from Jazoon where I spoke about in-memory data grids/Infinispan and how can they be used to complement or even replace databases. There was a good and enthusiastic crowd, and the discussions ended late in the night - of course cooled down by cold swiss beer :)
Infinispan was also present in Hardy Ferentschik presentation about Hibernate OGM: the bran new JBoss project which exposes the grid ( to be read Infinispan) through Hibernate's API.

Thank to Jazoon organizers for an excellent conference and the chance to meet other enthusiasts from all over the world!

Cheers,
Mircea

Monday, 20 June 2011

Another week, another release candidate

Infinispan 5.0.0 codenamed Pagoa has yet another release candidate out for you to play with.  CR6 was released earlier today, please switch any tests you have on the 5.0 series to this latest release candidate and provide as much feedback as possible, as we get ever closer to 5.0.0.Final.

In addition to Pete's new grouping API, we also have some changes to the way class loading works, as well as the closure of a host of bugs reported against previous release candidates.

Please provide feedback using the forums, grab the release in the usual place, and report issues on JIRA.

Enjoy
Manik

The Grouping API

Infinispan 5 CR4 (and above) includes a new Grouping API. You can read more in the documentation, but I'll introduce it quickly for you here.

In some cases you may wish to co-locate a group of entries onto a particular node. In this case, the group API will be useful for you.

How does it work?

Infinispan allocates each node a portion of the total hash space. Normally, when you store an entry, Infinispan will take a hash of the key, and store the entry on the node which owns that portion of the hash space. Infinispan always uses an algorithm to locate a key in the hash space, never allowing the node on which the entry is stored to be specified manually. This scheme allows any node to know which nodes owns a key, without having to distribute such ownership information. This reduces the overhead of Infinispan, but more importantly improves redundancy as there is no need to replicate the ownership information in case of node failure.

If you use the grouping API , then Infinispan will ignore the hash of the key when deciding which node to store the entry on, and instead use a hash of the group. Infinispan still uses the hash of the key to store the entry on a node. When the group API is in use, it is important that every node can still compute, using an algorithm, the owner of every key. For this reason, the group cannot be specified manually. The group can either be intrinsic to the entry (generated by the key class) or extrinsic (generated by an external function).

How can I use it?

If you can alter the key class, and the determination of the group is not an orthogonal concern to the key class, then you can simply annotate a method on the key class that will provide the group. For example



class User {

...
String office;
...

int hashCode() {
// Defines the hash for the key, normally used to determine location
...
}

// Override the location by specifying a group, all keys in the same
// group end up with the same owner
@Group
String getOffice() {
return office;
}

}



Of course, you need to make sure your algorithm for computing the key is consistent, and always returns the same group for a key!

Alternatively, if you can't modify the key class, or determination of the group is an orthogonal concern, you can externalise computation of the group to an "interceptor style" class, called a "Grouper". Let's take a look an example of a Grouper:




class KXGrouper implements Grouper {

// A pattern that can extract from a "kX" (e.g. k1, k2) style key
static Pattern kPattern = Pattern.compile("(^k)(\\d)$");

String computeGroup(String key, String group) {
Matcher matcher = kPattern.matcher(key);
if (matcher.matches()) {
String g = Integer.parseInt(matcher.group(2)) % 2 + "";
return g;
} else
return null;
}

Class getKeyType() {
return String.class;
}

}


Here, we've had to use a grouper, as we cannot modify the key class (String). Our group is still based upon the key, and established by extracting a part of the key.

Of course, you need to enable grouping support in Infinispan, and configure any groupers. The reference documentation will help you here.

Friday, 17 June 2011

So you want JPA-like access to Infinispan?

Back in the early days of Infinispan (since our first public announcement, in fact) we always had it in mind to expose a JPA-like layer to Infinispan.  Initially this was as a replacement to the fine-grained replication that JBoss Cache's POJO Cache variant offered, but it grew beyond just a technique to do fine-grained replication on complex object graphs.  The fact that it offered a familiar data storage API to Java developers was big.  Huge, in fact.

So we realised JPA-on-Infinispan was firmly on the roadmap.  The original plan was to implement the entire set of JPA APIs from scratch, but this was a daunting and Herculean task.  After much discussion with core Hibernate architects and Infinispan contributors Emmanuel Bernard and Sanne Grinovero, we came to a decision that rather than implementing all this from scratch, it served both Infinispan and the community better to fork Hibernate's core ORM engine, and replace the relational database mappings with key/value store mappings.  And we get to reuse the mature codebase of Hibernate's session and transaction management, object graph dehydration code, proxies, etc.

And Hibernate OGM (Object-Grid Mapping) was born.  After initial experiments and even a large-scale public demo at the JBoss World 2011 Keynote, Emmanuel has officially blogged about the launch of Hibernate OGM.  Very exciting times, Infinispan now has a JPA-like layer.  :-)

To reiterate a key point from Emmanuel's blog, Hibernate OGM is still in its infancy.  It needs community participation to help it grow up and mature.  This is where the Infinispan community should step in; consider Hibernate OGM as Infinispan's JPA-like layer and get involved.  For more details, please read Emmanuel's announcement.

Enjoy!
Manik

Thursday, 9 June 2011

Keynote of the decade: behind the scenes, an Infinispan perspective

JBoss World 2011's much talked about keynote ended with a big bang - a live demo that many thought we were insane to even try and pull off - and has sparked off a lot of interest, many claiming JBoss has got its mojo back.  One of the things people keep asking is, what actually went on?  How did we build such a demo?  How can we do the same?

Firstly, if you did not attend the keynote or did not watch it online, I recommend that you stop reading this now, and go and watch the keynote. A recording is available online (the demo starts at about minute 35).

Ok, now that you've been primed, lets talk about the role Infinispan played in that demo.  The demo involved reading mass volumes of real-time data off a Twitter stream, and storing these tweets in an Infinispan grid. This primary grid (known as Grid-A), and ran off 3 large rack-mount servers. The Infinispan nodes were standalone, bootstrapped off a simple Main class, and formed a cluster, running in asynchronous distributed mode with 2 data owners.

Andrew Sacamano did an excellent job of building an HTML 5-based webapp to visualise what goes on in such a grid, making use of cache listeners pushing events to browsers and browsers rendering the "spinning spheres" using HTML 5's canvas tag.  So now we could visualise data and data movement within a grid of Infinispan nodes.

As Twitter data started to populate the grid, we fired up a second grid (Grid-B) consisting of 8 nodes. Again, these nodes were configured using asynchronous distribution and 2 data owners, but this time these nodes were running on very small and cheap plugtop computers.  These plugtops - GuruPlugs - are constrained devices with 512MB of RAM, a 1GHz ARM processor.

Yes, your iPhone has more grunt :-) And yes, these sub-iPhone devices were running a real data grid!

The purpose of this was to demonstrate the extremely low footprint and overhead Infinispan imposes on your hardware (we even had to run the zero assembly port of OpenJDK, an interpreted-mode JVM, since the processor only had a 16-bit bus!). We also had a server running JBossAS running Andrew's cool visualisation webapp rendering the contents of Grid-B, so people could "see" the data in both grids.

We then fired up Drools to have it mine the contents of Grid-A and send it to Grid-B applying some rules to select the interesting tweets, namely the ones having the hashtag #JBW. With this in place, we then invited the audience to participate - by tweeting with hashtag #JBW, as well as the hashtag of your favourite JBoss project - e.g., #infinispan :-)  People were allowed to vote for more than one project, and the most prolific tweeter was to win a prize.  This started a frenzy of tweeting, and was reflected in the two grid visualisations.

Not only Infinispan is very quick here: needless to say, Drools was sending the tweets from Grid-A to Grid-B using HornetQ, the fastest JMS implementation on the planet.

Jay Balunas of Richfaces built a TwitterStream app with live updates of these tweets for various devices, including iPhones, iPads, Android phones and tablets, and of course, desktop web browsers, grabbing data off Grid-B.  Christian Sadilek and Mike Brock from the Errai team also built a tag-cloud application visualising popular tags as a tag cloud, again off Grid-B, making use of Errai to push events to the browser.

After simulating Mark Proctor to try cheating the system with a script, we could recover the correct votes: clear Grid-B, update the Drools rules to have it discard the cheat tweets, and have a cleaned up stream of tweets flow to Grid-B.

All applications, including Drools and the visualizations, where using a JPA interface to store or load the tweets: it was powered by an early preview of HibernateOGM, which aims to abstract any NoSQL store as a JPA persistence store while still providing some level of consistency. As HibernateOGM is not feature complete, it was using Hibernate Search to provide query capabilities via a Lucene index, and using the Infinispan integration of Hibernate Search to distribute the index on Infinispan.

We then demonstrated failover, as we invited the winner to come up on stage to choose and brutally un-plug one of the plugtops of his choice from Grid-B - this plugtop became his prize. Important to note, the webapps running off the grid did not risk to lose any data, Drools pulling stuff off Grid-A onto Grid-B was still able to continue running, the Lucene index could continually be updated and queried by the remaining nodes.

From an Infinispan perspective, what did this demo make use of?
So a fairly simple setup, using simple embeddable components, cheap hardware, to build a fairly complex application with excellent failover and scalability properties.

So we where depending on wi-fi connectivity, internet access, a live tweet stream, technology previews and people's cooperation!

To make things more interesting, the day before the demo one of the servers died; hardware failure: didn't survive the trip. A second server, meant to serve the UI webapps, started reporting failures on all network interfaces just before starting the demo: it could not figure out hardware addresses of cluster peers, and we had no time to replace him: its backup was already dead. Interesting enough we could tap in some advanced parameters of the JGroups configuration to workaround this issue.

Nothing was pre-recorded! Actually the backup plan was to have Mark Little dancing a tip-tap; next year we will try to stretch our demo even more so you might see that dance!

So here you can see the recording of the event: http://www.jboss.org/jbw2011keynote or listen to the behind the scenes podcast.

After the demo, we did hear of a large commercial application using Infinispan and Drools in precisely this manner - except instead of Twitter, the large data stream was flight seat pricing, changing dynamically and constantly, and eventually rendered to web pages of various travel sites - oh, and they weren't running on plugtops in case you were thinking ;)  So, the example isn't completely artificial.

How do you use Infinispan?  We'd love for you to share stories with us.

Cheers
Manik and Sanne