Infinispan: October 2011

Thursday, 27 October 2011

Infinispan 5.1.0.BETA3 is out with Atomic Map and Hot Rod improvements

I'm very proud to announce yet another beta release in the 5.1 'Brahma' series. This time is the turn of Infinispan 5.1.0.BETA3 which apart from containing many small fixes, it comes with two major improvements:

Fine-grained Atomic Maps

Atomic Maps are special constructs that users can use to bundle data into the value side of a key/value pair. What's special about them is that when the map changes, only the changes or deltas of that map are transfered, which makes Atomic Maps very efficient from a replication perspective when individual elements are modified.

Up until Infinispan 5.1.0.BETA2, the other interesting characteristic of these Atomic Maps was the fact that Atomic Map locking and isolation happened at the the level of the entire Atomic Map. So, if a single key/value pair in the Atomic Map was modified, the entire map was locked.

Starting with Infinispan 5.1.0.BETA3, thanks to Vladimir Blagojevic, Atomic Maps supporting fine-grained locking are available as well. What this means is that an Atomic Map's key/value pairs can be modified in parallel thanks to the ability to lock individual map entries as opposed to the entire map.

This will be particularly handy for heavy Atomic Map users such as JBoss Application Server 7 which uses Atomic Maps for maintaining HTTP sessions, and Hibernate OGM which decomposes entities into Atomic Maps.

Hot Rod server topology improvements

When we originally designed Hot Rod protocol version 1.0, we decided that whenever a distributed cache wanted to send information about the topology of the backend servers to the clients, we'd send the hash ids of each of these nodes. At the time, this seemed like a good idea, until virtual nodes were implemented...

With virtual nodes, each physical Hot Rod server can potentially represent tens, hundreds or even thousands of different virtual nodes. If we stuck with the original protocol, that would mean that we'd have to send each virtual node's hash id back to the client. So, for a cluster of 8 nodes, and 1000 virtual nodes, that'd be at least 80kb of hash ids being transfered back to the client, on top of tons of redundant information about a node's host and port, which is very inefficient.

So, after having some discussions, we decided to evolve the Hot Rod protocol to version 1.1 in order to address this issue. The end result is that now it's the responsibility of the Hot Rod client to generate the hash ids of each of the physical nodes. We do that by sticking to a general formula to generate a Hot Rod server's hash id which both the Hot Rod server and clients can implement.

This improvement has also lead to the significant decrease in memory consumption of the Hot Rod server because it does not need to cache those hash ids anymore.

So, if you are using Infinispan Hot Rod servers and in particular you'are configuring virtual nodes, you definitely should be upgrading your Hot Rod server and client libraries. From a client code perspective, no changes are necessary because starting with 5.1.0.BETA3, Hot Rod clients talk to servers using this latest protocol.

Finally, remember to use user forums to report back, grab the release here, enjoy and keep the feedback coming!!

Cheers,
Galder

Wednesday, 19 October 2011

Infinispan 5.1.0.BETA2 is out and asymmetric clusters are here!

The latest beta of the 5.1 'Brahma' series, 5.1.0.BETA2, is out now and thanks to Dan Berindei, it comes with support for asymmetric clusters which has been highly demanded.

Before asymmetric clusters were supported, it was required that all Infinispan caches that client code interacted against were defined and running in all nodes in the cluster, otherwise, cluster wide messages for a cache that did not exist in a node would fail. So, imagine this scenario where c1 and c2 are user defined caches configured with replication:

Node A [c1]
Node B [c1, c2]

Without asymmetric clusters, whenever c2 cache was modified in Node B, a replication message would be sent to Node A, but the replication would fail indicating that c2 was not defined in Node A. This failure would get propagated back to Node B which would result in the modification failing. This kind of problems can particularly affect managed environments such as the JBoss Application Server, because often, deployments will be made in a subset of the cluster, so it could well happen that not all nodes have a particular cache started.

So, what Infinispan 5.1.0.BETA2 'Brahma' brings is support for this type of scenarios by maintaining a view correlating nodes and started caches, hence allowing any node to know which other nodes have a particular cache started. This means that in the above case, Node B would not have sent a replication message to Node A because it would know that c2 was only started in Node B.

The lack of support for asymmetric clusters is what forced Infinispan servers to only accept invocations for predefined caches because these predefined could be started when the servers were started, hence avoiding the asymmetric cluster problem. Now that asymmetric clusters are supported, it's likely that this limitation go away, but the timeline is to be defined yet.

This release also includes a bunch of other fixes and as always, please use the user forums to report back, grab the release here, enjoy and keep the feedback coming.

Cheers,
Galder

Monday, 17 October 2011

An understudy for Devoxx 2011

I won't be able to make it to Devoxx this year, but worry not, the University talk and hands-on deep-dive on Infinispan will still go on. Pete and Mircea will be joined by Sanne Grinovero - maintainer of Infinispan's querying capabilities, Lucene and Hibernate hacker, and committer on Hibernate OGM. Now I wish I was attending, as an audience member! :-)

Enjoy
Manik

Tuesday, 4 October 2011

5.1.0.BETA1 "Brahma" is out with reworked transaction handling

It's been a frantic couple of weeks at chez Infinispan with loads of hacking, presentations preparation, team meetings...etc and we're now proud to release Infinispan 5.1.0.BETA1 "Brahma".

For this first beta release, the transaction layer has been redesigned as explained by Mircea in this blog post. This is a very important step in the process of implementing some key locking improvements, so we're very excited about this! Thanks Mircea :)

There's a bunch of other little improvements, such as avoiding the use of thread locals for cache operations with flags. As a result, optimisations like the following are now viable:

AdvancedCache cache = ...
Cache forceWLCache = cache.withFlags(Flag.FORCE_WRITE_LOCK);
forceWLCache.get("voo");
forceWLCache.put("voo", "doo");
...

Previously each cache invocation would have required withFlags() to be called, but now you only need to do it once and you can cache the "flagged" cache and reuse it.

Another interesting little improvement is available for JDBC cache store users. Basically, database tables can now be discovered within an implicit schema. So, if each user has a different schema, the tables will be created within their own space. This makes it easier to manage environments where the JDBC cache store is used by multiple caches at the same time because management is limited to adding a user per application, as opposed to adding a user plus prefixing table names. Thanks to Nicolas Filotto for bringing this up.

Please keep the feedback coming, and as always, you can download the release from here and you get further details on the issues addressed in the changelog.

Cheers,
Galder

Monday, 3 October 2011

Transaction remake in Infinispan 5.1

If you ever used Infinispan in a transactional way you might be very interested in this article as it describes some very significant improvements in version 5.1 "Brahma" (released with 5.1.Beta1):

starting with this release an Infinispan cache can accessed either transactionally or non-transactionally. The mixed access mode is no longer supported (backward compatibility still maintained, see below). There are several reasons for going this path, but one of them most important result of this decision is a cleaner semantic on how concurrency is managed between multiple requestors for the same cache entry.

starting with 5.1 the supported transaction models are optimistic and pessimistic. Optimistic model is an improvement over the existing default transaction model by completely deferring lock acquisition to transaction prepare time. That reduces lock acquisition duration and increases throughput; also avoids deadlocks. With pessimistic model, cluster wide-locks are being acquired on each write and only being released after the transaction completed (see below).

Transactional or non transactional cache?

It's up to you as an user to decide weather you want to define a cache as transactional or not. By default, infinispan caches are non transactional. A cache can be made transactional by changing the transactionMode attribute:

transactionMode can only take two values: TRANSACTIONAL and NON_TRANSACTIONAL. Same thing can be also achieved programatically:

Important:for transactional caches it is required to configure a TransactionManagerLookup.

Backward compatibility

The autoCommit attribute was added in order to assure backward compatibility. If a cache is transactional and autoCommit is enabled (defaults to true) then any call that is performed outside of a transaction's scope is transparently wrapped within a transaction. In other words Infinispan adds the logic for starting a transaction before the call and committing it after the call.

So if your code accesses a cache both transactionally and non-transactionally, all you have to do when migrating to Infinispan 5.1 is mark the cache as transactional and enable autoCommit (that's actually enabled by default, so just don't disable it :)

The autoCommit feature can be managed through configuration:

or programatically:

Optimistic Transactions

With optimistic transactions locks are being acquired at transaction prepare time and are only being held up to the point the transaction commits (or rollbacks). This is different from the 5.0 default locking model where local locks are being acquire on writes and cluster locks are being acquired during prepare time.

Optimistic transactions can be enabled in the configuration file:

or programatically:

By default, a transactional cache is optimistic.

Pessimistic Transactions

From a lock acquisition perspective, pessimistic transactions obtain locks on keys at the time the key is written. E.g.

When cache.put(k1,v1) returns k1 is locked and no other transaction running anywhere in the cluster can write to it. Reading k1 is still possible. The lock on k1 is released when the transaction completes (commits or rollbacks).

Pessimistic transactions can be enabled in the configuration file:

or programatically:

What do I need - pessimistic or optimistic transactions?

From a use case perspective, optimistic transactions should be used when there's not a lot of contention between multiple transactions running at the same time. That is because the optimistic transactions rollback if data has changed between the time it was read and the time it was committed (writeSkewCheck).

On the other hand, pessimistic transactions might be a better fit when there is high contention on the keys and transaction rollbacks are less desirable. Pessimistic transactions are more costly by their nature: each write operation potentially involves a RPC for lock acquisition.

The path ahead

This major transaction rework has opened the way for several other transaction related improvements:

Single node locking model is a major step forward in avoiding deadlocks and increasing throughput by only acquiring locks on a single node in the cluster, disregarding the number of redundant copies (numOwners) on which data is replicated

Lock acquisition reordering is a deadlock avoidance technique that will be used for optimistic transactions

Incremental locking is another technique for minimising deadlocks.

Stay tuned!
Mircea