Thursday, 22 December 2011
Startup performance
This is a pretty valid use case, since when used as a non-clustered 2nd level cache in Hibernate, a separate cache instance is created per entity type, and in the past this has become somewhat of a bottleneck.
In this test, I compared Infinispan 5.0.1.Final, 5.1.0.CR1 and 5.1.0.CR2. 5.1.0 is significantly quicker, but I used this test (and subsequent profiling) to commit a couple of interesting changes in 5.1.0.CR2, which has improved things even more - both in terms of CPU performance as well as memory footprint.
Essentially, 5.1.0.CR1 made use of Jandex to perform annotation scanning of internal components at build-time, to prevent expensive reflection calls to determine component dependencies and lifecycle at runtime. 5.1.0.CR2 takes this concept a step further - now we don't just cache annotation lookups at build-time, but entire dependency graphs. And determining and ordering of lifecycle methods are done at build-time too, again making startup times significantly quicker while offering a much tighter memory footprint.
Enough talk. Here is the test used, and here are the performance numbers, as per my laptop, a 2010 MacBook Pro with an i5 CPU.
Multiverse:InfinispanStartupBenchmark manik [master]$ ./bench.sh
---- Starting benchmark ---
Please standby ...
Using Infinispan 5.0.1.FINAL (JMX enabled? false)
Created 2000 caches in 10.9 seconds and consumed 172.32 Mb of memory.
Using Infinispan 5.0.1.FINAL (JMX enabled? true)
Created 2000 caches in 56.18 seconds and consumed 315.21 Mb of memory.
Using Infinispan 5.1.0.CR1 (JMX enabled? false)
Created 2000 caches in 7.13 seconds and consumed 157.5 Mb of memory.
Using Infinispan 5.1.0.CR1 (JMX enabled? true)
Created 2000 caches in 34.9 seconds and consumed 243.33 Mb of memory.
Using Infinispan 5.1.0.CR2(JMX enabled? false)
Created 2000 caches in 3.18 seconds and consumed 142.2 Mb of memory.
Using Infinispan 5.1.0.CR2(JMX enabled? true)
Created 2000 caches in 17.62 seconds and consumed 176.13 Mb of memory.
A whopping 3.5 times faster, and significantly more memory-efficient especially when enabling JMX reporting. :-)
Enjoy!
Manik
Wednesday, 21 December 2011
Infinispan 5.1.0.CR2 is out in time for Xmas!
We've also worked on improving startup time by indexing annotation metadata at build time and reading it at runtime. From a Infinispan user perspective, there's been some changes to how Infinispan is extended, in particular related to custom command implementations, where we know use JDK's ServiceLoader to load them.
As per usual, downloads are in the usual place, use the forums to provide feedback and report any issues.
Cheers, Merry Christmas and a Happy New Year to all the Infinispan community! :)
Galder
Tuesday, 6 December 2011
First Infinispan 5.1.0 'Brahma' candidate release is out!
- Ahead of future eventual consistency support, Infinispan now supports versioned cache entries which means that existing write skew checks on REPEATABLE_READ caches can be done more accurately.
- Support for CDI injection of remote caches! Thanks to excellent work of community contributor Kevin Pollet, you can now inject remote caches as well as embedded caches using CDI. Detailed documentation on how to use is available here. In the mean time, checkout the CDI integration module in the Infinispan source code for examples.
Infinispan coming to the French Alps!
Looking forward to meeting Emannuel and the rest of the Alpes JUG gang :)
Cheers,
Galder
Friday, 25 November 2011
Infinispan @Devoxx
Pete Muir, Sanne Grinovero and myself were also given the chance to speak. And we did take our time with a three hours deep dive into the Infinispan ecosystem, plenty of live demos and good discussions.
If you couldn't make it and you can't wait for the video to be published don't worry, the demo is available online here. Give it a spin and let us know what you think!
Cheers,
Mircea
Wednesday, 23 November 2011
More on transaction performance: use1PcForAutoCommitTransactions
What's use1PcForAutoCommitTransactions all about?
Don't be scared the name, use1PcForAutoCommitTransactions is a new feature (5.1.CR1) that does quite a cool thing: increases your transactions's performance.
Let me explain.
Before Infinispan 5.1 you could access the cache both transactional and non-transactional. Naturally the non-transactional access is faster and offers less consistency guarantees.But we don't support mixed access in Infinispan 5.1, so what what's to be done when you need the speed of non-transactional access and you are ready to trade some consistency guarantees for it?
Well here is where use1PcForAutoCommitTransactions helps you. What is does is forces an induced (autoCommit=true) transaction to commit in a single phase. So only 1 RPC instead of 2RPCs as in the case of a full 2 Phase Commit (2PC).
At what cost?
You might end up with inconsistent data if multiple transactions modify the same key concurrently. But if you know that's not the case, or you can live with it then use1PcForAutoCommitTransactions will help your performance considerably.
An example
Let's say you do a simple put operation outside the scope of a transaction:
Now let's see how this would behave if the cache has several different transaction configurations:
Not using 1PC...
The put will happen in two RPCs/steps: a prepare message is sent around and then a commit.
Using 1PC...
The put happens in one RPC as the prepare and the commit are merged. Better performance.
Not using autoCommit
An exception is thrown, as this is a transactional cache and invocations must happen within the scope of a transaction.
Enjoy!
Mircea
Tuesday, 22 November 2011
Infinispan 5.1.0.BETA5 is out!
- Locks acquired within a transaction are now reordered in order to avoid deadlocks. There's no new configuration required to take advantage of this feature. More information on how lock reordering works can be found here.
- One of the aims of Infinispan 5.1 'Brahma' series is to move away from JAXB and instead use Stax based XML parsing. Ahead of that, a new configuration API based on builders has been developed. Expect to hear more about it and examples on using the API in the next few days.
- The demo paths that were broken in 5.1.0.BETA4 have now been fixed.
- Some of the Infinispan jars in 5.1.0.BETA4 were showing duplicate classes. This was the result of an OSGI bundle generation bug, and so to avoid the issue 5.1.0.BETA5 OSGI bundle generation has been disabled. This functionality will be re-enabled once the issue has been fixed by the Maven Felix plugin.
Friday, 11 November 2011
Some worth mentioning improvements for pessimistic transactions
- a single RPC happens for acquiring lock on a key, disregarding the number of invocations. So if you call cache.put(k,v) in a loop, within the scope of the same transaction, there is only one remote call to the owner of k.
- if the key you want to lock/write maps to the local node then no remote locks are acquired. In other words there won't be any RPCs for writing to a key that maps locally. This can be very powerful used in conjunction with the KeyAffinityService, as it allows you to control the locality of you keys.
- during the two phase commit (2PC), the prepare phase doesn't perform any RPCs: this optimisation is based on the fact locks are already acquired on each write. This means that then number of RPCs during transactions lifespan is reduced with 1.
- for some writes to the cache (e..g cache.put(k,v)) two RPCs were performed: one to acquire the remote lock and one to fetch the previous value. The obvious optimisation in this case was to make a single RPC for both operations - which we do starting with 5.1.
Thursday, 10 November 2011
Fewer deadlocks, higher throughput
What I want to talk about though is a way to solve this problem. Quit a simple way - just force an order on your transaction writes and you're guaranteed not to deadlock: if both T1 and T2 write to a then b (lexicographical order) there won't be any deadlock. Ever.
Wednesday, 9 November 2011
Single lock owner: an important step forward
How does it help me?
Short version: if you use transactions that concurrently write to the same keys, this improvement significantly increases your system' throughput.
Long version: If you're using Infinispan with transactions that modify the same key(s) concurrently then you can easily end up in a deadlock. A deadlock can also occur if two transaction modify the same key at the same time - which is both inefficient and counter-intuitive. Such a deadlock means that at one transaction(or both) eventually rollback but also the lock on the key is held for the duration of a lockAquistionTimout config option (defaults to 10 seconds). These deadlocks reduces the throughput significantly as transactions threads are held inactive during deadlock time. On top of that, other transactions that want to operate on that key are also delayed, potentially resulting in a cascade effect.
What's the added performance penalty?
The only encountered performance penalty is during cluster topology changes. At that point the cluster needs to perform some additional computation (no RPC involved) to fail-over the acquired locks from previous to new owners.
Another noticeable aspect is that locks are now being released asynchronously, after the transaction commits. This doesn't add any burden to the transaction duration, but it means that locks are being held slightly longer. That's not something to be concerned about if you're not using transactions that compete for same locks though.
We plan to benchmark this feature using Radargun benchmark tool - we'll report back!
Want to know more?
You can read the single lock owner design wiki or/and follow the JIRA JIRA discussions.
More locking improvements in Infinispan 5.1.0.BETA4
- A hugely important lock acquisition improvement has been implemented that results in locks being acquired in only a single node in the cluster. This means that deadlocks as a result of multiple nodes updating the same key are no longer possible. Concurrent updates on a single key will now be queued in the node that 'owns' that key. For more info, please check the design wiki and keep an eye on this blog because Mircea Markus, who's the author of this enhancement, will be explaining it in more detail very shortly. Please note that you don't need to make any configuration or code changes to take advantage of this improvement.
- A bunch of classes and interfaces in the core/ module have been migrated to an api/ and commons/ module in order to reduce the size of the dependencies that the Hot Rod Java client had. As a result, there's been a change in the hierarchy of Cache and CacheContainer classes, with the introduction of BasicCache and BasicCacheContainer, which are parent classes of existing Cache and CacheContainer classes respectively. What's important is that Hot Rod clients must now code againts BasicCache and BasicCacheContainers rather than Cache and CacheContainer. So previous code that was written like this will no longer compile:
import org.infinispan.Cache; import org.infinispan.manager.CacheContainer; import org.infinispan.client.hotrod.RemoteCacheManager; ... CacheContainer cacheContainer = new RemoteCacheManager(); Cache cache = cacheContainer.getCache();
Instead, if Hot Rod clients want to continue using interfaces higher up the hierarchy from the remote cache/container classes, they'll have to write:import org.infinispan.BasicCache; import org.infinispan.manager.BasicCacheContainer; import org.infinispan.client.hotrod.RemoteCacheManager; ... BasicCacheContainer cacheContainer = new RemoteCacheManager(); BasicCache cache = cacheContainer.getCache();
Previous code that interacted against the RemoteCache and RemoteCacheManager should work as it used to:import org.infinispan.client.hotrod.RemoteCache; import org.infinispan.client.hotrod.RemoteCacheManager; ... RemoteCacheManager cacheContainer = new RemoteCacheManager(); RemoteCache cache = cacheContainer.getCache();
We apologise for any inconvenience caused, but we think that the Hot Rod clients will hugely benefit from this vastly reducing the number of dependencies they need. - Finally, a few words about the ZIP distribution file. In BETA4 we've added some cache store implementations that were missing from previous releases, such as the RemoteCacheStore that talks to Hot Rod servers, and we've added a brand new demo application that implements a near-caching pattern using JMS. Please be aware that this demo is just a simple prototype of how near caches could be built using Infinispan and HornetQ.
As always, please keep the feedback coming. You can download the release from here and you get further details on the issues addressed in the changelog.
Cheers,
Galder
Thursday, 27 October 2011
Infinispan 5.1.0.BETA3 is out with Atomic Map and Hot Rod improvements
Fine-grained Atomic Maps
Atomic Maps are special constructs that users can use to bundle data into the value side of a key/value pair. What's special about them is that when the map changes, only the changes or deltas of that map are transfered, which makes Atomic Maps very efficient from a replication perspective when individual elements are modified.
Up until Infinispan 5.1.0.BETA2, the other interesting characteristic of these Atomic Maps was the fact that Atomic Map locking and isolation happened at the the level of the entire Atomic Map. So, if a single key/value pair in the Atomic Map was modified, the entire map was locked.
Starting with Infinispan 5.1.0.BETA3, thanks to Vladimir Blagojevic, Atomic Maps supporting fine-grained locking are available as well. What this means is that an Atomic Map's key/value pairs can be modified in parallel thanks to the ability to lock individual map entries as opposed to the entire map.
This will be particularly handy for heavy Atomic Map users such as JBoss Application Server 7 which uses Atomic Maps for maintaining HTTP sessions, and Hibernate OGM which decomposes entities into Atomic Maps.
Hot Rod server topology improvements
When we originally designed Hot Rod protocol version 1.0, we decided that whenever a distributed cache wanted to send information about the topology of the backend servers to the clients, we'd send the hash ids of each of these nodes. At the time, this seemed like a good idea, until virtual nodes were implemented...
With virtual nodes, each physical Hot Rod server can potentially represent tens, hundreds or even thousands of different virtual nodes. If we stuck with the original protocol, that would mean that we'd have to send each virtual node's hash id back to the client. So, for a cluster of 8 nodes, and 1000 virtual nodes, that'd be at least 80kb of hash ids being transfered back to the client, on top of tons of redundant information about a node's host and port, which is very inefficient.
So, after having some discussions, we decided to evolve the Hot Rod protocol to version 1.1 in order to address this issue. The end result is that now it's the responsibility of the Hot Rod client to generate the hash ids of each of the physical nodes. We do that by sticking to a general formula to generate a Hot Rod server's hash id which both the Hot Rod server and clients can implement.
This improvement has also lead to the significant decrease in memory consumption of the Hot Rod server because it does not need to cache those hash ids anymore.
So, if you are using Infinispan Hot Rod servers and in particular you'are configuring virtual nodes, you definitely should be upgrading your Hot Rod server and client libraries. From a client code perspective, no changes are necessary because starting with 5.1.0.BETA3, Hot Rod clients talk to servers using this latest protocol.
Finally, remember to use user forums to report back, grab the release here, enjoy and keep the feedback coming!!
Cheers,
Galder
Wednesday, 19 October 2011
Infinispan 5.1.0.BETA2 is out and asymmetric clusters are here!
Before asymmetric clusters were supported, it was required that all Infinispan caches that client code interacted against were defined and running in all nodes in the cluster, otherwise, cluster wide messages for a cache that did not exist in a node would fail. So, imagine this scenario where c1 and c2 are user defined caches configured with replication:
Node A [c1]
Node B [c1, c2]
Without asymmetric clusters, whenever c2 cache was modified in Node B, a replication message would be sent to Node A, but the replication would fail indicating that c2 was not defined in Node A. This failure would get propagated back to Node B which would result in the modification failing. This kind of problems can particularly affect managed environments such as the JBoss Application Server, because often, deployments will be made in a subset of the cluster, so it could well happen that not all nodes have a particular cache started.
So, what Infinispan 5.1.0.BETA2 'Brahma' brings is support for this type of scenarios by maintaining a view correlating nodes and started caches, hence allowing any node to know which other nodes have a particular cache started. This means that in the above case, Node B would not have sent a replication message to Node A because it would know that c2 was only started in Node B.
The lack of support for asymmetric clusters is what forced Infinispan servers to only accept invocations for predefined caches because these predefined could be started when the servers were started, hence avoiding the asymmetric cluster problem. Now that asymmetric clusters are supported, it's likely that this limitation go away, but the timeline is to be defined yet.
This release also includes a bunch of other fixes and as always, please use the user forums to report back, grab the release here, enjoy and keep the feedback coming.
Cheers,
Galder
Monday, 17 October 2011
An understudy for Devoxx 2011
Enjoy
Manik
Tuesday, 4 October 2011
5.1.0.BETA1 "Brahma" is out with reworked transaction handling
For this first beta release, the transaction layer has been redesigned as explained by Mircea in this blog post. This is a very important step in the process of implementing some key locking improvements, so we're very excited about this! Thanks Mircea :)
There's a bunch of other little improvements, such as avoiding the use of thread locals for cache operations with flags. As a result, optimisations like the following are now viable:
AdvancedCache cache = ... Cache forceWLCache = cache.withFlags(Flag.FORCE_WRITE_LOCK); forceWLCache.get("voo"); forceWLCache.put("voo", "doo"); ...Previously each cache invocation would have required withFlags() to be called, but now you only need to do it once and you can cache the "flagged" cache and reuse it.
Another interesting little improvement is available for JDBC cache store users. Basically, database tables can now be discovered within an implicit schema. So, if each user has a different schema, the tables will be created within their own space. This makes it easier to manage environments where the JDBC cache store is used by multiple caches at the same time because management is limited to adding a user per application, as opposed to adding a user plus prefixing table names. Thanks to Nicolas Filotto for bringing this up.
Please keep the feedback coming, and as always, you can download the release from here and you get further details on the issues addressed in the changelog.
Cheers,
Galder
Monday, 3 October 2011
Transaction remake in Infinispan 5.1
- starting with this release an Infinispan cache can accessed either transactionally or non-transactionally. The mixed access mode is no longer supported (backward compatibility still maintained, see below). There are several reasons for going this path, but one of them most important result of this decision is a cleaner semantic on how concurrency is managed between multiple requestors for the same cache entry.
- starting with 5.1 the supported transaction models are optimistic and pessimistic. Optimistic model is an improvement over the existing default transaction model by completely deferring lock acquisition to transaction prepare time. That reduces lock acquisition duration and increases throughput; also avoids deadlocks. With pessimistic model, cluster wide-locks are being acquired on each write and only being released after the transaction completed (see below).
Transactional or non transactional cache?
It's up to you as an user to decide weather you want to define a cache as transactional or not. By default, infinispan caches are non transactional. A cache can be made transactional by changing the transactionMode attribute:
transactionMode can only take two values: TRANSACTIONAL and NON_TRANSACTIONAL. Same thing can be also achieved programatically:
Important:for transactional caches it is required to configure a TransactionManagerLookup.
Backward compatibility
The autoCommit attribute was added in order to assure backward compatibility. If a cache is transactional and autoCommit is enabled (defaults to true) then any call that is performed outside of a transaction's scope is transparently wrapped within a transaction. In other words Infinispan adds the logic for starting a transaction before the call and committing it after the call.
So if your code accesses a cache both transactionally and non-transactionally, all you have to do when migrating to Infinispan 5.1 is mark the cache as transactional and enable autoCommit (that's actually enabled by default, so just don't disable it :)
The autoCommit feature can be managed through configuration:
or programatically:
Optimistic Transactions
With optimistic transactions locks are being acquired at transaction prepare time and are only being held up to the point the transaction commits (or rollbacks). This is different from the 5.0 default locking model where local locks are being acquire on writes and cluster locks are being acquired during prepare time.
Optimistic transactions can be enabled in the configuration file:
or programatically:
By default, a transactional cache is optimistic.
Pessimistic Transactions
From a lock acquisition perspective, pessimistic transactions obtain locks on keys at the time the key is written. E.g.
When cache.put(k1,v1) returns k1 is locked and no other transaction running anywhere in the cluster can write to it. Reading k1 is still possible. The lock on k1 is released when the transaction completes (commits or rollbacks).
Pessimistic transactions can be enabled in the configuration file:
or programatically:
What do I need - pessimistic or optimistic transactions?
From a use case perspective, optimistic transactions should be used when there's not a lot of contention between multiple transactions running at the same time. That is because the optimistic transactions rollback if data has changed between the time it was read and the time it was committed (writeSkewCheck).
On the other hand, pessimistic transactions might be a better fit when there is high contention on the keys and transaction rollbacks are less desirable. Pessimistic transactions are more costly by their nature: each write operation potentially involves a RPC for lock acquisition.
The path ahead
This major transaction rework has opened the way for several other transaction related improvements:
- Single node locking model is a major step forward in avoiding deadlocks and increasing throughput by only acquiring locks on a single node in the cluster, disregarding the number of redundant copies (numOwners) on which data is replicated
- Lock acquisition reordering is a deadlock avoidance technique that will be used for optimistic transactions
- Incremental locking is another technique for minimising deadlocks.
Stay tuned!
Mircea
Tuesday, 27 September 2011
Catch me if you can... at Soft Shake or JUDCon!
Soft Shake is an IT conference being held in Geneva on October 3rd and 4th and I'll be speaking about data grids and data caching with Infinispan on the 3rd. In fact, I'll be speaking twice! At 1pm you'll see me doing an introduction to data grids and data caching, and at 5pm I'll be delving into the data grid vs database debate.
So, if any of this topics interest you, come and join us at Soft Shake! It's gonna be fun :)
And that is not all! I'll be one of the Infinispan team members speaking in JUDCon London at the end of October. The agenda is now live and you'll see me talking about near caching on the 31st of October. Don't miss it!
Cheers,
Galder
Wednesday, 21 September 2011
Next Infinispan 5.1.0 alpha hits the streets!
On top of that, this Infinispan release is the first one to integrate JGroups 3.0 which brings plenty of API changes that simplifies a lot of the Infinispan/JGroups interaction. If you want to find out more about the new JGroups version, make sure you check Bela's blog and the brand new JGroups manual.
Please keep the feedback coming, and as always, you can download the release from here and you get further details on the issues addressed in the changelog.
Cheers,
Galder
When Infinispan meets CDI
- full integration with Java EE 6
- typesafe cache (and cache manager) injection
- JCACHE annotations support
Please note that the cache injection is typed. In this case, only String typed Java objects could be added as key and value.
It's also possible to configure the injected cache using CDI. The first step is to create a CDI qualifier, and then create the cache configuration producer, annotated with @ConfigureCache. The qualifier is used to qualify the injection point and the cache configuration producer:
In the same way, a cache can be defined with the default configuration of the cache manager in use, using a producer field:
One advantage of this approach is that all cache configurations of the entire application can be gathered together into a single Configuration class.
The Infinispan CDI extension provides a cache manager with a default configuration (and it is used by default). You can also override the default configuration (used by the default cache manager), as well as the default cache manager. You can find more information here.
JCache annotations support
JCache (aka JSR-107) is famous as the oldest open JSR. However, this JSR has recently seen extensive progress, and is a candidate for inclusion in Java EE 7 (JSR-342).
This specification defines a standard caching API to work with a standalone cache, as well as a distributed cache. An interesting part of the specification are the annotations which are designed to solve common caching use cases. Some of the annotations defined by this specification are:
- @CacheResult caches the result of a method call
- @CachePut caches a method parameter
- @CacheRemoveEntry removes an entry from a cache
- @CacheRemoveAll removes all entries from a cache
The Infinispan CDI extension adds support for these annotations. The only thing to do is to enable the CDI interceptors in your application beans.xml - you can find more information here.
Infinispan CDI and JBoss AS 7
With JBoss AS 7, you can setup an Infinispan cache manager in the server configuration file. This allows you to externalize your Infinispan configuration and also to lookup the cache manager from JNDI, normally with the @Resource annotation. This post has more details on the subject.
As we mentioned earlier, you can override the default cache manager used by the Infinispan CDI extension. To use a JBoss AS 7 configured cache, you need to use the cache manager defined in JBoss AS 7. You only need to annotate the default cache manager producer with @Resource. Simple!
Now, you can inject the cache defined in JBoss AS 7 as we described earlier.
What's next?
Here is a highlight of the features you will see soon.
- support for all JSR 107 annotations - @CachePut, @CacheDefaults
- support for remote cache
- ability to listen Infinispan events with CDI observers
- and more - let us know what you want ;-)
Feel free to open a topic in the Infinispan forum if you need help.
The Infinispan CDI documentation is here.
To see the Infinispan CDI extension in action you can browse and run the quickstart application here or watch this screencast.
Enjoy!
About the author
Kevin Pollet is a software engineer at SERLI a Consulting & Software Engineering company based in France. He's an Open Source advocate and contributes on many projects such as Infinispan and Hibernate Validator, both at SERLI and at home. He is also involved in the Poitou-Charentes JUG and has spoken in many JUG events. He enjoys attending Java events like JUDCon, JBoss World and Devoxx.
Tuesday, 13 September 2011
Infinispan 5.1.0.ALPHA1 released: Distributed Queries are here!
- Thanks to Israel Lacerra, Infinispan now supports fully distributed queries which allows queries to be parallelised across all nodes. Creating a distributed query is very easy, simply call SearchManager.getClusteredQuery. Please note that this feature is experimental and the API is likely to change before the final release.
- Infnispan Query module uses Hibernate Search 4 now.
- In Infinispan 5.0, we introduced the possibility of executing operations against a cache in such way that class loading could occur with a user-provided classloader. In this new release, we've extended the use case to allow data that's stored in binary format to be unmarshalled with the given classloader. This is particularly handy for those users that are deploying Infinispan in modular, or OSGI-like environments. For more information, check AdvancedCache.with(ClassLoader) API.
Monday, 12 September 2011
Infinispan 5.0.1.FINAL is out!
Please keep the feedback coming, and as always, you can download the release from here and you get further details on the issues addressed in the changelog.
Finally, we know have a documentation space fully dedicated to Infinispan 5.0. Make sure you check it out!
Cheers,
Galder
Thursday, 1 September 2011
JavaOne 2011 and Devoxx 2011
I have a conference session titled "A Tale About Caching (JSR 107) and Data Grids (JSR 347) in Enterprise Java" and a BoF session focused on JSR 347 titled "Making Java EE Cloud-Friendly: JSR 347, Data Grids for the Java Platform", which I will be delivering with fellow Infinispan developer, JBoss rockstar and overall nice guy Pete Muir.
Later on in the year, I will also be running a University talk at Devoxx in Antwerp, titled "A real-world deep-dive into Infinispan". This too will be with Pete and Mircea Markus, another core Infinispan developer.
This will be a great chance to learn more about Infinispan, data grids, JSR 107 and JSR 347, so if you are attending these conferences, make sure you add these talks to your agenda! :-)
Cheers
Manik
Wednesday, 10 August 2011
Transactions enhancements in 5.0
- transaction recovery is now supported, with a set of tools that allow state reconciliation in case the transaction fails during 2nd phase of 2PC. This is especially useful in the case of transactions spreading over Infinispan and another resource manager, e.g. a database (distributed transactions). You can find out more on how to enable and use transaction recovery here.
- Synchronization enlistment is another important feature in this release. This allows Infinispan to enlist in a transaction as an Synchronization rather than an XAResource.This enlistment allows the TransactionManager to optimize 2PC with a 1PC where only one other resource is enlisted with that transaction (last resource commit optimization). This is particularly important when using Infinispan as a 2nd level cache in Hibernate. You can read more about this feature here.
- besides that several bugs were fixed particularly when it comes to the integration with a transaction manager - BIG thanks to the community for reporting and testing them!
- as a fully fledged XAResource that supports recovery
- as an XAResource, but without recovery. This is the default configuration
- and as an Synchronization
- a non-transactional put is about 40% faster than a transactional one
- Synchronization-enlisted transactions outperform an XAResource enlisted one by about 20%
- A recoverable cache has about the same performance as a non-recoverable cache when it comes to transactions.
Friday, 5 August 2011
Infinispan 5.0.0.FINAL has hit the streets!
Wednesday, 27 July 2011
Infinispan in JBoss AS7
Configuration
Unlike previous releases of JBoss AS, AS7 centralizes all server configuration into one location. This include Infinispan cache configurations, which are defined by the Infinispan subsystem, within domain.xml or standalone.xml:
The complete schema for the Infinispan subsystem is included in the AS7 binary distribution:
If you are familiar with Infinispan's native configuration file format or the corresponding configuration file from AS6, you'll notice some obvious similarities, but some noteworthy differences.
While a native Infinispan configuration file contains cache configurations for a single cache container, like AS6, the Infinispan subsystem configuration defines multiple cache containers, each identified by a name. As with AS6, cache containers can have 1 or more aliases.
Being concise
The Infinispan subsystem's configuration schema attempts to be more concise than the equivalent configuration in AS6. This is a direct result of the following changes:
Where is <global/>?
Much of the global configuration contains references to other AS services. In AS7, these services are auto-injected behind the scenes. This includes things like thread pools (described below), the JGroups transport (also described below), and the mbean server.Configuration default values
AS7 supplies a set of custom default values for various configuration properties. These defaults differ depending on the cache mode. The complete set of default values can be found here:File-based cache store
Because clustering services use the file-based cache store frequently, we've simplified its definition. First, by using a distinctive element, you no longer need to specify the class name. The location of the store is defined by 2 attributes:<file-store relative-to="..." path="..."/>
Specifying cache mode
Instead of defining the cache mode via a separate <clustering mode="..."/> attribute, each cache mode uses it's own element, the child elements of which are specific to that cache mode. For example, rehashing properties are only available within the <distributed-cache/> element.Where is <default/>?
The semantics of the default cache of a cache container are different in AS7 than in native Infinispan. In native Infinispan, the configuration within <default/> defines the cache returned by calls to CacheContainer.getCache(), while <namedCache/> entries inherit the configuration from the default cache.In AS7, all caches defined in the Infinispan subsystem are named caches. The default-cache attribute identifies which named cache should be returned by calls to CacheContainer.getCache(). This lets you easily modify the default cache of a cache container, without having to worry about rearranging configuration property inheritance.
Specifying a transport
The Infinispan subsystem uses with the JGroups subsystem to provide it's JGroups channel. By default, cache containers use the default-stack as defined by the JGroups subsystem.
Changing the default stack for all clustering services is a simple as changing the default-stack attribute defined in the JGroups subsystem. An individual cache-container can opt to use a particular stack by specifying a stack attribute within its transport element.<subsystem xmlns="urn:jboss:domain:jgroups:1.0" default-stack="udp"><stack name="udp"><!-- ... --></stack><stack name="tcp"><!-- ... --></stack></subsystem>
JGroups channels are named using the cache container name.<cache-container name="web" default-cache="repl"><transport stack="tcp"/><replicated-cache name="repl" mode="ASYNC" batching="true"><locking isolation="REPEATABLE_READ"/><file-store/></replicated-cache></cache-container>
Defining thread pools
Cache containers defined by the Infinispan subsystem can reference thread pools defined by the threading subsystem. Externalizing thread pool in this way has the additional advantage of being able to manage the thread pools via native JBoss AS management mechanisms, and allows you to share thread pools across cache containers.
<cache-container name="web" default-cache="repl" listener-executor="infinispan-listener" eviction-executor="infinispan-eviction" replication-queue-executor="infinispan-repl-queue"><transport executor="infinispan-transport"/><replicated-cache name="repl" mode="ASYNC" batching="true"><locking isolation="REPEATABLE_READ"/><file-store/></replicated-cache></cache-container><subsystem xmlns="urn:jboss:domain:threads:1.0"><thread-factory name="infinispan-factory" priority="1"/><bounded-queue-thread-pool name="infinispan-transport"/><core-threads count="1"/><queue-length count="100000"/><max-threads count="25"/><thread-factory name="infinispan-factory"/></bounded-queue-thread-pool><bounded-queue-thread-pool name="infinispan-listener"/><core-threads count="1"/><queue-length count="100000"/><max-threads count="1"/><thread-factory name="infinispan-factory"/></bounded-queue-thread-pool><scheduled-thread-pool name="infinispan-eviction"/><max-threads count="1"/><thread-factory name="infinispan-factory"/></scheduled-thread-pool><scheduled-thread-pool name="infinispan-repl-queue"/><max-threads count="1"/><thread-factory name="infinispan-factory"/></scheduled-thread-pool></subsystem>
Cache container lifecycle
During AS6 server startup, the CacheContainerRegistry service would create and start all cache containers defined within its infinispan-configs.xml file. Individual caches were started and stopped as needed. Lifecycle control of a cache was the complete responsibility of the application or service that used it.
Instead of a separate CacheContainerRegistry, AS7 uses the generic ServiceRegistry from the jboss-msc project (i.e. JBoss Modular Service Container). When AS7 starts, it creates on-demand services for each cache and cache container defined in the Infinispan subsystem. A service or deployment that needs to use a given cache or cache container simply adds a dependency on the relevant service name. When the service or deployment stops, dependent services are stopped as well, provided they are not still demanded by some other service or deployment. In this way, AS7 handles cache and cache container lifecycle for you.
There may be an occasion where you'd like a cache to start eagerly when the server starts, without requiring a dependency from some service or deployment. This can be achieve by using the start attribute of a cache.
<cache-container name="cluster" default-cache="default"><alias>ha-partition</alias><replicated-cache name="default" mode="SYNC" batching="true" start="EAGER"><locking isolation="REPEATABLE_READ"/></replicated-cache></cache-container>
Using an Infinispan cache directly
AS7 adds the ability to inject an Infinispan cache into your application using standard JEE mechanisms. This is perhaps best explained by an example:
That's it! No JBoss specific classes required - only standard JEE annotations. Pretty neat, no?@ManagedBeanpublic class MyBean<K, V> {@Resource(lookup="java:jboss/infinispan/my-container-name")private org.infinispan.manager.CacheContainer container;private org.infinispan.Cache<K, V> cache;
@PostConstructpublic void start() {this.cache = this.container.getCache();}}
There's only one catch - due to the AS's use of modular classloading, Infinispan classes are not available to deployments by default. You need to explicitly tell the AS to import the Infinispan API into your application. This is most easily done by adding the following line to your application's META-INF/MANIFEST.MF:
Dependencies: org.infinispan export
Sounds great! Where do I get it?
You can download the JBoss AS 7.0.0 Final release here:
User documentation can be found here:
And direct any questions to the user forums:
Keep a look out for the 7.0.1 release expected in the coming weeks, which contains a number of clustering fixes identified since the initial final release.
How can I contribute?
Here's the best place to start:
Wednesday, 20 July 2011
One last release candidate for Infinispan 5.0
A bunch of bugs fixed over and above CR7, we'd love to hear what you have to say about this release.
Cheers
Manik
Tuesday, 19 July 2011
Infinispan 5.1 has a codename
Beer aside, Brahma is a continuation of the work started with Infinispan 5.0 Pagoa. Some of the key features of Brahma include:
- Overhaul rehashing and state transfer. This codebase will be consolidated and significantly improved, starting with the PUSH based rehashing introduced in Pagoa. Chunking and parallel transfers will also be supported, which will improve the performance and robustness of rehashing/state transfer.
- Improved locking and JTA interactions, including deadlock-minimising reordering and true optimistic and pessimistic modes.
- Versioned entries and an eventually consistent mode and API. Infinispan has always leaned towards consistency in the CAP triangle at the expense of partition tolerance, in line with most Java Data Grids. However, we can very easily also support eventual consistency with partition tolerance, and in Brahma we intend to introduce the versioned API to support this.
- Distributed querying based on parallelising a query task across all nodes in the cluster should also make an appearance, an additional query mode to add to the Lucene index-based querying supported in Pagoa.
- Fine-grained AtomicHashMaps. Anyone using AtomicHashMaps - including Hibernate OGM - will love this!
- Top-level support for JSON documents, including fine-grained replication for deltas in JSON documents.
- Moving off JAXB for configuration parsing, JAXB being too slow and cumbersome to deal with.
Wednesday, 6 July 2011
On datagrids at JFS
Thursday, 30 June 2011
Another release candidate for 5.0
A number of important bugs are fixed, details are in JIRA, and you can download the release in the usual place. Please provide feedback on the user forums.
Enjoy, and onward to 5.0.0.Final!
Manik
Friday, 24 June 2011
Infinispan @ jazoon
Monday, 20 June 2011
Another week, another release candidate
In addition to Pete's new grouping API, we also have some changes to the way class loading works, as well as the closure of a host of bugs reported against previous release candidates.
Please provide feedback using the forums, grab the release in the usual place, and report issues on JIRA.
Enjoy
Manik
The Grouping API
Infinispan 5 CR4 (and above) includes a new Grouping API. You can read more in the documentation, but I'll introduce it quickly for you here.
In some cases you may wish to co-locate a group of entries onto a particular node. In this case, the group API will be useful for you.
How does it work?
Infinispan allocates each node a portion of the total hash space. Normally, when you store an entry, Infinispan will take a hash of the key, and store the entry on the node which owns that portion of the hash space. Infinispan always uses an algorithm to locate a key in the hash space, never allowing the node on which the entry is stored to be specified manually. This scheme allows any node to know which nodes owns a key, without having to distribute such ownership information. This reduces the overhead of Infinispan, but more importantly improves redundancy as there is no need to replicate the ownership information in case of node failure.
If you use the grouping API , then Infinispan will ignore the hash of the key when deciding which node to store the entry on, and instead use a hash of the group. Infinispan still uses the hash of the key to store the entry on a node. When the group API is in use, it is important that every node can still compute, using an algorithm, the owner of every key. For this reason, the group cannot be specified manually. The group can either be intrinsic to the entry (generated by the key class) or extrinsic (generated by an external function).
How can I use it?
If you can alter the key class, and the determination of the group is not an orthogonal concern to the key class, then you can simply annotate a method on the key class that will provide the group. For example
class User {
...
String office;
...
int hashCode() {
// Defines the hash for the key, normally used to determine location
...
}
// Override the location by specifying a group, all keys in the same
// group end up with the same owner
@Group
String getOffice() {
return office;
}
}
Of course, you need to make sure your algorithm for computing the key is consistent, and always returns the same group for a key!
Alternatively, if you can't modify the key class, or determination of the group is an orthogonal concern, you can externalise computation of the group to an "interceptor style" class, called a "Grouper". Let's take a look an example of a Grouper:
class KXGrouper implements Grouper{
// A pattern that can extract from a "kX" (e.g. k1, k2) style key
static Pattern kPattern = Pattern.compile("(^k)(\\d)$");
String computeGroup(String key, String group) {
Matcher matcher = kPattern.matcher(key);
if (matcher.matches()) {
String g = Integer.parseInt(matcher.group(2)) % 2 + "";
return g;
} else
return null;
}
ClassgetKeyType() {
return String.class;
}
}
Here, we've had to use a grouper, as we cannot modify the key class (String). Our group is still based upon the key, and established by extracting a part of the key.
Of course, you need to enable grouping support in Infinispan, and configure any groupers. The reference documentation will help you here.
Friday, 17 June 2011
So you want JPA-like access to Infinispan?
So we realised JPA-on-Infinispan was firmly on the roadmap. The original plan was to implement the entire set of JPA APIs from scratch, but this was a daunting and Herculean task. After much discussion with core Hibernate architects and Infinispan contributors Emmanuel Bernard and Sanne Grinovero, we came to a decision that rather than implementing all this from scratch, it served both Infinispan and the community better to fork Hibernate's core ORM engine, and replace the relational database mappings with key/value store mappings. And we get to reuse the mature codebase of Hibernate's session and transaction management, object graph dehydration code, proxies, etc.
And Hibernate OGM (Object-Grid Mapping) was born. After initial experiments and even a large-scale public demo at the JBoss World 2011 Keynote, Emmanuel has officially blogged about the launch of Hibernate OGM. Very exciting times, Infinispan now has a JPA-like layer. :-)
To reiterate a key point from Emmanuel's blog, Hibernate OGM is still in its infancy. It needs community participation to help it grow up and mature. This is where the Infinispan community should step in; consider Hibernate OGM as Infinispan's JPA-like layer and get involved. For more details, please read Emmanuel's announcement.
Enjoy!
Manik
Thursday, 9 June 2011
Keynote of the decade: behind the scenes, an Infinispan perspective
Yes, your iPhone has more grunt :-) And yes, these sub-iPhone devices were running a real data grid!
The purpose of this was to demonstrate the extremely low footprint and overhead Infinispan imposes on your hardware (we even had to run the zero assembly port of OpenJDK, an interpreted-mode JVM, since the processor only had a 16-bit bus!). We also had a server running JBossAS running Andrew's cool visualisation webapp rendering the contents of Grid-B, so people could "see" the data in both grids.
- Data distribution, numOwners = 2
- Async network communication via JGroups
- JTA integration with JBossTS
- Cache listeners to notify applications of changes in data and topology
- The Infinispan Lucene Directory distributing the Lucene index on the grid
So here you can see the recording of the event: http://www.jboss.org/jbw2011keynote or listen to the behind the scenes podcast.