So a lot of folks have asked me for a downloadable slide deck from my recent JUG presentations on Infinispan. I've gone a step further and have recorded a short 5 minute intro to data grids and Infinispan as a podcast.
Enjoy!
Manik
Friday, 28 August 2009
Tuesday, 25 August 2009
First beta now available!
So today I've finally cut the much-awaited Infinispan 4.0.0.BETA1. Codenamed Starobrno - after the Czech beer that was omnipresent during early planning sessions of Infinispan - Beta1 is finally feature-complete. This is also the first release where distribution is complete, with rehashing on joins and leaves implemented as well. In addition, a number of bugs reported on previous alpha releases have been tended to.
This is a hugely important release for Infinispan. No more features will be added to 4.0.0, and all efforts will now focus on stability, performance and squashing bugs. And for this we need your help! Download, try out, feedback. And you will be rewarded with a rock-solid, lightning-fast final release that you can depend on.
Some things that have changed since the alphas include a better mechanism of naming caches and overriding configurations, and a new configuration XML reference guide. Don't forget the 5-minute guide for the impatient, and the interactive tutorial to get you started as well.
There are a lot of folk to thank - way too many to list here, but you all know who you are. For a full set of release notes, visit this JIRA page.
Download, try it out, and feedback as much as possible. We'd love to hear from you!
Enjoy
Manik
Friday, 21 August 2009
Distribution instead of Buddy Replication
People have often commented on Buddy Replication (from JBoss Cache) not being available in Infinispan, and have asked how Infinispan's far superior distribution mode works. I've decided to write this article to discuss the main differences from a high level. For deeper technical details, please visit the Infinispan wiki.
Scalability versus high availability
These two concepts are often at odds with one another, even though they are commonly lumped together. What is usually good for scalability isn't always good for high availability, and vice versa. When it comes to clustering servers, high availability often means simply maintaining more copies, so that if nodes fail - and with commodity hardware, this is expected - state is not lost. An extreme case of this is replicated mode, available in both JBoss Cache and Infinispan, where each node is a clone of its neighbour. This provides very high availability, but unfortunately, this does not scale well. Assume you have 2GB per node. Discounting overhead, with replicated mode, you can only address 2GB of space, regardless of how large the cluster is. Even if you had 100 nodes - seemingly 200GB of space! - you'd still only be able to address 2GB since each node maintains a redundant copy. Further, since every node needs a copy, a lot of network traffic is generated as the cluster size grows.
Enter Buddy Replication
Buddy Replication (BR) was originally devised as a solution to this scalability problem. BR does not replicate state to every other node in the cluster. Instead, it chooses a fixed number of 'backup' nodes and only replicates to these backups. The number of backups is configurable, but in general it means that the number of backups is fixed. BR improved scalability significantly and showed near-linear scalability with increasing cluster size. This means that as more nodes are added to a cluster, the space available grows linearly as does the available computing power if measured in transactions per second.
But Buddy Replication doesn't help everybody!
BR was specifically designed around the HTTP session caching use-case for the JBoss Application Server, and heavily optimised accordingly. As a result, session affinity is mandated, and applications that do not use session affinity can be prone to a lot of data gravitation and 'thrashing' - data is moved back and forth across a cluster as different nodes attempt to claim 'ownership' of state. Of course this is not a problem with JBoss AS and HTTP session caching - session affinity is recommended, available on most load balancer hardware and/or software, is taken for granted, and is a well-understood and employed paradigm for web-based applications.
So we had to get better
Just solving the HTTP session caching use-case wasn't enough. A well-performing data grid needs to to better, and crucially, session affinity cannot be taken for granted. And this was the primary reason for not porting BR to Infinispan. As such, Infinispan does not and will not support BR as it is too restrictive.
Distribution
Distribution is a new cache mode in Infinispan. It is also the default clustered mode - as opposed to replication, which isn't scalable. Distribution makes use of familiar concepts in data grids, such as consistent hashing, call proxying and local caching of remote lookups. What this leads to is a design that does scale well - fixed number of replicas for each cache entry, just like BR - but no requirement for session affinity.
What about co-locating state?
Co-location of state - moving entries about as a single block - was automatic and implicit with BR. Since each node always picked a backup node for all its state, one could visualize all of the state on a given node as a single block. Thus, colocation was trivial and automatic: whatever you put in Node1 will always be together, even if Node1 eventually dies and the state is accessed on Node2. However, this meant that state cannot be evenly balanced across a cluster since the data blocks are very coarse grained.
With distribution, colocation is not implicit. In part due to the use of consistent hashing to determine where each cached entry resides, and also in part due to the finer-grained cache structure of Infinispan - key/value pairs instead of a tree-structure - this leads to individual entries as the granularity of state blocks. This means nodes can be far better balanced across a cluster. However, it does mean that certain optimizations which rely on co-location - such as keeping related entries close together - is a little more tricky.
One approach to co-locate state would be to use containers as values. For example, put all entries that should be colocated together into a HashMap. Then store the HashMap in the cache. But that is coarse-grained and ugly as an approach, and will mean that the entire HashMap would need to be locked and serialized as a single atomic unit, which can be expensive if this map is large.
Another approach is to use Infinispan's AtomicMap API. This powerful API lets you group entries together, so they will always be colocated, locked together, but replication will be much finer-grained, allowing only deltas to the map to be replicated. So that makes replication fast and performant, but it still means everything is locked as a single atomic unit. While this is necessary for certain applications, it isn't always be desirable.
One more solution is to implement your own ConsistentHash algorithm - perhaps extending DefaultConsistentHash. This implementation would have knowledge of your object model, and hashes related instances such that they are located together in the hash space. By far the most complex mechanism, but if performance and co-location really is a hard requirement then you cannot get better than this approach.
In summary:
Buddy Replication
Cheers
Manik
Scalability versus high availability
These two concepts are often at odds with one another, even though they are commonly lumped together. What is usually good for scalability isn't always good for high availability, and vice versa. When it comes to clustering servers, high availability often means simply maintaining more copies, so that if nodes fail - and with commodity hardware, this is expected - state is not lost. An extreme case of this is replicated mode, available in both JBoss Cache and Infinispan, where each node is a clone of its neighbour. This provides very high availability, but unfortunately, this does not scale well. Assume you have 2GB per node. Discounting overhead, with replicated mode, you can only address 2GB of space, regardless of how large the cluster is. Even if you had 100 nodes - seemingly 200GB of space! - you'd still only be able to address 2GB since each node maintains a redundant copy. Further, since every node needs a copy, a lot of network traffic is generated as the cluster size grows.
Enter Buddy Replication
Buddy Replication (BR) was originally devised as a solution to this scalability problem. BR does not replicate state to every other node in the cluster. Instead, it chooses a fixed number of 'backup' nodes and only replicates to these backups. The number of backups is configurable, but in general it means that the number of backups is fixed. BR improved scalability significantly and showed near-linear scalability with increasing cluster size. This means that as more nodes are added to a cluster, the space available grows linearly as does the available computing power if measured in transactions per second.
But Buddy Replication doesn't help everybody!
BR was specifically designed around the HTTP session caching use-case for the JBoss Application Server, and heavily optimised accordingly. As a result, session affinity is mandated, and applications that do not use session affinity can be prone to a lot of data gravitation and 'thrashing' - data is moved back and forth across a cluster as different nodes attempt to claim 'ownership' of state. Of course this is not a problem with JBoss AS and HTTP session caching - session affinity is recommended, available on most load balancer hardware and/or software, is taken for granted, and is a well-understood and employed paradigm for web-based applications.
So we had to get better
Just solving the HTTP session caching use-case wasn't enough. A well-performing data grid needs to to better, and crucially, session affinity cannot be taken for granted. And this was the primary reason for not porting BR to Infinispan. As such, Infinispan does not and will not support BR as it is too restrictive.
Distribution
Distribution is a new cache mode in Infinispan. It is also the default clustered mode - as opposed to replication, which isn't scalable. Distribution makes use of familiar concepts in data grids, such as consistent hashing, call proxying and local caching of remote lookups. What this leads to is a design that does scale well - fixed number of replicas for each cache entry, just like BR - but no requirement for session affinity.
What about co-locating state?
Co-location of state - moving entries about as a single block - was automatic and implicit with BR. Since each node always picked a backup node for all its state, one could visualize all of the state on a given node as a single block. Thus, colocation was trivial and automatic: whatever you put in Node1 will always be together, even if Node1 eventually dies and the state is accessed on Node2. However, this meant that state cannot be evenly balanced across a cluster since the data blocks are very coarse grained.
With distribution, colocation is not implicit. In part due to the use of consistent hashing to determine where each cached entry resides, and also in part due to the finer-grained cache structure of Infinispan - key/value pairs instead of a tree-structure - this leads to individual entries as the granularity of state blocks. This means nodes can be far better balanced across a cluster. However, it does mean that certain optimizations which rely on co-location - such as keeping related entries close together - is a little more tricky.
One approach to co-locate state would be to use containers as values. For example, put all entries that should be colocated together into a HashMap. Then store the HashMap in the cache. But that is coarse-grained and ugly as an approach, and will mean that the entire HashMap would need to be locked and serialized as a single atomic unit, which can be expensive if this map is large.
Another approach is to use Infinispan's AtomicMap API. This powerful API lets you group entries together, so they will always be colocated, locked together, but replication will be much finer-grained, allowing only deltas to the map to be replicated. So that makes replication fast and performant, but it still means everything is locked as a single atomic unit. While this is necessary for certain applications, it isn't always be desirable.
One more solution is to implement your own ConsistentHash algorithm - perhaps extending DefaultConsistentHash. This implementation would have knowledge of your object model, and hashes related instances such that they are located together in the hash space. By far the most complex mechanism, but if performance and co-location really is a hard requirement then you cannot get better than this approach.
In summary:
Buddy Replication
- Near-linear scalability
- Session affinity mandatory
- Co-location automatic
- Applicable to a specific set of use cases due to the session affinity requirement
- Near-linear scalability
- No session affinity needed
- Co-location requires special treatment, ranging in complexity based on performance and locking requirements. By default, no co-location is provided
- Applicable to a far wider range of use cases, and hence the default highly scalable clustered mode in Infinispan
Cheers
Manik
Labels:
buddy replication,
distribution,
partitioning
Defining cache configurations via CacheManager in Beta1
Infinispan's first beta release is just around the corner and in preparation, I'd like to introduce to the Infinispan users an important API change in org.infinispan.manager.CacheManager class that will be part of this beta release.
As a result of the development of the Infinispan second level cache provider for Hibernate, we have discovered that the CacheManager API for definition and retrieval of Configuration instances was a bit limited. So, for this coming release, the following method has been deleted:
And instead, the following two methods have been added:
The primary driver for this change has been the development of the Infinispan cache provider, where we wanted to enable users to configure or override most commonly modified Infinispan parameters via hibernate configuration file. This would avoid users having to modify different files for the most commonly modified parameters, hence improving usability of the Infinispan cache provider. However, to be able to implement this, we needed CacheManager's API to be enhanced so that:
- Existing defined cache configurations could be overriden. This enables use cases like this: Sample Infinispan cache provider configuration will contain a generic cache definition to be used for entities. Via hibernate configuration file, users could redefine the maximum number of entries to be allowed before eviction kicks in for all entities. The code would look something like this:
- Be able to define new cache configurations based on the configuration of a given cache instance, optionally applying some overrides. This enables uses cases like the following: A user wants to define eviction wake up interval for a specific entity which is different to the wake up interval used for the rest of entities.
Another limitation of the previous API, which we've solved with this API change, is that in the past the only way to get a cache's Configuration required the cache to be started because the only way to get the Configuration instance was from the Cache API. However, with this API change, we can now retrieve a cache's Configuration instance via the CacheManager API. Example:
If you would like to provide any feedback to this post, either respond to this blog entry or go to Infinispan's user forums.
As a result of the development of the Infinispan second level cache provider for Hibernate, we have discovered that the CacheManager API for definition and retrieval of Configuration instances was a bit limited. So, for this coming release, the following method has been deleted:
void defineCache(String cacheName, Configuration configurationOverride)
And instead, the following two methods have been added:
Configuration defineConfiguration(String cacheName, Configuration configurationOverride);
Configuration defineConfiguration(String cacheName, String templateCacheName,
Configuration configurationOverride);
The primary driver for this change has been the development of the Infinispan cache provider, where we wanted to enable users to configure or override most commonly modified Infinispan parameters via hibernate configuration file. This would avoid users having to modify different files for the most commonly modified parameters, hence improving usability of the Infinispan cache provider. However, to be able to implement this, we needed CacheManager's API to be enhanced so that:
- Existing defined cache configurations could be overriden. This enables use cases like this: Sample Infinispan cache provider configuration will contain a generic cache definition to be used for entities. Via hibernate configuration file, users could redefine the maximum number of entries to be allowed before eviction kicks in for all entities. The code would look something like this:
// Assume that 'cache-provider-configs.xml' contains
// a named cache for entities called 'entity'
CacheManager cacheManager = new DefaultCacheManager(
"/home/me/infinispan/cache-provider-configs.xml");
Configuration overridingConfiguration = new Configuration();
overridingConfiguration.setEvictionMaxEntries(20000); // max entries to 20.000
// Override existing 'entity' configuration so that eviction max entries are 20.000.
cacheManager.defineConfiguration("entity", overridingConfiguration);
- Be able to define new cache configurations based on the configuration of a given cache instance, optionally applying some overrides. This enables uses cases like the following: A user wants to define eviction wake up interval for a specific entity which is different to the wake up interval used for the rest of entities.
// Assume that 'cache-provider-configs.xml' contains
// a named cache for entities called 'entity'
CacheManager cacheManager = new DefaultCacheManager(
"/home/me/infinispan/cache-provider-configs.xml");
Configuration overridingConfiguration = new Configuration();
// set wake up interval to 240 seconds
overridingConfiguration.setEvictionWakeUpInterval(240000L);
// Create a new cache configuration for com.acme.Person entity
// based on 'entity' configuration, overriding the wake up interval to be 240 seconds
cacheManager.defineConfiguration("com.acme.Person", "entity", overridingConfiguration);
Another limitation of the previous API, which we've solved with this API change, is that in the past the only way to get a cache's Configuration required the cache to be started because the only way to get the Configuration instance was from the Cache API. However, with this API change, we can now retrieve a cache's Configuration instance via the CacheManager API. Example:
// Assume that 'cache-provider-configs.xml' contains
// a named cache for entities called 'entity'
CacheManager cacheManager = new DefaultCacheManager(
"/home/me/infinispan/cache-provider-configs.xml");
// Pass a brand new Configuration instance without overrides
// and it will return the given cache name's Configuration
Configuration entityConfiguration = cacheManager.defineConfiguration("entity",
new Configuration());
If you would like to provide any feedback to this post, either respond to this blog entry or go to Infinispan's user forums.
Wednesday, 12 August 2009
Coalesced Asynchronous Cache Store
As we prepare for Infinispan's beta release, let me introduce to you one of the recent enhancements implemented which improves the way the current asynchronous (or write-behind) cache store works.
Right until now, the asynchronous cache store simply queued modifications, while a set of threads would apply them. However, if the queue contained N put operations on the same key, these threads would apply each and every modification one after the other, which is not very efficient.
Thanks to the excellent feedback from the Infinispan community, we've now improved the asynchronous cache store so that it coalesces changes and only applies the latest modification on a key. So, if N put operations on the same key are queued, only the last modification will be applied to the cache store.
Internally, the asynchronous concurrent queueing mechanism used performs in O(1) by keeping an map with the latest values for each key. So, this maps acts like the queue but there's a not a need for a queue as such, we only care about making sure the latest values are stored hence, order is not important.
Note that the way threads apply these modifications is that they start working as soon as there are any changes available and so to see these changes coalesced, the system needs to be relatively busy or a lot of changes on the same key need to happen in a relatively short period of time. We could have made these threads work periodically, i.e. every X seconds, but by doing that, we would be letting modifications pile up and the time between operations and the cache store updates would go up, hence increasing the chance that the cache store is outdated.
Finally, there's no configuration modifications required to get the asynchronous cache store to work in the coalesced way, it just works like this out-of-the-box. Example:
Right until now, the asynchronous cache store simply queued modifications, while a set of threads would apply them. However, if the queue contained N put operations on the same key, these threads would apply each and every modification one after the other, which is not very efficient.
Thanks to the excellent feedback from the Infinispan community, we've now improved the asynchronous cache store so that it coalesces changes and only applies the latest modification on a key. So, if N put operations on the same key are queued, only the last modification will be applied to the cache store.
Internally, the asynchronous concurrent queueing mechanism used performs in O(1) by keeping an map with the latest values for each key. So, this maps acts like the queue but there's a not a need for a queue as such, we only care about making sure the latest values are stored hence, order is not important.
Note that the way threads apply these modifications is that they start working as soon as there are any changes available and so to see these changes coalesced, the system needs to be relatively busy or a lot of changes on the same key need to happen in a relatively short period of time. We could have made these threads work periodically, i.e. every X seconds, but by doing that, we would be letting modifications pile up and the time between operations and the cache store updates would go up, hence increasing the chance that the cache store is outdated.
Finally, there's no configuration modifications required to get the asynchronous cache store to work in the coalesced way, it just works like this out-of-the-box. Example:
<?xml version="1.0" encoding="UTF-8"?>
<infinispan xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:infinispan:config:4.0">
<namedCache name="persistentCache">
<loaders passivation="false" shared="false" preload="true">
<loader class="org.infinispan.loaders.file.FileCacheStore" fetchPersistentState="true" ignoreModifications="false" purgeOnStartup="false">
<properties>
<property name="location" value="/tmp"/>
</properties>
<async enabled="true" threadPoolSize="10"/>
</loader>
</loaders>
</namedCache>
</infinispan>
Subscribe to:
Posts (Atom)