Thursday, 28 April 2011

Infinispan 5.0.0.CR1 "Pagoa" is out!

The first candidate release of the Infinispan 5.0 "Pagoa" series is out now. The final features added include:
  • Lock identifiers are now available to Infinispan users in order to help reorder lock acquisitions which helps reduce the possibility of deadlocks.
  • Infinispan now supports internationalization of messages as per the rules here. The internationalization of messages is not yet available, but the integration of JBoss Logging into Infinispan would allow it in an easy way.
  • In the last BETA we renamed the lazyDeserialization XML element to storeAsBinary but this resulted in previous XML configurations not been valid any more. So, we've reinstated the lazyDeserialization element but if you use it, you'll see a WARN message indicating that you should replace it with storeAsBinary.
  • A very important change in the release is the change of the default value for lock striping configuration. In previous releases, this used to be enabled by default but as hinted in its documentation this can cause deadlocks. So, after some debate, we've decided to disable it by default.
  • New EC2 demo available in the all zip distribution with examples of distributed executors and map/reduce! Make sure you try it out :)
There's some other minor fixes as shown in the release notes. As always, please use the user forums to report back, grab the release here, enjoy and keep the feedback coming. The Final version is not far away, so make sure you test this CR! :)

Un saludo,
Galder

Tuesday, 19 April 2011

5.0.0.BETA2 released with better distribution!

A brand new Infinispan 5.0 "Pagoa" beta is out now, 5.0.0.BETA2 bringing even more goodies for Infinispan users:
  • Initial implementation of virtual nodes for consistent hash algorithm based distribution is included. This means that each Infinispan node can now pick multiple nodes in the hash wheel reducing the standard deviation and so improving the distribution of data. The configuration is done via the numVirtualNodes attribute in hash element.
  • The externalizer configuration has been revamped in order to make it more user-friendly! You only need the @SerializeWith annotation and an Externalizer implementation in its most basic form, but more advanced externalizer configuration is still available for particular use cases. The wiki on plugging externalizers has been rewritten to show these changes.
  • lazyDeserialization XML element has been renamed to storeAsBinary in order to better represent its function. The previous programmatic configuration for this option has been deprecated to help ease migration but your XML will need changing.
  • All references to JOPR, including the maven module name have been renamed to RHQ. So bear make sure you plug your RHQ server with infinispan-rhq-plugin.jar instead of infinispan-jopr-plugin.jar
There's some other minor API changes and fixes as show in the release notes. As always, please use the user forums to report back, grab the release here, enjoy and keep the feedback coming.

Cheers,
Galder

Thursday, 14 April 2011

In response to PCWorld...

PCWorld has published an article on the recent data grid JSR that I have submitted.  As a follow-up to PCWorld's article, I would like to make a few comments to clarify a few things.

I don't quite understand what is meant by Red Hat's approach not being the best solution.  Do people take issue with having a standard in the first place?  Or is it the standards body used in this particular case (the JCP)?  If it is the details of the standard itself, one should keep in mind that this has yet to be defined by an expert group!

It is unfortunate that the "others" mentioned in the article - who feel that Red Hat's approach is not the best - were not able to provide any details about their objections. I would love to hear these objections and make sure that the JSR addresses them.

The importance of a standard, to remove vendor lock-in, etc., is pretty well understood, so I won't go into too much detail here.  But with that in mind, I find Pandey's comment regarding a "self-beneficial move" an odd one.  A standard makes it easier for people to switch between products (which may explain why no one else may have stepped up to the plate to propose such a standard thus far).  Proposing a standard makes it easier for end-users to move away from Infinispan.  Yes, it may help with awareness of Infinispan, but it also means Red Hat, just like other data grid vendors, will need to work really hard to make sure their products are up to scratch.  The only real beneficiary here is the end-user.  In fact, I'd like to invite Terracotta to participate in this JSR, as participation can only make it stronger, more relevant and eventually even more useful to end-users.

With regards to JSR-107, I believe Pandey has misunderstood the intention in proposing a data grid JSR.  I have proposed extending and building on top of JSR-107 - not throwing it away - and I have expressed this the JSR-107 expert group mailing list, of which Terracotta's Greg Luck is a member.  In fact, without Pandey's actually seeing my data grid proposal blog post - PCWorld's article was written before I published details of the JSR submission, based on a high-level Red Hat press release - one has to wonder where such strong words come from!  :-)

Cheers
Manik

A new data grid JSR

Following up on my previous response to Antonio Goncalves' blog post, I have submitted a JSR to the JCP on a data grid standard, titled "Java Data Grids".  It has yet to be assigned a number by the JCP, but I thought I'd talk about it a little here anyway.

Here is the description of the JSR that I have submitted:
This specification proposes to provide an API for accessing, storing, and managing data in a distributed data grid.
The primary API will build upon and extend JSR-107 (JCACHE) API. In addition to it’s genericized Map-like API to access a Cache, JSR-107 defines SPIs for spooling in-memory data to persistent storage, an API for obtaining a named Cache from a CacheManager and an API to register event listeners.
Above and beyond JSR-107, this JSR will define characteristics and expectations from eviction, replication and distribution, and transactions (via the JTA specification). Further, it would define an asynchronous, non-blocking API as an alternative to JSR-107’s primary API, as non-blocking access to data becomes a concern when an implementation needs to perform remote calls, as in the case of a data grid.
This specification builds upon JSR-107, which is not yet complete. We intend to work with the JSR-107 EG to ensure that their schedule is compatible with the schedule for this JSR. If JSR-107 is unable to complete, we propose merging the last available draft into this specification.
Data grids are gaining prominence and importance in enterprise Java, particularly as cloud-style deployments gain popularity:

  • Characteristics such as high availability, along with removal of single points of failure become increasingly important, since cloud infrastructure is inherently unreliable and can be re-provisioned with minimal notice; applications deployed on cloud need to be resilient to this.  
  • Further, one of the major benefits of cloud-style deployments is elasticity.  The ability to scale out (and back in) quickly and easily.  Again, data grids have a role to play here.  
  • Finally, with scalable middleware comes additional stress on the data tier (traditionally an RDBMS), as middleware nodes scale out to cope with load.  Data grids - used as a distributed cache - can help with mitigating database bottlenecks.

With one of Java EE 7's stated goals being "cloud-friendliness", the above are powerful arguments for the inclusion of a distributed data grid standard in Java EE 7.

What about JSR-107?  JSR-107 - the temporary caching API proposed in 2001 - certainly has a role to play in Java EE too.  Temporary caches are an important part of enterprise middleware, but yet a standard has been sadly missing from a Java EE umbrella specification for far too long.  Spring, having identified the need as well, has a temporary caching abstraction in their current development versions.  Several other non-Java frameworks define temporary caching APIs too (Ruby on Rails, Django for Python, .NET).  There is no denying JSR-107 is necessary, and necessary as a part of Java EE.

But JSR-107 isn't a data grid.  JSR-107 falls short as a standard for data grids, specifically as it doesn't take into account characteristics of distribution and replication of data, and doesn't define a contract that implementations would have to adhere to when it comes to moving data around a cluster.  Crucial things for a data grid that, if not baked into a specification, will hinder portability and render the standard itself useless and impotent.

Further, with remote capabilities in mind, a data grid should also expose a non-blocking API, since network calls can be a limiting factor.  Invoking methods that involve remote calls should be able to be done in an asynchronous fashion.  Stuff that is irrelevant to a temporary caching API like JSR-107.

So with all that in mind, I'd love to hear your thoughts on the data grid JSR.  In addition to Red Hat, the JSR is currently backed by a major Java EE and data grid vendor which cannot be named at this stage, along with independent JCP members with relevant interest and background.

Cheers
Manik

Monday, 11 April 2011

Infinispan at JBoss World/Red Hat Summit/JUDCon in Boston, May 2011

It's that time of year again, and Red Hat Summit/JBoss World 2011 looms ever larger on the horizon.  Back in Boston again this year, it will be co-located with another JUDCon, the JBoss Users and Developers Conference.

Red Hat Summit/JBoss World
Infinispan is well-represented again at Summit/JBW, as it was last year.  I will be speaking on using Infinispan to solve a number of scalability and availability issues, including reducing database bottlenecks.

Emmanuel Bernard and Sanne Grinovero will be speaking on Hibernate OGM - an awesomely cool project to provide JPA-like access to Infinispan (details of the session available here).

Craig Bomba and Shane Johnson will talk about optimising Infinispan for performance and consistency based on lessons learned using Infinispan at the Chicago Board of Options Exchange, the world's largest options exchange.

Further, for the first time, we intend to run a hands-on lab on Infinispan, coordinated by Jim Tyrrell.  An awesome chance to learn about the practical aspects of working with Infinispan, with core Infinispan engineers on hand to answer questions, etc.  This sort of opportunity doesn't come about every day!  :-)

JUDCon
JUDCon attendees too will get their fill of your favourite data grid with a talk titled Infinispan for Ninja Developers and one on Infinispan's new Map/Reduce capabilities.  Fun, hands-on, developer-focused deep dives.

UPDATE: I forgot to mention, Sanne Grinovero will also be talking about advanced querying on Infinispan.  More cool stuff.

For a summary of what happened at JUDCon and Summit/JBW last year, check out this blog entry.  It's still not too late to get tickets, you may even qualify for early-bird discounts if you hurry!

Hope to see you there.
Manik



Wednesday, 6 April 2011

First 5.0 beta now available!

We've just released our first beta version of Infinispan 5.0 "Pagoa". The main highlight in this release is the introduction of our brand new Map/Reduce API to compliment the existing distributed executor API release in a previous alpha version. We had countless discussions around these APIs and the result can be see in our javadocs and this wiki which explains the distributed executor and Map/Reduce framework. Please download the 5.0.0.BETA1 distribution and play with it! Your feedback is invaluable at this stage as we start to aim towards CR and Final release.

In other news, the query API has received a major overhaul and apart from providing a more intuitive and powerful API, it hooks into Hibernate Search new SPI making it easier to maintain. Infinispan users now get access to Faceting via Hibernate Search as well. A revision of the Infinispan Querying wiki is on its way, but the javadocs available here are full pointers to get you going.

Finally, here's a few notes on other interesting improvements:
  • Distribution uses now MurmurHash3 hash function which is more performant and provides more even spread.
  • EmbeddedCacheManager has been enhanced with some new methods that allow caches to be removed all together. That is, remove the contents of a cache cluster wide and in the persistent store. A couple of other methods to go along with this have been added such as cacheExists() and a conditional getCache() that can inform the user when the cache has been completely been removed the system.
  • Increased performance of Hot Rod client/server architectures as a result of fixing a couple of issues, one in the Hot Rod client and another in the server, so if you're a Hot Rod user, make sure you upgrade!
There's some other minor API changes and fixes as show in the release notes. As always, please use the user forums to report back, grab the release here, enjoy and keep the feedback coming.

Cheers,
Galder