Friday, 28 February 2014

Infinispan 7.0.0.Alpha1 release

Dear Infinispan community,

We're proud to announce the first Alpha release of Infinispan 7.0.0.

This release adds several new features:
  • Support for clustered listeners. One of the limitation of Infinispan's distributed mode used to be that listeners could only receive events for cache modifications on their own node. That's no longer the case, and it paves the way for a long-requested feature: HotRod listeners.
  • Map/Reduce tasks can now execute the mapper/combiner/reducer on multiple threads. Stay tuned for more Map/Reduce improvements in the near future.
  • The first essential component of cache security has been added, which will be the building block for remote protocol authentication and authorization.
  • Improved OSGi support in the HotRod Java client. The core components are also getting into shape for OSGi, expect more on this front in the next release.

As you can see, many of the new features are stepping stones for bigger things yet to come. Feel free to join us and shape the future releases on our forums, our mailing lists or our #infinispan IRC channel.


For a complete list of features and bug fixes included in this release please refer to the release notes
Visit our downloads section to find the latest release.

Thanks to everyone for their involvement and contribution!
Happy hacking!


Monday, 24 February 2014

Map/Reduce parallel execution


Ever since Infinispan 5.2 release we implemented fully distributed execution of both map and reduce phases of MapReduceTask. For the map phase, MapReduceTask hashes task input keys, groups them by execution node N these keys are hashed to, and sends map function along input keys to each node N. At node N map function gets invoked for each input key and locally loaded corresponding value. However, map function on node N, until recently, got invoked on a single thread regardless of the number of key/value pairs. If we need to invoke map function on many key/value pairs, things would sooner rather than later grind to a halt.

Similarly in order to complete reduce phase, MapReduceTask groups intermediate KOut keys by execution node N they are hashed to. After intermediate phase is completed, MapReduceTask sends a reduce command to each node N where KOut keys are hashed. Once reduce command arrives on target execution node, it looks up temporary cache belonging to MapReduceTask and for each KOut key, grabs a list of VOut values, wraps it with an Iterator and invokes reduce on it. However, even reduce function, until recently, got invoked on a single thread, as well. Even though, due to the nature of map/reduce paradigm, reduce entails significantly smaller number of key/value function invocations compared to a map, current single threaded execution model does not help to speed things up.

Starting with Infinispan community release 7.0.0.Alpha1, map and reduce task phases are executed in parallel. If the eviction is not configured for the cache where key/value pairs involved in the map phase reside, MapReduceTask uses fork/join work-stealing technique for parallel execution of the map and reduce functions. Otherwise, we implement parallel execution using a standard thread executor framework. Reduce phase is always executed using fork/join work-stealing algorithm. Either way, we are hoping that users' large map/reduce tasks will experience a significant execution speedup.  At the moment, we are conducting our own performance tests and will get back to you with the results soon. Stay tuned.

Cheers,
Vladimir