Infinispan: March 2017

Friday, 31 March 2017

Infinispan 9

Infinispan 9 is the culmination of nearly a year of work. It is codenamed "Ruppaner" in honor of the city of Konstanz, where we designed many of the improvements we've made. Prost!

Performance

We decided it was time to revisit Infinispan's performance and scalability. So we went back to our internals design and we made a number of improvements. Infinispan 9.0 is faster than any previous release by quite a sizeable margin in a number of key aspects:

distributed writes, thanks to a new algorithm which reduces the number of RPCs required to write to the owners
distributed reads, which scale much better under load
replicated writes, also with better scalability under load
eviction, thanks to a new in-memory container
internal marshalling, which was completely rewritten

We will have a post dedicated to benchmarks detailing the difference against previous versions and in various scenarios.

Marshalling

We've made several improvements in the cluster and persistent storage marshalling layer which has resulted in increased performance and smaller payloads. Also, the new marshaller layer makes JBoss Marshalling an optional component, which is only used when no Infinispan Externalizers (or AdvancedExternalizers) are available for a given type, hence relying on standard JDK Serializable/Externalizable capabilities to be marshalled.

Remote Hot Rod Clients

We now ship alternate marshallers for remote clients based on Kryo and ProtoStuff.

Additionally, the Hot Rod protocol now supports streaming operations for dealing with large objects.

Off-Heap and data-container changes

An In-Memory Data Grid likes to eat through your memory (because you want it to be fast!), but in the world of the JVM that is not ideal: that huge chunk of data gives Garbage Collectors a hard time when the heap goes into double-digit gigabyte territory. Long GC pauses can make individual nodes unresponsive, compromising the stability of your cluster.

Infinispan 9 introduces an improved data container which can optionally store entries off-heap.

Additionally, our bounded container has been replaced with Ben Manes' excellent Caffeine which provides much better performance. Check out Ben's benchmarks where he compares, among other things, against Infinispan's old bounded container.

Configuration-wise, the previously separate concepts of eviction, store-as-binary and data-container have been merged into a single 'memory' configuration element.

Persistence

The JDBC cache store received quite an overhaul:

The internal connection pool is now based on HikariCP, for improved performance
Writes will now use database-specific upsert functionality when available
Transactional writes to the cache translate to transactional writes to the database
The JdbcBinaryStore and JdbcMixedStore have been removed as detailed here

We have also replaced the LevelDB cache store with the better-maintained and faster RocksDB cache store.

Ickle, our new query language

We decided it was time for Infinispan to have a proper query language, which would take full advantage of our query capabilities. We have therefore grafted Lucene's full-text operators on top of a subset of JP-QL to obtain Ickle. We have already started describing Ickle in a recent blog post. For a taste of Ickle, the following query shows how to combine a traditional boolean clause with a full-text term query:

select transactionId, amount, description from com.acme.Transaction
where amount > 10 and description : "coffee"

Cloud integrations

Infinispan continues to play nicely in cloud environments thanks to a number of improvements that have been made to discovery (such as KUBE_PING for Kubernetes/OpenShift), health probes and our pre-built Docker images.

Multi-tenant server and SNI support

Infinispan Server is now capable of exposing multiple cache containers through a single Hot Rod or REST endpoint. The selection of the container is performed via SNI. This allows you to have a single cluster serve all your applications while maintaining each one's data isolated.

Administration Console

The adminstration console has been completely rewritten in a more modular fashion using TypeScript to allow for greater extensibility and ease of maintanence. In addition to this refactor, the console now supports the following:

Stateless views
HTTP Digest Authentication
Management of individual and clustered Standalone server instances
Internet Explorer

Documentation overhaul

Our documentation has been completely overhauled with entire chapters being added or rewritten for readability and consistency.

What's coming

We will be blogging in more detail about some of the things above, so watch out for more content coming soon !

We've already started working on Infinispan 9.1 which will bring a number of new features and improvements, such as clustered counters, consistency checker with merge policies, a new distributed cache for even better write performance, and more.

Get it now !

Head over to our download page to get binaries, sources, clients, etc.

Please join us to let us know what you think about this release.

The Infinispan team

Wednesday, 22 March 2017

Infinispan 9.0.0.CR4

Dear Infinispan users, we thought CR3 was going to be the last candidate release before Final... but we were mistaken!

The reason for yet another CR is that we decided to make some changes which affect some default behaviours:

enabling optimistic transactions with repeatable read now turns on write-skew by default
retrieving an already configured cache by passing in a template doesn't redefine that cache's configuration

Other important changes:

big improvements to the client/server rolling upgrade process
allow indexes to be stored in off-heap caches
lots of bug fixes

For the full list of changes check the release notes, download the 9.0.0.CR4 release and let us know if you have any questions or suggestions.

Cheers,
The Infinispan team

Monday, 20 March 2017

Memory and CPU constraints inside a Docker Container

In one of the previous blog posts we wrote about different configuration options for our Docker image. Now we did another step adding auto-configuration steps for memory and CPU constraints.

Before we dig in...

Setting memory and CPU constraints to containers is very popular technique especially for public cloud offerings (such as OpenShift). Behind the scenes everything works based on adding additional Docker settings to the containers. There are two very popular switches: --memory (which is responsible for setting the amount of available memory) and --cpu-quota (which throttles CPU usage).

Now here comes the best part... JDK has no idea about those settings! We will probably need to wait until JDK9 for getting full CGroups support.

What can we do about it?

The answer is very simple, we need to tell JDK what is the available memory (at least by setting Xmx) and available number of CPUs (by setting XX:ParallelGCThreads, XX:ConcGCThreads and Djava.util.concurrent.ForkJoinPool.common.parallelism).

And we have some very good news! We already did it for you!

Let's test it out!

At first you need to pull our latest Docker image:

Then run it with CPU and memory limits using the following command:

Note that JAVA_OPTS variable was overridden. Let's have a look what had happened:

-Xms64m -Xmx350m - it is always a good idea to set Xmn inside a Docker container. Next we set Xmx to 70% of available memory.
-XX:ParallelGCThreads=6 -XX:ConcGCThreads=6 -Djava.util.concurrent.ForkJoinPool.common.parallelism=6 - The next thing is setting CPU throttling as I explained above.

There might be some cases where you wouldn't like to set those properties automatically. In that case, just pass -n switch to the starter script:

Friday, 17 March 2017

Event Listener with C++

Dear Infinispan community,

as announced in a previous post, starting from version 8.1.0 also the C++/C# clients can receive and process Infinispan events.

Here's an example of usage of C++ event listeners that, with a good dose of imagination, pretends to be a customer behavior tracking system for our store chain (don't take this too seriously, we're just trying to add some fiction).

As a first requirement our tracking system will record every single purchase made in our stores. How many stores we have? 1, 100, millions? It doesn't matter: we're backed with an Infinispan data grid.
This is version 0.x and hence the checker must use the keyboard to enter all the needed information.

As you can see our entry key is a concatenation of the product name and the timestamp and the entry value is an unstructured string, maybe too simply but it could work for now.
Seems we are at a good point: we have the data and we can do analytics on it, so far so good but now our boss makes a new request: he wants a runtime monitor on how's the sales performance. That's a perfect request to be fulfilled with event listener: the monitor application will be an Hotrod C++ client that registers a client listener on the server and receives and show on the boss's laptop the data flow.
A client listener, once registered on the server, can receive events related to: creation, modification, deletion, expiration of cache entries; in our example only the creation and expiration events are processed (expired events can be useful to do some moving average statistics?). Following a snip of code that creates and registers a listener that writes the events key on the stdout.

You can git this quickstart here [1]. On startup a multiple choice menu is shown with all the available operations. Running several instances you can act as the checker (data entry) or the boss (installing the listener and seeing the events flow).

Filters

Again so far so good, but then the marketing department ask support to do targeted advertising like: soliciting customers that bought product Y to buy product X.
Let's suppose that X="harmonica" and Y="hiking boots" (it's a well known fact of life that in the high mountains you feel the desire to play an harmonica).

To do that we register on the server another listener, but this time we're not interested in the whole flow of purchase data: to run our marketing campaign, we only interested in cache entries having the key starting with "hiking". The Infinispan server can filter out events for us, if we pass in the AddClientListener operation the name of the wanted filter along with any configuration arguments.

Filter are java classes that must be deployed into the Infinispan server (more here [2])

and converters

Predefined events contain very few information: basically the event type and the entry key, this to prevent to flood the network spreading around very long entry values. Users can override this limitation using a converter, that is a java class deployed into the server, that can create custom events containing every data needed by the application.
As in the previous case, we pass into the add operation the name of the converter and the configuration arguments, any.

That's all guys, let us know your feedback: do you like it? Could be better? Tell us how it can be improved creating an issue [3], or fork and improve it yourself [4]!

Thanks for reading and enjoy!
The Infinispan Team
[1] https://github.com/rigazilla/infinispan-simple-tutorials/tree/new_event_tutorial/c%2B%2B/events
[2] http://blog.infinispan.org/2014/08/hot-rod-remote-events-1-getting-started.html
[2] http://blog.infinispan.org/2014/08/hot-rod-remote-events-2-filtering-events.html
[2] http://blog.infinispan.org/2014/09/hot-rod-remote-events-3-customizing.html
[3] https://issues.jboss.org/projects/HRCPP/issues
[4] https://github.com/infinispan/cpp-client

Wednesday, 15 March 2017

Spring Boot Starter 1.0.0.CR1 released!

I'm happy to announce a new release (the first feature-complete!) of Infinispan Spring Boot Starters.

We finally added new properties for managing Hot Rod client mode in application.properties as well Spring Cache automatic support. Finally, we fixed a couple of smaller issues.

For complete changelog, please refer to the release page.

The artifacts should be available in Maven Central as soon as the sync completes. In the meantime grab them from JBoss Repository.

KUBE_PING 0.9.2 released!

I'm happy to announce a new release of KUBE_PING JGroups protocol.

Since this is a minor maintenance release, there are no ground breaking changes but we fixed a couple of issues that prevented our users from using JGroups 3.6.x and KUBE_PING 0.9.1.

Have a look at the release page to learn more details.

The artifacts should be available in Maven Central as soon as the sync completes. In the meantime grab them from JBoss Repository.

Hotrod clients C++/C# 8.1.0.CR2 released!

Dears,

we're pleased to announce that 8.1.0.CR2 release for C++/C# clients is out!

Check the release notes, focus was on bug fixes this round so you have the opportunity to download the cleanest code so far!

Spring cleaning will continue in the next release iteration, stay tuned and, if you like, take part signalling new issues here!

Enjoy!

The Infinispan Team

Friday, 10 March 2017

JDBC Migrator or: How I Learned to Stop Worrying About Buckets and Utilise the JdbcStringBasedStore!

Infinispan 9 has introduced many improvements to its marshalling codebase in order to improve performance and allow for greater flexibility. Consequently, data marshalled and persisted by Infinispan 8.x is no longer compatible with Infinispan 9.x. Furthermore, as part of our ongoing efforts to improve the cache stores provided by Infinispan, we have removed both the JdbcBinaryStore and JdbcMixedStore in Infinispan 9.0.

To assist users migrating from Infinispan 8.x, we have created the JDBC Migrator that enables existing JDBC stores to be migrated to Infinispan 9's JdbcStringBasedStore.

No More Binary Keyed Stores!

The original intention of the JdbcBinaryStore was to provide greater flexibility over the JdbcStringBasedStore as it did not require a Key2StringMapper implementation. This was achieved by utilising the hashcode of an entries key for a table's ID column entry. However, due to the possibility of hash collisions all entries had to be placed inside a Bucket object which was then serialised and inserted into the underlying table. Utilising buckets in this manner was far from optimal as each read/write to the underlying table required an existing bucket for a given hash to be retrieved, deserialised, updated, serialised and then re-inserted back into the db.

Introducing JDBC Migrator

The JDBCMigrator is a standalone application that takes a single argument, the path to a .properties file which must contain the configuration properties for both the source and target stores. To use the migrator you need the infinispan-tools-9.x.jar, as well as the jdbc drivers required by your source and target databases, on your classpath.

An example maven pom that launches the migrator via mvn exec:java is presented below:

Migration Examples

Below are several example .properties files used for migrating various stores, however an exhaustive list of all available properties can be found in the Infinispan user guide.

Before attempting to migrate your existing stores please ensure you have backed up your database!

8.x JdbcBinaryStore -> 9.x JdbcStringBasedStore

The most important property to set in this example is "source.marshaller.type=LEGACY" as this instructs the migrator to utilise the Infinispan 8.x marshaller to unmarshall data stored in your existing DB tables.

If you specified custom AdvancedExternalizer implementations in your Infinispan 8.x configuration, then it is necessary for you to specify these in the migrator configuration and ensure that they are available on the migrators classpath. To Specify the AdvancedExternalizers to load, it is necessary to define the "source.marshaller.externalizers" property with a comma-separated list of class names. If an ID was explicitly set for your externalizer, then it is possible to prepend the externalizers class name with "<id>:" to ensure the IDs is respected by the marshaller.

TwoWayKey2StringMapper Migration

As well as using the JDBC Migrator to migrate from Infinispan 8.x, it is also possible to utilise it to migrate from one DB dialect to another or to migrate from one TwoWayKey2StringMapper implementation to another.

Summary

Infinispan 9 stores are no longer compatible with Infinispan 8.x stores due to internal marshalling changes. Furthermore, the JdbcBinary and JdbcMixed stores have been removed due to their poor performance characteristics. To aid users in their transition from Infinispan 8.x we have created the JDBC Migrator to enable users to migrate their existing JDBC stores.

If you're a user of the JDBC stores and have any feedback on the latest changes, let us know via the forum, issue tracker or the #infinispan channel on Freenode.

Wednesday, 1 March 2017

Checking Infinispan cluster health and Kubernetes/OpenShift

Modern applications and microservices often need to expose their health status. A common example is Spring Actuator but there are also many different ways of doing that.

Starting from Infinispan 9.0.0.Beta2 we introduced the HealthCheck API. It is accessible in both Embedded and Client/Server mode.

Cluster Health and Embedded Mode

The HealthCheck API might be obtained directly from EmbeddedCacheManager and it looks like this:

The nice thing about the API is that it is exposed in JMX by default:

More information about using HealthCheck API in Embedded Mode might be found here:

http://infinispan.org/docs/dev/user_guide/user_guide.html#monitoring_cluster_health

Cluster Health and Server Mode

Since Infinispan is based on Wildfly, we decided to use CLI as well as built-in Management REST interface.

Here's an example of checking the status of a running server:

Querying the HealthCheck API using the Management REST is also very simple:

Note that for the REST endpoint, you have to use proper credentials.

More information about the HealthCheckA API in Server Mode might be found here:

http://infinispan.org/docs/dev/server_guide/server_guide.html#health_monitoring

Cluster Health and Kubernetes/OpenShift

Monitoring cluster health is crucial for Clouds Platforms such as Kubernetes and OpenShift. Those Clouds use a concept of immutable Pods. This means that every time you need change anything in your application (changing configuration for the instance), you need to replace the old instances with new ones. There are several ways of doing that but we highly recommend using Rolling Updates. We also recommend to tune the configuration and instruct Kubernetes/OpenShift to replace Pods one by one (I will show you an example in a moment).

Our goal is to configure Kubernetes/OpenShift in such a way, that each time a new Pod is joining or leaving the cluster a State Transfer is triggered. When data is being transferred between the nodes, the Readiness Probe needs to report failures and prevent Kubernetes/OpenShift from doing progress in Rolling Update procedure. Once the cluster is back in stable state, Kubernetes/OpenShift can replace another node. This process loops until all nodes are replaced.

Luckily, we introduced two scripts in our Docker image, which can be used out of the box for Liveness and Readiness Probes:

At this point we are ready to put all the things together and assemble DeploymentConfig:

Interesting parts of the configuration:

lines 13 and 14: We allocate additional capacity for the Rolling Update and allow one Pod to be down. This ensures Kubernetes/OpenShift replaces nodes one by one.
line 44: Sometimes shutting a Pod down takes a little while. It is always better to wait until it terminates gracefully than taking the risk of losing data.
lines 45 - 53: The Liveness Probe definition. Note that when a node is transferring the data it might highly occupied. It is wise to set higher value of 'failureThreshold'.
lines 54 - 62: The same rule as the above. The bigger the cluster is, the higher the value of 'successThreshold' as well as 'failureThreshold'.

Feel free to checkout other articles about deploying Infinispan on Kubernetes/OpenShift: