Infinispan: persistence

Showing posts with label persistence. Show all posts

Wednesday, 3 February 2016

The return of the Cassandra CacheStore

Ever since we spruced up our Cache Store SPI in Infinispan 6.0, some of our "extra" cache stores have lied in a state of semi-abandonment, waiting for a kind soul with time and determination to bring them back to life.
I'm glad to announce that such a kind soul, in the form of Jakub Markos, had the necessary qualities to accomplish the resurrection of the Cassandra Cache Store.

Apache Cassandra is a database with a distributed architecture which can be used to provide a virtually unlimited, horizontally scalable persistent store for Infinispan's caches. The new Cassandra Cache Store leverages the Datastax Cassandra client driver instead of the old Thrift client approach, which makes it much more robust and reliable.

Configuration

In order to use this cache store you need to add the following dependency to your project:
You will also need to create an appropriate keyspace on your Cassandra database, or configure the auto-create-keyspace to create it automatically.
The following CQL commands show how to configure the keyspace manually (using cqlsh for example):

You then need to add an appropriate cache declaration to your `infinispan.xml`
(or whichever file you use to configure Infinispan):

It is important the the shared property on the cassandra-store element is set to true
because all the Infinispan nodes will share the same Cassandra cluster.

Limitations

The cache store uses Cassandra's own expiration mechanisms (time to live = TTL) to handle expiration of entries. Since TTL is specified in seconds, expiration lifespan and maxIdle values are handled only with seconds-precision.

In addition to this, when both lifespan and maxIdle are used, entries in the cache store effectively behave as if their lifespan = maxIdle, due to an existing bug https://issues.jboss.org/browse/ISPN-3202.

So, try it out and let us know about your experience !

Tuesday, 19 November 2013

Infinispan 6.0.0.Final is out!

Dear Infinispan community,

We're pleased to announce the final release of Infinispan 6.0 "Infinium". As announced, this is the first Infinispan stable version to be released under the terms of Apache License v2.0.

This release brings some highly demanded features besides many stability enhancements and bug fixes:

Support for remote query. It is now possible for the HotRod clients to query an Infinispan grid using a new expressive query DSL. This querying functionality is built on top of Apache Lucene and Google Protobuf and lays the foundation for storing information and querying an Infinispan server in a language neutral manner. The Java HotRod client has already been enhanced to support this, the soon-to-be announced C++ HotRod client will also contain this functionality (initially for write/read, then full blown querying).
C++ HotRod client. Allows C++ applications to read and write information from an Infinispan server. This is a fully fledged HotRod client that is topology (level 2) and consistent hash aware (level 3) and will be released in the following days. Some features (such as Remote Query and SSL support) will be developed during the next iteration so that it maintains feature parity with its Java counterpart.
Better persistence integration. We’ve revisited the entire cache loader API and we’re quite pleased with the result: the new Persistence API brought by Infinispan 6.0 supports parallel iteration of the stored entries, reduces the overall serialization overhead and also is aligned with the JSR-107 specification, which makes implementations more portable.

A more efficient FileCacheStore implementation. This file store is built with efficiency in mind: it outperforms the existing file store with up to 2 levels of magnitude. This comes at a cost though, as keys need to be kept in memory. Thanks to Karsten Blees for contributing this!
Support for heterogeneous clusters. Up to this release, every member of the cluster owns an equal share of the cluster’s data. This doesn’t work well if one machine is more powerful than the other cluster participants. This functionality allows specifying the amount of data, compared with the average, held by a particular machine.
A new set of usage and performance statistics developed within the scope of the CloudTM project.
JCache (JSR-107) implementation upgrade. First released in Infinispan 5.3.0, the standard caching support is now upgraded to version 1.0.0-PFD.

For a complete list of features included in this release please refer to the release notes.

The user documentation for this release has been revamped and migrated to the new website - we think it looks much better and hope you’ll like it too!

This release has spread over a period of 5 months: a sustained effort from the core development team, QE team and our growing community - a BIG thanks to everybody involved! Please visit our downloads section to find the latest release. Also if you have any questions please check our forums, our mailing lists or ping us directly on IRC.

Cheers,

Adrian

Monday, 16 September 2013

New persistence API in Infinispan 6.0.0.Alpha4

The existing CacheLoader/CacheStore API has been around since Infinispan 4.0. In this release of Infinispan we've taken a major step forward in both simplifying the integration with persistence and opening the door for some pretty significant performance improvements.

What's new

So here's what the new persistence integration brings to the table:

alignment with JSR-107: now we have a CacheWriter and CacheLoader interface similar to the the loader and writer in JSR 107, which should considerably help writing portable stores across JCache compliant vendors
simplified transaction integration: all the locking is now handled within the Infinispan layer, so implementors don't have to be concerned coordinating concurrent access to the store (old LockSupportCacheStore is dropped for that reason).
parallel iteration: it is now possible to iterate over entries in the store with multiple threads in parallel. Map/Reduce tasks immediately benefit from this, as the map/reduce tasks now run in parallel over both the nodes in the cluster and within the same node (multiple threads)
reduced serialization (translated in less CPU usage): the new API allows exposing the stored entries in serialized format. If an entry is fetched from persistent storage for the sole purpose of being sent remotely, we no longer need to deserialize it (when reading from the store) and serialize it back (when writing to the wire). Now we can write to the wire the serialized format as read fro the storage directly

API

Now let's take a look at the API in more detail:

The diagram above shows the main classes in the API:

ByteBuffer - abstracts the serialized form on an object
MarshalledEntry - abstracts the information held within a persistent store corresponding to a key-value added to the cache. Provides method for reading this information both in serialized (ByteBuffer) and deserialized (Object) format. Normally data read from the store is kept in serialized format and lazily deserialized on demand, within the MarshalledEntry implementation
CacheWriter and CacheLoader provide basic methods for reading and writing to a store
AdvancedCacheLoader and AdvancedCacheWriter provide operations to manipulate the underlaying storage in bulk: parallel iteration and purging of expired entries, clear and size.

A provider might choose to only implement a subset of these interfaces:

Not implementing the AdvancedCacheWriter makes the given writer not usable for purging expired entries or clear
Not implementing the AdvancedCacheLoader makes the information stored in the given loader not used for preloading, nor for the map/reduce iteration

If you're looking at migrating your existing store to the new API, looking at the SingleFileStore for inspiration can be of great help.

Configuration

And finally, the way the stores are configured has changed:

the 5.x loaders element is now replaced with persistence
both the loaders and writers are configured through a unique store element (vs loader and store, as allowed in 5.x)
the preload and shared attributes are configured at each individual store, giving more flexibility when it comes to configuring multiple chained stores

Cheers,

Mircea

Thursday, 18 July 2013

Faster file cache store (no extra dependencies!) in 6.0.0.Alpha1

As announced yesterday by Adrian, the brand new Infinispan 6.0.0.Alpha1 release contains a new file-based cache store which needs no extra dependencies. This is essentially a replacement of the existing FileCacheStore which didn't perform as expected, and caused major issues due to the number of files it created.

The new cache store, contributed by a Karsten Blees (who also contributed an improved asynchronous cache store), is called SingleFileCacheStore and it keeps all data in a single file. The way it looks up data is by keeping an in-memory index of keys and the positions of their values in this file. This design outperforms the existing FileCacheStore and even LevelDB based JNI cache store.

The classic case for a file based cache store is when you want to have a cache with a cache store available locally which stores data that has overflowed from memory, having exceeded size and/or time restrictions. We ran some performance tests to verify how fast different cache store implementations could deal with reading and writing overflowed data, and these are the results we got (in Ks):

FileCacheStore: 0.75k reads/s, 0.285k writes/s
LevelDB-JNI impl: 46k reads/s, 15.2k writes/s
SingleFileCacheStore: 458k reads/s, 137k writes/s

The difference is quite astonishing but as already hinted, this performance increase comes at a cost. Having to maintain an index of keys and positions in the file in memory has a cost in terms of extra memory required, and potential impact on GC. That's why the SingleFileCacheStore is not recommended for use cases where the keys are too big.

In order to help tame this memory consumption issues, the size of the cache store can be optionally limited, providing a maximum number of entries to store in it. However, setting this parameter will only work in use cases where Infinispan is used as a cache. When used as a cache, data not present in Infinispan can be recomputed or re-retrieved from the authoritative data store and stored in Infinispan cache. The reason for this limitation is because once the maximum number of entries is reached, older data in the cache store is removed, so if Infinispan was used as an authoritative data store, it would lead to data loss which is not good.

Existing FileCacheStore users might wonder: what is it gonna happen to the existing FileCacheStore? We're not 100% sure yet what we're going to do with it, but we're looking into some ways to migrate data from the FileCacheStore to the SingleFileCacheStore. Some interesting ideas have already been submitted which we'll investigate in next Infinispan 6.0 pre-releases.

So, if you're a FileCacheStore user, give the new SingleFileCacheStore a go and let us know how it goes! Switching from one to the other is easy :)

Cheers,

Galder

Wednesday, 17 July 2013

Infinispan 6.0.0.Alpha1 is out!

Dear Infinispan community,

We're proud to announce the first Alpha release of Infinispan 6.0.0. Starting with this release, Infinispan license is moving to the terms of the Apache Software Licence version 2.0.

Besides increased stability (about 30 bug fixes) this release also brings several new features:

A more efficient FileCacheStore implementation (courtesy Karsten Blees)
A new set of usage and performance statistics developed within the scope of the CloudTM project
A new (experimental) marshaller for Hot Rod based on protobuf, which will be primarily used by the upcoming remote querying feature. Since this has reuse potential in other projects it was promoted to an independent project named protostream under the Infinispan umbrella

For a complete list of features and fixes included in this release please refer to the release notes.

Visit our downloads section to find the latest release and if you have any questions please check our forums, our mailing lists or ping us directly on IRC.

Thanks to everyone for their involvement and contribution!

Cheers,

Adrian

Wednesday, 3 February 2016

The return of the Cassandra CacheStore

Configuration

Limitations

Tuesday, 19 November 2013

Infinispan 6.0.0.Final is out!

Monday, 16 September 2013

New persistence API in Infinispan 6.0.0.Alpha4

What's new

API

Configuration

Thursday, 18 July 2013

Faster file cache store (no extra dependencies!) in 6.0.0.Alpha1

Wednesday, 17 July 2013

Infinispan 6.0.0.Alpha1 is out!

Links

Subscribe

Labels

Contributors

Blog Archive

Followers