Website: https://www.instaclustr.com/
Documentation: https://www.instaclustr.com/support/documentation/
An EverywhereStrategy
implementation for Apache Cassandra.
This is useful for performing DSE Cassandra → Apache Cassandra migrations.
The remainder of this README refers to DSE Cassandra as simply DSE, and Apache Cassandra as Cassandra.
Simply install the JAR into the classpath on all Cassandra nodes. The JAR contains an implementation of EverywhereStrategy
that
is compatible with Cassandra.
We offer a packaged version of instaclustr-everywhere-strategy
for systems where Cassandra has been installed via the official Apache.org Debian or RPM package.
This package will automatically install instaclustr-everywhere-strategy
into an appropriate location for the Cassandra package install
(i.e. $CASSANDRA_HOME/lib
, which at present is /usr/share/cassandra/lib
).
Note: These packages have a hard dependency for a cassandra
package.
If Cassandra hasn't been installed via your distributions package manager, installing instaclustr-everywhere-strategy
may force the Cassandra package to be installed. This may conflict with a tarball install.
See Cassandra Tarball Installs below on how to install instaclustr-everywhere-strategy
for tarball installs of Cassandra.
(Debian, Ubuntu, et al.)
-
Add the
instaclustr/debian
repository.echo "deb https://dl.bintray.com/instaclustr/debian stable main" > \ /etc/apt/sources.list.d/instaclustr.sources.list
-
Run
apt-get update
to fetch the contents of the new package repository. -
Run
apt-get install instaclustr-everywhere-stratgey
to install the package.
(RHEL, Fedora, CentOS, et al.)
-
Add the
instaclustr/rpm
repository.wget -O - https://bintray.com/instaclustr/rpm/rpm | \ sudo tee /etc/yum.repos.d/instaclustr.repo
-
Run
dnf install instaclustr-everywhere-strategy
to install the package.Hint: For YUM-based distributions the command is
yum install instaclustr-everywhere-strategy
.
-
Download the latest
instaclustr-everywhere-strategy
JAR from the releases page. -
Install the
instaclustr-everywhere-strategy
JAR into the Cassandra classpath.Typically the best location is
$CASSANDRA_HOME/lib
. -
Restart Cassandra.
Some automated tests leveraging Cassandra Cluster Manager (CCM) exist in the
test/
directory.
The basic gist of testing is as follows:
-
Create a new keyspace using
EverywhereStrategy
:CREATE KEYSPACE example USING replication = {'class': 'EverywhereStrategy'};
The strategy is installed correctly if the keyspace is created successfully.
-
Create a table under the new keyspace, and insert some data:
CREATE TABLE example.demo (a text PRIMARY KEY); INSERT INTO example.demo (a) VALUES ('a'); INSERT INTO example.demo (a) VALUES ('b'); INSERT INTO example.demo (a) VALUES ('c'); INSERT INTO example.demo (a) VALUES ('d');
The strategy is functioning correctly if the data is replicated to all nodes.
-
Run
nodetool flush
on every node. -
Run
nodetool compact
on every node. -
Run
sstabledump
on the table SSTables from each node. -
Compare the JSON output from each node and confirm that the data in each dump is identical.
See DSE → Cassandra Migration Test Results for the results of running the DSE → Cassandra end-to-end tests.
DSE uses an internal EverywhereStrategy
implementation for various dse_*
keyspaces.
When joining a Cassandra node to a DSE cluster these keyspaces will cause ClassNotFound
exceptions to be thrown on the Cassandra node.
These exceptions result in a schema disagreement.
In the system.log
for a Cassandra node:
ERROR [InternalResponseStage:1] MigrationTask.java:95 - Configuration exception merging remote schema
org.apache.cassandra.exceptions.ConfigurationException: Unable to find replication strategy class 'org.apache.cassandra.locator.EverywhereStrategy'
<stacktrace snipped>
and nodetool describecluster
:
Cluster Information:
Name: test-cluster
Snitch: org.apache.cassandra.locator.GossipingPropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
850859e7-fcca-3516-9d8c-e9a9a205c974: [127.0.0.1, 127.0.0.2, 127.0.0.3]
e84b6a60-24cf-30ca-9b58-452d92911703: [127.0.1.1, 127.0.1.2, 127.0.1.3]
In the output above, IPs 127.0.0.*
are DSE nodes, 127.0.1.*
are Cassandra nodes.
One common solution is to ALTER
the dse_*
keyspaces to use NetworkTopologyStrategy
before joining Cassandra nodes to the cluster.
While this works, it's also dangerous.
DSE nodes reset the replication strategy back to EverywhereStrategy
on startup.
As a result, if any DSE nodes restart while Cassandra nodes are present in the cluster then schema disagreement will again occur.
Our EverywhereStrategy
implementation extends NetworkTopologyStrategy
.
This is required because various core components inside Cassandra
(e.g. ConsistencyLevel
)
perform instanceof NetworkTopologyStrategy
checks when they need to be data center aware.
Yet, NetworkTopologyStrategy
hasn't been designed to be extendable.
A number of its fields are private final immutable, including datacenters
, which is the DC→RF mapping.
So we resort to reflection to fix this. Yuck! But, it works…
Cassandra Version | Status |
---|---|
4.x | Supported |
3.11.x | Supported |
3.0.x | Supported |
2.2.x | Supported |
2.1.x | Supported |
2.0.x | Supported |
For 2.1.x and 2.0.x versions, you can use version 2.2.x, it is compatible.
This project is licensed under the Apache License, version 2.0. See LICENSE for details.
Please see our Open Source Project Status page for details on Instaclustr's support status of this project.