Apache Cassandra Migration Last Episode: 3x to 4x Migration

In the previous blog posts of this series, we learned the basics and X-DR Setup in Cassandra. Finally, we have arrived at our last phase: Cassandra migration to a newer version.

The last 2 years have been great on our database journey. We searched for the best database for our application and ended up using Cassandra. It proved to be quite fast, just what we needed for our use case. We started using Cassandra 3.1.2 which was fine but after running it for a year we realized that it was time to upgrade as, by then, v3.1.2 had lots of vulnerabilities.

Obviously, experimenting on current production was risky so we decided to go with our non-production first. We were running the same version 3.1.2 and X-DR Setup. In the previous blog, I explained the X-DR Setup you might want to refer to for proper understanding.

We began by reviewing all the documentation related to Cassandra to understand its upgrading processes.

Before doing any upgrade on databases the first thing is to secure data because data matters the most. Therefore, we started with the data backup.

Source: 123rf

SStables is where your all data is stored in Cassandra. We first made a copy of our data by attaching an additional disk and mounting it on the nodes. We have also taken the Snapshot by running the below command on each node.

Now our data is secured which means if something happens we have a rollback plan which is to shift back to the older version and then point data to that backup disk and start the Cassandra service.

IT’S SHOW TIMEEEEEEEEEEEE!!!!!!!!!!!!!!!!!!!

We want Inline Upgrade and have Cluster as an X-DR Setup. So we decided to upgrade one cluster at a time so the application will be pointing to another cluster this will reduce the chances of Application downtime.

Here are the procedures we adhered to for all the nodes in the clusters:

  1. We first took the backup of the configuration cassandra.yaml and cassandra-rack.properties. Both files are necessary to run cassandra. I have explained both the files in previous blog X-DR Setup
  2. List all packages currently installed: rpm -qa| grep cassandra
  3. Now drain the node by the below command: nodetool drain. It will make a node stop accepting writes and flush all memtables from the node to SSTables.
  4. Now Stop the service: service cassandra stop
  5. Now remove the node from the cluster by running: nodetool removenode <<HOSTID>>
  6. After successfully removing the node you are halfway through
  7. Now remove packages of Cassandra: rpm -e cassandra*
  8. Now install the latest Cassandra: yum install cassandra4.1.2
  9. After installation now its time to copy the cassandra files from backup to /etc/cassandra/conf
  10. Now Start the cassandra service: service cassandra start
  11. Now watch for logs /var/log/cassandra/debug.log

NOTE: Cassandra.yaml has some changes over the configuration parameters that need to be reconsidered before executing the Step 9. In the previous blog, I have posted the yaml.

Now Cassandra is up and running with the latest version that can be verified using nodetool describecluster command

Source: behance.net

Now your Cassandra is on 4.x.x but sstables are still not upgraded for that you need to run the below command

Upgrading sstables in each node will take almost 1 hour to upgrade. After that, you have successfully upgraded the cluster to 4x.

Congratulations We Have Done It !!!!!!!!!!!!!!!!!!!!!

NOTE: Upgrading sstables is necessary because whenever you upgrade to any minor version or major upgrade, sstables resolve version incompatibility issues.

Wrapping Up

Here we have reached the end of the blog. This is the last blog of our Cassandra Migration Series. Do pay attention to the tips (or notes) to avoid any form of errors. Do share your opinion or if you have any questions through comments.

Blog Pundits: Pankaj Kumar and Sandeep Rawat

OpsTree is an End-to-End DevOps Solution Provider.

Connect with Us

Leave a Reply