Install Apache Cassandra on Ubuntu 20.04 in a few Easy Steps

Choose a different version or distribution

Introduction

Before we begin talking about how to install Apache Cassandra on Ubuntu 20.04, let's briefly understand – What is Apache Cassandra?

Apache Cassandra is a powerful open-source distributed database that excels in handling massive amounts of data across multiple servers. It offers exceptional scalability, fault tolerance, and high availability, making it ideal for big data applications.

Cassandra's flexible data model allows for easy horizontal scaling, ensuring efficient storage and retrieval of information. With its distributed architecture and linear scalability, Apache Cassandra empowers businesses to manage large-scale data with speed and reliability, enabling them to make informed decisions based on real-time insights.

This tutorial explains how to install Apache Cassandra on Ubuntu 20.04 on Ubuntu. We will also address a few FAQs on how to install Apache Cassandra on Ubuntu 20.04.

Advantages of Apache Cassandra

  1. Scalability: Apache Cassandra scales effortlessly to handle large data volumes and increasing workloads.
  2. Fault Tolerance: It ensures data integrity and availability, even in the face of hardware or network failures.
  3. High Performance: With its distributed architecture, Cassandra offers low-latency data access and fast write speeds.
  4. Flexibility: It provides a flexible data model, allowing for easy schema changes and accommodating diverse data types.
  5. Linear Scalability: Cassandra's linear scalability allows for seamless expansion by adding more servers, without sacrificing performance.

Installing Java

The latest version of Apache Cassandra, 4.1.0, requires OpenJDK 8 to be installed on the computer.

To install OpenJDK, run the following command as root or with sudo rights:

sudo apt update
sudo apt install openjdk-8-jdk

Verify the Java installation by printing the Java version:

java -version

The output should look something like this:

Output

openjdk version "1.8.0_265"
OpenJDK Runtime Environment (build 1.8.0_265-8u265-b01-0ubuntu2~20.04-b01)
OpenJDK 64-Bit Server VM (build 25.265-b01, mixed mode)

Installing Apache Cassandra

Install the dependencies necessary to add a new repository over HTTPS:

sudo apt install apt-transport-https

Import the repository’s GPG key and add the Cassandra repository to the system:

wget -q -O - https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -

 echo "deb http://www.apache.org/dist/cassandra/debian 311x main"| sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list

Update the packages list and install the most recent version of Apache Cassandra after the repository has been enabled:

sudo apt update
sudo apt install cassandra

The Apache Cassandra service will launch immediately following installation. You can verify with the following command:

sudo service cassandra status
sudo service cassandra restart

nodetool status

You should see something similar to this:

Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load    Tokens  Owns (effective)  Host ID                               Rack
UN  127.0.0.1  70 KiB  256     100.0%            2eaab399-be32-49c8-80d1-780dcbab694f  rack1

That’s it. At this point, you have Apache Cassandra installed on your Ubuntu server.

Configuring Apache Cassandra

The /var/lib/cassandra directory houses Apache Cassandra configuration files, whereas the /etc/default/cassandra file allows you to customize Java startup settings.

Cassandra is set up by default to only listen on localhost. You don't need to modify the default configuration file if the client that is connecting to the database is also operating on the same host.

The command-line program cqlsh that comes with the Cassandra package can be used to communicate with Cassandra using CQL (the Cassandra Query Language).

cqlsh
Output

Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.7 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh>

Renaming Apache Cassandra Cluster

The default Cassandra cluster is named “Test Cluster”. If you want to change the cluster name, perform the steps below:

Login to the Cassandra CQL terminal with cqlsh:

cqlsh

Run the following command to change the cluster name to “PeerXP Cluster”:

UPDATE system.local SET cluster_name = 'PeerXP Cluster' WHERE KEY = 'local';

Change “PeerXP Cluster” with your desired name.

Once done, type exit to exit the console.

Open the cassandra.yaml configuration file and enter your new cluster name.

sudo vim /etc/cassandra/cassandra.yaml

Put the following cluster name in the file, you can even change the name as per you need.

cluster_name: 'PeerXP Cluster'

Save and close the file.

Run the following commands to change default settings:

sudo rm -rf /var/lib/cassandra/* /var/log/cassandra/*
sudo service cassandra start

Clear the system cache:

nodetool flush system

Restart the Cassandra service:

sudo systemctl restart cassandra

Finally, check if the cluster name is changed or not

cqlsh

FAQs to Install Apache Cassandra on Ubuntu 20.04

What are the system requirements for installing Apache Cassandra on Ubuntu 20.04?

Apache Cassandra requires at least 8GB of RAM and a multi-core processor. It's recommended to have a 64-bit operating system with a minimum of 50GB of free disk space.

How do I start and stop Apache Cassandra on Ubuntu 20.04?

To start Apache Cassandra, use the command: sudo service cassandra start. To stop it, use: sudo service cassandra stop.

Where is the configuration file for Apache Cassandra located on Ubuntu 20.04?

The configuration file for Apache Cassandra is located at /etc/cassandra/cassandra.yaml.

How can I check the status of Apache Cassandra on Ubuntu 20.04?

You can check the status of Apache Cassandra by running the command: nodetool status.

How do I enable remote access to Apache Cassandra on Ubuntu 20.04?

To enable remote access, you need to modify the Cassandra configuration file (cassandra.yaml) and change the listen_address to the IP address of your server. Additionally, update the rpc_address to 0.0.0.0.

How can I access the Apache Cassandra command-line interface (CLI) on Ubuntu 20.04?

You can access the Apache Cassandra CLI by running the command: cqlsh.

How do I uninstall Apache Cassandra from Ubuntu 20.04?

To uninstall Apache Cassandra, use the command: sudo apt purge cassandra. Additionally, you can remove any remaining configuration files using: sudo rm -rf /etc/cassandra.

Conclusion

In this tutorial, you installed Apache Cassandra on Ubuntu 20.04 LTS.

If you have any queries, please leave a comment below and we’ll be happy to respond to them.