Implementation of a Cloud HA infrastructure with Keepalived

Introduction

This tutorial describes how to set up a Cloud HA cluster using keepalived and Floating IPs. The tutorials uses two servers as an example. If the first (original) server fails, the Floating IP is automatically assigned to the second server. As soon as the original server is available again, the Floating IP is automatically assigned back to the first server. This ensures that the web application is still available even if the server fails.

It should be noted that this form of high availability is not supported by all applications. This should be checked in the first place.

Prerequisites

Min. 2 cloud servers
One Floating IP

Recommended

A separate cloud project should be created for the HA service, since critical access data must be stored in clear text on the servers. With a separate project, the damage can be minimized in the event of a possible compromise.

Step 1 - Setting up automatic IP rerouting

This step is about automatic failover, where the Floating IP is automatically assigned to the other server, so that the service can operate under the same address.

Step 1.1 - Creation of a Cloud API token

This token is required to control the Floating IP assignment from the server. In the Cloud Console, a read/write API token must be created under "Security" » "API tokens".

Note: Since the token can only be read once, it is recommended to store it temporarily.

Step 1.2 - Create a Cloud Network

For the VRRP Heartbeat, a private network channel is needed. For this, Cloud Networks are used.

You can create a Cloud Network in the Cloud Console under "Networks".

There, you can create a Network and assign it to the servers.

Step 1.3 - Installing the IP Failover Software

In this step, the software responsible for provisioning the Floating IP is installed on both servers.

This is the open-source software hcloud-ip. To install the software, you can either use the prebuilt binaries or compile the software yourself.

For an installation with the prebuilt binaries, the following steps are necessary

wget -O /opt/hcloud-ip https://github.com/FootprintDev/hcloud-ip/releases/download/v0.0.1/hcloud-ip-linux64
chmod +x /opt/hcloud-ip

To compile the software yourself, the following steps are necessary

Installing the dependencies

Ubuntu / Debian:

apt install git wget

CentOS / RHEL:

yum install git wget

Fedora:

dnf instal git wget

openSUSE / SLES:

zypper install git wget

Golang installation

wget https://go.dev/dl/go1.21.4.linux-amd64.tar.gz
tar xfvz go1.21.4.linux-amd64.tar.gz
export PATH=$(pwd)/go/bin:$PATH

Cloning the repository

git clone https://github.com/FootprintDev/hcloud-ip /tmp/hcloud-ip && cd /tmp/hcloud-ip

Build the Project
```
go build
```

Now there should be a program named hcloud-ip in the current folder. You can make it executable and save it respectively.

chmod +x hcloud-ip
mv hcloud-ip /opt

Step 1.4 - Configuring the Floating IP

In order for the Floating IP to work on all servers in the event of a failover, you need to include it in the network configuration. You can find instructions for this at docs.hetzner.com.

Step 2 - Setting up Keepalived

Keepalived is a Linux daemon that monitors systems or services and triggers a failover in case of an error. You need to install it on both servers.

Step 2.1 - Installing Keepalived

Ubuntu / Debian:

apt install keepalived

CentOS / RHEL:

yum install keepalived

Fedora:

dnf instal keepalived

openSUSE / SLES:

zypper install keepalived

Step 2.2 - Enable Keepalived autostart

Systemd based systems:

systemctl enable keepalived

CentOS / RHEL

chkconfig keepalived on

Step 2.3 - Configuration of Keepalived

The configuration shown here corresponds to an example using an HA web server (nginx).

Configuration file on the master server

/etc/keepalived/keepalived.conf

vrrp_script chk_nginx {
    script "/usr/bin/pgrep nginx"
    interval 2
    weight -101
}

vrrp_instance VI_1 {
    interface [cloud_private_network_adapter]
    state MASTER
    priority 200

    virtual_router_id 30
    unicast_src_ip [master_private_IP]
    unicast_peer {
        [backup_private_IP]
    }

    authentication {
        auth_type PASS
        auth_pass [random_password]
    }

    track_script {
        chk_nginx
    }

    notify_master /etc/keepalived/failover.sh
}

Note: The outlined values [] are to be replaced by your own specifications. For auth_pass, you can pick any random password. However, make sure to use the same password for the master and the backup.

Configuration file on the backup server

/etc/keepalived/keepalived.conf

vrrp_script chk_nginx {
    script "/usr/bin/pgrep nginx"
    interval 2
}

vrrp_instance VI_1 {
    interface [cloud_private_network_adapter]
    state BACKUP
    priority 100

    virtual_router_id 30
    unicast_src_ip [backup_private_IP]
    unicast_peer {
        [master_private_IP]
    }

    authentication {
        auth_type PASS
        auth_pass [random_password]
    }

    track_script {
        chk_nginx
    }

    notify_master /etc/keepalived/failover.sh
}

Note: The outlined values [] are to be replaced by your own specifications. For auth_pass, you can pick any random password. However, make sure to use the same password for the master and the backup.

Contents of failover.sh on both servers

/etc/keepalived/failover.sh

The script contains the actions that are to be executed in the event of a failover.

#!/bin/bash
IP='[Floating-IP-Name]'
TOKEN='[CloudToken]'

n=0
while [ $n -lt 10 ]
do
    if [ "$(/opt/hcloud-ip -ip "$IP" -key $TOKEN)" == "Server called $HOSTNAME was found" ]; then
        break
    fi
    n=$((n+1))
    sleep 3
done

Note: The outlined values [] are to be replaced by your own specifications.

Make the failover.sh file executable:

chmod 755 /etc/keepalived/failover.sh

On systemd based systems, you can now start keepalived:

systemctl start keepalived

Step 3 - Testing the configuration

In normal operation, the master web server handles all requests.

As soon as this fails, there is a failover to the backup web server.

As soon as the master web server is reachable again, it is switched back to.

Conclusion

You should now have a High Availability Cloud environment.

Implementation of a Cloud HA infrastructure with Keepalived

Author

Published

Time to read

Table of Contents