Get Rewarded! We will reward you with up to €50 credit on your account for every tutorial that you write and we publish!

Install a Nomad cluster with Consul on cloud servers

profile picture
Author
Dominik Langer
Published
2024-11-18
Time to read
13 minutes reading time

Introduction

In this tutorial we will set up a HashiCorp Nomad cluster with Consul for service and node discovery. With 3 server nodes and a custom amount of client nodes this should serve as a basis for growing projects. We will also create a Hetzner Cloud Snapshot for our clients that enables us to add further clients without any manual setup. The cluster will run on a private network between the servers and supports all out of the box features of Nomad and Consul, like service discovery and volumes.

This tutorial follows in part the recommended steps from the official Consul deployment guide and Nomad deployment guide.

Prerequisites

  • A Hetzner Cloud account
  • Basic knowledge of Linux commands and the shell
  • The ability to connect to a server with ssh

This tutorial was tested on Ubuntu 24.04 Hetzner Cloud servers with Nomad 1.9.3 and Consul 1.20.1

Terminology and Notation

Commands

local$ <command>  # This command must be executed on your local machine
server$ <command> # This command must be executed on the server as root

Step 1 - Create the base image

In this step the following resource will be used

  • 1 Hetzner cloud CX22 server

We will start by setting up a Consul / Nomad server on a new CX22 cloud server. The resulting Snapshot will serve as the base image for all cluster servers and clients in the following steps.

This guide will show the setup for 3 servers. This will make the cluster highly-available, but not too expensive. Single server setups are not recommended, but possible. The tutorial will include comments on what to change for only 1 or more than 3 servers in the respective steps.

Go to the Hetzner cloud web interface and create a new CX22 server with Ubuntu 24.04.

Step 1.1 - Install Consul

Install Consul

For more information about available versions, see the official website.

server$ wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
server$ echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | tee /etc/apt/sources.list.d/hashicorp.list
server$ apt update && apt install consul

We can now add autocomplete functionality for Consul (optional)

server$ consul -autocomplete-install

Prepare the TLS certificates for Consul

server$ consul tls ca create
server$ consul tls cert create -server -dc dc1
server$ consul tls cert create -server -dc dc1
server$ consul tls cert create -server -dc dc1
server$ consul tls cert create -client -dc dc1

If you want to run more or less than 3 Consul cluster servers, adapt the repetitions of consul tls cert create -server -dc dc1. For one cluster server, you only need to create one, for 5 servers you need 5 certificates.

This creates 3 server certificate and one client certificate. The /root/ folder should now contain at least the following files

server$ ls
consul-agent-ca.pem
consul-agent-ca-key.pem
dc1-server-consul-0.pem
dc1-server-consul-0-key.pem
dc1-server-consul-1.pem
dc1-server-consul-1-key.pem
dc1-server-consul-2.pem
dc1-server-consul-2-key.pem
dc1-client-consul-0.pem
dc1-client-consul-0-key.pem

Step 1.2 - Install the Nomad binary

Install nomad

For more information about available versions, see the official website.

server$ wget -O- https://apt.releases.hashicorp.com/gpg | gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
server$ echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | tee /etc/apt/sources.list.d/hashicorp.list
server$ apt update && apt install nomad

Add autocomplete functionality to nomad (optional)

server$ nomad -autocomplete-install

Add the following configuration in the file /etc/nomad.d/nomad.hcl

datacenter = "dc1"
data_dir = "/opt/nomad"

Step 1.3 - Prepare the systemd services

Consul and Nomad should start automatically after boot. To enable this, create a systemd service for both of them.

First, set all permissions:

server$ chown consul:consul dc1-server-consul*
server$ chown consul:consul dc1-client-consul*
server$ chown -R consul:consul /opt/consul
server$ chown -R nomad:nomad /opt/nomad
server$ mkdir -p /opt/alloc_mounts && chown -R nomad:nomad /opt/alloc_mounts

Add the configuration file /etc/systemd/system/consul.service with the following content:

[Unit]
Description="HashiCorp Consul - A service mesh solution"
Documentation=https://www.consul.io/
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/etc/consul.d/consul.hcl

[Service]
EnvironmentFile=-/etc/consul.d/consul.env
User=consul
Group=consul
ExecStart=/usr/bin/consul agent -config-dir=/etc/consul.d/
ExecReload=/bin/kill --signal HUP $MAINPID
KillMode=process
KillSignal=SIGTERM
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

And the configuration file /etc/systemd/system/nomad.service with the content:

[Unit]
Description=Nomad
Documentation=https://www.nomadproject.io/docs/
Wants=network-online.target
After=network-online.target

[Service]

User=nomad
Group=nomad

ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/bin/nomad agent -config /etc/nomad.d
KillMode=process
KillSignal=SIGINT
LimitNOFILE=65536
LimitNPROC=infinity
Restart=on-failure
RestartSec=2
OOMScoreAdjust=-1000
TasksMax=infinity

[Install]
WantedBy=multi-user.target

Don't enable these services yet, since the setup is not complete.

Step 1.5 - Create the base Snapshot

Finally, stop the server in the Hetzner Cloud Console and create a Snapshot. This Snapshot will be used as a foundation for the server and client setup of the cluster.

After the Snapshot creation was successful, delete the CX22 instance of this step.

Step 2 - Set up the cluster servers

In this step you will create 3 cluster servers from the base image created in Step 1. These servers are the foundation of your cluster and will dynamically elect a cluster leader.

The cloud servers used in this step will be part of the final cluster, so make sure to schedule them in the right project and choose good names, since changing the name later is not easy.

We will use the following resources

  • 1 Hetzner Cloud Network
  • 3 Hetzner Cloud CX22 server

In the Hetzner Cloud Console, create 3 CX22 servers from the Snapshot created in Step 1 and a common Cloud Network attached. This tutorial will use a 10.0.0.0/8 network, but smaller networks work as well.

In the following steps, the tutorial will refer to the servers by their internal IP addresses 10.0.0.2, 10.0.0.3 and 10.0.0.4. If your servers have different internal addresses, make sure to replace them in the following steps.

If you plan to only run a single cluster server, one cloud server is enough in this step. You still need the Cloud Network, since it will be used by the clients as well.

Step 2.1 - Create the symmetric encryption key

First, create a symmetric encryption key, which will be shared between all servers. Save this key in a secure location, we will need it in the next steps.

On one of the servers run

server$ consul keygen

Step 2.2 - Distribute the certificates

Now we can copy the right certificates from step 1 to the Consul configuration directory. Run the following command on all servers:

server$ cp consul-agent-ca.pem /etc/consul.d/

The 3 certificates created in step 1 need to be distributed, so that every server gets a unique certificate with matching key. This tutorial will copy dc1-server-consul-0* on the server 10.0.0.2 into the folder /etc/consul.d/. Similar, copy dc1-server-consul-1* on 10.0.0.3 and dc1-server-consul-2* on 10.0.0.4.

[10.0.0.2] server$ cp -a dc1-server-consul-0* /etc/consul.d/
[10.0.0.3] server$ cp -a dc1-server-consul-1* /etc/consul.d/
[10.0.0.4] server$ cp -a dc1-server-consul-2* /etc/consul.d/

To check if you have all the files you need, you should get a similar output for your servers (X being the respective certificate number)

server$ ls /etc/consul.d/
consul-agent-ca.pem
consul.hcl
dc1-server-consul-X-key.pem
dc1-server-consul-X.pem

Finally, delete all cert files and keys in your root directory on every server

server$ cd /root/
server$ rm *.pem

Step 2.3 - Edit Consul base configuration

On all servers, edit the configuration file /etc/consul.d/consul.hcl and add the content

datacenter = "dc1"
data_dir = "/opt/consul"
encrypt = "your-symmetric-encryption-key"
tls {
  defaults {
    ca_file = "/etc/consul.d/consul-agent-ca.pem"
    cert_file = "/etc/consul.d/dc1-server-consul-0.pem"
    key_file = "/etc/consul.d/dc1-server-consul-0-key.pem"
    verify_incoming = true
    verify_outgoing = true
  },
  internal_rpc {
    verify_server_hostname = true
  }
}
retry_join = ["10.0.0.2"]
bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"

acl = {
  enabled = true
  default_policy = "allow"
  enable_token_persistence = true
}

performance {
  raft_multiplier = 1
}

Replace the encrypt parameter with your previously generated key on all servers and replace cert_file and key_file with the respective file names on each server.

retry_join should contain at least one of your servers internal IP adresses. If you don't use 10.0.0.0/8 as your private network, replace this part in the bind_addr parameter as well.

Check the configuration with the command

server$ consul validate /etc/consul.d/consul.hcl

Step 2.4 - Create Consul server configuration

On all servers, create the configuration file /etc/consul.d/server.hcl with the following content

server = true
bootstrap_expect = 3

Change the bootstrap_expect value to match the amount of servers you want to run. This value should be the same on all servers.

Step 2.5 - Set up the Nomad servers

In this tutorial every Consul server will also act as a Nomad server.

Create the configuration file /etc/nomad.d/server.hcl with the content

server {
  enabled = true
  bootstrap_expect = 3
}

acl {
  enabled = true
}

Change the bootstrap_expect value to match the amount of servers you want to run. This value should again be the same on all servers.

Step 2.6 - Start the cluster

If everything was configured correctly, you can now start the cluster.

Enable both services on all servers

server$ systemctl enable consul
server$ systemctl enable nomad

And start the services

server$ systemctl start consul
server$ systemctl start nomad

And check the status

If you run into any issues, you can add log_level = "debug" in /etc/consul.d/consul.hcl or /etc/nomad.d/nomad.hcl, restart the service, and run journalctl -u consul or journalctl -u nomad.

server$ systemctl status consul
server$ systemctl status nomad

If everything went well, you should now have your cluster up and running. Wait up to 30 seconds for the servers to sync and elect a leader.

To check the cluster, run the following command on one of your servers

server$ consul members
...

This should give you a list of all 3 servers.

Since we use ACLs (Access Control Lists) on Nomad, we have to get the bootstrap token first, before checking the status here as well.

Display the bootstrap token on one of the servers

server$ nomad acl bootstrap

This command only works once. Make sure to save the acl token in a secure location.

This gives you the Secret ID used for all Nomad requests. With that copied, create a variable with your secret ID:

server$ export NOMAD_TOKEN="your-secret-id"

And check the status of the Nomad cluster:

server$ nomad server members

Similar to Consul, you should get a list of all servers.

Step 3 - Create the client image

In this step you will set up a client. The Snapshot will serve as an image for easy deployment of multiple clients.

We will use the following resources

  • 1 Hetzner Cloud CX22 server

In the Hetzner Cloud Console, create 1 CX22 server from the Snapshot created in step 1. The cloud server name is not important in this step, since we will only prepare the image on it.

Step 3.1 - Configure Consul

Similar to the server configuration, we have to copy the certificate to the /etc/consul.d/ folder.

Run the following commands on the cloud server

server$ cd /root/
server$ cp consul-agent-ca.pem /etc/consul.d/
server$ cp -a dc1-client-consul-0* /etc/consul.d/
server$ rm *.pem

Open the configuration file /etc/consul.d/consul.hcl and add the content

datacenter = "dc1"
data_dir = "/opt/consul"
encrypt = "your-symmetric-encryption-key"
tls {
  defaults {
    ca_file = "/etc/consul.d/consul-agent-ca.pem"
    cert_file = "/etc/consul.d/dc1-client-consul-0.pem"
    key_file = "/etc/consul.d/dc1-client-consul-0-key.pem"
    verify_incoming = true
    verify_outgoing = true
  },
  internal_rpc {
    verify_server_hostname = true
  }
}
retry_join = ["10.0.0.2"]
bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}"

check_update_interval = "0s"

acl = {
  enabled = true
  default_policy = "allow"
  enable_token_persistence = true
}

performance {
  raft_multiplier = 1
}

Make sure to replace the encrypt parameter with your encryption key. If you don't use 10.0.0.0/8 as your private network, replace this part in the bind_addr parameter as well.

Step 3.2 - Configure Nomad

Nomad has to be configured as well. For that, add the configuration file /etc/nomad.d/client.hcl with the content

client {
  enabled = true

  network_interface = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"name\" }}"
}

acl {
  enabled = true
}

If you don't use 10.0.0.0/8 as your private network, replace this part in the network_interface parameter.

Step 3.3 - Enable systemd services for Consul and Nomad

To make the Snapshot as small as possible, we will only enable the services, but won't start them yet.

server$ systemctl enable consul
server$ systemctl enable nomad

Step 3.4 - Create the Snapshot

Go to the Hetzner Cloud Console, stop the server and create a Snapshot. This Snapshot will be used for all clients, so choose a meaningful name. You can delete the cloud server after the Snapshot creation was successful.

If you want to update this image later, create a cloud server from the Snapshot, but without any Cloud Network attached. This way both Consul and Nomad will fail and won't sync any data, resulting in a smaller image.

Step 4 - Adding clients

To add a client, start a cloud server from the Snapshot created in step 3. Make sure to attach the cloud server to the private Cloud Network also used by the servers from step 2. Keep in mind that the cloud server name will be used in the cluster.

Startup may take up to a minute for the client to join the network. You can check the status on any node in the cluster with

server$ consul members

Conclusion

Congratulations to your own, highly-available Nomad cluster on Hetzner cloud servers. You are ready to deploy your applications or add more functionality like reverse proxies, auto-scaling or HashiCorp Vault to store your secrets in. Make sure to also properly secure the cluster with firewalls and custom ACL tokens.

License: MIT
Want to contribute?

Get Rewarded: Get up to €50 in credit! Be a part of the community and contribute. Do it for the money. Do it for the bragging rights. And do it to teach others!

Report Issue
Try Hetzner Cloud

Get €20/$20 free credit!

Valid until: 31 December 2024 Valid for: 3 months and only for new customers
Get started
Want to contribute?

Get Rewarded: Get up to €50 credit on your account for every tutorial you write and we publish!

Find out more