Introduction
We all know how much of a headache installing Kubernetes manually is, especially scaling it as your workload grows. Yet, in the realm of modern containerized applications, Kubernetes remains the undisputed champion in orchestrating workloads efficiently. Fortunately, there's a light at the end of the tunnel. In this article, we're diving headfirst into the process of building your own scalable Kubernetes cluster on Hetzner Cloud with Terraform.
This article covers how to create, scale, access, and destroy your own cluster, and how to calculate the costs of running the cluster.
Prerequisites
- Hetzner Cloud API token in the Cloud Console
- SSH key
- Your own SSH key to access the nodes.
- A new SSH key for the worker nodes. You can use one SSH key for all worker nodes.
- Basic knowledge of Docker and Kubernetes
- Terraform installed on your local machine
Terminology
- Hetzner Cloud:
hcloud
- Hetzner Cloud API token:
<your_api_token>
node
is a fancy word for a server- Your own SSH key
- Private:
<your_private_ssh_key>
- Public:
<your_public_ssh_key>
- Private:
- New SSH key for worker nodes
- Private:
<worker_private_ssh_key>
- Public:
<worker_public_ssh_key>
- Private:
Step 1 - Create the project
Create a new folder on your local machine. You can name it however you want. Here we'll go with kubernetes-cluster
.
mkdir kubernetes-cluster && cd kubernetes-cluster
In that folder, create two files named main.tf
and .tfvars
.
touch main.tf && touch .tfvars
main.tf
will be used to declare hcloud terraform resources..tfvars
will be used to configure the hcloud provider with your API token and hide it at the same time.
Step 1.1 - Setup the hcloud provider
After creating the two files, add the following to the main.tf
file:
You can view the latest version of
hetznercloud/hcloud
in the Terraform Registry.
# Tell Terraform to include the hcloud provider
terraform {
required_providers {
hcloud = {
source = "hetznercloud/hcloud"
# Here we use version 1.48.0, this may change in the future
version = "1.48.0"
}
}
}
# Declare the hcloud_token variable from .tfvars
variable "hcloud_token" {
sensitive = true # Requires terraform >= 0.14
}
# Configure the Hetzner Cloud Provider with your token
provider "hcloud" {
token = var.hcloud_token
}
Right now, this will fail to initialize the hcloud provider because the variable hcloud_token
has not yet been declared
in .tfvars
. To fix this, add the following line to the .tfvars
file:
Replace
<your_api_token>
with your actual Hetzner Cloud API token.
hcloud_token = "<your_api_token>"
Just like a .env
file, you can use the .tfvars
file to initialize secrets and global variables that you don't want in
your code.
Step 1.2 Initialize the project
Lastly, run the following command to initialize the project and download the hcloud provider. Before you run the command, make sure Terraform is installed as mentioned in the prerequisites.
terraform init
The output log should be similar to this:
Initializing the backend...
Initializing provider plugins...
- Finding hetznercloud/hcloud versions matching "1.45.0"...
- Installing hetznercloud/hcloud v1.45.0...
- Installed hetznercloud/hcloud v1.45.0 (signed by a HashiCorp partner, key ID 5219EACB3A77198B)
Terraform has been successfully initialized!
Step 2 - Setup your master node
Now that Terraform is ready for hcloud, we need to create the most important part of the cluster, the master node.
The master node controls all of the other worker nodes, it's the brain of the cluster.
Step 2.1 - Create a private network for the cluster
Later on, you'll want to add more nodes — worker nodes. The master node and the worker nodes need to communicate without being exposed to the public. For this, they need to be in a private network.
Add the following code to your main.tf
file:
resource "hcloud_network" "private_network" {
name = "kubernetes-cluster"
ip_range = "10.0.0.0/16"
}
resource "hcloud_network_subnet" "private_network_subnet" {
type = "cloud"
network_id = hcloud_network.private_network.id
network_zone = "eu-central"
ip_range = "10.0.1.0/24"
}
This will create a private network named kubernetes-cluster
that will be used by the nodes to communicate with each
other.
Step 2.2 Create the master node
When creating a node on hcloud, you can add what is called a cloud-init
script to it. This script is executed
when the node is created, and you can use it to install software, configure the node, and more. In this case, we'll use
it to install Kubernetes.
Create cloud-init.yaml
and add the following content:
Replace
<your_public_ssh_key>
with the public SSH key of your local machine and<worker_public_ssh_key>
with the public SSH key for your worker node(s).
#cloud-config
packages:
- curl
users:
- name: cluster
ssh-authorized-keys:
- <your-ssh-public-key>
- <worker_public_ssh_key>
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
runcmd:
- apt-get update -y
- curl https://get.k3s.io | INSTALL_K3S_EXEC="--disable traefik" sh -
- chown cluster:cluster /etc/rancher/k3s/k3s.yaml
- chown cluster:cluster /var/lib/rancher/k3s/server/node-token
You need to add the public SSH key for the worker node so that the worker node can copy the Kubernetes token from the master node.
You might have noticed that instead of installing k8s the traditional way, we're using k3s
. This is because k3s
is a
lightweight k8s distribution that makes it way easier to install and manage k8s clusters. This tutorial would be way too
long if we were to install k8s the traditional way.
Also, we're disabling traefik
because we'll be using nginx
as an ingress controller for later on.
Before we continue, you must choose a plan for your master node. You can find the available
plans here. Because k3s is optimized for Arm64 architecture, we'll go with the cax11
plan
from hcloud's Arm64 instances.
Plan | CPU | Memory | Disk |
---|---|---|---|
cax11 |
2 | 4GB | 40GB |
Add the following code to your main.tf
file:
Replace
cax11
with the plan you would like to use.
resource "hcloud_server" "master-node" {
name = "master-node"
image = "ubuntu-24.04"
server_type = "cax11"
location = "fsn1"
public_net {
ipv4_enabled = true
ipv6_enabled = true
}
network {
network_id = hcloud_network.private_network.id
# IP Used by the master node, needs to be static
# Here the worker nodes will use 10.0.1.1 to communicate with the master node
ip = "10.0.1.1"
}
user_data = file("${path.module}/cloud-init.yaml")
# If we don't specify this, Terraform will create the resources in parallel
# We want this node to be created after the private network is created
depends_on = [hcloud_network_subnet.private_network_subnet]
}
Step 3 - Setup your worker nodes
Now that the master node is ready, we need to give it some more juice. While you can scale the master node by giving it more CPU's and memory, eventually you'll hit a limit (this is called vertical scaling). This is where worker nodes come in. You can have an unlimited amount of worker nodes (this is called horizontal scaling).
The cloud-init-worker.yaml
file will be similar to the cloud-init.yaml
file, but it will not install k3s the same
way. Instead, it will join the worker node to the master node with a command similar to this:
curl -sfL https://get.k3s.io | K3S_URL=https://10.0.1.1:6443 K3S_TOKEN=<will_be_diffferent_every_time> sh -
Create cloud-init-worker.yaml
and add the following content:
Replace
<your_public_ssh_key>
with the public SSH key of your local machine and<worker_private_ssh_key>
with the private SSH key for your worker node(s).
#cloud-config
packages:
- curl
users:
- name: cluster
ssh-authorized-keys:
- <your_public_ssh_key>
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
write_files:
- path: /root/.ssh/id_rsa
content: |
-----BEGIN OPENSSH PRIVATE KEY-----
<worker_private_ssh_key>
-----END OPENSSH PRIVATE KEY-----
permissions: "0600"
runcmd:
- apt-get update -y
- # wait for the master node to be ready by trying to connect to it
- until curl -k https://10.0.1.1:6443; do sleep 5; done
- # copy the token from the master node
- REMOTE_TOKEN=$(ssh -o StrictHostKeyChecking=accept-new cluster@10.0.1.1 sudo cat /var/lib/rancher/k3s/server/node-token)
- # Install k3s worker
- curl -sfL https://get.k3s.io | K3S_URL=https://10.0.1.1:6443 K3S_TOKEN=$REMOTE_TOKEN sh -
You need to add the private SSH key for the worker node so that the worker node can copy the Kubernetes token from the master node.
Next, edit main.tf
to create the worker nodes with the cloud-init script above.
You can put resources in a loop in Terraform, so you can create as many worker nodes as you want. The example below will create 3 worker nodes but you can change it to any other number. If you set the count to 1
, it will create only one worker node. Add the following code to your main.tf
file:
Replace
cax11
with the plan you would like to use.
Replacecount = 3
with however many worker nodes you want to create.
resource "hcloud_server" "worker-nodes" {
count = 3
# The name will be worker-node-0, worker-node-1, worker-node-2...
name = "worker-node-${count.index}"
image = "ubuntu-24.04"
server_type = "cax11"
location = "fsn1"
public_net {
ipv4_enabled = true
ipv6_enabled = true
}
network {
network_id = hcloud_network.private_network.id
}
user_data = file("${path.module}/cloud-init-worker.yaml")
depends_on = [hcloud_network_subnet.private_network_subnet, hcloud_server.master-node]
}
Step 4 - Setup the Load Balancer (Optional)
Optionally, you can also add a Load Balancer and use nginx
as an ingress controller as suggested above.
Add the following code to your main.tf
file to create a new Load Balancer, add the Load Balancer to the private network, and add the new servers via their private IPs as targets:
Replace
lb11
with the plan you would like to use.
resource "hcloud_load_balancer" "load_balancer" {
name = "kubernetes-load-balancer"
load_balancer_type = "lb11"
location = "fsn1"
delete_protection = true
depends_on = [hcloud_network_subnet.private_network_subnet, hcloud_server.master-node]
}
resource "hcloud_load_balancer_service" "load_balancer_service" {
load_balancer_id = hcloud_load_balancer.load_balancer.id
protocol = "http"
depends_on = [hcloud_load_balancer.load_balancer]
}
resource "hcloud_load_balancer_network" "lb_network" {
load_balancer_id = hcloud_load_balancer.load_balancer.id
network_id = hcloud_network.private_network.id
depends_on = [hcloud_network_subnet.private_network_subnet, hcloud_load_balancer.load_balancer]
}
resource "hcloud_load_balancer_target" "load_balancer_target_worker" {
for_each = { for idx, server in hcloud_server.worker-nodes : idx => server }
type = "server"
load_balancer_id = hcloud_load_balancer.load_balancer.id
server_id = each.value.id
use_private_ip = true
depends_on = [hcloud_network_subnet.private_network_subnet, hcloud_load_balancer_network.lb_network, hcloud_server.worker-nodes]
}
Step 5 - Deploy your cluster
NOTE: Always make sure to calculate the costs of running your cluster before deploying it. You can figure out how in "Step 9".
Now that you have your master and worker nodes, you can simply run terraform apply
to create the cluster.
Note that running the command below will create new servers in Hetzner Cloud that will be charged.
terraform apply -var-file .tfvars
The output log should be similar to this:
Plan: 4 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
hcloud_network.private_network: Creating...
hcloud_network.private_network: Creation complete after 1s [id=REDACTED]
hcloud_network_subnet.private_network_subnet: Creating...
hcloud_network_subnet.private_network_subnet: Creation complete after 1s [id=REDACTED-10.0.1.0/24]
hcloud_server.master-node: Creating...
hcloud_server.master-node: Creation complete after 2m46s [id=REDACTED]
hcloud_server.worker-node-1: Creating...
hcloud_server.worker-node-1: Creation complete after 5m35s [id=REDACTED]
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
Step 6 - Access your cluster (Optional)
To access your cluster from your local machine, you need to have the kubectl
command-line tool.
-
On macOS, you can install it with homebrew:
brew install kubectl
-
On Ubuntu, you can install it with
snap
:snap install kubectl --classic
After installing kubectl
, you need to copy the kubeconfig file from the master node to your local machine. You can do
this by running the following command on your local machine:
scp -i <your_private_ssh_key> cluster@<master_node_public_ip>:/etc/rancher/k3s/k3s.yaml ~/.kube/config
Make sure that the IP in ~/.kube/config
is the public IP of the master node, not 127.0.0.1
.
For example:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: ...
# NOT https://127.0.0.1:6443
server: https://<master_node_public_ip>:6443
name: default
contexts:
- context:
cluster: default
user: default
name: default
current-context: default
kind: Config
preferences: { }
users:
- name: default
user:
client-certificate-data: ...
client-key-data: ...
After that, check if your cluster is up and running:
kubectl get nodes
This command should return something like this:
NAME STATUS ROLES AGE VERSION
worker-node-1 Ready <none> 93s v1.28.7+k3s1
master-node Ready control-plane,master 6m24s v1.28.7+k3s1
Step 7 - Setup the Load Balancer
If you followed the optional step "Step 5" and created a Load Balancer, you should now deploy a Service
of type LoadBalancer
(see Hetzner Cloud Controller Manager documentation: Reference existing Load Balancers).
First, install some prerequisites:
-
Nginx Ingress Controller as a DaemonSet:
wget -q https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.5.1/deploy/static/provider/cloud/deploy.yaml -O /tmp/nginx.yaml sed -i -e "s/kind: Deployment/kind: DaemonSet/g" /tmp/nginx.yaml sed -i -e '/^kind: ConfigMap.*/i \ \ compute-full-forwarded-for: \"true\"\n \ use-forwarded-headers: \"true\"\n \ use-proxy-protocol: \"true\"' /tmp/nginx.yaml kubectl apply -f /tmp/nginx.yaml
-
Hetzner Cloud Controller and API token
Replace
<your_api_token>
with your own Hetzner API token.kubectl -n kube-system apply -f https://github.com/hetznercloud/hcloud-cloud-controller-manager/releases/download/v1.13.2/ccm-networks.yaml kubectl -n kube-system create secret generic hcloud --from-literal=token=<your_api_token>
Now create the service:
nano example-service.yaml
Add this content
apiVersion: apps/v1
kind: Deployment
metadata:
name: simple-web
namespace: ingress-nginx
spec:
replicas: 3
selector:
matchLabels:
app: ingress-nginx
template:
metadata:
labels:
app: ingress-nginx
spec:
containers:
- name: simple-web
image: httpd:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx-controller
namespace: ingress-nginx
annotations:
load-balancer.hetzner.cloud/name: "kubernetes-load-balancer"
load-balancer.hetzner.cloud/location: "fsn1"
spec:
type: LoadBalancer
selector:
app: ingress-nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
Next, apply the file and view the pods and the deployment:
kubectl apply -f example-service.yaml
kubectl get pods -n ingress-nginx
kubectl get deployments -n ingress-nginx
To stop the deployment, you can run:
kubectl delete deployment simple-web -n ingress-nginx
Step 8 - Destroy your cluster (Optional)
When you're done with your cluster, don't forget to destroy it to avoid being billed for resources you're not using.
Run the following command in the kubernetes-cluster
directory on your local machine:
If you have a Load Balancer, note that you cannot delete it, unless you disabled protection.
terraform destroy -var-file .tfvars
The output log should be similar to this:
Plan: 0 to add, 0 to change, 4 to destroy.
Do you really want to destroy all resources?
Terraform will destroy all your managed infrastructure, as shown above.
There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
hcloud_server.worker-node-1: Destroying... [id=REDACTED]
hcloud_server.worker-node-1: Destruction complete after 1m0s
hcloud_server.master-node: Destroying... [id=REDACTED]
hcloud_server.master-node: Destruction complete after 1m0s
hcloud_network_subnet.private_network_subnet: Destroying... [id=REDACTED-
hcloud_network_subnet.private_network_subnet: Destruction complete after 1s
hcloud_network.private_network: Destroy
Step 9 - Calculate the costs of running the cluster
Before deploying, it's important to think about the cost of the cluster. This will help you avoid any surprises when you get your bill at the end of the month. The cost will depend on the following factors:
- The number of worker nodes
- The plan of the master and worker nodes
You can find a full list of Arm64 plans offered by Hetzner here.
You would usually want to pick the same plan for the master and worker nodes. This is because the master node also acts as a worker.
For example, let's pick the cax11
plan for the master and worker nodes.
At the time of writing, this plan costs 0.006€/hour or 3.95€/month.
If you have 1 master node and 3 worker nodes, you would calculate it like this:
-
The hourly cost would be:
0.006 + (0.006 * 3) = 0.024€ / hour
-
The monthly cost would be:
3.95 + (3.95 * 3) = 15.80€ / month
You can replace the numbers with variables:
price_master + (price_worker * number_of_workers) = total_price
Conclusion
Congratulations! You've just created your own scalable Kubernetes cluster on Hetzner Cloud. You can now deploy your applications and scale them as you see fit. If you want to learn more about Kubernetes, you can check out the official documentation.