Introduction
Managing Kubernetes clusters can be a daunting task, especially at scale. The Cluster API (CAPI) is an official Kubernetes SIG project that aims to simplify the provisioning, upgrading, and operating multiple clusters in a declarative way.
This approach offers several benefits over Infrastructure as Code tools, such as Terraform and Ansible, since it manages the entire lifecycle of your cluster. This includes automated creation, scaling, upgrades and self-healing, unlike IaC tools that run a predefined workflow only when triggered.
These benefits can be better understood by comparing tools in the same scenario: If someone accidentally deletes or changes a virtual machine or a load balancer after the initial provisioning with IaC tools, your infrastructure will remain broken until the next time you make a change, or until you see the mistake by chance (or worse, when your customers start reporting issues). With Cluster API, the state of your cluster is continuously reconciled to match the desired state, automatically fixing configuration drift.
The Cluster API Provider Hetzner (CAPH) is an open-source project (maintained by Syself and the community; not a Hetzner project) that allows you to leverage the capabilities of Cluster API to manage highly-available Kubernetes clusters on both Hetzner baremetal servers (Robot) and Hetzner cloud instances.
This tutorial covers the process of setting up a highly-available Kubernetes cluster on Hetzner Cloud using CAPH.
Prerequisites
- Docker, for running containers
- Kind, to create a local Kubernetes cluster
- kubectl and clusterctl, to access and manage your clusters
- A Hetzner Cloud account
- An SSH key
- Basic knowledge of Kubernetes
Step 1 - Prepare your Hetzner Account
Create a new project in the Hetzner Cloud Console, go to the "Security" tab and create an API token with read and write access. Note it down.
Next, add your public SSH key to the project.
Step 2 - Create a management cluster
A Kubernetes cluster is needed to run the Cluster API and CAPH controllers. It will act as a management cluster, allowing you to manage workload clusters with Kubernetes objects. In this way, the controllers will handle the entire lifecycle of the machines and infrastructure.
We will start with a local Kind cluster to serve as a temporary bootstrap cluster. Later, we will be able to run the controllers on the new workload cluster in Hetzner Cloud, and move our resources there. If you already have a running Kubernetes cluster, feel free to use it instead.
Create a local Kind (Kubernetes in Docker) cluster:
# Create a cluster with Kind
kind create cluster --name caph-mgt-cluster
# Initialize it
clusterctl init --core cluster-api --bootstrap kubeadm --control-plane kubeadm --infrastructure hetzner
Now, create a secret to enable CAPH to communicate with the Hetzner API:
# Replace <YOUR_HCLOUD_TOKEN> with the API token you generated in the previous step
kubectl create secret generic hetzner --from-literal=hcloud=<YOUR_HCLOUD_TOKEN>
Step 3 - Create your workload cluster
Define your cluster variables:
export HCLOUD_SSH_KEY="<ssh-key-name>" \
export HCLOUD_REGION="fsn1" \
export CONTROL_PLANE_MACHINE_COUNT=3 \
export WORKER_MACHINE_COUNT=3 \
export KUBERNETES_VERSION=1.29.4 \
export HCLOUD_CONTROL_PLANE_MACHINE_TYPE=cpx31 \
export HCLOUD_WORKER_MACHINE_TYPE=cpx31
And create your cluster:
# Generate the manifests defining a workload cluster, and apply them to the bootstrap cluster
clusterctl generate cluster --infrastructure hetzner:v1.0.0-beta.35 hetzner-cluster | kubectl apply -f -
# Get the kubeconfig for this new cluster
clusterctl get kubeconfig hetzner-cluster > hetzner-cluster-kubeconfig.yaml
Every component and configuration of this workload cluster can be defined declaratively in the management cluster. If you run the clusterctl generate
command again, you will see the actual manifests that were applied to it. This means you can scale, delete, and modify clusters only by interacting with Kubernetes resources.
Before you use Cluster API Provider Hetzner in a production scenario, you should read through the CAPH and CAPI documentations, and familiarize yourself with the main resources you'll be interacting with like Clusters, Machines, Machine Deployments, etc.
Step 4 - Install components in your cluster
Your newly created cluster needs a few key components before you can host your workloads in it. These are a Container Network Interface (CNI), responsible for networking capabilities, and a Cloud Controller Manager (CCM), which allows you to properly use Hetzner resources such as Load Balancers.
export KUBECONFIG=hetzner-cluster-kubeconfig.yaml
# Install Hetzner CCM
kubectl apply -f https://github.com/hetznercloud/hcloud-cloud-controller-manager/releases/latest/download/ccm.yaml
# Install Flannel CNI - You can use your preferred CNI instead, e.g. Cilium
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
Now edit the deployment hcloud-cloud-controller-manager
:
kubectl edit deployment hcloud-cloud-controller-manager -n kube-system
Hit i
to enter "insert mode". Then, move down to "HCLOUD_TOKEN" and set the key to hcloud
, and the name to hetzner
:
- name: HCLOUD_TOKEN
valueFrom:
secretKeyRef:
key: hcloud
name: hetzner
When you're done, hit esc
to exit "insert mode" and enter :wq
to save your changes and exit.
And that's it! You now have a working Kubernetes cluster in Hetzner Cloud.
If you want to delete the cluster, you can run the command below:
kubectl delete cluster hetzner-cluster
This will delete the cluster and all resources created for it, like machines.
Step 5 - Move your management cluster to the created cluster on Hetzner (Optional)
You can use your new cluster on Hetzner as a management cluster, moving away from your temporary bootstrap cluster.
Run the clusterctl init
command to deploy CAPI and CAPH controllers to your new cluster:
KUBECONFIG=hetzner-cluster-kubeconfig.yaml clusterctl init --core cluster-api --bootstrap kubeadm --control-plane kubeadm --infrastructure hetzner
And, back on your local kind cluster, use clusterctl
to move your resources:
# This will make the secret automatically move to the target cluster
kubectl patch secret hetzner -p '{"metadata":{"labels":{"clusterctl.cluster.x-k8s.io/move":""}}}'
# Move the cluster definitions to the new cluster, you can omit the namespace to use default
clusterctl move --to-kubeconfig="hetzner-cluster-kubeconfig.yaml --namespace=<target-namespace>"
After the move, you can safely delete the local Kind cluster:
kind delete cluster --name caph-mgt-cluster
Next steps
Your workload cluster was created with the default kubeadm bootstrap and controlplane providers. For production use, you may want to add additional layers to this configuration and create your own node images, as the default configuration provides only the basics to have a running cluster.
For more information on which aspects are handled by CAPH, you can check the project's GitHub readme.
Baremetal
In the introduction to this article, it was stated that CAPH fully supports the use of Hetzner baremetal servers (Hetzner Robot). A second guide focusing on this feature is in the works, but if it hasn't been published by the time you read this, you can visit the CAPH docs if you're interested in managing Hetzner baremetal servers with Cluster API.
Conclusion
With Cluster API Provider Hetzner, you can create and manage highly available Kubernetes clusters in Hetzner, in a declarative and cloud-native way. This enables you to operate and scale your clusters seamlessly. A single Cluster API management cluster can handle approximately one hundred clusters, depending on the number of nodes.
In this tutorial, you created your own highly available Kubernetes cluster on Hetzner, with a fully managed lifecycle. As you continue to work with Kubernetes and the Cluster API Provider Hetzner, you can explore additional features and configuration options to optimize your cluster management, like:
- Implementing custom node images and configurations tailored to your specific workloads
- Integrating with other Kubernetes tools and add-ons such as a CNI, metric-server, konnectivity, etc
- Increase the reliability of your cluster with backups, monitoring and alerting