Portal:Cloud VPS/Admin/Magnum setup
Intro
Magnum relies on a host of moving parts to launch a Kubernetes cluster. As of 2025, there are two back-end implementations. For the last few years we've been using the Heat backend, but it is being deprecated by the upstream developers in favor of cluster-api backends.
We are in the process of moving from the Heat backend to the capi helm backend. This document is about understanding and setting up Magnum to work with capi helm drivers.
Openstack Components
When creating a cluster, Magnum creates a service account and credentials in Keystone and then delegates the actual cluster creation to the cluster-api service which uses the Magnum-managed credentials. Cluster-api uses those credentials to call back into other openstack services to create and configure the components of the cluster (e.g. calling nova to create a worker node) and then applies helm charts to the nodes to set up the moving parts.
Magnum-api
Magnum-api provides the REST api for managing kubernetes clusters and cluster templates. It is a straightforward python/uwsgi service that runs on cloudcontrol nodes and is managed by puppet just like other openstack services.
The magnum-api logs can be read via journalctl, logstash, or by looking at /var/log/magnum/magnum-api.log. There is seldom anything of interest in the logs but they may contain useful messages if there are fundamental integration issues between the different magnum components.
Typically a user will interact with magnum-api using opentofu or the openstack cli. In the cli, magnum is referred to as 'cluster coe,', for example:
# openstack coe cluster list
Magnum-conductor
Magnum-conductor schedules and implements the various steps in cluster creation. It is installed and managed by puppet and runs on cloudcontrol nodes.
Most of the real work of creating a kubernetes cluster is delegated to cluster api, so magnum-conductor is mostly limited to maintaining and reporting cluster status.
Magnum-api and magnum-conductor are configured in /etc/magnum/magnum.conf on a cloudcontrol.
Octavia
Cluster-api creates an Octavia loadbalancer as a front end to the created k8s cluster. As of version 'epoxy,' it insists on creating a floating IP as the entry point to that loadbalancer. That means that a project must have floating IP quota to create a cluster.
Neutron
Typically we create clusters an an existing VM network (e.g. VXLAN/IPv6-dualstack).
Nova
The actual k8s worker nodes managed by magnum will appear as regular nova VMs within the project of the magnum user. They typically have names like "<cluster-name-worker-1-2>-<id>-default-worker-<id>"
Cluster-api (capi) Components
capi service cluster
Cluster api runs on a kubernetes cluster. In order to avoid chicken/egg concerns, the cluster that hosts capi is NOT managed by magnum. Instead, it runs on an easily-recreated single node k3s cluster and as described by a helm chart. This cluster can be created and configured with the puppet class 'profile::openstack::capi'
To build a new capi worker:
1. Create a new debian VM in the 'magnum' project named 'capi-worker-xxx'. One core and 2GB of ram seems sufficient.
2. Apply the 'profile::openstack::capi' class to that VM (there should already be a prefix config that does this), and allow puppet to stabilize (it will take two or three puppet runs). Puppet will create a k3s cluster and then apply a helm chart hosted on our chartmuseum repo to set up capi.
3. Apply security group rules that allow cloudcontrol access to port 644 (typically via a pre-existing service group like 'cloudcontrol-to-kubernetes')
4. Confirm that the worker is set up properly:
labtestandrew@capi-worker-1:~$ export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
labtestandrew@capi-worker-1:~$ sudo kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
capi-addon-system cluster-api-addon-provider-756bfd798f-5gpnk 1/1 Running 0 44h
capi-janitor-system cluster-api-janitor-openstack-55fb64777c-d9bgb 1/1 Running 0 44h
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager-5b959f764c-cl79q 1/1 Running 6 (41h ago) 44h
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager-7556869f8-ctl5d 1/1 Running 3 (43h ago) 44h
capi-system capi-controller-manager-5b67d4fc7-c94md 1/1 Running 6 (41h ago) 44h
capo-system capo-controller-manager-6f899b5b7b-spslb 1/1 Running 3 (41h ago) 44h
cert-manager cert-manager-7d67448f59-t5vlb 1/1 Running 0 44h
cert-manager cert-manager-cainjector-666b8b6b66-mmt5k 1/1 Running 0 44h
cert-manager cert-manager-webhook-78cb4cf989-6xwfm 1/1 Running 2 (43h ago) 44h
kube-system coredns-5688667fd4-9hsqt 1/1 Running 2 (43h ago) 45h
kube-system local-path-provisioner-774c6665dc-g5fld 1/1 Running 0 45h
kube-system metrics-server-6f4c6675d5-8fw6r 1/1 Running 2 (43h ago) 45h
magnum-admin test-cluster-capi-worker-1-2-edez5vbbrppf-autoscaler-6bff5mbbnr 1/1 Running 5 (42m ago) 47m
5. Once that worker is up and running, you need to direct magnum to use the new cluster. As created by k3s, /etc/rancher/k3s/k3s.yaml contains everything that Magnum will need to know with the exception of the host IP. Make a copy of k3s.yaml but replace 127.0.0.1 in the the 'server' line with the actual IP of the worker.
6. Add this exit k3s.yaml contents to private puppet, as modules/secret/secrets/openstack/<codfw1dev/eqiad1>/magnum/capiservicek3s.yaml -- it will then be installed on cloudcontrols.
Other necessary services
image repo
We try not to build our control plain out of random upstream packages; rather, we mirror particular upstream versions in an internal repo.
The images used for creating the capi worker cluster (as well as the k3s images themselves) are stored on a VM in the cloudinfra project named 'docker-registry-XX' with the images themselves stored on a mounted cinder volume. These packages can be upgraded from various upstream repos; when version string changes the associated version numbers must also be updated in hiera with settings like 'profile::openstack::capi::cluster_api_version'
helm chart repo
Similarly, we try to host our own helm charts for management of the cluster-api service cluster as well as for user-created magnum clusters.
Those charts are hosted on a simple chartmuseum instance in 'cloudinfra' named chartmuseum-X.
All charts are exact copies of upstream charts, with one important exception: we need to override the network ranges in the upstream magnum cluster chart in order to avoid conflict with out existing setup. That chart is (currently) called openstack-cluster.
Upstream charts can be checked out from the azimuth github repo. There's also a working checkout of that repo in the /srv volume of the chartmuseum VM in cloudinfra.
If starting with a raw upstream checkout, first apply this patch to avoid network conflicts:
diff --git a/charts/openstack-cluster/values.yaml b/charts/openstack-cluster/values.yaml
index 6bad53b..7890d91 100644
--- a/charts/openstack-cluster/values.yaml
+++ b/charts/openstack-cluster/values.yaml
@@ -62,12 +62,16 @@ kubeNetwork:
# By default, use the private network range 172.16.0.0/12 for the cluster network
# We split it into two equally-sized blocks for pods and services
# This gives ~500,000 addresses in each block
+ #
+ # WMF HACK: Upstream uses 172.16.0.0/12 but we need to use 10.16.0.0/12 to avoid
+ # clashing with other bits of our Neutron setup
+ #
pods:
cidrBlocks:
- - 172.16.0.0/13
+ - 10.16.0.0/13
services:
cidrBlocks:
- - 172.24.0.0/13
+ - 10.24.0.0/13
serviceDomain: cluster.local
# Settings for the OpenStack networking for the cluster
@@ -187,7 +191,7 @@ apiServer:
# - 10.10.0.0/16 # IPv4 Internal Network
# - 123.123.123.123 # some other IPs
# Indicates whether to associate a floating IP with the API server
- associateFloatingIP: true
+ associateFloatingIP: false
# The specific floating IP to associate with the API server
# If not given, a new IP will be allocated if required
floatingIP:
After fetching, rebasing, and <optionally> modifying, you can package and release the new chart like this:
root@chartmuseum-2:/srv/capi-helm-charts/charts/openstack-cluster# helm dependency build
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "chartmuseum" chart repository
...Successfully got an update from the "chartmuseum.wmcloud.org" chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Deleting outdated charts
root@chartmuseum-2:/srv/capi-helm-charts/charts/
openstack-cluster# cd ..
root@chartmuseum-2:/srv/capi-helm-charts/charts# helm package ./openstack-cluster --version 0.16.1-wmf
Successfully packaged chart and saved it to: /srv/capi-helm-charts/charts/openstack-cluster-0.16.1-wmf.tgz
root@chartmuseum-2:/srv/capi-helm-charts/charts# cp openstack-cluster-0.16.1-wmf.tgz /srv/chartmuseum/
To check the latest available chart, you can pull it back down from chartmuseum in a separate directory:
$ cd checkthecharts
$ helm repo add chartmuseum https://chartmuseum.wmcloud.org
$ helm pull chartmuseum/openstack-cluster