Jump to content

Kubernetes/Kubernetes Workshop/Building a production ready cluster

From Wikitech


At the end of this module, you should be able to:

  • Use kubespray to set up a cluster.
  • Set up a cluster on Google’s Kubernetes Service (GKE).

Step 1: Creating a cluster with kubespray

Thus far, you have used Minikube as your test cluster. In this module, you will first set up a production-ready Kubernetes cluster. There are several tools to set up Kubernetes clusters, including kubeadm and rancher.

There is also a manual method (Kubernetes the hard way) and many other methods in-between. In this module, you will use kubespray to set up a cluster, and kubespray uses Ansible to configure a set of machines into a kubernetes cluster.

The machines can be bare metal or VMs. Using your WMF developer account, create 3 VMs, with each VM having 2 CPUs and 4 GB RAM. You can also use VMs from Digitalocean.

Each VM should have the following specifications: 1 CPU, 2GB (this is sufficient for running simple tests). You will use kubespray mainly as a black box to create a cluster, and for this workshop, you will not need to know kubespray's advanced use cases.

Launch the VMs

1. Log into your WMF horizon account.

2. In your selected project, launch 3 VMs. Fill out the prompts as required with the values below:

  • Name: node1, node2 and node3. (Use these exact name conventions else you will get puppet errors as kubespray changes the hostname to node1, node2, and node 3).
  • Source: debian.buster
  • Flavor: g3.cores2.ram4.disk20
  • Security group default: we will need SSH access
  • Server groups: none

Once launched, you will have to be able to SSH into each VM by its IP address (this is a kubespray requirement). It would be best to do this from a fourth Horizon VM (this is likely the most straightforward method). You can also code along to this module on your local machine. In my case, to make it work from my laptop, I had to add 172.16.*.* in the wmflabs stanza in .ssh/config on my computer, as shown below:

Host 172.16.* *.wmflabs *.wikimedia.cloud
   User <Username>
   ProxyJump bastion.wmcloud.org:22
   IdentityFile ~/.ssh/labs.key

Log into your Instances

Note: Create a new SSH key for an instance, then log into that instance (do this for each instance you create). You can use this wiki page as a guide. You might have to wait a while before logging into your new instances. From your terminal, SSH into your instances:

$ ssh <node1-ip>
$ ssh <node2-ip> 
$ ssh <node3-ip>
Linux node2 4.19.0-17-cloud-amd64 #1 SMP Debian 4.19.194-3 (2021-07-18) x86_64
Debian GNU/Linux 10 (buster)

3. In a new terminal, clone kubespray to your original instance or your local machine and install the required software:

$ git clone https://github.com/kubernetes-sigs/kubespray.git
$ cd kubespray
$ pip3 install -r requirements.txt

4. Add your hosts to the mycluster inventory configuration by running the inventory.py script:

$ cp -r inventory/sample inventory/mycluster
$ declare -a IPS=(node-1-ip node-2-ip node-3-ip)
$ CONFIG_FILE=inventory/mycluster/hosts.yml python3 contrib/inventory_builder/inventory.py ${IPS[@]}


     ansible_host: <node1-ip>
     ip: <node1-ip>
     access_ip: <node1-ip>
     ansible_host: <node2-ip>
     ip: <node2-ip>
     access_ip: <node2-ip>
     ansible_host: <node3-ip>
     ip: <node3-ip>
     access_ip: <node3-ip>
     hosts: {}

5. Using Ansible, log into one of the VMs you created. Add timeout=60 to your /etc/ansible/ansible.cfg file:

$ ansible-playbook -i inventory/mycluster/hosts.yml -u <shell-username> -b cluster.yml

Note: You might have to run the above command three times. On the third run, the ansible script cluster.yml installs Kubernetes on all three instances. The Kubernetes installation will run for a while.

6. Check that you can access the VM:

$ ssh <node-ip>
$ sudo su -
$ kubectl get nodes

From the output of the above commands, kubespray has built a Kubernetes cluster. This cluster has three nodes, two control planes, and a worker node. The two control plane nodes run the etcd database, the API-server, the scheduler, and the controller-manager, while the worker nodes run the kubelet and the kube-proxy.

Note: The control plane nodes are also worker nodes in this limited configuration, i.e: Kubernetes can also schedule pods to run on the control plane nodes.

Joe Breda gave an overview of the Kubernetes architecture above.

Note: You can verify what is running on the individual nodes by logging into them and checking the value of the following processes: etcd, apiserver, scheduler, controller-manager and kubelet, kube-proxy. You can check all these values on all three nodes.

Step 2: Running a simple workload

You can run a simple workload on node1’s Kubernetes cluster. Try this out using the cronjob application you built in module 3. You can build the cronjob based on the pywchksumbot Docker image from module 1 or any Docker image of your choice. Stop the process:

$ kubectl delete cronjob --all

Note: Run the kubectl create -f cron_mywpchksumbot.yaml command as root user.

Step 3: Running a simple service workload

In this step, you will rerun a simple service. 1. As a root user on node1, create a YAML file to run the simple service: baseapache.yaml

apiVersion: apps/v1
kind: Deployment
 name: baseapache
   app: baseapache
 replicas: 1
   type: RollingUpdate
     app: baseapache
       app: baseapache
      - name: baseapache
        image: <userid>/baseapache:latest
        imagePullPolicy: Always

2. Run your script:

$ kubectl create -f baseapache.yaml
$ kubectl get pods -o wide

3. Next, you create the corresponding service:


kind: Service
apiVersion: v1
 name: baseapache
   app: baseapache
 - protocol: TCP
   port: 80
   targetPort: 80

4. Run your script:

$ kubectl create -f baseapacheservice.yaml

5. On node 3, use the curl command to get the page's contents:

$ curl <ip_address>

Note: The IP ^^^ comes from the command kubectl get svc, run on node1

6. SSH into the node 1 and run the following commands to get more information about the pod:

$ ps -ef | grep apache2


  • Identify the parent princess of apache2.
  • In the parent process, there is a workdir on the command line. Take a look at that.

Docker uses the overlay filesystem to represent the layers in a Docker image. For example, when using FROM ubuntu in your baseapache Docker image, it implies that you base your Docker image on the Ubuntu image and any changes made get captured in one or more layers. You can see your container's layers with the docker inspect <imagename> command. You can run this command on any node.

$ docker image inspect <userid>/baseapache

7. You can use the file system IDs in the baseapache docker image to identify the filesystem the pod uses.

$ docker image inspect <userid>/baseapache

8. Copy the merged directory’s ID from /var/lib/docker/overlay2/. Run the grep command in the /var/lib/docker/overlay2/l file to get the id. The mapped string is the new ID:

$ ls -l | grep <id>

9. Find the mounted filesystem; this is the file system used by the container:

$ mount -l | grep <new id>
$ cat /var/lib/docker/overlay2/<hex string found>/merged/var/www/html/index.html

10. You can now use the filesystem to view the apache logs:

$ tail -f /var/lib/docker/overlay2/<file_system_id>/merged/var/log/apache2/access.log

Note: If you edit the index.html file, you will change the container image uncontrollably. However, this is not a good practice.

Step 4: Using kubectl locally

You might want to use kubectl on your local workstation to create and delete deployments. You have to change the config file in ~/.kube/config and point it to the new cluster. Your local workstation then needs to access the API server on the cluster. Since the cluster runs in the WMF VPS cloud behind a bastion host, setting up an SSH tunnel is the working solution so far.

Take the following steps:

1. Make a backup copy of the local config file. 2. Copy the config file from node1. 3. Setup the SSH tunnel:

$ ssh -L 6443: <ip_for_node1>

4. In the config file, set the value of server to 5. Verify it works:

$ kubectl get nodes

6. On your Digitalocean VM, this can work directly, without any SSH forwarding. Copy the config file as detailed above, but use node1's IP address in the config file instead of and make sure port 6443 is accessible in the associated firewall. 7. Now run your database deployment according to Module 5 on the machine: bookdb (deployment and service), bookapp (configmap, secrets, deployment, and service) and confirm if all works as expected.

Using kubectl’s Floating IPs Locally You can forward Horizon's floating IPs to a local IP. If you forward a floating IP address to one of the API servers and open the firewall to allow traffic on port 6443, you can access the K8s cluster directly. Copy the ~/.kube/config file from node1 and use it locally. Substitute the address with the floating IP address. Forwarding floating IPs to your servers can be helpful to test another application's access to the cluster, for example, GitLab runners. Note: Floating IPs have to be requested and take some time to approve.

Hands-on Demo: Granting browser access to our service

When using a WMF Cloud Service Account, you can define web proxies to allow external access to a service. Two configurations need to be created and adjusted: A. The web proxy: this is under DNS in the menu option. B. A security group: this is under Network in the menu option.


1. Get the port mapped for our service:

$ kubectl get svc

2. In this step, your service is hosted on port 31818. Make a curl request to the URL to get a response from your service:

$ curl http://<ip_address>:31818

3. Define a web proxy for that port:

Note: The name baseapache might not be available for your use. In such a scenario, you will get a Duplicate Recordset error message.

4. The firewall for the project needs to allow port 31818. Add a rule to your security group. Your security group might have the name Default Group if you did not select anything specific when creating the VMs.

5. Now you should be able to access the service from any browser. If you type in the address http://baseapache.wmcloud.org in your browser, you should get a reply. You can add more web proxies for other services that might run on the Kubernetes cluster by following the same sequence.

Step 5: Google Kubernetes Engine (GKE)

This step introduces you to Google's Kubernetes Engine (GKE). Take the following steps to use this service: 1. Login to the Google Cloud console. 2. Click on Kubernetes Engine/Clusters. 3. Click on Create Cluster (you can leave the default values name, zone, and version as they are) 4. Click on the newly created cluster and then create a node pool. You can change the default node type to an N1 machine to minimize costs (delete the cluster after this walkthrough). 5. Click connect. 6. In your terminal, install the gcloud CLI. This step is compulsory because the cloud shell will not work in this scenario. You can follow this guide to install the gcloud CLI. You can install gcloud on a local VM, a VM in the cloud (Horizon etc), or your local machine. 7. Run the following:

$ gcloud auth login 
$ gcloud container clusters get-credentials <cluster-name> --zone <zone-name> --project <project-name>

8. View your available nodes:

$ kubectl get nodes

9. Verify if the IRC bot works.

$ kubectl create -f cron_pywpchksumbot.yaml

10. Check the IRC channel 11. Run your database deployments: bookdb (deployment and service), bookapp (configmap, secrets, deployment, and service), and check its proper functioning. 12. Delete the Cluster.


  • GKE has a networking part integrated with Kubernetes as we have not configured this yet, bookdbapp gets an external address and is accessible through the internet.
  • Authenticate to gcloud using your Google Cloud Platform (GCP) credentials.

Next Module

Module 7: Kubernetes Package Manager

Previous Module