News/2020 Kubernetes cluster migration

Tracked in Phabricator
Task T214513 Resolved

This page details information about the historical migration to the 2020 Toolforge Kubernetes cluster from the legacy (2016) Kubernetes cluster. This migration was completed on 2020-03-01. All Toolforge tools using Kubernetes are now running on the 2020 cluster.

What is changing?

New Kubernetes version (v1.15.6)
New Docker version (19.03.4)
New per-tool resource quota system
New k8s-status tool to view workloads and configuration of the Kubernetes cluster
Improved ability for Toolforge admins to upgrade and extend the Kubernetes cluster

Timeline

2020-01-09: 2020 Kubernetes cluster available for beta testers on an opt-in basis (announcement email)
2020-01-24: 2020 Kubernetes cluster general availability for migration on an opt-in basis (announcement email)
2020-02-21: Automatic migration of remaining workloads from 2016 cluster to 2020 cluster by Toolforge admins (announcement email)
2020-03-01: Forced migration complete!!

What should I do?

Manually migrate a webservice to the new cluster

Log into a Toolforge bastion (login.tools.wmflabs.org or dev.tools.wmflabs.org)
become $yourtool
webservice migrate
Configure your shell to use a newer version of kubectl:
- alias kubectl=/usr/bin/kubectl
- echo "alias kubectl=/usr/bin/kubectl" >> $HOME/.profile
Check to see that things launched successfully at your usual web location (https://tools.wmflabs.org/$mytool).

Manually migrate Kubernetes continuous jobs or other custom Deployments

Migrating a custom Deployment is largely the same process as migrating a webservice. You will need to at minimum update the image you are requesting. Additional configuration changes may be needed as well depending on what your Deployment is trying to ask the Kubernetes cluster to do.

Log into a Toolforge bastion (login.tools.wmflabs.org or dev.tools.wmflabs.org)
become $yourtool
Remind yourself what your Deployment is named: kubectl get deployment
Shutdown your Deployment on the legacy cluster: kubectl delete deployment [NAME]
- Check for and shutdown any Service object you may be using as well. A service object is not typical for a continuous job, but if leave one running on the legacy cluster it will keep the ingress routing from directing traffic to a similar service on the 2020 cluster.
Switch your Kubernetes "context" to the new 2020 cluster: kubectl config use-context toolforge
Configure your shell to use a newer version of kubectl:
- alias kubectl=/usr/bin/kubectl
- echo "alias kubectl=/usr/bin/kubectl" >> $HOME/.profile
Update your Deployment YAML file:
- The 2020 Kubernetes cluster is using a new LDAP integration system so you need to request an 'sssd' enabled image instead of the nslcd enabled image you were using on the legacy cluster. See toolforge:docker-registry/ for a complete listing of Docker images available. Choose one ending in '-sssd-base'.
- See Help:Toolforge/Kubernetes#Example deployment.yaml for an updated Deployment file example from Tool:Stashbot's custom Deployment, and ac90ab6af724 and f1ad69c461a1 for the specific changes made to Stashbot’s original Deployment.
Launch your Deployment on the new cluster: kubectl create --validate=true -f [YOUR YAML FILE]
- The --validate=true argument will help you debug your YAML configuration by checking its syntax and conformance the the APIs in the 2020 Kubernetes cluster before actually submitting the configuration to the cluster.
Check that the deployment started and that the pods are ready: kubectl get deployment,rs,pods,services

What are the primary changes with moving to the new cluster?

Lower default resource limits for webservice

On the new Kubernetes cluster, webservices will run with CPU and RAM limits which are lower by default than the limits used on the legacy cluster. Defaults are set at 0.5 CPU and 512Mi of memory. Users can adjust these up to the highest level allowed in the legacy cluster (1 CPU and 4GiB of memory) with command line arguments to the webservice command (--cpu and --mem) or properly formatted Kubernetes YAML specifications for advanced users.

The Toolforge admin team encourages you to try running your webservice with the new lower defaults before deciding that you need more resources. We believe that most PHP and Python3 webservices will work as expected with the lower values. Java webservices will almost certainly need higher limits due to the nature of running a JVM.

Namespace quota system

The new cluster places quota limits on an entire namespace which determine how many pods can be used, how many service ports can be exposed, total memory, total CPU, and others. The default hard limits for a tool's entire namespace are:

requests.cpu: 2           # Soft limit on CPU usage
requests.memory: "6Gi"    # Soft limit on memory usage
limits.cpu: 2             # Hard limit on CPU usage
limits.memory: "8Gi"      # Hard limit on memory usage
pods: 4
services: 1
services.nodeport: 0      # Nodeport services are not allowed
replicationcontrollers: 1
secrets: 10
configmaps: 10
persistentvolumeclaims: 3

These limits govern everything a user can create in the new cluster. You will notice they are a bit higher than seems useful for some things, like pods, but that is to make room for future services. We anticipate founding a process soon to request upgraded limits from the Toolforge administrator for tools that need a bit more to get their work done.

Improved isolation between namespaces

w:WP:BEANS applies to the details of this change until we have fully decommissioned the legacy cluster, but the new Kubernetes cluster has many improvements in both the upstream Kubernetes software and the configuration of the Toolforge deployment which provide better isolation between pods running in namespaces. The Toolforge admin team expects to publish a detailed threat assessment report on the configuration of the new Kubernetes cluster in the coming weeks.

New LDAP integration mechanism

The 2020 Kubernetes cluster is using a different software stack to integrate the Developer account LDAP directory with NSS on both the Kubernetes nodes themselves as well as inside the Docker containers which run webservices and other workloads. The new integration is done with SSSD. SSSD has been tested extensively on Toolforge's bastions and grid engine hosts. We have found it to be both faster and more stable than the nslcd software which was used on our Debian Jessie and Ubuntu hosts in the past. We hope that this change will fix a long standing bug affecting $HOME and uid lookups inside our Kubernetes containers.

New ingress routing layer

The 2020 Kubernetes cluster is using a new mechanism for ingress routing. We have deployed ingress-nginx which is a community developed FOSS solution which uses nginx and lua. This ingress layer is replacing a custom solution called 'kube2proxy' which was written to monitor the state of the legacy Kubernetes cluster and automatically add rules to the custom 'dynamicproxy' software which manages both https://tools.wmflabs.org and the rest of the *.wmflabs.org HTTPS services for Cloud VPS projects. Using a supported and community developed ingress layer should make routing HTTP traffic to Toolforge webservers more reliable than our past implementation.

Ingress-nginx provides some new features in the form of Ingress object annotations that we will be exploring in the future. We have already discovered a convenient way to redirect all traffic from one tool account to another URL which can be used to replace prior Lighttpd based solutions for tools which have been renamed or have graduated from Toolforge to their own Cloud VPS project.

Solutions to common problems

Having trouble with the new cluster? If the answer to your problem isn't here, ask for help in #wikimedia-cloud ^connect or file a bug in Phabricator.

`webservice restart` fails

This is a known issue that is being actively worked on. When webservice restart fails with

Traceback (most recent call last):
  File "/usr/local/bin/webservice", line 319, in <module>
    start(job, "Restarting webservice")
  File "/usr/local/bin/webservice", line 134, in start
    job.request_start()
  File "/usr/lib/python2.7/dist-packages/toollabs/webservice/backends/kubernetesbackend.py", line 658, in request_start
    pykube.Deployment(self.api, self._get_deployment()).create()
  File "/usr/lib/python2.7/dist-packages/pykube/objects.py", line 76, in create
    self.api.raise_for_status(r)
  File "/usr/lib/python2.7/dist-packages/pykube/http.py", line 104, in raise_for_status
    raise HTTPError(payload["message"])
pykube.exceptions.HTTPError: object is being deleted: deployments.extensions "$tool" already exists

You can generally simply wait a few moments and run webservice start, and everything will start with whatever settings you last used to start the service. The reason is that it normally won't have blanked out your service.manifest file when it lands in this state. You can check that by running grep kubernetes service.manifest before you run webservice start just to be sure.

Switch back to the legacy Kubernetes cluster

It is entirely possible to migrate back to the legacy cluster if an issue is discovered until the legacy cluster is discontinued. The steps required mirror the process of migrating to the new cluster as the main change in either case is setting your tool's active "context".

Log into a Toolforge bastion (login.tools.wmflabs.org or dev.tools.wmflabs.org)
become $yourtool
Remind yourself which webservice runtime you are using: webservice status
Shutdown your webservice on the 2020 cluster: webservice stop
Switch your Kubernetes "context" to the legacy cluster: kubectl config use-context default
Configure your shell to use an older version of kubectl:
- unalias kubectl
- Edit $HOME/.profile and remove any alias kubectl=... line.
Launch your web service on the legacy cluster: webservice --backend=kubernetes [runtime] start
Check to see that things launched successfully at your usual web location (https://tools.wmflabs.org/$mytool).

kubectl errors: "cannot create resource … in API group … in the namespace"

You may not have updated your Deployment YAML file – see above.