Help:Toolforge/Raw Kubernetes jobs

From Wikitech
Using the Toolforge jobs framework is recommended over direct use of the Kubernetes CronJob API. The Kubernetes API internals may change at any time which can break your tools using this method!

This page contains information on running raw Kubernetes jobs in Toolforge. In this context, raw means direct interaction with the Kubernetes API.

One time jobs

If you need to run a job only once you can use a pod, that is the smallest deployable unit in kubernetes. To deploy a pod you need to create a yaml file like the example below.

apiVersion: v1
kind: Pod
metadata:
  name : example
  labels:
     toolforge: tool
spec:
  containers:
  - name: main
    workingDir: /data/project/mytool
    image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
    command: ['/bin/bash', '-c', 'source venv3/bin/activate; ./myapp.py']
  restartPolicy: Never

Change the name "example" to the name you want to your pod, change the workingDir to the directory where your application is, change the image to the image you need, change the command to call your app and save the yaml file. You can create the pod with the command kubectl apply -f <path-to-yaml-file>.

You can see if the pod is running with kubectl get pods and see the pod output with kubectl logs <pod-name>. Note that it can not have two pods with the same name, you need to delete the old pod with kubectl delete pod <pod-name> before create a new one with the same name.

You can change the "restartPolicy: Never" to "restartPolicy: OnFailure" to make the pod restart the container when it exit with an error. However, if you want a continuous job it is recommended to use a "deployment" workload type as describe in a section below, because when the Kubernetes node where the pod is running has some failure the deployment will recreate the pod in another node, what not happens when you create a simple pod.

Cron jobs

It is possible to run cron jobs on Kubernetes (see upstream documentation for a full description).

Example cronjob.yaml

Wikiloveslove is a Python 3.7 bot that runs in a Kubernetes deployment. The cronjobs.yaml file that it uses to tell Kubernetes how to start and schedule the bot is reproduced below.

Create the CronJob object in your tool's Kubernetes namespace using kubectl:

$ kubectl apply --validate=true -f $HOME/cronjobs.yaml
cronjob.batch/CRONJOB-NAME configured

After creating the cronjob you can create a test job with kubectl create job --from=cronjob/CRONJOB-NAME test to immediately trigger the cronjob and then access the logs as usual with kubectl logs job/test -f to debug.

If that doesn't give you any useful output, try kubectl describe job/test to see what's going on: it might be a misconfigured limit, for instance.

If you want the application not to restart on failure, change "restartPolicy: OnFailure" to "restartPolicy: Never" and add "backoffLimit: 0" in the jobTemplate spec (with same indentation as "template:").

Continuous jobs

The basic unit of managing execution on a Kubernetes cluster is called a "deployment". Each deployment is described with a YAML configuration file which describes the container images to be started ("pods" in the Kubernetes terminology) and commands to be run inside them after the container is initialized. A deployment also specifies where the pods run and what external resources are connected to them. The upstream documentation is comprehensive.

Example deployment.yaml

Stashbot is a Python 3.7 irc bot that runs in a Kubernetes deployment. The deployment.yaml file that it uses to tell Kubernetes how to start the bot is reproduced below. This deployment is launched using a stashbot.sh wrapper script which runs kubectl create --validate=true -f /data/project/stashbot/etc/deployment.yaml.

This deployment:

  • Uses the 'tool-stashbot' namespace that the tool is authorized to control
  • Creates a container using the 'latest' version of the 'docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base' Docker image.
  • Runs the command /data/project/stashbot/bin/stashbot.sh run inside the container to start the bot itself.
  • Mounts the /data/project/stashbot/ NFS directory as /data/project/stashbot/ inside the container.
The stashbot.sh script assumes that a Python 3.7 virtual environment has been manually created and populated with library dependencies for the project. See Help:Toolforge/Web/Python#Virtual Environments and Packages for more information about how to create a virtual environment. Make sure you call your venv python interpreter and not /usr/bin/python.

Monitoring your jobs

You can see which jobs you have running with kubectl get pods. Using the name of the pod, you can see the logs with kubectl logs <pod-name>.

To restart a failing pod, use kubectl delete pod <pod-name>. If you need to kill it entirely, find the deployment name with kubectl get deployment, and delete it with kubectl delete deployment <deployment-name>.

Virtualenv and pywikibot

See Help:Toolforge/Running Pywikibot scripts (advanced) for a supported way of doing this.

For some applications with python a virtualenv is necessary to use packages that are not included in the python image. Pywikibot for example needs at least the requests package to work, which is not in the python3 image. Below are the steps to create a virtualenv and install the requests package using python 3.9.

First, we create an interactive shell inside a Kubernetes container using the python3.9 image.

tools.mytool@tools-sgebastion-10:~$ kubectl run -it shell --image=docker-registry.tools.wmflabs.org/toolforge-python39-sssd-base:latest --restart=Never --rm=true --labels="toolforge=tool" --env="HOME=$HOME" -- sh -c 'cd $HOME ; bash'

Then we create the virtualenv, activate it, install the requests package and exit the container.

tools.mytool@shell:~$ python3 -m venv venv
tools.mytool@shell:~$ source venv/bin/activate
(venv) tools.mytool@shell:~$ pip install requests
(venv) tools.mytool@shell:~$ exit

The example below is a container section of the yaml described in the sections above, you can use it with a single job, a cronjob or a continuous job. This example will activate the virtualenv and run a pywikibot application.

...
  containers:
  - name: bot
    workingDir: /data/project/mytool
    image: docker-registry.tools.wmflabs.org/toolforge-python39-sssd-base:latest
    command: ['/bin/bash', '-c', 'source venv/bin/activate; ./myapp.py']
    env:
    - name: PYTHONPATH
      value: /data/project/shared/pywikibot/stable

To use the virtualenv, you can activate it directly in the container command like in the example or you can create a wrapper shell script and call the script.

To use pywikibot we added the environment variable PYTHONPATH=/data/project/shared/pywikibot/stable, which allows python to import the shared pywikibot package in toolforge.