Help:Toolforge/Raw Kubernetes jobs
This page contains information on running raw Kubernetes jobs in Toolforge. In this context, raw means direct interaction with the Kubernetes API.
One time jobs
If you need to run a job only once you can use a pod, that is the smallest deployable unit in kubernetes. To deploy a pod you need to create a yaml file like the example below.
apiVersion: v1
kind: Pod
metadata:
name : example
labels:
toolforge: tool
spec:
containers:
- name: main
workingDir: /data/project/mytool
image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
command: ['/bin/bash', '-c', 'source venv3/bin/activate; ./myapp.py']
restartPolicy: Never
Change the name "example" to the name you want to your pod, change the workingDir to the directory where your application is, change the image to the image you need, change the command to call your app and save the yaml file. You can create the pod with the command kubectl apply -f <path-to-yaml-file>
.
You can see if the pod is running with kubectl get pods
and see the pod output with kubectl logs <pod-name>
. Note that it can not have two pods with the same name, you need to delete the old pod with kubectl delete pod <pod-name>
before create a new one with the same name.
You can change the "restartPolicy: Never" to "restartPolicy: OnFailure" to make the pod restart the container when it exit with an error. However, if you want a continuous job it is recommended to use a "deployment" workload type as describe in a section below, because when the Kubernetes node where the pod is running has some failure the deployment will recreate the pod in another node, what not happens when you create a simple pod.
Cron jobs
It is possible to run cron jobs on Kubernetes (see upstream documentation for a full description).
Example cronjob.yaml
Wikiloveslove is a Python 3.7 bot that runs in a Kubernetes deployment. The cronjobs.yaml file that it uses to tell Kubernetes how to start and schedule the bot is reproduced below.
/data/project/wikiloveslove/cronjobs.yaml (copied 2020-02-01) |
---|
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: list-images
labels:
name: wikiloveslove.listimages
# The toolforge=tool label will cause $HOME and other paths to be mounted from Toolforge
toolforge: tool
spec:
schedule: "28 * * 2 *"
startingDeadlineSeconds: 30
jobTemplate:
spec:
template:
metadata:
labels:
toolforge: tool
spec:
containers:
- name: bot
workingDir: /data/project/wikiloveslove
image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
args:
- /bin/sh
- -c
- /data/project/wikiloveslove/list_images.sh
env:
- name: PYWIKIBOT_DIR
value: /data/project/wikiloveslove
- name: HOME
value: /data/project/wikiloveslove
restartPolicy: OnFailure
|
Create the CronJob object in your tool's Kubernetes namespace using kubectl:
$ kubectl apply --validate=true -f $HOME/cronjobs.yaml
cronjob.batch/CRONJOB-NAME configured
After creating the cronjob you can create a test job with kubectl create job --from=cronjob/CRONJOB-NAME test
to immediately trigger the cronjob and then access the logs as usual with kubectl logs job/test -f
to debug.
If that doesn't give you any useful output, try kubectl describe job/test
to see what's going on: it might be a misconfigured limit, for instance.
If you want the application not to restart on failure, change "restartPolicy: OnFailure" to "restartPolicy: Never" and add "backoffLimit: 0" in the jobTemplate spec (with same indentation as "template:").
Continuous jobs
The basic unit of managing execution on a Kubernetes cluster is called a "deployment". Each deployment is described with a YAML configuration file which describes the container images to be started ("pods" in the Kubernetes terminology) and commands to be run inside them after the container is initialized. A deployment also specifies where the pods run and what external resources are connected to them. The upstream documentation is comprehensive.
Example deployment.yaml
Stashbot is a Python 3.7 irc bot that runs in a Kubernetes deployment. The deployment.yaml file that it uses to tell Kubernetes how to start the bot is reproduced below. This deployment is launched using a stashbot.sh
wrapper script which runs kubectl create --validate=true -f /data/project/stashbot/etc/deployment.yaml
.
/data/project/stashbot/etc/deployment.yaml (copied 2020-01-03) |
---|
---
# NOTE: this deployment works with the "toolforge" Kubernetes cluster, and not the legacy "default" cluster.
apiVersion: apps/v1
kind: Deployment
metadata:
name: stashbot.bot
namespace: tool-stashbot
labels:
name: stashbot.bot
# The toolforge=tool label will cause $HOME and other paths to be mounted from Toolforge
toolforge: tool
spec:
replicas: 1
selector:
matchLabels:
name: stashbot.bot
toolforge: tool
template:
metadata:
labels:
name: stashbot.bot
toolforge: tool
spec:
containers:
- name: bot
image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
command: [ "/data/project/stashbot/bin/stashbot.sh", "run" ]
workingDir: /data/project/stashbot
env:
- name: HOME
value: /data/project/stashbot
imagePullPolicy: Always
|
This deployment:
- Uses the 'tool-stashbot' namespace that the tool is authorized to control
- Creates a container using the 'latest' version of the 'docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base' Docker image.
- Runs the command
/data/project/stashbot/bin/stashbot.sh run
inside the container to start the bot itself. - Mounts the /data/project/stashbot/ NFS directory as /data/project/stashbot/ inside the container.
Monitoring your jobs
You can see which jobs you have running with kubectl get pods
. Using the name of the pod, you can see the logs with kubectl logs <pod-name>
.
To restart a failing pod, use kubectl delete pod <pod-name>
. If you need to kill it entirely, find the deployment name with kubectl get deployment
, and delete it with kubectl delete deployment <deployment-name>
.
Virtualenv and pywikibot
For some applications with python a virtualenv is necessary to use packages that are not included in the python image. Pywikibot for example needs at least the requests package to work, which is not in the python3 image. Below are the steps to create a virtualenv and install the requests package using python 3.9.
First, we create an interactive shell inside a Kubernetes container using the python3.9 image.
tools.mytool@tools-sgebastion-10:~$ kubectl run -it shell --image=docker-registry.tools.wmflabs.org/toolforge-python39-sssd-base:latest --restart=Never --rm=true --labels="toolforge=tool" --env="HOME=$HOME" -- sh -c 'cd $HOME ; bash'
Then we create the virtualenv, activate it, install the requests package and exit the container.
tools.mytool@shell:~$ python3 -m venv venv
tools.mytool@shell:~$ source venv/bin/activate
(venv) tools.mytool@shell:~$ pip install requests
(venv) tools.mytool@shell:~$ exit
The example below is a container section of the yaml described in the sections above, you can use it with a single job, a cronjob or a continuous job. This example will activate the virtualenv and run a pywikibot application.
...
containers:
- name: bot
workingDir: /data/project/mytool
image: docker-registry.tools.wmflabs.org/toolforge-python39-sssd-base:latest
command: ['/bin/bash', '-c', 'source venv/bin/activate; ./myapp.py']
env:
- name: PYTHONPATH
value: /data/project/shared/pywikibot/stable
To use the virtualenv, you can activate it directly in the container command like in the example or you can create a wrapper shell script and call the script.
To use pywikibot we added the environment variable PYTHONPATH=/data/project/shared/pywikibot/stable
, which allows python to import the shared pywikibot package in toolforge.