From Wikitech
< PAWS‎ | Tools
Jump to: navigation, search


PAWS is a Jupyterhub deployment that runs in it's own Kubernetes cluster (separate from Toolforge K8s) in the tools project on Wikimedia Cloud VPS. It is accessible at, and is a public service that can authenticated to via Wikimedia OAuth. More end-user info is at PAWS/Tools.

Kubernetes cluster


The PAWS Kubernetes cluster is deployed using custom scripts hosted at that use kubeadm, a tool that helps bootstrap a Kubernetes cluster on a bare-metal cluster.

We have the kubeadm-bootstrap repo cloned at /home/yuvipanda on, (and since this is on NFS, accessible everywhere else on tools). The cluster specific config/secrets are at ~/kubeadm-bootstrap/data.

The README at, has good info on how to set up the cluster. Our cluster has small variations to accommodate for installing on Debian (not Ubuntu Xenial as the current kubeadm-bootstrap code is setup for).

# install-kubeadm.bash looks slightly different

yuvipanda@tools-paws-master-01:~/kubeadm-bootstrap$ git diff install-kubeadm.bash
diff --git a/install-kubeadm.bash b/install-kubeadm.bash
index c6ce3bb..3aae948 100755
--- a/install-kubeadm.bash
+++ b/install-kubeadm.bash
@@ -1,24 +1,29 @@
-apt-get update
+apt-get update
 apt-get install -y apt-transport-https
 curl -s | apt-key add -
+curl -fsSL | apt-key add -
 cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
 deb kubernetes-xenial main
+deb [arch=amd64] xenial stable
 apt-get update

 # Install docker if you don't have it already.

-apt-get install -y docker-engine
+systemctl stop docker
+apt-get purge -y docker-engine
+rm -rf /var/lib/docker/*
+apt-get install -y docker-ce

-# Make sure you're using the overlay driver!
-# Note that this gives us docker 1.11, which does *not* support overlay2

 systemctl stop docker
 modprobe overlay
-echo '{"storage-driver": "overlay"}' > /etc/docker/daemon.json
 rm -rf /var/lib/docker/*
+echo '{"storage-driver": "overlay2"}' > /etc/docker/daemon.json
 systemctl start docker

 # Install kubernetes components!
 apt-get install -y kubelet kubeadm kubernetes-cni
# init-worker.bash looks slightly different too:

yuvipanda@tools-paws-master-01:~/kubeadm-bootstrap$ git diff init-worker.bash
diff --git a/init-worker.bash b/init-worker.bash
index 579e9fa..36557ee 100755
--- a/init-worker.bash
+++ b/init-worker.bash
@@ -5,4 +5,4 @@ set -e
 source data/config.bash
 source data/secrets.bash

-kubeadm join --token "${KUBEADM_TOKEN}"  "${KUBE_MASTER_IP}":6443
+kubeadm join --skip-preflight-checks --token "${KUBEADM_TOKEN}"  "${KUBE_MASTER_IP}":6443

We also have a change-docker.bash script:

yuvipanda@tools-paws-master-01:~/kubeadm-bootstrap$ cat change-docker.bash
curl -fsSL | sudo apt-key add -A
sudo add-apt-repository \
       "deb [arch=amd64] \
          $(lsb_release -cs) \
sudo apt-get update

All the above changes should be tracked persistently in git somewhere, but logged here while that's not the case.


[Coming Soon]

Current Setup

  • Currently, admins need to be sudo-ed as yuvipanda to run kubectl on this k8s cluster.
  • The k8s master is at
  • For k8s nodes summary, see kubectl get node
    yuvipanda@tools-paws-master-01:~$ kubectl get node
    NAME                     STATUS                     AGE       VERSION
    tools-paws-master-01     Ready                      113d      v1.7.3
    tools-paws-worker-1001   Ready                      35d       v1.8.0
    tools-paws-worker-1002   Ready                      98d       v1.7.3
    tools-paws-worker-1003   Ready                      98d       v1.7.3
    tools-paws-worker-1005   Ready                      98d       v1.7.3
    tools-paws-worker-1006   Ready                      98d       v1.7.3
    tools-paws-worker-1007   Ready                      98d       v1.7.3
    tools-paws-worker-1010   Ready                      98d       v1.7.3
    tools-paws-worker-1013   Ready                      98d       v1.7.3
    tools-paws-worker-1016   Ready                      98d       v1.7.3
    tools-paws-worker-1017   Ready,SchedulingDisabled   98d       v1.7.3
    tools-paws-worker-1019   Ready                      35d       v1.8.0
  • Helm is used to deploy kubernetes applications on the cluster. It is installed during the cluster bootstrap process, and is also used in turn to install add-ons nginx-ingress and kube-lego. Helm has two parts: a client (helm) and a server (tiller).
  • To see status of k8s control plane pods (running kubedns, kube-proxy, flannel, etcd, kube-apiserver, kube-controller-manager, tiller), see kubectl --namespace=kube-system get pod -o wide.

Jupyterhub deployment

Jupyterhub & PAWS Components

Jupyterhub is a set of systems deployed together that provide Jupyter notebook servers per user. The three main subsystems for Jupyterhub are the Hub, Proxy, and the Single-User Notebook Server. Really good overview of these systems is available at

Paws is a Jupyterhub deployment (Hub, Proxy, Single-User Notebook Server) with some added bells and whistles. The additional PAWS specific parts of our deployment are:

  • db-proxy: Mysql-proxy plugin script to perform simple authentication to the Wiki Replicas. See
  • querry-killer: Uses pt-kill to kill MySQL queries running longer than 30 minutes.
  • nbserve and render: nbserve is the nginx proxy that handles URL rewriting for paws-public URLs, and render handles the actual rendering of the ipynb notebook as a static page. Together they make paws-public possible.