Portal:Toolforge/Admin/Build Service
Documentation of components and common admin procedures for Build Service.
Components
- Builds client (source code): main entrypoint for users
- Builds API (source code): main entry-point for clients (users use the cli)
- Builds builder (source code): underlying building system
- Builds admission controller (source code): validates that the taskruns (tekton) we create meet our criteria (builder image used, params passed, ...)
- Harbor (puppet code): Hosts the images the users create
- Tools: https://tools-harbor.wmcloud.org
- Toolsbeta: https://toolsbeta-harbor.wmcloud.org
Alerts
- From the cloud UI: https://prometheus-alerts.wmcloud.org/?q=%40state%3Dactive&q=project%3D~^%28tools%7Ctoolsbeta%29
- From the prod UI: https://alerts.wikimedia.org/?q=team%3Dwmcs&q=project%3D~%28tools%7Ctoolsbeta%29
Dasdboards
You can find all the current dashboards here: https://grafana-rw.wmcloud.org/d/m9V1RQs4k/harbor-overview?orgId=1&search=open&query=folder:current&tag=buildservice
Main phabricator board
https://phabricator.wikimedia.org/project/board/5596/
Ongoing project
https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Ongoing_Efforts/Toolforge_Build_Service
Administrative tasks
Starting a service
Harbor
Ssh to the harbor instance (ex. toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud
):
dcaro@vulcanus$ wm-ssh toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud
...
dcaro@toolsbeta-harbor-1:~$ sudo -i
root@toolsbeta-harbor-1:~# cd /srv/ops/harbor/
root@toolsbeta-harbor-1:/srv/ops/harbor# docker-compose up -d # will start the containers that are down if any
harbor-log is up-to-date
registry is up-to-date
redis is up-to-date
harbor-portal is up-to-date
registryctl is up-to-date
harbor-core is up-to-date
harbor-jobservice is up-to-date
nginx is up-to-date
harbor-exporter is up-to-date
Buildservice API
This lives in kubernetes, behind the API gateway. To start it you can try redepolying it, to do so follow Portal:Toolforge/Admin/Kubernetes/Components#Deploy (the component is toolforge-builds-api).
You can monitor if it's coming up with the usual k8s commands:
root@toolsbeta-test-k8s-control-4:~# kubectl get all -n builds-api
NAME READY STATUS RESTARTS AGE
pod/builds-api-5bffd6b58f-9zg4s 2/2 Running 0 29h
pod/builds-api-5bffd6b58f-jk6sf 2/2 Running 0 29h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/builds-api ClusterIP 10.97.55.43 <none> 8443/TCP 18d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/builds-api 2/2 2 2 18d
NAME DESIRED CURRENT READY AGE
replicaset.apps/builds-api-5bffd6b58f 2 2 2 29h
Tekton
Similar to the builds api, tekton is a k8s component, you can try redepolying it too following Portal:Toolforge/Admin/Kubernetes/Components#Deploy (the component is buildservice).
You can monitor if it's coming up with the usual k8s commands:
root@toolsbeta-test-k8s-control-4:~# kubectl get all -n tekton-pipelines
NAME READY STATUS RESTARTS AGE
pod/tekton-pipelines-controller-5c78ddd49b-dj4hz 1/1 Running 0 57d
pod/tekton-pipelines-webhook-5d899cc8c-zwf7p 1/1 Running 0 57d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/tekton-pipelines-controller ClusterIP 10.96.176.235 <none> 9090/TCP,8008/TCP,8080/TCP 447d
service/tekton-pipelines-webhook ClusterIP 10.101.163.215 <none> 9090/TCP,8008/TCP,443/TCP,8080/TCP 447d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/tekton-pipelines-controller 1/1 1 1 87d
deployment.apps/tekton-pipelines-webhook 1/1 1 1 87d
NAME DESIRED CURRENT READY AGE
replicaset.apps/tekton-pipelines-controller-5c78ddd49b 1 1 1 87d
replicaset.apps/tekton-pipelines-webhook-5d899cc8c 1 1 1 87d
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/tekton-pipelines-webhook Deployment/tekton-pipelines-webhook 4%/100% 1 5 1 447d
Buildpack admission controller
Also a k8s component, follow Portal:Toolforge/Admin/Kubernetes/Components#Deploy (the component is bupildpack-admission-controller).
root@toolsbeta-test-k8s-control-4:~# kubectl get all -n buildpack-admission
NAME READY STATUS RESTARTS AGE
pod/buildpack-admission-5c87f7664f-59rg8 1/1 Running 0 7d1h
pod/buildpack-admission-5c87f7664f-796td 1/1 Running 0 7d1h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/buildpack-admission ClusterIP 10.109.169.190 <none> 443/TCP 79d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/buildpack-admission 2/2 2 2 79d
NAME DESIRED CURRENT READY AGE
replicaset.apps/buildpack-admission-56d8989bdd 0 0 0 9d
replicaset.apps/buildpack-admission-5c87f7664f 2 2 2 7d1h
replicaset.apps/buildpack-admission-5fb584f788 0 0 0 77d
replicaset.apps/buildpack-admission-64b4cbfb5f 0 0 0 39d
replicaset.apps/buildpack-admission-67d574f979 0 0 0 79d
replicaset.apps/buildpack-admission-6b8db4f7b8 0 0 0 31d
replicaset.apps/buildpack-admission-7cf857d878 0 0 0 79d
replicaset.apps/buildpack-admission-c77b6b447 0 0 0 63d
Stopping a service
Harbor
Ssh to the harbor instance (ex. toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud
):
dcaro@vulcanus$ wm-ssh toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud
...
dcaro@toolsbeta-harbor-1:~$ sudo -i
root@toolsbeta-harbor-1:~# cd /srv/ops/harbor/
oot@toolsbeta-harbor-1:/srv/ops/harbor# docker-compose stop
Stopping harbor-jobservice ... done
Stopping nginx ... done
Stopping harbor-exporter ... done
Stopping harbor-core ... done
Stopping registry ... done
Stopping harbor-portal ... done
Stopping redis ... done
Stopping registryctl ... done
Stopping harbor-log ... done
See also Portal:Toolforge/Admin/Harbor.
Buildservice API
Being a k8s deployment, the quickest way might be just to remove the deployment itself (will require redeploying to start again).
root@toolsbeta-test-k8s-control-4:~# kubectl get deployment -n builds-api builds-api -o yaml > backup.yaml # in case you want to restore later with kubectl apply -f backup.yaml
root@toolsbeta-test-k8s-control-4:~# kubectl delete deployment -n builds-api builds-api
For a full removal (CAREFUL! Only if you know what you are doing) you can use helm:
root@toolsbeta-test-k8s-control-4:~# helm uninstall -n builds-api builds-api
Tekton
This one is a bit more tricky, but it would be removing the tekton controller itself (the one that handles the PipelineRun
and TaskRun
resources).
root@toolsbeta-test-k8s-control-4:~# kubectl get deployment -n tekton-pipelines tekton-pipelines-controller -o yaml > backup.yaml # in case you want to restore later with kubectl apply -f backup.yaml
root@toolsbeta-test-k8s-control-4:~# kubectl delete deployment -n tekton-pipelines tekton-pipelines-controller
NOTE: Tekton does not have yet a helm deployment associated with it
Buildpack admission controller
Similar to builds api:
root@toolsbeta-test-k8s-control-4:~# kubectl get deployment -n buildpack-admission buildpack-admission -o yaml > backup.yaml # in case you want to restore later with kubectl apply -f backup.yaml
root@toolsbeta-test-k8s-control-4:~# kubectl delete deployment -n buildpack-admission buildpack-admission
For a full removal (CAREFUL! Only if you know what you are doing) you can use helm:
root@toolsbeta-test-k8s-control-4:~# helm uninstall -n buildpack-admission buildpack-admission
Checking all components are alive
We don't have a unified dashboard yet, but for now you can check each component individually.
You can check the dashboards as a starting point. For the rest keep reading:
Harbor
Ssh to the harbor instance (ex. toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud
):
dcaro@vulcanus$ wm-ssh toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud
...
dcaro@toolsbeta-harbor-1:~$ sudo -i
root@toolsbeta-harbor-1:~# cd /srv/ops/harbor/
root@toolsbeta-harbor-1:/srv/ops/harbor# docker-compose ps
Name Command State Ports
----------------------------------------------------------------------------------------------------------------
harbor-core /harbor/entrypoint.sh Up (healthy)
harbor-exporter /harbor/entrypoint.sh Up
harbor-jobservice /harbor/entrypoint.sh Up (healthy)
harbor-log /bin/sh -c /usr/local/bin/ ... Up (healthy) 127.0.0.1:1514->10514/tcp
harbor-portal nginx -g daemon off; Up (healthy)
nginx nginx -g daemon off; Up (healthy) 0.0.0.0:80->8080/tcp, 0.0.0.0:9090->9090/tcp
redis redis-server /etc/redis.conf Up (healthy)
registry /home/harbor/entrypoint.sh Up (healthy)
registryctl /home/harbor/start.sh Up (healthy)
Buildservice API
You can monitor if it's coming up with the usual k8s commands:
root@toolsbeta-test-k8s-control-4:~# kubectl get all -n builds-api
NAME READY STATUS RESTARTS AGE
pod/builds-api-5bffd6b58f-9zg4s 2/2 Running 0 29h
pod/builds-api-5bffd6b58f-jk6sf 2/2 Running 0 29h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/builds-api ClusterIP 10.97.55.43 <none> 8443/TCP 18d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/builds-api 2/2 2 2 18d
NAME DESIRED CURRENT READY AGE
replicaset.apps/builds-api-5bffd6b58f 2 2 2 29h
Tekton
Same as before, different namespace:
root@toolsbeta-test-k8s-control-4:~# kubectl get all -n tekton-pipelines
NAME READY STATUS RESTARTS AGE
pod/tekton-pipelines-controller-5c78ddd49b-dj4hz 1/1 Running 0 57d
pod/tekton-pipelines-webhook-5d899cc8c-zwf7p 1/1 Running 0 57d
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/tekton-pipelines-controller ClusterIP 10.96.176.235 <none> 9090/TCP,8008/TCP,8080/TCP 447d
service/tekton-pipelines-webhook ClusterIP 10.101.163.215 <none> 9090/TCP,8008/TCP,443/TCP,8080/TCP 447d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/tekton-pipelines-controller 1/1 1 1 87d
deployment.apps/tekton-pipelines-webhook 1/1 1 1 87d
NAME DESIRED CURRENT READY AGE
replicaset.apps/tekton-pipelines-controller-5c78ddd49b 1 1 1 87d
replicaset.apps/tekton-pipelines-webhook-5d899cc8c 1 1 1 87d
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
horizontalpodautoscaler.autoscaling/tekton-pipelines-webhook Deployment/tekton-pipelines-webhook 4%/100% 1 5 1 447d
Buildpack admission controller
Also a k8s component, different namespace:
root@toolsbeta-test-k8s-control-4:~# kubectl get all -n buildpack-admission
NAME READY STATUS RESTARTS AGE
pod/buildpack-admission-5c87f7664f-59rg8 1/1 Running 0 7d1h
pod/buildpack-admission-5c87f7664f-796td 1/1 Running 0 7d1h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/buildpack-admission ClusterIP 10.109.169.190 <none> 443/TCP 79d
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/buildpack-admission 2/2 2 2 79d
NAME DESIRED CURRENT READY AGE
replicaset.apps/buildpack-admission-56d8989bdd 0 0 0 9d
replicaset.apps/buildpack-admission-5c87f7664f 2 2 2 7d1h
replicaset.apps/buildpack-admission-5fb584f788 0 0 0 77d
replicaset.apps/buildpack-admission-64b4cbfb5f 0 0 0 39d
replicaset.apps/buildpack-admission-67d574f979 0 0 0 79d
replicaset.apps/buildpack-admission-6b8db4f7b8 0 0 0 31d
replicaset.apps/buildpack-admission-7cf857d878 0 0 0 79d
replicaset.apps/buildpack-admission-c77b6b447 0 0 0 63d