Portal:Toolforge/Admin/Build Service

From Wikitech
Jump to navigation Jump to search

Documentation of components and common admin procedures for Build Service.

Components

Alerts

Dasdboards

You can find all the current dashboards here: https://grafana-rw.wmcloud.org/d/m9V1RQs4k/harbor-overview?orgId=1&search=open&query=folder:current&tag=buildservice


Main phabricator board

https://phabricator.wikimedia.org/project/board/5596/


Ongoing project

https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Ongoing_Efforts/Toolforge_Build_Service


Administrative tasks

Starting a service

Harbor

Ssh to the harbor instance (ex. toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud):


dcaro@vulcanus$ wm-ssh toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud 
...
dcaro@toolsbeta-harbor-1:~$ sudo -i
root@toolsbeta-harbor-1:~# cd /srv/ops/harbor/

root@toolsbeta-harbor-1:/srv/ops/harbor# docker-compose up -d  # will start the containers that are down if any
harbor-log is up-to-date
registry is up-to-date
redis is up-to-date
harbor-portal is up-to-date
registryctl is up-to-date
harbor-core is up-to-date
harbor-jobservice is up-to-date
nginx is up-to-date
harbor-exporter is up-to-date

Buildservice API

This lives in kubernetes, behind the API gateway. To start it you can try redepolying it, to do so follow Portal:Toolforge/Admin/Kubernetes/Components#Deploy (the component is toolforge-builds-api).

You can monitor if it's coming up with the usual k8s commands:


root@toolsbeta-test-k8s-control-4:~# kubectl get all -n builds-api
NAME                              READY   STATUS    RESTARTS   AGE
pod/builds-api-5bffd6b58f-9zg4s   2/2     Running   0          29h
pod/builds-api-5bffd6b58f-jk6sf   2/2     Running   0          29h

NAME                 TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
service/builds-api   ClusterIP   10.97.55.43   <none>        8443/TCP   18d

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/builds-api   2/2     2            2           18d

NAME                                    DESIRED   CURRENT   READY   AGE

replicaset.apps/builds-api-5bffd6b58f   2         2         2       29h

Tekton

Similar to the builds api, tekton is a k8s component, you can try redepolying it too following Portal:Toolforge/Admin/Kubernetes/Components#Deploy (the component is buildservice).

You can monitor if it's coming up with the usual k8s commands:


root@toolsbeta-test-k8s-control-4:~# kubectl get all -n tekton-pipelines
NAME                                               READY   STATUS    RESTARTS   AGE
pod/tekton-pipelines-controller-5c78ddd49b-dj4hz   1/1     Running   0          57d
pod/tekton-pipelines-webhook-5d899cc8c-zwf7p       1/1     Running   0          57d

NAME                                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                              AGE
service/tekton-pipelines-controller   ClusterIP   10.96.176.235    <none>        9090/TCP,8008/TCP,8080/TCP           447d
service/tekton-pipelines-webhook      ClusterIP   10.101.163.215   <none>        9090/TCP,8008/TCP,443/TCP,8080/TCP   447d

NAME                                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/tekton-pipelines-controller   1/1     1            1           87d
deployment.apps/tekton-pipelines-webhook      1/1     1            1           87d

NAME                                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/tekton-pipelines-controller-5c78ddd49b   1         1         1       87d
replicaset.apps/tekton-pipelines-webhook-5d899cc8c       1         1         1       87d

NAME                                                           REFERENCE                             TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/tekton-pipelines-webhook   Deployment/tekton-pipelines-webhook   4%/100%   1         5         1          447d

Buildpack admission controller

Also a k8s component, follow Portal:Toolforge/Admin/Kubernetes/Components#Deploy (the component is bupildpack-admission-controller).


root@toolsbeta-test-k8s-control-4:~# kubectl get all -n buildpack-admission
NAME                                       READY   STATUS    RESTARTS   AGE
pod/buildpack-admission-5c87f7664f-59rg8   1/1     Running   0          7d1h
pod/buildpack-admission-5c87f7664f-796td   1/1     Running   0          7d1h

NAME                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/buildpack-admission   ClusterIP   10.109.169.190   <none>        443/TCP   79d

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/buildpack-admission   2/2     2            2           79d

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/buildpack-admission-56d8989bdd   0         0         0       9d
replicaset.apps/buildpack-admission-5c87f7664f   2         2         2       7d1h
replicaset.apps/buildpack-admission-5fb584f788   0         0         0       77d
replicaset.apps/buildpack-admission-64b4cbfb5f   0         0         0       39d
replicaset.apps/buildpack-admission-67d574f979   0         0         0       79d
replicaset.apps/buildpack-admission-6b8db4f7b8   0         0         0       31d
replicaset.apps/buildpack-admission-7cf857d878   0         0         0       79d
replicaset.apps/buildpack-admission-c77b6b447    0         0         0       63d

Stopping a service

Harbor

Ssh to the harbor instance (ex. toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud):


dcaro@vulcanus$ wm-ssh toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud 
...
dcaro@toolsbeta-harbor-1:~$ sudo -i
root@toolsbeta-harbor-1:~# cd /srv/ops/harbor/

oot@toolsbeta-harbor-1:/srv/ops/harbor# docker-compose stop
Stopping harbor-jobservice ... done
Stopping nginx             ... done
Stopping harbor-exporter   ... done
Stopping harbor-core       ... done
Stopping registry          ... done
Stopping harbor-portal     ... done
Stopping redis             ... done
Stopping registryctl       ... done
Stopping harbor-log        ... done

See also Portal:Toolforge/Admin/Harbor.

Buildservice API

Being a k8s deployment, the quickest way might be just to remove the deployment itself (will require redeploying to start again).


root@toolsbeta-test-k8s-control-4:~# kubectl get deployment -n builds-api builds-api -o yaml > backup.yaml  # in case you want to restore later with kubectl apply -f backup.yaml

root@toolsbeta-test-k8s-control-4:~# kubectl delete deployment -n builds-api builds-api

For a full removal (CAREFUL! Only if you know what you are doing) you can use helm:


root@toolsbeta-test-k8s-control-4:~# helm uninstall -n builds-api builds-api

Tekton

This one is a bit more tricky, but it would be removing the tekton controller itself (the one that handles the PipelineRun and TaskRun resources).


root@toolsbeta-test-k8s-control-4:~# kubectl get deployment -n tekton-pipelines tekton-pipelines-controller -o yaml > backup.yaml  # in case you want to restore later with kubectl apply -f backup.yaml

root@toolsbeta-test-k8s-control-4:~# kubectl delete deployment -n tekton-pipelines tekton-pipelines-controller

NOTE: Tekton does not have yet a helm deployment associated with it

Buildpack admission controller

Similar to builds api:


root@toolsbeta-test-k8s-control-4:~# kubectl get deployment -n buildpack-admission buildpack-admission -o yaml > backup.yaml  # in case you want to restore later with kubectl apply -f backup.yaml

root@toolsbeta-test-k8s-control-4:~# kubectl delete deployment -n buildpack-admission buildpack-admission

For a full removal (CAREFUL! Only if you know what you are doing) you can use helm:


root@toolsbeta-test-k8s-control-4:~# helm uninstall -n buildpack-admission buildpack-admission

Checking all components are alive

We don't have a unified dashboard yet, but for now you can check each component individually.

You can check the dashboards as a starting point. For the rest keep reading:


Harbor

Ssh to the harbor instance (ex. toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud):


dcaro@vulcanus$ wm-ssh toolsbeta-harbor-1.toolsbeta.eqiad1.wikimedia.cloud 
...
dcaro@toolsbeta-harbor-1:~$ sudo -i
root@toolsbeta-harbor-1:~# cd /srv/ops/harbor/

root@toolsbeta-harbor-1:/srv/ops/harbor# docker-compose ps
      Name                     Command                  State                          Ports                    
----------------------------------------------------------------------------------------------------------------
harbor-core         /harbor/entrypoint.sh            Up (healthy)                                               
harbor-exporter     /harbor/entrypoint.sh            Up                                                         
harbor-jobservice   /harbor/entrypoint.sh            Up (healthy)                                               
harbor-log          /bin/sh -c /usr/local/bin/ ...   Up (healthy)   127.0.0.1:1514->10514/tcp                   
harbor-portal       nginx -g daemon off;             Up (healthy)                                               
nginx               nginx -g daemon off;             Up (healthy)   0.0.0.0:80->8080/tcp, 0.0.0.0:9090->9090/tcp
redis               redis-server /etc/redis.conf     Up (healthy)                                               
registry            /home/harbor/entrypoint.sh       Up (healthy)                                               
registryctl         /home/harbor/start.sh            Up (healthy)

Buildservice API

You can monitor if it's coming up with the usual k8s commands:


root@toolsbeta-test-k8s-control-4:~# kubectl get all -n builds-api
NAME                              READY   STATUS    RESTARTS   AGE
pod/builds-api-5bffd6b58f-9zg4s   2/2     Running   0          29h
pod/builds-api-5bffd6b58f-jk6sf   2/2     Running   0          29h

NAME                 TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
service/builds-api   ClusterIP   10.97.55.43   <none>        8443/TCP   18d

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/builds-api   2/2     2            2           18d

NAME                                    DESIRED   CURRENT   READY   AGE

replicaset.apps/builds-api-5bffd6b58f   2         2         2       29h

Tekton

Same as before, different namespace:


root@toolsbeta-test-k8s-control-4:~# kubectl get all -n tekton-pipelines
NAME                                               READY   STATUS    RESTARTS   AGE
pod/tekton-pipelines-controller-5c78ddd49b-dj4hz   1/1     Running   0          57d
pod/tekton-pipelines-webhook-5d899cc8c-zwf7p       1/1     Running   0          57d

NAME                                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                              AGE
service/tekton-pipelines-controller   ClusterIP   10.96.176.235    <none>        9090/TCP,8008/TCP,8080/TCP           447d
service/tekton-pipelines-webhook      ClusterIP   10.101.163.215   <none>        9090/TCP,8008/TCP,443/TCP,8080/TCP   447d

NAME                                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/tekton-pipelines-controller   1/1     1            1           87d
deployment.apps/tekton-pipelines-webhook      1/1     1            1           87d

NAME                                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/tekton-pipelines-controller-5c78ddd49b   1         1         1       87d
replicaset.apps/tekton-pipelines-webhook-5d899cc8c       1         1         1       87d

NAME                                                           REFERENCE                             TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/tekton-pipelines-webhook   Deployment/tekton-pipelines-webhook   4%/100%   1         5         1          447d

Buildpack admission controller

Also a k8s component, different namespace:


root@toolsbeta-test-k8s-control-4:~# kubectl get all -n buildpack-admission
NAME                                       READY   STATUS    RESTARTS   AGE
pod/buildpack-admission-5c87f7664f-59rg8   1/1     Running   0          7d1h
pod/buildpack-admission-5c87f7664f-796td   1/1     Running   0          7d1h

NAME                          TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
service/buildpack-admission   ClusterIP   10.109.169.190   <none>        443/TCP   79d

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/buildpack-admission   2/2     2            2           79d

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/buildpack-admission-56d8989bdd   0         0         0       9d
replicaset.apps/buildpack-admission-5c87f7664f   2         2         2       7d1h
replicaset.apps/buildpack-admission-5fb584f788   0         0         0       77d
replicaset.apps/buildpack-admission-64b4cbfb5f   0         0         0       39d
replicaset.apps/buildpack-admission-67d574f979   0         0         0       79d
replicaset.apps/buildpack-admission-6b8db4f7b8   0         0         0       31d
replicaset.apps/buildpack-admission-7cf857d878   0         0         0       79d
replicaset.apps/buildpack-admission-c77b6b447    0         0         0       63d