Portal:Toolforge/Admin/Kubernetes/Docker-registry
This page describes the Docker registry setup in the Toolforge platform.
Deployment components and architecture
Information on how the setup is deployed, and the different components.
Servers
The setup is generally composed of 2 types of servers:
- docker builder - to build docker images which are then uploaded to the registry. Usually named tools-docker-builder-XX.tools.eqiad1.wikimedia.cloud.
- docker registry - to hold and serve the docker registry itself. Usually named tools-docker-registry-XX.tools.eqiad1.wikimedia.cloud.
One builder server is enough for the workload we handle.
The docker registry server is usually 1 server, with a cinder volume attached to store the actual registry information.
DNS, addressing and SSL
In order for Toolforge kubernetes users to be able to use the registry, there is a public FQDN called docker-registry.tools.wmflabs.org.
This FQDN resolves to a single IPv4 public address. This IPv4 address is a floating IP from CloudVPS, which is allocated and assigned to the docker registry server.
The DNS setup can be fully managed using Horizon (tools.wmflabs.org zone).
There is a SSL certificate in use by the registry servers, *.tools.wmflabs.org (managed by acme-chief), by the nginx daemon.
Also, please note that registry servers require a openstack security group that allows its work as web server.
Registry content
The content of the registry is generated from the operations/docker-images/toollabs-images.git repository. There should be a checkout of this at /srv/images/toollabs-images in the docker builder server.
Information on how to generate the images can be read in the "building toolforge specific images" page. TODO: put that content here?
Once the docker images are in the registry, they are served immediately to users by means of the docker-registry daemon/service and nginx. Registry data is physically stored in the local disk of the server at /srv/registry, which is a cinder volume mount (usually named tools-docker-registry-data
.
By the time of this writing, stored docker registry data is about ~70GB in size.
Puppet
We use the role role::wmcs::toolforge::docker::registry and specially profile::toolforge::docker::registry.
The docker::registry module, and related code, is in use by this setup.
There are a couple of important hiera keys related to this deployment, which should be set in Horizon (prefix and/or project):
docker::registry_url: https://tools-docker-registry.wmflabs.org/v2/
docker::registry: docker-registry.tools.wmflabs.org
docker::builder_host: tools-docker-builder-06.tools.eqiad1.wikimedia.cloud
System administration
Information on maintenance and administration of the setup.
Health
We don't have any specific monitoring or alerting setups for these hosts. TODO: this could be improved.
Basic health check can be done by doing some checks:
- check registry catalog URL: https://docker-registry.tools.wmflabs.org/v2/_catalog Should show a list of available images in the registry.
- checking disk usage on the servers:
aborrero@tools-docker-registry-03:/srv/registry$ df -h /srv
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vd-second--local--disk 138G 38G 93G 29% /srv
- check status of services/daemons in the registry servers:
command examples |
---|
aborrero@tools-docker-registry-03:/srv/registry$ sudo systemctl status docker-registry.service
● docker-registry.service - the Docker toolset to pack, ship, store, and deliver content
Loaded: loaded (/lib/systemd/system/docker-registry.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-01-15 14:17:00 UTC; 1 day 3h ago
Main PID: 11218 (docker-registry)
Tasks: 16 (limit: 4915)
CGroup: /system.slice/docker-registry.service
└─11218 /usr/bin/docker-registry serve /etc/docker/registry/config.yml
Jan 16 16:36:04 tools-docker-registry-03 docker-registry[11218]: time="2019-01-16T16:36:04.076483358Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.tools.wmflabs.org
Jan 16 16:36:04 tools-docker-registry-03 docker-registry[11218]: 127.0.0.1 - - [16/Jan/2019:16:36:04 +0000] "GET /v2/toollabs-php-base/blobs/sha256:f054dcacf5320ad30ba01fc4b6e8155255bb79e506957d415e8b398e7316f5c
Jan 16 16:36:04 tools-docker-registry-03 docker-registry[11218]: time="2019-01-16T16:36:04.704470881Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.tools.wmflabs.org
Jan 16 16:36:04 tools-docker-registry-03 docker-registry[11218]: 127.0.0.1 - - [16/Jan/2019:16:36:04 +0000] "GET /v2/toollabs-php-base/blobs/sha256:03fcb53c4afdf0f7dec4aa6982f9bf334dc13e51449424cc1f91ca217466bca
Jan 16 16:36:05 tools-docker-registry-03 docker-registry[11218]: time="2019-01-16T16:36:05.766415368Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.tools.wmflabs.org
Jan 16 16:36:05 tools-docker-registry-03 docker-registry[11218]: 127.0.0.1 - - [16/Jan/2019:16:36:04 +0000] "GET /v2/toollabs-php-base/blobs/sha256:f9ac56ca2c38bfef9c6b2ea11ef611ff87e43458a42005c6df79fd0de664264
Jan 16 16:36:08 tools-docker-registry-03 docker-registry[11218]: time="2019-01-16T16:36:08.120073611Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.tools.wmflabs.org
Jan 16 16:36:08 tools-docker-registry-03 docker-registry[11218]: 127.0.0.1 - - [16/Jan/2019:16:36:03 +0000] "GET /v2/toollabs-php-base/blobs/sha256:935379b435eb391149091a293dc5f497cb6d6c76a2d74af78ac44a3c3c0e8a0
Jan 16 16:54:12 tools-docker-registry-03 docker-registry[11218]: time="2019-01-16T16:54:12.414173447Z" level=info msg="response completed" go.version=go1.11.4 http.request.host=docker-registry.tools.wmflabs.org
Jan 16 16:54:12 tools-docker-registry-03 docker-registry[11218]: 127.0.0.1 - - [16/Jan/2019:16:54:12 +0000] "GET /v2/_catalog HTTP/1.1" 200 868 "" "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox
aborrero@tools-docker-registry-03:/srv/registry$ sudo systemctl status rsync.service
● rsync.service - fast remote file copy program daemon
Loaded: loaded (/lib/systemd/system/rsync.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2019-01-16 12:27:16 UTC; 5h 6min ago
Main PID: 3820 (rsync)
Tasks: 1 (limit: 4915)
CGroup: /system.slice/rsync.service
└─3820 /usr/bin/rsync --daemon --no-detach
Jan 16 17:00:01 tools-docker-registry-03 rsyncd[12107]: building file list
Jan 16 17:10:01 tools-docker-registry-03 rsyncd[12227]: connect from tools-docker-registry-04.tools.eqiad1.wikimedia.cloud (172.16.7.217)
Jan 16 17:10:01 tools-docker-registry-03 rsyncd[12227]: rsync on docker-registry-sync/ from tools-docker-registry-04.tools.eqiad1.wikimedia.cloud (172.16.7.217)
Jan 16 17:10:01 tools-docker-registry-03 rsyncd[12227]: building file list
Jan 16 17:20:01 tools-docker-registry-03 rsyncd[13185]: connect from tools-docker-registry-04.tools.eqiad1.wikimedia.cloud (172.16.7.217)
Jan 16 17:20:01 tools-docker-registry-03 rsyncd[13185]: rsync on docker-registry-sync/ from tools-docker-registry-04.tools.eqiad1.wikimedia.cloud (172.16.7.217)
Jan 16 17:20:01 tools-docker-registry-03 rsyncd[13185]: building file list
Jan 16 17:30:02 tools-docker-registry-03 rsyncd[13272]: connect from tools-docker-registry-04.tools.eqiad1.wikimedia.cloud (172.16.7.217)
Jan 16 17:30:02 tools-docker-registry-03 rsyncd[13272]: rsync on docker-registry-sync/ from tools-docker-registry-04.tools.eqiad1.wikimedia.cloud (172.16.7.217)
Jan 16 17:30:02 tools-docker-registry-03 rsyncd[13272]: building file list
aborrero@tools-docker-registry-03:/srv/registry$ sudo systemctl status nginx.service
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-01-15 14:37:11 UTC; 1 day 2h ago
Docs: man:nginx(8)
Main PID: 14557 (nginx)
Tasks: 9 (limit: 4915)
CGroup: /system.slice/nginx.service
├─14557 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
├─14558 nginx: worker process
├─14559 nginx: worker process
├─14560 nginx: worker process
├─14561 nginx: worker process
├─14562 nginx: worker process
├─14563 nginx: worker process
├─14564 nginx: worker process
└─14565 nginx: worker process
Jan 15 14:37:11 tools-docker-registry-03 systemd[1]: Starting A high performance web server and a reverse proxy server...
Jan 15 14:37:11 tools-docker-registry-03 systemd[1]: Started A high performance web server and a reverse proxy server.
|
Registry failover
We don't have a specific failover mechanism rather than pointing the DNS name to another registry server.
Care should be taken to don't loss registry data, since generating it from scratch can take some time. That's why there is a cinder volume that we can re-attach to other VMs.
Updating SSL certificate
Managed by acme-chief.
Other operations
Other operations that you may want to perform in these servers.
Uploading custom docker images from an external registry
Other than Toolforge base images, you may want to copy some other upstream docker images into the registry. To do this, follow these steps.
Automated procedure:
user@cloudcumin1001:~$ sudo cookbook wmcs.toolforge.k8s.image.copy_to_registry [...]
Manual procedure:
- pull the docker image into a tools-docker-builder server.
- tag the downloaded image with the new registry name.
- push the docker image into the new registry.
In this example, the nginx-ingress-controller
image is copied to the internal registry.
user@tools-docker-imagebuilder-01:~$ sudo docker pull quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1
0.25.1: Pulling from kubernetes-ingress-controller/nginx-ingress-controller
6f3771171c5a: Pull complete
c0f4bf67891e: Download complete
d800a815826b: Download complete
a5f64c9a48d8: Downloading [=================================> ] 87.78 MB/131.4 MB
c0f4bf67891e: Pull complete
d800a815826b: Pull complete
a5f64c9a48d8: Pull complete
85eb3bab5b30: Pull complete
3b5969ca8957: Pull complete
817ea478bbe8: Pull complete
d805ea7d2e5b: Pull complete
52a9aaa24508: Pull complete
b88dd7367a51: Pull complete
1863722e8548: Pull complete
4c628964bc02: Pull complete
Digest: sha256:0c4941fa8c812dd44297b5f4900e3b26c3e6a8a42940e48fe9a1a585fe8f7e25
Status: Downloaded newer image for quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1
user@tools-docker-imagebuilder-01:~$ sudo docker images | grep nginx
quay.io/kubernetes-ingress-controller/nginx-ingress-controller 0.25.1 0439eb3e11f1 10 weeks ago 510.7 MB
user@tools-docker-imagebuilder-01:~$ sudo docker tag 0439eb3e11f1 docker-registry.tools.wmflabs.org/nginx-ingress-controller:0.25.1
root@tools-docker-imagebuilder-01:~# docker push docker-registry.tools.wmflabs.org/nginx-ingress-controller:0.25.1
The push refers to a repository [docker-registry.tools.wmflabs.org/nginx-ingress-controller]
2fcc832a6182: Pushed
1a919be260e2: Pushed
144091cf2f8e: Pushed
8477ce4e751b: Pushed
5d3f0729b342: Pushed
095e01840c9e: Pushed
3e65873fb5c2: Pushed
72b39ef0b6f4: Pushed
b3c716deb2a6: Pushed
68fdb45b95ce: Pushed
3ccb3fedf4ab: Pushed
2e83df525186: Pushed
latest: digest: sha256:b9cd638b8849f25210740b075d27ef2e55ffd2861488ead98276aa70b8a859ab size: 2838
user@tools-docker-imagebuilder-01:~$ sudo docker images | grep nginx
docker-registry.tools.wmflabs.org/nginx-ingress-controller 0.25.1 0439eb3e11f1 10 weeks ago 510.7 MB
quay.io/kubernetes-ingress-controller/nginx-ingress-controller 0.25.1 0439eb3e11f1 10 weeks ago 510.7 MB
Uploading custom docker images from a git repository
Other than Toolforge base images, you may want to copy some other random source code from a git repository as a docker image into the registry.
See Portal:Toolforge/Admin/Kubernetes/Custom_components.
Old instructions:
- git clone the git repository into the tools docker-builder VM (even if you are building for toolsbeta, ex. tools-docker-imagebuilder-01.tools.eqiad1.wikimedia.cloud).
- build the docker image
- tag the downloaded image with the new registry name.
- push the docker image into the new registry.
In this example, the toolforge-jobs-framework-api
image is built and pushed to the internal registry.
user@tools-docker-imagebuilder-01:~$ git clone "https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api"
user@tools-docker-imagebuilder-01:~$ cd jobs-framework-api
user@tools-docker-imagebuilder-01:jobs-framework-api/$ sudo docker build --tag toolforge-jobs-framework-api .
[..]
user@tools-docker-imagebuilder-01:~$ sudo docker images | grep toolforge-jobs-framework-api | grep latest
docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api latest 6562c3204603 2 weeks ago 132MB
toolforge-jobs-framework-api latest fa19a234f284 2 weeks ago 132MB
^^^^ id, the one without registry in the name
user@tools-docker-imagebuilder-01:~$ ID=fa19a234f284
user@tools-docker-imagebuilder-01:~$ sudo docker tag $ID docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest # note that we always use tools registry, no matter the project you are in
user@tools-docker-imagebuilder-01:~$ sudo -i docker push docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest
[..]
Delete old images
Deleting old things seems to be a bit of a dark art in Docker's reference registry. Web searches will bring you some help however. On 2020-07-24, BryanDavis cleaned up the old "toollabs-*" images which were left around from the initial Toolforge Kubernetes cluster using roughly the process described in this blog post with the help of docker_reg_tool:
$ sudo -i $(pwd)/docker_reg_tool https://docker-registry.tools.wmflabs.org list | grep toollabs- > old-images.txt
$ for img in $(cat old-images.txt); do
for tag in $(sudo -i $(pwd)/docker_reg_tool https://docker-registry.tools.wmflabs.org list $img); do
sudo -i $(pwd)/docker_reg_tool https://docker-registry.tools.wmflabs.org delete $img $tag
done
done
: # After this ran, there were no tags shown for the toollabs-* collections when browsing the registry, but the collections themselves were still there
$ sudo -i -- sudo -u docker-registry docker-registry garbage-collect /etc/docker/registry/config.yml
: # This printed a lot of stuff but did not actually seem to remove anything from the file system. It did look like it removed things from internal cache structures.
$ cd /srv/registry/docker/registry/v2/repositories
$ sudo rm -r toollabs-*
: # This removed the <repo>/_layers/sha256/<sha256>/link files that tell docker which blobs to pull
$ sudo -u docker-registry docker-registry garbage-collect /etc/docker/registry/config.yml
: # This seemed to actually remove things from /srv/registry/docker/registry/v2/blobs/ that were no longer referenced by link files
On 2023-07-10 Taavi cleaned up images without any tags using this command:
user@tools-docker-registry-05:~$ sudo sudo -u docker-registry docker-registry garbage-collect /etc/docker/registry/config.yml --delete-untagged