From Wikitech
Jump to navigation Jump to search

This page describes the Docker registry setup in the Toolforge platform.

Deployment components and architecture

Information on how the setup is deployed, and the different components.


The setup is generally composed of 2 types of servers:

  • docker builder - to build docker images which are then uploaded to the registry. Usually named tools-docker-builder-XX.tools.eqiad1.wikimedia.cloud.
  • docker registry - to hold and serve the docker registry itself. Usually named tools-docker-registry-XX.tools.eqiad1.wikimedia.cloud.

One builder server is enough for the workload we handle.

The docker registry servers are usually a pair of cold-standby servers. One server the registry (active) and other ready to take over in case of disaster or maintenance (standby).

DNS, addressing and SSL

In order for Toolforge kubernetes users to be able to use the registry, there is a public FQDN called docker-registry.tools.wmflabs.org.
This FQDN resolves to a single IPv4 public address. This IPv4 address is a floating IP from CloudVPS, which is allocated and assigned to docker registry servers (each server has its own floating IP).

The DNS setup can be fully managed using Horizon (tools.wmflabs.org zone).

There is a SSL certificate in use by the registry servers, *.tools.wmflabs.org (aka star.tools.wmflabs.org), by the nginx daemon.

Also, please note that registry servers require a openstack security group that allows its work as web server.

Registry content

The content of the registry is generated from the operations/docker-images/toollabs-images.git repository. There should be a checkout of this at /srv/images/toollabs-images in the docker builder server.
Information on how to generate the images can be read in the "building toolforge specific images" page. TODO: put that content here?

Once the docker images are in the registry, they are served inmediately to users by means of the docker-registry daemon/service and nginx. Registry data is physically stored in the local disk of the server at /srv/registry.
Please note that registry images are pushed to the active docker registry node only. In order to improve robustness of the system when dealing with disasters or maintenance, the standby registry server will incrementally copy the data by means of rsync every 10 minutes. See phabricator ticket T213695 for more details and possible improvements for this setup.

By the time of this writting, stored docker registry data is about ~50GB in size.


We use the role role::wmcs::toolforge::docker::registry and specially profile::toolforge::docker::registry.

The docker::registry module, and related code, is in use by this setup.

There are a couple of important hiera keys related to this deployment, which should be set in Horizon (prefix and/or project):

docker::registry_url: https://tools-docker-registry.wmflabs.org/v2/
docker::registry: docker-registry.tools.wmflabs.org
docker::builder_host: tools-docker-builder-06.tools.eqiad1.wikimedia.cloud
profile::toolforge::docker::registry::active_node: tools-docker-registry-03.tools.eqiad1.wikimedia.cloud
profile::toolforge::docker::registry::standby_node: tools-docker-registry-04.tools.eqiad1.wikimedia.cloud

System administration

Information on maintenance and administration of the setup.


We don't have any specific monitoring or alerting setups for these hosts. TODO: this could be improved.

Basic health check can be done by doing some checks:

aborrero@tools-docker-registry-03:/srv/registry$ df -h /srv
Filesystem                          Size  Used Avail Use% Mounted on
/dev/mapper/vd-second--local--disk  138G   38G   93G  29% /srv
  • check status of services/daemons in the registry servers:

Registry failover

We don't have a specific failover mechanism rather than pointing the DNS name to the standby registry server and updating hiera accordingly.
Care should be taken to don't loss registry data, since generating it from scratch can take some time. That's why there is a rsync job to sync data between them.

Updating SSL certificate

Follow instructions on the main Tooforge admin docs.

Other operations

Other operations that you may want to perform in these servers.

Uploading custom docker images

Other than Toolforge base images, you may want to copy some other upstream docker images into the registry. To do this, follow these steps.

  • pull the docker image into a tools-docker-builder server.
  • tag the downloaded image with the new registry name.
  • push the docker image into the new registry.

In this example, the nginx-ingress-controller image is copied to the internal registry.

user@tools-docker-imagebuilder-01:~$ sudo docker pull quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1
0.25.1: Pulling from kubernetes-ingress-controller/nginx-ingress-controller

6f3771171c5a: Pull complete 
c0f4bf67891e: Download complete 
d800a815826b: Download complete 
a5f64c9a48d8: Downloading [=================================>                 ] 87.78 MB/131.4 MB
c0f4bf67891e: Pull complete 
d800a815826b: Pull complete 
a5f64c9a48d8: Pull complete 
85eb3bab5b30: Pull complete 
3b5969ca8957: Pull complete 
817ea478bbe8: Pull complete 
d805ea7d2e5b: Pull complete 
52a9aaa24508: Pull complete 
b88dd7367a51: Pull complete 
1863722e8548: Pull complete 
4c628964bc02: Pull complete 
Digest: sha256:0c4941fa8c812dd44297b5f4900e3b26c3e6a8a42940e48fe9a1a585fe8f7e25
Status: Downloaded newer image for quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.25.1

user@tools-docker-imagebuilder-01:~$ sudo docker images | grep nginx
quay.io/kubernetes-ingress-controller/nginx-ingress-controller   0.25.1              0439eb3e11f1        10 weeks ago        510.7 MB

user@tools-docker-imagebuilder-01:~$ sudo docker tag 0439eb3e11f1 docker-registry.tools.wmflabs.org/nginx-ingress-controller:0.25.1
root@tools-docker-imagebuilder-01:~# docker push docker-registry.tools.wmflabs.org/nginx-ingress-controller:0.25.1
The push refers to a repository [docker-registry.tools.wmflabs.org/nginx-ingress-controller]
2fcc832a6182: Pushed 
1a919be260e2: Pushed 
144091cf2f8e: Pushed 
8477ce4e751b: Pushed 
5d3f0729b342: Pushed 
095e01840c9e: Pushed 
3e65873fb5c2: Pushed 
72b39ef0b6f4: Pushed 
b3c716deb2a6: Pushed 
68fdb45b95ce: Pushed 
3ccb3fedf4ab: Pushed 
2e83df525186: Pushed 
latest: digest: sha256:b9cd638b8849f25210740b075d27ef2e55ffd2861488ead98276aa70b8a859ab size: 2838

user@tools-docker-imagebuilder-01:~$ sudo docker images | grep nginx
docker-registry.tools.wmflabs.org/nginx-ingress-controller       0.25.1              0439eb3e11f1        10 weeks ago        510.7 MB
quay.io/kubernetes-ingress-controller/nginx-ingress-controller   0.25.1              0439eb3e11f1        10 weeks ago        510.7 MB

Delete old images

Deleting old things seems to be a bit of a dark art in Docker's reference registry. Web searches will bring you some help however. On 2020-07-24, BryanDavis cleaned up the old "toollabs-*" images which were left around from the initial Toolforge Kubernetes cluster using roughly the process described in this blog post with the help of docker_reg_tool:

$ sudo -i $(pwd)/docker_reg_tool https://docker-registry.tools.wmflabs.org list | grep toollabs- > old-images.txt
$ for img in $(cat old-images.txt); do
  for tag in $(sudo -i $(pwd)/docker_reg_tool https://docker-registry.tools.wmflabs.org list $img); do
    sudo -i $(pwd)/docker_reg_tool https://docker-registry.tools.wmflabs.org delete $img $tag
: # After this ran, there were no tags shown for the toollabs-* collections when browsing the registry, but the collections themselves were still there
$ sudo -i -- sudo -u docker-registry docker-registry garbage-collect /etc/docker/registry/config.yml
: # This printed a lot of stuff but did not actually seem to remove anything from the file system. It did look like it removed things from internal cache structures.
$ cd /srv/registry/docker/registry/v2/repositories
$ sudo rm -r toollabs-*
: # This removed the <repo>/_layers/sha256/<sha256>/link files that tell docker which blobs to pull
$ sudo -u docker-registry docker-registry garbage-collect /etc/docker/registry/config.yml
: # This seemed to actually remove things from /srv/registry/docker/registry/v2/blobs/ that were no longer referenced by link files

See also