Portal:Toolforge/Admin/Docker-registry

From Wikitech
Jump to navigation Jump to search

This page describes the Docker registry setup in the Toolforge platform.

Deployment components and architecture

Information on how the setup is deployed, and the different components.

Servers

The setup is generally composed of 2 types of servers:

  • docker builder - to build docker images which are then uploaded to the registry. Usually named tools-docker-builder-XX.tools.eqiad.wmflabs.
  • docker registry - to hold and serve the docker registry itself. Usually named tools-docker-registry-XX.tools.eqiad.wmflabs.

One builder server is enough for the workload we handle.

The docker registry servers are usually a pair of cold-standby servers. One server the registry (active) and other ready to take over in case of disaster or maintenance (standby).

DNS, addressing and SSL

In order for Toolforge kubernetes users to be able to use the registry, there is a public FQDN called docker-registry.tools.wmflabs.org.
This FQDN resolves to a single IPv4 public address. This IPv4 address is a floating IP from CloudVPS, which is allocated and assigned to docker registry servers (each server has its own floating IP).

The DNS setup can be fully managed using Horizon (tools.wmflabs.org zone).

There is a SSL certificate in use by the registry servers, *.tools.wmflabs.org (aka star.tools.wmflabs.org), by the nginx daemon.

Also, please note that registry servers require a openstack security group that allows its work as web server.

Registry content

The content of the registry is generated from the operations/docker-images/toollabs-images.git repository. There should be a checkout of this at /srv/images/toollabs-images in the docker builder server.
Information on how to generate the images can be read in the "building toolforge specific images" page. TODO: put that content here?

Once the docker images are in the registry, they are served inmediately to users by means of the docker-registry daemon/service and nginx. Registry data is fisically stored in the local disk of the server at /srv/registry.
Please note that registry images are pushed to the active docker registry node only. In order to improve robustness of the system when dealing with disasters or maintenance, the standby registry server will incrementally copy the data by means of rsync every 10 minutes. See phabricator ticket T213695 for more details and possible improvements for this setup.

By the time of this writting, stored docker registry data is about ~50GB in size.

Puppet

In the old days, the docker registry was using the deprecated role::toollabs::docker::registry puppet role.
Nowadays we use the more modern setup by means of role::wmcs::toolforge::docker::registry and specially profile::toolforge::docker::registry.

The docker::registry module, and related code, is in use by this setup.

There are a couple of important hiera keys related to this deployment, which should be set in Horizon (prefix and/or project) or in Hiera:Tools, depending on the key:

docker::registry_url: https://tools-docker-registry.wmflabs.org/v2/
docker::registry: docker-registry.tools.wmflabs.org
docker::builder_host: tools-docker-builder-06.tools.eqiad.wmflabs
profile::toolforge::docker::registry::active_node: tools-docker-registry-03.tools.eqiad.wmflabs
profile::toolforge::docker::registry::standby_node: tools-docker-registry-04.tools.eqiad.wmflabs

System administration

Information on maintenance and administration of the setup.

Health

We don't have any specific monitoring or alerting setups for these hosts. TODO: this could be improved.

Basic health check can be done by doing some checks:

aborrero@tools-docker-registry-03:/srv/registry$ df -h /srv
Filesystem                          Size  Used Avail Use% Mounted on
/dev/mapper/vd-second--local--disk  138G   38G   93G  29% /srv
  • check status of services/daemons in the registry servers:

Registry failover

We don't have a specific failover mechanism rather than pointing the DNS name to the standby registry server and updating hiera accordingly.
Care should be taken to don't loss registry data, since generating it from scratch can take some time. That's why there is a rsync job to sync data between them.

Updating SSL certificate

Follow instructions on the main Tooforge admin docs.

See also