Kubernetes/Images

From Wikitech

Image restriction strategy

We restrict only running images from the production Docker registry, aka docker-registry.wikimedia.org (aka docker-registry.discovery.wmnet inside production networks), which is available publicly. This is enforced at the network level currently. We should note that all of this applies particularly to what we call the production realm. WMCS/Toolforge aren't restricted by this.

Below there is an explanation of our reasoning

  1. Make it easy to do security updates when necessary (just rebuild all the containers & redeploy)
    As further explanation, we want to not be dependent on a multitude of authors/maintainers of upstream images for upgrades/updates as this is not sustainable long term. We have experience from where severe vulnerabilities (e.g. HeartBleed) affected very low level libraries that it took months for the entirety of the ecosystem to catch up.
  2. Faster deploys, since this is in the same network (vs. dockerhub, which is retrieved over the Internet)
    Fetching internally means that we don't stress others infrastructures, we don't incur possible costs because of our usage of those and we have the biggest possible speed without burdening our external facing networks.
  3. Access control is provided totally by us, less dependent on dockerhub
    We are able to fully decide what we want to expose to the public, which is almost all images and yet keep some images private, e.g. for embargoed security patches reasons.
  4. We control what's in our images.
    We have built tooling, e.g. DebMonitor that allows us to have an audit trail of what gets installed in a container, knowing which images need to be upgraded etc.. We also can enforce that the content in our images adheres to Our values and principles.

Operating System

While we don't shut the door to other Operating System being used in our OCI images, we are going to be focusing exclusively on using the Debian Linux Operating System for a number of reasons that we explain below.

  • We have tooling, that evolved organically, e.g. DebMonitor that is focused on tracking what is inside a container image
  • Through the sheer homogeneity of using the same base images everywhere, we are reaping substantial benefits in deployment times as in almost all cases the images already exist on all Kubernetes nodes.
  • We have extensive experience with Debian and know well how to operate it and the intricacies and quirks that it might have. Deviating from this would require that we rebuild this experience and knowledge for a new Operating System family

While we don't forbid other Linux distributions in our OCI images, we want to see a concretely justified use case for diverging from the above.

As an exception to the above. We are aware of distroless images. We 'd like to experiment with this newer approach that allows in some cases, to entirely forego almost anything in the image aside from the running binary. This is possible with Golang applications as well as some GraalVM/Quarkus compilable applications. We believe that our reservations/reasons listed above, won't apply in the case of distroless container images.

Image building

Images are separated in 3 "generations":

  • base images
  • production images
  • service images

that are dependent on each other in the above order. A service image (e.g. Mathoid) will depend on a production image (e.g. nodejs16) that will depend on a base image (e.g. wikimedia-buster), forming a tree. Base images can never get deployed, but both service images and production images are deployed. The former for powering a service, the latter for providing infrastructure functionality (e.g. metrics collecting, TLS demarcation, etc.).

Base images

Those are the first layer of the tree. They are built on the designated production builder host (look at manifests/site.pp to figure out which host has that role) and are built using

set_proxy
build-base-images

This code uses debuerreotype to build the image and push it to the registry, and the script we use is a modified version of the process that is used for the "official" Debian images on dockerhub. Note that you need to be root to build / push docker containers. Suggest using sudo -i for it - since docker looks for credentials in the user's home directory, and it is only present in root's home directory. It's a very simplistic approach but it works well for this use case.

Production images

The code building those has been written from scratch. It's a tool called docker-pkg. It is still run on the same builder host as above, but will automatically infer versions, what needs or does not need to be built and dependencies. The repo containing the definitions to those images is at operations/docker-images/production-images and the command /usr/local/bin/build-production-images is used to build them. Again, it's suggested that one would be using sudo -i for it - since docker looks for credentials in the user's home directory, and it is only present in root's home directory.

sudo -i
# cd /srv/images/production-images/
# git pull
# build-production-images

Note: The production-images repo does not have CI at present and changes need to be manually merged.

Services images

Those are built by the Deployment pipeline using Blubber. They are being created automatically on every git merged commit per software.

Local builds

It is sometimes convenient to test builds on your local workstation. To this end, you need docker-pkg and docker installed, and a checkout of production-images.git.

From the git checkout you can test images rebuild:

# rebuild all images as needed (no upload to registry)
docker-pkg -c config.yaml build images
# only a subset of images
docker-pkg -c config.yaml build images --select '*myimage*'