Kubernetes/CRE/criteria

From Wikitech

Intro

Docker.io as a Container Runtime (CR) is no longer supported in Kubernetes past version 1.23 (the version production workloads are now at). Since we:

  • Still use Docker as our Container Runtime Engine (CRE)
  • Need to continue the upgrade process for all our production Kubernetes clusters

we need to figure out how to pick and handle the replacement of our currently in use Container Runtime Engine (CRE)

Scope

The scope of this is to define the selection criteria of each proposed solution, evaluate the solutions based on those and pick one. The environment is Wikimedia production Kubernetes clusters as defined in Wikitech

Audience

Audience for this task is SREs and developers with a good understanding of Container Runtime Engines who want to help with choosing the best one for production Wikimedia workloads

Glossary

  • Container Runtime/Container (Runtime) Engine: A software component that can run containers on a host operating system. From now on, shortened as CRE.
  • OCI: Open Container Initiative, a set of specifications regarding running, distributing and building/bundling container images.
  • OCI Runtime: A CRE compatible with the OCI specification. From this point on, CRE and OCI CRE are considered identical in this document
  • OCI Image: Also known as docker image or OCI image. A bundle that can be run by an OCI Runtime.
  • OCI distribution: A specification on how to distribute OCI Images.
  • Low-level runtime: A CRE that is used to perform creation, execution, stopping of containers. Typical examples are runC, crun, containerd. For most of this document, it helps to think of them as OCI Runtime implementations.
  • High-level runtime: These utilize low-level runtimes and offer extra functionality like fetching/extracting/building/tagging container images, executing processes inside containers, setting up networking/mounts (irrelevant to Kubernetes, but listed for clarity). For most of this doc, it helps to assume that they implement OCI distribution and OCI image functionalities. Prominent examples include Docker, CRI-O, containerd (yes it implements both low and high level runtimes)
  • Container Runtime Interface: Abbreviated as CRI, it is a specification regarding what functionality a Kubernetes compatible Runtime Engine should provide. The following are the primary requirements:
    • The runtime needs to be capable of starting/stopping pods
    • The runtime must deal with all container operations within pods—start, pause, stop, delete, kill
    • The runtime should handle images and be able to retrieve them from a image registry
    • The runtime should provide helper and utility functions around metrics collection and logs
  • Sandboxed/Virtualized runtimes: These are runtimes that either emulate or virtualize a kernel. For the purposes of this, we will be briefly only touching on them as we aren’t using any. Examples include: gVisor, nabla-containers, kata-containers, firecracker and others. The field hasn’t stabilized yet much.

Problem statement

Per the Dockershim deprecation FAQ docker isn’t going to be a viable solution post Kubernetes 1.23. The upstream maintainers are no longer willing to maintain the shim that allows docker to function as a container runtime engine for Kubernetes.

Mirantis has adopted and maintains the shim in https://github.com/Mirantis/cri-dockerd.

From Updated: Dockershim Removal FAQ | Kubernetes: “Mirantis and Docker have committed to maintaining a replacement adapter for Docker Engine, and to maintain that adapter even after the in-tree dockershim is removed from Kubernetes. The replacement adapter is named cri-dockerd.”

The project's README  says “For Mirantis customers, that means that Docker Engine’s commercially supported version, Mirantis Container Runtime (MCR), will be CRI compliant“. However, we aren’t Mirantis customers and do not use MCR (it’s a commercially available and not open source solution) so this isn’t unfortunately an option.

Solution

The obvious solution is to stop using docker and move to a container runtime engine that is compatible with Kubernetes. Below, we outline some selection criteria, split into different categories regarding their perceived criticality

Selection criteria

3 categories exist, those are Essential (MUST be met by any chosen solution), Important (SHOULD be met by any chosen solution), Desirable (MAY be met by any chosen solution), Wishlist (would be cool to have)

Essential

  • Handling of OCI images
  • Fetching
  • Extracting
  • OCI containers
    • Start
    • Stop
    • Create
    • Delete
    • Kill
  • runC support (we don’t want to have to re-implement a low-level engine, runC, our current one is non-negotiable)
  • compatibility with Dragonfly (we already use it, if we end up with a container runtime engine incompatible with it, we ‘ll be severely set back in MediaWiki on Kubernetes)
  • CRI compatibility with v1 of the CRI API (a CRE that doesn’t have CRI compatibility is useless for this project). Note: Kubernetes starting v1.26 only works with v1 of the CRI API.

Important

  • Apparmor support (already used in wikifunctions)
  • Availability as a Debian package (Reduces toil of packaging and distribution greatly)
  • upstream support (we don’t want the upstream to disappear or become unresponsive or worse toxic in responses)
    • Number of committers
    • Size of overall community
    • Responsiveness on issues/tasks
    • Amount/Gravity of open/lingering issues/tasks
  • CLI interface (it’s important to have a CLI interface to the runtime that allows us to debug problems in our infrastructure). The closer it feels to the current knowledge (Docker) the better

Desirable

  • gVisor support (May be used with wikifunctions in the future)
  • crun support (it’s a way faster low-level CRE, could be useful for performance reasons in the future)

Wishlist

  • Libkrun
  • Kata-containers
  • nabla-containers

Candidates

There are effectively 2 main candidates for replacing Docker as our CRE.

  • containerd
  • CRI-O

Note that some containers engines like podman, systemd-nspawn, cri-dockerd (which is just another layer between kubernetes and containerd/runc), lxc (which needs an extra shim, just as cri-dockerd), singularity (again extra shim) who aren’t compatible natively with Kubernetes are not being evaluated (containerd also used to have a shim, named containerd-cri, which was merged in the main project). The reasoning is either the utter lack of CRI compatibility, or the need to install and maintain long term an extra component that is not as well battle tested in the wider industry.

It is also worth mentioning that docker does use containerd under the hood, so effectively we’re already running containerd as container runtime with docker as container runtime engine, making the migration possibly a bit easier.

The above criteria are being evaluated in Container Runtime Engine selection criteria calculations

Evaluation matrix

Criteria/CRE CRI-O Containerd Notes
Essential
Handling of OCI images 10 10
OCI containers 10 10
runC support 10 10
CRI compatibility with v1 of the CRI API 10 10
compatibility with Dragonfly 5 5 Both have upstream docs with Dragonfly 2.x, should work with Dragonfly 1 but needs testing
Important
Apparmor 10 10
Debian Package 5 10 CRI-O has a debian RFP at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=979702

CRI-O has upstream packages at: https://github.com/cri-o/cri-o/blob/main/install.md#apt-based-operating-systems-1

upstream behavior/support 7 10 From ossinsight is seems as if containerd has a bigger community and code commits are less centered around

Red Hat (~20% of PRs created in cri-o)

CLI interface 5 10 Both ship in-tree cli tools which are supposed to be for debugging only.

Both can be managed with crictl Containerd provides a docker compatible UI with nerdctl

Desirable
gVisor 7 10 CRI-O: No upstream docs but: https://github.com/google/gvisor/issues/3283, https://devopstales.github.io/kubernetes/gvisor-cri-o/

containerd: Documented upstream: https://gvisor.dev/docs/user_guide/containerd/quick_start/

crun 10 10 AIUI this should be transparent as crun implements the OCI Container runtime spec
Wishlist
libkrun crun can use libkrun - if that counts
kata-containers 10 10
nabla-containers The nabla runtimes (runnc) github repo has been archived since March 2023
Total 99 115