Calico

Calico is a virtual network infrastructure that we use to manage Kubernetes networking.

It provides IPAM for Kubernetes workloads and pods. It also manages iptables rules, routing tables, and BGP peering for Kubernetes nodes.

IPAM

We configure IP pools per cluster (via the calico helm-chart) that Calico splits up in blocks (CRD resource: ipamblocks.crd.projectcalico.org). One node can have zero or more IPAM blocks assigned (the first one will be assigned as soon as the first Pod is scheduled on a node).

On the nodes, the network of assigned IPAM blocks get blackholed and specific (/32) routes are added for every Pod running on the node:

kubestage1003:~# ip route
default via 10.64.16.1 dev eno1 onlink
10.64.16.0/22 dev eno1 proto kernel scope link src 10.64.16.55
10.64.75.64 dev caliabad5f15937 scope link
blackhole 10.64.75.64/26 proto bird
10.64.75.65 dev cali13b43f910f6 scope link
10.64.75.66 dev cali8bc45095644 scope link
...

This way, the nodes will be authoritative for and announce the assigned IPAM blocks networks to their BGP peers. The IPAM blocks, affinities are stored in Kubernetes CRD objects and can be viewed and modified using the Kubernetes API, kubectl or calicoctl:

calicoctl ipam show --show-blocks
kubectl get ipamblocks.crd.projectcalico.org,blockaffinities.crd.projectcalico.org

Calico IPAM also supports a concept of borrowing IPs from IP blocks of foreign nodes in case a node has used up all it's attached IP blocks and can't get another one from the IP pool. We disable this feature by configuring calico IPAM with StrictAffinity (see task T296303) as it only works in a node-to-node mesh configuration.

Operations

Calico should be running via a Daemonset on every node of a Kubernetes cluster, establishing a BGP peering with the core routers (see IP and AS allocations#Private AS).

Unfortunately, Calico currently does not set the NetworkUnavailable condition to true on nodes where it is not running or failing, although that will ultimately render the node unusable. Therefore a Prometheus alert will fire in case if fails to scrape Calico metrics from a node.

If you are reading this page because you've seen such an alert:

Check the nodes state with: kubectl describe node <node fqdn>
Take a look at the latest events in the cluster: https://logstash.wikimedia.org/app/dashboards#/view/d43f9bf0-17b5-11eb-b848-090a7444f26c
Check the logs of calico components (use the component filter at the right): https://logstash.wikimedia.org/app/dashboards#/view/f6a5b090-0020-11ec-81e9-e1226573bad4
On the node itself: sudo calicoctl node status

Typha

Calico Typha can be considered a "smart proxy" between calico-node (Felix) and the Kubernetes API. It's purpose is to maintain a single connection to the Kubernetes API while serving multiple instances of calico-node with relevant data and (filtered) events. In large clusters this reduces the load on the Kubernetes API as well as on the calico-node instances (which don't have to deal with all events as they only get the relevant ones, filtered by Typha). Unfortunately this makes Typha a hard dependency which, when not available, will bring down the whole cluster networking.

There usually are 3 replica per cluster (1 for small clusters), running in the kube-system namespace.

Check the state with: kubectl -n kube-system get po -l k8s-app=calico-typha
Grafana dashboard: https://grafana-rw.wikimedia.org/d/p8RgaNXGk
For logs, see the generic logstash link above

Kube Controllers

The Calico Kubernetes Controllers are a couple of different control loops (all in one container/binary) that monitor objects in the Kubernetes API (like network policies, endpoints, nodes etc.) and perform necessary actions.

There usually is one replica per cluster (a maximum of one can be active at any given time anyways), running in the kube-system namespace.

Check the state with: kubectl -n kube-system get po -l k8s-app=calico-kube-controllers
Grafana dashboard: https://grafana-rw.wikimedia.org/d/-OQgQZOSk
For logs, see the generic logstash link above

Resource Usage

We have had multiple incidents in the past that originated from calico components reaching their resource limits (and being OOM killed or throttled).

Calico resource usage does increase organically due to events like:

Pods being added (new deployments with high number of replicas or the like)
Nodes being added
Network Policies being added

The above could be verified via the "etcd object" panels on the Kubernetes API Grafana dashboard.

Packaging

<dist> below stands for one of the Debian distribution's codenames, e.g. jessie, stretch, buster, bullseye. Make sure you use the one you target

We don't actually build calico but package it's components from upstream binary releases.

Because of that, you will need to set HTTP proxy variables for internet access on the build host.

The general process to follow is:

Check out operations/debs/calico on your workstation
Decide if you want to package a new master (production) or future (potential next production) version
Create a patch to bump the debian changelog

export NEW_VERSION=3.16.5 # Calico version you want to package
dch -v ${NEW_VERSION}-1 -D unstable "Update to v${NEW_VERSION}"
git commit debian/changelog

# Make sure to submit the patch to the correct version branch
git review vX.Y

Merge
Check out operations/debs/calico on the build host
Build the packages:

# If you want to build a specific version
git checkout vX.Y

# Ensure you allow networking in pbuilder
# This option needs to be in the file, an environment variable will *not* work!
echo "USENETWORK=yes" >> ~/.pbuilderrc

# Build the package
https_proxy=http://webproxy.$(hostname -d):8080 DIST=<dist> pdebuild

Updating helm charts

There are two helm charts that might need updating, depending on the changes in a newly packaged calico version:

Publishing

The Debian Packages

# On apt1001, copy the packages from the build host
rsync -vaz build2001.codfw.wmnet::pbuilder-result/<dist>-amd64/calico*<PACKAGE VERSION>* .

# If you want to import a new production version, import to component main
sudo -i reprepro -C main --ignore=wrongdistribution include <dist>-wikimedia /path/to/<PACKAGE>.changes

# If you want to import a test/pre-production version, import to component calico-future
sudo -i reprepro -C component/calicoXY --ignore=wrongdistribution include <dist>-wikimedia /path/to/<PACKAGE>.changes

The Docker Images

Calico also includes a bunch of docker images which need to be published into our docker registry. To simplify the process, the packaging generates a debian package named "calico-images" that includes the images as well as a script to publish them:

# On the build host, extract the calico-images debian package
tmpd=$(mktemp -d)
dpkg -x /var/cache/pbuilder/result/<dist>-amd64/calico-images_<PACKAGE_VERSION>_amd64.deb $tmpd

# Load and push the images
sudo -i CALICO_IMAGE_DIR=${tmpd}/usr/share/calico ${tmpd}/usr/share/calico/push-calico-images.sh
rm -rf $tmpd

Updating

Update debian packages calicoctl and calico-cni on kubernetes nodes using Debdeploy
Update image.tag version in helmfile.d/admin_ng/values/<Cluster>/calico-values.yaml
- Deploy to the cluster(s) that you want updated

External links

Official website: Calico