Portal:Toolforge/Admin/Upgrading Kubernetes

From Wikitech
Jump to navigation Jump to search

This document only applies to a kubeadm-managed cluster deployed as described in Portal:Toolforge/Admin/Deploying_k8s.

Before the upgrade

Some considerations before you perform the important operations.

  • To begin, check your version with kubectl version so you know where you are starting from.
  • Make sure all related DEB pkgs (kubeadm, kubectl, kubelet) are available in the desired versions in reprepro. This might involve puppet patches and repo updates.
  • Are you also upgrading Calico? If you are, are you upgrading a patch version or a minor/major release? If just a patch, you can probably just update the profile::toolforge::k8s::calico_version in hiera, adjust the profile::toolforge::k8s::calicoctl_sha value to the new file in the release bundle and use puppet's changed file and the kubectl apply command below to upgrade. If this is a minor or major release, please check the new release yaml file and make sure the puppet yaml template in modules/toolforge/templates/k8s is updated, if needed. Then, proceed. When checking the external docs on that, know that we are using the Kubernetes API datastore and are using Calico for policy and networking.

Upgrade

Check and plan what an upgrade will entail and building an upgrade plan. The command for this is fairly straightforward.

root@controlplanenode # kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.15.0
[upgrade/versions] kubeadm version: v1.15.0
[upgrade/versions] Latest stable version: v1.15.1
[upgrade/versions] Latest version in the v1.15 series: v1.15.1

External components that should be upgraded manually before you upgrade the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT   AVAILABLE
Etcd        3.2.26    3.3.10

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT       AVAILABLE
Kubelet     5 x v1.15.0   v1.15.1

Upgrade to the latest version in the v1.15 series:

COMPONENT            CURRENT   AVAILABLE
API Server           v1.15.0   v1.15.1
Controller Manager   v1.15.0   v1.15.1
Scheduler            v1.15.0   v1.15.1
Kube Proxy           v1.15.0   v1.15.1
CoreDNS              1.3.1     1.3.1

You can now apply the upgrade by executing the following command:

        kubeadm upgrade apply v1.15.1

Note: Before you can perform this upgrade, you have to update kubeadm to v1.15.1.

Some important things to note here:

  • Etcd is external, so upgrades there need to involve the packaged versions. Make sure that the version we are using (or that can be upgraded to) is acceptable to the new version of Kubernetes before trying anything.
  • kubeadm is deployed from packages, which need to be upgraded, including kubelet in order to finish an upgrade.

Now you can proceed with:

root@controlplanenode # kubeadm upgrade apply v1.15.1

Obviously, this is assuming you were upgrading to v1.15.1 (at the time of this writing, we are at v1.15.5, so I hope you aren't using that number). This will produce a fair bit of output. Do check it for errors.

Upgrade Calico, if you are doing, so with kubectl apply -f /etc/kubernetes/calico.yaml once puppet has updated the file.


For the next control plane nodes, you do not need to run kubectl upgrade plan, but and instead of kubeadm upgrade apply, you run kubeadm upgrade node. kubectl apply is idempotent, so anything that doesn't need upgrading for calico will do nothing, and is perfectly fine to run.

Upgrade kubelet and kubeadm packages on all control plane nodes. Restart kubelet if it hasn't already.

When this is done, we upgrade the workers. For each worker:

  1. Drain it
    root@controlplanenode # kubectl drain $NODE --ignore-daemonsets
    
  2. On the node, upgrade it's kubelet config
    root@workernode # kubeadm upgrade node
    
  3. Upgrade kubectl and kubelet packages
  4. Restart kubelet
  5. Run puppet in case there's any config we have that isn't captured by kubeadm.
  6. Uncordon
    root@controlplanenode # kubectl uncordon $NODE
    

NOTE mind the k8s API is behind the FQDN k8s.tools.eqiad1.wikimedia.cloud, some commands may vary their output/results depending on which backend is HAproxy reaching. This can be prevented by disabling backends by hand during the upgrade window.