Wikimedia Cloud Services team/EnhancementProposals/Decision record T363683 kubernetes upgrade workgroup
Appearance
Origin task: phab:T363683
Date of the decision: 2024-06-06
People in the decision meeting (alphabetical order): There was no meeting, decided in the task
Decision taken
Option 2 was chosen with some comments:
- Create an overview wiki page for the coming upgrades
Starting workgroup members (alphabetical order)
- User:Arturo_Borrero_Gonzalez
- User:David_Caro
- Raymond Ndibe
- Slavina Stefanova
- User:FNegri
Rationale
We want to catch up with the k8s versions specially as we can't jump versions yet, so we have to put in the effort to speed up until we get there, and to stay there once we catch up.
Problem
We are several years behind kubernetes upgrades, and in order to catch up, we need to upgrade faster than upstream releases for some time.
Constraints and risks
- All the problems of running old software (security, bugs, stability, ...)
Extra info
- Current upgrade process documentation - https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Upgrading_Kubernetes
- Latest upgrade task - phab:T362869 (the rest are subtasks of it)
- Upstream releases - https://kubernetes.io/releases/
Decision record
In progress
Options
Option 1
Do nothing
Pros:
- No extra effort needed
Cons:
- We never catch up
Option 2
Create a dedicated opt-in workgroup to focus on monthly Kubernetes upgrades until we catch up (as aim, some upgrades might take more), and continue with regular updates thereafter.
Pros:
- K8s upgrade progress greatly improves
- we spread upgrade knowledge in the team
- we setup a working group that can then take over the regular updates (3/year)
- automation improvement and refinement
Cons:
- Considerable effort sometimes when api deprecations happen to affect us
- Are monthly updates compatible with other work streams?