Kubernetes/Clusters/Add or remove nodes
Intro
This is a guide for adding or removing nodes from existing Kubernetes clusters.
sre.k8s.roll-reimage-nodes cookbook.
Adding a node
Adding a node is a 4 step process, first we add the node to BGP via our network configuration manager, Homer, and then we create 3 puppet patches, which we merge one by one.
Step 0: DNS
Make sure that the node's DNS is properly configured both for IPv4 and IPv6.
Step 1: Add node to BGP
We have a calico setup, so nodes need to be able to establish BGP with their top of rack switch or the core routers.
To do so:
- in Netbox, set the server's
BGPcustom field toTrue - If needed, add the ToR switch to
helmfile.d/admin_ng/values/common-bgp.yaml, like this- the IPs are the gateways for the per-rack VLANs: search for something like
private1-f4-codfwin Netbox
- the IPs are the gateways for the per-rack VLANs: search for something like
- On a Cumin host, run homer. The exact target depends on the node's location:
- If it's a VM or if it/s connected to eqiad/codfw rows A-D, target the core routers (
cr*eqiad*orcr*codfw*) - If it's a physical server in eqiad row E/F, target its top of rack switch (eg. lsw1-e1-eqiad)
- If it's a VM or if it/s connected to eqiad/codfw rows A-D, target the core routers (
BGP Status alerts. They are not a big deal, but be aware and either keep going with the reimage as soon as possible, or do your homer commit after the reimage is done.
Step 2: Node installation
- Prepare a patch to assign the kubernetes worker role to the nodes, and add them to list of kubernetes workers: https://gerrit.wikimedia.org/r/c/operations/puppet/+/958487
- Stop puppet on all new nodes
- Merge the patch
- Run puppet on Docker-registry
- Run puppet on new nodes, or reimage if needed
- Reimage the node if needed, or simply run puppet on the host
- Command help:
sudo cumin 'A:docker-registry' 'run-puppet-agent -q'
Add node specific hiera data
If the node has some kubernetes related special features, you can add them via hiera
This can be done by creating the file hieradata/hosts/foo-node1001.yaml:
profile::kubernetes::node::kubelet_node_labels:
- label-bar/foo=value1
- label-foo/bar=value2
Note: In this past, we used this to populate region (datacentre) and zone (rack row). This no longer is needed, we do this automatically.
Step 3: Add to conftool/LVS
If the Kubernetes cluster is exposing services via LVS, you need to add the nodes FQDN to the cluster in conftool-data as well. For eqiad in conftool-data/node/eqiad.yaml like:
eqiad:
foo:
[...]
foo_node1001.eqiad.wmnet: [kubesvc]
# example: https://gerrit.wikimedia.org/r/c/operations/puppet/+/894701
Merge the change, and run puppet on the datacentre's LVS hosts.
Then, pool your nodes using conftool (check the weight of your cluster's nodes first):
sudo confctl select 'name=foo_node1001.eqiad.wmnet,cluster=kubernetes,service=kubesvc' set/weight=10
sudo confctl select 'name=foo_node1001.eqiad.wmnet,cluster=kubernetes,service=kubesvc' set/pooled=yes
Done! You made it!
Please ensure you've followed all necessary steps from Server_Lifecycle#Staged_->_Active
Your node should now join the cluster and have workload scheduled automatically (like calico daemonsets). You can login to a deploy server and check the status:
kubectl get nodes
Removing a node
Drain workload
First step to remove a node is to drain workload from it and depool it. Use the pool-depool-node cookbook for that:
sudo cookbook sre.k8s.pool-depool-node --k8s-cluster wikikube-eqiad -r decom -t YOUR-TASK depool wikikube-worker[1002-1005].eqiad.wmnet
Decommission
You can now follow the steps outlined in Server_Lifecycle#Active_->_Decommissioned
Ensure to:
- Run homer for the core switches of the DC you're decommissioning from, after the decom cookbook has completed (like
homer 'cr*eqiad*' commit 'YOUR TASK'). - Remove eventual node specific hiera data (from Kubernetes/Clusters/Add_or_remove_nodes#Add node specific hiera data)
- Remove the node from the
cluster_nodeslist of workers (from Kubernetes/Clusters/Add or remove nodes#Step 2: Node installation)
Delete the node from Kubernetes API
The step left is to delete the node from Kubernetes:
kubectl delete node foo-node1001.datacenter.wmnet
