Portal:Toolforge/Admin/Runbooks/IngressPodMisplaced
Appearance
The procedures in this runbook require admin permissions to complete.
The IngressPodMisplaced alert fires when a Toolforge ingress-nginx pod is running on a non-ingress worker.
This issue generally happens if the ingress-nginx pods need to be replaced for whatever reason, as the ingress pods are sized so that only one of them fits on a single worker.
Debugging
Check where the pods are running:
user@tools-bastion-NN:~ $ kubectl get pod -n ingress-nginx-gen2 -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
ingress-nginx-gen2-controller-6967c4b878-9h9wv 1/1 Running 0 15d 192.168.166.63 tools-k8s-ingress-8 <none> <none>
ingress-nginx-gen2-controller-6967c4b878-hc4zv 1/1 Running 0 15d 192.168.36.120 tools-k8s-worker-105 <none> <none>
ingress-nginx-gen2-controller-6967c4b878-z5wtz 1/1 Running 0 15d 192.168.254.210 tools-k8s-ingress-9 <none> <none>
Common issues
The simple fix is to delete the pod running on a non-ingress worker, at which point Kubernetes should re-create it on the correct node:
user@tools-bastion-NN:~ $ kubectl sudo delete pod -n ingress-nginx-gen2 ingress-nginx-gen2-controller-6967c4b878-hc4zv
If the replacement pod also gets scheduled on an incorrect node, you need to investigate further.
