Portal:Toolforge/Admin/Runbooks/IstioGatewayPodMisplaced
Appearance
The procedures in this runbook require admin permissions to complete.
The IstioGatewayPodMisplaced alert fires when a Toolforge Istio gateway pod is running on a non-gateway worker.
This issue generally happens if the gateway pods need to be replaced for whatever reason, as the pods are sized so that only one of them fits on a single worker.
The related IngressPodMisplaced alert would fire for the old ingress-nginx deployment for the same reasons until it is decomissioned.
Debugging
Check where the pods are running:
user@tools-bastion-NN:~ $ kubectl get pod -n istio-gateway -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
toolforge-istio-85f7bff487-54fjg 1/1 Running 0 35m 192.168.88.132 toolsbeta-test-k8s-gateway-1 <none> <none>
toolforge-istio-85f7bff487-cdm96 1/1 Running 0 36m 192.168.179.3 toolsbeta-test-k8s-worker-12 <none> <none>
toolforge-istio-85f7bff487-hbbf2 1/1 Running 0 35m 192.168.234.197 toolsbeta-test-k8s-gateway-2 <none> <none>
Common issues
The simple fix is to delete the pod running on a non-gateway worker, at which point Kubernetes should re-create it on the correct node:
user@tools-bastion-NN:~ $ kubectl sudo delete pod -n istio-gateway toolforge-istio-85f7bff487-cdm96
If the replacement pod also gets scheduled on an incorrect node, you need to investigate further.
