Portal:Toolforge/Admin/Runbooks/k8s-haproxy

From Wikitech

These tests are defined here: https://gerrit.wikimedia.org/g/operations/puppet/+/c85d36363c69c1d01092726af556812138475857/modules/profile/manifests/toolforge/k8s/haproxy.pp#56

These checks verify the behavior of the toolforge web proxy. There are two different endpoints tested:

https://admin.toolforge.org/healthz

This tests a known, minimal k8s webservice that when working properly returns 'ok' at the healthz route.

When this alert fires it is probably because the tool has crashed. Restart on a toolforge bastion with:

$ sudo su -
# become admin
# webservice stop  # and now wait a minute, just doing a quick restart seems insufficient
# webservice start
# tail -f error.log # should be fairly quiet

https://this-tool-does-not-exist.toolforge.org/

As explained by the endpoint, this is a check of the proxy behavior when accessing a non-existent tool. It ought fall through to the 404 handler and report a useful "this tool does not exist" message.

As of 2022-09-02 this test is known to be flaky due to the 404 handler being a bit broken. A (apparently insufficient) attempt to fix this is https://gerrit.wikimedia.org/r/c/operations/puppet/+/826779/