Portal:Data Services/Admin/Runbooks/Depool wikireplicas

The procedures in this runbook require admin permissions to complete.

Wikireplicas need work sometimes, fail, etc. This details how to depool a wiki replica database server.

Depooling

The pool status of the wiki replica hosts is managed via conftool. Note that if a "web" instance is depooled from a particular section the "analytics" one will take over, and vice versa, so only depool one server from each section at once.

Via conftool on cumin hosts

The confctl utility can be used on any of the main cumin hosts (cumin1002.eqiad.wmnet, cumin2002.codfw.wmnet):

taavi@cumin1002 ~ $ sudo confctl select name=clouddb1013.eqiad.wmnet get
{"clouddb1013.eqiad.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=eqiad,cluster=wikireplica-db-web,service=s3"}
{"clouddb1013.eqiad.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=eqiad,cluster=wikireplica-db-web,service=s1"}

taavi@cumin1002 ~ $ sudo confctl select name=clouddb1013.eqiad.wmnet,service=s3 set/pooled=no
eqiad/wikireplica-db-web/s3/clouddb1013.eqiad.wmnet: pooled changed yes => no
WARNING:conftool.announce:conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3

taavi@cumin1002 ~ $ sudo confctl select name=clouddb1013.eqiad.wmnet,service=s3 set/pooled=yes
eqiad/wikireplica-db-web/s3/clouddb1013.eqiad.wmnet: pooled changed no => yes
WARNING:conftool.announce:conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3

Via utility commands on the hosts themselves

this does not work before https://gerrit.wikimedia.org/r/c/operations/puppet/+/976735/ is merged

Some confctl commands can also be run on the hosts themselves.

Support contacts

If you are following this, you are probably already a part of the WMCS or Data Engineering SRE team. Perhaps you can ask the team you are not on if you need more help?

Related information