Jump to content

Server Admin Log/Archive 76

From Wikitech


2024-02-29

  • 22:37 foks: removing 4 files for legal compliance
  • 22:02 jdrewniak@deploy2002: Finished scap: Backport for Default to day mode (T358811) (duration: 10m 40s)
  • 21:57 mutante: phabricator - added STran to WMF-NDA (group 61) - T355388
  • 21:54 jdrewniak@deploy2002: jdlrobson and jdrewniak: Continuing with sync
  • 21:52 jdrewniak@deploy2002: jdlrobson and jdrewniak: Backport for Default to day mode (T358811) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:51 jdrewniak@deploy2002: Started scap: Backport for Default to day mode (T358811)
  • 21:50 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 08m 23s)
  • 21:42 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 08m 40s)
  • 21:31 mutante: phabricator - added Fring to WMF-NDA (group 61) - T358578
  • 21:29 mutante: phabricator - added Ifeatu_Nnaobi_WMDE to WMF-NDA (group 61) - T358578
  • 21:27 eileen: * civicrm upgraded from aeffaf88 to dd378ea1
  • 21:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2114 (T352010)', diff saved to https://phabricator.wikimedia.org/P58268 and previous config saved to /var/cache/conftool/dbconfig/20240229-212602-ladsgroup.json
  • 21:25 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 21:25 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 21:24 mutante: LDAP - added uid ifeatunnaobiwmde (46162) to groups nda and wmde (T358091)
  • 21:23 mutante: LDAP - added uid member: uid=ifeatunnaobiwmde,ou=people,dc=wikimedia,dc=org
  • 21:13 jdrewniak@deploy2002: Finished scap: Backport for Performance Impact Assessment for Night Mode Style Correction (T358240) (duration: 09m 28s)
  • 21:05 jdrewniak@deploy2002: mabualruz and jdrewniak: Continuing with sync
  • 21:05 jdrewniak@deploy2002: mabualruz and jdrewniak: Backport for Performance Impact Assessment for Night Mode Style Correction (T358240) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:04 jdrewniak@deploy2002: Started scap: Backport for Performance Impact Assessment for Night Mode Style Correction (T358240)
  • 20:52 mutante: LDAP - added uid frri (43019) to groups nda and wmde (T358584
  • 20:47 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2012.codfw.wmnet with OS bullseye
  • 20:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P58267 and previous config saved to /var/cache/conftool/dbconfig/20240229-202158-root.json
  • 20:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P58266 and previous config saved to /var/cache/conftool/dbconfig/20240229-200653-root.json
  • 19:58 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
  • 19:58 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
  • 19:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P58265 and previous config saved to /var/cache/conftool/dbconfig/20240229-195148-root.json
  • 19:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P58264 and previous config saved to /var/cache/conftool/dbconfig/20240229-193643-root.json
  • 19:35 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.20 refs T354438
  • 19:35 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
  • 19:32 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: host reimage
  • 19:14 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host lvs2012.codfw.wmnet with OS bullseye
  • 19:07 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:07 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for lvs2012 - cmooney@cumin1002"
  • 19:06 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) lvs2012.codfw.wmnet on all recursors
  • 19:06 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache lvs2012.codfw.wmnet on all recursors
  • 19:06 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for lvs2012 - cmooney@cumin1002"
  • 19:03 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 18:59 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org,service=(recdns|ntp)
  • 18:58 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org,service=recdns
  • 18:57 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org,service=ntp
  • 18:40 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr[1-2]-codfw with reason: lvs moves to per-rack vlans
  • 18:40 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr[1-2]-codfw with reason: lvs moves to per-rack vlans
  • 18:37 topranks: disabling PyBal on lvs2012 to move traffic to lvs2014 ahead of reimage T352918
  • 18:13 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: Moving lvs2012 primary interface from private1-b-codfw to private1-b2-codfw
  • 18:13 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 18:12 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2012.codfw.wmnet with reason: Moving lvs2012 primary interface from private1-b-codfw to private1-b2-codfw
  • 18:12 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 18:12 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 18:12 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 18:11 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 18:11 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 18:10 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 'sretest1001.eqiad.wmnet$' on ulsfo recursors
  • 18:10 volans@cumin1002: START - Cookbook sre.dns.wipe-cache 'sretest1001.eqiad.wmnet$' on ulsfo recursors
  • 18:06 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest1001.eqiad.wmnet sretest1002.eqiad.wmnet on all recursors
  • 18:05 volans@cumin1002: START - Cookbook sre.dns.wipe-cache sretest1001.eqiad.wmnet sretest1002.eqiad.wmnet on all recursors
  • 16:53 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-worker1173.eqiad.wmnet with reason: Investigating disk errors
  • 16:53 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-worker1173.eqiad.wmnet with reason: Investigating disk errors
  • 16:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 16:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 16:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 16:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 16:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 16:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 16:42 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 16:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 16:40 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 16:40 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 16:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 100%: Excersie over', diff saved to https://phabricator.wikimedia.org/P58262 and previous config saved to /var/cache/conftool/dbconfig/20240229-163459-root.json
  • 16:27 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 16:26 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 16:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 75%: Excersie over', diff saved to https://phabricator.wikimedia.org/P58261 and previous config saved to /var/cache/conftool/dbconfig/20240229-161954-root.json
  • 16:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 100%: After recloning', diff saved to https://phabricator.wikimedia.org/P58260 and previous config saved to /var/cache/conftool/dbconfig/20240229-161629-root.json
  • 16:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 16:12 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 16:10 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 16:09 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 16:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 16:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 16:05 topranks: Commencing network maintenance migrating servers to new switch codfw rack B7 T355872
  • 16:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 25%: Excersie over', diff saved to https://phabricator.wikimedia.org/P58259 and previous config saved to /var/cache/conftool/dbconfig/20240229-160449-root.json
  • 16:02 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 9 hosts with reason: Migrating servers in codfw rack B7 to lsw1-b7-codfw
  • 16:02 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 9 hosts with reason: Migrating servers in codfw rack B7 to lsw1-b7-codfw
  • 16:01 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b7-codfw with reason: prepping for server uplink migration codfw rack b7
  • 16:01 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b7-codfw with reason: prepping for server uplink migration codfw rack b7
  • 16:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 75%: After recloning', diff saved to https://phabricator.wikimedia.org/P58258 and previous config saved to /var/cache/conftool/dbconfig/20240229-160124-root.json
  • 15:59 topranks: configuring lsw1-b7-codfw in advance of server migration T355872
  • 15:52 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 15:52 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 15:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 10%: Excersie over', diff saved to https://phabricator.wikimedia.org/P58257 and previous config saved to /var/cache/conftool/dbconfig/20240229-154944-root.json
  • 15:48 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 15:48 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 15:46 marostegui@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 50%: After recloning', diff saved to https://phabricator.wikimedia.org/P58256 and previous config saved to /var/cache/conftool/dbconfig/20240229-154619-root.json
  • 15:46 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 15:43 moritzm: installing tar security updates
  • 15:40 swfrench@cumin2002: dbctl commit (dc=all): 'Depooling db1213 for exercise', diff saved to https://phabricator.wikimedia.org/P58255 and previous config saved to /var/cache/conftool/dbconfig/20240229-154005-swfrench.json
  • 15:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T354015)', diff saved to https://phabricator.wikimedia.org/P58254 and previous config saved to /var/cache/conftool/dbconfig/20240229-153658-marostegui.json
  • 15:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 15:36 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 15:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354015)', diff saved to https://phabricator.wikimedia.org/P58253 and previous config saved to /var/cache/conftool/dbconfig/20240229-153646-marostegui.json
  • 15:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 25%: After recloning', diff saved to https://phabricator.wikimedia.org/P58252 and previous config saved to /var/cache/conftool/dbconfig/20240229-153115-root.json
  • 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P58251 and previous config saved to /var/cache/conftool/dbconfig/20240229-152139-marostegui.json
  • 15:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 10%: After recloning', diff saved to https://phabricator.wikimedia.org/P58250 and previous config saved to /var/cache/conftool/dbconfig/20240229-151610-root.json
  • 15:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1213.eqiad.wmnet with reason: Maint test
  • 15:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db1213.eqiad.wmnet with reason: Maint test
  • 15:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P58248 and previous config saved to /var/cache/conftool/dbconfig/20240229-150632-marostegui.json
  • 15:02 Daimona: T357007 Running mwscript CampaignEvents:GenerateInvitationList --wiki=metawiki --listfile=/home/daimona/list.txt
  • 15:01 marostegui@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 5%: After recloning', diff saved to https://phabricator.wikimedia.org/P58247 and previous config saved to /var/cache/conftool/dbconfig/20240229-150105-root.json
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354015)', diff saved to https://phabricator.wikimedia.org/P58246 and previous config saved to /var/cache/conftool/dbconfig/20240229-145125-marostegui.json
  • 14:44 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:44 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Remove unused Phan suppression, Bump special-new-lexeme, fix redirect without temp user (T358754) (duration: 10m 08s)
  • 14:41 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 14:41 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Continuing with sync
  • 14:36 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde: Backport for Remove unused Phan suppression, Bump special-new-lexeme, fix redirect without temp user (T358754) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:34 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Remove unused Phan suppression, Bump special-new-lexeme, fix redirect without temp user (T358754)
  • 14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 backport Canceled
  • 14:12 urbanecm@deploy2002: Finished scap: Backport for cswiki, commonswiki, enwiki: Lift IP cap for Women in Science Editathon (T358755) (duration: 09m 42s)
  • 14:04 urbanecm@deploy2002: anzx and urbanecm: Continuing with sync
  • 14:04 urbanecm@deploy2002: anzx and urbanecm: Backport for cswiki, commonswiki, enwiki: Lift IP cap for Women in Science Editathon (T358755) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:03 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 14:03 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 14:02 urbanecm@deploy2002: Started scap: Backport for cswiki, commonswiki, enwiki: Lift IP cap for Women in Science Editathon (T358755)
  • 14:02 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jfishback out of all services on: 8 hosts
  • 14:02 root@cumin2002: START - Cookbook sre.idm.logout Logging Jfishback out of all services on: 8 hosts
  • 14:02 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 14:02 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 14:01 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 14:01 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 12:57 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 12:57 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 12:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T352010)', diff saved to https://phabricator.wikimedia.org/P58245 and previous config saved to /var/cache/conftool/dbconfig/20240229-125723-ladsgroup.json
  • 12:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P58244 and previous config saved to /var/cache/conftool/dbconfig/20240229-124215-ladsgroup.json
  • 12:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P58242 and previous config saved to /var/cache/conftool/dbconfig/20240229-122709-ladsgroup.json
  • 12:16 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
  • 12:14 cmooney@cumin1002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
  • 12:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T352010)', diff saved to https://phabricator.wikimedia.org/P58240 and previous config saved to /var/cache/conftool/dbconfig/20240229-121202-ladsgroup.json
  • 12:04 kharlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 12:03 kharlan@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 12:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T357189)', diff saved to https://phabricator.wikimedia.org/P58239 and previous config saved to /var/cache/conftool/dbconfig/20240229-120335-arnaudb.json
  • 12:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 12:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 12:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T357189)', diff saved to https://phabricator.wikimedia.org/P58238 and previous config saved to /var/cache/conftool/dbconfig/20240229-120312-arnaudb.json
  • 12:02 kharlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 12:01 kharlan@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 12:00 kharlan@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 12:00 kharlan@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 11:55 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-mcrouter: apply
  • 11:55 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-mcrouter: apply
  • 11:55 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 11:55 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 11:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 11:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 11:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P58236 and previous config saved to /var/cache/conftool/dbconfig/20240229-114806-arnaudb.json
  • 11:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 11:47 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 11:36 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 11:36 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 11:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 11:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 11:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P58235 and previous config saved to /var/cache/conftool/dbconfig/20240229-113259-arnaudb.json
  • 11:27 jiji@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-mcrouter: apply
  • 11:27 jiji@deploy2002: helmfile [staging] START helmfile.d/services/mw-mcrouter: apply
  • 11:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T357189)', diff saved to https://phabricator.wikimedia.org/P58234 and previous config saved to /var/cache/conftool/dbconfig/20240229-111753-arnaudb.json
  • 11:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T357189)', diff saved to https://phabricator.wikimedia.org/P58233 and previous config saved to /var/cache/conftool/dbconfig/20240229-111247-arnaudb.json
  • 11:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 11:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 11:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T357189)', diff saved to https://phabricator.wikimedia.org/P58232 and previous config saved to /var/cache/conftool/dbconfig/20240229-111215-arnaudb.json
  • 11:11 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2156.codfw.wmnet onto db2190.codfw.wmnet
  • 10:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P58231 and previous config saved to /var/cache/conftool/dbconfig/20240229-105708-arnaudb.json
  • 10:44 marostegui@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58230 and previous config saved to /var/cache/conftool/dbconfig/20240229-104437-root.json
  • 10:42 arnaudb@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58229 and previous config saved to /var/cache/conftool/dbconfig/20240229-104223-arnaudb.json
  • 10:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P58228 and previous config saved to /var/cache/conftool/dbconfig/20240229-104202-arnaudb.json
  • 10:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58227 and previous config saved to /var/cache/conftool/dbconfig/20240229-103431-arnaudb.json
  • 10:29 marostegui@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58226 and previous config saved to /var/cache/conftool/dbconfig/20240229-102932-root.json
  • 10:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58225 and previous config saved to /var/cache/conftool/dbconfig/20240229-102719-arnaudb.json
  • 10:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T357189)', diff saved to https://phabricator.wikimedia.org/P58224 and previous config saved to /var/cache/conftool/dbconfig/20240229-102656-arnaudb.json
  • 10:26 brouberol@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-tool1005.eqiad.wmnet
  • 10:26 brouberol@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:26 brouberol@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-tool1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1002"
  • 10:24 brouberol@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-tool1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1002"
  • 10:24 claime: Cordoning kubernetes2023.codfw.wmnet for vlan change cookbook tests - T350152
  • 10:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T357189)', diff saved to https://phabricator.wikimedia.org/P58223 and previous config saved to /var/cache/conftool/dbconfig/20240229-102143-arnaudb.json
  • 10:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 10:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 10:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 10:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 10:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T357189)', diff saved to https://phabricator.wikimedia.org/P58222 and previous config saved to /var/cache/conftool/dbconfig/20240229-102102-arnaudb.json
  • 10:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58221 and previous config saved to /var/cache/conftool/dbconfig/20240229-101926-arnaudb.json
  • 10:17 joal@deploy2002: Finished deploy [analytics/refinery@6e8f25b] (hadoop-test): Additional analytics weekly train - TEST [analytics/refinery@6e8f25b3] (duration: 03m 41s)
  • 10:14 marostegui@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58220 and previous config saved to /var/cache/conftool/dbconfig/20240229-101427-root.json
  • 10:13 joal@deploy2002: Started deploy [analytics/refinery@6e8f25b] (hadoop-test): Additional analytics weekly train - TEST [analytics/refinery@6e8f25b3]
  • 10:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58219 and previous config saved to /var/cache/conftool/dbconfig/20240229-101214-arnaudb.json
  • 10:11 joal@deploy2002: Finished deploy [analytics/refinery@6e8f25b] (thin): Additional analytics weekly train - THIN [analytics/refinery@6e8f25b3] (duration: 00m 05s)
  • 10:11 joal@deploy2002: Started deploy [analytics/refinery@6e8f25b] (thin): Additional analytics weekly train - THIN [analytics/refinery@6e8f25b3]
  • 10:11 joal@deploy2002: Finished deploy [analytics/refinery@6e8f25b]: Additional analytics weekly train [analytics/refinery@6e8f25b3] (duration: 11m 39s)
  • 10:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P58218 and previous config saved to /var/cache/conftool/dbconfig/20240229-100556-arnaudb.json
  • 10:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58217 and previous config saved to /var/cache/conftool/dbconfig/20240229-100421-arnaudb.json
  • 09:59 marostegui@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58216 and previous config saved to /var/cache/conftool/dbconfig/20240229-095923-root.json
  • 09:59 joal@deploy2002: Started deploy [analytics/refinery@6e8f25b]: Additional analytics weekly train [analytics/refinery@6e8f25b3]
  • 09:59 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 ', diff saved to https://phabricator.wikimedia.org/P58215 and previous config saved to /var/cache/conftool/dbconfig/20240229-095918-arnaudb.json
  • 09:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2117.codfw.wmnet with reason: Silence for maintenance T356240
  • 09:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2117.codfw.wmnet with reason: Silence for maintenance T356240
  • 09:57 arnaudb@cumin1002: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58214 and previous config saved to /var/cache/conftool/dbconfig/20240229-095709-arnaudb.json
  • 09:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1224.eqiad.wmnet
  • 09:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2218 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P58213 and previous config saved to /var/cache/conftool/dbconfig/20240229-095425-root.json
  • 09:51 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1224.eqiad.wmnet
  • 09:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db1224.eqiad.wmnet with reason: Silence for maintenance T356240
  • 09:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db1224.eqiad.wmnet with reason: Silence for maintenance T356240
  • 09:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P58212 and previous config saved to /var/cache/conftool/dbconfig/20240229-095049-arnaudb.json
  • 09:49 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 reboot', diff saved to https://phabricator.wikimedia.org/P58211 and previous config saved to /var/cache/conftool/dbconfig/20240229-094945-arnaudb.json
  • 09:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db1201 (re)pooling @ 25%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58210 and previous config saved to /var/cache/conftool/dbconfig/20240229-094915-arnaudb.json
  • 09:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T352010)', diff saved to https://phabricator.wikimedia.org/P58209 and previous config saved to /var/cache/conftool/dbconfig/20240229-094429-ladsgroup.json
  • 09:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58208 and previous config saved to /var/cache/conftool/dbconfig/20240229-094418-root.json
  • 09:44 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 09:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1201.eqiad.wmnet
  • 09:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2218 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P58207 and previous config saved to /var/cache/conftool/dbconfig/20240229-093921-root.json
  • 09:36 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1201.eqiad.wmnet
  • 09:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T357189)', diff saved to https://phabricator.wikimedia.org/P58206 and previous config saved to /var/cache/conftool/dbconfig/20240229-093543-arnaudb.json
  • 09:34 brouberol@cumin1002: START - Cookbook sre.dns.netbox
  • 09:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T357189)', diff saved to https://phabricator.wikimedia.org/P58205 and previous config saved to /var/cache/conftool/dbconfig/20240229-093025-arnaudb.json
  • 09:30 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 09:30 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 09:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T357189)', diff saved to https://phabricator.wikimedia.org/P58204 and previous config saved to /var/cache/conftool/dbconfig/20240229-093003-arnaudb.json
  • 09:29 marostegui@cumin1002: dbctl commit (dc=all): 'Promote back es2034 to es3 codfw master T358180', diff saved to https://phabricator.wikimedia.org/P58203 and previous config saved to /var/cache/conftool/dbconfig/20240229-092929-marostegui.json
  • 09:29 marostegui@cumin1002: dbctl commit (dc=all): 'es2034 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58202 and previous config saved to /var/cache/conftool/dbconfig/20240229-092913-root.json
  • 09:28 arnaudb@cumin1002: dbctl commit (dc=all): 'depooling for maintenance - reboot', diff saved to https://phabricator.wikimedia.org/P58201 and previous config saved to /var/cache/conftool/dbconfig/20240229-092853-arnaudb.json
  • 09:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2034.codfw.wmnet with OS bookworm
  • 09:26 brouberol@cumin1002: START - Cookbook sre.hosts.decommission for hosts an-tool1005.eqiad.wmnet
  • 09:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2144.codfw.wmnet,db1201.eqiad.wmnet with reason: Silence for maintenance T356240
  • 09:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2144.codfw.wmnet,db1201.eqiad.wmnet with reason: Silence for maintenance T356240
  • 09:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2218 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P58200 and previous config saved to /var/cache/conftool/dbconfig/20240229-092416-root.json
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P58199 and previous config saved to /var/cache/conftool/dbconfig/20240229-091457-arnaudb.json
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2218 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P58198 and previous config saved to /var/cache/conftool/dbconfig/20240229-090911-root.json
  • 09:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2034.codfw.wmnet with reason: host reimage
  • 09:07 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db2156.codfw.wmnet onto db2190.codfw.wmnet
  • 09:05 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2034.codfw.wmnet with reason: host reimage
  • 08:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P58197 and previous config saved to /var/cache/conftool/dbconfig/20240229-085951-arnaudb.json
  • 08:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2218 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P58196 and previous config saved to /var/cache/conftool/dbconfig/20240229-085406-root.json
  • 08:52 kartik@deploy2002: Finished scap: Backport for Section Translation: Add 'nb' in target language code (T353734) (duration: 12m 45s)
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'es1034 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58195 and previous config saved to /var/cache/conftool/dbconfig/20240229-085021-root.json
  • 08:47 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2034.codfw.wmnet with OS bookworm
  • 08:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2034 T358180', diff saved to https://phabricator.wikimedia.org/P58194 and previous config saved to /var/cache/conftool/dbconfig/20240229-084541-root.json
  • 08:45 marostegui@cumin1002: dbctl commit (dc=all): 'Promote back es2029 to es3 codfw master T358180', diff saved to https://phabricator.wikimedia.org/P58193 and previous config saved to /var/cache/conftool/dbconfig/20240229-084502-marostegui.json
  • 08:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T357189)', diff saved to https://phabricator.wikimedia.org/P58192 and previous config saved to /var/cache/conftool/dbconfig/20240229-084444-arnaudb.json
  • 08:44 kartik@deploy2002: kartik: Continuing with sync
  • 08:40 kartik@deploy2002: kartik: Backport for Section Translation: Add 'nb' in target language code (T353734) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T357189)', diff saved to https://phabricator.wikimedia.org/P58191 and previous config saved to /var/cache/conftool/dbconfig/20240229-083928-arnaudb.json
  • 08:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 08:39 kartik@deploy2002: Started scap: Backport for Section Translation: Add 'nb' in target language code (T353734)
  • 08:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 08:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2218 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P58190 and previous config saved to /var/cache/conftool/dbconfig/20240229-083901-root.json
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'es1034 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58189 and previous config saved to /var/cache/conftool/dbconfig/20240229-083517-root.json
  • 08:26 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T354015)', diff saved to https://phabricator.wikimedia.org/P58188 and previous config saved to /var/cache/conftool/dbconfig/20240229-082602-marostegui.json
  • 08:25 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:25 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'es1034 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58185 and previous config saved to /var/cache/conftool/dbconfig/20240229-080507-root.json
  • 08:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58184 and previous config saved to /var/cache/conftool/dbconfig/20240229-080449-root.json
  • 08:04 kartik@deploy2002: kartik: Backport for Enable Section translation on Wikipedias with Content Translation available as default (T351882) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:03 kartik@deploy2002: Started scap: Backport for Enable Section translation on Wikipedias with Content Translation available as default (T351882)
  • 07:51 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host idp-test1003.wikimedia.org with OS bookworm
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'es1034 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58183 and previous config saved to /var/cache/conftool/dbconfig/20240229-075002-root.json
  • 07:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58182 and previous config saved to /var/cache/conftool/dbconfig/20240229-074944-root.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'Promote back es1034 to es3 eqiad master T358180', diff saved to https://phabricator.wikimedia.org/P58181 and previous config saved to /var/cache/conftool/dbconfig/20240229-073523-marostegui.json
  • 07:35 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 07:34 marostegui@cumin1002: dbctl commit (dc=all): 'es1034 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58180 and previous config saved to /var/cache/conftool/dbconfig/20240229-073457-root.json
  • 07:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58179 and previous config saved to /var/cache/conftool/dbconfig/20240229-073440-root.json
  • 07:32 slyngshede@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58178 and previous config saved to /var/cache/conftool/dbconfig/20240229-071935-root.json
  • 07:19 slyngshede@cumin1002: START - Cookbook sre.hosts.reimage for host idp-test1003.wikimedia.org with OS bookworm
  • 07:15 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2003.codfw.wmnet with reason: sretest
  • 07:14 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2003.codfw.wmnet with reason: sretest
  • 07:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1034.eqiad.wmnet with OS bookworm
  • 07:04 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58177 and previous config saved to /var/cache/conftool/dbconfig/20240229-070430-root.json
  • 06:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1034.eqiad.wmnet with reason: host reimage
  • 06:50 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1034.eqiad.wmnet with reason: host reimage
  • 06:49 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 5%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58176 and previous config saved to /var/cache/conftool/dbconfig/20240229-064925-root.json
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'Pool db2218 with 5% weight only', diff saved to https://phabricator.wikimedia.org/P58175 and previous config saved to /var/cache/conftool/dbconfig/20240229-064402-marostegui.json
  • 06:37 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1034.eqiad.wmnet with OS bookworm
  • 06:36 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on es1034.eqiad.wmnet with reason: Reimage
  • 06:35 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 4:00:00 on es1034.eqiad.wmnet with reason: Reimage
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1034 T358180', diff saved to https://phabricator.wikimedia.org/P58174 and previous config saved to /var/cache/conftool/dbconfig/20240229-063502-root.json
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'db1169 (re)pooling @ 1%: After optimizing revision table', diff saved to https://phabricator.wikimedia.org/P58173 and previous config saved to /var/cache/conftool/dbconfig/20240229-063420-root.json
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2118 from dbctl', diff saved to https://phabricator.wikimedia.org/P58172 and previous config saved to /var/cache/conftool/dbconfig/20240229-063412-marostegui.json
  • 06:26 marostegui@cumin1002: dbctl commit (dc=all): 'Pool db2218 with 1% weight only T358421 T355422', diff saved to https://phabricator.wikimedia.org/P58171 and previous config saved to /var/cache/conftool/dbconfig/20240229-062601-marostegui.json
  • 06:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 06:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 06:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P58170 and previous config saved to /var/cache/conftool/dbconfig/20240229-060721-ladsgroup.json
  • 05:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P58169 and previous config saved to /var/cache/conftool/dbconfig/20240229-055215-ladsgroup.json
  • 05:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P58168 and previous config saved to /var/cache/conftool/dbconfig/20240229-053708-ladsgroup.json
  • 05:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P58167 and previous config saved to /var/cache/conftool/dbconfig/20240229-052202-ladsgroup.json
  • 04:30 TimStarling: on mwmaint2002 running migrateBlocks.php on all wikis
  • 04:19 tstarling@deploy2002: Synchronized wmf-config/CommonSettings.php: Switch block schema to read-old/write-both mode T355034 (duration: 08m 47s)
  • 03:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1224 (T352010)', diff saved to https://phabricator.wikimedia.org/P58165 and previous config saved to /var/cache/conftool/dbconfig/20240229-030309-ladsgroup.json
  • 03:03 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 03:02 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1224.eqiad.wmnet with reason: Maintenance
  • 03:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P58164 and previous config saved to /var/cache/conftool/dbconfig/20240229-030247-ladsgroup.json
  • 02:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P58163 and previous config saved to /var/cache/conftool/dbconfig/20240229-024741-ladsgroup.json
  • 02:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P58162 and previous config saved to /var/cache/conftool/dbconfig/20240229-023234-ladsgroup.json
  • 02:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P58161 and previous config saved to /var/cache/conftool/dbconfig/20240229-021728-ladsgroup.json
  • 01:50 Krinkle: ruwiktionary `UPDATE page SET page_namespace=1,page_title=CONCAT('Broken/NS2303:',page_title) WHERE page_id=2469241 AND page_namespace=2303; ` T31272
  • 01:49 Krinkle: ruwiktionary `UPDATE page SET page_namespace=1,page_title=CONCAT('Broken/NS2301:',page_title) WHERE page_id=2469240 AND page_namespace=2301` T31272
  • 00:59 ayounsi@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host testvm2006.codfw.wmnet
  • 00:59 ayounsi@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2006.codfw.wmnet with OS bookworm

2024-02-28

  • 23:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T352010)', diff saved to https://phabricator.wikimedia.org/P58159 and previous config saved to /var/cache/conftool/dbconfig/20240228-232800-ladsgroup.json
  • 23:27 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 23:27 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 23:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T357189)', diff saved to https://phabricator.wikimedia.org/P58158 and previous config saved to /var/cache/conftool/dbconfig/20240228-230015-arnaudb.json
  • 22:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P58157 and previous config saved to /var/cache/conftool/dbconfig/20240228-224508-arnaudb.json
  • 22:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P58156 and previous config saved to /var/cache/conftool/dbconfig/20240228-223002-arnaudb.json
  • 22:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T357189)', diff saved to https://phabricator.wikimedia.org/P58155 and previous config saved to /var/cache/conftool/dbconfig/20240228-221456-arnaudb.json
  • 22:14 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2043*,2044*,2079*,2080* for switch maintenance - bking@cumin2002 - T355872
  • 22:13 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2043*,2044*,2079*,2080* for switch maintenance - bking@cumin2002 - T355872
  • 21:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T357189)', diff saved to https://phabricator.wikimedia.org/P58154 and previous config saved to /var/cache/conftool/dbconfig/20240228-211823-arnaudb.json
  • 21:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 21:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 21:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T357189)', diff saved to https://phabricator.wikimedia.org/P58153 and previous config saved to /var/cache/conftool/dbconfig/20240228-211801-arnaudb.json
  • 21:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P58152 and previous config saved to /var/cache/conftool/dbconfig/20240228-210254-arnaudb.json
  • 20:53 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 20:53 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 20:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P58151 and previous config saved to /var/cache/conftool/dbconfig/20240228-205308-ladsgroup.json
  • 20:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P58150 and previous config saved to /var/cache/conftool/dbconfig/20240228-204748-arnaudb.json
  • 20:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P58149 and previous config saved to /var/cache/conftool/dbconfig/20240228-203802-ladsgroup.json
  • 20:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T357189)', diff saved to https://phabricator.wikimedia.org/P58148 and previous config saved to /var/cache/conftool/dbconfig/20240228-203241-arnaudb.json
  • 20:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T357189)', diff saved to https://phabricator.wikimedia.org/P58147 and previous config saved to /var/cache/conftool/dbconfig/20240228-202435-arnaudb.json
  • 20:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 20:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 20:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T357189)', diff saved to https://phabricator.wikimedia.org/P58146 and previous config saved to /var/cache/conftool/dbconfig/20240228-202413-arnaudb.json
  • 20:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P58145 and previous config saved to /var/cache/conftool/dbconfig/20240228-202256-ladsgroup.json
  • 20:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P58144 and previous config saved to /var/cache/conftool/dbconfig/20240228-200906-arnaudb.json
  • 20:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P58143 and previous config saved to /var/cache/conftool/dbconfig/20240228-200748-ladsgroup.json
  • 19:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P58142 and previous config saved to /var/cache/conftool/dbconfig/20240228-195400-arnaudb.json
  • 19:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T357189)', diff saved to https://phabricator.wikimedia.org/P58141 and previous config saved to /var/cache/conftool/dbconfig/20240228-193854-arnaudb.json
  • 19:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T357189)', diff saved to https://phabricator.wikimedia.org/P58140 and previous config saved to /var/cache/conftool/dbconfig/20240228-193133-arnaudb.json
  • 19:31 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 19:31 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 19:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T357189)', diff saved to https://phabricator.wikimedia.org/P58139 and previous config saved to /var/cache/conftool/dbconfig/20240228-193111-arnaudb.json
  • 19:22 dduvall@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.20 refs T354438 (duration: 08m 37s)
  • 19:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P58138 and previous config saved to /var/cache/conftool/dbconfig/20240228-191605-arnaudb.json
  • 19:14 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2118.codfw.wmnet onto db2218.codfw.wmnet
  • 19:14 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.20 refs T354438
  • 19:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138', diff saved to https://phabricator.wikimedia.org/P58136 and previous config saved to /var/cache/conftool/dbconfig/20240228-190059-arnaudb.json
  • 18:50 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 18:49 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 18:49 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 18:49 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 18:49 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 18:48 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 18:46 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov1006.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138 (T357189)', diff saved to https://phabricator.wikimedia.org/P58135 and previous config saved to /var/cache/conftool/dbconfig/20240228-184552-arnaudb.json
  • 18:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2138 (T357189)', diff saved to https://phabricator.wikimedia.org/P58134 and previous config saved to /var/cache/conftool/dbconfig/20240228-183915-arnaudb.json
  • 18:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 18:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 18:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T357189)', diff saved to https://phabricator.wikimedia.org/P58133 and previous config saved to /var/cache/conftool/dbconfig/20240228-183853-arnaudb.json
  • 18:34 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P58132 and previous config saved to /var/cache/conftool/dbconfig/20240228-182347-arnaudb.json
  • 18:14 vriley@cumin1002: START - Cookbook sre.hosts.provision for host dbprov1006.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:14 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:13 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt dbprov1006 - vriley@cumin1002"
  • 18:13 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt dbprov1006 - vriley@cumin1002"
  • 18:10 sbailey@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 18:10 sbailey@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 18:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P58131 and previous config saved to /var/cache/conftool/dbconfig/20240228-180840-arnaudb.json
  • 18:08 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T357189)', diff saved to https://phabricator.wikimedia.org/P58130 and previous config saved to /var/cache/conftool/dbconfig/20240228-175333-arnaudb.json
  • 17:52 sbailey@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
  • 17:52 sbailey@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
  • 17:49 vriley@cumin1002: START - Cookbook sre.hosts.provision for host dbprov1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:48 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:48 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt dbprov1005 - vriley@cumin1002"
  • 17:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2126 (T357189)', diff saved to https://phabricator.wikimedia.org/P58129 and previous config saved to /var/cache/conftool/dbconfig/20240228-174759-arnaudb.json
  • 17:47 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt dbprov1005 - vriley@cumin1002"
  • 17:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 17:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 17:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 17:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
  • 17:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T357189)', diff saved to https://phabricator.wikimedia.org/P58128 and previous config saved to /var/cache/conftool/dbconfig/20240228-174720-arnaudb.json
  • 17:46 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 17:38 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-redacteddb1001']
  • 17:38 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-redacteddb1001']
  • 17:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P58127 and previous config saved to /var/cache/conftool/dbconfig/20240228-173214-arnaudb.json
  • 17:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P58126 and previous config saved to /var/cache/conftool/dbconfig/20240228-171707-arnaudb.json
  • 17:16 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db2118.codfw.wmnet onto db2218.codfw.wmnet
  • 17:16 marostegui@cumin1002: dbctl commit (dc=all): 'Add db2218 depooled T355422', diff saved to https://phabricator.wikimedia.org/P58125 and previous config saved to /var/cache/conftool/dbconfig/20240228-171633-marostegui.json
  • 17:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T352010)', diff saved to https://phabricator.wikimedia.org/P58124 and previous config saved to /var/cache/conftool/dbconfig/20240228-171157-ladsgroup.json
  • 17:11 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 17:11 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 17:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P58123 and previous config saved to /var/cache/conftool/dbconfig/20240228-171136-ladsgroup.json
  • 17:03 sukhe: running dummy authdns-update
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T357189)', diff saved to https://phabricator.wikimedia.org/P58122 and previous config saved to /var/cache/conftool/dbconfig/20240228-170201-arnaudb.json
  • 17:01 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1028 to es3 eqiad master T358180', diff saved to https://phabricator.wikimedia.org/P58121 and previous config saved to /var/cache/conftool/dbconfig/20240228-170134-marostegui.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58120 and previous config saved to /var/cache/conftool/dbconfig/20240228-165841-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58119 and previous config saved to /var/cache/conftool/dbconfig/20240228-165832-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2096 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58118 and previous config saved to /var/cache/conftool/dbconfig/20240228-165832-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58117 and previous config saved to /var/cache/conftool/dbconfig/20240228-165823-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2111 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58116 and previous config saved to /var/cache/conftool/dbconfig/20240228-165815-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2110 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58115 and previous config saved to /var/cache/conftool/dbconfig/20240228-165806-arnaudb.json
  • 16:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P58114 and previous config saved to /var/cache/conftool/dbconfig/20240228-165629-ladsgroup.json
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2125 (T357189)', diff saved to https://phabricator.wikimedia.org/P58113 and previous config saved to /var/cache/conftool/dbconfig/20240228-165315-arnaudb.json
  • 16:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
  • 16:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T357189)', diff saved to https://phabricator.wikimedia.org/P58112 and previous config saved to /var/cache/conftool/dbconfig/20240228-165253-arnaudb.json
  • 16:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Optimize revision table T354015
  • 16:45 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Optimize revision table T354015
  • 16:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1169 T354015', diff saved to https://phabricator.wikimedia.org/P58111 and previous config saved to /var/cache/conftool/dbconfig/20240228-164451-root.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58110 and previous config saved to /var/cache/conftool/dbconfig/20240228-164337-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58109 and previous config saved to /var/cache/conftool/dbconfig/20240228-164327-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2096 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58108 and previous config saved to /var/cache/conftool/dbconfig/20240228-164321-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58107 and previous config saved to /var/cache/conftool/dbconfig/20240228-164312-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2111 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58106 and previous config saved to /var/cache/conftool/dbconfig/20240228-164310-arnaudb.json
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2110 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58105 and previous config saved to /var/cache/conftool/dbconfig/20240228-164301-arnaudb.json
  • 16:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P58104 and previous config saved to /var/cache/conftool/dbconfig/20240228-164123-ladsgroup.json
  • 16:40 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:39 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P58103 and previous config saved to /var/cache/conftool/dbconfig/20240228-163747-arnaudb.json
  • 16:31 jayme@cumin1002: conftool action : set/pooled=yes; selector: name=mw23(2[5-9]|3[0-4]).codfw.wmnet
  • 16:28 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58102 and previous config saved to /var/cache/conftool/dbconfig/20240228-162832-arnaudb.json
  • 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58101 and previous config saved to /var/cache/conftool/dbconfig/20240228-162823-arnaudb.json
  • 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2096 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58100 and previous config saved to /var/cache/conftool/dbconfig/20240228-162816-arnaudb.json
  • 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58099 and previous config saved to /var/cache/conftool/dbconfig/20240228-162807-arnaudb.json
  • 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2111 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58098 and previous config saved to /var/cache/conftool/dbconfig/20240228-162806-arnaudb.json
  • 16:27 arnaudb@cumin1002: dbctl commit (dc=all): 'db2110 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58097 and previous config saved to /var/cache/conftool/dbconfig/20240228-162756-arnaudb.json
  • 16:27 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:27 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P58096 and previous config saved to /var/cache/conftool/dbconfig/20240228-162616-ladsgroup.json
  • 16:25 topranks: Disabling IPv6 RAs for private1-b-codfw vlan on codfw CR routers, moving GW to lsw/ssw T355544
  • 16:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P58095 and previous config saved to /var/cache/conftool/dbconfig/20240228-162240-arnaudb.json
  • 16:21 dancy@deploy2002: Finished scap: testing new scap release (duration: 09m 12s)
  • 16:18 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 16:17 sukhe: sudo cumin 'A:dns-rec' "run-puppet-agent --enable 'merging CR 1006955'"
  • 16:17 moritzm: import cas 6.6.12+wmf12u3 to bookworm-wikimedia T357748
  • 16:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2161 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58094 and previous config saved to /var/cache/conftool/dbconfig/20240228-161327-arnaudb.json
  • 16:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2162 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58093 and previous config saved to /var/cache/conftool/dbconfig/20240228-161318-arnaudb.json
  • 16:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2096 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58092 and previous config saved to /var/cache/conftool/dbconfig/20240228-161312-arnaudb.json
  • 16:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2124 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58091 and previous config saved to /var/cache/conftool/dbconfig/20240228-161303-arnaudb.json
  • 16:13 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps2009.codfw.wmnet
  • 16:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2111 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58090 and previous config saved to /var/cache/conftool/dbconfig/20240228-161254-arnaudb.json
  • 16:12 arnaudb@cumin1002: dbctl commit (dc=all): 'db2110 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58089 and previous config saved to /var/cache/conftool/dbconfig/20240228-161251-arnaudb.json
  • 16:12 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org,service=authdns-update
  • 16:12 dancy@deploy2002: Started scap: testing new scap release
  • 16:12 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6001.wikimedia.org,service=authdns-update
  • 16:11 dancy@deploy2002: Installation of scap version "4.67.0" completed for 445 hosts
  • 16:11 dancy@deploy2002: Installing scap version "4.67.0" for 445 hosts
  • 16:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T357189)', diff saved to https://phabricator.wikimedia.org/P58088 and previous config saved to /var/cache/conftool/dbconfig/20240228-160734-arnaudb.json
  • 16:06 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 16:06 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 16:04 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 37 hosts with reason: Migrating servers in codfw rack B6 to lsw1-b6-codfw
  • 16:04 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 37 hosts with reason: Migrating servers in codfw rack B6 to lsw1-b6-codfw
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2104 (T357189)', diff saved to https://phabricator.wikimedia.org/P58087 and previous config saved to /var/cache/conftool/dbconfig/20240228-160202-arnaudb.json
  • 16:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 16:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2104.codfw.wmnet with reason: Maintenance
  • 15:59 sukhe: sudo cumin "A:dns-rec" "disable-puppet 'merging CR 1006955'"
  • 15:57 samtar@deploy2002: Finished scap: Backport for InitialiseSettings: Enable Edit Recovery on arwiki (T355548) (duration: 10m 10s)
  • 15:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 15:55 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2198.codfw.wmnet with OS bookworm
  • 15:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 15:55 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:51 topranks: configuring lsw1-b6-codfw in advance of server migration T355871
  • 15:51 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 15:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 15:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T357189)', diff saved to https://phabricator.wikimedia.org/P58086 and previous config saved to /var/cache/conftool/dbconfig/20240228-155113-arnaudb.json
  • 15:49 samtar@deploy2002: samtar: Continuing with sync
  • 15:49 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b6-codfw.mgmt with reason: prepping for server uplink migration codfw rack b6
  • 15:48 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b6-codfw.mgmt with reason: prepping for server uplink migration codfw rack b6
  • 15:48 samtar@deploy2002: samtar: Backport for InitialiseSettings: Enable Edit Recovery on arwiki (T355548) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 15:48 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2009.codfw.wmnet
  • 15:46 samtar@deploy2002: Started scap: Backport for InitialiseSettings: Enable Edit Recovery on arwiki (T355548)
  • 15:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2197.codfw.wmnet with OS bookworm
  • 15:45 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'T355871 - depooling db2110 db2111 db2124 db2134 db2096 db2161 db2162', diff saved to https://phabricator.wikimedia.org/P58085 and previous config saved to /var/cache/conftool/dbconfig/20240228-154043-arnaudb.json
  • 15:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2198.codfw.wmnet with reason: host reimage
  • 15:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on 7 hosts with reason: Silence for maintenance T355871
  • 15:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:40:00 on 7 hosts with reason: Silence for maintenance T355871
  • 15:37 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2198.codfw.wmnet with reason: host reimage
  • 15:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P58084 and previous config saved to /var/cache/conftool/dbconfig/20240228-153607-arnaudb.json
  • 15:35 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:35 jayme@cumin1002: conftool action : set/pooled=inactive; selector: name=mw23(2[5-9]|3[0-4]).codfw.wmnet
  • 15:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2215.codfw.wmnet with OS bookworm
  • 15:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:31 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 15:30 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:28 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:28 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:25 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:25 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:23 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246', diff saved to https://phabricator.wikimedia.org/P58083 and previous config saved to /var/cache/conftool/dbconfig/20240228-152101-arnaudb.json
  • 15:20 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2197.codfw.wmnet with reason: host reimage
  • 15:18 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:17 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
  • 15:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2198.codfw.wmnet with OS bookworm
  • 15:17 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:17 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:15 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:15 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2197.codfw.wmnet with reason: host reimage
  • 15:15 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:14 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:14 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2215.codfw.wmnet with reason: host reimage
  • 15:11 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 15:10 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 15:10 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 15:09 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 15:08 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 15:08 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 15:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2198.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246 (T357189)', diff saved to https://phabricator.wikimedia.org/P58082 and previous config saved to /var/cache/conftool/dbconfig/20240228-150554-arnaudb.json
  • 15:04 fab@deploy2002: Finished deploy [airflow-dags/research@4bed377]: (no justification provided) (duration: 00m 42s)
  • 15:03 fab@deploy2002: Started deploy [airflow-dags/research@4bed377]: (no justification provided)
  • 15:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1246 (T357189)', diff saved to https://phabricator.wikimedia.org/P58081 and previous config saved to /var/cache/conftool/dbconfig/20240228-145958-arnaudb.json
  • 14:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 14:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1246.eqiad.wmnet with reason: Maintenance
  • 14:56 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2198.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 14:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 14:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2197.codfw.wmnet with OS bookworm
  • 14:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T357189)', diff saved to https://phabricator.wikimedia.org/P58080 and previous config saved to /var/cache/conftool/dbconfig/20240228-145457-arnaudb.json
  • 14:53 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS bookworm
  • 14:40 jiji@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:39 jiji@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P58079 and previous config saved to /var/cache/conftool/dbconfig/20240228-143951-arnaudb.json
  • 14:39 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:39 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:38 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:38 daniel@deploy2002: Finished scap: Backport for Configure parser cache filters for parsoid-pcache (T346765 T355375) (duration: 14m 56s)
  • 14:37 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:32 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:30 daniel@deploy2002: daniel: Continuing with sync
  • 14:29 jiji@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:27 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on wdqs2008.codfw.wmnet with reason: T355617
  • 14:27 bking@cumin2002: START - Cookbook sre.hosts.downtime for 6:00:00 on wdqs2008.codfw.wmnet with reason: T355617
  • 14:25 jiji@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:25 daniel@deploy2002: daniel: Backport for Configure parser cache filters for parsoid-pcache (T346765 T355375) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P58077 and previous config saved to /var/cache/conftool/dbconfig/20240228-142445-arnaudb.json
  • 14:24 jiji@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:24 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:23 daniel@deploy2002: Started scap: Backport for Configure parser cache filters for parsoid-pcache (T346765 T355375)
  • 14:23 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:22 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:22 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 14:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T352010)', diff saved to https://phabricator.wikimedia.org/P58076 and previous config saved to /var/cache/conftool/dbconfig/20240228-141413-ladsgroup.json
  • 14:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 14:13 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T357189)', diff saved to https://phabricator.wikimedia.org/P58075 and previous config saved to /var/cache/conftool/dbconfig/20240228-140938-arnaudb.json
  • 14:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 100%: After running optimize', diff saved to https://phabricator.wikimedia.org/P58074 and previous config saved to /var/cache/conftool/dbconfig/20240228-140626-root.json
  • 14:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T357189)', diff saved to https://phabricator.wikimedia.org/P58073 and previous config saved to /var/cache/conftool/dbconfig/20240228-140346-arnaudb.json
  • 14:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 14:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 14:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T357189)', diff saved to https://phabricator.wikimedia.org/P58072 and previous config saved to /var/cache/conftool/dbconfig/20240228-140323-arnaudb.json
  • 13:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:53 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2003 - ayounsi@cumin1002"
  • 13:52 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2003 - ayounsi@cumin1002"
  • 13:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 75%: After running optimize', diff saved to https://phabricator.wikimedia.org/P58071 and previous config saved to /var/cache/conftool/dbconfig/20240228-135121-root.json
  • 13:49 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 13:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P58070 and previous config saved to /var/cache/conftool/dbconfig/20240228-134817-arnaudb.json
  • 13:41 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.move-vlan (exit_code=99) for host <spicerack.netbox.NetboxServer object at 0x7f3aaebfffa0>
  • 13:41 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 6.0.0.0.0.1.0.0.2.9.1.0.0.1.0.0.b.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:41 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache 6.0.0.0.0.1.0.0.2.9.1.0.0.1.0.0.b.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:41 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 6.10.192.10.in-addr.arpa on all recursors
  • 13:41 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache 6.10.192.10.in-addr.arpa on all recursors
  • 13:41 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2003.codfw.wmnet on all recursors
  • 13:41 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2003.codfw.wmnet on all recursors
  • 13:40 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:40 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1007.eqiad.wmnet with OS bookworm
  • 13:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T352010)', diff saved to https://phabricator.wikimedia.org/P58069 and previous config saved to /var/cache/conftool/dbconfig/20240228-133959-ladsgroup.json
  • 13:39 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 13:39 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 13:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P58068 and previous config saved to /var/cache/conftool/dbconfig/20240228-133937-ladsgroup.json
  • 13:39 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 13:39 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 5.5.2.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:39 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache 5.5.2.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
  • 13:39 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 255.0.192.10.in-addr.arpa on all recursors
  • 13:39 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache 255.0.192.10.in-addr.arpa on all recursors
  • 13:39 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2003.codfw.wmnet on all recursors
  • 13:39 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2003.codfw.wmnet on all recursors
  • 13:39 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:39 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host sretest2003 - ayounsi@cumin1002"
  • 13:38 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host sretest2003 - ayounsi@cumin1002"
  • 13:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 50%: After running optimize', diff saved to https://phabricator.wikimedia.org/P58066 and previous config saved to /var/cache/conftool/dbconfig/20240228-133616-root.json
  • 13:36 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 13:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P58065 and previous config saved to /var/cache/conftool/dbconfig/20240228-133311-arnaudb.json
  • 13:33 ayounsi@cumin1002: START - Cookbook sre.hosts.move-vlan for host <spicerack.netbox.NetboxServer object at 0x7f3aaebfffa0>
  • 13:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P58064 and previous config saved to /var/cache/conftool/dbconfig/20240228-132431-ladsgroup.json
  • 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 25%: After running optimize', diff saved to https://phabricator.wikimedia.org/P58063 and previous config saved to /var/cache/conftool/dbconfig/20240228-132111-root.json
  • 13:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P58062 and previous config saved to /var/cache/conftool/dbconfig/20240228-132002-root.json
  • 13:18 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1007.eqiad.wmnet with reason: host reimage
  • 13:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T357189)', diff saved to https://phabricator.wikimedia.org/P58061 and previous config saved to /var/cache/conftool/dbconfig/20240228-131804-arnaudb.json
  • 13:16 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1007.eqiad.wmnet with reason: host reimage
  • 13:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T357189)', diff saved to https://phabricator.wikimedia.org/P58060 and previous config saved to /var/cache/conftool/dbconfig/20240228-131318-arnaudb.json
  • 13:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 13:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
  • 13:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:13 ayounsi@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
  • 13:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 13:12 ayounsi@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2006.codfw.wmnet decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1002"
  • 13:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 255.0.192.10.in-addr.arpa on codfw recursors
  • 13:11 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache 255.0.192.10.in-addr.arpa on codfw recursors
  • 13:11 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 10.192.0.229 on codfw recursors
  • 13:11 ayounsi@cumin1002: START - Cookbook sre.dns.wipe-cache 10.192.0.229 on codfw recursors
  • 13:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P58059 and previous config saved to /var/cache/conftool/dbconfig/20240228-130925-ladsgroup.json
  • 13:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 13:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 13:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T357189)', diff saved to https://phabricator.wikimedia.org/P58058 and previous config saved to /var/cache/conftool/dbconfig/20240228-130811-arnaudb.json
  • 13:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 10%: After running optimize', diff saved to https://phabricator.wikimedia.org/P58057 and previous config saved to /var/cache/conftool/dbconfig/20240228-130606-root.json
  • 13:05 moritzm: installing bind9 security updates
  • 13:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P58056 and previous config saved to /var/cache/conftool/dbconfig/20240228-130457-root.json
  • 13:03 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host dbstore1007.eqiad.wmnet with OS bookworm
  • 13:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 13:01 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 13:01 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 12:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 12:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 12:57 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
  • 12:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P58055 and previous config saved to /var/cache/conftool/dbconfig/20240228-125418-ladsgroup.json
  • 12:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P58054 and previous config saved to /var/cache/conftool/dbconfig/20240228-125305-arnaudb.json
  • 12:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1232 (re)pooling @ 5%: After running optimize', diff saved to https://phabricator.wikimedia.org/P58053 and previous config saved to /var/cache/conftool/dbconfig/20240228-125102-root.json
  • 12:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P58052 and previous config saved to /var/cache/conftool/dbconfig/20240228-124953-root.json
  • 12:47 moritzm: import cas 6.6.12+wmf12u2 to bookworm-wikimedia T357748
  • 12:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P58050 and previous config saved to /var/cache/conftool/dbconfig/20240228-123759-arnaudb.json
  • 12:37 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1007.eqiad.wmnet with OS bullseye
  • 12:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2177 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P58049 and previous config saved to /var/cache/conftool/dbconfig/20240228-123448-root.json
  • 12:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T357189)', diff saved to https://phabricator.wikimedia.org/P58048 and previous config saved to /var/cache/conftool/dbconfig/20240228-122252-arnaudb.json
  • 12:16 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1007.eqiad.wmnet with reason: host reimage
  • 12:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1222 (T357189)', diff saved to https://phabricator.wikimedia.org/P58047 and previous config saved to /var/cache/conftool/dbconfig/20240228-121603-arnaudb.json
  • 12:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 12:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1222.eqiad.wmnet with reason: Maintenance
  • 12:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T357189)', diff saved to https://phabricator.wikimedia.org/P58046 and previous config saved to /var/cache/conftool/dbconfig/20240228-121541-arnaudb.json
  • 12:14 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1007.eqiad.wmnet with reason: host reimage
  • 12:01 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host dbstore1007.eqiad.wmnet with OS bullseye
  • 12:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P58045 and previous config saved to /var/cache/conftool/dbconfig/20240228-120035-arnaudb.json
  • 11:57 jiji@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:54 jiji@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:52 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbstore1007.eqiad.wmnet with OS bookworm
  • 11:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P58044 and previous config saved to /var/cache/conftool/dbconfig/20240228-114529-arnaudb.json
  • 11:44 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2156.codfw.wmnet onto db2177.codfw.wmnet
  • 11:31 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1007.eqiad.wmnet with reason: host reimage
  • 11:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T357189)', diff saved to https://phabricator.wikimedia.org/P58043 and previous config saved to /var/cache/conftool/dbconfig/20240228-113022-arnaudb.json
  • 11:27 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbstore1007.eqiad.wmnet with reason: host reimage
  • 11:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T357189)', diff saved to https://phabricator.wikimedia.org/P58042 and previous config saved to /var/cache/conftool/dbconfig/20240228-112523-arnaudb.json
  • 11:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 11:25 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 11:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T357189)', diff saved to https://phabricator.wikimedia.org/P58041 and previous config saved to /var/cache/conftool/dbconfig/20240228-112501-arnaudb.json
  • 11:24 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 11:23 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 11:22 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 11:22 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 11:19 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 11:18 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 11:14 moritzm: import cas 6.6.12+wmf12u1 to bookworm-wikimedia T357748
  • 11:13 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host dbstore1007.eqiad.wmnet with OS bookworm
  • 11:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P58039 and previous config saved to /var/cache/conftool/dbconfig/20240228-110955-arnaudb.json
  • 11:03 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 11:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 10:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P58038 and previous config saved to /var/cache/conftool/dbconfig/20240228-105449-arnaudb.json
  • 10:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T357189)', diff saved to https://phabricator.wikimedia.org/P58037 and previous config saved to /var/cache/conftool/dbconfig/20240228-103942-arnaudb.json
  • 10:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T357189)', diff saved to https://phabricator.wikimedia.org/P58036 and previous config saved to /var/cache/conftool/dbconfig/20240228-103442-arnaudb.json
  • 10:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 10:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 10:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T357189)', diff saved to https://phabricator.wikimedia.org/P58035 and previous config saved to /var/cache/conftool/dbconfig/20240228-103419-arnaudb.json
  • 10:32 claime: Lowered the weight of small disk videoscalers
  • 10:31 cgoubert@cumin2002: conftool action : set/weight=15; selector: name=mw(2259|226[3-6]|2278|2279|2281).codfw.wmnet,cluster=videoscaler
  • 10:31 moritzm: copy cas from bullseye-wikimedia to bookworm-wikimedia T357748
  • 10:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P58034 and previous config saved to /var/cache/conftool/dbconfig/20240228-101913-arnaudb.json
  • 10:18 volans: installed spicerack 8.4.0 on cumin1002
  • 10:12 claime: clearing up leftover boxedcommand media files on mw2281 - sudo find . -type f \( -name '*.wav' -o -name '*.ogg' -o -name '*.webm' -o -name '*.mov' -o -name '*.mp4' \) -mmin +1200 -exec sh -c "lsof {} || rm {}" \;
  • 10:12 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db2156.codfw.wmnet onto db2177.codfw.wmnet
  • 10:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T352010)', diff saved to https://phabricator.wikimedia.org/P58033 and previous config saved to /var/cache/conftool/dbconfig/20240228-100720-ladsgroup.json
  • 10:07 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:07 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:06 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:06 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:04 claime: clearing up leftover boxedcommand media files on mw2278 - sudo find . -type f \( -name '*.wav' -o -name '*.ogg' -o -name '*.webm' -o -name '*.mov' -o -name '*.mp4' \) -mmin +1200 -exec sh -c "lsof {} || rm {}" \;
  • 10:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P58032 and previous config saved to /var/cache/conftool/dbconfig/20240228-100406-arnaudb.json
  • 10:03 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
  • 10:00 ayounsi@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
  • 09:54 ladsgroup@deploy2002: Finished scap: Backport for Set three more wikis to read new on pagelinks migration (T351237) (duration: 10m 03s)
  • 09:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T357189)', diff saved to https://phabricator.wikimedia.org/P58030 and previous config saved to /var/cache/conftool/dbconfig/20240228-094900-arnaudb.json
  • 09:46 ayounsi@cumin2002: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
  • 09:46 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 09:46 joal@deploy2002: Finished deploy [analytics/refinery@dba67fd] (hadoop-test): Additional analytics weekly train - TEST [analytics/refinery@dba67fd6] (duration: 03m 33s)
  • 09:46 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin2002"
  • 09:45 ladsgroup@deploy2002: ladsgroup: Backport for Set three more wikis to read new on pagelinks migration (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:45 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2006.codfw.wmnet - ayounsi@cumin2002"
  • 09:45 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2006.codfw.wmnet on all recursors
  • 09:44 ayounsi@cumin2002: START - Cookbook sre.dns.wipe-cache testvm2006.codfw.wmnet on all recursors
  • 09:44 ayounsi@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 09:44 ayounsi@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin2002"
  • 09:44 ladsgroup@deploy2002: Started scap: Backport for Set three more wikis to read new on pagelinks migration (T351237)
  • 09:42 ayounsi@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2006.codfw.wmnet - ayounsi@cumin2002"
  • 09:42 joal@deploy2002: Started deploy [analytics/refinery@dba67fd] (hadoop-test): Additional analytics weekly train - TEST [analytics/refinery@dba67fd6]
  • 09:42 joal@deploy2002: Finished deploy [analytics/refinery@dba67fd] (thin): Additional analytics weekly train - THIN [analytics/refinery@dba67fd6] (duration: 00m 05s)
  • 09:42 joal@deploy2002: Started deploy [analytics/refinery@dba67fd] (thin): Additional analytics weekly train - THIN [analytics/refinery@dba67fd6]
  • 09:41 joal@deploy2002: Finished deploy [analytics/refinery@dba67fd]: Additional analytics weekly train [analytics/refinery@dba67fd6] (duration: 13m 16s)
  • 09:41 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:41 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T357189)', diff saved to https://phabricator.wikimedia.org/P58029 and previous config saved to /var/cache/conftool/dbconfig/20240228-094103-arnaudb.json
  • 09:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 09:41 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:41 ayounsi@cumin2002: START - Cookbook sre.dns.netbox
  • 09:41 ayounsi@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
  • 09:40 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 09:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T357189)', diff saved to https://phabricator.wikimedia.org/P58028 and previous config saved to /var/cache/conftool/dbconfig/20240228-094041-arnaudb.json
  • 09:40 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:39 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:34 moritzm: installing monitoring-plugins bugfix updates from Bookworm point update
  • 09:28 joal@deploy2002: Started deploy [analytics/refinery@dba67fd]: Additional analytics weekly train [analytics/refinery@dba67fd6]
  • 09:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P58027 and previous config saved to /var/cache/conftool/dbconfig/20240228-092535-arnaudb.json
  • 09:25 volans: installed spicerack 8.4.0 on cumin2002
  • 09:23 slyngshede@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host idp-test2003.wikimedia.org
  • 09:23 slyngshede@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host idp-test2003.wikimedia.org with OS bookworm
  • 09:15 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp-test2003.wikimedia.org with reason: host reimage
  • 09:14 moritzm: installing perl security updates on bullseye
  • 09:13 volans: temporary disabling puppet on cumin1002
  • 09:12 slyngshede@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on idp-test2003.wikimedia.org with reason: host reimage
  • 09:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P58026 and previous config saved to /var/cache/conftool/dbconfig/20240228-091029-arnaudb.json
  • 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T357189)', diff saved to https://phabricator.wikimedia.org/P58025 and previous config saved to /var/cache/conftool/dbconfig/20240228-085523-arnaudb.json
  • 08:55 slyngshede@cumin1002: START - Cookbook sre.hosts.reimage for host idp-test2003.wikimedia.org with OS bookworm
  • 08:52 slyngshede@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM idp-test2003.wikimedia.org - slyngshede@cumin1002"
  • 08:51 slyngshede@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM idp-test2003.wikimedia.org - slyngshede@cumin1002"
  • 08:51 slyngshede@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) idp-test2003.wikimedia.org on all recursors
  • 08:51 slyngshede@cumin1002: START - Cookbook sre.dns.wipe-cache idp-test2003.wikimedia.org on all recursors
  • 08:51 slyngshede@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:51 slyngshede@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM idp-test2003.wikimedia.org - slyngshede@cumin1002"
  • 08:50 slyngshede@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM idp-test2003.wikimedia.org - slyngshede@cumin1002"
  • 08:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T357189)', diff saved to https://phabricator.wikimedia.org/P58024 and previous config saved to /var/cache/conftool/dbconfig/20240228-084731-arnaudb.json
  • 08:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 08:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
  • 08:43 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58023 and previous config saved to /var/cache/conftool/dbconfig/20240228-084322-root.json
  • 08:28 kartik@deploy2002: Finished scap: Backport for Enable Section Translation on newly created Wikipedias by default (T298235), Enable SectionTranslation for Wikipedias where ContentTranslation is in beta (T353734) (duration: 12m 59s)
  • 08:28 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58022 and previous config saved to /var/cache/conftool/dbconfig/20240228-082817-root.json
  • 08:02 slyngshede@cumin1002: START - Cookbook sre.dns.netbox
  • 08:02 slyngshede@cumin1002: START - Cookbook sre.ganeti.makevm for new host idp-test2003.wikimedia.org
  • 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58020 and previous config saved to /var/cache/conftool/dbconfig/20240228-075807-root.json
  • 07:51 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2187.codfw.wmnet with OS bookworm
  • 07:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2186.codfw.wmnet with OS bookworm
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58018 and previous config saved to /var/cache/conftool/dbconfig/20240228-074302-root.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2156 T358640', diff saved to https://phabricator.wikimedia.org/P58017 and previous config saved to /var/cache/conftool/dbconfig/20240228-074259-root.json
  • 07:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
  • 07:27 marostegui@cumin1002: dbctl commit (dc=all): 'es2027 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P58016 and previous config saved to /var/cache/conftool/dbconfig/20240228-072757-root.json
  • 07:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2027.codfw.wmnet with OS bookworm
  • 07:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2187.codfw.wmnet with reason: host reimage
  • 07:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
  • 07:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2186.codfw.wmnet with reason: host reimage
  • 07:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2027.codfw.wmnet with reason: host reimage
  • 07:09 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2187.codfw.wmnet with OS bookworm
  • 07:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2027.codfw.wmnet with reason: host reimage
  • 06:58 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2186.codfw.wmnet with OS bookworm
  • 06:51 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s1
  • 06:51 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
  • 06:51 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2027.codfw.wmnet with OS bookworm
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2027 T358180', diff saved to https://phabricator.wikimedia.org/P58015 and previous config saved to /var/cache/conftool/dbconfig/20240228-064731-root.json
  • 06:44 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s1
  • 06:44 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1232 - optimizing revision table T354015', diff saved to https://phabricator.wikimedia.org/P58014 and previous config saved to /var/cache/conftool/dbconfig/20240228-064210-root.json
  • 03:13 slyngshede@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host idp-test1003.wikimedia.org
  • 03:12 slyngshede@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host idp-test1003.wikimedia.org with OS bookworm
  • 03:05 swfrench@cumin2002: conftool action : set/pooled=yes; selector: dc=codfw,cluster=appserver,service=nginx,name=mw2268.codfw.wmnet
  • 03:03 swfrench@cumin2002: conftool action : set/pooled=no; selector: dc=codfw,cluster=appserver,service=nginx,name=mw2268.codfw.wmnet
  • 02:52 swfrench-wmf: Running 'sudo systemctl start etcdmirror-conftool-eqiad-wmnet.service' on conf2005
  • 02:50 swfrench-wmf: Correction: Actually running 'curl https://conf2005.codfw.wmnet:2379/v2/keys/__replication/conftool -XPUT -d "value=3021126"' on conf2005 in an attempt to unwedge replication
  • 02:47 swfrench-wmf: Running 'curl https://conf2005.codfw.wmnet:2379/v2/keys/__replication -XPUT -d "value=3021126"' on conf2005 in an attempt to unwedge replication
  • 02:06 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2213.codfw.wmnet with OS bookworm
  • 02:06 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 02:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 02:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2218.codfw.wmnet with OS bookworm
  • 02:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 02:00 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2217.codfw.wmnet with OS bookworm
  • 01:58 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:55 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2220.codfw.wmnet with OS bookworm
  • 01:55 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:53 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:53 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2210.codfw.wmnet with OS bookworm
  • 01:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:51 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2212.codfw.wmnet with OS bookworm
  • 01:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2213.codfw.wmnet with reason: host reimage
  • 01:49 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:48 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:48 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2211.codfw.wmnet with OS bookworm
  • 01:48 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:47 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:46 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2219.codfw.wmnet with OS bookworm
  • 01:46 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:45 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
  • 01:44 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2214.codfw.wmnet with OS bookworm
  • 01:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2217.codfw.wmnet with reason: host reimage
  • 01:40 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
  • 01:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2216.codfw.wmnet with OS bookworm
  • 01:39 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:38 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:37 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2209.codfw.wmnet with OS bookworm
  • 01:37 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2210.codfw.wmnet with reason: host reimage
  • 01:36 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:34 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
  • 01:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2211.codfw.wmnet with reason: host reimage
  • 01:29 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2219.codfw.wmnet with reason: host reimage
  • 01:27 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2214.codfw.wmnet with reason: host reimage
  • 01:26 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2219.codfw.wmnet with reason: host reimage
  • 01:24 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
  • 01:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
  • 01:20 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2212.codfw.wmnet with reason: host reimage
  • 01:20 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2217.codfw.wmnet with reason: host reimage
  • 01:20 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2213.codfw.wmnet with reason: host reimage
  • 01:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2218.codfw.wmnet with reason: host reimage
  • 01:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2210.codfw.wmnet with reason: host reimage
  • 01:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2214.codfw.wmnet with reason: host reimage
  • 01:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2206.codfw.wmnet with OS bookworm
  • 01:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2216.codfw.wmnet with reason: host reimage
  • 01:19 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2220.codfw.wmnet with reason: host reimage
  • 01:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2211.codfw.wmnet with reason: host reimage
  • 01:19 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2209.codfw.wmnet with reason: host reimage
  • 01:18 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:16 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2208.codfw.wmnet with OS bookworm
  • 01:16 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:15 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2207.codfw.wmnet with OS bookworm
  • 01:14 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:12 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2204.codfw.wmnet with OS bookworm
  • 01:12 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:11 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2205.codfw.wmnet with OS bookworm
  • 01:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:08 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2203.codfw.wmnet with OS bookworm
  • 01:07 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:06 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 01:04 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2206.codfw.wmnet with reason: host reimage
  • 01:01 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
  • 00:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2220.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2219.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2218.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2217.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2216.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2215.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2214.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2213.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2212.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2211.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2210.codfw.wmnet with OS bookworm
  • 00:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2209.codfw.wmnet with OS bookworm
  • 00:56 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2204.codfw.wmnet with reason: host reimage
  • 00:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2205.codfw.wmnet with reason: host reimage
  • 00:53 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2204.codfw.wmnet with reason: host reimage
  • 00:53 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2208.codfw.wmnet with reason: host reimage
  • 00:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2206.codfw.wmnet with reason: host reimage
  • 00:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2207.codfw.wmnet with reason: host reimage
  • 00:51 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2205.codfw.wmnet with reason: host reimage
  • 00:50 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
  • 00:47 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2203.codfw.wmnet with reason: host reimage
  • 00:30 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2198.codfw.wmnet with OS bookworm
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2208.codfw.wmnet with OS bookworm
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2207.codfw.wmnet with OS bookworm
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2206.codfw.wmnet with OS bookworm
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2205.codfw.wmnet with OS bookworm
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2204.codfw.wmnet with OS bookworm
  • 00:28 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2203.codfw.wmnet with OS bookworm
  • 00:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 00:10 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 00:08 rzl@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 00:08 rzl@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 00:08 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 00:07 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 00:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 15:00:00 on wdqs1011.eqiad.wmnet with reason: T355617
  • 00:06 bking@cumin2002: START - Cookbook sre.hosts.downtime for 15:00:00 on wdqs1011.eqiad.wmnet with reason: T355617
  • 00:02 dzahn@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host contint1003.eqiad.wmnet
  • 00:02 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1003.eqiad.wmnet with OS bullseye
  • 00:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2199.codfw.wmnet with OS bookworm
  • 00:01 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 00:00 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2202.codfw.wmnet with OS bookworm
  • 00:00 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"

2024-02-27

  • 23:57 mutante: T358237 - manually went through "fix forward"-steps from T349619 (install puppet-agent package, delete old key material, create new CSR, sign on puppetserver, node clean on puppetmaster) to fix puppet failures while makevm cookbook still running (which couldn't find succesful puppet run)
  • 23:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 23:54 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2201.codfw.wmnet with OS bookworm
  • 23:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 23:52 mutante: T358237 - creating VM with cookbook fails because puppet runs have certificate issue, applied role is already migrated to puppet 7 though
  • 23:50 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 23:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2200.codfw.wmnet with OS bookworm
  • 23:49 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 23:45 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 23:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2199.codfw.wmnet with reason: host reimage
  • 23:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
  • 23:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2201.codfw.wmnet with reason: host reimage
  • 23:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2199.codfw.wmnet with reason: host reimage
  • 23:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2202.codfw.wmnet with reason: host reimage
  • 23:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2200.codfw.wmnet with reason: host reimage
  • 23:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2201.codfw.wmnet with reason: host reimage
  • 23:30 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2200.codfw.wmnet with reason: host reimage
  • 23:10 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2202.codfw.wmnet with OS bookworm
  • 23:10 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2201.codfw.wmnet with OS bookworm
  • 23:10 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2200.codfw.wmnet with OS bookworm
  • 23:10 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2199.codfw.wmnet with OS bookworm
  • 23:10 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2198.codfw.wmnet with OS bookworm
  • 23:09 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2197.codfw.wmnet with OS bookworm
  • 22:47 mutante: DNS - added new project language "bew" - Betawi, also known as Betawi Malay, Jakartan Malay, or Batavian Malay is the spoken language of the Betawi people in Jakarta, Indonesia with an estimated 5 million native speakers. T357866
  • 22:44 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1003.eqiad.wmnet with reason: host reimage
  • 22:41 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1003.eqiad.wmnet with reason: host reimage
  • 22:32 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host contint1003.eqiad.wmnet with OS bullseye
  • 22:31 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM contint1003.eqiad.wmnet - dzahn@cumin1002"
  • 22:30 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM contint1003.eqiad.wmnet - dzahn@cumin1002"
  • 22:30 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) contint1003.eqiad.wmnet on all recursors
  • 22:30 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache contint1003.eqiad.wmnet on all recursors
  • 22:30 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:30 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM contint1003.eqiad.wmnet - dzahn@cumin1002"
  • 22:29 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM contint1003.eqiad.wmnet - dzahn@cumin1002"
  • 22:24 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 22:24 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host contint1003.eqiad.wmnet
  • 20:51 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 20:51 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 20:50 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:50 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:48 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 20:48 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 20:48 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: sync
  • 20:47 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:47 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:45 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:45 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:43 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:41 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:40 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 19:47 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, testing 961878 patch) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T352010)', diff saved to https://phabricator.wikimedia.org/P58012 and previous config saved to /var/cache/conftool/dbconfig/20240227-194021-ladsgroup.json
  • 19:40 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 19:40 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 19:36 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, testing 961878 patch) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:26 dduvall@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.20 refs T354438
  • 18:57 tchin: finished deploying refinery successfully
  • 18:53 tchin@deploy2002: Finished deploy [analytics/refinery@ac9fd7b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ac9fd7b4] (duration: 03m 42s)
  • 18:50 tchin@deploy2002: Started deploy [analytics/refinery@ac9fd7b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@ac9fd7b4]
  • 18:50 tchin@deploy2002: Finished deploy [analytics/refinery@ac9fd7b] (thin): Regular analytics weekly train THIN [analytics/refinery@ac9fd7b4] (duration: 00m 06s)
  • 18:49 tchin@deploy2002: Started deploy [analytics/refinery@ac9fd7b] (thin): Regular analytics weekly train THIN [analytics/refinery@ac9fd7b4]
  • 18:49 tchin@deploy2002: Finished deploy [analytics/refinery@ac9fd7b]: Regular analytics weekly train [analytics/refinery@ac9fd7b4] (duration: 00m 18s)
  • 18:49 tchin@deploy2002: Started deploy [analytics/refinery@ac9fd7b]: Regular analytics weekly train [analytics/refinery@ac9fd7b4]
  • 18:48 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-hd1003.eqiad.wmnet with OS bookworm
  • 18:48 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:48 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host logging-hd1001.eqiad.wmnet with OS bookworm
  • 18:48 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host logging-hd1001.eqiad.wmnet with OS bookworm
  • 18:46 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:46 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-hd1001.eqiad.wmnet with OS bookworm
  • 18:46 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:44 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:38 tchin: rollbacked refinery deployment, failed on stat1010 and stat1011
  • 18:37 tchin@deploy2002: Finished deploy [analytics/refinery@ac9fd7b]: Regular analytics weekly train [analytics/refinery@ac9fd7b4] (duration: 09m 51s)
  • 18:27 tchin@deploy2002: Started deploy [analytics/refinery@ac9fd7b]: Regular analytics weekly train [analytics/refinery@ac9fd7b4]
  • 18:25 tchin: deploying refinery
  • 18:25 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logging-hd1002.eqiad.wmnet with OS bookworm
  • 18:25 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:24 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-hd1003.eqiad.wmnet with reason: host reimage
  • 18:23 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 18:22 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-hd1001.eqiad.wmnet with reason: host reimage
  • 18:22 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
  • 18:21 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
  • 18:19 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-hd1003.eqiad.wmnet with reason: host reimage
  • 18:19 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-hd1001.eqiad.wmnet with reason: host reimage
  • 18:18 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
  • 18:17 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
  • 18:15 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
  • 18:15 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
  • 18:01 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logging-hd1002.eqiad.wmnet with reason: host reimage
  • 17:56 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on logging-hd1002.eqiad.wmnet with reason: host reimage
  • 17:54 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host logging-hd1001.eqiad.wmnet with OS bookworm
  • 17:53 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host logging-hd1003.eqiad.wmnet with OS bookworm
  • 17:31 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host logging-hd1002.eqiad.wmnet with OS bookworm
  • 17:23 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts contint1004.eqiad.wmnet
  • 17:23 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:22 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1002"
  • 17:19 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1002"
  • 17:14 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 17:09 dzahn@cumin1002: START - Cookbook sre.hosts.decommission for hosts contint1004.eqiad.wmnet
  • 17:08 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts contint1003.eqiad.wmnet
  • 17:08 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:08 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1002"
  • 17:07 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: contint1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1002"
  • 17:05 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2123 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58011 and previous config saved to /var/cache/conftool/dbconfig/20240227-170342-arnaudb.json
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'db2108 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58010 and previous config saved to /var/cache/conftool/dbconfig/20240227-170330-arnaudb.json
  • 17:03 arnaudb@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58009 and previous config saved to /var/cache/conftool/dbconfig/20240227-170312-arnaudb.json
  • 17:01 effie: pool citoid eqiad back
  • 17:01 dzahn@cumin1002: START - Cookbook sre.hosts.decommission for hosts contint1003.eqiad.wmnet
  • 17:01 jiji@cumin1002: conftool action : set/pooled=true; selector: dnsdisc=citoid,name=eqiad
  • 16:51 claime: Repooling mw2324.codfw.wmnet,mw2323.codfw.wmnet,mw2259.codfw.wmnet,mw2261.codfw.wmnet,mw2262.codfw.wmnet,mw2263.codfw.wmnet,mw2264.codfw.wmnet,mw2265.codfw.wmnet,mw2266.codfw.wmnet,mw2268.codfw.wmnet,mw2269.codfw.wmnet,mw2270.codfw.wmnet,mw2314.codfw.wmnet,mw2315.codfw.wmnet,mw2316.codfw.wmnet,mw2320.codfw.wmnet,mw2321.codfw.wmnet,mw2322.codfw.wmnet for T355870
  • 16:49 claime: Uncordoning mw2260.codfw.wmnet mw2267.codfw.wmnet mw2310.codfw.wmnet mw2311.codfw.wmnet mw2312.codfw.wmnet mw2313.codfw.wmnet mw2317.codfw.wmnet mw2318.codfw.wmnet mw2319.codfw.wmnet kubernetes2030.codfw.wmnet kubernetes2029.codfw.wmnet kubernetes2057.codfw.wmnet for T355870
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2123 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58008 and previous config saved to /var/cache/conftool/dbconfig/20240227-164837-arnaudb.json
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2108 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58007 and previous config saved to /var/cache/conftool/dbconfig/20240227-164825-arnaudb.json
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58006 and previous config saved to /var/cache/conftool/dbconfig/20240227-164808-arnaudb.json
  • 16:47 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host logging-hd1002.eqiad.wmnet with OS bookworm
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2123 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58005 and previous config saved to /var/cache/conftool/dbconfig/20240227-163332-arnaudb.json
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'db2108 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58004 and previous config saved to /var/cache/conftool/dbconfig/20240227-163320-arnaudb.json
  • 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2196.codfw.wmnet with OS bookworm
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58003 and previous config saved to /var/cache/conftool/dbconfig/20240227-163303-arnaudb.json
  • 16:33 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host logging-hd1002.eqiad.wmnet with OS bookworm
  • 16:30 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
  • 16:30 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host logging-hd1001.eqiad.wmnet with OS bookworm
  • 16:23 fabfur: restarting pybal on lvs2014,lvs2011,lvs2012 and lvs2013 for T355544
  • 16:23 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1039.eqiad.wmnet with OS bookworm
  • 16:23 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:22 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2123 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58002 and previous config saved to /var/cache/conftool/dbconfig/20240227-161827-arnaudb.json
  • 16:18 arnaudb@cumin1002: dbctl commit (dc=all): 'db2108 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58001 and previous config saved to /var/cache/conftool/dbconfig/20240227-161815-arnaudb.json
  • 16:17 arnaudb@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P58000 and previous config saved to /var/cache/conftool/dbconfig/20240227-161758-arnaudb.json
  • 16:16 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
  • 16:13 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
  • 16:07 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1039.eqiad.wmnet with reason: host reimage
  • 16:04 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1039.eqiad.wmnet with reason: host reimage
  • 15:58 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS bookworm
  • 15:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2196.codfw.wmnet with OS bookworm
  • 15:57 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS bookworm
  • 15:56 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 36 hosts with reason: Migrating servers in codfw rack B3 to lsw1-b3-codfw
  • 15:56 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 36 hosts with reason: Migrating servers in codfw rack B3 to lsw1-b3-codfw
  • 15:55 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b3-codfw.mgmt with reason: prepping for server uplink migration codfw rack b3
  • 15:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b3-codfw.mgmt with reason: prepping for server uplink migration codfw rack b3
  • 15:51 jiji@cumin1002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM urldownloader1003.wikimedia.org
  • 15:46 jiji@cumin1002: START - Cookbook sre.ganeti.reboot-vm for VM urldownloader1003.wikimedia.org
  • 15:45 effie: reboot urldownloader1003 - T358597
  • 15:41 topranks: configuring lsw1-b3-codfw in advance of server migration T355870
  • 15:39 arnaudb@cumin1002: dbctl commit (dc=all): 'T355870 - depooling es2021 db2108 db2123', diff saved to https://phabricator.wikimedia.org/P57999 and previous config saved to /var/cache/conftool/dbconfig/20240227-153951-arnaudb.json
  • 15:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on db[2108,2123].codfw.wmnet,es2021.codfw.wmnet with reason: Silence for network maintenance T355870
  • 15:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:40:00 on db[2108,2123].codfw.wmnet,es2021.codfw.wmnet with reason: Silence for network maintenance T355870
  • 15:24 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet
  • 15:24 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:24 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 15:23 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 15:22 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1038.eqiad.wmnet with OS bookworm
  • 15:22 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:22 moritzm: copy prometheus-mcrouter-exporter from bullseye-wikimedia to bookworm-wikimedia T357748
  • 15:21 claime: Extending vg-root on remaining small disk codfw jobrunners
  • 15:21 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:20 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es1040.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: START - Cookbook sre.dns.wipe-cache es1040.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es1039.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: START - Cookbook sre.dns.wipe-cache es1039.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es1038.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: START - Cookbook sre.dns.wipe-cache es1038.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es1037.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: START - Cookbook sre.dns.wipe-cache es1037.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es1036.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: START - Cookbook sre.dns.wipe-cache es1036.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) es1035.eqiad.wmnet on all recursors
  • 15:20 volans@cumin1002: START - Cookbook sre.dns.wipe-cache es1035.eqiad.wmnet on all recursors
  • 15:20 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1037.eqiad.wmnet with OS bookworm
  • 15:19 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2220.codfw.wmnet on all recursors
  • 15:19 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:19 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2220.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2219.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2219.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2218.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2218.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2217.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2217.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2216.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2216.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2215.codfw.wmnet on all recursors
  • 15:19 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 15:19 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2215.codfw.wmnet on all recursors
  • 15:19 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2214.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2214.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2213.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2213.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2212.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2212.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2211.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2211.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2210.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2210.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2209.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2209.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2208.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2208.codfw.wmnet on all recursors
  • 15:18 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2207.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2207.codfw.wmnet on all recursors
  • 15:17 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:17 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2206.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2206.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2205.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2205.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2204.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2204.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2203.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2203.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2202.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2202.codfw.wmnet on all recursors
  • 15:17 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2201.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2201.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2200.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2200.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2199.codfw.wmnet on all recursors
  • 15:16 claime: Cleaning up old tmp media files on codfw jobrunners
  • 15:16 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2199.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2198.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2198.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2197.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2197.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) db2196.codfw.wmnet on all recursors
  • 15:16 volans@cumin1002: START - Cookbook sre.dns.wipe-cache db2196.codfw.wmnet on all recursors
  • 15:13 cmooney@cumin1002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
  • 15:11 volans@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:11 volans@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Deleted AAAA records from new DBs - volans@cumin1002"
  • 15:11 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1039.eqiad.wmnet with OS bookworm
  • 15:10 volans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Deleted AAAA records from new DBs - volans@cumin1002"
  • 15:10 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host es1039.eqiad.wmnet with OS bookworm
  • 15:08 volans@cumin1002: START - Cookbook sre.dns.netbox
  • 15:06 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
  • 15:03 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1038.eqiad.wmnet with reason: host reimage
  • 15:02 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
  • 15:00 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts sretest2004.codfw.wmnet
  • 15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:00 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 14:59 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sretest2004.codfw.wmnet decommissioned, removing all IPs except the asset tag one - cmooney@cumin1002"
  • 14:57 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1037.eqiad.wmnet with reason: host reimage
  • 14:57 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:56 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1039.eqiad.wmnet with OS bookworm
  • 14:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moscovium.eqiad.wmnet
  • 14:52 claime: Drainining mw2260.codfw.wmnet mw2267.codfw.wmnet mw2310.codfw.wmnet mw2311.codfw.wmnet mw2312.codfw.wmnet mw2313.codfw.wmnet mw2317.codfw.wmnet mw2318.codfw.wmnet mw2319.codfw.wmnet kubernetes2030.codfw.wmnet kubernetes2029.codfw.wmnet kubernetes2057.codfw.wmnet for T355870
  • 14:52 cmooney@cumin1002: START - Cookbook sre.hosts.decommission for hosts sretest2004.codfw.wmnet
  • 14:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host moscovium.eqiad.wmnet
  • 14:50 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS bookworm
  • 14:50 cmooney@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host sretest2004.codfw.wmnet with OS bookworm
  • 14:47 claime: Depooling mw2324.codfw.wmnet,mw2323.codfw.wmnet,mw2259.codfw.wmnet,mw2261.codfw.wmnet,mw2262.codfw.wmnet,mw2263.codfw.wmnet,mw2264.codfw.wmnet,mw2265.codfw.wmnet,mw2266.codfw.wmnet,mw2268.codfw.wmnet,mw2269.codfw.wmnet,mw2270.codfw.wmnet,mw2314.codfw.wmnet,mw2315.codfw.wmnet,mw2316.codfw.wmnet,mw2320.codfw.wmnet,mw2321.codfw.wmnet,mw2322.codfw.wmnet for T355870
  • 14:45 claime: disregard previous depooling message for T355544
  • 14:44 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS bookworm
  • 14:41 volans: uploaded spicerack_8.4.0 to apt.wikimedia.org bullseye-wikimedia
  • 14:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 14:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 14:39 claime: depooling mw2325.codfw.wmnet,mw2326.codfw.wmnet,mw2327.codfw.wmnet,mw2328.codfw.wmnet,mw2329.codfw.wmnet,mw2330.codfw.wmnet,mw2331.codfw.wmnet,mw2332.codfw.wmnet,mw2333.codfw.wmnet,mw2334.codfw.wmnet for T355544
  • 14:36 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1035.eqiad.wmnet with OS bookworm
  • 14:36 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 14:35 claime: Adding 20G to root lv on mw2279
  • 14:33 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2004.codfw.wmnet with OS bookworm
  • 14:32 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2004.codfw.wmnet on all recursors
  • 14:32 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2004.codfw.wmnet on all recursors
  • 14:32 fabfur: restarting pybal on lvs2014,lvs2011,lvs2012 and lvs2013 for T355544
  • 14:29 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:29 jiji@cumin1002: conftool action : set/pooled=false; selector: dnsdisc=citoid,name=eqiad
  • 14:28 effie: depool citoid eqiad
  • 14:28 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:27 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2004.wikimedia.org on all recursors
  • 14:27 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2004.wikimedia.org on all recursors
  • 14:27 cmooney@cumin1002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
  • 14:27 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:24 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: sync
  • 14:24 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: sync
  • 14:23 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 14:23 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 14:22 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:22 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for sretest2004 - cmooney@cumin1002"
  • 14:20 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for sretest2004 - cmooney@cumin1002"
  • 14:19 cmooney@cumin1002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2001.codfw.wmnet
  • 14:19 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2001.codfw.wmnet with OS bookworm
  • 14:18 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:11 herron: pyrra upgraded to 0.7.4-2 T351111
  • 14:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 14:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 14:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 14:09 effie: force restarted all citoid pods in eqiad
  • 14:08 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: sync
  • 14:08 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: sync
  • 14:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 14:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 14:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 14:07 effie: force restarted all zotero pods in eqiad
  • 14:06 jiji@deploy2002: helmfile [eqiad] DONE helmfile.d/services/zotero: sync
  • 14:06 jiji@deploy2002: helmfile [eqiad] START helmfile.d/services/zotero: sync
  • 14:05 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2001.codfw.wmnet with reason: host reimage
  • 14:02 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2001.codfw.wmnet with reason: host reimage
  • 13:50 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2001.codfw.wmnet with OS bookworm
  • 13:19 XioNoX: remove unused 208.80.154.143/32 - 208.80.153.47/32 - 208.80.153.50/32 from Netbox
  • 13:17 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2001.codfw.wmnet - cmooney@cumin1002"
  • 13:16 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM testvm2001.codfw.wmnet - cmooney@cumin1002"
  • 13:16 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) testvm2001.codfw.wmnet on all recursors
  • 13:16 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache testvm2001.codfw.wmnet on all recursors
  • 13:16 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:16 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2001.codfw.wmnet - cmooney@cumin1002"
  • 13:15 taavi@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 13:14 taavi@cumin1002: conftool action : set/pooled=inactive; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 13:14 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM testvm2001.codfw.wmnet - cmooney@cumin1002"
  • 13:12 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 13:12 cmooney@cumin1002: START - Cookbook sre.ganeti.makevm for new host testvm2001.codfw.wmnet
  • 12:48 claime: restarting apache2 on mw2278
  • 12:42 claime: restarting apache2 on mw2281
  • 12:40 cgoubert@cumin2002: conftool action : set/weight=25; selector: cluster=jobrunner,dc=codfw,name=mw22(59|63|64|65|66|78|79|81).*
  • 12:39 cgoubert@cumin2002: conftool action : set/weight=25; selector: cluster=videoscaler,dc=codfw,name=mw22(59|63|64|65|66|78|79|81).*
  • 12:39 claime: rebalancing videoscaler cluster: all E5-2650 to weight 25
  • 12:31 claime: Lowered weight and restarted apache on mw2281.codfw.wmnet
  • 12:30 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=mw2281.codfw.wmnet,cluster=videoscaler,dc=codfw
  • 12:30 cgoubert@cumin2002: conftool action : set/pooled=no; selector: name=mw2281.codfw.wmnet,cluster=videoscaler,dc=codfw
  • 12:29 cgoubert@cumin2002: conftool action : set/weight=20; selector: name=mw2281.codfw.wmnet,cluster=videoscaler,dc=codfw
  • 12:28 moritzm: installing perl security updates on bullseye
  • 12:23 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
  • 12:23 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
  • 12:23 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 12:22 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 12:22 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
  • 12:22 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
  • 12:21 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
  • 12:20 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
  • 12:18 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
  • 12:17 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
  • 12:17 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 12:15 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 12:15 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
  • 12:14 slyngshede@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 12:14 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
  • 12:14 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
  • 12:13 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
  • 12:11 slyngshede@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on idp-test1003.wikimedia.org with reason: host reimage
  • 12:10 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
  • 12:10 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
  • 12:09 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
  • 12:09 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
  • 12:09 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
  • 12:08 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
  • 12:08 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
  • 12:08 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
  • 12:05 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host es1025.eqiad.wmnet
  • 12:05 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
  • 12:04 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
  • 12:02 slyngshede@cumin1002: START - Cookbook sre.hosts.reimage for host idp-test1003.wikimedia.org with OS bookworm
  • 12:01 slyngshede@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM idp-test1003.wikimedia.org - slyngshede@cumin1002"
  • 12:01 slyngshede@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM idp-test1003.wikimedia.org - slyngshede@cumin1002"
  • 12:00 slyngshede@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) idp-test1003.wikimedia.org on all recursors
  • 12:00 slyngshede@cumin1002: START - Cookbook sre.dns.wipe-cache idp-test1003.wikimedia.org on all recursors
  • 12:00 slyngshede@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:00 slyngshede@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM idp-test1003.wikimedia.org - slyngshede@cumin1002"
  • 11:59 slyngshede@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM idp-test1003.wikimedia.org - slyngshede@cumin1002"
  • 11:58 claime: Expanding root lv on mw2281,mw2278 by 20G
  • 11:57 slyngshede@cumin1002: START - Cookbook sre.dns.netbox
  • 11:57 slyngshede@cumin1002: START - Cookbook sre.ganeti.makevm for new host idp-test1003.wikimedia.org
  • 11:52 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
  • 11:51 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
  • 11:47 akosiaris@deploy2002: Synchronized tests/src/ClusterConfigTest.php: (no justification provided) (duration: 09m 36s)
  • 11:46 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host es1025.eqiad.wmnet
  • 11:44 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
  • 11:43 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
  • 11:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset: apply
  • 11:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset: apply
  • 11:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 11:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2186.codfw.wmnet
  • 11:09 jynus@cumin1002: dbctl commit (dc=all): 'Repool db2117', diff saved to https://phabricator.wikimedia.org/P57997 and previous config saved to /var/cache/conftool/dbconfig/20240227-110952-jynus.json
  • 11:08 jynus@cumin1002: dbctl commit (dc=all): 'Depool db2117', diff saved to https://phabricator.wikimedia.org/P57996 and previous config saved to /var/cache/conftool/dbconfig/20240227-110828-jynus.json
  • 10:54 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2186.codfw.wmnet
  • 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2185.codfw.wmnet
  • 10:41 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2185.codfw.wmnet
  • 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2173.codfw.wmnet
  • 10:20 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es1037.eqiad.wmnet with OS bookworm
  • 10:20 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es1039.eqiad.wmnet with OS bookworm
  • 10:17 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es1038.eqiad.wmnet with OS bookworm
  • 10:16 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1039.eqiad.wmnet with OS bookworm
  • 10:15 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es1039.eqiad.wmnet with OS bookworm
  • 10:14 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS bookworm
  • 10:13 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es1037.eqiad.wmnet with OS bookworm
  • 10:11 klausman@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 10:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1040.eqiad.wmnet with OS bookworm
  • 10:11 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - marostegui@cumin1002"
  • 10:10 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2173.codfw.wmnet
  • 10:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS bookworm
  • 10:10 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - marostegui@cumin1002"
  • 10:09 marostegui@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host es1038.eqiad.wmnet with OS bookworm
  • 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1151.eqiad.wmnet
  • 10:06 klausman@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 10:03 jnuche@deploy2002: Finished scap: Backport for In RequestContext::setUser() also reset $this->skinName (T336504) (duration: 10m 12s)
  • 10:02 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS bookworm
  • 10:02 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 10:02 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 10:01 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 10:01 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 10:01 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 10:01 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 10:01 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 10:00 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 09:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1040.eqiad.wmnet with reason: host reimage
  • 09:55 jnuche@deploy2002: jnuche and tstarling: Continuing with sync
  • 09:54 jnuche@deploy2002: jnuche and tstarling: Backport for In RequestContext::setUser() also reset $this->skinName (T336504) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 09:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1036.eqiad.wmnet with OS bookworm
  • 09:53 marostegui@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - marostegui@cumin1002"
  • 09:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1040.eqiad.wmnet with reason: host reimage
  • 09:53 jnuche@deploy2002: Started scap: Backport for In RequestContext::setUser() also reset $this->skinName (T336504)
  • 09:44 marostegui@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - marostegui@cumin1002"
  • 09:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1040.eqiad.wmnet with OS bookworm
  • 09:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1039.eqiad.wmnet with OS bookworm
  • 09:38 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS bookworm
  • 09:37 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS bookworm
  • 09:31 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
  • 09:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
  • 09:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1035.eqiad.wmnet with OS bookworm
  • 09:15 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db1151.eqiad.wmnet
  • 09:14 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS bookworm
  • 09:12 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1035.eqiad.wmnet with reason: host reimage
  • 09:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1035.eqiad.wmnet with reason: host reimage
  • 09:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2134.codfw.wmnet
  • 08:56 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1035.eqiad.wmnet with OS bookworm
  • 08:52 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2134.codfw.wmnet
  • 08:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 100%: After migration to 10.6 T358180', diff saved to https://phabricator.wikimedia.org/P57995 and previous config saved to /var/cache/conftool/dbconfig/20240227-085113-root.json
  • 08:47 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2132.codfw.wmnet
  • 08:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 75%: After migration to 10.6 T358180', diff saved to https://phabricator.wikimedia.org/P57994 and previous config saved to /var/cache/conftool/dbconfig/20240227-083608-root.json
  • 08:36 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host db2132.codfw.wmnet
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 25%: After migration to 10.6 T358180', diff saved to https://phabricator.wikimedia.org/P57992 and previous config saved to /var/cache/conftool/dbconfig/20240227-080559-root.json
  • 08:05 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet,service=s7
  • 08:05 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet,service=s2
  • 08:00 jmm@cumin2002: START - Cookbook sre.puppet.migrate-host for host dbproxy2001.codfw.wmnet
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 10%: After migration to 10.6 T358180', diff saved to https://phabricator.wikimedia.org/P57991 and previous config saved to /var/cache/conftool/dbconfig/20240227-075054-root.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 5%: After migration to 10.6 T358180', diff saved to https://phabricator.wikimedia.org/P57990 and previous config saved to /var/cache/conftool/dbconfig/20240227-073549-root.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2029 (re)pooling @ 1%: After migration to 10.6 T358180', diff saved to https://phabricator.wikimedia.org/P57989 and previous config saved to /var/cache/conftool/dbconfig/20240227-072044-root.json
  • 07:20 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2029.codfw.wmnet with OS bookworm
  • 06:54 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2118.codfw.wmnet with reason: Maintenance
  • 06:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2118.codfw.wmnet with reason: Maintenance
  • 06:42 XioNoX: Netbox: set ENFORCE_GLOBAL_UNIQUE to True - T336275
  • 06:41 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2029.codfw.wmnet with OS bookworm
  • 06:37 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2029 T358180', diff saved to https://phabricator.wikimedia.org/P57988 and previous config saved to /var/cache/conftool/dbconfig/20240227-063707-root.json
  • 06:35 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet,service=s2
  • 06:35 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet,service=s7
  • 06:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Master upgrade x2 T353499
  • 06:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Master upgrade x2 T353499
  • 06:06 kart_: cxserver: Removed dictionary support
  • 05:49 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 05:48 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 05:46 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 05:46 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 05:41 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 05:41 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 04:56 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.20 refs T354438 (duration: 52m 18s)
  • 04:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57987 and previous config saved to /var/cache/conftool/dbconfig/20240227-042703-arnaudb.json
  • 04:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P57986 and previous config saved to /var/cache/conftool/dbconfig/20240227-041156-arnaudb.json
  • 04:04 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.20 refs T354438
  • 04:02 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.17 (duration: 02m 00s)
  • 03:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P57985 and previous config saved to /var/cache/conftool/dbconfig/20240227-035650-arnaudb.json
  • 03:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57984 and previous config saved to /var/cache/conftool/dbconfig/20240227-034144-arnaudb.json
  • 03:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57983 and previous config saved to /var/cache/conftool/dbconfig/20240227-032037-arnaudb.json
  • 03:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 03:20 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 03:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T357189)', diff saved to https://phabricator.wikimedia.org/P57982 and previous config saved to /var/cache/conftool/dbconfig/20240227-032015-arnaudb.json
  • 03:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P57981 and previous config saved to /var/cache/conftool/dbconfig/20240227-030508-arnaudb.json
  • 02:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P57980 and previous config saved to /var/cache/conftool/dbconfig/20240227-025002-arnaudb.json
  • 02:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T357189)', diff saved to https://phabricator.wikimedia.org/P57979 and previous config saved to /var/cache/conftool/dbconfig/20240227-023456-arnaudb.json
  • 02:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T357189)', diff saved to https://phabricator.wikimedia.org/P57978 and previous config saved to /var/cache/conftool/dbconfig/20240227-021357-arnaudb.json
  • 02:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 02:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 02:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57977 and previous config saved to /var/cache/conftool/dbconfig/20240227-021333-arnaudb.json
  • 01:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P57976 and previous config saved to /var/cache/conftool/dbconfig/20240227-015827-arnaudb.json
  • 01:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P57975 and previous config saved to /var/cache/conftool/dbconfig/20240227-014321-arnaudb.json
  • 01:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57974 and previous config saved to /var/cache/conftool/dbconfig/20240227-012814-arnaudb.json
  • 01:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57973 and previous config saved to /var/cache/conftool/dbconfig/20240227-010344-arnaudb.json
  • 01:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 01:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 01:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T357189)', diff saved to https://phabricator.wikimedia.org/P57972 and previous config saved to /var/cache/conftool/dbconfig/20240227-010321-arnaudb.json
  • 00:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P57971 and previous config saved to /var/cache/conftool/dbconfig/20240227-004815-arnaudb.json
  • 00:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P57970 and previous config saved to /var/cache/conftool/dbconfig/20240227-003309-arnaudb.json
  • 00:30 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 00:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T357189)', diff saved to https://phabricator.wikimedia.org/P57969 and previous config saved to /var/cache/conftool/dbconfig/20240227-001802-arnaudb.json
  • 00:16 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1035.eqiad.wmnet with reason: host reimage
  • 00:13 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1035.eqiad.wmnet with reason: host reimage

2024-02-26

  • 23:59 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1035.eqiad.wmnet with OS bookworm
  • 23:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T357189)', diff saved to https://phabricator.wikimedia.org/P57968 and previous config saved to /var/cache/conftool/dbconfig/20240226-235539-arnaudb.json
  • 23:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 23:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 23:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 23:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 23:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T357189)', diff saved to https://phabricator.wikimedia.org/P57967 and previous config saved to /var/cache/conftool/dbconfig/20240226-235500-arnaudb.json
  • 23:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P57966 and previous config saved to /var/cache/conftool/dbconfig/20240226-233953-arnaudb.json
  • 23:26 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host an-redacteddb1001.eqiad.wmnet with OS bookworm
  • 23:26 btullis@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 23:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P57965 and previous config saved to /var/cache/conftool/dbconfig/20240226-232443-arnaudb.json
  • 23:11 btullis@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1002"
  • 23:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T357189)', diff saved to https://phabricator.wikimedia.org/P57964 and previous config saved to /var/cache/conftool/dbconfig/20240226-230934-arnaudb.json
  • 23:06 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade - ryankemper@cumin2002 - T356651
  • 23:00 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1040.eqiad.wmnet with reason: host reimage
  • 22:57 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
  • 22:55 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1040.eqiad.wmnet with reason: host reimage
  • 22:54 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
  • 22:46 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade - ryankemper@cumin2002 - T356651
  • 22:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T357189)', diff saved to https://phabricator.wikimedia.org/P57963 and previous config saved to /var/cache/conftool/dbconfig/20240226-224557-arnaudb.json
  • 22:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 22:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 22:45 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
  • 22:42 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1036.eqiad.wmnet with reason: host reimage
  • 22:42 TimStarling: on snapshot1010 killed PHP processes left over from kill -9 of python parents T358458
  • 22:42 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bookworm
  • 22:41 btullis@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-redacteddb1001.eqiad.wmnet with OS bookworm
  • 22:38 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1040.eqiad.wmnet with OS bookworm
  • 22:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: cloudelastic restart
  • 22:28 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: cloudelastic restart
  • 22:27 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1035.eqiad.wmnet with reason: host reimage
  • 22:25 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 22:24 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 22:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T357189)', diff saved to https://phabricator.wikimedia.org/P57962 and previous config saved to /var/cache/conftool/dbconfig/20240226-222435-arnaudb.json
  • 22:24 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1035.eqiad.wmnet with reason: host reimage
  • 22:20 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS bookworm
  • 22:18 ryankemper@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.UPGRADE (2 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade - ryankemper@cumin2002 - T356651
  • 22:15 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es1036.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:14 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1036.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P57961 and previous config saved to /var/cache/conftool/dbconfig/20240226-220928-arnaudb.json
  • 22:06 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (2 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade - ryankemper@cumin2002 - T356651
  • 22:02 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 08m 37s)
  • 21:56 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1035.eqiad.wmnet with OS bookworm
  • 21:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P57960 and previous config saved to /var/cache/conftool/dbconfig/20240226-215422-arnaudb.json
  • 21:54 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 08m 26s)
  • 21:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T357189)', diff saved to https://phabricator.wikimedia.org/P57959 and previous config saved to /var/cache/conftool/dbconfig/20240226-213916-arnaudb.json
  • 21:38 cjming@deploy2002: Finished scap: Backport for Fix regression in WebM transcodes breaking audio (T358342) (duration: 11m 14s)
  • 21:30 cjming@deploy2002: cjming and bvibber: Continuing with sync
  • 21:29 cjming@deploy2002: cjming and bvibber: Backport for Fix regression in WebM transcodes breaking audio (T358342) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:27 cjming@deploy2002: Started scap: Backport for Fix regression in WebM transcodes breaking audio (T358342)
  • 21:22 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1004.eqiad.wmnet with OS bullseye
  • 21:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2109 (T357189)', diff saved to https://phabricator.wikimedia.org/P57958 and previous config saved to /var/cache/conftool/dbconfig/20240226-211619-arnaudb.json
  • 21:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 21:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 21:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T357189)', diff saved to https://phabricator.wikimedia.org/P57957 and previous config saved to /var/cache/conftool/dbconfig/20240226-211557-arnaudb.json
  • 21:10 dzahn@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1004.eqiad.wmnet with reason: host reimage
  • 21:07 dzahn@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1004.eqiad.wmnet with reason: host reimage
  • 21:02 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:02 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P57956 and previous config saved to /var/cache/conftool/dbconfig/20240226-210050-arnaudb.json
  • 20:58 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host contint1004.eqiad.wmnet with OS bullseye
  • 20:58 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host contint1004.eqiad.wmnet
  • 20:57 dzahn@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint1004.eqiad.wmnet with OS bullseye
  • 20:52 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1040.eqiad.wmnet with OS bookworm
  • 20:52 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1039.eqiad.wmnet with OS bookworm
  • 20:52 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1038.eqiad.wmnet with OS bookworm
  • 20:51 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1036.eqiad.wmnet with OS bookworm
  • 20:46 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1037.eqiad.wmnet with OS bookworm
  • 20:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P57955 and previous config saved to /var/cache/conftool/dbconfig/20240226-204544-arnaudb.json
  • 20:44 mutante: T358237 used the next hostname number,1004, to avoid the duplicate IP issue. makevm cookbook is at attempt 103/240 to detect a reboot of the VM and uptime just keeps going up. used the "gnt-instance console --show-cmd " trick to get a console despite https://phabricator.wikimedia.org/T309724 - was missing partman config
  • 20:41 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1035.eqiad.wmnet with OS bookworm
  • 20:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T357189)', diff saved to https://phabricator.wikimedia.org/P57954 and previous config saved to /var/cache/conftool/dbconfig/20240226-203038-arnaudb.json
  • 20:19 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host contint1004.eqiad.wmnet with OS bullseye
  • 20:18 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2003.codfw.wmnet with OS bookworm
  • 20:18 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM contint1004.eqiad.wmnet - dzahn@cumin1002"
  • 20:17 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM contint1004.eqiad.wmnet - dzahn@cumin1002"
  • 20:17 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) contint1004.eqiad.wmnet on all recursors
  • 20:17 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache contint1004.eqiad.wmnet on all recursors
  • 20:17 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:17 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM contint1004.eqiad.wmnet - dzahn@cumin1002"
  • 20:16 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM contint1004.eqiad.wmnet - dzahn@cumin1002"
  • 20:14 sukhe: running dummy authdns-update
  • 20:12 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 20:12 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host contint1004.eqiad.wmnet
  • 20:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2105 (T357189)', diff saved to https://phabricator.wikimedia.org/P57953 and previous config saved to /var/cache/conftool/dbconfig/20240226-200734-arnaudb.json
  • 20:07 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 20:07 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
  • 20:07 bblack@cumin1002: conftool action : set/pooled=no; selector: cluster=dnsbox,service=authdns-update,name=dns3001.wikimedia.org
  • 20:03 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage
  • 20:02 bblack@cumin1002: conftool action : set/pooled=yes; selector: cluster=dnsbox,service=authdns-update,name=dns3003.wikimedia.org
  • 20:01 bblack@cumin1002: conftool action : set/pooled=no; selector: cluster=dnsbox,service=authdns-update,name=dns3003.wikimedia.org
  • 20:00 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage
  • 20:00 bblack@cumin1002: conftool action : set/pooled=no; selector: cluster=dnsbox,service=authdns-update,name=dns3001.wikimedia.org
  • 19:59 bblack@cumin1002: conftool action : set/pooled=yes; selector: cluster=dnsbox,service=authdns-update,name=dns6002.wikimedia.org
  • 19:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 19:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 19:45 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2003.codfw.wmnet with OS bookworm
  • 19:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 19:44 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 19:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T357189)', diff saved to https://phabricator.wikimedia.org/P57952 and previous config saved to /var/cache/conftool/dbconfig/20240226-194427-arnaudb.json
  • 19:43 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=dns6001.wikimedia.org,service=authdns-update
  • 19:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1040.eqiad.wmnet with OS bookworm
  • 19:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1039.eqiad.wmnet with OS bookworm
  • 19:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS bookworm
  • 19:31 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS bookworm
  • 19:30 cmooney@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host sretest2004.wikimedia.org with OS bookworm
  • 19:30 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2003.codfw.wmnet with OS bookworm
  • 19:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P57951 and previous config saved to /var/cache/conftool/dbconfig/20240226-192920-arnaudb.json
  • 19:26 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS bookworm
  • 19:21 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1035.eqiad.wmnet with OS bookworm
  • 19:15 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=dns6002.wikimedia.org,service=authdns-update
  • 19:15 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage
  • 19:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223', diff saved to https://phabricator.wikimedia.org/P57950 and previous config saved to /var/cache/conftool/dbconfig/20240226-191414-arnaudb.json
  • 19:13 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:12 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2003.codfw.wmnet with reason: host reimage
  • 19:11 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 19:10 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync - dzahn@cumin1002"
  • 19:09 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync - dzahn@cumin1002"
  • 19:09 mutante: decom cookbook finishes with 0 but does not remove DNS record of virtual machine T358237
  • 19:06 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:06 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:05 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host contint1003.eqiad.wmnet
  • 19:04 dzahn@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 19:04 mutante: T358237 - makevm cookbook was interrupted by accident. re-running it would create a second IP with the same DNS name, running decom cookbook also fails, stuck
  • 19:02 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 19:02 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host contint1003.eqiad.wmnet
  • 19:02 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es1040
  • 19:02 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host es1040
  • 19:02 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es1039
  • 19:01 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es1038
  • 19:01 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host es1039
  • 19:01 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host es1038
  • 19:01 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es1037
  • 19:01 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es1036
  • 19:01 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host es1037
  • 19:00 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host es1036
  • 19:00 jclark@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host es1035
  • 19:00 jclark@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host es1035
  • 18:59 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:59 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt for es1036-40 - jclark@cumin1002"
  • 18:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1223 (T357189)', diff saved to https://phabricator.wikimedia.org/P57949 and previous config saved to /var/cache/conftool/dbconfig/20240226-185907-arnaudb.json
  • 18:56 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt for es1036-40 - jclark@cumin1002"
  • 18:55 dzahn@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.eqiad.wmnet with OS bullseye
  • 18:54 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 18:51 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2003.codfw.wmnet with OS bookworm
  • 18:49 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2004.wikimedia.org with OS bookworm
  • 18:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1223 (T357189)', diff saved to https://phabricator.wikimedia.org/P57948 and previous config saved to /var/cache/conftool/dbconfig/20240226-184903-arnaudb.json
  • 18:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 18:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1223.eqiad.wmnet with reason: Maintenance
  • 18:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T357189)', diff saved to https://phabricator.wikimedia.org/P57947 and previous config saved to /var/cache/conftool/dbconfig/20240226-184841-arnaudb.json
  • 18:48 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host sretest2004.wikimedia.org
  • 18:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host es1038.eqiad.wmnet with OS bookworm
  • 18:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host es1039.eqiad.wmnet with OS bookworm
  • 18:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host es1040.eqiad.wmnet with OS bookworm
  • 18:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host es1037.eqiad.wmnet with OS bookworm
  • 18:42 cmooney@cumin1002: START - Cookbook sre.hosts.dhcp for host sretest2004.wikimedia.org
  • 18:42 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host es1035.eqiad.wmnet with OS bookworm
  • 18:41 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host es1036.eqiad.wmnet with OS bookworm
  • 18:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P57946 and previous config saved to /var/cache/conftool/dbconfig/20240226-183334-arnaudb.json
  • 18:29 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:29 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:28 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2004.wikimedia.org with OS bookworm
  • 18:19 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:18 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P57945 and previous config saved to /var/cache/conftool/dbconfig/20240226-181827-arnaudb.json
  • 18:16 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 18:16 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 18:14 Daimona: T357007 Running mwscript CampaignEvents:GenerateInvitationList --wiki=metawiki --listfile=/home/daimona/list.txt
  • 18:13 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host contint1003.eqiad.wmnet with OS bullseye
  • 18:11 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host contint1003.eqiad.wmnet
  • 18:11 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) contint1003.eqiad.wmnet on all recursors
  • 18:11 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache contint1003.eqiad.wmnet on all recursors
  • 18:11 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:09 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:09 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) contint1003.eqiad.wmnet on all recursors
  • 18:09 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache contint1003.eqiad.wmnet on all recursors
  • 18:09 dzahn@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 18:07 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:07 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host contint1003.eqiad.wmnet
  • 18:07 dzahn@cumin1002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host contint1003.eqiad.wmnet
  • 18:07 dzahn@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host contint1003.eqiad.wmnet with OS bullseye
  • 18:06 dzahn@cumin1002: START - Cookbook sre.hosts.reimage for host contint1003.eqiad.wmnet with OS bullseye
  • 18:06 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM contint1003.eqiad.wmnet - dzahn@cumin1002"
  • 18:06 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM contint1003.eqiad.wmnet - dzahn@cumin1002"
  • 18:06 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) contint1003.eqiad.wmnet on all recursors
  • 18:05 dzahn@cumin1002: START - Cookbook sre.dns.wipe-cache contint1003.eqiad.wmnet on all recursors
  • 18:05 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:05 dzahn@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM contint1003.eqiad.wmnet - dzahn@cumin1002"
  • 18:04 dzahn@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM contint1003.eqiad.wmnet - dzahn@cumin1002"
  • 18:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T357189)', diff saved to https://phabricator.wikimedia.org/P57944 and previous config saved to /var/cache/conftool/dbconfig/20240226-180321-arnaudb.json
  • 18:01 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 18:00 dzahn@cumin1002: START - Cookbook sre.ganeti.makevm for new host contint1003.eqiad.wmnet
  • 17:59 dzahn@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 17:58 dzahn@cumin1002: START - Cookbook sre.dns.netbox
  • 17:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T357189)', diff saved to https://phabricator.wikimedia.org/P57943 and previous config saved to /var/cache/conftool/dbconfig/20240226-175315-arnaudb.json
  • 17:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 17:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 17:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 17:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 17:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T357189)', diff saved to https://phabricator.wikimedia.org/P57942 and previous config saved to /var/cache/conftool/dbconfig/20240226-175231-arnaudb.json
  • 17:51 sukhe: running dummy authdns-update to confirm working ferm rules
  • 17:41 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-hd1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:38 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 17:38 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P57941 and previous config saved to /var/cache/conftool/dbconfig/20240226-173725-arnaudb.json
  • 17:35 denisse: Enabled meta-monitoring for alert1001 - T333615
  • 17:33 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1040.eqiad.wmnet with OS bookworm
  • 17:33 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1039.eqiad.wmnet with OS bookworm
  • 17:33 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1038.eqiad.wmnet with OS bookworm
  • 17:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1037.eqiad.wmnet with OS bookworm
  • 17:32 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1036.eqiad.wmnet with OS bookworm
  • 17:31 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host es1035.eqiad.wmnet with OS bookworm
  • 17:22 denisse@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on alert2001.wikimedia.org with reason: host reimage
  • 17:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P57940 and previous config saved to /var/cache/conftool/dbconfig/20240226-172218-arnaudb.json
  • 17:18 denisse@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on alert2001.wikimedia.org with reason: host reimage
  • 17:16 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-hd1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T357189)', diff saved to https://phabricator.wikimedia.org/P57939 and previous config saved to /var/cache/conftool/dbconfig/20240226-170712-arnaudb.json
  • 17:05 vriley@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['logging-hd1001']
  • 17:04 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logging-hd1001']
  • 17:04 vriley@cumin1002: START - Cookbook sre.hosts.provision for host logging-hd1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:03 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logging-hd1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 17:00 denisse@cumin2002: START - Cookbook sre.hosts.reimage for host alert2001.wikimedia.org with OS bookworm
  • 16:59 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1036.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T357189)', diff saved to https://phabricator.wikimedia.org/P57938 and previous config saved to /var/cache/conftool/dbconfig/20240226-165730-arnaudb.json
  • 16:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 16:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 16:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T357189)', diff saved to https://phabricator.wikimedia.org/P57937 and previous config saved to /var/cache/conftool/dbconfig/20240226-165707-arnaudb.json
  • 16:55 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logging-hd1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:55 sukhe: sudo cumin 'A:dns-rec and not P{dns6001*}' "run-puppet-agent --enable 'merging CR'"
  • 16:54 sukhe: re-enable Puppet on A:dns-rec and run agent
  • 16:48 vriley@cumin1002: START - Cookbook sre.hosts.provision for host logging-hd1003.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:47 sukhe: disable puppet on A:dns-rec to merge CR 1006532
  • 16:46 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:46 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt logging-hd1003 - vriley@cumin1002"
  • 16:46 sukhe@puppetmaster1001: conftool action : set/weight=100; selector: cluster=dnsbox
  • 16:46 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt logging-hd1003 - vriley@cumin1002"
  • 16:45 vriley@cumin1002: START - Cookbook sre.hosts.provision for host logging-hd1002.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:43 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 16:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P57936 and previous config saved to /var/cache/conftool/dbconfig/20240226-164201-arnaudb.json
  • 16:39 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1038.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:39 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:38 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 16:37 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:37 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt logging-hd1002 - vriley@cumin1002"
  • 16:36 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt logging-hd1002 - vriley@cumin1002"
  • 16:34 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 16:30 denisse@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host alert2001.wikimedia.org with OS bookworm
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P57935 and previous config saved to /var/cache/conftool/dbconfig/20240226-162655-arnaudb.json
  • 16:23 vriley@cumin1002: START - Cookbook sre.hosts.provision for host logging-hd1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:22 sukhe: etcd: purging /conftool/v1/dnsbox: old schema, deprecated: T347054
  • 16:20 vriley@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 16:19 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 16:18 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1036.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:13 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es1037']
  • 16:13 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es1037']
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T357189)', diff saved to https://phabricator.wikimedia.org/P57933 and previous config saved to /var/cache/conftool/dbconfig/20240226-161148-arnaudb.json
  • 16:09 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2004.wikimedia.org with OS bookworm
  • 16:05 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es1035']
  • 16:05 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es1035']
  • 16:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es1036']
  • 16:04 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es1036']
  • 16:04 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es1037']
  • 16:03 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es1035']
  • 16:03 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['es1036']
  • 16:03 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es1037']
  • 16:03 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es1036']
  • 16:02 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['es1035']
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T357189)', diff saved to https://phabricator.wikimedia.org/P57932 and previous config saved to /var/cache/conftool/dbconfig/20240226-160206-arnaudb.json
  • 16:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 16:02 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 16:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T357189)', diff saved to https://phabricator.wikimedia.org/P57931 and previous config saved to /var/cache/conftool/dbconfig/20240226-160143-arnaudb.json
  • 15:59 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:59 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es1036.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:58 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1040.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:58 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1037.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:58 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1039.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P57930 and previous config saved to /var/cache/conftool/dbconfig/20240226-154637-arnaudb.json
  • 15:45 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1038.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:44 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es1038.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:41 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1038.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:41 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host es1038.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P57929 and previous config saved to /var/cache/conftool/dbconfig/20240226-153131-arnaudb.json
  • 15:28 klausman@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 15:27 klausman@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 15:25 klausman@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 15:25 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster - update 3 (duration: 00m 12s)
  • 15:25 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster - update 3
  • 15:23 klausman@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 15:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1040.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:21 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1039.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:20 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1038.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:20 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1037.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:20 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1036.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:18 jclark@cumin1002: START - Cookbook sre.hosts.provision for host es1035.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:17 klausman@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 15:16 klausman@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 15:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T357189)', diff saved to https://phabricator.wikimedia.org/P57928 and previous config saved to /var/cache/conftool/dbconfig/20240226-151624-arnaudb.json
  • 15:16 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:16 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt for es1036-40 - jclark@cumin1002"
  • 15:15 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster - update 2 (duration: 00m 12s)
  • 15:15 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster - update 2
  • 15:15 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: added network and mgmt for es1036-40 - jclark@cumin1002"
  • 15:13 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 15:13 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 15:12 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 15:11 denisse@cumin2002: START - Cookbook sre.hosts.reimage for host alert2001.wikimedia.org with OS bookworm
  • 15:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T357189)', diff saved to https://phabricator.wikimedia.org/P57927 and previous config saved to /var/cache/conftool/dbconfig/20240226-150639-arnaudb.json
  • 15:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 15:06 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 15:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57926 and previous config saved to /var/cache/conftool/dbconfig/20240226-150606-arnaudb.json
  • 15:03 denisse: Disabling meta-monitoring for the alert hosts - T333615
  • 15:02 denisse: Disabling meta-monitoring for the alert hosts - T333615
  • 14:52 kevinbazira@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 14:52 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 14:51 fabfur: repooled and reactivate puppet on cp4037 to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1006489 (T358105)
  • 14:51 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:51 fabfur@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet,service=(cdn|ats-be)
  • 14:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P57925 and previous config saved to /var/cache/conftool/dbconfig/20240226-145059-arnaudb.json
  • 14:48 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:48 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 14:48 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 14:48 cgoubert@deploy2002: Finished scap: Backport for Enable $wgLocalHTTPProxy on all wikis (T298265) (duration: 13m 24s)
  • 14:47 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 14:47 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:46 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 14:46 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:46 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 14:42 fabfur: depooled and deactivated puppet on cp4037 to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1006489 (T358105)
  • 14:39 cgoubert@deploy2002: cgoubert: Continuing with sync
  • 14:36 cgoubert@deploy2002: cgoubert: Backport for Enable $wgLocalHTTPProxy on all wikis (T298265) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P57924 and previous config saved to /var/cache/conftool/dbconfig/20240226-143553-arnaudb.json
  • 14:34 cgoubert@deploy2002: Started scap: Backport for Enable $wgLocalHTTPProxy on all wikis (T298265)
  • 14:26 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Remove the Collection extension from wikisource (T358437) (duration: 11m 49s)
  • 14:23 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host sretest2004.wikimedia.org with OS bookworm
  • 14:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57923 and previous config saved to /var/cache/conftool/dbconfig/20240226-142046-arnaudb.json
  • 14:17 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and soda: Continuing with sync
  • 14:15 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and soda: Backport for Remove the Collection extension from wikisource (T358437) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for Remove the Collection extension from wikisource (T358437)
  • 14:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57922 and previous config saved to /var/cache/conftool/dbconfig/20240226-141107-arnaudb.json
  • 14:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 14:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 13:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 13:09 claime: trafficserver: move 50% of traffic to mw on k8s - T357507
  • 13:06 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 13:06 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2004.wikimedia.org with OS bookworm
  • 13:06 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 13:06 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) sretest2004.wikimedia.org on all recursors
  • 13:05 cmooney@cumin1002: START - Cookbook sre.dns.wipe-cache sretest2004.wikimedia.org on all recursors
  • 13:04 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 13:04 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 13:04 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 13:04 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 13:04 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 13:04 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:04 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for sretest2004 - cmooney@cumin1002"
  • 13:03 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 13:03 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for sretest2004 - cmooney@cumin1002"
  • 13:00 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 12:55 Dreamy_Jazz: Restarting MediaModeration scanning maintenance script - See https://wikitech.wikimedia.org/wiki/MediaModeration
  • 12:07 btullis@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
  • 12:04 btullis@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on an-redacteddb1001.eqiad.wmnet with reason: host reimage
  • 11:44 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bookworm
  • 11:42 btullis@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-redacteddb1001.eqiad.wmnet with OS bookworm
  • 11:41 claime: Restarting failed mediawiki_job_generatecaptcha
  • 11:20 Lucas_WMDE: STOP persistRevisionThreadItems on viwiki for T315510 again, tons of errors (didn’t even respond to Ctrl+C so I `sudo -u www-data kill`’ed it)
  • 11:18 fabfur: enabled puppet on 'A:cp' to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1005548 (T358105, T358107)
  • 11:18 btullis@cumin1002: END (ERROR) - Cookbook sre.presto.roll-restart-workers (exit_code=97) for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 11:13 btullis@cumin1002: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons.
  • 11:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 11:07 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 10:47 btullis@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bookworm
  • 10:36 fabfur: enabled puppet on 'A:cp-ulsfo' to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1005548 (T358105, T358107)
  • 10:29 jayme@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw2442.codfw.wmnet
  • 10:27 taavi: upgrading wikitech-static to mediawiki 1.41 T357880
  • 10:07 moritzm: installing perl security updates
  • 10:04 fabfur: disabled puppet on all cp hosts to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1005548 (T358105, T358107)
  • 09:23 Emperor: unmute the outbound port utilisation over 80% alert T358455
  • 09:12 jayme@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw2442.codfw.wmnet
  • 09:10 jayme@cumin1002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mw2442.codfw.wmnet
  • 09:10 jayme@cumin1002: START - Cookbook sre.hosts.reboot-single for host mw2442.codfw.wmnet
  • 09:00 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on etherpad1003.eqiad.wmnet with reason: Upgrade etherpad and switch to bookworm
  • 09:00 jelto@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on etherpad1003.eqiad.wmnet with reason: Upgrade etherpad and switch to bookworm
  • 08:58 slyngs: IDP switchover to idp2002
  • 08:51 XioNoX: deploy "facebookexternalhit" varnish 403 - T358455

2024-02-25

  • 22:47 Emperor: mute the outbound port utilisation over 80% alert
  • 00:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57920 and previous config saved to /var/cache/conftool/dbconfig/20240225-005423-arnaudb.json
  • 00:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P57919 and previous config saved to /var/cache/conftool/dbconfig/20240225-003916-arnaudb.json
  • 00:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P57918 and previous config saved to /var/cache/conftool/dbconfig/20240225-002410-arnaudb.json
  • 00:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57917 and previous config saved to /var/cache/conftool/dbconfig/20240225-000904-arnaudb.json

2024-02-24

  • 23:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57916 and previous config saved to /var/cache/conftool/dbconfig/20240224-230912-arnaudb.json
  • 23:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 23:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 23:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T357189)', diff saved to https://phabricator.wikimedia.org/P57915 and previous config saved to /var/cache/conftool/dbconfig/20240224-230850-arnaudb.json
  • 22:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P57914 and previous config saved to /var/cache/conftool/dbconfig/20240224-225343-arnaudb.json
  • 22:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P57913 and previous config saved to /var/cache/conftool/dbconfig/20240224-223837-arnaudb.json
  • 22:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T357189)', diff saved to https://phabricator.wikimedia.org/P57912 and previous config saved to /var/cache/conftool/dbconfig/20240224-222331-arnaudb.json
  • 21:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T357189)', diff saved to https://phabricator.wikimedia.org/P57911 and previous config saved to /var/cache/conftool/dbconfig/20240224-212414-arnaudb.json
  • 21:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 21:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 21:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 21:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 21:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T357189)', diff saved to https://phabricator.wikimedia.org/P57910 and previous config saved to /var/cache/conftool/dbconfig/20240224-212336-arnaudb.json
  • 21:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P57909 and previous config saved to /var/cache/conftool/dbconfig/20240224-210830-arnaudb.json
  • 20:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P57908 and previous config saved to /var/cache/conftool/dbconfig/20240224-205323-arnaudb.json
  • 20:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T357189)', diff saved to https://phabricator.wikimedia.org/P57907 and previous config saved to /var/cache/conftool/dbconfig/20240224-203816-arnaudb.json
  • 19:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T357189)', diff saved to https://phabricator.wikimedia.org/P57906 and previous config saved to /var/cache/conftool/dbconfig/20240224-193712-arnaudb.json
  • 19:37 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 19:36 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 19:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T357189)', diff saved to https://phabricator.wikimedia.org/P57905 and previous config saved to /var/cache/conftool/dbconfig/20240224-193651-arnaudb.json
  • 19:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P57904 and previous config saved to /var/cache/conftool/dbconfig/20240224-192144-arnaudb.json
  • 19:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P57903 and previous config saved to /var/cache/conftool/dbconfig/20240224-190638-arnaudb.json
  • 18:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T357189)', diff saved to https://phabricator.wikimedia.org/P57902 and previous config saved to /var/cache/conftool/dbconfig/20240224-185132-arnaudb.json
  • 17:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2140 (T357189)', diff saved to https://phabricator.wikimedia.org/P57901 and previous config saved to /var/cache/conftool/dbconfig/20240224-174941-arnaudb.json
  • 17:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 17:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 16:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 16:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T357189)', diff saved to https://phabricator.wikimedia.org/P57900 and previous config saved to /var/cache/conftool/dbconfig/20240224-165636-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P57899 and previous config saved to /var/cache/conftool/dbconfig/20240224-164129-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137', diff saved to https://phabricator.wikimedia.org/P57898 and previous config saved to /var/cache/conftool/dbconfig/20240224-162623-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137 (T357189)', diff saved to https://phabricator.wikimedia.org/P57897 and previous config saved to /var/cache/conftool/dbconfig/20240224-161117-arnaudb.json
  • 15:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2137 (T357189)', diff saved to https://phabricator.wikimedia.org/P57896 and previous config saved to /var/cache/conftool/dbconfig/20240224-151234-arnaudb.json
  • 15:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 15:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 15:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T357189)', diff saved to https://phabricator.wikimedia.org/P57895 and previous config saved to /var/cache/conftool/dbconfig/20240224-151212-arnaudb.json
  • 14:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P57894 and previous config saved to /var/cache/conftool/dbconfig/20240224-145706-arnaudb.json
  • 14:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P57893 and previous config saved to /var/cache/conftool/dbconfig/20240224-144200-arnaudb.json
  • 14:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T357189)', diff saved to https://phabricator.wikimedia.org/P57892 and previous config saved to /var/cache/conftool/dbconfig/20240224-142653-arnaudb.json
  • 12:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T357189)', diff saved to https://phabricator.wikimedia.org/P57891 and previous config saved to /var/cache/conftool/dbconfig/20240224-124741-arnaudb.json
  • 12:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 12:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 12:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T357189)', diff saved to https://phabricator.wikimedia.org/P57890 and previous config saved to /var/cache/conftool/dbconfig/20240224-124709-arnaudb.json
  • 12:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P57889 and previous config saved to /var/cache/conftool/dbconfig/20240224-123203-arnaudb.json
  • 12:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P57888 and previous config saved to /var/cache/conftool/dbconfig/20240224-121657-arnaudb.json
  • 12:05 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2196.codfw.wmnet with OS bookworm
  • 12:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T357189)', diff saved to https://phabricator.wikimedia.org/P57887 and previous config saved to /var/cache/conftool/dbconfig/20240224-120150-arnaudb.json
  • 10:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T357189)', diff saved to https://phabricator.wikimedia.org/P57886 and previous config saved to /var/cache/conftool/dbconfig/20240224-105413-arnaudb.json
  • 10:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 10:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 10:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T357189)', diff saved to https://phabricator.wikimedia.org/P57885 and previous config saved to /var/cache/conftool/dbconfig/20240224-105351-arnaudb.json
  • 10:48 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2121 from API', diff saved to https://phabricator.wikimedia.org/P57884 and previous config saved to /var/cache/conftool/dbconfig/20240224-104824-marostegui.json
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2118 T358423', diff saved to https://phabricator.wikimedia.org/P57883 and previous config saved to /var/cache/conftool/dbconfig/20240224-104617-root.json
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'Promote db2121 to s7 primary and set section read-write T358423', diff saved to https://phabricator.wikimedia.org/P57882 and previous config saved to /var/cache/conftool/dbconfig/20240224-104522-marostegui.json
  • 10:44 marostegui@cumin1002: dbctl commit (dc=all): 'Set s7 codfw as read-only for maintenance - T358423', diff saved to https://phabricator.wikimedia.org/P57881 and previous config saved to /var/cache/conftool/dbconfig/20240224-104440-marostegui.json
  • 10:44 marostegui: Starting s7 codfw emergency failover from db2118 to db2121 - T358423
  • 10:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P57880 and previous config saved to /var/cache/conftool/dbconfig/20240224-103845-arnaudb.json
  • 10:24 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s7 T358423
  • 10:24 marostegui@cumin1002: dbctl commit (dc=all): 'Set db2121 with weight 0 T358423', diff saved to https://phabricator.wikimedia.org/P57879 and previous config saved to /var/cache/conftool/dbconfig/20240224-102401-root.json
  • 10:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P57878 and previous config saved to /var/cache/conftool/dbconfig/20240224-102338-arnaudb.json
  • 10:23 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s7 T358423
  • 10:10 taavi: powercycle db2118
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T357189)', diff saved to https://phabricator.wikimedia.org/P57877 and previous config saved to /var/cache/conftool/dbconfig/20240224-100832-arnaudb.json
  • 09:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2110 (T357189)', diff saved to https://phabricator.wikimedia.org/P57876 and previous config saved to /var/cache/conftool/dbconfig/20240224-090212-arnaudb.json
  • 09:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 09:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 09:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T357189)', diff saved to https://phabricator.wikimedia.org/P57875 and previous config saved to /var/cache/conftool/dbconfig/20240224-090150-arnaudb.json
  • 08:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P57874 and previous config saved to /var/cache/conftool/dbconfig/20240224-084644-arnaudb.json
  • 08:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P57873 and previous config saved to /var/cache/conftool/dbconfig/20240224-083138-arnaudb.json
  • 07:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2106 (T357189)', diff saved to https://phabricator.wikimedia.org/P57871 and previous config saved to /var/cache/conftool/dbconfig/20240224-071221-arnaudb.json
  • 07:12 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 07:12 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 06:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 06:17 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 05:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 05:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 05:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T357189)', diff saved to https://phabricator.wikimedia.org/P57870 and previous config saved to /var/cache/conftool/dbconfig/20240224-052320-arnaudb.json
  • 05:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P57869 and previous config saved to /var/cache/conftool/dbconfig/20240224-050814-arnaudb.json
  • 04:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P57868 and previous config saved to /var/cache/conftool/dbconfig/20240224-045307-arnaudb.json
  • 04:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T357189)', diff saved to https://phabricator.wikimedia.org/P57867 and previous config saved to /var/cache/conftool/dbconfig/20240224-043801-arnaudb.json
  • 03:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T357189)', diff saved to https://phabricator.wikimedia.org/P57866 and previous config saved to /var/cache/conftool/dbconfig/20240224-033304-arnaudb.json
  • 03:32 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 03:32 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 03:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T357189)', diff saved to https://phabricator.wikimedia.org/P57865 and previous config saved to /var/cache/conftool/dbconfig/20240224-033241-arnaudb.json
  • 03:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P57864 and previous config saved to /var/cache/conftool/dbconfig/20240224-031735-arnaudb.json
  • 03:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P57863 and previous config saved to /var/cache/conftool/dbconfig/20240224-030228-arnaudb.json
  • 02:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T357189)', diff saved to https://phabricator.wikimedia.org/P57862 and previous config saved to /var/cache/conftool/dbconfig/20240224-024722-arnaudb.json
  • 01:47 brett: Upload ncmonitor 0.0.3 to bookworm-wikimedia
  • 01:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T357189)', diff saved to https://phabricator.wikimedia.org/P57861 and previous config saved to /var/cache/conftool/dbconfig/20240224-014734-arnaudb.json
  • 01:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 01:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 01:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T357189)', diff saved to https://phabricator.wikimedia.org/P57860 and previous config saved to /var/cache/conftool/dbconfig/20240224-014711-arnaudb.json
  • 01:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P57859 and previous config saved to /var/cache/conftool/dbconfig/20240224-013205-arnaudb.json
  • 01:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P57858 and previous config saved to /var/cache/conftool/dbconfig/20240224-011658-arnaudb.json
  • 01:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T357189)', diff saved to https://phabricator.wikimedia.org/P57857 and previous config saved to /var/cache/conftool/dbconfig/20240224-010152-arnaudb.json

2024-02-23

  • 23:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T357189)', diff saved to https://phabricator.wikimedia.org/P57856 and previous config saved to /var/cache/conftool/dbconfig/20240223-235919-arnaudb.json
  • 23:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 23:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 23:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 23:14 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 23:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T357189)', diff saved to https://phabricator.wikimedia.org/P57855 and previous config saved to /var/cache/conftool/dbconfig/20240223-231440-arnaudb.json
  • 22:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P57854 and previous config saved to /var/cache/conftool/dbconfig/20240223-225933-arnaudb.json
  • 22:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P57853 and previous config saved to /var/cache/conftool/dbconfig/20240223-224427-arnaudb.json
  • 22:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1244 (T357189)', diff saved to https://phabricator.wikimedia.org/P57852 and previous config saved to /var/cache/conftool/dbconfig/20240223-222920-arnaudb.json
  • 21:49 sbassett: Deployed updated security mitigation for T336027
  • 21:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1244 (T357189)', diff saved to https://phabricator.wikimedia.org/P57850 and previous config saved to /var/cache/conftool/dbconfig/20240223-214211-arnaudb.json
  • 21:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 21:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance
  • 21:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T357189)', diff saved to https://phabricator.wikimedia.org/P57848 and previous config saved to /var/cache/conftool/dbconfig/20240223-214149-arnaudb.json
  • 21:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P57847 and previous config saved to /var/cache/conftool/dbconfig/20240223-212643-arnaudb.json
  • 21:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P57846 and previous config saved to /var/cache/conftool/dbconfig/20240223-211136-arnaudb.json
  • 21:07 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
  • 21:04 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2196.codfw.wmnet with reason: host reimage
  • 20:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1243 (T357189)', diff saved to https://phabricator.wikimedia.org/P57845 and previous config saved to /var/cache/conftool/dbconfig/20240223-205630-arnaudb.json
  • 20:48 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host db2196.codfw.wmnet with OS bookworm
  • 20:42 isaranto@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
  • 20:23 RhinosF1: [relog due to stashbot errors] jhancock@cumin2002 ran cookbook SRE.hardware.upgrade-firmware for hosts db2201/db2204/db2197/db2198/db2202/db2203/db2205 and all END PASS
  • 20:00 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2205']
  • 20:00 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2204']
  • 19:59 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2203']
  • 19:59 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2202']
  • 19:59 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2201']
  • 19:59 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2199']
  • 19:59 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2198']
  • 19:59 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2197']
  • 19:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1243 (T357189)', diff saved to https://phabricator.wikimedia.org/P57844 and previous config saved to /var/cache/conftool/dbconfig/20240223-195835-arnaudb.json
  • 19:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 19:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1243.eqiad.wmnet with reason: Maintenance
  • 19:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T357189)', diff saved to https://phabricator.wikimedia.org/P57843 and previous config saved to /var/cache/conftool/dbconfig/20240223-195802-arnaudb.json
  • 19:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P57841 and previous config saved to /var/cache/conftool/dbconfig/20240223-192749-arnaudb.json
  • 19:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1242 (T357189)', diff saved to https://phabricator.wikimedia.org/P57840 and previous config saved to /var/cache/conftool/dbconfig/20240223-191243-arnaudb.json
  • 19:04 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1036.eqiad.wmnet with reason: Bootstrapping — T354560
  • 19:04 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1036.eqiad.wmnet with reason: Bootstrapping — T354560
  • 19:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2220']
  • 19:03 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2220']
  • 19:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2220']
  • 19:03 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2220']
  • 19:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2219']
  • 19:02 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2219']
  • 19:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2218']
  • 19:02 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2218']
  • 19:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2217']
  • 19:02 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2217']
  • 19:02 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2216']
  • 19:01 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2216']
  • 19:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2215']
  • 19:01 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2215']
  • 18:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2200']
  • 18:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2214']
  • 18:57 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2214']
  • 18:57 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2213']
  • 18:56 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2213']
  • 18:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2212']
  • 18:56 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2212']
  • 18:56 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2211']
  • 18:55 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2211']
  • 18:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2210']
  • 18:55 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2210']
  • 18:55 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2209']
  • 18:55 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2209']
  • 18:54 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2208']
  • 18:54 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2208']
  • 18:53 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2207']
  • 18:53 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2200']
  • 18:52 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2207']
  • 18:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2207']
  • 18:52 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2207']
  • 18:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2206']
  • 18:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2196']
  • 18:51 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2206']
  • 18:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2205']
  • 18:51 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2205']
  • 18:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2204']
  • 18:50 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2204']
  • 18:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2202']
  • 18:50 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2202']
  • 18:50 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2201']
  • 18:49 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2201']
  • 18:49 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2200']
  • 18:49 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2200']
  • 18:48 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2199']
  • 18:48 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2199']
  • 18:48 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2198']
  • 18:47 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2198']
  • 18:47 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2197']
  • 18:47 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2197']
  • 18:46 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2196']
  • 18:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2196']
  • 18:45 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2196']
  • 18:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 18:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 18:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1242 (T357189)', diff saved to https://phabricator.wikimedia.org/P57839 and previous config saved to /var/cache/conftool/dbconfig/20240223-181437-arnaudb.json
  • 18:14 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:14 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1242.eqiad.wmnet with reason: Maintenance
  • 18:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T357189)', diff saved to https://phabricator.wikimedia.org/P57838 and previous config saved to /var/cache/conftool/dbconfig/20240223-181416-arnaudb.json
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P57835 and previous config saved to /var/cache/conftool/dbconfig/20240223-175909-arnaudb.json
  • 17:55 Daimona: T357007 Running mwscript CampaignEvents:GenerateInvitationList --wiki=metawiki --listfile=/home/daimona/list.txt
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P57834 and previous config saved to /var/cache/conftool/dbconfig/20240223-174403-arnaudb.json
  • 17:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1241 (T357189)', diff saved to https://phabricator.wikimedia.org/P57833 and previous config saved to /var/cache/conftool/dbconfig/20240223-172856-arnaudb.json
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1241 (T357189)', diff saved to https://phabricator.wikimedia.org/P57832 and previous config saved to /var/cache/conftool/dbconfig/20240223-162426-arnaudb.json
  • 16:24 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 16:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1241.eqiad.wmnet with reason: Maintenance
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T357189)', diff saved to https://phabricator.wikimedia.org/P57831 and previous config saved to /var/cache/conftool/dbconfig/20240223-162351-arnaudb.json
  • 16:09 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet,service=(cdn|ats-be)
  • 16:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P57830 and previous config saved to /var/cache/conftool/dbconfig/20240223-160845-arnaudb.json
  • 15:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P57829 and previous config saved to /var/cache/conftool/dbconfig/20240223-155338-arnaudb.json
  • 15:44 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 15:43 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 15:40 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 15:39 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 15:39 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 15:38 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 15:38 claime: Deploying 1005974 to eventgate-main - T249745
  • 15:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1221 (T357189)', diff saved to https://phabricator.wikimedia.org/P57828 and previous config saved to /var/cache/conftool/dbconfig/20240223-153832-arnaudb.json
  • 15:33 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2196.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:27 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2196.mgmt.codfw.wmnet with reboot policy FORCED
  • 15:14 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 14:48 hnowlan@cumin2002: conftool action : set/pooled=yes:weight=10; selector: name=(mw2351.codfw.wmnet|mw2353.codfw.wmnet|mw2382.codfw.wmnet|mw2394.codfw.wmnet|mw2419.codfw.wmnet|mw2426.codfw.wmnet|mw2428.codfw.wmnet|mw2444.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 14:43 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster - update (duration: 00m 12s)
  • 14:43 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster - update
  • 14:42 hnowlan: running `homer 'cr*codfw*' commit 'T354791'` for reclaimed codfw jobrunners moving to k8s workers
  • 14:37 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS bullseye
  • 14:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1221 (T357189)', diff saved to https://phabricator.wikimedia.org/P57827 and previous config saved to /var/cache/conftool/dbconfig/20240223-143337-arnaudb.json
  • 14:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:33 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 14:32 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1221.eqiad.wmnet with reason: Maintenance
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T357189)', diff saved to https://phabricator.wikimedia.org/P57826 and previous config saved to /var/cache/conftool/dbconfig/20240223-143246-arnaudb.json
  • 14:24 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 14:22 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 14:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P57825 and previous config saved to /var/cache/conftool/dbconfig/20240223-141740-arnaudb.json
  • 14:12 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 14:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P57824 and previous config saved to /var/cache/conftool/dbconfig/20240223-140233-arnaudb.json
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T357189)', diff saved to https://phabricator.wikimedia.org/P57823 and previous config saved to /var/cache/conftool/dbconfig/20240223-134727-arnaudb.json
  • 13:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cumin1001.eqiad.wmnet
  • 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cumin1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cumin1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
  • 13:19 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2394.codfw.wmnet with OS bullseye
  • 13:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:11 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts cumin1001.eqiad.wmnet
  • 13:09 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2419.codfw.wmnet with OS bullseye
  • 13:08 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2428.codfw.wmnet with OS bullseye
  • 13:07 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2426.codfw.wmnet with OS bullseye
  • 13:04 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2382.codfw.wmnet with OS bullseye
  • 12:57 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2351.codfw.wmnet with OS bullseye
  • 12:55 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2444.codfw.wmnet with OS bullseye
  • 12:53 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2394.codfw.wmnet with reason: host reimage
  • 12:52 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2353.codfw.wmnet with OS bullseye
  • 12:49 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2428.codfw.wmnet with reason: host reimage
  • 12:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1199 (T357189)', diff saved to https://phabricator.wikimedia.org/P57822 and previous config saved to /var/cache/conftool/dbconfig/20240223-124710-arnaudb.json
  • 12:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 12:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1199.eqiad.wmnet with reason: Maintenance
  • 12:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T357189)', diff saved to https://phabricator.wikimedia.org/P57821 and previous config saved to /var/cache/conftool/dbconfig/20240223-124648-arnaudb.json
  • 12:46 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2419.codfw.wmnet with reason: host reimage
  • 12:43 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2426.codfw.wmnet with reason: host reimage
  • 12:41 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2382.codfw.wmnet with reason: host reimage
  • 12:39 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2351.codfw.wmnet with reason: host reimage
  • 12:36 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2444.codfw.wmnet with reason: host reimage
  • 12:34 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2353.codfw.wmnet with reason: host reimage
  • 12:34 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2428.codfw.wmnet with reason: host reimage
  • 12:32 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2394.codfw.wmnet with reason: host reimage
  • 12:32 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2444.codfw.wmnet with reason: host reimage
  • 12:32 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2419.codfw.wmnet with reason: host reimage
  • 12:32 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2426.codfw.wmnet with reason: host reimage
  • 12:32 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2382.codfw.wmnet with reason: host reimage
  • 12:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P57820 and previous config saved to /var/cache/conftool/dbconfig/20240223-123141-arnaudb.json
  • 12:31 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2351.codfw.wmnet with reason: host reimage
  • 12:31 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2353.codfw.wmnet with reason: host reimage
  • 12:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P57819 and previous config saved to /var/cache/conftool/dbconfig/20240223-121635-arnaudb.json
  • 12:16 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2444.codfw.wmnet with OS bullseye
  • 12:16 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2428.codfw.wmnet with OS bullseye
  • 12:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2426.codfw.wmnet with OS bullseye
  • 12:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2419.codfw.wmnet with OS bullseye
  • 12:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2394.codfw.wmnet with OS bullseye
  • 12:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2382.codfw.wmnet with OS bullseye
  • 12:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2353.codfw.wmnet with OS bullseye
  • 12:15 hnowlan@cumin2002: START - Cookbook sre.hosts.reimage for host mw2351.codfw.wmnet with OS bullseye
  • 12:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T357189)', diff saved to https://phabricator.wikimedia.org/P57818 and previous config saved to /var/cache/conftool/dbconfig/20240223-120129-arnaudb.json
  • 11:58 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["75194261"]' | tee -a ~/T315510-enwiki-2 # in tmux
  • 11:52 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki viwiki --current --all --touched-after=20230613000000 --start '["7939741"]' 2>&1 | tee ~/T315510-viwiki # in tmux
  • 11:49 Lucas_WMDE: STOP persistRevisionThreadItems on viwiki for T315510, had been throwing tons of errors since at least Wednesday
  • 11:32 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2384.codfw.wmnet|mw2385.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 11:07 hnowlan: running `homer 'cr*codfw*' commit 'T351074'` for two more appservers becoming k8s workers
  • 11:01 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2369.codfw.wmnet|mw2367.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 10:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1190 (T357189)', diff saved to https://phabricator.wikimedia.org/P57816 and previous config saved to /var/cache/conftool/dbconfig/20240223-105929-arnaudb.json
  • 10:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T357189)', diff saved to https://phabricator.wikimedia.org/P57815 and previous config saved to /var/cache/conftool/dbconfig/20240223-105907-arnaudb.json
  • 10:52 hnowlan: running homer 'cr*codfw*' commit 'T351074' for new appservers being migrated to k8s workers
  • 10:49 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1458.eqiad.wmnet|mw1467.eqiad.wmnet|mw1468.eqiad.wmnet|mw1483.eqiad.wmnet|mw1484.eqiad.wmnet|mw1485.eqiad.wmnet|mw1494.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 10:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P57814 and previous config saved to /var/cache/conftool/dbconfig/20240223-104401-arnaudb.json
  • 10:41 hnowlan: running homer 'cr*eqiad*' commit 'T351074' && homer 'lsw1-f2-eqiad*' commit 'T351074' for jobrunners being migrated to k8s workers
  • 10:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P57813 and previous config saved to /var/cache/conftool/dbconfig/20240223-102854-arnaudb.json
  • 10:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 10:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 10:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T357189)', diff saved to https://phabricator.wikimedia.org/P57811 and previous config saved to /var/cache/conftool/dbconfig/20240223-101348-arnaudb.json
  • 09:35 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57810 and previous config saved to /var/cache/conftool/dbconfig/20240223-093559-root.json
  • 09:20 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57809 and previous config saved to /var/cache/conftool/dbconfig/20240223-092053-root.json
  • 09:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1160 (T357189)', diff saved to https://phabricator.wikimedia.org/P57808 and previous config saved to /var/cache/conftool/dbconfig/20240223-090913-arnaudb.json
  • 09:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 09:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57807 and previous config saved to /var/cache/conftool/dbconfig/20240223-090549-root.json
  • 08:54 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging GoranSMilovanovic out of all services on: 8 hosts
  • 08:53 root@cumin2002: START - Cookbook sre.idm.logout Logging GoranSMilovanovic out of all services on: 8 hosts
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57806 and previous config saved to /var/cache/conftool/dbconfig/20240223-085043-root.json
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57805 and previous config saved to /var/cache/conftool/dbconfig/20240223-083538-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57804 and previous config saved to /var/cache/conftool/dbconfig/20240223-082033-root.json
  • 08:20 godog: rollout prometheus-rsyslog-exporter new version to remaining hosts, caching sites - T357616
  • 08:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 08:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'es1031 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57803 and previous config saved to /var/cache/conftool/dbconfig/20240223-080528-root.json
  • 08:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1031.eqiad.wmnet with OS bookworm
  • 07:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1031.eqiad.wmnet with reason: host reimage
  • 07:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1031.eqiad.wmnet with reason: host reimage
  • 07:40 marostegui: Install 10.6.17 on pc1014 T357089
  • 07:28 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1031.eqiad.wmnet with OS bookworm
  • 07:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1031 T358180', diff saved to https://phabricator.wikimedia.org/P57802 and previous config saved to /var/cache/conftool/dbconfig/20240223-071952-root.json
  • 01:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T357189)', diff saved to https://phabricator.wikimedia.org/P57801 and previous config saved to /var/cache/conftool/dbconfig/20240223-015907-arnaudb.json
  • 01:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P57800 and previous config saved to /var/cache/conftool/dbconfig/20240223-014400-arnaudb.json
  • 01:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P57799 and previous config saved to /var/cache/conftool/dbconfig/20240223-012853-arnaudb.json
  • 01:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2195 (T357189)', diff saved to https://phabricator.wikimedia.org/P57798 and previous config saved to /var/cache/conftool/dbconfig/20240223-011347-arnaudb.json
  • 01:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2195 (T357189)', diff saved to https://phabricator.wikimedia.org/P57797 and previous config saved to /var/cache/conftool/dbconfig/20240223-011128-arnaudb.json
  • 01:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 01:11 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2195.codfw.wmnet with reason: Maintenance
  • 01:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T357189)', diff saved to https://phabricator.wikimedia.org/P57796 and previous config saved to /var/cache/conftool/dbconfig/20240223-011107-arnaudb.json
  • 00:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P57795 and previous config saved to /var/cache/conftool/dbconfig/20240223-005601-arnaudb.json
  • 00:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P57794 and previous config saved to /var/cache/conftool/dbconfig/20240223-004054-arnaudb.json
  • 00:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T357189)', diff saved to https://phabricator.wikimedia.org/P57793 and previous config saved to /var/cache/conftool/dbconfig/20240223-002547-arnaudb.json
  • 00:14 zabe@deploy2002: Finished scap: Backport for block: Pass wikiId to DatabaseBlock::getId in DatabaseBlockStore (T358208) (duration: 11m 02s)
  • 00:12 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="Grandmaster Huon" . # T358022
  • 00:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2181 (T357189)', diff saved to https://phabricator.wikimedia.org/P57791 and previous config saved to /var/cache/conftool/dbconfig/20240223-000920-arnaudb.json
  • 00:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 00:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
  • 00:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57790 and previous config saved to /var/cache/conftool/dbconfig/20240223-000858-arnaudb.json
  • 00:06 zabe@deploy2002: zabe: Continuing with sync
  • 00:04 zabe@deploy2002: zabe: Backport for block: Pass wikiId to DatabaseBlock::getId in DatabaseBlockStore (T358208) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 00:03 zabe@deploy2002: Started scap: Backport for block: Pass wikiId to DatabaseBlock::getId in DatabaseBlockStore (T358208)

2024-02-22

  • 23:59 tstarling@deploy2002: Finished scap: (no justification provided) (duration: 09m 40s)
  • 23:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P57789 and previous config saved to /var/cache/conftool/dbconfig/20240222-235351-arnaudb.json
  • 23:49 tstarling@deploy2002: Started scap: (no justification provided)
  • 23:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P57788 and previous config saved to /var/cache/conftool/dbconfig/20240222-233845-arnaudb.json
  • 23:35 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1035.eqiad.wmnet with reason: Bootstrapping — T354560
  • 23:35 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1035.eqiad.wmnet with reason: Bootstrapping — T354560
  • 23:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57787 and previous config saved to /var/cache/conftool/dbconfig/20240222-232338-arnaudb.json
  • 23:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57786 and previous config saved to /var/cache/conftool/dbconfig/20240222-232118-arnaudb.json
  • 23:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 23:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 23:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57785 and previous config saved to /var/cache/conftool/dbconfig/20240222-232056-arnaudb.json
  • 23:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P57784 and previous config saved to /var/cache/conftool/dbconfig/20240222-230549-arnaudb.json
  • 22:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P57783 and previous config saved to /var/cache/conftool/dbconfig/20240222-225042-arnaudb.json
  • 22:41 cjming: end of UTC late backport window
  • 22:40 cjming@deploy2002: Finished scap: Backport for Improve chunked upload jobs and abort assemble job if already in progress (T200820) (duration: 09m 46s)
  • 22:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57782 and previous config saved to /var/cache/conftool/dbconfig/20240222-223536-arnaudb.json
  • 22:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2166 (T357189)', diff saved to https://phabricator.wikimedia.org/P57781 and previous config saved to /var/cache/conftool/dbconfig/20240222-223314-arnaudb.json
  • 22:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 22:32 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
  • 22:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T357189)', diff saved to https://phabricator.wikimedia.org/P57780 and previous config saved to /var/cache/conftool/dbconfig/20240222-223251-arnaudb.json
  • 22:32 cjming@deploy2002: bawolff and cjming: Continuing with sync
  • 22:32 cjming@deploy2002: bawolff and cjming: Backport for Improve chunked upload jobs and abort assemble job if already in progress (T200820) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:30 cjming@deploy2002: Started scap: Backport for Improve chunked upload jobs and abort assemble job if already in progress (T200820)
  • 22:30 cjming@deploy2002: Finished scap: Backport for testwiki: Allow modifying email in account vanishing contact form. (T343536) (duration: 09m 58s)
  • 22:22 cjming@deploy2002: cjming and dbrant: Continuing with sync
  • 22:21 cjming@deploy2002: cjming and dbrant: Backport for testwiki: Allow modifying email in account vanishing contact form. (T343536) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:20 cjming@deploy2002: Started scap: Backport for testwiki: Allow modifying email in account vanishing contact form. (T343536)
  • 22:18 cjming@deploy2002: Finished scap: Backport for Add verbiage for Account Vanishing contact page. (T343536) (duration: 27m 47s)
  • 22:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P57779 and previous config saved to /var/cache/conftool/dbconfig/20240222-221745-arnaudb.json
  • 22:06 cjming@deploy2002: dbrant and cjming: Continuing with sync
  • 22:05 cjming@deploy2002: dbrant and cjming: Backport for Add verbiage for Account Vanishing contact page. (T343536) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 22:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P57778 and previous config saved to /var/cache/conftool/dbconfig/20240222-220238-arnaudb.json
  • 21:51 cjming@deploy2002: Started scap: Backport for Add verbiage for Account Vanishing contact page. (T343536)
  • 21:50 cjming@deploy2002: Finished scap: Backport for Change font-size "Small" label to "Standard" (T358074) (duration: 29m 07s)
  • 21:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T357189)', diff saved to https://phabricator.wikimedia.org/P57777 and previous config saved to /var/cache/conftool/dbconfig/20240222-214732-arnaudb.json
  • 21:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2164 (T357189)', diff saved to https://phabricator.wikimedia.org/P57776 and previous config saved to /var/cache/conftool/dbconfig/20240222-214310-arnaudb.json
  • 21:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 21:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 21:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 21:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
  • 21:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T357189)', diff saved to https://phabricator.wikimedia.org/P57775 and previous config saved to /var/cache/conftool/dbconfig/20240222-214221-arnaudb.json
  • 21:39 cjming@deploy2002: cjming and jdlrobson: Continuing with sync
  • 21:35 cjming@deploy2002: cjming and jdlrobson: Backport for Change font-size "Small" label to "Standard" (T358074) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P57774 and previous config saved to /var/cache/conftool/dbconfig/20240222-212715-arnaudb.json
  • 21:21 cjming@deploy2002: Started scap: Backport for Change font-size "Small" label to "Standard" (T358074)
  • 21:12 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS bullseye
  • 21:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P57773 and previous config saved to /var/cache/conftool/dbconfig/20240222-211208-arnaudb.json
  • 21:01 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 20:57 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 20:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T357189)', diff saved to https://phabricator.wikimedia.org/P57772 and previous config saved to /var/cache/conftool/dbconfig/20240222-205701-arnaudb.json
  • 20:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2163 (T357189)', diff saved to https://phabricator.wikimedia.org/P57771 and previous config saved to /var/cache/conftool/dbconfig/20240222-205440-arnaudb.json
  • 20:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 20:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
  • 20:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T357189)', diff saved to https://phabricator.wikimedia.org/P57770 and previous config saved to /var/cache/conftool/dbconfig/20240222-205417-arnaudb.json
  • 20:45 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 20:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P57769 and previous config saved to /var/cache/conftool/dbconfig/20240222-203911-arnaudb.json
  • 20:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P57768 and previous config saved to /var/cache/conftool/dbconfig/20240222-202404-arnaudb.json
  • 20:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T357189)', diff saved to https://phabricator.wikimedia.org/P57767 and previous config saved to /var/cache/conftool/dbconfig/20240222-200858-arnaudb.json
  • 20:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2162 (T357189)', diff saved to https://phabricator.wikimedia.org/P57766 and previous config saved to /var/cache/conftool/dbconfig/20240222-200636-arnaudb.json
  • 20:06 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 20:06 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2002.codfw.wmnet with OS bullseye
  • 20:06 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
  • 20:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57765 and previous config saved to /var/cache/conftool/dbconfig/20240222-200614-arnaudb.json
  • 20:00 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 19:58 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 19:58 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 19:57 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 19:56 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 19:56 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 19:55 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 19:53 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 19:52 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 19:52 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 19:52 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 19:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P57764 and previous config saved to /var/cache/conftool/dbconfig/20240222-195108-arnaudb.json
  • 19:50 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 19:50 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 19:49 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 19:40 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 19:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P57763 and previous config saved to /var/cache/conftool/dbconfig/20240222-193601-arnaudb.json
  • 19:30 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:30 robh@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cleanup incorrect asset tags - robh@cumin2002"
  • 19:29 robh@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cleanup incorrect asset tags - robh@cumin2002"
  • 19:27 robh@cumin2002: START - Cookbook sre.dns.netbox
  • 19:23 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 19:22 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 19:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57762 and previous config saved to /var/cache/conftool/dbconfig/20240222-192055-arnaudb.json
  • 19:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57761 and previous config saved to /var/cache/conftool/dbconfig/20240222-191834-arnaudb.json
  • 19:18 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.19 refs T354437
  • 19:18 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 19:18 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
  • 19:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T357189)', diff saved to https://phabricator.wikimedia.org/P57760 and previous config saved to /var/cache/conftool/dbconfig/20240222-191810-arnaudb.json
  • 19:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2385.codfw.wmnet with OS bullseye
  • 19:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P57759 and previous config saved to /var/cache/conftool/dbconfig/20240222-190304-arnaudb.json
  • 18:49 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2385.codfw.wmnet with reason: host reimage
  • 18:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P57758 and previous config saved to /var/cache/conftool/dbconfig/20240222-184757-arnaudb.json
  • 18:46 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2385.codfw.wmnet with reason: host reimage
  • 18:44 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2384.codfw.wmnet with OS bullseye
  • 18:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T357189)', diff saved to https://phabricator.wikimedia.org/P57757 and previous config saved to /var/cache/conftool/dbconfig/20240222-183251-arnaudb.json
  • 18:31 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2385.codfw.wmnet with OS bullseye
  • 18:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2154 (T357189)', diff saved to https://phabricator.wikimedia.org/P57756 and previous config saved to /var/cache/conftool/dbconfig/20240222-183030-arnaudb.json
  • 18:30 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 18:30 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
  • 18:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T357189)', diff saved to https://phabricator.wikimedia.org/P57755 and previous config saved to /var/cache/conftool/dbconfig/20240222-183009-arnaudb.json
  • 18:28 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1485.eqiad.wmnet with OS bullseye
  • 18:25 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1467.eqiad.wmnet with OS bullseye
  • 18:24 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1494.eqiad.wmnet with OS bullseye
  • 18:22 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 18:22 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 18:22 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1484.eqiad.wmnet with OS bullseye
  • 18:21 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2384.codfw.wmnet with reason: host reimage
  • 18:18 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2384.codfw.wmnet with reason: host reimage
  • 18:17 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1468.eqiad.wmnet with OS bullseye
  • 18:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P57753 and previous config saved to /var/cache/conftool/dbconfig/20240222-181502-arnaudb.json
  • 18:14 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1483.eqiad.wmnet with OS bullseye
  • 18:12 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1458.eqiad.wmnet with OS bullseye
  • 18:11 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1485.eqiad.wmnet with reason: host reimage
  • 18:07 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1467.eqiad.wmnet with reason: host reimage
  • 18:04 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1484.eqiad.wmnet with reason: host reimage
  • 18:04 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 18:04 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 18:04 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 18:03 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 18:03 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 18:03 hnowlan@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host mw2384.codfw.wmnet with OS bullseye
  • 18:03 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 18:02 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 18:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1494.eqiad.wmnet with reason: host reimage
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P57752 and previous config saved to /var/cache/conftool/dbconfig/20240222-175956-arnaudb.json
  • 17:59 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1468.eqiad.wmnet with reason: host reimage
  • 17:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1483.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1494.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1483.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1458.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1484.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1485.eqiad.wmnet with reason: host reimage
  • 17:54 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1468.eqiad.wmnet with reason: host reimage
  • 17:52 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1467.eqiad.wmnet with reason: host reimage
  • 17:52 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1458.eqiad.wmnet with reason: host reimage
  • 17:51 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 17:45 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:44 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T357189)', diff saved to https://phabricator.wikimedia.org/P57751 and previous config saved to /var/cache/conftool/dbconfig/20240222-174449-arnaudb.json
  • 17:44 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:43 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:43 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:43 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 17:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2152 (T357189)', diff saved to https://phabricator.wikimedia.org/P57750 and previous config saved to /var/cache/conftool/dbconfig/20240222-174328-arnaudb.json
  • 17:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 17:43 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 17:43 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 17:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:42 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1494.eqiad.wmnet with OS bullseye
  • 17:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 17:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T357189)', diff saved to https://phabricator.wikimedia.org/P57749 and previous config saved to /var/cache/conftool/dbconfig/20240222-174138-arnaudb.json
  • 17:41 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1485.eqiad.wmnet with OS bullseye
  • 17:41 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1484.eqiad.wmnet with OS bullseye
  • 17:41 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1483.eqiad.wmnet with OS bullseye
  • 17:41 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1468.eqiad.wmnet with OS bullseye
  • 17:40 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 17:39 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1467.eqiad.wmnet with OS bullseye
  • 17:39 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1458.eqiad.wmnet with OS bullseye
  • 17:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 17:36 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 17:35 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2002.codfw.wmnet with OS bullseye
  • 17:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P57748 and previous config saved to /var/cache/conftool/dbconfig/20240222-172632-arnaudb.json
  • 17:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P57747 and previous config saved to /var/cache/conftool/dbconfig/20240222-171125-arnaudb.json
  • 17:05 topranks: disabling IPv6 RAs for private1-a-codfw vlan on codfw core routers T355544
  • 16:58 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Remove legacy codfw vc switches from synced hiera data after netbox status change - cmooney@cumin1002 - T355544"
  • 16:57 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Remove legacy codfw vc switches from synced hiera data after netbox status change - cmooney@cumin1002 - T355544"
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T357189)', diff saved to https://phabricator.wikimedia.org/P57746 and previous config saved to /var/cache/conftool/dbconfig/20240222-165619-arnaudb.json
  • 16:56 topranks: disabling link from asw-a-codfw vc to ssw1-a1-codfw and ssw1-a8-codfw T355544
  • 16:54 dancy@deploy2002: Finished scap: testing T357402 again (duration: 08m 58s)
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T357189)', diff saved to https://phabricator.wikimedia.org/P57745 and previous config saved to /var/cache/conftool/dbconfig/20240222-165401-arnaudb.json
  • 16:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 16:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T357189)', diff saved to https://phabricator.wikimedia.org/P57744 and previous config saved to /var/cache/conftool/dbconfig/20240222-165312-arnaudb.json
  • 16:45 dancy@deploy2002: Started scap: testing T357402 again
  • 16:43 dancy@deploy2002: sync-world aborted: testing T357402 (duration: 14m 57s)
  • 16:42 akosiaris@cumin1002: conftool action : set/pooled=inactive; selector: service=parsoid-php,name=kubernetes.*
  • 16:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P57743 and previous config saved to /var/cache/conftool/dbconfig/20240222-163806-arnaudb.json
  • 16:36 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 16:36 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 16:30 fabfur@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2032.codfw.wmnet,service=(cdn|ats-be)
  • 16:30 fabfur@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2031.codfw.wmnet,service=(cdn|ats-be)
  • 16:28 fabfur@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp[2031-2032].codfw.wmnet
  • 16:28 fabfur@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp[2031-2032].codfw.wmnet
  • 16:28 dancy@deploy2002: Started scap: testing T357402
  • 16:26 dancy@deploy2002: Installation of scap version "4.66.0" completed for 458 hosts
  • 16:25 dancy@deploy2002: Installing scap version "4.66.0" for 458 hosts
  • 16:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P57742 and previous config saved to /var/cache/conftool/dbconfig/20240222-162300-arnaudb.json
  • 16:22 volans@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
  • 16:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 100%: After recloning', diff saved to https://phabricator.wikimedia.org/P57741 and previous config saved to /var/cache/conftool/dbconfig/20240222-162151-root.json
  • 16:19 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 16:16 volans@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
  • 16:11 mvernon@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=swift,name=codfw
  • 16:11 Emperor: repool codfs-mw T355868
  • 16:10 Emperor: repool thanos-fe2002 T355868
  • 16:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T357189)', diff saved to https://phabricator.wikimedia.org/P57740 and previous config saved to /var/cache/conftool/dbconfig/20240222-160753-arnaudb.json
  • 16:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 75%: After recloning', diff saved to https://phabricator.wikimedia.org/P57739 and previous config saved to /var/cache/conftool/dbconfig/20240222-160646-root.json
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T357189)', diff saved to https://phabricator.wikimedia.org/P57738 and previous config saved to /var/cache/conftool/dbconfig/20240222-160534-arnaudb.json
  • 16:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 16:05 volans@cumin1002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 16:05 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57737 and previous config saved to /var/cache/conftool/dbconfig/20240222-160512-arnaudb.json
  • 16:04 volans@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 16:00 topranks: Commencing network maintenance migrating servers to new switch codfw rack B2 T355868
  • 15:58 cmooney@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host testvm2002.codfw.wmnet with OS bullseye
  • 15:57 hnowlan: depooling mw[1458,1467-1468,1483-1485,1494].eqiad.wmnet in advance of reimaging
  • 15:56 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B2 to lsw1-b2-codfw
  • 15:55 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B2 to lsw1-b2-codfw
  • 15:54 mvernon@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=swift,name=codfw
  • 15:54 Emperor: depool codfs-mw T355868
  • 15:53 Emperor: depool thanos-fe2002 T355868
  • 15:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: After recloning', diff saved to https://phabricator.wikimedia.org/P57736 and previous config saved to /var/cache/conftool/dbconfig/20240222-155141-root.json
  • 15:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P57735 and previous config saved to /var/cache/conftool/dbconfig/20240222-155005-arnaudb.json
  • 15:48 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b2-codfw.mgmt with reason: prepping for server uplink migration codfw rack b2
  • 15:48 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b2-codfw.mgmt with reason: prepping for server uplink migration codfw rack b2
  • 15:46 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2031-2032].codfw.wmnet with reason: T355868
  • 15:46 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp[2031-2032].codfw.wmnet with reason: T355868
  • 15:39 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster (duration: 00m 16s)
  • 15:39 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster
  • 15:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: After recloning', diff saved to https://phabricator.wikimedia.org/P57734 and previous config saved to /var/cache/conftool/dbconfig/20240222-153636-root.json
  • 15:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P57733 and previous config saved to /var/cache/conftool/dbconfig/20240222-153459-arnaudb.json
  • 15:32 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 15:27 moritzm: installing glib2.0 security updates on bullseye
  • 15:27 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
  • 15:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 10%: After recloning', diff saved to https://phabricator.wikimedia.org/P57732 and previous config saved to /var/cache/conftool/dbconfig/20240222-152131-root.json
  • 15:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57731 and previous config saved to /var/cache/conftool/dbconfig/20240222-151952-arnaudb.json
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57730 and previous config saved to /var/cache/conftool/dbconfig/20240222-151733-arnaudb.json
  • 15:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 15:17 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57729 and previous config saved to /var/cache/conftool/dbconfig/20240222-151701-arnaudb.json
  • 15:15 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye
  • 15:15 akosiaris@cumin1002: conftool action : set/pooled=yes; selector: service=parsoid-php,name=kubernetes.*
  • 15:15 akosiaris: T357392 pool 46 kubernetes hosts of parsoid-php with a weight of 1. Since the 42 parse hosts are at weight 110, that means 1% goes to mw-parsoid deployment, aka mw-on-k8s
  • 15:13 akosiaris@cumin1002: conftool action : set/weight=1; selector: service=parsoid-php,name=kubernetes.*
  • 15:12 akosiaris@cumin1002: conftool action : set/weight=110; selector: service=parsoid-php,name=(pars.*|mw.*)
  • 15:12 akosiaris: Bump weight of old parsoid hosts from 10 to 110. This is a noop right now but will makes calculations later spelled out in T357392 possible.
  • 14:55 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply
  • 14:55 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/mw-parsoid: apply
  • 14:55 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 14:55 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2149 (re)pooling @ 1%: After recloning', diff saved to https://phabricator.wikimedia.org/P57726 and previous config saved to /var/cache/conftool/dbconfig/20240222-145120-root.json
  • 14:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P57725 and previous config saved to /var/cache/conftool/dbconfig/20240222-144648-arnaudb.json
  • 14:45 cgoubert@deploy2002: Finished scap: Backport for Enable $wgLocalHTTPProxy on group1 wikis (T298265) (duration: 17m 46s)
  • 14:44 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-redacteddb1001.eqiad.wmnet with OS bullseye
  • 14:44 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bullseye
  • 14:37 cgoubert@deploy2002: cgoubert: Continuing with sync
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57724 and previous config saved to /var/cache/conftool/dbconfig/20240222-143141-arnaudb.json
  • 14:29 cgoubert@deploy2002: cgoubert: Backport for Enable $wgLocalHTTPProxy on group1 wikis (T298265) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57723 and previous config saved to /var/cache/conftool/dbconfig/20240222-142921-arnaudb.json
  • 14:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 14:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 14:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57722 and previous config saved to /var/cache/conftool/dbconfig/20240222-142859-arnaudb.json
  • 14:28 cgoubert@deploy2002: Started scap: Backport for Enable $wgLocalHTTPProxy on group1 wikis (T298265)
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57721 and previous config saved to /var/cache/conftool/dbconfig/20240222-141508-root.json
  • 14:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P57720 and previous config saved to /var/cache/conftool/dbconfig/20240222-141353-arnaudb.json
  • 14:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 14:00 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57719 and previous config saved to /var/cache/conftool/dbconfig/20240222-140003-root.json
  • 13:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P57718 and previous config saved to /var/cache/conftool/dbconfig/20240222-135846-arnaudb.json
  • 13:53 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:46 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1034.eqiad.wmnet with OS bookworm
  • 13:46 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:45 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:45 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 13:45 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57717 and previous config saved to /var/cache/conftool/dbconfig/20240222-134458-root.json
  • 13:44 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57716 and previous config saved to /var/cache/conftool/dbconfig/20240222-134340-arnaudb.json
  • 13:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57715 and previous config saved to /var/cache/conftool/dbconfig/20240222-134120-arnaudb.json
  • 13:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 13:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance
  • 13:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57714 and previous config saved to /var/cache/conftool/dbconfig/20240222-134059-arnaudb.json
  • 13:40 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1034
  • 13:40 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1034
  • 13:34 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:29 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57713 and previous config saved to /var/cache/conftool/dbconfig/20240222-132953-root.json
  • 13:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P57712 and previous config saved to /var/cache/conftool/dbconfig/20240222-132551-arnaudb.json
  • 13:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 13:20 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1034.eqiad.wmnet with reason: host reimage
  • 13:18 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1034.eqiad.wmnet with reason: host reimage
  • 13:14 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57711 and previous config saved to /var/cache/conftool/dbconfig/20240222-131448-root.json
  • 13:13 godog: bounce grafana to apply new datasources
  • 13:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P57710 and previous config saved to /var/cache/conftool/dbconfig/20240222-131045-arnaudb.json
  • 13:05 Emperor: ms-codfw set ACL {"read-only":["mw:backup"]} T269108
  • 13:03 Emperor: ms-eqiad set ACL {"read-only":["mw:backup"]} T269108
  • 13:02 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Upgrading gitlab
  • 13:01 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1034.eqiad.wmnet with OS bookworm
  • 12:59 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57709 and previous config saved to /var/cache/conftool/dbconfig/20240222-125943-root.json
  • 12:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57708 and previous config saved to /var/cache/conftool/dbconfig/20240222-125538-arnaudb.json
  • 12:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57707 and previous config saved to /var/cache/conftool/dbconfig/20240222-125319-arnaudb.json
  • 12:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 12:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 12:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57706 and previous config saved to /var/cache/conftool/dbconfig/20240222-125257-arnaudb.json
  • 12:52 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Upgrading gitlab
  • 12:45 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrading gitlab
  • 12:44 marostegui@cumin1002: dbctl commit (dc=all): 'es1028 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57705 and previous config saved to /var/cache/conftool/dbconfig/20240222-124438-root.json
  • 12:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P57704 and previous config saved to /var/cache/conftool/dbconfig/20240222-123750-arnaudb.json
  • 12:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P57703 and previous config saved to /var/cache/conftool/dbconfig/20240222-122244-arnaudb.json
  • 12:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57702 and previous config saved to /var/cache/conftool/dbconfig/20240222-120737-arnaudb.json
  • 12:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57701 and previous config saved to /var/cache/conftool/dbconfig/20240222-120518-arnaudb.json
  • 12:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 12:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57700 and previous config saved to /var/cache/conftool/dbconfig/20240222-120445-arnaudb.json
  • 12:02 eoghan@cumin1002: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrading gitlab
  • 11:55 eoghan@cumin1002: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrading gitlab
  • 11:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 11:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 11:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/spark-history: apply
  • 11:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/spark-history: apply
  • 11:50 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1028.eqiad.wmnet with OS bookworm
  • 11:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P57699 and previous config saved to /var/cache/conftool/dbconfig/20240222-114938-arnaudb.json
  • 11:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P57698 and previous config saved to /var/cache/conftool/dbconfig/20240222-113432-arnaudb.json
  • 11:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1028.eqiad.wmnet with reason: host reimage
  • 11:26 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1028.eqiad.wmnet with reason: host reimage
  • 11:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57697 and previous config saved to /var/cache/conftool/dbconfig/20240222-111925-arnaudb.json
  • 11:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T357189)', diff saved to https://phabricator.wikimedia.org/P57696 and previous config saved to /var/cache/conftool/dbconfig/20240222-111706-arnaudb.json
  • 11:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 11:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 11:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57695 and previous config saved to /var/cache/conftool/dbconfig/20240222-111644-arnaudb.json
  • 11:12 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1028.eqiad.wmnet with OS bookworm
  • 11:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1028 T358180', diff saved to https://phabricator.wikimedia.org/P57694 and previous config saved to /var/cache/conftool/dbconfig/20240222-110914-root.json
  • 11:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P57693 and previous config saved to /var/cache/conftool/dbconfig/20240222-110138-arnaudb.json
  • 10:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P57692 and previous config saved to /var/cache/conftool/dbconfig/20240222-104632-arnaudb.json
  • 10:35 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=s5
  • 10:35 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet,service=s8
  • 10:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57690 and previous config saved to /var/cache/conftool/dbconfig/20240222-103125-arnaudb.json
  • 10:31 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=s8
  • 10:31 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=s5
  • 10:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T357189)', diff saved to https://phabricator.wikimedia.org/P57689 and previous config saved to /var/cache/conftool/dbconfig/20240222-102906-arnaudb.json
  • 10:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 10:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 10:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:28 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57688 and previous config saved to /var/cache/conftool/dbconfig/20240222-102817-arnaudb.json
  • 10:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P57687 and previous config saved to /var/cache/conftool/dbconfig/20240222-101310-arnaudb.json
  • 10:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57686 and previous config saved to /var/cache/conftool/dbconfig/20240222-101123-arnaudb.json
  • 10:10 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57685 and previous config saved to /var/cache/conftool/dbconfig/20240222-101018-arnaudb.json
  • 10:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57684 and previous config saved to /var/cache/conftool/dbconfig/20240222-100140-root.json
  • 09:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P57683 and previous config saved to /var/cache/conftool/dbconfig/20240222-095804-arnaudb.json
  • 09:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57682 and previous config saved to /var/cache/conftool/dbconfig/20240222-095619-arnaudb.json
  • 09:55 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57681 and previous config saved to /var/cache/conftool/dbconfig/20240222-095513-arnaudb.json
  • 09:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57680 and previous config saved to /var/cache/conftool/dbconfig/20240222-094635-root.json
  • 09:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57679 and previous config saved to /var/cache/conftool/dbconfig/20240222-094257-arnaudb.json
  • 09:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57678 and previous config saved to /var/cache/conftool/dbconfig/20240222-094114-arnaudb.json
  • 09:40 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57677 and previous config saved to /var/cache/conftool/dbconfig/20240222-094008-arnaudb.json
  • 09:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57675 and previous config saved to /var/cache/conftool/dbconfig/20240222-093130-root.json
  • 09:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2195 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57674 and previous config saved to /var/cache/conftool/dbconfig/20240222-092609-arnaudb.json
  • 09:25 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57673 and previous config saved to /var/cache/conftool/dbconfig/20240222-092503-arnaudb.json
  • 09:16 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57672 and previous config saved to /var/cache/conftool/dbconfig/20240222-091626-root.json
  • 09:03 jayme: restart prometheus@k8s in eqiad - T343529
  • 09:01 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57671 and previous config saved to /var/cache/conftool/dbconfig/20240222-090121-root.json
  • 09:01 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2143.codfw.wmnet
  • 09:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2195.codfw.wmnet
  • 08:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1180.eqiad.wmnet
  • 08:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 100%: After migration', diff saved to https://phabricator.wikimedia.org/P57670 and previous config saved to /var/cache/conftool/dbconfig/20240222-085800-root.json
  • 08:56 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2195.codfw.wmnet
  • 08:55 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2143.codfw.wmnet
  • 08:55 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1180.eqiad.wmnet
  • 08:55 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 - depooling db1187 db2143 db2195', diff saved to https://phabricator.wikimedia.org/P57669 and previous config saved to /var/cache/conftool/dbconfig/20240222-085521-arnaudb.json
  • 08:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2143,2195].codfw.wmnet,db1187.eqiad.wmnet with reason: Silence for reboot T356240
  • 08:52 jayme: rolling out prometheus-rsyslog-exporter 1.0.0+git20221110-1 to wikikube nodes - T357616
  • 08:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db[2143,2195].codfw.wmnet,db1187.eqiad.wmnet with reason: Silence for reboot T356240
  • 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57668 and previous config saved to /var/cache/conftool/dbconfig/20240222-084616-root.json
  • 08:44 jelto@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetmaster1002.eqiad.wmnet
  • 08:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 75%: After migration', diff saved to https://phabricator.wikimedia.org/P57667 and previous config saved to /var/cache/conftool/dbconfig/20240222-084255-root.json
  • 08:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T357189)', diff saved to https://phabricator.wikimedia.org/P57666 and previous config saved to /var/cache/conftool/dbconfig/20240222-084235-arnaudb.json
  • 08:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:42 jelto@cumin1002: START - Cookbook sre.hosts.reboot-single for host puppetmaster1002.eqiad.wmnet
  • 08:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 08:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'es2033 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57665 and previous config saved to /var/cache/conftool/dbconfig/20240222-083111-root.json
  • 08:30 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2033.codfw.wmnet with OS bookworm
  • 08:29 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 18779
  • 08:28 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 18779
  • 08:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 50%: After migration', diff saved to https://phabricator.wikimedia.org/P57664 and previous config saved to /var/cache/conftool/dbconfig/20240222-082750-root.json
  • 08:25 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 138997
  • 08:24 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'configure' for AS: 138997
  • 08:24 ayounsi@cumin1002: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 138997
  • 08:23 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 138997
  • 08:21 hoo@deploy2002: Finished scap: Backport for Migrate to virtual domain mapping (T348526), Migrate to virtual domain mapping (T348526) (duration: 14m 44s)
  • 08:20 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 08:20 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
  • 08:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2033.codfw.wmnet with reason: host reimage
  • 08:13 hoo@deploy2002: hoo: Continuing with sync
  • 08:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 25%: After migration', diff saved to https://phabricator.wikimedia.org/P57663 and previous config saved to /var/cache/conftool/dbconfig/20240222-081243-root.json
  • 08:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2033.codfw.wmnet with reason: host reimage
  • 08:08 hoo@deploy2002: hoo: Backport for Migrate to virtual domain mapping (T348526), Migrate to virtual domain mapping (T348526) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:06 hoo@deploy2002: Started scap: Backport for Migrate to virtual domain mapping (T348526), Migrate to virtual domain mapping (T348526)
  • 07:58 taavi: taavi@puppetmaster1002 ~ $ sudo systemctl restart apache2 # lots of 'Error 500 on SERVER: Server Error: undefined method `content' for nil:NilClass' in the logs, seems to have helped
  • 07:57 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 10%: After migration', diff saved to https://phabricator.wikimedia.org/P57662 and previous config saved to /var/cache/conftool/dbconfig/20240222-075738-root.json
  • 07:54 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2033.codfw.wmnet with OS bookworm
  • 07:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57661 and previous config saved to /var/cache/conftool/dbconfig/20240222-075448-root.json
  • 07:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 5%: After migration', diff saved to https://phabricator.wikimedia.org/P57660 and previous config saved to /var/cache/conftool/dbconfig/20240222-074233-root.json
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2033 T358080', diff saved to https://phabricator.wikimedia.org/P57659 and previous config saved to /var/cache/conftool/dbconfig/20240222-074042-root.json
  • 07:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57658 and previous config saved to /var/cache/conftool/dbconfig/20240222-073943-root.json
  • 07:30 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2026 as es2 codfw master T358080', diff saved to https://phabricator.wikimedia.org/P57657 and previous config saved to /var/cache/conftool/dbconfig/20240222-073017-marostegui.json
  • 07:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1033 (re)pooling @ 1%: After migration', diff saved to https://phabricator.wikimedia.org/P57656 and previous config saved to /var/cache/conftool/dbconfig/20240222-072729-root.json
  • 07:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57655 and previous config saved to /var/cache/conftool/dbconfig/20240222-072438-root.json
  • 07:19 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1033.eqiad.wmnet with OS bookworm
  • 07:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57654 and previous config saved to /var/cache/conftool/dbconfig/20240222-070933-root.json
  • 06:58 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on es1033.eqiad.wmnet with reason: host reimage
  • 06:57 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1033.eqiad.wmnet with reason: host reimage
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57653 and previous config saved to /var/cache/conftool/dbconfig/20240222-065428-root.json
  • 06:48 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
  • 06:48 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 06:48 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
  • 06:47 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s3
  • 06:47 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s1
  • 06:46 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s1
  • 06:46 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s3
  • 06:44 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1033.eqiad.wmnet with OS bookworm
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1033 T358080', diff saved to https://phabricator.wikimedia.org/P57652 and previous config saved to /var/cache/conftool/dbconfig/20240222-064253-root.json
  • 06:42 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1030 as es2 master T358080', diff saved to https://phabricator.wikimedia.org/P57651 and previous config saved to /var/cache/conftool/dbconfig/20240222-064205-marostegui.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57650 and previous config saved to /var/cache/conftool/dbconfig/20240222-063923-root.json
  • 01:29 eileen: config revision changed from 5bdfab7a to b221a95a
  • 01:28 eileen: config revision changed from 5bdfab7a to b221a95a
  • 01:27 eileen: civicrm upgraded from cd839468 to c50fcae3
  • 00:43 rzl: rzl@lists1001:~$ sudo systemctl restart mailman3 # T358020
  • 00:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T357189)', diff saved to https://phabricator.wikimedia.org/P57649 and previous config saved to /var/cache/conftool/dbconfig/20240222-001210-arnaudb.json

2024-02-21

  • 23:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P57648 and previous config saved to /var/cache/conftool/dbconfig/20240221-235703-arnaudb.json
  • 23:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P57647 and previous config saved to /var/cache/conftool/dbconfig/20240221-234156-arnaudb.json
  • 23:37 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:37 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:28 eileen: config revision changed from c6fc16bb to 5bdfab7a
  • 23:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T357189)', diff saved to https://phabricator.wikimedia.org/P57646 and previous config saved to /var/cache/conftool/dbconfig/20240221-232649-arnaudb.json
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T357189)', diff saved to https://phabricator.wikimedia.org/P57645 and previous config saved to /var/cache/conftool/dbconfig/20240221-225350-arnaudb.json
  • 22:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 22:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 22:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57644 and previous config saved to /var/cache/conftool/dbconfig/20240221-225326-arnaudb.json
  • 22:51 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:50 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P57643 and previous config saved to /var/cache/conftool/dbconfig/20240221-223819-arnaudb.json
  • 22:29 ryankemper@deploy2002: Finished deploy [wdqs/wdqs@8a290df]: new allowlisted endpoints for wdqs (duration: 11m 59s)
  • 22:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:25 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P57642 and previous config saved to /var/cache/conftool/dbconfig/20240221-222313-arnaudb.json
  • 22:20 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 22:20 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 22:17 ryankemper@deploy2002: Started deploy [wdqs/wdqs@8a290df]: new allowlisted endpoints for wdqs
  • 22:12 Dreamy_Jazz: Evening UTC backport window done
  • 22:10 ryankemper: [WDQS] T355868 Depooling `wdqs2024`, `wdqs2014,` `wdqs2010` in anticipation of row maintenance
  • 22:08 dreamyjazz@deploy2002: Finished scap: Backport for Pin wgGlobalBlockingAllowGlobalAccountBlocks as false on WMF wikis (T356923 T356924) (duration: 10m 16s)
  • 22:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57641 and previous config saved to /var/cache/conftool/dbconfig/20240221-220807-arnaudb.json
  • 22:02 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2041*,elastic2042*,elastic2057*,elastic2063*,elastic2064*,elastic2077*,elastic2078*,elastic2092*,elastic2093*,elastic2094* for switch maintenance - bking@cumin2002 - T355860
  • 22:02 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2041*,elastic2042*,elastic2057*,elastic2063*,elastic2064*,elastic2077*,elastic2078*,elastic2092*,elastic2093*,elastic2094* for switch maintenance - bking@cumin2002 - T355860
  • 22:00 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
  • 22:00 dreamyjazz@deploy2002: dreamyjazz: Backport for Pin wgGlobalBlockingAllowGlobalAccountBlocks as false on WMF wikis (T356923 T356924) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:58 dreamyjazz@deploy2002: Started scap: Backport for Pin wgGlobalBlockingAllowGlobalAccountBlocks as false on WMF wikis (T356923 T356924)
  • 21:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57640 and previous config saved to /var/cache/conftool/dbconfig/20240221-215620-arnaudb.json
  • 21:56 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 21:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 21:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T357189)', diff saved to https://phabricator.wikimedia.org/P57639 and previous config saved to /var/cache/conftool/dbconfig/20240221-215558-arnaudb.json
  • 21:54 jhuneidi@deploy2002: Finished scap: Backport for cswiki, commonswiki, enwiki: fix IP cap date and IP for WikiGap Editathon (T357978) (duration: 10m 47s)
  • 21:52 eevans@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1034.eqiad.wmnet with reason: Bootstrapping — T354560
  • 21:52 eevans@cumin1002: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase1034.eqiad.wmnet with reason: Bootstrapping — T354560
  • 21:51 urandom: boostrapping Cassandra, restbase1034-{a,b,c} — T354560
  • 21:46 jhuneidi@deploy2002: anzx and jhuneidi: Continuing with sync
  • 21:45 jhuneidi@deploy2002: anzx and jhuneidi: Backport for cswiki, commonswiki, enwiki: fix IP cap date and IP for WikiGap Editathon (T357978) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 21:43 jhuneidi@deploy2002: Started scap: Backport for cswiki, commonswiki, enwiki: fix IP cap date and IP for WikiGap Editathon (T357978)
  • 21:42 jhuneidi@deploy2002: Finished scap: Backport for Remove Japanese Wikipedia from projects sharing user scripts (T301212), Enable night mode on beta cluster (T357759) (duration: 15m 25s)
  • 21:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P57638 and previous config saved to /var/cache/conftool/dbconfig/20240221-214052-arnaudb.json
  • 21:34 jhuneidi@deploy2002: jdlrobson and jhuneidi: Continuing with sync
  • 21:32 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 21:31 rzl@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 21:31 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncmonitor1001.eqiad.wmnet with reason: host reimage
  • 21:29 jhuneidi@deploy2002: jdlrobson and jhuneidi: Backport for Remove Japanese Wikipedia from projects sharing user scripts (T301212), Enable night mode on beta cluster (T357759) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:27 jhuneidi@deploy2002: Started scap: Backport for Remove Japanese Wikipedia from projects sharing user scripts (T301212), Enable night mode on beta cluster (T357759)
  • 21:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncmonitor1001.eqiad.wmnet with reason: host reimage
  • 21:26 jhuneidi@deploy2002: Finished scap: Backport for Turn on Parsoid read views by default on officewiki (T355566) (duration: 15m 19s)
  • 21:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P57637 and previous config saved to /var/cache/conftool/dbconfig/20240221-212546-arnaudb.json
  • 21:24 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 21:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 21:19 rzl@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 21:18 rzl@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 21:18 jhuneidi@deploy2002: cscott and jhuneidi: Continuing with sync
  • 21:17 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 21:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host ncmonitor1001.eqiad.wmnet with OS bookworm
  • 21:17 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 21:12 jhuneidi@deploy2002: cscott and jhuneidi: Backport for Turn on Parsoid read views by default on officewiki (T355566) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:11 jhuneidi@deploy2002: Started scap: Backport for Turn on Parsoid read views by default on officewiki (T355566)
  • 21:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T357189)', diff saved to https://phabricator.wikimedia.org/P57636 and previous config saved to /var/cache/conftool/dbconfig/20240221-211039-arnaudb.json
  • 21:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T357189)', diff saved to https://phabricator.wikimedia.org/P57635 and previous config saved to /var/cache/conftool/dbconfig/20240221-210001-arnaudb.json
  • 20:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T357189)', diff saved to https://phabricator.wikimedia.org/P57634 and previous config saved to /var/cache/conftool/dbconfig/20240221-205922-arnaudb.json
  • 20:54 jhuneidi@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.19 refs T354437 (duration: 08m 35s)
  • 20:46 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.19 refs T354437
  • 20:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P57633 and previous config saved to /var/cache/conftool/dbconfig/20240221-204415-arnaudb.json
  • 20:39 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:39 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 ejegg: turned off nightly recurring charge job for Autorescue deployment
  • 20:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P57632 and previous config saved to /var/cache/conftool/dbconfig/20240221-202906-arnaudb.json
  • 20:16 jhuneidi@deploy2002: scap failed: average error rate on 4/4 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
  • 20:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T357189)', diff saved to https://phabricator.wikimedia.org/P57631 and previous config saved to /var/cache/conftool/dbconfig/20240221-201400-arnaudb.json
  • 20:11 jhuneidi@deploy2002: Finished scap: Backport for CentralAuthHooks::onGetUserBlock: Only run for reg. users (T358112) (duration: 14m 09s)
  • 20:03 jhuneidi@deploy2002: jhuneidi and matmarex: Continuing with sync
  • 20:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T357189)', diff saved to https://phabricator.wikimedia.org/P57630 and previous config saved to /var/cache/conftool/dbconfig/20240221-200209-arnaudb.json
  • 20:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 20:02 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 20:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T357189)', diff saved to https://phabricator.wikimedia.org/P57629 and previous config saved to /var/cache/conftool/dbconfig/20240221-200148-arnaudb.json
  • 19:58 jhuneidi@deploy2002: jhuneidi and matmarex: Backport for CentralAuthHooks::onGetUserBlock: Only run for reg. users (T358112) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:57 jhuneidi@deploy2002: Started scap: Backport for CentralAuthHooks::onGetUserBlock: Only run for reg. users (T358112)
  • 19:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T355609)', diff saved to https://phabricator.wikimedia.org/P57628 and previous config saved to /var/cache/conftool/dbconfig/20240221-195157-marostegui.json
  • 19:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P57627 and previous config saved to /var/cache/conftool/dbconfig/20240221-194641-arnaudb.json
  • 19:38 inflatador: bking@deploy2002 deleting old flink data from thanos-swift T348685
  • 19:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P57626 and previous config saved to /var/cache/conftool/dbconfig/20240221-193650-marostegui.json
  • 19:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P57625 and previous config saved to /var/cache/conftool/dbconfig/20240221-193135-arnaudb.json
  • 19:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P57624 and previous config saved to /var/cache/conftool/dbconfig/20240221-192144-marostegui.json
  • 19:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T357189)', diff saved to https://phabricator.wikimedia.org/P57623 and previous config saved to /var/cache/conftool/dbconfig/20240221-191628-arnaudb.json
  • 19:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2188 (T355609)', diff saved to https://phabricator.wikimedia.org/P57622 and previous config saved to /var/cache/conftool/dbconfig/20240221-190637-marostegui.json
  • 19:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2122 (T357189)', diff saved to https://phabricator.wikimedia.org/P57621 and previous config saved to /var/cache/conftool/dbconfig/20240221-190311-arnaudb.json
  • 19:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 19:02 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
  • 19:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T357189)', diff saved to https://phabricator.wikimedia.org/P57620 and previous config saved to /var/cache/conftool/dbconfig/20240221-190249-arnaudb.json
  • 18:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P57619 and previous config saved to /var/cache/conftool/dbconfig/20240221-184743-arnaudb.json
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2188 (T355609)', diff saved to https://phabricator.wikimedia.org/P57618 and previous config saved to /var/cache/conftool/dbconfig/20240221-184144-marostegui.json
  • 18:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 18:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2188.codfw.wmnet with reason: Maintenance
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T355609)', diff saved to https://phabricator.wikimedia.org/P57617 and previous config saved to /var/cache/conftool/dbconfig/20240221-184120-marostegui.json
  • 18:32 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P57616 and previous config saved to /var/cache/conftool/dbconfig/20240221-183236-arnaudb.json
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P57615 and previous config saved to /var/cache/conftool/dbconfig/20240221-182614-marostegui.json
  • 18:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T357189)', diff saved to https://phabricator.wikimedia.org/P57614 and previous config saved to /var/cache/conftool/dbconfig/20240221-181729-arnaudb.json
  • 18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P57613 and previous config saved to /var/cache/conftool/dbconfig/20240221-181107-marostegui.json
  • 18:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2121 (T357189)', diff saved to https://phabricator.wikimedia.org/P57612 and previous config saved to /var/cache/conftool/dbconfig/20240221-180103-arnaudb.json
  • 18:00 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 18:00 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
  • 18:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T357189)', diff saved to https://phabricator.wikimedia.org/P57611 and previous config saved to /var/cache/conftool/dbconfig/20240221-180041-arnaudb.json
  • 17:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T355609)', diff saved to https://phabricator.wikimedia.org/P57610 and previous config saved to /var/cache/conftool/dbconfig/20240221-175601-marostegui.json
  • 17:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P57609 and previous config saved to /var/cache/conftool/dbconfig/20240221-174534-arnaudb.json
  • 17:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P57608 and previous config saved to /var/cache/conftool/dbconfig/20240221-173028-arnaudb.json
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2176 (T355609)', diff saved to https://phabricator.wikimedia.org/P57607 and previous config saved to /var/cache/conftool/dbconfig/20240221-172731-marostegui.json
  • 17:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2176.codfw.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T355609)', diff saved to https://phabricator.wikimedia.org/P57606 and previous config saved to /var/cache/conftool/dbconfig/20240221-172709-marostegui.json
  • 17:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T357189)', diff saved to https://phabricator.wikimedia.org/P57605 and previous config saved to /var/cache/conftool/dbconfig/20240221-171521-arnaudb.json
  • 17:12 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P57604 and previous config saved to /var/cache/conftool/dbconfig/20240221-171203-marostegui.json
  • 17:09 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bullseye
  • 17:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2120 (T357189)', diff saved to https://phabricator.wikimedia.org/P57603 and previous config saved to /var/cache/conftool/dbconfig/20240221-170157-arnaudb.json
  • 17:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 17:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
  • 17:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T357189)', diff saved to https://phabricator.wikimedia.org/P57602 and previous config saved to /var/cache/conftool/dbconfig/20240221-170134-arnaudb.json
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P57601 and previous config saved to /var/cache/conftool/dbconfig/20240221-165657-marostegui.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57600 and previous config saved to /var/cache/conftool/dbconfig/20240221-165651-arnaudb.json
  • 16:56 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57599 and previous config saved to /var/cache/conftool/dbconfig/20240221-165644-arnaudb.json
  • 16:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P57598 and previous config saved to /var/cache/conftool/dbconfig/20240221-164628-arnaudb.json
  • 16:42 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T355609)', diff saved to https://phabricator.wikimedia.org/P57597 and previous config saved to /var/cache/conftool/dbconfig/20240221-164150-marostegui.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57596 and previous config saved to /var/cache/conftool/dbconfig/20240221-164146-arnaudb.json
  • 16:41 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57595 and previous config saved to /var/cache/conftool/dbconfig/20240221-164140-arnaudb.json
  • 16:34 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57594 and previous config saved to /var/cache/conftool/dbconfig/20240221-163433-root.json
  • 16:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P57593 and previous config saved to /var/cache/conftool/dbconfig/20240221-163122-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57592 and previous config saved to /var/cache/conftool/dbconfig/20240221-162641-arnaudb.json
  • 16:26 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57591 and previous config saved to /var/cache/conftool/dbconfig/20240221-162635-arnaudb.json
  • 16:25 claime: Uncordoning kubernetes2025.codfw.wmnet kubernetes2026.codfw.wmnet following codfw A8 network migration - T355874
  • 16:24 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=parse200(4|5).*
  • 16:24 claime: Repooling parse2004.codfw.wmnet parse2005.codfw.wmnet following codfw A8 network migration - T355874
  • 16:19 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57590 and previous config saved to /var/cache/conftool/dbconfig/20240221-161928-root.json
  • 16:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T357189)', diff saved to https://phabricator.wikimedia.org/P57589 and previous config saved to /var/cache/conftool/dbconfig/20240221-161615-arnaudb.json
  • 16:14 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2174 (T355609)', diff saved to https://phabricator.wikimedia.org/P57588 and previous config saved to /var/cache/conftool/dbconfig/20240221-161407-marostegui.json
  • 16:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 16:14 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2174.codfw.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T355609)', diff saved to https://phabricator.wikimedia.org/P57587 and previous config saved to /var/cache/conftool/dbconfig/20240221-161345-marostegui.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2106 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57586 and previous config saved to /var/cache/conftool/dbconfig/20240221-161136-arnaudb.json
  • 16:11 arnaudb@cumin1002: dbctl commit (dc=all): 'db2146 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57585 and previous config saved to /var/cache/conftool/dbconfig/20240221-161129-arnaudb.json
  • 16:09 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2137.codfw.wmnet with OS bookworm
  • 16:06 jayme: imported prometheus-rsyslog-exporter 1.0.0+git20221110-1 to buster,bullseye,bookworm - T357616
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2108 (T357189)', diff saved to https://phabricator.wikimedia.org/P57584 and previous config saved to /var/cache/conftool/dbconfig/20240221-160511-arnaudb.json
  • 16:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 16:05 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
  • 16:04 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 16:04 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 16:04 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57583 and previous config saved to /var/cache/conftool/dbconfig/20240221-160423-root.json
  • 16:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 16:03 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 16:02 topranks: Commencing network maintenance migrating servers to new switch codfw rack A8 T355874
  • 15:59 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: Migrating servers in codfw rack A7 to lsw1-a7-codfw
  • 15:58 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: Migrating servers in codfw rack A7 to lsw1-a7-codfw
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P57582 and previous config saved to /var/cache/conftool/dbconfig/20240221-155839-marostegui.json
  • 15:58 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a8-codfw.mgmt with reason: prepping for server uplink migration codfw rack a8
  • 15:57 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a8-codfw.mgmt with reason: prepping for server uplink migration codfw rack a8
  • 15:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 15:55 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
  • 15:55 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
  • 15:54 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
  • 15:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
  • 15:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
  • 15:49 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57581 and previous config saved to /var/cache/conftool/dbconfig/20240221-154918-root.json
  • 15:47 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2137.codfw.wmnet with reason: host reimage
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
  • 15:44 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2137.codfw.wmnet with reason: host reimage
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P57580 and previous config saved to /var/cache/conftool/dbconfig/20240221-154333-marostegui.json
  • 15:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:25:00 on db2106.codfw.wmnet with reason: T355874 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:25:00 on db2106.codfw.wmnet with reason: T355874 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:25:00 on db2146.codfw.wmnet with reason: T355874 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:25:00 on db2146.codfw.wmnet with reason: T355874 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:40 arnaudb@cumin1002: dbctl commit (dc=all): 'T355874 - depooling db2146 db2106', diff saved to https://phabricator.wikimedia.org/P57579 and previous config saved to /var/cache/conftool/dbconfig/20240221-154056-arnaudb.json
  • 15:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 15:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 15:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T357189)', diff saved to https://phabricator.wikimedia.org/P57578 and previous config saved to /var/cache/conftool/dbconfig/20240221-153926-arnaudb.json
  • 15:34 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57577 and previous config saved to /var/cache/conftool/dbconfig/20240221-153414-root.json
  • 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T355609)', diff saved to https://phabricator.wikimedia.org/P57576 and previous config saved to /var/cache/conftool/dbconfig/20240221-152826-marostegui.json
  • 15:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P57575 and previous config saved to /var/cache/conftool/dbconfig/20240221-152420-arnaudb.json
  • 15:21 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host db2137.codfw.wmnet with OS bookworm
  • 15:19 marostegui@cumin1002: dbctl commit (dc=all): 'es2026 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57574 and previous config saved to /var/cache/conftool/dbconfig/20240221-151909-root.json
  • 15:12 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 15:12 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 15:10 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:10 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-b-codfw - cmooney@cumin1002"
  • 14:55 TheresNoTime: UTC afternoon backport window done
  • 14:54 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:54 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:54 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-a-codfw - cmooney@cumin1002"
  • 14:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1236 (T357189)', diff saved to https://phabricator.wikimedia.org/P57570 and previous config saved to /var/cache/conftool/dbconfig/20240221-145407-arnaudb.json
  • 14:53 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-a-codfw - cmooney@cumin1002"
  • 14:53 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:52 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:49 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:48 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:47 TheresNoTime: [samtar@mwmaint2002 ~]$ mwscript namespaceDupes.php --wiki hewikinews --fix #T349581
  • 14:47 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1236 (T357189)', diff saved to https://phabricator.wikimedia.org/P57569 and previous config saved to /var/cache/conftool/dbconfig/20240221-144702-arnaudb.json
  • 14:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 14:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1236.eqiad.wmnet with reason: Maintenance
  • 14:46 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T357189)', diff saved to https://phabricator.wikimedia.org/P57568 and previous config saved to /var/cache/conftool/dbconfig/20240221-144641-arnaudb.json
  • 14:46 samtar@deploy2002: Finished scap: Backport for cswiki, commonswiki, enwiki: Lift IP cap for WikiGap Editathon, mywiki: create portal and draft namespace (T352424) (duration: 20m 23s)
  • 14:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P57567 and previous config saved to /var/cache/conftool/dbconfig/20240221-144536-marostegui.json
  • 14:44 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2026.codfw.wmnet with reason: host reimage
  • 14:44 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:43 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:42 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2026.codfw.wmnet with reason: host reimage
  • 14:40 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1033.eqiad.wmnet with OS bookworm
  • 14:38 cmooney@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:38 cmooney@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-a-codfw - cmooney@cumin1002"
  • 14:37 cmooney@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns entries for private1-a-codfw - cmooney@cumin1002"
  • 14:37 samtar@deploy2002: samtar and anzx: Continuing with sync
  • 14:34 cmooney@cumin1002: START - Cookbook sre.dns.netbox
  • 14:33 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:33 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 14:33 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 14:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57566 and previous config saved to /var/cache/conftool/dbconfig/20240221-143239-arnaudb.json
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P57565 and previous config saved to /var/cache/conftool/dbconfig/20240221-143133-arnaudb.json
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57564 and previous config saved to /var/cache/conftool/dbconfig/20240221-143120-arnaudb.json
  • 14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P57563 and previous config saved to /var/cache/conftool/dbconfig/20240221-143030-marostegui.json
  • 14:27 samtar@deploy2002: samtar and anzx: Backport for cswiki, commonswiki, enwiki: Lift IP cap for WikiGap Editathon, mywiki: create portal and draft namespace (T352424) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:26 samtar@deploy2002: Started scap: Backport for cswiki, commonswiki, enwiki: Lift IP cap for WikiGap Editathon, mywiki: create portal and draft namespace (T352424)
  • 14:24 cmooney@cumin1002: START - Cookbook sre.hosts.reimage for host es2026.codfw.wmnet with OS bookworm
  • 14:23 samtar@deploy2002: Finished scap: Backport for zhwiki: Create group ipblock-exempt-grantor (T357991) (duration: 11m 05s)
  • 14:22 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "new apt server in codfw - jmm@cumin2002 - T331613"
  • 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "new apt server in codfw - jmm@cumin2002 - T331613"
  • 14:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57562 and previous config saved to /var/cache/conftool/dbconfig/20240221-141734-arnaudb.json
  • 14:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P57561 and previous config saved to /var/cache/conftool/dbconfig/20240221-141627-arnaudb.json
  • 14:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57560 and previous config saved to /var/cache/conftool/dbconfig/20240221-141615-arnaudb.json
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2170 (T355609)', diff saved to https://phabricator.wikimedia.org/P57559 and previous config saved to /var/cache/conftool/dbconfig/20240221-141523-marostegui.json
  • 14:15 samtar@deploy2002: stang and samtar: Continuing with sync
  • 14:13 samtar@deploy2002: stang and samtar: Backport for zhwiki: Create group ipblock-exempt-grantor (T357991) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:12 samtar@deploy2002: Started scap: Backport for zhwiki: Create group ipblock-exempt-grantor (T357991)
  • 14:10 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1033.eqiad.wmnet with reason: host reimage
  • 14:08 claime: restarted ferm.service on kubernetes2055.codfw.wmnet mw2440.codfw.wmnet mw2297.codfw.wmnet kubernetes2016.codfw.wmnet - T354855
  • 14:07 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1033.eqiad.wmnet with reason: host reimage
  • 14:05 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host apt2002.wikimedia.org
  • 14:05 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host apt2002.wikimedia.org with OS bookworm
  • 14:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57558 and previous config saved to /var/cache/conftool/dbconfig/20240221-140229-arnaudb.json
  • 14:01 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T357189)', diff saved to https://phabricator.wikimedia.org/P57557 and previous config saved to /var/cache/conftool/dbconfig/20240221-140120-arnaudb.json
  • 14:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57556 and previous config saved to /var/cache/conftool/dbconfig/20240221-140110-arnaudb.json
  • 13:59 topranks: adding IRB anycast interface on private1-a-codfw vlan to lsw1-a4-codfw
  • 13:50 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1033.eqiad.wmnet with OS bookworm
  • 13:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T357189)', diff saved to https://phabricator.wikimedia.org/P57555 and previous config saved to /var/cache/conftool/dbconfig/20240221-135031-arnaudb.json
  • 13:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 13:50 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 13:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T357189)', diff saved to https://phabricator.wikimedia.org/P57554 and previous config saved to /var/cache/conftool/dbconfig/20240221-135009-arnaudb.json
  • 13:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db1180 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57553 and previous config saved to /var/cache/conftool/dbconfig/20240221-134724-arnaudb.json
  • 13:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2142.codfw.wmnet
  • 13:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57552 and previous config saved to /var/cache/conftool/dbconfig/20240221-134605-arnaudb.json
  • 13:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1180.eqiad.wmnet
  • 13:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1213.eqiad.wmnet
  • 13:41 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2142.codfw.wmnet
  • 13:41 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1213.eqiad.wmnet
  • 13:40 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1180.eqiad.wmnet
  • 13:40 Dreamy_Jazz: Re-started MediaModeration scanning script using `mwscript extensions/MediaModeration/maintenance/scanFilesInScanTable.php --wiki=commonswiki --use-jobqueue --sleep 30 --verbose 2>&1 | tee ~/scan-files-in-scan-table-commonswiki-sleep-30-no-render-now.txt` - See T351400
  • 13:40 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 - depooling db1180 db1213 db2142', diff saved to https://phabricator.wikimedia.org/P57551 and previous config saved to /var/cache/conftool/dbconfig/20240221-134015-arnaudb.json
  • 13:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2142.codfw.wmnet,db[1180,1213].eqiad.wmnet with reason: Silence for reboot T356240
  • 13:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db2142.codfw.wmnet,db[1180,1213].eqiad.wmnet with reason: Silence for reboot T356240
  • 13:35 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P57550 and previous config saved to /var/cache/conftool/dbconfig/20240221-133503-arnaudb.json
  • 13:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on apt2002.wikimedia.org with reason: host reimage
  • 13:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on apt2002.wikimedia.org with reason: host reimage
  • 13:22 cmooney@cumin1002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
  • 13:22 cmooney@cumin1002: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
  • 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2170 (T355609)', diff saved to https://phabricator.wikimedia.org/P57549 and previous config saved to /var/cache/conftool/dbconfig/20240221-132156-marostegui.json
  • 13:21 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 13:21 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 13:21 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T355609)', diff saved to https://phabricator.wikimedia.org/P57548 and previous config saved to /var/cache/conftool/dbconfig/20240221-132134-marostegui.json
  • 13:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P57547 and previous config saved to /var/cache/conftool/dbconfig/20240221-131957-arnaudb.json
  • 13:18 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host apt2002.wikimedia.org with OS bookworm
  • 13:16 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM apt2002.wikimedia.org - jmm@cumin2002"
  • 13:15 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM apt2002.wikimedia.org - jmm@cumin2002"
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) apt2002.wikimedia.org on all recursors
  • 13:14 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache apt2002.wikimedia.org on all recursors
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:14 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM apt2002.wikimedia.org - jmm@cumin2002"
  • 13:13 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM apt2002.wikimedia.org - jmm@cumin2002"
  • 13:11 jmm@cumin2002: START - Cookbook sre.dns.netbox
  • 13:11 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host apt2002.wikimedia.org
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.resource-report (exit_code=0)
  • 13:11 jmm@cumin2002: START - Cookbook sre.ganeti.resource-report
  • 13:08 samtar@deploy2002: Finished scap: Backport for InitialiseSettings: Enable Edit Recovery on 3 projects (T355548) (duration: 14m 36s)
  • 13:06 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P57546 and previous config saved to /var/cache/conftool/dbconfig/20240221-130628-marostegui.json
  • 13:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T357189)', diff saved to https://phabricator.wikimedia.org/P57545 and previous config saved to /var/cache/conftool/dbconfig/20240221-130450-arnaudb.json
  • 13:03 aborrero@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "cloudvirt1033 - aborrero@cumin1002"
  • 13:02 aborrero@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "cloudvirt1033 - aborrero@cumin1002"
  • 13:00 samtar@deploy2002: samtar: Continuing with sync
  • 12:57 Daimona: T357007 Running mwscript /home/daimona/GenerateInvitationList.php --wiki=metawiki --listfile=/home/daimona/list.txt (same as current master)
  • 12:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T357189)', diff saved to https://phabricator.wikimedia.org/P57544 and previous config saved to /var/cache/conftool/dbconfig/20240221-125711-arnaudb.json
  • 12:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 12:56 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 12:56 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57543 and previous config saved to /var/cache/conftool/dbconfig/20240221-125648-arnaudb.json
  • 12:55 samtar@deploy2002: samtar: Backport for InitialiseSettings: Enable Edit Recovery on 3 projects (T355548) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:53 samtar@deploy2002: Started scap: Backport for InitialiseSettings: Enable Edit Recovery on 3 projects (T355548)
  • 12:52 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1033.eqiad.wmnet with OS bookworm
  • 12:51 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P57542 and previous config saved to /var/cache/conftool/dbconfig/20240221-125121-marostegui.json
  • 12:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P57541 and previous config saved to /var/cache/conftool/dbconfig/20240221-124142-arnaudb.json
  • 12:36 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T355609)', diff saved to https://phabricator.wikimedia.org/P57540 and previous config saved to /var/cache/conftool/dbconfig/20240221-123615-marostegui.json
  • 12:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57539 and previous config saved to /var/cache/conftool/dbconfig/20240221-123439-arnaudb.json
  • 12:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57538 and previous config saved to /var/cache/conftool/dbconfig/20240221-123423-arnaudb.json
  • 12:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57537 and previous config saved to /var/cache/conftool/dbconfig/20240221-123410-arnaudb.json
  • 12:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P57536 and previous config saved to /var/cache/conftool/dbconfig/20240221-122636-arnaudb.json
  • 12:24 akosiaris@cumin1002: conftool action : set/pooled=true; selector: dnsdisc=mw-parsoid,name=codfw
  • 12:24 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1033.eqiad.wmnet with reason: host reimage
  • 12:22 kart_: Updated cxserver to 2024-02-21-112101-production (T357769)
  • 12:21 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1033.eqiad.wmnet with reason: host reimage
  • 12:21 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2026.codfw.wmnet with OS bookworm
  • 12:20 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 12:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:20 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 12:20 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:20 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:20 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57535 and previous config saved to /var/cache/conftool/dbconfig/20240221-121934-arnaudb.json
  • 12:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57534 and previous config saved to /var/cache/conftool/dbconfig/20240221-121918-arnaudb.json
  • 12:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57533 and previous config saved to /var/cache/conftool/dbconfig/20240221-121906-arnaudb.json
  • 12:18 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 12:18 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 12:15 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:15 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:15 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2026.codfw.wmnet with OS bookworm
  • 12:15 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2026.codfw.wmnet with OS bookworm
  • 12:14 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 12:14 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 12:13 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:13 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
  • 12:12 claime: mw-page-content-change-enrich: Switch to mw-api-int-async - T357785
  • 12:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57532 and previous config saved to /var/cache/conftool/dbconfig/20240221-121129-arnaudb.json
  • 12:10 akosiaris: restart pybal on lvs2013, lvs 1019 to pickup mw-parsoid service. T357392
  • 12:09 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1033
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2153 (T355609)', diff saved to https://phabricator.wikimedia.org/P57531 and previous config saved to /var/cache/conftool/dbconfig/20240221-120949-marostegui.json
  • 12:09 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 12:09 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 12:09 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1033
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T355609)', diff saved to https://phabricator.wikimedia.org/P57530 and previous config saved to /var/cache/conftool/dbconfig/20240221-120927-marostegui.json
  • 12:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57529 and previous config saved to /var/cache/conftool/dbconfig/20240221-120429-arnaudb.json
  • 12:05 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1033.eqiad.wmnet with OS bookworm
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57528 and previous config saved to /var/cache/conftool/dbconfig/20240221-120414-arnaudb.json
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57527 and previous config saved to /var/cache/conftool/dbconfig/20240221-120401-arnaudb.json
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T357189)', diff saved to https://phabricator.wikimedia.org/P57526 and previous config saved to /var/cache/conftool/dbconfig/20240221-120345-arnaudb.json
  • 12:04 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2026.codfw.wmnet with OS bookworm
  • 12:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 12:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 12:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T357189)', diff saved to https://phabricator.wikimedia.org/P57525 and previous config saved to /var/cache/conftool/dbconfig/20240221-120324-arnaudb.json
  • 12:02 akosiaris: restart pybal on lvs2014 to pickup mw-parsoid service. T357392
  • 12:02 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2026 T358080', diff saved to https://phabricator.wikimedia.org/P57524 and previous config saved to /var/cache/conftool/dbconfig/20240221-120202-root.json
  • 12:01 akosiaris: restart pybal on lvs1020 to pickup mw-parsoid service. T357392
  • 12:00 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57523 and previous config saved to /var/cache/conftool/dbconfig/20240221-120051-root.json
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P57522 and previous config saved to /var/cache/conftool/dbconfig/20240221-115421-marostegui.json
  • 11:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2193 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57521 and previous config saved to /var/cache/conftool/dbconfig/20240221-114925-arnaudb.json
  • 11:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2192 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57520 and previous config saved to /var/cache/conftool/dbconfig/20240221-114909-arnaudb.json
  • 11:48 arnaudb@cumin1002: dbctl commit (dc=all): 'db2191 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57519 and previous config saved to /var/cache/conftool/dbconfig/20240221-114856-arnaudb.json
  • 11:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P57518 and previous config saved to /var/cache/conftool/dbconfig/20240221-114817-arnaudb.json
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57517 and previous config saved to /var/cache/conftool/dbconfig/20240221-114546-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P57516 and previous config saved to /var/cache/conftool/dbconfig/20240221-113914-marostegui.json
  • 11:36 volans@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:36 volans@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Added cassandra IPs for restbase10[34-42] - volans@cumin1002"
  • 11:35 volans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Added cassandra IPs for restbase10[34-42] - volans@cumin1002"
  • 11:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P57515 and previous config saved to /var/cache/conftool/dbconfig/20240221-113311-arnaudb.json
  • 11:32 volans@cumin1002: START - Cookbook sre.dns.netbox
  • 11:32 volans@cumin1002: END (ERROR) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=97) generate netbox hiera data: "Added cassandra IPs for restbase10[34-42] - volans@cumin1002"
  • 11:32 volans@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Added cassandra IPs for restbase10[34-42] - volans@cumin1002"
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57514 and previous config saved to /var/cache/conftool/dbconfig/20240221-113041-root.json
  • 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T355609)', diff saved to https://phabricator.wikimedia.org/P57513 and previous config saved to /var/cache/conftool/dbconfig/20240221-112408-marostegui.json
  • 11:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T357189)', diff saved to https://phabricator.wikimedia.org/P57512 and previous config saved to /var/cache/conftool/dbconfig/20240221-111805-arnaudb.json
  • 11:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1151.eqiad.wmnet
  • 11:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2191.codfw.wmnet
  • 11:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2192.codfw.wmnet
  • 11:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2193.codfw.wmnet
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57511 and previous config saved to /var/cache/conftool/dbconfig/20240221-111536-root.json
  • 11:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57510 and previous config saved to /var/cache/conftool/dbconfig/20240221-111348-root.json
  • 11:13 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2192.codfw.wmnet
  • 11:12 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2193.codfw.wmnet
  • 11:12 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1151.eqiad.wmnet
  • 11:12 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2191.codfw.wmnet
  • 11:12 arnaudb@cumin1002: dbctl commit (dc=all): 'T356240 - depooling db2191 db2192 db2193 db1151', diff saved to https://phabricator.wikimedia.org/P57508 and previous config saved to /var/cache/conftool/dbconfig/20240221-111023-arnaudb.json
  • 11:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2191-2193].codfw.wmnet,db1151.eqiad.wmnet with reason: Silence for reboot T356240
  • 11:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T357189)', diff saved to https://phabricator.wikimedia.org/P57507 and previous config saved to /var/cache/conftool/dbconfig/20240221-111012-arnaudb.json
  • 11:11 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 11:11 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db[2191-2193].codfw.wmnet,db1151.eqiad.wmnet with reason: Silence for reboot T356240
  • 11:10 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 11:10 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 11:10 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 11:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T357189)', diff saved to https://phabricator.wikimedia.org/P57506 and previous config saved to /var/cache/conftool/dbconfig/20240221-110951-arnaudb.json
  • 11:09 stran@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
  • 11:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver1001.eqiad.wmnet
  • 11:08 stran@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
  • 11:08 stran@deploy2002: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
  • 11:07 stran@deploy2002: helmfile [eqiad] START helmfile.d/services/ipoid: apply
  • 11:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver1001.eqiad.wmnet
  • 11:05 stran@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
  • 11:05 stran@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
  • 11:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetserver2002.codfw.wmnet
  • 11:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetserver2002.codfw.wmnet
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57505 and previous config saved to /var/cache/conftool/dbconfig/20240221-110031-root.json
  • 10:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57504 and previous config saved to /var/cache/conftool/dbconfig/20240221-105844-root.json
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2146 (T355609)', diff saved to https://phabricator.wikimedia.org/P57503 and previous config saved to /var/cache/conftool/dbconfig/20240221-105654-marostegui.json
  • 10:56 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 10:56 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 10:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T355609)', diff saved to https://phabricator.wikimedia.org/P57502 and previous config saved to /var/cache/conftool/dbconfig/20240221-105630-marostegui.json
  • 10:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P57501 and previous config saved to /var/cache/conftool/dbconfig/20240221-105445-arnaudb.json
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'es2031 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57500 and previous config saved to /var/cache/conftool/dbconfig/20240221-104526-root.json
  • 10:43 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57499 and previous config saved to /var/cache/conftool/dbconfig/20240221-104339-root.json
  • 10:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P57498 and previous config saved to /var/cache/conftool/dbconfig/20240221-104124-marostegui.json
  • 10:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P57497 and previous config saved to /var/cache/conftool/dbconfig/20240221-103938-arnaudb.json
  • 10:37 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:36 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:36 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:35 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:35 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2031.codfw.wmnet with OS bookworm
  • 10:34 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:34 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:32 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 10:32 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply
  • 10:28 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57496 and previous config saved to /var/cache/conftool/dbconfig/20240221-102833-root.json
  • 10:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P57495 and previous config saved to /var/cache/conftool/dbconfig/20240221-102618-marostegui.json
  • 10:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T357189)', diff saved to https://phabricator.wikimedia.org/P57494 and previous config saved to /var/cache/conftool/dbconfig/20240221-102432-arnaudb.json
  • 10:16 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T357189)', diff saved to https://phabricator.wikimedia.org/P57493 and previous config saved to /var/cache/conftool/dbconfig/20240221-101646-arnaudb.json
  • 10:16 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 10:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 10:14 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2031.codfw.wmnet with reason: host reimage
  • 10:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57492 and previous config saved to /var/cache/conftool/dbconfig/20240221-101328-root.json
  • 10:12 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es2031.codfw.wmnet with reason: host reimage
  • 10:12 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1003.eqiad.wmnet with OS bookworm
  • 10:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T355609)', diff saved to https://phabricator.wikimedia.org/P57491 and previous config saved to /var/cache/conftool/dbconfig/20240221-101111-marostegui.json
  • 10:08 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:08 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T357189)', diff saved to https://phabricator.wikimedia.org/P57490 and previous config saved to /var/cache/conftool/dbconfig/20240221-100815-arnaudb.json
  • 09:58 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57489 and previous config saved to /var/cache/conftool/dbconfig/20240221-095823-root.json
  • 09:56 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1003.eqiad.wmnet with reason: host reimage
  • 09:53 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1003.eqiad.wmnet with reason: host reimage
  • 09:53 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es2031.codfw.wmnet with OS bookworm
  • 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P57488 and previous config saved to /var/cache/conftool/dbconfig/20240221-095309-arnaudb.json
  • 09:52 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2031 T358080', diff saved to https://phabricator.wikimedia.org/P57487 and previous config saved to /var/cache/conftool/dbconfig/20240221-095205-root.json
  • 09:45 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2145 (T355609)', diff saved to https://phabricator.wikimedia.org/P57486 and previous config saved to /var/cache/conftool/dbconfig/20240221-094516-marostegui.json
  • 09:45 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 09:44 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'es1030 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57485 and previous config saved to /var/cache/conftool/dbconfig/20240221-094319-root.json
  • 09:42 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1030.eqiad.wmnet with OS bookworm
  • 09:40 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest1003.eqiad.wmnet with OS bookworm
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P57484 and previous config saved to /var/cache/conftool/dbconfig/20240221-093802-arnaudb.json
  • 09:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1030.eqiad.wmnet with reason: host reimage
  • 09:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1030.eqiad.wmnet with reason: host reimage
  • 09:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 09:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T357189)', diff saved to https://phabricator.wikimedia.org/P57482 and previous config saved to /var/cache/conftool/dbconfig/20240221-092256-arnaudb.json
  • 09:22 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 09:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T355609)', diff saved to https://phabricator.wikimedia.org/P57481 and previous config saved to /var/cache/conftool/dbconfig/20240221-092251-marostegui.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57480 and previous config saved to /var/cache/conftool/dbconfig/20240221-091531-arnaudb.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57479 and previous config saved to /var/cache/conftool/dbconfig/20240221-091521-arnaudb.json
  • 09:15 arnaudb@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57478 and previous config saved to /var/cache/conftool/dbconfig/20240221-091509-arnaudb.json
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57477 and previous config saved to /var/cache/conftool/dbconfig/20240221-091449-arnaudb.json
  • 09:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T357189)', diff saved to https://phabricator.wikimedia.org/P57476 and previous config saved to /var/cache/conftool/dbconfig/20240221-091358-arnaudb.json
  • 09:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 09:13 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 09:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57475 and previous config saved to /var/cache/conftool/dbconfig/20240221-091337-arnaudb.json
  • 09:10 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1030.eqiad.wmnet with OS bookworm
  • 09:09 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1030 T358080', diff saved to https://phabricator.wikimedia.org/P57474 and previous config saved to /var/cache/conftool/dbconfig/20240221-090957-root.json
  • 09:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P57473 and previous config saved to /var/cache/conftool/dbconfig/20240221-090744-marostegui.json
  • 09:06 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 09:06 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet,service=s7
  • 09:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57472 and previous config saved to /var/cache/conftool/dbconfig/20240221-090026-arnaudb.json
  • 09:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57471 and previous config saved to /var/cache/conftool/dbconfig/20240221-090016-arnaudb.json
  • 09:00 hashar: Restarted CI Jenkins on contint2002 to update the timestamper plugin
  • 09:00 arnaudb@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57470 and previous config saved to /var/cache/conftool/dbconfig/20240221-090004-arnaudb.json
  • 08:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57469 and previous config saved to /var/cache/conftool/dbconfig/20240221-085944-arnaudb.json
  • 08:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P57468 and previous config saved to /var/cache/conftool/dbconfig/20240221-085830-arnaudb.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P57467 and previous config saved to /var/cache/conftool/dbconfig/20240221-085238-marostegui.json
  • 08:45 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57466 and previous config saved to /var/cache/conftool/dbconfig/20240221-084521-arnaudb.json
  • 08:45 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57465 and previous config saved to /var/cache/conftool/dbconfig/20240221-084511-arnaudb.json
  • 08:45 arnaudb@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57464 and previous config saved to /var/cache/conftool/dbconfig/20240221-084459-arnaudb.json
  • 08:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57463 and previous config saved to /var/cache/conftool/dbconfig/20240221-084440-arnaudb.json
  • 08:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P57462 and previous config saved to /var/cache/conftool/dbconfig/20240221-084325-arnaudb.json
  • 08:43 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts sretest2005.codfw.wmnet
  • 08:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:41 ayounsi@cumin1002: START - Cookbook sre.dns.netbox
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T355609)', diff saved to https://phabricator.wikimedia.org/P57461 and previous config saved to /var/cache/conftool/dbconfig/20240221-083731-marostegui.json
  • 08:36 ayounsi@cumin1002: START - Cookbook sre.hosts.decommission for hosts sretest2005.codfw.wmnet
  • 08:30 arnaudb@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57460 and previous config saved to /var/cache/conftool/dbconfig/20240221-083016-arnaudb.json
  • 08:30 arnaudb@cumin1002: dbctl commit (dc=all): 'db2189 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57459 and previous config saved to /var/cache/conftool/dbconfig/20240221-083006-arnaudb.json
  • 08:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2188 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57458 and previous config saved to /var/cache/conftool/dbconfig/20240221-082955-arnaudb.json
  • 08:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57457 and previous config saved to /var/cache/conftool/dbconfig/20240221-082935-arnaudb.json
  • 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2180.codfw.wmnet
  • 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2190.codfw.wmnet
  • 08:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57456 and previous config saved to /var/cache/conftool/dbconfig/20240221-082818-arnaudb.json
  • 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2189.codfw.wmnet
  • 08:28 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2188.codfw.wmnet
  • 08:23 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2189.codfw.wmnet
  • 08:23 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2190.codfw.wmnet
  • 08:23 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2188.codfw.wmnet
  • 08:23 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db2180.codfw.wmnet
  • 08:22 arnaudb@cumin1002: dbctl commit (dc=all): 'db2180 db2188 db2189 db2190 depool for T356240', diff saved to https://phabricator.wikimedia.org/P57455 and previous config saved to /var/cache/conftool/dbconfig/20240221-082219-arnaudb.json
  • 08:21 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2180,2188-2190].codfw.wmnet with reason: Silence for reboot T356240
  • 08:21 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db[2180,2188-2190].codfw.wmnet with reason: Silence for reboot T356240
  • 08:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57454 and previous config saved to /var/cache/conftool/dbconfig/20240221-082029-arnaudb.json
  • 08:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:20 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 08:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2130 (T355609)', diff saved to https://phabricator.wikimedia.org/P57452 and previous config saved to /var/cache/conftool/dbconfig/20240221-080836-marostegui.json
  • 08:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T355609)', diff saved to https://phabricator.wikimedia.org/P57451 and previous config saved to /var/cache/conftool/dbconfig/20240221-080814-marostegui.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P57450 and previous config saved to /var/cache/conftool/dbconfig/20240221-075307-marostegui.json
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57449 and previous config saved to /var/cache/conftool/dbconfig/20240221-074452-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P57448 and previous config saved to /var/cache/conftool/dbconfig/20240221-073801-marostegui.json
  • 07:29 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57447 and previous config saved to /var/cache/conftool/dbconfig/20240221-072948-root.json
  • 07:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T355609)', diff saved to https://phabricator.wikimedia.org/P57446 and previous config saved to /var/cache/conftool/dbconfig/20240221-072255-marostegui.json
  • 07:14 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57445 and previous config saved to /var/cache/conftool/dbconfig/20240221-071443-root.json
  • 06:59 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57444 and previous config saved to /var/cache/conftool/dbconfig/20240221-065938-root.json
  • 06:55 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2116 (T355609)', diff saved to https://phabricator.wikimedia.org/P57443 and previous config saved to /var/cache/conftool/dbconfig/20240221-065508-marostegui.json
  • 06:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:54 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T355609)', diff saved to https://phabricator.wikimedia.org/P57442 and previous config saved to /var/cache/conftool/dbconfig/20240221-065447-marostegui.json
  • 06:44 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57441 and previous config saved to /var/cache/conftool/dbconfig/20240221-064433-root.json
  • 06:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P57440 and previous config saved to /var/cache/conftool/dbconfig/20240221-063940-marostegui.json
  • 06:29 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57439 and previous config saved to /var/cache/conftool/dbconfig/20240221-062928-root.json
  • 06:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P57438 and previous config saved to /var/cache/conftool/dbconfig/20240221-062434-marostegui.json
  • 06:13 marostegui@cumin1002: dbctl commit (dc=all): 'es1026 (re)pooling @ 1%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57437 and previous config saved to /var/cache/conftool/dbconfig/20240221-061325-root.json
  • 06:11 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1026.eqiad.wmnet with OS bookworm
  • 06:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T355609)', diff saved to https://phabricator.wikimedia.org/P57436 and previous config saved to /var/cache/conftool/dbconfig/20240221-060928-marostegui.json
  • 05:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1026.eqiad.wmnet with reason: host reimage
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on es1026.eqiad.wmnet with reason: host reimage
  • 05:45 kart_: Updated MinT to 2024-02-20-062448-production (T333969, T354666)
  • 05:42 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db2103 (T355609)', diff saved to https://phabricator.wikimedia.org/P57435 and previous config saved to /var/cache/conftool/dbconfig/20240221-054136-marostegui.json
  • 05:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 05:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
  • 05:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host es1026.eqiad.wmnet with OS bookworm
  • 05:38 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1026 T358080', diff saved to https://phabricator.wikimedia.org/P57434 and previous config saved to /var/cache/conftool/dbconfig/20240221-053822-root.json
  • 05:33 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
  • 05:21 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s7
  • 05:21 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet,service=s2
  • 05:21 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
  • 05:14 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
  • 05:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 05:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 05:13 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
  • 05:09 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
  • 05:02 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2220.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:58 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2219.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2217.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2218.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:43 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2216.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:42 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2220.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:41 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:41 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2220 to codfw - jhancock@cumin2002"
  • 04:41 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2220 to codfw - jhancock@cumin2002"
  • 04:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:36 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2215.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:36 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2219.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:35 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:35 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2219 to codfw - jhancock@cumin2002"
  • 04:34 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2219 to codfw - jhancock@cumin2002"
  • 04:32 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:31 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 04:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2218.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:30 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:30 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2218 to codfw - jhancock@cumin2002"
  • 04:30 rzl@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 04:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2218 to codfw - jhancock@cumin2002"
  • 04:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:25 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2217.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:25 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:25 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2217 to codfw - jhancock@cumin2002"
  • 04:24 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2217 to codfw - jhancock@cumin2002"
  • 04:23 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2214.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:22 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:21 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2216.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2213.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:20 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:20 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2216 to codfw - jhancock@cumin2002"
  • 04:19 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2216 to codfw - jhancock@cumin2002"
  • 04:18 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2212.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:18 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:15 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2215.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:15 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:15 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2215 to codfw - jhancock@cumin2002"
  • 04:14 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2215 to codfw - jhancock@cumin2002"
  • 04:14 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2211.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:12 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:10 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2214.mgmt.codfw.wmnet with reboot policy FORCED
  • 04:09 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:09 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2214 to codfw - jhancock@cumin2002"
  • 04:08 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2214 to codfw - jhancock@cumin2002"
  • 04:06 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 04:00 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 04:00 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2213 to codfw - jhancock@cumin2002"
  • 03:59 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2213 to codfw - jhancock@cumin2002"
  • 03:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2212.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:57 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:56 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:56 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2212 to codfw - jhancock@cumin2002"
  • 03:55 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2212 to codfw - jhancock@cumin2002"
  • 03:55 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 03:54 rzl@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 03:53 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:52 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2209.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2211.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:51 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2210.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:42 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2211 to codfw - jhancock@cumin2002"
  • 03:41 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2211 to codfw - jhancock@cumin2002"
  • 03:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2210.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:36 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:36 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2210 to codfw - jhancock@cumin2002"
  • 03:35 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2210 to codfw - jhancock@cumin2002"
  • 03:33 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:31 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2209.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:30 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 03:30 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2209 to codfw - jhancock@cumin2002"
  • 03:29 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2209 to codfw - jhancock@cumin2002"
  • 03:29 rzl@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 03:28 rzl@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 03:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 03:26 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 03:26 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 03:25 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2208.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2206.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:15 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2207.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2208.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:01 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2207.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2208.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2207.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2208.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2207.mgmt.codfw.wmnet with reboot policy FORCED
  • 03:00 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2206.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:59 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 02:59 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2206 to codfw - jhancock@cumin2002"
  • 02:58 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2206 to codfw - jhancock@cumin2002"
  • 02:56 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 02:49 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:40 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:29 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 02:23 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 02:22 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 02:20 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 02:20 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 02:11 rzl@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 02:10 rzl@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 00:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 00:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 00:00 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance

2024-02-20

  • 23:59 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 23:38 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 23:38 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 23:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T355609)', diff saved to https://phabricator.wikimedia.org/P57433 and previous config saved to /var/cache/conftool/dbconfig/20240220-233832-marostegui.json
  • 23:25 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:24 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P57432 and previous config saved to /var/cache/conftool/dbconfig/20240220-232326-marostegui.json
  • 23:23 logmsgbot: @deploy2002 helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:23 logmsgbot: @deploy2002 helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P57431 and previous config saved to /var/cache/conftool/dbconfig/20240220-230817-marostegui.json
  • 22:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T355609)', diff saved to https://phabricator.wikimedia.org/P57430 and previous config saved to /var/cache/conftool/dbconfig/20240220-225311-marostegui.json
  • 22:52 sfaci: Deployed refinery using scap, then deployed onto hdfs
  • 22:39 sfaci@deploy2002: Finished deploy [analytics/refinery@d078656] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d0786561] (duration: 03m 29s)
  • 22:36 sfaci@deploy2002: Started deploy [analytics/refinery@d078656] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d0786561]
  • 22:36 sfaci@deploy2002: Finished deploy [analytics/refinery@d078656] (thin): Regular analytics weekly train THIN [analytics/refinery@d0786561] (duration: 00m 05s)
  • 22:35 sfaci@deploy2002: Started deploy [analytics/refinery@d078656] (thin): Regular analytics weekly train THIN [analytics/refinery@d0786561]
  • 22:35 sfaci@deploy2002: Finished deploy [analytics/refinery@d078656]: Regular analytics weekly train [analytics/refinery@d0786561] (duration: 00m 21s)
  • 22:35 sfaci@deploy2002: Started deploy [analytics/refinery@d078656]: Regular analytics weekly train [analytics/refinery@d0786561]
  • 22:34 sfaci@deploy2002: Finished deploy [analytics/refinery@d078656]: Regular analytics weekly train [analytics/refinery@d0786561] (duration: 13m 19s)
  • 22:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T355609)', diff saved to https://phabricator.wikimedia.org/P57429 and previous config saved to /var/cache/conftool/dbconfig/20240220-222445-marostegui.json
  • 22:24 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 22:24 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 22:24 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T355609)', diff saved to https://phabricator.wikimedia.org/P57428 and previous config saved to /var/cache/conftool/dbconfig/20240220-222423-marostegui.json
  • 22:20 sfaci@deploy2002: Started deploy [analytics/refinery@d078656]: Regular analytics weekly train [analytics/refinery@d0786561]
  • 22:18 sfaci: Starting refinery deployment
  • 22:09 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P57427 and previous config saved to /var/cache/conftool/dbconfig/20240220-220917-marostegui.json
  • 22:00 cjming: end of UTC late backport window
  • 21:58 cjming@deploy2002: Finished scap: Backport for Fix for regression in audio track suppression logic (T357942), Fix for regression in audio track suppression logic (T357942) (duration: 09m 24s)
  • 21:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:54 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P57426 and previous config saved to /var/cache/conftool/dbconfig/20240220-215410-marostegui.json
  • 21:51 cjming@deploy2002: brion and cjming: Continuing with sync
  • 21:50 cjming@deploy2002: brion and cjming: Backport for Fix for regression in audio track suppression logic (T357942), Fix for regression in audio track suppression logic (T357942) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:49 cjming@deploy2002: Started scap: Backport for Fix for regression in audio track suppression logic (T357942), Fix for regression in audio track suppression logic (T357942)
  • 21:48 cjming@deploy2002: Finished scap: Backport for Enable night mode on mobile test servers (T357759) (duration: 11m 01s)
  • 21:48 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:48 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:47 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:47 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:47 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:47 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:42 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:42 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 21:40 cjming@deploy2002: cjming and jdlrobson: Continuing with sync
  • 21:39 cjming@deploy2002: cjming and jdlrobson: Backport for Enable night mode on mobile test servers (T357759) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:39 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T355609)', diff saved to https://phabricator.wikimedia.org/P57424 and previous config saved to /var/cache/conftool/dbconfig/20240220-213904-marostegui.json
  • 21:37 cjming@deploy2002: Started scap: Backport for Enable night mode on mobile test servers (T357759)
  • 21:35 cjming@deploy2002: Finished scap: Backport for Enable desktop diff for anonymous users on enwiki (T350181) (duration: 13m 19s)
  • 21:30 ryankemper@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 21:28 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:28 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:27 cjming@deploy2002: jdlrobson and cjming: Continuing with sync
  • 21:24 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:24 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 21:23 cjming@deploy2002: jdlrobson and cjming: Backport for Enable desktop diff for anonymous users on enwiki (T350181) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:22 cjming@deploy2002: Started scap: Backport for Enable desktop diff for anonymous users on enwiki (T350181)
  • 21:20 cjming@deploy2002: Finished scap: Backport for Correctly turn on Parsoid read views by default on wikitech Talk pages (duration: 12m 53s)
  • 21:11 cjming@deploy2002: cscott and cjming: Continuing with sync
  • 21:08 cjming@deploy2002: cscott and cjming: Backport for Correctly turn on Parsoid read views by default on wikitech Talk pages synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T355609)', diff saved to https://phabricator.wikimedia.org/P57423 and previous config saved to /var/cache/conftool/dbconfig/20240220-210840-marostegui.json
  • 21:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 21:08 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 21:08 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T355609)', diff saved to https://phabricator.wikimedia.org/P57422 and previous config saved to /var/cache/conftool/dbconfig/20240220-210819-marostegui.json
  • 21:07 cjming@deploy2002: Started scap: Backport for Correctly turn on Parsoid read views by default on wikitech Talk pages
  • 21:04 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 21:04 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:56 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:55 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 20:55 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 20:55 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 20:53 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P57421 and previous config saved to /var/cache/conftool/dbconfig/20240220-205312-marostegui.json
  • 20:38 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P57420 and previous config saved to /var/cache/conftool/dbconfig/20240220-203806-marostegui.json
  • 20:35 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 20:35 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 20:32 ryankemper@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudelastic[1001-1004].wikimedia.org
  • 20:32 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:32 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic[1001-1004].wikimedia.org decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
  • 20:31 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:31 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:30 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic[1001-1004].wikimedia.org decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
  • 20:27 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
  • 20:23 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T355609)', diff saved to https://phabricator.wikimedia.org/P57419 and previous config saved to /var/cache/conftool/dbconfig/20240220-202300-marostegui.json
  • 20:01 ryankemper@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic[1001-1004].wikimedia.org
  • 19:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T355609)', diff saved to https://phabricator.wikimedia.org/P57417 and previous config saved to /var/cache/conftool/dbconfig/20240220-195303-marostegui.json
  • 19:52 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 19:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 19:52 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T355609)', diff saved to https://phabricator.wikimedia.org/P57416 and previous config saved to /var/cache/conftool/dbconfig/20240220-195242-marostegui.json
  • 19:48 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T347624, testing 961878 patch) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:48 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T347624, testing 961878 patch) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet w/ force delete existing files, repooling source-only afterwards
  • 19:43 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 19:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57415 and previous config saved to /var/cache/conftool/dbconfig/20240220-193842-arnaudb.json
  • 19:37 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P57414 and previous config saved to /var/cache/conftool/dbconfig/20240220-193735-marostegui.json
  • 19:36 cdanis@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 19:35 cdanis@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 19:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P57413 and previous config saved to /var/cache/conftool/dbconfig/20240220-192335-arnaudb.json
  • 19:22 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228', diff saved to https://phabricator.wikimedia.org/P57412 and previous config saved to /var/cache/conftool/dbconfig/20240220-192229-marostegui.json
  • 19:12 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.42.0-wmf.19 refs T354437
  • 19:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P57411 and previous config saved to /var/cache/conftool/dbconfig/20240220-190829-arnaudb.json
  • 19:07 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1228 (T355609)', diff saved to https://phabricator.wikimedia.org/P57410 and previous config saved to /var/cache/conftool/dbconfig/20240220-190722-marostegui.json
  • 18:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57409 and previous config saved to /var/cache/conftool/dbconfig/20240220-185322-arnaudb.json
  • 18:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T357189)', diff saved to https://phabricator.wikimedia.org/P57408 and previous config saved to /var/cache/conftool/dbconfig/20240220-184925-arnaudb.json
  • 18:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 18:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 18:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57407 and previous config saved to /var/cache/conftool/dbconfig/20240220-184903-arnaudb.json
  • 18:42 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1228 (T355609)', diff saved to https://phabricator.wikimedia.org/P57406 and previous config saved to /var/cache/conftool/dbconfig/20240220-184157-marostegui.json
  • 18:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 18:41 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1228.eqiad.wmnet with reason: Maintenance
  • 18:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T355609)', diff saved to https://phabricator.wikimedia.org/P57405 and previous config saved to /var/cache/conftool/dbconfig/20240220-184124-marostegui.json
  • 18:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P57404 and previous config saved to /var/cache/conftool/dbconfig/20240220-183356-arnaudb.json
  • 18:31 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4052.ulsfo.wmnet,service=(cdn|ats-be)
  • 18:31 sukhe: pool cp4052: bookworm cp host with haproxy 2.6 built against OpenSSL 1.1.1: T352744
  • 18:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P57403 and previous config saved to /var/cache/conftool/dbconfig/20240220-182617-marostegui.json
  • 18:22 sukhe: reprepro -C component/haproxy26 include bookworm-wikimedia haproxy_2.6.16-1~bpo12+1_amd64.changes: T352744
  • 18:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P57402 and previous config saved to /var/cache/conftool/dbconfig/20240220-181850-arnaudb.json
  • 18:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P57401 and previous config saved to /var/cache/conftool/dbconfig/20240220-181111-marostegui.json
  • 18:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57400 and previous config saved to /var/cache/conftool/dbconfig/20240220-180342-arnaudb.json
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T357189)', diff saved to https://phabricator.wikimedia.org/P57399 and previous config saved to /var/cache/conftool/dbconfig/20240220-175938-arnaudb.json
  • 17:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 17:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T357189)', diff saved to https://phabricator.wikimedia.org/P57398 and previous config saved to /var/cache/conftool/dbconfig/20240220-175917-arnaudb.json
  • 17:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T355609)', diff saved to https://phabricator.wikimedia.org/P57397 and previous config saved to /var/cache/conftool/dbconfig/20240220-175605-marostegui.json
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P57396 and previous config saved to /var/cache/conftool/dbconfig/20240220-174411-arnaudb.json
  • 17:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P57395 and previous config saved to /var/cache/conftool/dbconfig/20240220-172904-arnaudb.json
  • 17:27 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T355609)', diff saved to https://phabricator.wikimedia.org/P57394 and previous config saved to /var/cache/conftool/dbconfig/20240220-172716-marostegui.json
  • 17:27 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 17:27 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 17:26 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T355609)', diff saved to https://phabricator.wikimedia.org/P57393 and previous config saved to /var/cache/conftool/dbconfig/20240220-172653-marostegui.json
  • 17:18 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS bookworm
  • 17:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T357189)', diff saved to https://phabricator.wikimedia.org/P57392 and previous config saved to /var/cache/conftool/dbconfig/20240220-171358-arnaudb.json
  • 17:11 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P57391 and previous config saved to /var/cache/conftool/dbconfig/20240220-171147-marostegui.json
  • 17:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T357189)', diff saved to https://phabricator.wikimedia.org/P57390 and previous config saved to /var/cache/conftool/dbconfig/20240220-170949-arnaudb.json
  • 17:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 17:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T357189)', diff saved to https://phabricator.wikimedia.org/P57389 and previous config saved to /var/cache/conftool/dbconfig/20240220-170928-arnaudb.json
  • 16:57 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
  • 16:56 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P57388 and previous config saved to /var/cache/conftool/dbconfig/20240220-165641-marostegui.json
  • 16:55 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P57387 and previous config saved to /var/cache/conftool/dbconfig/20240220-165421-arnaudb.json
  • 16:43 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2005.codfw.wmnet with OS bookworm
  • 16:43 brett@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,name=cp20(29|30).codfw.wmnet
  • 16:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp[2029-2030].codfw.wmnet
  • 16:42 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp[2029-2030].codfw.wmnet
  • 16:41 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T355609)', diff saved to https://phabricator.wikimedia.org/P57386 and previous config saved to /var/cache/conftool/dbconfig/20240220-164134-marostegui.json
  • 16:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P57385 and previous config saved to /var/cache/conftool/dbconfig/20240220-163915-arnaudb.json
  • 16:35 reedy@deploy2002: Synchronized php-1.42.0-wmf.19/extensions/AntiSpoof/: T357995 (duration: 11m 02s)
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1233 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57384 and previous config saved to /var/cache/conftool/dbconfig/20240220-163451-arnaudb.json
  • 16:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57383 and previous config saved to /var/cache/conftool/dbconfig/20240220-163447-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57382 and previous config saved to /var/cache/conftool/dbconfig/20240220-163447-arnaudb.json
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 100%: maintenance done', diff saved to https://phabricator.wikimedia.org/P57381 and previous config saved to /var/cache/conftool/dbconfig/20240220-163442-arnaudb.json
  • 16:30 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
  • 16:29 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS bookworm
  • 16:27 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS bookworm
  • 16:24 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
  • 16:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T357189)', diff saved to https://phabricator.wikimedia.org/P57380 and previous config saved to /var/cache/conftool/dbconfig/20240220-162408-arnaudb.json
  • 16:21 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T357189)', diff saved to https://phabricator.wikimedia.org/P57379 and previous config saved to /var/cache/conftool/dbconfig/20240220-161953-arnaudb.json
  • 16:20 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db1233 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57378 and previous config saved to /var/cache/conftool/dbconfig/20240220-161946-arnaudb.json
  • 16:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57377 and previous config saved to /var/cache/conftool/dbconfig/20240220-161942-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57376 and previous config saved to /var/cache/conftool/dbconfig/20240220-161942-arnaudb.json
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 75%: maintenance done', diff saved to https://phabricator.wikimedia.org/P57375 and previous config saved to /var/cache/conftool/dbconfig/20240220-161937-arnaudb.json
  • 16:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T357189)', diff saved to https://phabricator.wikimedia.org/P57374 and previous config saved to /var/cache/conftool/dbconfig/20240220-161931-arnaudb.json
  • 16:18 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS bookworm
  • 16:14 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T355609)', diff saved to https://phabricator.wikimedia.org/P57373 and previous config saved to /var/cache/conftool/dbconfig/20240220-161348-marostegui.json
  • 16:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 16:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T355609)', diff saved to https://phabricator.wikimedia.org/P57372 and previous config saved to /var/cache/conftool/dbconfig/20240220-161326-marostegui.json
  • 16:12 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
  • 16:11 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in search_codfw
  • 16:11 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in search_codfw
  • 16:09 hnowlan@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2312.codfw.wmnet|mw2313.codfw.wmnet|mw2367.codfw.wmnet|mw2369.codfw.wmnet)
  • 16:07 topranks: Commencing network maintenance migrating servers to new switch codfw rack A7 T355867
  • 16:06 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 22 hosts with reason: Migrating servers in codfw rack A7 to lsw1-a7-codfw
  • 16:06 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 22 hosts with reason: Migrating servers in codfw rack A7 to lsw1-a7-codfw
  • 16:05 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57371 and previous config saved to /var/cache/conftool/dbconfig/20240220-160438-arnaudb.json
  • 16:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57370 and previous config saved to /var/cache/conftool/dbconfig/20240220-160437-arnaudb.json
  • 16:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db1226 (re)pooling @ 50%: maintenance done', diff saved to https://phabricator.wikimedia.org/P57369 and previous config saved to /var/cache/conftool/dbconfig/20240220-160432-arnaudb.json
  • 16:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db1233 (re)pooling @ 50%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57368 and previous config saved to /var/cache/conftool/dbconfig/20240220-160429-arnaudb.json
  • 16:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P57367 and previous config saved to /var/cache/conftool/dbconfig/20240220-160423-arnaudb.json
  • 16:02 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a7-codfw.mgmt with reason: prepping for server uplink migration codfw rack a7
  • 16:02 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a7-codfw.mgmt with reason: prepping for server uplink migration codfw rack a7
  • 16:02 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 16:00 hnowlan: running `homer 'cr*codfw*' commit 'T351074'` for new k8s workers
  • 16:00 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: elastic2089*,elastic2062*,elastic2061* for switch maintenance - bking@cumin2002 - T355860
  • 16:00 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: elastic2089*,elastic2062*,elastic2061* for switch maintenance - bking@cumin2002 - T355860
  • 15:59 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2005.codfw.wmnet with reason: host reimage
  • 15:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P57366 and previous config saved to /var/cache/conftool/dbconfig/20240220-155820-marostegui.json
  • 15:55 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@b115452]: (no justification provided) (duration: 00m 34s)
  • 15:55 Emperor: import ceph-reef packages to apt1001 T279621
  • 15:55 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@b115452]: (no justification provided)
  • 15:54 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 15:53 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 15:53 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 15:50 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 15:50 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 15:49 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db1233 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57365 and previous config saved to /var/cache/conftool/dbconfig/20240220-154924-arnaudb.json
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db1210 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57364 and previous config saved to /var/cache/conftool/dbconfig/20240220-154920-arnaudb.json
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 (re)pooling @ 20%: Maintenance done', diff saved to https://phabricator.wikimedia.org/P57363 and previous config saved to /var/cache/conftool/dbconfig/20240220-154920-arnaudb.json
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P57362 and previous config saved to /var/cache/conftool/dbconfig/20240220-154917-arnaudb.json
  • 15:46 denisse: When doing the alert hosts upgrade we encountered some issues that prevented us to properly reimage the hosts to proceed with the upgrade. We're investigating this issue and inform of the new alert hosts upgrade date ASAP. - T333615
  • 15:46 denisse: When doing the alert hosts upgrade we encountered some issues that prevented us to properly reimage the hosts to proceed with the upgrade. We're investigating this issue and inform of the new alert hosts upgrade date ASAP. - T333615
  • 15:46 godog: re-enable meta-monitoring on wikitech-static.w.o - T333615
  • 15:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P57361 and previous config saved to /var/cache/conftool/dbconfig/20240220-154313-marostegui.json
  • 15:42 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1233.eqiad.wmnet
  • 15:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1168.eqiad.wmnet
  • 15:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1226.eqiad.wmnet
  • 15:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1210.eqiad.wmnet
  • 15:37 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1233.eqiad.wmnet
  • 15:37 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1226.eqiad.wmnet
  • 15:37 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1210.eqiad.wmnet
  • 15:36 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1168.eqiad.wmnet
  • 15:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db1168 db1210 db1226 db1233 depool for T356240', diff saved to https://phabricator.wikimedia.org/P57359 and previous config saved to /var/cache/conftool/dbconfig/20240220-153557-arnaudb.json
  • 15:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T357189)', diff saved to https://phabricator.wikimedia.org/P57358 and previous config saved to /var/cache/conftool/dbconfig/20240220-153410-arnaudb.json
  • 15:33 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[1168,1210,1226,1233].eqiad.wmnet with reason: Silence for reboot T356240
  • 15:33 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on db[1168,1210,1226,1233].eqiad.wmnet with reason: Silence for reboot T356240
  • 15:32 godog: temp disable meta-monitoring on wikitech-static.w.o - T333615
  • 15:30 Emperor: import ceph-reef packages to apt1001 T279621
  • 15:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2128 (T357189)', diff saved to https://phabricator.wikimedia.org/P57357 and previous config saved to /var/cache/conftool/dbconfig/20240220-153000-arnaudb.json
  • 15:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 15:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
  • 15:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T357189)', diff saved to https://phabricator.wikimedia.org/P57356 and previous config saved to /var/cache/conftool/dbconfig/20240220-152933-arnaudb.json
  • 15:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T355609)', diff saved to https://phabricator.wikimedia.org/P57355 and previous config saved to /var/cache/conftool/dbconfig/20240220-152807-marostegui.json
  • 15:25 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 15:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 100%: After migration', diff saved to https://phabricator.wikimedia.org/P57354 and previous config saved to /var/cache/conftool/dbconfig/20240220-151812-root.json
  • 15:16 dcausse: depooled wdqs2009 & wdqs2020 (T355867)
  • 15:16 denisse_: starting the Alert hosts upgrade to Bookworm - T333615
  • 15:16 denisse_: starting the Alert hosts upgrade to Bookworm - T333615
  • 15:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P57353 and previous config saved to /var/cache/conftool/dbconfig/20240220-151426-arnaudb.json
  • 15:13 ayounsi@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2005.codfw.wmnet with OS bookworm
  • 15:13 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db[2146,2151].codfw.wmnet
  • 14:55 bking@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:55 bking@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57346 and previous config saved to /var/cache/conftool/dbconfig/20240220-145124-root.json
  • 14:50 bking@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:49 sukhe: disable puppet on A:cp to merge CR 1004126
  • 14:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2029-2030].codfw.wmnet with reason: T355867
  • 14:49 bking@deploy2002: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:49 brett@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp[2029-2030].codfw.wmnet with reason: T355867
  • 14:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1231.eqiad.wmnet
  • 14:48 bking@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:48 bking@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:48 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 50%: After migration', diff saved to https://phabricator.wikimedia.org/P57345 and previous config saved to /var/cache/conftool/dbconfig/20240220-144803-root.json
  • 14:48 bking@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 14:48 brett@puppetmaster1001: conftool action : set/pooled=no; selector: dc=codfw,name=cp20(29|30).codfw.wmnet
  • 14:48 bking@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 14:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57344 and previous config saved to /var/cache/conftool/dbconfig/20240220-144753-root.json
  • 14:46 sukhe: updating pdns-recursor to 4.8.6-1 on dns*
  • 14:45 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P57343 and previous config saved to /var/cache/conftool/dbconfig/20240220-144539-marostegui.json
  • 14:44 arnaudb@cumin1002: START - Cookbook sre.mysql.upgrade for db1231.eqiad.wmnet
  • 14:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T357189)', diff saved to https://phabricator.wikimedia.org/P57342 and previous config saved to /var/cache/conftool/dbconfig/20240220-144414-arnaudb.json
  • 14:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2123 (T357189)', diff saved to https://phabricator.wikimedia.org/P57341 and previous config saved to /var/cache/conftool/dbconfig/20240220-144001-arnaudb.json
  • 14:40 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 14:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
  • 14:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T357189)', diff saved to https://phabricator.wikimedia.org/P57340 and previous config saved to /var/cache/conftool/dbconfig/20240220-143939-arnaudb.json
  • 14:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57339 and previous config saved to /var/cache/conftool/dbconfig/20240220-143619-root.json
  • 14:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 25%: After migration', diff saved to https://phabricator.wikimedia.org/P57338 and previous config saved to /var/cache/conftool/dbconfig/20240220-143258-root.json
  • 14:32 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57337 and previous config saved to /var/cache/conftool/dbconfig/20240220-143249-root.json
  • 14:30 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P57336 and previous config saved to /var/cache/conftool/dbconfig/20240220-143032-marostegui.json
  • 14:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P57334 and previous config saved to /var/cache/conftool/dbconfig/20240220-142433-arnaudb.json
  • 14:21 claime: launching build-production-images - T342346
  • 14:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57333 and previous config saved to /var/cache/conftool/dbconfig/20240220-142114-root.json
  • 14:20 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 14:19 ayounsi@cumin1002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest2005.codfw.wmnet with OS bookworm
  • 14:18 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 10%: After migration', diff saved to https://phabricator.wikimedia.org/P57332 and previous config saved to /var/cache/conftool/dbconfig/20240220-141752-root.json
  • 14:17 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57331 and previous config saved to /var/cache/conftool/dbconfig/20240220-141744-root.json
  • 14:15 claime: Uncordoning mw2379
  • 14:15 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T355609)', diff saved to https://phabricator.wikimedia.org/P57330 and previous config saved to /var/cache/conftool/dbconfig/20240220-141525-marostegui.json
  • 14:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P57329 and previous config saved to /var/cache/conftool/dbconfig/20240220-140926-arnaudb.json
  • 14:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57328 and previous config saved to /var/cache/conftool/dbconfig/20240220-140609-root.json
  • 14:05 ayounsi@cumin1002: START - Cookbook sre.hosts.reimage for host sretest2005.codfw.wmnet with OS bookworm
  • 14:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 5%: After migration', diff saved to https://phabricator.wikimedia.org/P57327 and previous config saved to /var/cache/conftool/dbconfig/20240220-140247-root.json
  • 14:02 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57326 and previous config saved to /var/cache/conftool/dbconfig/20240220-140239-root.json
  • 13:55 ayounsi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2005.codfw.wmnet with reason: sretest
  • 13:55 ayounsi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2005.codfw.wmnet with reason: sretest
  • 13:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T357189)', diff saved to https://phabricator.wikimedia.org/P57325 and previous config saved to /var/cache/conftool/dbconfig/20240220-135420-arnaudb.json
  • 13:54 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s5
  • 13:54 marostegui@cumin1002: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet,service=s8
  • 13:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2194 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57324 and previous config saved to /var/cache/conftool/dbconfig/20240220-135104-root.json
  • 13:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2111 (T357189)', diff saved to https://phabricator.wikimedia.org/P57323 and previous config saved to /var/cache/conftool/dbconfig/20240220-134958-arnaudb.json
  • 13:49 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 13:49 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
  • 13:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 1%: After migration', diff saved to https://phabricator.wikimedia.org/P57322 and previous config saved to /var/cache/conftool/dbconfig/20240220-134742-root.json
  • 13:47 marostegui@cumin1002: dbctl commit (dc=all): 'db2190 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57321 and previous config saved to /var/cache/conftool/dbconfig/20240220-134734-root.json
  • 13:47 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 13:47 jynus: setting up mariadb instances at db2097
  • 13:47 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
  • 13:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:44 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T355609)', diff saved to https://phabricator.wikimedia.org/P57320 and previous config saved to /var/cache/conftool/dbconfig/20240220-134403-marostegui.json
  • 13:44 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T357189)', diff saved to https://phabricator.wikimedia.org/P57319 and previous config saved to /var/cache/conftool/dbconfig/20240220-134354-arnaudb.json
  • 13:43 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 13:43 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 13:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T355609)', diff saved to https://phabricator.wikimedia.org/P57318 and previous config saved to /var/cache/conftool/dbconfig/20240220-134334-marostegui.json
  • 13:28 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P57317 and previous config saved to /var/cache/conftool/dbconfig/20240220-132848-arnaudb.json
  • 13:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P57316 and previous config saved to /var/cache/conftool/dbconfig/20240220-132827-marostegui.json
  • 13:13 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P57315 and previous config saved to /var/cache/conftool/dbconfig/20240220-131341-arnaudb.json
  • 13:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P57314 and previous config saved to /var/cache/conftool/dbconfig/20240220-131320-marostegui.json
  • 13:08 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2190.codfw.wmnet onto db2194.codfw.wmnet
  • 12:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T357189)', diff saved to https://phabricator.wikimedia.org/P57313 and previous config saved to /var/cache/conftool/dbconfig/20240220-125835-arnaudb.json
  • 12:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T355609)', diff saved to https://phabricator.wikimedia.org/P57312 and previous config saved to /var/cache/conftool/dbconfig/20240220-125814-marostegui.json
  • 12:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T357189)', diff saved to https://phabricator.wikimedia.org/P57311 and previous config saved to /var/cache/conftool/dbconfig/20240220-125516-arnaudb.json
  • 12:55 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 12:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 12:53 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T357189)', diff saved to https://phabricator.wikimedia.org/P57310 and previous config saved to /var/cache/conftool/dbconfig/20240220-125311-arnaudb.json
  • 12:48 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s8
  • 12:48 marostegui@cumin1002: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=s5
  • 12:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P57309 and previous config saved to /var/cache/conftool/dbconfig/20240220-123804-arnaudb.json
  • 12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T355609)', diff saved to https://phabricator.wikimedia.org/P57308 and previous config saved to /var/cache/conftool/dbconfig/20240220-122947-marostegui.json
  • 12:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 12:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 12:29 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 12:29 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T355609)', diff saved to https://phabricator.wikimedia.org/P57307 and previous config saved to /var/cache/conftool/dbconfig/20240220-122907-marostegui.json
  • 12:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213', diff saved to https://phabricator.wikimedia.org/P57306 and previous config saved to /var/cache/conftool/dbconfig/20240220-122258-arnaudb.json
  • 12:18 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2384.codfw.wmnet with OS bullseye
  • 12:18 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2385.codfw.wmnet with OS bullseye
  • 12:16 claime: Draining mw2379
  • 12:14 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P57305 and previous config saved to /var/cache/conftool/dbconfig/20240220-121402-marostegui.json
  • 12:07 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1213 (T357189)', diff saved to https://phabricator.wikimedia.org/P57304 and previous config saved to /var/cache/conftool/dbconfig/20240220-120752-arnaudb.json
  • 12:05 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1213 (T357189)', diff saved to https://phabricator.wikimedia.org/P57303 and previous config saved to /var/cache/conftool/dbconfig/20240220-120434-arnaudb.json
  • 12:05 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 12:04 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1213.eqiad.wmnet with reason: Maintenance
  • 12:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T357189)', diff saved to https://phabricator.wikimedia.org/P57302 and previous config saved to /var/cache/conftool/dbconfig/20240220-120412-arnaudb.json
  • 12:04 kart_: cxserver: Update to 2024-02-15-085232-production + Bump mesh.configuration to 1.7 (T333969, T352747, T355686, T255568)
  • 12:03 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2385.codfw.wmnet with OS bullseye
  • 12:03 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2385.codfw.wmnet with OS bullseye
  • 12:02 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 12:02 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2384.codfw.wmnet with OS bullseye
  • 12:01 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2369.codfw.wmnet with OS bullseye
  • 12:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57301 and previous config saved to /var/cache/conftool/dbconfig/20240220-120031-root.json
  • 12:00 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
  • 11:59 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
  • 11:58 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P57300 and previous config saved to /var/cache/conftool/dbconfig/20240220-115855-marostegui.json
  • 11:57 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2367.codfw.wmnet with OS bullseye
  • 11:55 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2313.codfw.wmnet with OS bullseye
  • 11:55 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
  • 11:54 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
  • 11:51 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2312.codfw.wmnet with OS bullseye
  • 11:51 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
  • 11:50 sukhe: updating pdns-recursor to 4.8.6-1 on doh* hosts
  • 11:50 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
  • 11:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P57299 and previous config saved to /var/cache/conftool/dbconfig/20240220-114906-arnaudb.json
  • 11:45 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57298 and previous config saved to /var/cache/conftool/dbconfig/20240220-114526-root.json
  • 11:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T355609)', diff saved to https://phabricator.wikimedia.org/P57297 and previous config saved to /var/cache/conftool/dbconfig/20240220-114349-marostegui.json
  • 11:42 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2369.codfw.wmnet with reason: host reimage
  • 11:39 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2367.codfw.wmnet with reason: host reimage
  • 11:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2313.codfw.wmnet with reason: host reimage
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2367.codfw.wmnet with reason: host reimage
  • 11:35 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2369.codfw.wmnet with reason: host reimage
  • 11:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P57296 and previous config saved to /var/cache/conftool/dbconfig/20240220-113401-arnaudb.json
  • 11:33 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2312.codfw.wmnet with reason: host reimage
  • 11:33 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db2190.codfw.wmnet onto db2194.codfw.wmnet
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2312.codfw.wmnet with reason: host reimage
  • 11:30 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57295 and previous config saved to /var/cache/conftool/dbconfig/20240220-113021-root.json
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2385.codfw.wmnet with OS bullseye
  • 11:30 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 11:29 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2385.codfw.wmnet with OS bullseye
  • 11:29 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw2384.codfw.wmnet with OS bullseye
  • 11:29 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2194.codfw.wmnet with OS bookworm
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2385.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2384.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2369.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2367.codfw.wmnet with OS bullseye
  • 11:19 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2313.codfw.wmnet with OS bullseye
  • 11:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T357189)', diff saved to https://phabricator.wikimedia.org/P57294 and previous config saved to /var/cache/conftool/dbconfig/20240220-111854-arnaudb.json
  • 11:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T355609)', diff saved to https://phabricator.wikimedia.org/P57293 and previous config saved to /var/cache/conftool/dbconfig/20240220-111722-marostegui.json
  • 11:17 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 11:17 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 11:17 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T355609)', diff saved to https://phabricator.wikimedia.org/P57292 and previous config saved to /var/cache/conftool/dbconfig/20240220-111700-marostegui.json
  • 11:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1210 (T357189)', diff saved to https://phabricator.wikimedia.org/P57291 and previous config saved to /var/cache/conftool/dbconfig/20240220-111531-arnaudb.json
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57290 and previous config saved to /var/cache/conftool/dbconfig/20240220-111525-root.json
  • 11:15 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 11:15 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57289 and previous config saved to /var/cache/conftool/dbconfig/20240220-111516-root.json
  • 11:15 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1210.eqiad.wmnet with reason: Maintenance
  • 11:15 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T357189)', diff saved to https://phabricator.wikimedia.org/P57288 and previous config saved to /var/cache/conftool/dbconfig/20240220-111510-arnaudb.json
  • 11:14 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw2312.codfw.wmnet with OS bullseye
  • 11:08 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2194.codfw.wmnet with reason: host reimage
  • 11:06 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2194.codfw.wmnet with reason: host reimage
  • 11:04 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2194 in s3 depooled T354826', diff saved to https://phabricator.wikimedia.org/P57287 and previous config saved to /var/cache/conftool/dbconfig/20240220-110444-marostegui.json
  • 11:01 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P57286 and previous config saved to /var/cache/conftool/dbconfig/20240220-110154-marostegui.json
  • 11:01 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2190', diff saved to https://phabricator.wikimedia.org/P57285 and previous config saved to /var/cache/conftool/dbconfig/20240220-110020-root.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57284 and previous config saved to /var/cache/conftool/dbconfig/20240220-110011-root.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57283 and previous config saved to /var/cache/conftool/dbconfig/20240220-110008-root.json
  • 11:00 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P57282 and previous config saved to /var/cache/conftool/dbconfig/20240220-110004-arnaudb.json
  • 11:00 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2194 multi instance', diff saved to https://phabricator.wikimedia.org/P57281 and previous config saved to /var/cache/conftool/dbconfig/20240220-105959-marostegui.json
  • 10:56 slyngs: Import CAS 6.6.12+wmf11u2 in apt-repo
  • 10:50 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on cloudvirt1032.eqiad.wmnet with reason: nova-compute registration
  • 10:50 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on cloudvirt1032.eqiad.wmnet with reason: nova-compute registration
  • 10:48 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2194.codfw.wmnet with OS bookworm
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P57280 and previous config saved to /var/cache/conftool/dbconfig/20240220-104647-marostegui.json
  • 10:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2194', diff saved to https://phabricator.wikimedia.org/P57279 and previous config saved to /var/cache/conftool/dbconfig/20240220-104633-root.json
  • 10:42 marostegui@cumin1002: dbctl commit (dc=all): 'db2169 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57278 and previous config saved to /var/cache/conftool/dbconfig/20240220-104231-root.json
  • 10:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P57277 and previous config saved to /var/cache/conftool/dbconfig/20240220-104209-arnaudb.json
  • 10:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2169.codfw.wmnet with OS bookworm
  • 10:38 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57276 and previous config saved to /var/cache/conftool/dbconfig/20240220-103842-root.json
  • 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on cumin1001.eqiad.wmnet with reason: being taken down
  • 10:34 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on cumin1001.eqiad.wmnet with reason: being taken down
  • 10:31 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T355609)', diff saved to https://phabricator.wikimedia.org/P57275 and previous config saved to /var/cache/conftool/dbconfig/20240220-103141-marostegui.json
  • 10:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T357189)', diff saved to https://phabricator.wikimedia.org/P57274 and previous config saved to /var/cache/conftool/dbconfig/20240220-102703-arnaudb.json
  • 10:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T357189)', diff saved to https://phabricator.wikimedia.org/P57273 and previous config saved to /var/cache/conftool/dbconfig/20240220-102344-arnaudb.json
  • 10:23 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57272 and previous config saved to /var/cache/conftool/dbconfig/20240220-102337-root.json
  • 10:23 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T357189)', diff saved to https://phabricator.wikimedia.org/P57271 and previous config saved to /var/cache/conftool/dbconfig/20240220-102322-arnaudb.json
  • 10:18 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
  • 10:16 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2169.codfw.wmnet with reason: host reimage
  • 10:12 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57270 and previous config saved to /var/cache/conftool/dbconfig/20240220-101206-root.json
  • 10:10 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 10:10 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 10:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57269 and previous config saved to /var/cache/conftool/dbconfig/20240220-100832-root.json
  • 10:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P57268 and previous config saved to /var/cache/conftool/dbconfig/20240220-100816-arnaudb.json
  • 10:06 marostegui@cumin1002: dbctl commit (dc=all): 'Add db2169 to s6 depooled', diff saved to https://phabricator.wikimedia.org/P57267 and previous config saved to /var/cache/conftool/dbconfig/20240220-100623-marostegui.json
  • 10:05 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T355609)', diff saved to https://phabricator.wikimedia.org/P57266 and previous config saved to /var/cache/conftool/dbconfig/20240220-100511-marostegui.json
  • 10:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 10:04 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T355609)', diff saved to https://phabricator.wikimedia.org/P57265 and previous config saved to /var/cache/conftool/dbconfig/20240220-100449-marostegui.json
  • 10:04 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2169 multiinstance', diff saved to https://phabricator.wikimedia.org/P57264 and previous config saved to /var/cache/conftool/dbconfig/20240220-100444-marostegui.json
  • 10:00 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply
  • 09:57 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57263 and previous config saved to /var/cache/conftool/dbconfig/20240220-095701-root.json
  • 09:56 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2169.codfw.wmnet with OS bookworm
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2169', diff saved to https://phabricator.wikimedia.org/P57262 and previous config saved to /var/cache/conftool/dbconfig/20240220-095353-root.json
  • 09:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57261 and previous config saved to /var/cache/conftool/dbconfig/20240220-095327-root.json
  • 09:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P57260 and previous config saved to /var/cache/conftool/dbconfig/20240220-095310-arnaudb.json
  • 09:49 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply
  • 09:46 filippo@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:46 filippo@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aus-k8s-eqiad-services/jaeger: apply
  • 09:43 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P57259 and previous config saved to /var/cache/conftool/dbconfig/20240220-094334-marostegui.json
  • 09:41 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57258 and previous config saved to /var/cache/conftool/dbconfig/20240220-094156-root.json
  • 09:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T357189)', diff saved to https://phabricator.wikimedia.org/P57257 and previous config saved to /var/cache/conftool/dbconfig/20240220-093803-arnaudb.json
  • 09:36 moritzm: installing imagemagick security updates
  • 09:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57256 and previous config saved to /var/cache/conftool/dbconfig/20240220-093607-root.json
  • 09:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T357189)', diff saved to https://phabricator.wikimedia.org/P57255 and previous config saved to /var/cache/conftool/dbconfig/20240220-093442-arnaudb.json
  • 09:34 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 09:34 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 09:34 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57254 and previous config saved to /var/cache/conftool/dbconfig/20240220-093420-arnaudb.json
  • 09:28 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P57253 and previous config saved to /var/cache/conftool/dbconfig/20240220-092827-marostegui.json
  • 09:26 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57252 and previous config saved to /var/cache/conftool/dbconfig/20240220-092651-root.json
  • 09:26 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2167.codfw.wmnet with OS bookworm
  • 09:23 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 09:22 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 09:21 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 09:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57251 and previous config saved to /var/cache/conftool/dbconfig/20240220-092102-root.json
  • 09:21 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 09:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P57250 and previous config saved to /var/cache/conftool/dbconfig/20240220-091914-arnaudb.json
  • 09:16 akosiaris@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 09:16 dcausse@deploy2002: Finished deploy [airflow-dags/search@088b013]: search: wdqs updater set proper start date (duration: 00m 26s)
  • 09:16 dcausse@deploy2002: Started deploy [airflow-dags/search@088b013]: search: wdqs updater set proper start date
  • 09:15 akosiaris@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 09:13 marostegui@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T355609)', diff saved to https://phabricator.wikimedia.org/P57249 and previous config saved to /var/cache/conftool/dbconfig/20240220-091321-marostegui.json
  • 09:11 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57248 and previous config saved to /var/cache/conftool/dbconfig/20240220-091146-root.json
  • 09:09 akosiaris@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 09:08 akosiaris@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 09:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57247 and previous config saved to /var/cache/conftool/dbconfig/20240220-090557-root.json
  • 09:05 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
  • 09:04 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P57246 and previous config saved to /var/cache/conftool/dbconfig/20240220-090408-arnaudb.json
  • 09:02 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2167.codfw.wmnet with reason: host reimage
  • 09:01 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2138.codfw.wmnet with OS bookworm
  • 08:57 dcausse@deploy2002: Finished deploy [airflow-dags/search@a6356d2]: search: wdqs-updater reconcile, do not create the dag dynamically (duration: 00m 28s)
  • 08:56 dcausse@deploy2002: Started deploy [airflow-dags/search@a6356d2]: search: wdqs-updater reconcile, do not create the dag dynamically
  • 08:56 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57245 and previous config saved to /var/cache/conftool/dbconfig/20240220-085641-root.json
  • 08:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57244 and previous config saved to /var/cache/conftool/dbconfig/20240220-085222-root.json
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57243 and previous config saved to /var/cache/conftool/dbconfig/20240220-085052-root.json
  • 08:49 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57242 and previous config saved to /var/cache/conftool/dbconfig/20240220-084901-arnaudb.json
  • 08:46 marostegui@cumin1002: dbctl commit (dc=all): 'Depooling db1163 (T355609)', diff saved to https://phabricator.wikimedia.org/P57241 and previous config saved to /var/cache/conftool/dbconfig/20240220-084637-marostegui.json
  • 08:46 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:46 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 08:45 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T357189)', diff saved to https://phabricator.wikimedia.org/P57240 and previous config saved to /var/cache/conftool/dbconfig/20240220-084530-arnaudb.json
  • 08:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 08:44 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 08:43 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2167.codfw.wmnet with OS bookworm
  • 08:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2167', diff saved to https://phabricator.wikimedia.org/P57239 and previous config saved to /var/cache/conftool/dbconfig/20240220-084136-root.json
  • 08:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2138.codfw.wmnet with reason: host reimage
  • 08:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2138.codfw.wmnet with reason: host reimage
  • 08:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57238 and previous config saved to /var/cache/conftool/dbconfig/20240220-083718-root.json
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57237 and previous config saved to /var/cache/conftool/dbconfig/20240220-083547-root.json
  • 08:31 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 5%: After migration', diff saved to https://phabricator.wikimedia.org/P57236 and previous config saved to /var/cache/conftool/dbconfig/20240220-083132-root.json
  • 08:25 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57235 and previous config saved to /var/cache/conftool/dbconfig/20240220-082515-root.json
  • 08:23 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2168.codfw.wmnet with OS bookworm
  • 08:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57234 and previous config saved to /var/cache/conftool/dbconfig/20240220-082213-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57233 and previous config saved to /var/cache/conftool/dbconfig/20240220-082043-root.json
  • 08:19 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2138.codfw.wmnet with OS bookworm
  • 08:17 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2138', diff saved to https://phabricator.wikimedia.org/P57232 and previous config saved to /var/cache/conftool/dbconfig/20240220-081740-root.json
  • 08:16 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 1%: After migration', diff saved to https://phabricator.wikimedia.org/P57231 and previous config saved to /var/cache/conftool/dbconfig/20240220-081627-root.json
  • 08:16 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2170.codfw.wmnet with OS bookworm
  • 08:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57230 and previous config saved to /var/cache/conftool/dbconfig/20240220-081353-root.json
  • 08:10 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57229 and previous config saved to /var/cache/conftool/dbconfig/20240220-081010-root.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57228 and previous config saved to /var/cache/conftool/dbconfig/20240220-080708-root.json
  • 08:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
  • 08:00 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2168.codfw.wmnet with reason: host reimage
  • 07:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57227 and previous config saved to /var/cache/conftool/dbconfig/20240220-075848-root.json
  • 07:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
  • 07:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57226 and previous config saved to /var/cache/conftool/dbconfig/20240220-075505-root.json
  • 07:52 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2170.codfw.wmnet with reason: host reimage
  • 07:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57225 and previous config saved to /var/cache/conftool/dbconfig/20240220-075203-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 100%: After migration', diff saved to https://phabricator.wikimedia.org/P57224 and previous config saved to /var/cache/conftool/dbconfig/20240220-075128-root.json
  • 07:43 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57223 and previous config saved to /var/cache/conftool/dbconfig/20240220-074343-root.json
  • 07:40 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2168.codfw.wmnet with OS bookworm
  • 07:40 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57222 and previous config saved to /var/cache/conftool/dbconfig/20240220-074000-root.json
  • 07:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2168', diff saved to https://phabricator.wikimedia.org/P57221 and previous config saved to /var/cache/conftool/dbconfig/20240220-073912-root.json
  • 07:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2028.codfw.wmnet
  • 07:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2171 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57220 and previous config saved to /var/cache/conftool/dbconfig/20240220-073658-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 75%: After migration', diff saved to https://phabricator.wikimedia.org/P57219 and previous config saved to /var/cache/conftool/dbconfig/20240220-073623-root.json
  • 07:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2028.codfw.wmnet
  • 07:33 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57218 and previous config saved to /var/cache/conftool/dbconfig/20240220-073313-root.json
  • 07:32 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2170.codfw.wmnet with OS bookworm
  • 07:31 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2170', diff saved to https://phabricator.wikimedia.org/P57217 and previous config saved to /var/cache/conftool/dbconfig/20240220-073139-root.json
  • 07:28 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57216 and previous config saved to /var/cache/conftool/dbconfig/20240220-072838-root.json
  • 07:28 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2171.codfw.wmnet with OS bookworm
  • 07:27 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56286
  • 07:27 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 56286
  • 07:27 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 60501
  • 07:26 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 60501
  • 07:26 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 18779
  • 07:26 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 18779
  • 07:26 ayounsi@cumin1002: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 26554
  • 07:25 ayounsi@cumin1002: START - Cookbook sre.network.peering with action 'email' for AS: 26554
  • 07:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57215 and previous config saved to /var/cache/conftool/dbconfig/20240220-072455-root.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 50%: After migration', diff saved to https://phabricator.wikimedia.org/P57214 and previous config saved to /var/cache/conftool/dbconfig/20240220-072118-root.json
  • 07:18 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57213 and previous config saved to /var/cache/conftool/dbconfig/20240220-071808-root.json
  • 07:13 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57212 and previous config saved to /var/cache/conftool/dbconfig/20240220-071333-root.json
  • 07:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57211 and previous config saved to /var/cache/conftool/dbconfig/20240220-070948-root.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 25%: After migration', diff saved to https://phabricator.wikimedia.org/P57210 and previous config saved to /var/cache/conftool/dbconfig/20240220-070613-root.json
  • 07:04 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1246.eqiad.wmnet with OS bookworm
  • 07:03 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57209 and previous config saved to /var/cache/conftool/dbconfig/20240220-070303-root.json
  • 07:02 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
  • 06:59 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1244.eqiad.wmnet with OS bookworm
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57208 and previous config saved to /var/cache/conftool/dbconfig/20240220-065828-root.json
  • 06:58 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2171.codfw.wmnet with reason: host reimage
  • 06:51 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 10%: After migration', diff saved to https://phabricator.wikimedia.org/P57207 and previous config saved to /var/cache/conftool/dbconfig/20240220-065108-root.json
  • 06:47 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57206 and previous config saved to /var/cache/conftool/dbconfig/20240220-064758-root.json
  • 06:44 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 06:42 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1246.eqiad.wmnet with reason: host reimage
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2171 in s5 depooled T354826', diff saved to https://phabricator.wikimedia.org/P57205 and previous config saved to /var/cache/conftool/dbconfig/20240220-064152-marostegui.json
  • 06:40 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2171 multi-instance', diff saved to https://phabricator.wikimedia.org/P57204 and previous config saved to /var/cache/conftool/dbconfig/20240220-064014-marostegui.json
  • 06:39 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1244.eqiad.wmnet with reason: host reimage
  • 06:39 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2171.codfw.wmnet with OS bookworm
  • 06:37 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1244.eqiad.wmnet with reason: host reimage
  • 06:36 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 5%: After migration', diff saved to https://phabricator.wikimedia.org/P57203 and previous config saved to /var/cache/conftool/dbconfig/20240220-063603-root.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2171 T354826', diff saved to https://phabricator.wikimedia.org/P57202 and previous config saved to /var/cache/conftool/dbconfig/20240220-063521-marostegui.json
  • 06:32 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57201 and previous config saved to /var/cache/conftool/dbconfig/20240220-063254-root.json
  • 06:29 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1246.eqiad.wmnet with OS bookworm
  • 06:28 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1246', diff saved to https://phabricator.wikimedia.org/P57200 and previous config saved to /var/cache/conftool/dbconfig/20240220-062759-root.json
  • 06:24 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1244.eqiad.wmnet with OS bookworm
  • 06:22 marostegui@deploy2002: Finished scap: Backport for Revert "db-production.php: Disable writes on es4" (duration: 09m 32s)
  • 06:20 marostegui@cumin1002: dbctl commit (dc=all): 'es2021 (re)pooling @ 1%: After migration', diff saved to https://phabricator.wikimedia.org/P57199 and previous config saved to /var/cache/conftool/dbconfig/20240220-062058-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1244', diff saved to https://phabricator.wikimedia.org/P57198 and previous config saved to /var/cache/conftool/dbconfig/20240220-061932-root.json
  • 06:17 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57197 and previous config saved to /var/cache/conftool/dbconfig/20240220-061749-root.json
  • 06:14 marostegui@deploy2002: marostegui: Continuing with sync
  • 06:14 marostegui@deploy2002: marostegui: Backport for Revert "db-production.php: Disable writes on es4" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 06:13 marostegui@deploy2002: Started scap: Backport for Revert "db-production.php: Disable writes on es4"
  • 06:13 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS bookworm
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Add weight to es2020', diff saved to https://phabricator.wikimedia.org/P57196 and previous config saved to /var/cache/conftool/dbconfig/20240220-061049-root.json
  • 06:10 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es2021 T356372', diff saved to https://phabricator.wikimedia.org/P57195 and previous config saved to /var/cache/conftool/dbconfig/20240220-061025-marostegui.json
  • 06:08 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es2020 to es4 primary T356372', diff saved to https://phabricator.wikimedia.org/P57194 and previous config saved to /var/cache/conftool/dbconfig/20240220-060852-marostegui.json
  • 06:08 marostegui: Starting es4 codfw failover from es2021 to es2020 - T356372
  • 06:04 marostegui@cumin1002: dbctl commit (dc=all): 'Set es2020 with weight 0 T356372', diff saved to https://phabricator.wikimedia.org/P57193 and previous config saved to /var/cache/conftool/dbconfig/20240220-060404-marostegui.json
  • 06:03 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356372
  • 06:03 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T356372
  • 06:01 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2137.codfw.wmnet with OS bookworm
  • 06:00 marostegui@deploy2002: Finished scap: Backport for db-production.php: Disable writes on es4 (T356372) (duration: 09m 36s)
  • 05:55 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
  • 05:55 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2137.codfw.wmnet with OS bookworm
  • 05:54 marostegui@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2137.codfw.wmnet with OS bookworm
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1170.eqiad.wmnet with reason: host reimage
  • 05:52 marostegui@deploy2002: marostegui: Continuing with sync
  • 05:52 marostegui@deploy2002: marostegui: Backport for db-production.php: Disable writes on es4 (T356372) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 05:50 marostegui@deploy2002: Started scap: Backport for db-production.php: Disable writes on es4 (T356372)
  • 05:45 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db2137.codfw.wmnet with OS bookworm
  • 05:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2137 for reimage', diff saved to https://phabricator.wikimedia.org/P57192 and previous config saved to /var/cache/conftool/dbconfig/20240220-054156-marostegui.json
  • 05:41 marostegui@cumin1002: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS bookworm
  • 05:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1170 for reimage', diff saved to https://phabricator.wikimedia.org/P57191 and previous config saved to /var/cache/conftool/dbconfig/20240220-053920-marostegui.json
  • 04:56 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.42.0-wmf.19 refs T354437 (duration: 52m 09s)
  • 04:04 mwpresync@deploy2002: Started scap: testwikis wikis to 1.42.0-wmf.19 refs T354437
  • 04:02 mwpresync@deploy2002: Pruned MediaWiki: 1.42.0-wmf.16 (duration: 01m 57s)
  • 02:15 tstarling@deploy2002: Synchronized wmf-config/CommonSettings.php: Set $wgLoginNotifyUseCheckUser = false T346989 (duration: 08m 13s)

2024-02-19

  • 23:43 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 23:42 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 23:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57190 and previous config saved to /var/cache/conftool/dbconfig/20240219-234251-arnaudb.json
  • 23:27 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P57189 and previous config saved to /var/cache/conftool/dbconfig/20240219-232745-arnaudb.json
  • 23:12 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193', diff saved to https://phabricator.wikimedia.org/P57188 and previous config saved to /var/cache/conftool/dbconfig/20240219-231238-arnaudb.json
  • 22:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57187 and previous config saved to /var/cache/conftool/dbconfig/20240219-225732-arnaudb.json
  • 22:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57186 and previous config saved to /var/cache/conftool/dbconfig/20240219-224117-arnaudb.json
  • 22:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 22:40 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2193.codfw.wmnet with reason: Maintenance
  • 22:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57185 and previous config saved to /var/cache/conftool/dbconfig/20240219-224054-arnaudb.json
  • 22:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P57184 and previous config saved to /var/cache/conftool/dbconfig/20240219-222547-arnaudb.json
  • 22:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P57183 and previous config saved to /var/cache/conftool/dbconfig/20240219-221239-ladsgroup.json
  • 22:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P57182 and previous config saved to /var/cache/conftool/dbconfig/20240219-221041-arnaudb.json
  • 21:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P57181 and previous config saved to /var/cache/conftool/dbconfig/20240219-215733-ladsgroup.json
  • 21:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57180 and previous config saved to /var/cache/conftool/dbconfig/20240219-215534-arnaudb.json
  • 21:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57179 and previous config saved to /var/cache/conftool/dbconfig/20240219-215217-arnaudb.json
  • 21:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 21:52 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 21:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T357189)', diff saved to https://phabricator.wikimedia.org/P57178 and previous config saved to /var/cache/conftool/dbconfig/20240219-215155-arnaudb.json
  • 21:42 zabe@deploy2002: Finished scap: Backport for EditAttemptStep: log buckets for the edit check test (T342930), Enrollment for the edit check a/b test (T342930), Launch the Visual Editor edit check a/b test (T342930 T352127), Default VE on mobile for other wikis (T352127) (duration: 17m 25s)
  • 21:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P57177 and previous config saved to /var/cache/conftool/dbconfig/20240219-214227-ladsgroup.json
  • 21:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P57176 and previous config saved to /var/cache/conftool/dbconfig/20240219-213648-arnaudb.json
  • 21:35 zabe@deploy2002: kemayo and zabe: Continuing with sync
  • 21:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P57175 and previous config saved to /var/cache/conftool/dbconfig/20240219-212720-ladsgroup.json
  • 21:26 zabe@deploy2002: kemayo and zabe: Backport for EditAttemptStep: log buckets for the edit check test (T342930), Enrollment for the edit check a/b test (T342930), Launch the Visual Editor edit check a/b test (T342930 T352127), Default VE on mobile for other wikis (T352127) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:25 zabe@deploy2002: Started scap: Backport for EditAttemptStep: log buckets for the edit check test (T342930), Enrollment for the edit check a/b test (T342930), Launch the Visual Editor edit check a/b test (T342930 T352127), Default VE on mobile for other wikis (T352127)
  • 21:21 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P57174 and previous config saved to /var/cache/conftool/dbconfig/20240219-212141-arnaudb.json
  • 21:06 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T357189)', diff saved to https://phabricator.wikimedia.org/P57173 and previous config saved to /var/cache/conftool/dbconfig/20240219-210635-arnaudb.json
  • 21:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2171:3316 (T357189)', diff saved to https://phabricator.wikimedia.org/P57172 and previous config saved to /var/cache/conftool/dbconfig/20240219-210228-arnaudb.json
  • 21:02 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 21:01 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 20:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57171 and previous config saved to /var/cache/conftool/dbconfig/20240219-205935-arnaudb.json
  • 20:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 100%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57169 and previous config saved to /var/cache/conftool/dbconfig/20240219-205047-arnaudb.json
  • 20:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P57168 and previous config saved to /var/cache/conftool/dbconfig/20240219-204429-arnaudb.json
  • 20:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 75%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57167 and previous config saved to /var/cache/conftool/dbconfig/20240219-203542-arnaudb.json
  • 20:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P57166 and previous config saved to /var/cache/conftool/dbconfig/20240219-202923-arnaudb.json
  • 20:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P57165 and previous config saved to /var/cache/conftool/dbconfig/20240219-202648-ladsgroup.json
  • 20:26 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 20:26 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 20:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P57164 and previous config saved to /var/cache/conftool/dbconfig/20240219-202615-ladsgroup.json
  • 20:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 50%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57163 and previous config saved to /var/cache/conftool/dbconfig/20240219-202037-arnaudb.json
  • 20:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57162 and previous config saved to /var/cache/conftool/dbconfig/20240219-201416-arnaudb.json
  • 20:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P57161 and previous config saved to /var/cache/conftool/dbconfig/20240219-201353-root.json
  • 20:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P57160 and previous config saved to /var/cache/conftool/dbconfig/20240219-201109-ladsgroup.json
  • 20:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2158 (T357189)', diff saved to https://phabricator.wikimedia.org/P57159 and previous config saved to /var/cache/conftool/dbconfig/20240219-200914-arnaudb.json
  • 20:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 20:09 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 20:09 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 20:08 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T357189)', diff saved to https://phabricator.wikimedia.org/P57158 and previous config saved to /var/cache/conftool/dbconfig/20240219-200847-arnaudb.json
  • 20:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 40%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57157 and previous config saved to /var/cache/conftool/dbconfig/20240219-200533-arnaudb.json
  • 20:05 zabe@deploy2002: Finished scap: Backport for Remove reviewer group from testwiki (T356012) (duration: 09m 16s)
  • 19:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P57156 and previous config saved to /var/cache/conftool/dbconfig/20240219-195848-root.json
  • 19:57 zabe@deploy2002: zabe: Continuing with sync
  • 19:57 zabe@deploy2002: zabe: Backport for Remove reviewer group from testwiki (T356012) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 19:56 zabe@deploy2002: Started scap: Backport for Remove reviewer group from testwiki (T356012)
  • 19:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P57155 and previous config saved to /var/cache/conftool/dbconfig/20240219-195603-ladsgroup.json
  • 19:53 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P57154 and previous config saved to /var/cache/conftool/dbconfig/20240219-195341-arnaudb.json
  • 19:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 30%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57153 and previous config saved to /var/cache/conftool/dbconfig/20240219-195028-arnaudb.json
  • 19:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P57152 and previous config saved to /var/cache/conftool/dbconfig/20240219-194343-root.json
  • 19:42 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript emptyUserGroup.php --wiki=testwiki reviewer # T356012
  • 19:41 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --user="Yann" --overwrite . # T357218
  • 19:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P57151 and previous config saved to /var/cache/conftool/dbconfig/20240219-194056-ladsgroup.json
  • 19:38 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P57150 and previous config saved to /var/cache/conftool/dbconfig/20240219-193834-arnaudb.json
  • 19:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 20%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57149 and previous config saved to /var/cache/conftool/dbconfig/20240219-193522-arnaudb.json
  • 19:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P57148 and previous config saved to /var/cache/conftool/dbconfig/20240219-192838-root.json
  • 19:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2149.codfw.wmnet onto db2156.codfw.wmnet
  • 19:23 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T357189)', diff saved to https://phabricator.wikimedia.org/P57147 and previous config saved to /var/cache/conftool/dbconfig/20240219-192327-arnaudb.json
  • 19:20 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 10%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57146 and previous config saved to /var/cache/conftool/dbconfig/20240219-192018-arnaudb.json
  • 19:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2151 (T357189)', diff saved to https://phabricator.wikimedia.org/P57145 and previous config saved to /var/cache/conftool/dbconfig/20240219-191923-arnaudb.json
  • 19:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 19:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2151.codfw.wmnet with reason: Maintenance
  • 19:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T357189)', diff saved to https://phabricator.wikimedia.org/P57144 and previous config saved to /var/cache/conftool/dbconfig/20240219-191901-arnaudb.json
  • 19:14 zabe: zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user="Yann" . # T357297
  • 19:05 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 8%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57143 and previous config saved to /var/cache/conftool/dbconfig/20240219-190513-arnaudb.json
  • 19:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P57142 and previous config saved to /var/cache/conftool/dbconfig/20240219-190354-arnaudb.json
  • 18:50 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 4%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57141 and previous config saved to /var/cache/conftool/dbconfig/20240219-185008-arnaudb.json
  • 18:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P57140 and previous config saved to /var/cache/conftool/dbconfig/20240219-184848-arnaudb.json
  • 18:35 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 2%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57139 and previous config saved to /var/cache/conftool/dbconfig/20240219-183503-arnaudb.json
  • 18:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T357189)', diff saved to https://phabricator.wikimedia.org/P57138 and previous config saved to /var/cache/conftool/dbconfig/20240219-183341-arnaudb.json
  • 18:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2124 (T357189)', diff saved to https://phabricator.wikimedia.org/P57137 and previous config saved to /var/cache/conftool/dbconfig/20240219-182929-arnaudb.json
  • 18:29 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 18:29 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 18:29 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T357189)', diff saved to https://phabricator.wikimedia.org/P57136 and previous config saved to /var/cache/conftool/dbconfig/20240219-182905-arnaudb.json
  • 18:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3316 (re)pooling @ 1%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57135 and previous config saved to /var/cache/conftool/dbconfig/20240219-181958-arnaudb.json
  • 18:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 100%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57134 and previous config saved to /var/cache/conftool/dbconfig/20240219-181953-arnaudb.json
  • 18:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P57133 and previous config saved to /var/cache/conftool/dbconfig/20240219-181359-arnaudb.json
  • 18:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 75%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57132 and previous config saved to /var/cache/conftool/dbconfig/20240219-180448-arnaudb.json
  • 17:58 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P57131 and previous config saved to /var/cache/conftool/dbconfig/20240219-175853-arnaudb.json
  • 17:56 ladsgroup@cumin1002: START - Cookbook sre.mysql.clone of db2149.codfw.wmnet onto db2156.codfw.wmnet
  • 17:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 50%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57130 and previous config saved to /var/cache/conftool/dbconfig/20240219-174943-arnaudb.json
  • 17:43 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T357189)', diff saved to https://phabricator.wikimedia.org/P57129 and previous config saved to /var/cache/conftool/dbconfig/20240219-174347-arnaudb.json
  • 17:43 hnowlan: running `decommssion` for mw2312.codfw.wmnet,mw2313.codfw.wmnet,mw2367.codfw.wmnet,mw2369.codfw.wmnet,mw2384.codfw.wmnet,mw2385.codfw.wmnet before reimaging to k8s workers
  • 17:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2117 (T357189)', diff saved to https://phabricator.wikimedia.org/P57128 and previous config saved to /var/cache/conftool/dbconfig/20240219-173941-arnaudb.json
  • 17:39 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 17:39 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 17:39 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T357189)', diff saved to https://phabricator.wikimedia.org/P57127 and previous config saved to /var/cache/conftool/dbconfig/20240219-173919-arnaudb.json
  • 17:38 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: recloning db2156 (T352010)
  • 17:38 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: recloning db2156 (T352010)
  • 17:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 40%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57126 and previous config saved to /var/cache/conftool/dbconfig/20240219-173438-arnaudb.json
  • 17:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'db2149 for maint', diff saved to https://phabricator.wikimedia.org/P57125 and previous config saved to /var/cache/conftool/dbconfig/20240219-173411-ladsgroup.json
  • 17:24 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P57124 and previous config saved to /var/cache/conftool/dbconfig/20240219-172412-arnaudb.json
  • 17:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 30%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57123 and previous config saved to /var/cache/conftool/dbconfig/20240219-171933-arnaudb.json
  • 17:09 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P57122 and previous config saved to /var/cache/conftool/dbconfig/20240219-170906-arnaudb.json
  • 17:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 20%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57121 and previous config saved to /var/cache/conftool/dbconfig/20240219-170428-arnaudb.json
  • 16:55 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57120 and previous config saved to /var/cache/conftool/dbconfig/20240219-165503-root.json
  • 16:54 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T357189)', diff saved to https://phabricator.wikimedia.org/P57119 and previous config saved to /var/cache/conftool/dbconfig/20240219-165400-arnaudb.json
  • 16:50 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db2114 (T357189)', diff saved to https://phabricator.wikimedia.org/P57118 and previous config saved to /var/cache/conftool/dbconfig/20240219-165032-arnaudb.json
  • 16:50 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 16:50 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 16:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 10%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57117 and previous config saved to /var/cache/conftool/dbconfig/20240219-164924-arnaudb.json
  • 16:48 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 16:48 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 16:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T357189)', diff saved to https://phabricator.wikimedia.org/P57116 and previous config saved to /var/cache/conftool/dbconfig/20240219-164809-arnaudb.json
  • 16:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 75%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57115 and previous config saved to /var/cache/conftool/dbconfig/20240219-163958-root.json
  • 16:38 jgiannelos@deploy2002: Finished deploy [restbase/deploy@7e5e720]: Disable parsoid storage on restbase[1031:1033] (duration: 01m 55s)
  • 16:36 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on restbase[1031:1033]
  • 16:35 jgiannelos@deploy2002: Finished deploy [restbase/deploy@7e5e720]: Disable parsoid storage on restbase[2033:2035] (duration: 01m 19s)
  • 16:34 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 8%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57114 and previous config saved to /var/cache/conftool/dbconfig/20240219-163419-arnaudb.json
  • 16:33 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on restbase[2033:2035]
  • 16:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P57113 and previous config saved to /var/cache/conftool/dbconfig/20240219-163303-arnaudb.json
  • 16:32 jgiannelos@deploy2002: deploy aborted: Disable parsoid storage on all nodes (duration: 01m 57s)
  • 16:30 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on all nodes
  • 16:30 jgiannelos@deploy2002: Finished deploy [restbase/deploy@7e5e720]: Disable parsoid storage on all nodes (duration: 00m 07s)
  • 16:30 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:30 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 16:30 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on all nodes
  • 16:29 jgiannelos@deploy2002: deploy aborted: Disable parsoid storage on all nodes (duration: 00m 08s)
  • 16:29 jgiannelos@deploy2002: Started deploy [restbase/deploy@7e5e720]: Disable parsoid storage on all nodes
  • 16:29 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:29 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 16:24 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57112 and previous config saved to /var/cache/conftool/dbconfig/20240219-162453-root.json
  • 16:21 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 00m 04s)
  • 16:21 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 16:19 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 4%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57111 and previous config saved to /var/cache/conftool/dbconfig/20240219-161914-arnaudb.json
  • 16:19 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 00m 07s)
  • 16:18 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 16:17 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P57110 and previous config saved to /var/cache/conftool/dbconfig/20240219-161756-arnaudb.json
  • 16:17 jgiannelos@deploy2002: deploy aborted: Deploy latest restbase config in all nodes (duration: 00m 04s)
  • 16:16 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Deploy latest restbase config in all nodes
  • 16:14 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 00m 08s)
  • 16:14 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 16:09 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57109 and previous config saved to /var/cache/conftool/dbconfig/20240219-160948-root.json
  • 16:04 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 00m 23s)
  • 16:04 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 16:04 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 2%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57108 and previous config saved to /var/cache/conftool/dbconfig/20240219-160409-arnaudb.json
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T357189)', diff saved to https://phabricator.wikimedia.org/P57107 and previous config saved to /var/cache/conftool/dbconfig/20240219-160249-arnaudb.json
  • 16:02 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 100%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57106 and previous config saved to /var/cache/conftool/dbconfig/20240219-160221-arnaudb.json
  • 15:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T357189)', diff saved to https://phabricator.wikimedia.org/P57105 and previous config saved to /var/cache/conftool/dbconfig/20240219-155936-arnaudb.json
  • 15:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 15:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 15:59 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[2029:2032] (duration: 02m 56s)
  • 15:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 15:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 15:57 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T357189)', diff saved to https://phabricator.wikimedia.org/P57104 and previous config saved to /var/cache/conftool/dbconfig/20240219-155702-arnaudb.json
  • 15:56 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[2029:2032]
  • 15:55 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[1027:1030] (duration: 04m 11s)
  • 15:54 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57103 and previous config saved to /var/cache/conftool/dbconfig/20240219-155443-root.json
  • 15:51 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[1027:1030]
  • 15:49 arnaudb@cumin1002: dbctl commit (dc=all): 'db2194:3317 (re)pooling @ 1%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57102 and previous config saved to /var/cache/conftool/dbconfig/20240219-154904-arnaudb.json
  • 15:47 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 75%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57101 and previous config saved to /var/cache/conftool/dbconfig/20240219-154716-arnaudb.json
  • 15:42 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P57100 and previous config saved to /var/cache/conftool/dbconfig/20240219-154154-arnaudb.json
  • 15:41 arnaudb@cumin1002: dbctl commit (dc=all): 'T343674 - db2194 missing config', diff saved to https://phabricator.wikimedia.org/P57099 and previous config saved to /var/cache/conftool/dbconfig/20240219-154148-arnaudb.json
  • 15:41 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1213.eqiad.wmnet with OS bookworm
  • 15:39 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57098 and previous config saved to /var/cache/conftool/dbconfig/20240219-153938-root.json
  • 15:37 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[2025:2028] (duration: 01m 28s)
  • 15:36 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[2025:2028]
  • 15:35 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase1026 (duration: 01m 55s)
  • 15:33 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase1026
  • 15:33 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[1023:1025] (duration: 01m 57s)
  • 15:32 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 50%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57097 and previous config saved to /var/cache/conftool/dbconfig/20240219-153211-arnaudb.json
  • 15:31 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase[1023:1025]
  • 15:26 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224', diff saved to https://phabricator.wikimedia.org/P57096 and previous config saved to /var/cache/conftool/dbconfig/20240219-152634-arnaudb.json
  • 15:24 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase2024 (duration: 01m 24s)
  • 15:23 Lucas_WMDE: UTC afternoon backport+config window done
  • 15:22 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: Disable parsoid storage on restbase2024
  • 15:22 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for Increase move rate limit for extendedmovers in arwiki to 16/60 (T357229) (duration: 24m 34s)
  • 15:22 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
  • 15:22 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 30s)
  • 15:20 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 15:19 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1213.eqiad.wmnet with reason: host reimage
  • 15:19 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 54s)
  • 15:17 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 15:17 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 40%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57095 and previous config saved to /var/cache/conftool/dbconfig/20240219-151706-arnaudb.json
  • 15:15 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 55s)
  • 15:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 15:14 logmsgbot: lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and gergesshamon: Continuing with sync
  • 15:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 15:13 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 15:13 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 28s)
  • 15:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57094 and previous config saved to /var/cache/conftool/dbconfig/20240219-151246-root.json
  • 15:11 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 15:11 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 55s)
  • 15:11 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1224 (T357189)', diff saved to https://phabricator.wikimedia.org/P57093 and previous config saved to /var/cache/conftool/dbconfig/20240219-151127-arnaudb.json
  • 15:10 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 15:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 15:09 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 14:53 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 24s)
  • 14:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P57087 and previous config saved to /var/cache/conftool/dbconfig/20240219-145251-arnaudb.json
  • 14:51 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 14:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P57086 and previous config saved to /var/cache/conftool/dbconfig/20240219-145119-ladsgroup.json
  • 14:51 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 14:51 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 14:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P57085 and previous config saved to /var/cache/conftool/dbconfig/20240219-145057-ladsgroup.json
  • 14:49 jgiannelos@deploy2002: Finished deploy [restbase/deploy@e5ed8d0]: (no justification provided) (duration: 01m 51s)
  • 14:48 jgiannelos@deploy2002: Started deploy [restbase/deploy@e5ed8d0]: (no justification provided)
  • 14:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 20%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57084 and previous config saved to /var/cache/conftool/dbconfig/20240219-144655-arnaudb.json
  • 14:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57083 and previous config saved to /var/cache/conftool/dbconfig/20240219-144422-root.json
  • 14:42 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 50%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57082 and previous config saved to /var/cache/conftool/dbconfig/20240219-144237-root.json
  • 14:37 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P57081 and previous config saved to /var/cache/conftool/dbconfig/20240219-143744-arnaudb.json
  • 14:37 reedy@deploy2002: Finished scap: Fix casing of MediaWiki (duration: 09m 11s)
  • 14:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P57080 and previous config saved to /var/cache/conftool/dbconfig/20240219-143550-ladsgroup.json
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3316 (re)pooling @ 10%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57079 and previous config saved to /var/cache/conftool/dbconfig/20240219-143150-arnaudb.json
  • 14:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 100%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57078 and previous config saved to /var/cache/conftool/dbconfig/20240219-143145-arnaudb.json
  • 14:29 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57077 and previous config saved to /var/cache/conftool/dbconfig/20240219-142917-root.json
  • 14:28 reedy@deploy2002: Started scap: Fix casing of MediaWiki
  • 14:27 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 25%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57076 and previous config saved to /var/cache/conftool/dbconfig/20240219-142732-root.json
  • 14:22 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T357189)', diff saved to https://phabricator.wikimedia.org/P57075 and previous config saved to /var/cache/conftool/dbconfig/20240219-142238-arnaudb.json
  • 14:20 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P57074 and previous config saved to /var/cache/conftool/dbconfig/20240219-142044-ladsgroup.json
  • 14:19 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:19 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:19 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1201 (T357189)', diff saved to https://phabricator.wikimedia.org/P57073 and previous config saved to /var/cache/conftool/dbconfig/20240219-141919-arnaudb.json
  • 14:19 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 14:19 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1201.eqiad.wmnet with reason: Maintenance
  • 14:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T357189)', diff saved to https://phabricator.wikimedia.org/P57072 and previous config saved to /var/cache/conftool/dbconfig/20240219-141858-arnaudb.json
  • 14:18 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:18 akosiaris@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:18 akosiaris@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 75%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57071 and previous config saved to /var/cache/conftool/dbconfig/20240219-141640-arnaudb.json
  • 14:14 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57070 and previous config saved to /var/cache/conftool/dbconfig/20240219-141412-root.json
  • 14:12 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 10%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57069 and previous config saved to /var/cache/conftool/dbconfig/20240219-141227-root.json
  • 14:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P57068 and previous config saved to /var/cache/conftool/dbconfig/20240219-140538-ladsgroup.json
  • 14:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P57067 and previous config saved to /var/cache/conftool/dbconfig/20240219-140351-arnaudb.json
  • 14:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 50%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57066 and previous config saved to /var/cache/conftool/dbconfig/20240219-140135-arnaudb.json
  • 13:59 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57065 and previous config saved to /var/cache/conftool/dbconfig/20240219-135907-root.json
  • 13:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1027.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1027.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1026.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1026.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1024.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:58 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1024.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:57 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1023.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:57 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1023.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:57 marostegui@cumin1002: dbctl commit (dc=all): 'es1020 (re)pooling @ 5%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57064 and previous config saved to /var/cache/conftool/dbconfig/20240219-135722-root.json
  • 13:48 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P57063 and previous config saved to /var/cache/conftool/dbconfig/20240219-134845-arnaudb.json
  • 13:48 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1020', diff saved to https://phabricator.wikimedia.org/P57062 and previous config saved to /var/cache/conftool/dbconfig/20240219-134804-root.json
  • 13:46 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 40%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57061 and previous config saved to /var/cache/conftool/dbconfig/20240219-134630-arnaudb.json
  • 13:45 marostegui@cumin1002: dbctl commit (dc=all): 'Promote es1021 to es4 primary ', diff saved to https://phabricator.wikimedia.org/P57060 and previous config saved to /var/cache/conftool/dbconfig/20240219-134551-root.json
  • 13:44 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57059 and previous config saved to /var/cache/conftool/dbconfig/20240219-134402-root.json
  • 13:43 marostegui: Starting es4 eqiad failover from es1020 to es1021 - T357904
  • 13:42 marostegui@cumin1002: dbctl commit (dc=all): 'Change weight of es1021', diff saved to https://phabricator.wikimedia.org/P57058 and previous config saved to /var/cache/conftool/dbconfig/20240219-134205-root.json
  • 13:40 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: es4 switchover T357904
  • 13:40 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: es4 switchover T357904
  • 13:38 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1021.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:37 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1021.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:36 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1020.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:35 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1020.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:33 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T357189)', diff saved to https://phabricator.wikimedia.org/P57057 and previous config saved to /var/cache/conftool/dbconfig/20240219-133339-arnaudb.json
  • 13:32 marostegui@cumin1002: dbctl commit (dc=all): 'Depool es1021', diff saved to https://phabricator.wikimedia.org/P57056 and previous config saved to /var/cache/conftool/dbconfig/20240219-133245-root.json
  • 13:31 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 30%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57055 and previous config saved to /var/cache/conftool/dbconfig/20240219-133125-arnaudb.json
  • 13:30 moritzm: installing runc security updates on buster
  • 13:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T357189)', diff saved to https://phabricator.wikimedia.org/P57054 and previous config saved to /var/cache/conftool/dbconfig/20240219-133019-arnaudb.json
  • 13:30 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 13:30 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 13:30 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57053 and previous config saved to /var/cache/conftool/dbconfig/20240219-132958-arnaudb.json
  • 13:28 marostegui@cumin1002: dbctl commit (dc=all): 'db2170 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57052 and previous config saved to /var/cache/conftool/dbconfig/20240219-132858-root.json
  • 13:26 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1025.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:26 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1025.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:17 marostegui@cumin1002: dbctl commit (dc=all): 'Add db2170 depooled', diff saved to https://phabricator.wikimedia.org/P57051 and previous config saved to /var/cache/conftool/dbconfig/20240219-131729-marostegui.json
  • 13:17 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on dbproxy1022.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:16 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on dbproxy1022.eqiad.wmnet with reason: Silence for reboot T356240
  • 13:16 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 20%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57050 and previous config saved to /var/cache/conftool/dbconfig/20240219-131620-arnaudb.json
  • 13:16 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db21170 multi-instance', diff saved to https://phabricator.wikimedia.org/P57049 and previous config saved to /var/cache/conftool/dbconfig/20240219-131609-marostegui.json
  • 13:14 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P57048 and previous config saved to /var/cache/conftool/dbconfig/20240219-131452-arnaudb.json
  • 13:12 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2170 T354826', diff saved to https://phabricator.wikimedia.org/P57047 and previous config saved to /var/cache/conftool/dbconfig/20240219-131245-root.json
  • 13:01 arnaudb@cumin1002: dbctl commit (dc=all): 'db2169:3317 (re)pooling @ 10%: Cloning to db2194 done', diff saved to https://phabricator.wikimedia.org/P57046 and previous config saved to /var/cache/conftool/dbconfig/20240219-130116-arnaudb.json
  • 12:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P57045 and previous config saved to /var/cache/conftool/dbconfig/20240219-125945-arnaudb.json
  • 12:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57044 and previous config saved to /var/cache/conftool/dbconfig/20240219-125456-root.json
  • 12:44 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57043 and previous config saved to /var/cache/conftool/dbconfig/20240219-124439-arnaudb.json
  • 12:44 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:43 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:43 hnowlan: migrating refreshLinks to k8s jobrunners
  • 12:42 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply
  • 12:42 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply
  • 12:41 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T357189)', diff saved to https://phabricator.wikimedia.org/P57042 and previous config saved to /var/cache/conftool/dbconfig/20240219-124115-arnaudb.json
  • 12:41 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 12:41 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 12:40 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57041 and previous config saved to /var/cache/conftool/dbconfig/20240219-124054-arnaudb.json
  • 12:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57040 and previous config saved to /var/cache/conftool/dbconfig/20240219-123951-root.json
  • 12:37 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1032
  • 12:37 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1032
  • 12:36 hnowlan@deploy2002: helmfile [codfw] [canary] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:36 hnowlan@deploy2002: helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:36 hnowlan@deploy2002: helmfile [codfw] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 12:36 hnowlan@deploy2002: helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync
  • 12:35 aborrero@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudvirt1032
  • 12:35 aborrero@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudvirt1032
  • 12:35 hnowlan@deploy2002: helmfile [eqiad] [canary] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:35 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 12:35 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 12:35 hnowlan@deploy2002: helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 12:32 samtar@deploy2002: Finished scap: Backport for IS/CS: Add wmgEditRecoveryDefaultUserOptions (T350653) (duration: 10m 21s)
  • 12:25 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P57039 and previous config saved to /var/cache/conftool/dbconfig/20240219-122547-arnaudb.json
  • 12:24 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57038 and previous config saved to /var/cache/conftool/dbconfig/20240219-122446-root.json
  • 12:24 samtar@deploy2002: samtar: Continuing with sync
  • 12:23 samtar@deploy2002: samtar: Backport for IS/CS: Add wmgEditRecoveryDefaultUserOptions (T350653) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 12:21 samtar@deploy2002: Started scap: Backport for IS/CS: Add wmgEditRecoveryDefaultUserOptions (T350653)
  • 12:21 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57037 and previous config saved to /var/cache/conftool/dbconfig/20240219-122142-root.json
  • 12:19 samtar@deploy2002: backport Cancelled
  • 12:18 samtar@deploy2002: backport Cancelled
  • 12:10 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P57035 and previous config saved to /var/cache/conftool/dbconfig/20240219-121040-arnaudb.json
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57034 and previous config saved to /var/cache/conftool/dbconfig/20240219-120951-root.json
  • 12:09 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57033 and previous config saved to /var/cache/conftool/dbconfig/20240219-120941-root.json
  • 12:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57032 and previous config saved to /var/cache/conftool/dbconfig/20240219-120637-root.json
  • 12:03 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1032.eqiad.wmnet with OS bookworm
  • 11:55 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57031 and previous config saved to /var/cache/conftool/dbconfig/20240219-115534-arnaudb.json
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57030 and previous config saved to /var/cache/conftool/dbconfig/20240219-115439-root.json
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57029 and previous config saved to /var/cache/conftool/dbconfig/20240219-115436-root.json
  • 11:54 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57028 and previous config saved to /var/cache/conftool/dbconfig/20240219-115435-root.json
  • 11:52 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T357189)', diff saved to https://phabricator.wikimedia.org/P57027 and previous config saved to /var/cache/conftool/dbconfig/20240219-115210-arnaudb.json
  • 11:52 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 11:51 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 11:51 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T357189)', diff saved to https://phabricator.wikimedia.org/P57026 and previous config saved to /var/cache/conftool/dbconfig/20240219-115138-arnaudb.json
  • 11:51 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57025 and previous config saved to /var/cache/conftool/dbconfig/20240219-115132-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57024 and previous config saved to /var/cache/conftool/dbconfig/20240219-113934-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2138 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57023 and previous config saved to /var/cache/conftool/dbconfig/20240219-113931-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57022 and previous config saved to /var/cache/conftool/dbconfig/20240219-113931-root.json
  • 11:39 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2138 in s2 T354826', diff saved to https://phabricator.wikimedia.org/P57021 and previous config saved to /var/cache/conftool/dbconfig/20240219-113926-marostegui.json
  • 11:37 ariel@deploy2002: Finished deploy [dumps/dumps@0d1f9be]: improvements to page content history backfill script (duration: 00m 04s)
  • 11:37 ariel@deploy2002: Started deploy [dumps/dumps@0d1f9be]: improvements to page content history backfill script
  • 11:36 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P57020 and previous config saved to /var/cache/conftool/dbconfig/20240219-113632-arnaudb.json
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57019 and previous config saved to /var/cache/conftool/dbconfig/20240219-113627-root.json
  • 11:36 marostegui@cumin1002: dbctl commit (dc=all): 'place db2138 in s2', diff saved to https://phabricator.wikimedia.org/P57018 and previous config saved to /var/cache/conftool/dbconfig/20240219-113622-marostegui.json
  • 11:34 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1032.eqiad.wmnet with reason: host reimage
  • 11:28 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1032.eqiad.wmnet with reason: host reimage
  • 11:24 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2138 T354826', diff saved to https://phabricator.wikimedia.org/P57017 and previous config saved to /var/cache/conftool/dbconfig/20240219-112405-root.json
  • 11:23 taavi: update cr*-codfw firewall policy for puppetmaster2003 -> puppetserver2003 rename
  • 11:23 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57016 and previous config saved to /var/cache/conftool/dbconfig/20240219-112311-root.json
  • 11:22 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57015 and previous config saved to /var/cache/conftool/dbconfig/20240219-112256-root.json
  • 11:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57014 and previous config saved to /var/cache/conftool/dbconfig/20240219-112030-root.json
  • 11:18 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P57013 and previous config saved to /var/cache/conftool/dbconfig/20240219-111819-arnaudb.json
  • 11:11 aborrero@cumin1002: START - Cookbook sre.hosts.reimage for host cloudvirt1032.eqiad.wmnet with OS bookworm
  • 11:10 claime: sudo cumin -b 20 -p 95 '*' 'run-puppet-agent -q --failed-only'
  • 11:09 claime: Running puppet on failed nodes
  • 11:08 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57012 and previous config saved to /var/cache/conftool/dbconfig/20240219-110806-root.json
  • 11:08 claime: puppetserver roll-restart done
  • 11:07 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57011 and previous config saved to /var/cache/conftool/dbconfig/20240219-110751-root.json
  • 11:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2166 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57010 and previous config saved to /var/cache/conftool/dbconfig/20240219-110525-root.json
  • 11:03 arnaudb@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T357189)', diff saved to https://phabricator.wikimedia.org/P57009 and previous config saved to /var/cache/conftool/dbconfig/20240219-110312-arnaudb.json
  • 11:00 claime: sudo cumin -s 10 -b 1 A:puppetserver 'systemctl restart puppetserver.service'
  • 11:00 claime: roll-restarting puppetserver
  • 10:59 arnaudb@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T357189)', diff saved to https://phabricator.wikimedia.org/P57008 and previous config saved to /var/cache/conftool/dbconfig/20240219-105949-arnaudb.json
  • 10:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:59 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 10:56 claime: restarting puppetserver on puppetserver1001
  • 10:54 godog: bounce thanos-query on titan1* - T356788
  • 10:53 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57007 and previous config saved to /var/cache/conftool/dbconfig/20240219-105302-root.json
  • 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57006 and previous config saved to /var/cache/conftool/dbconfig/20240219-105246-root.json
  • 10:52 marostegui@cumin1002: dbctl commit (dc=all): 'db2137 into s4, depooled', diff saved to https://phabricator.wikimedia.org/P57005 and previous config saved to /var/cache/conftool/dbconfig/20240219-105211-marostegui.json
  • 10:48 godog: bounce thanos-query on titan2* - T356788
  • 10:45 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2137 in s4 T354826', diff saved to https://phabricator.wikimedia.org/P57004 and previous config saved to /var/cache/conftool/dbconfig/20240219-104556-marostegui.json
  • 10:39 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2137 T354826', diff saved to https://phabricator.wikimedia.org/P57002 and previous config saved to /var/cache/conftool/dbconfig/20240219-103939-root.json
  • 10:37 marostegui@cumin1002: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db2166.codfw.wmnet onto db2167.codfw.wmnet
  • 10:37 marostegui@cumin1002: dbctl commit (dc=all): 'db2167 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57001 and previous config saved to /var/cache/conftool/dbconfig/20240219-103741-root.json
  • 10:33 cgoubert@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=thanos-query,name=eqiad
  • 10:33 claime: repooling thanos-query eqiad - T356788
  • 10:26 claime: restarting thanos-query.service - titan1001 - T356788
  • 10:22 claime: restarting thanos-query.service - titan1002 - T356788
  • 10:22 cgoubert@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad
  • 10:22 claime: depooling thanos-query eqiad - T356788
  • 10:11 aborrero@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cloudvirt1032.eqiad.wmnet with reason: reimage
  • 10:11 aborrero@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cloudvirt1032.eqiad.wmnet with reason: reimage
  • 10:10 taavi@cumin1002: END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: wmcs::openstack::eqiad1::cloudweb
  • 10:10 claime: restarting thanos-query.service - titan1002 - T356788
  • 10:05 claime: restarting thanos-query.service - titan1001 - T356788
  • 10:04 claime: restarting thanos-query.service - titan1001
  • 10:02 taavi@cumin1002: START - Cookbook sre.puppet.migrate-role for role: wmcs::openstack::eqiad1::cloudweb
  • 09:59 taavi@cumin1002: conftool action : set/pooled=yes; selector: name=cloudweb1004.wikimedia.org
  • 09:55 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudweb1004.wikimedia.org with OS bullseye
  • 09:49 claime: Draining mw2442 - failed RAID - T357380
  • 09:27 taavi@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudweb1004.wikimedia.org with reason: host reimage
  • 09:24 taavi@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudweb1004.wikimedia.org with reason: host reimage
  • 09:12 taavi@cumin1002: START - Cookbook sre.hosts.reimage for host cloudweb1004.wikimedia.org with OS bullseye
  • 09:10 moritzm: installing gnutls28 security updates on bookworm
  • 09:06 taavi@cumin1002: conftool action : set/pooled=inactive; selector: name=cloudweb1004.wikimedia.org
  • 09:06 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P57000 and previous config saved to /var/cache/conftool/dbconfig/20240219-090600-root.json
  • 09:01 ladsgroup@deploy2002: Finished scap: Backport for Set fawiki to read new in pagelinks (T351237) (duration: 09m 43s)
  • 08:54 ladsgroup@deploy2002: ladsgroup: Continuing with sync
  • 08:53 ladsgroup@deploy2002: ladsgroup: Backport for Set fawiki to read new in pagelinks (T351237) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 08:51 ladsgroup@deploy2002: Started scap: Backport for Set fawiki to read new in pagelinks (T351237)
  • 08:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56999 and previous config saved to /var/cache/conftool/dbconfig/20240219-085055-root.json
  • 08:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56998 and previous config saved to /var/cache/conftool/dbconfig/20240219-083840-root.json
  • 08:35 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56997 and previous config saved to /var/cache/conftool/dbconfig/20240219-083550-root.json
  • 08:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/superset-next: apply
  • 08:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/superset-next: apply
  • 08:25 marostegui@cumin1002: START - Cookbook sre.mysql.clone of db2166.codfw.wmnet onto db2167.codfw.wmnet
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56996 and previous config saved to /var/cache/conftool/dbconfig/20240219-082336-root.json
  • 08:23 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2166 T354826', diff saved to https://phabricator.wikimedia.org/P56995 and previous config saved to /var/cache/conftool/dbconfig/20240219-082321-root.json
  • 08:23 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 08:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 08:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 08:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 08:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56994 and previous config saved to /var/cache/conftool/dbconfig/20240219-082121-root.json
  • 08:20 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56993 and previous config saved to /var/cache/conftool/dbconfig/20240219-082045-root.json
  • 08:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P56992 and previous config saved to /var/cache/conftool/dbconfig/20240219-081920-ladsgroup.json
  • 08:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 08:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 08:16 moritzm: installing runc security updates on buster
  • 08:11 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2167 in s8 T354826', diff saved to https://phabricator.wikimedia.org/P56991 and previous config saved to /var/cache/conftool/dbconfig/20240219-081132-marostegui.json
  • 08:08 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56990 and previous config saved to /var/cache/conftool/dbconfig/20240219-080831-root.json
  • 08:07 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2167 multiinstance', diff saved to https://phabricator.wikimedia.org/P56989 and previous config saved to /var/cache/conftool/dbconfig/20240219-080744-marostegui.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56988 and previous config saved to /var/cache/conftool/dbconfig/20240219-080616-root.json
  • 08:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56987 and previous config saved to /var/cache/conftool/dbconfig/20240219-080612-root.json
  • 08:05 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56986 and previous config saved to /var/cache/conftool/dbconfig/20240219-080540-root.json
  • 08:03 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2167 T354826', diff saved to https://phabricator.wikimedia.org/P56985 and previous config saved to /var/cache/conftool/dbconfig/20240219-080322-root.json
  • 07:53 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56984 and previous config saved to /var/cache/conftool/dbconfig/20240219-075325-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56983 and previous config saved to /var/cache/conftool/dbconfig/20240219-075111-root.json
  • 07:51 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56982 and previous config saved to /var/cache/conftool/dbconfig/20240219-075107-root.json
  • 07:50 marostegui@cumin1002: dbctl commit (dc=all): 'db2168 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56981 and previous config saved to /var/cache/conftool/dbconfig/20240219-075035-root.json
  • 07:46 marostegui@cumin1002: dbctl commit (dc=all): 'Place db2168 in s7 T354826', diff saved to https://phabricator.wikimedia.org/P56980 and previous config saved to /var/cache/conftool/dbconfig/20240219-074609-marostegui.json
  • 07:44 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db2168 multiinstance', diff saved to https://phabricator.wikimedia.org/P56979 and previous config saved to /var/cache/conftool/dbconfig/20240219-074450-marostegui.json
  • 07:41 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db2168 T354826', diff saved to https://phabricator.wikimedia.org/P56978 and previous config saved to /var/cache/conftool/dbconfig/20240219-074148-root.json
  • 07:38 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56977 and previous config saved to /var/cache/conftool/dbconfig/20240219-073820-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56976 and previous config saved to /var/cache/conftool/dbconfig/20240219-073606-root.json
  • 07:36 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56975 and previous config saved to /var/cache/conftool/dbconfig/20240219-073602-root.json
  • 07:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 100%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56974 and previous config saved to /var/cache/conftool/dbconfig/20240219-073521-root.json
  • 07:23 marostegui@cumin1002: dbctl commit (dc=all): 'db1213 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56973 and previous config saved to /var/cache/conftool/dbconfig/20240219-072315-root.json
  • 07:21 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56972 and previous config saved to /var/cache/conftool/dbconfig/20240219-072101-root.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56971 and previous config saved to /var/cache/conftool/dbconfig/20240219-072057-root.json
  • 07:20 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 75%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56970 and previous config saved to /var/cache/conftool/dbconfig/20240219-072016-root.json
  • 07:17 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1213 in s5 T354826', diff saved to https://phabricator.wikimedia.org/P56969 and previous config saved to /var/cache/conftool/dbconfig/20240219-071658-marostegui.json
  • 07:16 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1213 multiinstance', diff saved to https://phabricator.wikimedia.org/P56968 and previous config saved to /var/cache/conftool/dbconfig/20240219-071604-marostegui.json
  • 07:08 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1213 T354826', diff saved to https://phabricator.wikimedia.org/P56967 and previous config saved to /var/cache/conftool/dbconfig/20240219-070815-root.json
  • 07:06 marostegui@cumin1002: dbctl commit (dc=all): 'db1246 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56966 and previous config saved to /var/cache/conftool/dbconfig/20240219-070556-root.json
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56965 and previous config saved to /var/cache/conftool/dbconfig/20240219-070552-root.json
  • 07:05 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 50%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56964 and previous config saved to /var/cache/conftool/dbconfig/20240219-070511-root.json
  • 07:02 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1246 in s2 T354826', diff saved to https://phabricator.wikimedia.org/P56963 and previous config saved to /var/cache/conftool/dbconfig/20240219-070212-marostegui.json
  • 06:58 marostegui@cumin1002: dbctl commit (dc=all): 'Remove db1246 multiinstance', diff saved to https://phabricator.wikimedia.org/P56962 and previous config saved to /var/cache/conftool/dbconfig/20240219-065848-marostegui.json
  • 06:54 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1246 T354826', diff saved to https://phabricator.wikimedia.org/P56961 and previous config saved to /var/cache/conftool/dbconfig/20240219-065456-root.json
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1244 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56960 and previous config saved to /var/cache/conftool/dbconfig/20240219-065048-root.json
  • 06:50 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 25%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56959 and previous config saved to /var/cache/conftool/dbconfig/20240219-065007-root.json
  • 06:43 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1244 in s4 T354826', diff saved to https://phabricator.wikimedia.org/P56958 and previous config saved to /var/cache/conftool/dbconfig/20240219-064350-marostegui.json
  • 06:41 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1244 in s4 T354826', diff saved to https://phabricator.wikimedia.org/P56957 and previous config saved to /var/cache/conftool/dbconfig/20240219-064157-marostegui.json
  • 06:35 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 10%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56956 and previous config saved to /var/cache/conftool/dbconfig/20240219-063502-root.json
  • 06:34 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1244 T354826', diff saved to https://phabricator.wikimedia.org/P56955 and previous config saved to /var/cache/conftool/dbconfig/20240219-063457-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'db1170 (re)pooling @ 5%: After rearraging sections T354826', diff saved to https://phabricator.wikimedia.org/P56954 and previous config saved to /var/cache/conftool/dbconfig/20240219-061957-root.json
  • 06:19 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1170 in s7 T354826', diff saved to https://phabricator.wikimedia.org/P56953 and previous config saved to /var/cache/conftool/dbconfig/20240219-061919-marostegui.json
  • 06:17 marostegui@deploy2002: Finished scap: Backport for Revert "ProductionServices.php: Promote pc2014 to pc1 master" (duration: 19m 02s)
  • 06:15 marostegui@cumin1002: dbctl commit (dc=all): 'Place db1170 in s7 T354826', diff saved to https://phabricator.wikimedia.org/P56952 and previous config saved to /var/cache/conftool/dbconfig/20240219-061548-marostegui.json
  • 06:11 marostegui@cumin1002: dbctl commit (dc=all): 'Depool db1170 T354826', diff saved to https://phabricator.wikimedia.org/P56951 and previous config saved to /var/cache/conftool/dbconfig/20240219-061121-root.json
  • 06:08 marostegui@deploy2002: marostegui: Continuing with sync
  • 06:08 marostegui@deploy2002: marostegui: Backport for Revert "ProductionServices.php: Promote pc2014 to pc1 master" synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 05:58 marostegui@deploy2002: Started scap: Backport for Revert "ProductionServices.php: Promote pc2014 to pc1 master"
  • 05:53 marostegui@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc[2011,2014].codfw.wmnet,pc[1011,1014].eqiad.wmnet with reason: Primary switchover pc1 T356371
  • 05:53 marostegui@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on pc[2011,2014].codfw.wmnet,pc[1011,1014].eqiad.wmnet with reason: Primary switchover pc1 T356371

2024-02-18

  • 23:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P56950 and previous config saved to /var/cache/conftool/dbconfig/20240218-231102-ladsgroup.json
  • 22:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P56949 and previous config saved to /var/cache/conftool/dbconfig/20240218-225556-ladsgroup.json
  • 22:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P56948 and previous config saved to /var/cache/conftool/dbconfig/20240218-224049-ladsgroup.json
  • 22:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P56947 and previous config saved to /var/cache/conftool/dbconfig/20240218-222543-ladsgroup.json
  • 21:10 eileen: civicrm upgraded from 45a0138c to 5af300d4
  • 17:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2172 (T352010)', diff saved to https://phabricator.wikimedia.org/P56945 and previous config saved to /var/cache/conftool/dbconfig/20240218-171526-ladsgroup.json
  • 17:15 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 17:15 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 17:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P56944 and previous config saved to /var/cache/conftool/dbconfig/20240218-171502-ladsgroup.json
  • 16:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P56943 and previous config saved to /var/cache/conftool/dbconfig/20240218-165955-ladsgroup.json
  • 16:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P56942 and previous config saved to /var/cache/conftool/dbconfig/20240218-164448-ladsgroup.json
  • 16:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P56941 and previous config saved to /var/cache/conftool/dbconfig/20240218-162942-ladsgroup.json
  • 11:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2155 (T352010)', diff saved to https://phabricator.wikimedia.org/P56940 and previous config saved to /var/cache/conftool/dbconfig/20240218-111954-ladsgroup.json
  • 11:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
  • 11:19 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 11:19 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 11:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P56939 and previous config saved to /var/cache/conftool/dbconfig/20240218-111915-ladsgroup.json
  • 11:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P56938 and previous config saved to /var/cache/conftool/dbconfig/20240218-110408-ladsgroup.json
  • 10:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P56937 and previous config saved to /var/cache/conftool/dbconfig/20240218-104901-ladsgroup.json
  • 10:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P56936 and previous config saved to /var/cache/conftool/dbconfig/20240218-103355-ladsgroup.json
  • 09:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2147 (T352010)', diff saved to https://phabricator.wikimedia.org/P56935 and previous config saved to /var/cache/conftool/dbconfig/20240218-093323-ladsgroup.json
  • 09:33 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 09:33 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 09:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T352010)', diff saved to https://phabricator.wikimedia.org/P56934 and previous config saved to /var/cache/conftool/dbconfig/20240218-093301-ladsgroup.json
  • 09:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P56933 and previous config saved to /var/cache/conftool/dbconfig/20240218-091754-ladsgroup.json
  • 09:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140', diff saved to https://phabricator.wikimedia.org/P56932 and previous config saved to /var/cache/conftool/dbconfig/20240218-090248-ladsgroup.json
  • 08:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T352010)', diff saved to https://phabricator.wikimedia.org/P56931 and previous config saved to /var/cache/conftool/dbconfig/20240218-084741-ladsgroup.json
  • 03:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2140 (T352010)', diff saved to https://phabricator.wikimedia.org/P56930 and previous config saved to /var/cache/conftool/dbconfig/20240218-035542-ladsgroup.json
  • 03:55 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 03:55 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance

2024-02-17

  • 23:42 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 23:42 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 23:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56929 and previous config saved to /var/cache/conftool/dbconfig/20240217-234216-ladsgroup.json
  • 23:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P56928 and previous config saved to /var/cache/conftool/dbconfig/20240217-232709-ladsgroup.json
  • 23:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P56927 and previous config saved to /var/cache/conftool/dbconfig/20240217-231203-ladsgroup.json
  • 22:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56926 and previous config saved to /var/cache/conftool/dbconfig/20240217-225656-ladsgroup.json
  • 17:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2138:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56925 and previous config saved to /var/cache/conftool/dbconfig/20240217-175100-ladsgroup.json
  • 17:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 17:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 17:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56924 and previous config saved to /var/cache/conftool/dbconfig/20240217-175038-ladsgroup.json
  • 17:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P56923 and previous config saved to /var/cache/conftool/dbconfig/20240217-173531-ladsgroup.json
  • 17:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P56922 and previous config saved to /var/cache/conftool/dbconfig/20240217-172024-ladsgroup.json
  • 17:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56921 and previous config saved to /var/cache/conftool/dbconfig/20240217-170518-ladsgroup.json
  • 11:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2137:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56920 and previous config saved to /var/cache/conftool/dbconfig/20240217-115446-ladsgroup.json
  • 11:54 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 11:54 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 11:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P56919 and previous config saved to /var/cache/conftool/dbconfig/20240217-115422-ladsgroup.json
  • 11:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P56918 and previous config saved to /var/cache/conftool/dbconfig/20240217-113916-ladsgroup.json
  • 11:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P56917 and previous config saved to /var/cache/conftool/dbconfig/20240217-112409-ladsgroup.json
  • 11:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P56916 and previous config saved to /var/cache/conftool/dbconfig/20240217-110903-ladsgroup.json
  • 10:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2136 (T352010)', diff saved to https://phabricator.wikimedia.org/P56915 and previous config saved to /var/cache/conftool/dbconfig/20240217-100830-ladsgroup.json
  • 10:08 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 10:08 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 10:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P56914 and previous config saved to /var/cache/conftool/dbconfig/20240217-100809-ladsgroup.json
  • 09:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P56913 and previous config saved to /var/cache/conftool/dbconfig/20240217-095302-ladsgroup.json
  • 09:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P56912 and previous config saved to /var/cache/conftool/dbconfig/20240217-093755-ladsgroup.json
  • 09:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P56911 and previous config saved to /var/cache/conftool/dbconfig/20240217-092249-ladsgroup.json
  • 08:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2119 (T352010)', diff saved to https://phabricator.wikimedia.org/P56910 and previous config saved to /var/cache/conftool/dbconfig/20240217-082217-ladsgroup.json
  • 08:22 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 08:22 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 08:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P56909 and previous config saved to /var/cache/conftool/dbconfig/20240217-082155-ladsgroup.json
  • 08:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P56908 and previous config saved to /var/cache/conftool/dbconfig/20240217-080649-ladsgroup.json
  • 07:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P56907 and previous config saved to /var/cache/conftool/dbconfig/20240217-075142-ladsgroup.json
  • 07:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P56906 and previous config saved to /var/cache/conftool/dbconfig/20240217-073636-ladsgroup.json
  • 02:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2110 (T352010)', diff saved to https://phabricator.wikimedia.org/P56905 and previous config saved to /var/cache/conftool/dbconfig/20240217-022159-ladsgroup.json
  • 02:21 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 02:21 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
  • 02:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P56904 and previous config saved to /var/cache/conftool/dbconfig/20240217-022137-ladsgroup.json
  • 02:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P56903 and previous config saved to /var/cache/conftool/dbconfig/20240217-020630-ladsgroup.json
  • 01:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P56902 and previous config saved to /var/cache/conftool/dbconfig/20240217-015123-ladsgroup.json
  • 01:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P56901 and previous config saved to /var/cache/conftool/dbconfig/20240217-013617-ladsgroup.json

2024-02-16

  • 21:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:43 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:43 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 21:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:41 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:40 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:40 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2205.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:40 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2204.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:40 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2203.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:39 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2203 to codfw - jhancock@cumin2002"
  • 21:38 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2203 to codfw - jhancock@cumin2002"
  • 21:35 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 21:32 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:21 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2202.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2202.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 21:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:58 tzatziki: removing 2 files for legal compliance
  • 20:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T352010)', diff saved to https://phabricator.wikimedia.org/P56900 and previous config saved to /var/cache/conftool/dbconfig/20240216-204746-ladsgroup.json
  • 20:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 20:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56899 and previous config saved to /var/cache/conftool/dbconfig/20240216-204709-ladsgroup.json
  • 20:38 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:38 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:35 hashar@deploy2002: Finished deploy [integration/docroot@7a9d46f]: build: Upgrade mediawiki/mediawiki-codesniffer to v43.0.0 (duration: 00m 07s)
  • 20:35 hashar@deploy2002: Started deploy [integration/docroot@7a9d46f]: build: Upgrade mediawiki/mediawiki-codesniffer to v43.0.0
  • 20:33 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 20:33 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 20:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P56898 and previous config saved to /var/cache/conftool/dbconfig/20240216-203202-ladsgroup.json
  • 20:28 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:28 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:23 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:22 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 20:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:17 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P56897 and previous config saved to /var/cache/conftool/dbconfig/20240216-201656-ladsgroup.json
  • 20:16 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2202.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:16 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 20:15 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 20:12 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:12 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 20:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2106 (T352010)', diff saved to https://phabricator.wikimedia.org/P56896 and previous config saved to /var/cache/conftool/dbconfig/20240216-201239-ladsgroup.json
  • 20:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 20:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 20:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56895 and previous config saved to /var/cache/conftool/dbconfig/20240216-200149-ladsgroup.json
  • 19:56 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:55 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 19:52 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:51 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:51 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:50 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 19:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:46 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2202.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2201.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:45 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2200.mgmt.codfw.wmnet with reboot policy FORCED
  • 19:44 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2200 to codfw - jhancock@cumin2002"
  • 19:43 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2200 to codfw - jhancock@cumin2002"
  • 19:41 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 19:08 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 19:07 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 17:50 jdrewniak@deploy2002: Finished scap: Backport for dd elements should have no margin (T357742) (duration: 14m 04s)
  • 17:43 jdrewniak@deploy2002: jdrewniak and kemayo: Continuing with sync
  • 17:37 jdrewniak@deploy2002: jdrewniak and kemayo: Backport for dd elements should have no margin (T357742) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 17:36 jdrewniak@deploy2002: Started scap: Backport for dd elements should have no margin (T357742)
  • 17:09 jdrewniak@deploy2002: Finished scap: Backport for Mitigates font size issues (T357724) (duration: 10m 04s)
  • 17:02 jdrewniak@deploy2002: jdrewniak and jdlrobson: Continuing with sync
  • 17:02 jdrewniak@deploy2002: jdrewniak and jdlrobson: Backport for Mitigates font size issues (T357724) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:59 jdrewniak@deploy2002: Started scap: Backport for Mitigates font size issues (T357724)
  • 16:53 bking@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 16:53 bking@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 16:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2199.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:39 mvernon@cumin1002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[1044-1050].eqiad.wmnet
  • 16:39 mvernon@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:39 mvernon@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[1044-1050].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin1002"
  • 16:36 mvernon@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[1044-1050].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin1002"
  • 16:34 mvernon@cumin1002: START - Cookbook sre.dns.netbox
  • 16:18 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host db2199.mgmt.codfw.wmnet with reboot policy FORCED
  • 16:18 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:18 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2199 to codfw - jhancock@cumin2002"
  • 16:17 bking@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
  • 16:17 bking@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
  • 16:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2199 to codfw - jhancock@cumin2002"
  • 16:12 jhancock@cumin2002: START - Cookbook sre.dns.netbox
  • 16:05 mvernon@cumin1002: START - Cookbook sre.hosts.decommission for hosts ms-be[1044-1050].eqiad.wmnet
  • 16:04 ejegg: fundraising civicrm upgraded from 84ba0ccf to 45a0138c
  • 16:01 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1047
  • 15:53 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic100[1-4]* for decom hosts - bking@cumin2002 - T357780
  • 15:53 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic100[1-4]* for decom hosts - bking@cumin2002 - T357780
  • 15:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1047.eqiad.wmnet
  • 15:35 hnowlan@cumin1002: conftool action : set/weight=10:pooled=yes; selector: name=(mw1349.eqiad.wmnet|mw1367.eqiad.wmnet|mw1476.eqiad.wmnet|mw1477.eqiad.wmnet),cluster=kubernetes,service=kubesvc
  • 15:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1047.eqiad.wmnet
  • 15:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be1047.eqiad.wmnet
  • 15:28 hnowlan: running `homer 'cr*eqiad*' commit 'T351074'` for new k8s workers
  • 15:20 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be1047.eqiad.wmnet
  • 15:20 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1047
  • 15:14 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1046
  • 15:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 15:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 14:49 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1046
  • 14:47 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1046.eqiad.wmnet
  • 14:42 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1046.eqiad.wmnet
  • 14:34 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["73436010"]' | tee -a ~/T315510-enwiki
  • 14:33 Lucas_WMDE: STOP persistRevisionThreadItems.php on enwiki for T315510 again, I forgot to adjust the --start >.<
  • 14:33 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.convert-disks (exit_code=99) for host ms-be1046
  • 14:32 Lucas_WMDE: START lucaswerkmeister-wmde@mwmaint2002:~$ mwscript extensions/DiscussionTools/maintenance/persistRevisionThreadItems.php --wiki enwiki --current --all --start '["67578461"]' | tee -a ~/T315510-enwiki
  • 14:32 Lucas_WMDE: STOP persistRevisionThreadItems on enwiki for T315510 – for restart on wmf.18; last output: --start '["73436010"]'
  • 14:19 bking@cumin2002: conftool action : set/pooled=yes; selector: name=cloudelastic1005.eqiad.wmnet
  • 14:19 bking@cumin2002: conftool action : set/weight=10; selector: name=cloudelastic1005.eqiad.wmnet
  • 14:18 mvernon@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-be1046.eqiad.wmnet
  • 14:08 mvernon@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-be1046.eqiad.wmnet
  • 14:08 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 14:07 mvernon@cumin2002: START - Cookbook sre.swift.convert-disks for host ms-be1046
  • 14:07 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 14:07 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 14:06 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 14:06 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 14:06 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 13:47 hashar@deploy2002: Finished scap: Backport for Revert "Avoid creating a MWReferenceModel if not needed" (T357745) (duration: 13m 24s)
  • 13:39 hashar@deploy2002: matmarex and hashar: Continuing with sync
  • 13:37 hashar@deploy2002: matmarex and hashar: Backport for Revert "Avoid creating a MWReferenceModel if not needed" (T357745) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 13:37 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1477.eqiad.wmnet with OS bullseye
  • 13:34 hashar@deploy2002: Started scap: Backport for Revert "Avoid creating a MWReferenceModel if not needed" (T357745)
  • 13:26 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1349.eqiad.wmnet with OS bullseye
  • 13:23 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1367.eqiad.wmnet with OS bullseye
  • 13:20 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1476.eqiad.wmnet with OS bullseye
  • 13:20 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1477.eqiad.wmnet with reason: host reimage
  • 13:17 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1477.eqiad.wmnet with reason: host reimage
  • 13:07 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1349.eqiad.wmnet with reason: host reimage
  • 13:05 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1477.eqiad.wmnet with OS bullseye
  • 13:04 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1367.eqiad.wmnet with reason: host reimage
  • 13:02 hnowlan@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1476.eqiad.wmnet with reason: host reimage
  • 13:00 hnowlan@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1477.eqiad.wmnet with OS bullseye
  • 13:00 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1349.eqiad.wmnet with reason: host reimage
  • 13:00 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1367.eqiad.wmnet with reason: host reimage
  • 12:59 hnowlan@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1476.eqiad.wmnet with reason: host reimage
  • 12:47 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1477.eqiad.wmnet with OS bullseye
  • 12:46 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1476.eqiad.wmnet with OS bullseye
  • 12:46 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1367.eqiad.wmnet with OS bullseye
  • 12:46 taavi: publish docker-registry.discovery.wmnet/python3-bookworm:0.0.1
  • 12:46 hnowlan@cumin1002: START - Cookbook sre.hosts.reimage for host mw1349.eqiad.wmnet with OS bullseye
  • 12:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T352010)', diff saved to https://phabricator.wikimedia.org/P56892 and previous config saved to /var/cache/conftool/dbconfig/20240216-121416-ladsgroup.json
  • 12:14 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 12:13 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 10:58 moritzm: update bullseye/bookworm netboot images on the Puppet 7 volatile environment to the latest point releases (to bring in sync with volatile for Puppet 5) T341056
  • 10:50 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 10:50 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 10:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P56891 and previous config saved to /var/cache/conftool/dbconfig/20240216-105041-ladsgroup.json
  • 10:44 volans@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 10:44 volans@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 10:43 volans@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 10:42 volans@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1001.eqiad.wmnet
  • 10:41 hnowlan@cumin2002: conftool action : set/pooled=yes; selector: name=mw2379.codfw.wmnet
  • 10:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P56890 and previous config saved to /var/cache/conftool/dbconfig/20240216-103535-ladsgroup.json
  • 10:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P56889 and previous config saved to /var/cache/conftool/dbconfig/20240216-102028-ladsgroup.json
  • 10:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P56888 and previous config saved to /var/cache/conftool/dbconfig/20240216-100521-ladsgroup.json
  • 10:03 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2194.codfw.wmnet with reason: Silence for WE
  • 10:03 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2194.codfw.wmnet with reason: Silence for WE
  • 09:07 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1036.eqiad.wmnet with OS bullseye
  • 09:07 jclark@cumin1002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 09:06 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 08:38 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bullseye
  • 08:07 jclark@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-redacteddb1001']
  • 08:07 jclark@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-redacteddb1001']
  • 06:04 apergos: manually generating 7z files in parallel for wikidata full history dumps run, in screen session, owned by ariel, on snapshot1009
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1249 (T352010)', diff saved to https://phabricator.wikimedia.org/P56887 and previous config saved to /var/cache/conftool/dbconfig/20240216-052044-ladsgroup.json
  • 05:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 05:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1249.eqiad.wmnet with reason: Maintenance
  • 05:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P56886 and previous config saved to /var/cache/conftool/dbconfig/20240216-052021-ladsgroup.json
  • 05:05 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P56885 and previous config saved to /var/cache/conftool/dbconfig/20240216-050514-ladsgroup.json
  • 05:05 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 05:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56884 and previous config saved to /var/cache/conftool/dbconfig/20240216-050458-ladsgroup.json
  • 04:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P56883 and previous config saved to /var/cache/conftool/dbconfig/20240216-045008-ladsgroup.json
  • 04:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P56882 and previous config saved to /var/cache/conftool/dbconfig/20240216-044952-ladsgroup.json
  • 04:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P56881 and previous config saved to /var/cache/conftool/dbconfig/20240216-043501-ladsgroup.json
  • 04:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P56880 and previous config saved to /var/cache/conftool/dbconfig/20240216-043445-ladsgroup.json
  • 04:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56879 and previous config saved to /var/cache/conftool/dbconfig/20240216-041938-ladsgroup.json
  • 01:26 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:ml-cache: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 01:08 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@d93828e]: (no justification provided) (duration: 00m 28s)
  • 01:07 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@d93828e]: (no justification provided)
  • 00:49 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:ml-cache: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 00:28 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 00:27 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad plugin upgrade - ryankemper@cumin2002 - T356651
  • 00:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1248 (T352010)', diff saved to https://phabricator.wikimedia.org/P56877 and previous config saved to /var/cache/conftool/dbconfig/20240216-001636-ladsgroup.json
  • 00:16 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 00:16 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1248.eqiad.wmnet with reason: Maintenance
  • 00:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P56876 and previous config saved to /var/cache/conftool/dbconfig/20240216-001612-ladsgroup.json
  • 00:06 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 00:02 thcipriani@deploy2002: Finished scap: Backport for Connection: Correct read-only detection (T354793 T356526) (duration: 10m 28s)
  • 00:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P56875 and previous config saved to /var/cache/conftool/dbconfig/20240216-000106-ladsgroup.json

2024-02-15

  • 23:55 thcipriani@deploy2002: ebernhardson and thcipriani: Continuing with sync
  • 23:53 thcipriani@deploy2002: ebernhardson and thcipriani: Backport for Connection: Correct read-only detection (T354793 T356526) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 23:52 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 23:52 thcipriani@deploy2002: Started scap: Backport for Connection: Correct read-only detection (T354793 T356526)
  • 23:50 thcipriani@deploy2002: Finished scap: Backport for Add border-collapse to wikitable (T357589) (duration: 11m 31s)
  • 23:46 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 23:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P56874 and previous config saved to /var/cache/conftool/dbconfig/20240215-234600-ladsgroup.json
  • 23:42 thcipriani@deploy2002: thcipriani and jdlrobson: Continuing with sync
  • 23:40 thcipriani@deploy2002: thcipriani and jdlrobson: Backport for Add border-collapse to wikitable (T357589) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 23:38 thcipriani@deploy2002: Started scap: Backport for Add border-collapse to wikitable (T357589)
  • 23:33 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:33 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 ebernhardson@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
  • 23:32 ebernhardson@deploy2002: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
  • 23:31 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1005.eqiad.wmnet with reason: host reimage
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P56873 and previous config saved to /var/cache/conftool/dbconfig/20240215-233053-ladsgroup.json
  • 23:28 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1005.eqiad.wmnet with reason: host reimage
  • 23:26 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 23:18 tzatziki: removing 2 files for legal compliance
  • 23:13 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 23:09 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 23:02 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 23:02 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 22:47 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 22:40 vriley@cumin1002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-redacteddb1001']
  • 22:40 vriley@cumin1002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['an-redacteddb1001']
  • 22:40 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1005.eqiad.wmnet with OS bullseye
  • 22:38 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host an-redacteddb1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:34 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1005
  • 22:34 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "c_f"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 22:33 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1005
  • 22:30 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:30 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1005 to private IPs - bking@cumin2002"
  • 22:29 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1005 to private IPs - bking@cumin2002"
  • 22:27 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 22:25 wfan: payments-wiki upgraded from 29eb0fff to 709d89bf
  • 22:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1005.wikimedia.org
  • 22:19 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 22:19 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 22:16 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1005.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 22:12 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 22:08 vriley@cumin1002: START - Cookbook sre.hosts.provision for host an-redacteddb1001.mgmt.eqiad.wmnet with reboot policy FORCED
  • 22:05 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1005.wikimedia.org
  • 22:05 brennen@deploy2002: Finished scap: Backport for Filter out null external link attributes (T357668) (duration: 11m 40s)
  • 22:03 bking@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=cloudelastic1006\.eqiad\.wmnet
  • 22:00 vriley@cumin1002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 21:59 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "c_f"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:59 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "b_e"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:59 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad plugin upgrade - ryankemper@cumin2002 - T356651
  • 21:58 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 21:57 brennen@deploy2002: brennen: Continuing with sync
  • 21:56 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade - ryankemper@cumin2002 - T356651
  • 21:54 brennen@deploy2002: brennen: Backport for Filter out null external link attributes (T357668) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 21:53 brennen@deploy2002: Started scap: Backport for Filter out null external link attributes (T357668)
  • 21:52 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1005* for IP migration - bking@cumin2002 - T355617
  • 21:52 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1005* for IP migration - bking@cumin2002 - T355617
  • 21:51 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Unbanning all hosts in cloudelastic
  • 21:51 bking@cumin2002: START - Cookbook sre.elasticsearch.ban Unbanning all hosts in cloudelastic
  • 21:28 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudelastic1006.eqiad.wmnet with OS bullseye
  • 21:28 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 21:26 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "b_e"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 21:21 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.18 refs T354436
  • 21:20 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "a_c"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:47 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "a_c"} and A:aqs and A:codfw: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:46 brennen@deploy2002: Synchronized php: group1 wikis to 1.42.0-wmf.18 refs T354436 (duration: 08m 05s)
  • 20:41 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "rack3"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:38 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.42.0-wmf.18 refs T354436
  • 20:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2109 (T352010)', diff saved to https://phabricator.wikimedia.org/P56870 and previous config saved to /var/cache/conftool/dbconfig/20240215-202036-ladsgroup.json
  • 20:20 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 20:20 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
  • 20:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T352010)', diff saved to https://phabricator.wikimedia.org/P56869 and previous config saved to /var/cache/conftool/dbconfig/20240215-202014-ladsgroup.json
  • 20:08 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "rack3"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:06 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "rack2"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 20:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P56868 and previous config saved to /var/cache/conftool/dbconfig/20240215-200507-ladsgroup.json
  • 20:00 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: T355866 - Post migration repool of es2024', diff saved to https://phabricator.wikimedia.org/P56867 and previous config saved to /var/cache/conftool/dbconfig/20240215-200015-arnaudb.json
  • 19:58 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - bking@cumin2002"
  • 19:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P56866 and previous config saved to /var/cache/conftool/dbconfig/20240215-195001-ladsgroup.json
  • 19:48 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade - ryankemper@cumin2002 - T356651
  • 19:45 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 75%: T355866 - Post migration repool of es2024', diff saved to https://phabricator.wikimedia.org/P56865 and previous config saved to /var/cache/conftool/dbconfig/20240215-194510-arnaudb.json
  • 19:43 apergos: manually generating checksums in parallel for wikidata full history dumps run, in screen session, owned by ariel, on snapshot1009
  • 19:42 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudelastic1006.eqiad.wmnet with reason: host reimage
  • 19:39 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudelastic1006.eqiad.wmnet with reason: host reimage
  • 19:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2105 (T352010)', diff saved to https://phabricator.wikimedia.org/P56864 and previous config saved to /var/cache/conftool/dbconfig/20240215-193455-ladsgroup.json
  • 19:31 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "rack2"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 19:30 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 50%: T355866 - Post migration repool of es2024', diff saved to https://phabricator.wikimedia.org/P56863 and previous config saved to /var/cache/conftool/dbconfig/20240215-193005-arnaudb.json
  • 19:24 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1006.eqiad.wmnet with OS bullseye
  • 19:22 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.42.0-wmf.18 refs T354436
  • 19:15 arnaudb@cumin1002: dbctl commit (dc=all): 'es2024 (re)pooling @ 25%: T355866 - Post migration repool of es2024', diff saved to https://phabricator.wikimedia.org/P56862 and previous config saved to /var/cache/conftool/dbconfig/20240215-191500-arnaudb.json
  • 19:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: T355866 - Post migration repool of db2122', diff saved to https://phabricator.wikimedia.org/P56861 and previous config saved to /var/cache/conftool/dbconfig/20240215-191454-arnaudb.json
  • 19:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1247 (T352010)', diff saved to https://phabricator.wikimedia.org/P56860 and previous config saved to /var/cache/conftool/dbconfig/20240215-191226-ladsgroup.json
  • 19:12 ladsgroup@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 19:12 ladsgroup@cumin1002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1247.eqiad.wmnet with reason: Maintenance
  • 19:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56859 and previous config saved to /var/cache/conftool/dbconfig/20240215-191203-ladsgroup.json
  • 19:11 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudelastic1006.eqiad.wmnet with OS bullseye
  • 19:04 brennen: train 1.42.0-wmf.18 (T354436): no current blockers, rolling to all wikis.
  • 18:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: T355866 - Post migration repool of db2122', diff saved to https://phabricator.wikimedia.org/P56858 and previous config saved to /var/cache/conftool/dbconfig/20240215-185949-arnaudb.json
  • 18:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246:3314', diff saved to https://phabricator.wikimedia.org/P56857 and previous config saved to /var/cache/conftool/dbconfig/20240215-185657-ladsgroup.json
  • 18:50 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1006.eqiad.wmnet
  • 18:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: T355866 - Post migration repool of db2122', diff saved to https://phabricator.wikimedia.org/P56856 and previous config saved to /var/cache/conftool/dbconfig/20240215-184444-arnaudb.json
  • 18:42 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudservices1006.eqiad.wmnet
  • 18:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246:3314', diff saved to https://phabricator.wikimedia.org/P56855 and previous config saved to /var/cache/conftool/dbconfig/20240215-184150-ladsgroup.json
  • 18:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: T355866 - Post migration repool of db2122', diff saved to https://phabricator.wikimedia.org/P56853 and previous config saved to /var/cache/conftool/dbconfig/20240215-182939-arnaudb.json
  • 18:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: T355866 - Post migration repool of db2105', diff saved to https://phabricator.wikimedia.org/P56852 and previous config saved to /var/cache/conftool/dbconfig/20240215-182934-arnaudb.json
  • 18:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1246:3314 (T352010)', diff saved to https://phabricator.wikimedia.org/P56850 and previous config saved to /var/cache/conftool/dbconfig/20240215-182644-ladsgroup.json
  • 18:23 bking@cumin2002: START - Cookbook sre.hosts.reimage for host cloudelastic1006.eqiad.wmnet with OS bullseye
  • 18:23 bking@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudelastic1006
  • 18:21 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
  • 18:21 bking@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudelastic1006
  • 18:21 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:20 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1006 to private IPs - bking@cumin2002"
  • 18:20 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: migrate cloudelastic1006 to private IPs - bking@cumin2002"
  • 18:18 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/toolhub: apply
  • 18:18 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 18:17 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
  • 18:17 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
  • 18:16 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
  • 18:15 bd808@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
  • 18:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: T355866 - Post migration repool of db2105', diff saved to https://phabricator.wikimedia.org/P56849 and previous config saved to /var/cache/conftool/dbconfig/20240215-181429-arnaudb.json
  • 18:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudelastic1006.wikimedia.org
  • 18:12 bking@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:12 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1006.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 18:11 bking@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudelastic1006.wikimedia.org decommissioned, removing all IPs except the asset tag one - bking@cumin2002"
  • 18:09 bking@cumin2002: START - Cookbook sre.dns.netbox
  • 18:02 bking@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudelastic1006.wikimedia.org
  • 17:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: T355866 - Post migration repool of db2105', diff saved to https://phabricator.wikimedia.org/P56848 and previous config saved to /var/cache/conftool/dbconfig/20240215-175924-arnaudb.json
  • 17:54 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1005.eqiad.wmnet
  • 17:48 swfrench-wmf: reenabled puppet on mediawiki::webserver hosts after deploying for T357436
  • 17:47 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudservices1005.eqiad.wmnet
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: T355866 - Post migration repool of db2105', diff saved to https://phabricator.wikimedia.org/P56847 and previous config saved to /var/cache/conftool/dbconfig/20240215-174419-arnaudb.json
  • 17:44 arnaudb@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 100%: T355866 - Post migration repool of db2156', diff saved to https://phabricator.wikimedia.org/P56846 and previous config saved to /var/cache/conftool/dbconfig/20240215-174414-arnaudb.json
  • 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-wikifunctions: apply
  • 17:38 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-wikifunctions: apply
  • 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-wikifunctions: apply
  • 17:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-wikifunctions: apply
  • 17:37 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-misc: apply
  • 17:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-misc: apply
  • 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-misc: apply
  • 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-misc: apply
  • 17:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
  • 17:35 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
  • 17:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
  • 17:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
  • 17:34 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
  • 17:33 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
  • 17:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
  • 17:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
  • 17:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
  • 17:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
  • 17:31 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
  • 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
  • 17:29 arnaudb@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 75%: T355866 - Post migration repool of db2156', diff saved to https://phabricator.wikimedia.org/P56844 and previous config saved to /var/cache/conftool/dbconfig/20240215-172909-arnaudb.json
  • 17:28 fnegri@cumin1002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet1005.eqiad.wmnet
  • 17:24 aokoth@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on vrts1002.eqiad.wmnet with reason: Migration Ongoing
  • 17:24 eevans@cumin1002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching P{P:cassandra%rack = "rack1"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 17:24 aokoth@cumin1002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on vrts1002.eqiad.wmnet with reason: Migration Ongoing
  • 17:23 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
  • 17:23 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
  • 17:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • 17:21 fnegri@cumin1002: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
  • 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
  • 17:14 arnaudb@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 50%: T355866 - Post migration repool of db2156', diff saved to https://phabricator.wikimedia.org/P56843 and previous config saved to /var/cache/conftool/dbconfig/20240215-171403-arnaudb.json
  • 17:05 swfrench-wmf: disabling puppet shortly on mediawiki::webserver hosts to deploy T357436
  • 16:59 arnaudb@cumin1002: dbctl commit (dc=all): 'db2156 (re)pooling @ 25%: T355866 - Post migration repool of db2156', diff saved to https://phabricator.wikimedia.org/P56842 and previous config saved to /var/cache/conftool/dbconfig/20240215-165858-arnaudb.json
  • 16:58 arnaudb@cumin1002: dbctl commit (dc=all): 'db2155 (re)pooling @ 100%: T355866 - Post migration repool of db2155', diff saved to https://phabricator.wikimedia.org/P56841 and previous config saved to /var/cache/conftool/dbconfig/20240215-165853-arnaudb.json
  • 16:53 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:53 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:52 dani@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:51 eevans@cumin1002: START - Cookbook sre.cassandra.roll-restart for nodes matching P{P:cassandra%rack = "rack1"} and A:aqs and A:eqiad: Restart to pickup logging jars — T353550 - eevans@cumin1002
  • 16:46 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f] (sessionstore): Deploying to updated target list — T353550 (duration: 00m 15s)
  • 16:46 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f] (sessionstore): Deploying to updated target list — T353550
  • 16:46 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f] (ml-cache): Deploying to updated target list — T353550 (duration: 00m 15s)
  • 16:46 hnowlan@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw2379.codfw.wmnet with reason: BGP issues - uncordoned, needs investigation
  • 16:45 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f] (ml-cache): Deploying to updated target list — T353550
  • 16:45 hnowlan@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw2379.codfw.wmnet with reason: BGP issues - uncordoned, needs investigation
  • 16:45 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f] (cassandra-dev): Deploying to updated target list — T353550 (duration: 00m 15s)
  • 16:45 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f] (cassandra-dev): Deploying to updated target list — T353550
  • 16:43 arnaudb@cumin1002: dbctl commit (dc=all): 'db2155 (re)pooling @ 75%: T355866 - Post migration repool of db2155', diff saved to https://phabricator.wikimedia.org/P56840 and previous config saved to /var/cache/conftool/dbconfig/20240215-164348-arnaudb.json
  • 16:43 eevans@deploy2002: Finished deploy [cassandra/logstash-logback-encoder@162f72f] (aqs): Deploying to updated target list — T353550 (duration: 00m 37s)
  • 16:43 eevans@deploy2002: Started deploy [cassandra/logstash-logback-encoder@162f72f] (aqs): Deploying to updated target list — T353550
  • 16:40 dani@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:40 dani@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:40 hnowlan@cumin2002: conftool action : set/pooled=no; selector: name=mw2379.codfw.wmnet
  • 16:40 dani@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:40 dani@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:40 dani@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 16:38 dancy@deploy2002: Finished scap: Backport for Load WikimediaCampaignEvents if CampaignEvents is loaded (T347909) (duration: 13m 36s)
  • 16:30 dancy@deploy2002: mhorsey and dancy: Continuing with sync
  • 16:29 hnowlan: kubectl cordon mw2379.codfw.wmnet - bgp issues
  • 16:28 arnaudb@cumin1002: dbctl commit (dc=all): 'db2155 (re)pooling @ 50%: T355866 - Post migration repool of db2155', diff saved to https://phabricator.wikimedia.org/P56839 and previous config saved to /var/cache/conftool/dbconfig/20240215-162843-arnaudb.json
  • 16:26 dancy@deploy2002: mhorsey and dancy: Backport for Load WikimediaCampaignEvents if CampaignEvents is loaded (T347909) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 16:24 dancy@deploy2002: Started scap: Backport for Load WikimediaCampaignEvents if CampaignEvents is loaded (T347909)
  • 16:16 hnowlan@cumin2002: conftool action : set/weight=10:pooled=yes; selector: name=(mw2311.codfw.wmnet|mw2335.codfw.wmnet|mw2379.codfw.wmnet|mw2380.codfw.wmnet|mw2383.codfw.wmnet),cluster=kubernetes,service=kubesvc
  • 16:14 Daimona: Creating new DB table for the WikimediaCampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # T347909
  • 16:13 cgoubert@cumin2002: conftool action : set/pooled=yes; selector: name=(mw2302|mw2303|mw2304|mw2305|mw2306|mw2307|mw2308|mw2309|mw2426).*
  • 16:13 arnaudb@cumin1002: dbctl commit (dc=all): 'db2155 (re)pooling @ 25%: T355866 - Post migration repool of db2155', diff saved to https://phabricator.wikimedia.org/P56838 and previous config saved to /var/cache/conftool/dbconfig/20240215-161338-arnaudb.json
  • 16:13 claime: Repooling mw2302|mw2303|mw2304|mw2305|mw2306|mw2307|mw2308|mw2309|mw2426 - T355866
  • 16:13 claime: Uncordoning kubernetes2059.codfw.wmnet kubernetes2028.codfw.wmnet kubernetes2027.codfw.wmnet kubernetes2060.codfw.wmnet kubernetes2008.codfw.wmnet kubernetes2007.codfw.wmnet kubernetes2055.codfw.wmnet mw2301.codfw.wmnet mw2424.codfw.wmnet mw2425.codfw.wmnet mw2427.codfw.wmnet - T355866
  • 16:13 hnowlan@deploy2002: helmfile [eqiad] [canary] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:12 hnowlan@deploy2002: helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync
  • 16:12 hnowlan@deploy2002: helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync
  • 16:12 hnowlan@deploy2002: helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync
  • 16:00 topranks: commencing move of server uplinks codfw row A6 T355866
  • 15:57 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 38 hosts with reason: Migrating servers in codfw rack A6 to lsw1-a6-codfw
  • 15:56 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on 38 hosts with reason: Migrating servers in codfw rack A6 to lsw1-a6-codfw
  • 15:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on es2028.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:54 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on es2028.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:54 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on es2027.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:53 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 0:30:00 on es2027.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:49 cmooney@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a6-codfw.mgmt with reason: prepping for server uplink migration codfw rack a6
  • 15:49 cmooney@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on asw-a-codfw,cr[1-2]-codfw,lsw1-a6-codfw.mgmt with reason: prepping for server uplink migration codfw rack a6
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on es2024.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on es2024.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2133.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2133.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2122.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2122.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:46 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2105.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2105.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2156.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2156.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2155.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: START - Cookbook sre.hosts.downtime for 1:00:00 on db2155.codfw.wmnet with reason: T355866 - Migrate servers in codfw rack A6 from asw-a6-codfw to lsw1-a6-codfw
  • 15:45 arnaudb@cumin1002: dbctl commit (dc=all): 'T355866 - db2155 db2156 db2105 db2122 db2133 es2024', diff saved to https://phabricator.wikimedia.org/P56837 and previous config saved to /var/cache/conftool/dbconfig/20240215-154520-arnaudb.json
  • 15:24 moritzm: imported openssl11 1.1.1w-0+deb11u1+wmf2 to component/haproxy26 T352744 (with fix for libssl11-dev file contents)
  • 15:15 cgoubert@cumin2002: conftool action : set/pooled=inactive; selector: name=(mw2302|mw2303|mw2304|mw2305|mw2306|mw2307|mw2308|mw2309|mw2426).*
  • 15:15 claime: Depooling mw2302|mw2303|mw2304|mw2305|mw2306|mw2307|mw2308|mw2309|mw2426 - T355866
  • 15:14 claime: Draining kubernetes2059.codfw.wmnet kubernetes2028.codfw.wmnet kubernetes2027.codfw.wmnet kubernetes2060.codfw.wmnet kubernetes2008.codfw.wmnet kubernetes2007.codfw.wmnet kubernetes2055.codfw.wmnet mw2301.codfw.wmnet mw2424.codfw.wmnet mw2425.codfw.wmnet mw2427.codfw.wmnet - T355866
  • 15:12 Lucas_WMDE: UTC afternoon backport+config window done
  • 14:47 logmsgbot: lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [ruwikiquote] Add 'suppressredirect' right to editors (T357241) (duration: 09m 26s)
  • 14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Continuing with sync
  • 14:40 logmsgbot: lucaswerkmeister-wmde@deploy2002 superpes and lucaswerkmeister-wmde: Backport for [ruwikiquote] Add 'suppressredirect' right to editors (T357241) synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
  • 14:38 logmsgbot: lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [ruwikiquote] Add 'suppressredirect' right to editors (T357241)
  • 14:37 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changepr