22:00 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase1030.eqiad.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001
21:59 mutante: etherpad (etherpad1003) - rebooting for maintenance
21:52 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase1029.eqiad.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001
21:48 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase1030.eqiad.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001
21:42 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase1029.eqiad.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001
21:42 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase1028.eqiad.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001
21:42 ebernhardson: run search index creation for pcmwiki
21:41 ebernhardson: run search index creation for bjnwiktionary
21:40 ebernhardson: run search index creation for guwwiktionary
21:32 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase1028.eqiad.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001
21:30 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on etherpad1003.eqiad.wmnet with reason: kernel upgrade
21:30 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on etherpad1003.eqiad.wmnet with reason: kernel upgrade
21:30 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on etherpad1003.eqiad.wmnet with reason: kernel upgrade
21:29 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on etherpad1003.eqiad.wmnet with reason: kernel upgrade
19:56 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2003.wikimedia.org with OS bullseye
19:41 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
19:37 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
19:30 ryankemper: T316719 Rolling upgrade operation complete; all of elastic codfw is now on `7.10.2`. Next week our related cirrus changes will go out with the mediawiki deploy train in `1.39.0-wmf.28`
19:21 ryankemper@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
19:21 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
19:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P33725 and previous config saved to /var/cache/conftool/dbconfig/20220831-192120-ladsgroup.json
19:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
19:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
19:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
19:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T314041)', diff saved to https://phabricator.wikimedia.org/P33724 and previous config saved to /var/cache/conftool/dbconfig/20220831-192032-ladsgroup.json
19:18 dzahn@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host gitlab2003.wikimedia.org with OS bullseye
19:16 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
19:15 mutante: gitlab: reimaging gitlab2003 with cookbook after reverting partman change and comment on gerrit:827578 T274463
19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P33723 and previous config saved to /var/cache/conftool/dbconfig/20220831-190526-ladsgroup.json
18:56 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P33722 and previous config saved to /var/cache/conftool/dbconfig/20220831-185020-ladsgroup.json
18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T314041)', diff saved to https://phabricator.wikimedia.org/P33721 and previous config saved to /var/cache/conftool/dbconfig/20220831-183513-ladsgroup.json
16:00 volans: uploaded spicerack_3.2.1 to apt.wikimedia.org bullseye-wikimedia
15:57 _joe_: updated php 7.4 in all of production T316691
15:55 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6016.drmrs.wmnet with OS buster
15:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti2015.codfw.wmnet with reason: Remove node for eventual reimage, T311686
15:54 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti2015.codfw.wmnet with reason: Remove node for eventual reimage, T311686
15:31 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
15:27 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
15:07 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp6016.drmrs.wmnet with OS buster
15:01 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-serve-ctrl2001.codfw.wmnet
15:01 klausman@cumin1001: START - Cookbook sre.hosts.remove-downtime for ml-serve-ctrl2001.codfw.wmnet
15:00 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6008.drmrs.wmnet with OS buster
14:56 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-serve-ctrl2001.codfw.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
14:56 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-serve-ctrl2001.codfw.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
14:56 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-serve-ctrl2002.codfw.wmnet
14:56 klausman@cumin1001: START - Cookbook sre.hosts.remove-downtime for ml-serve-ctrl2002.codfw.wmnet
14:50 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-serve-ctrl2002.codfw.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
14:50 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-serve-ctrl2002.codfw.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
13:07 klausman@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3002.esams.wmnet
12:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3002.esams.wmnet
12:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2002.codfw.wmnet
12:57 vgutierrez: test trafficserver: Hide non session cookies during cache lookup in drmrs - T316338
12:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2002.codfw.wmnet
12:39 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-staging-ctrl2002.codfw.wmnet
12:39 klausman@cumin1001: START - Cookbook sre.hosts.remove-downtime for ml-staging-ctrl2002.codfw.wmnet
12:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
12:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
12:34 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-staging-ctrl2002.codfw.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
12:34 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-staging-ctrl2002.codfw.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
12:33 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-staging-ctrl2001.codfw.wmnet
12:33 klausman@cumin1001: START - Cookbook sre.hosts.remove-downtime for ml-staging-ctrl2001.codfw.wmnet
12:28 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-staging-ctrl2001.codfw.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
12:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1002.eqiad.wmnet
12:28 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-staging-ctrl2001.codfw.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
12:27 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2002.codfw.wmnet
12:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1002.eqiad.wmnet
12:17 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2002.codfw.wmnet
12:16 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2002.codfw.wmnet
12:06 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
12:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2002.codfw.wmnet
11:58 marostegui: Reboot sanitarium hosts, lag will appear on clouddb* hosts
11:49 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host webperf1004.eqiad.wmnet
11:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1004.eqiad.wmnet
11:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet
11:27 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
11:27 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
11:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet
11:22 moritzm: draining ganeti2015 for eventual reimage T311686
11:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2004.codfw.wmnet
11:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf2004.codfw.wmnet
11:04 vgutierrez: test trafficserver: Hide non session cookies during cache lookup in cp6016 - T316338
11:00 _joe_: updating php 7.4 packages in wikimedia/bustrer T316601
10:42 _joe_: updating php 7.4 on mwdebug1002 to test the new patched packages T316601
10:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33715 and previous config saved to /var/cache/conftool/dbconfig/20220831-100853-root.json
10:06 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host parse1002.eqiad.wmnet with OS buster
09:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33714 and previous config saved to /var/cache/conftool/dbconfig/20220831-095348-root.json
09:51 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2003.codfw.wmnet
09:44 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-fe2003.codfw.wmnet
09:44 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2002.codfw.wmnet
09:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33713 and previous config saved to /var/cache/conftool/dbconfig/20220831-093844-root.json
09:37 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-fe2002.codfw.wmnet
09:34 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-fe2001.codfw.wmnet
09:33 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse1002.eqiad.wmnet with reason: host reimage
09:29 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse1002.eqiad.wmnet with reason: host reimage
09:27 moritzm: installing docker.io bugfix updates from Bullseye point release
09:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33712 and previous config saved to /var/cache/conftool/dbconfig/20220831-092339-root.json
09:22 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-fe2001.codfw.wmnet
09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet
09:17 cgoubert@cumin1001: START - Cookbook sre.hosts.reimage for host parse1002.eqiad.wmnet with OS buster
09:17 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1003.eqiad.wmnet
09:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet
09:11 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-fe1003.eqiad.wmnet
09:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33711 and previous config saved to /var/cache/conftool/dbconfig/20220831-090834-root.json
08:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33710 and previous config saved to /var/cache/conftool/dbconfig/20220831-085329-root.json
08:51 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1002.eqiad.wmnet
08:43 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-fe1002.eqiad.wmnet
08:39 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1001.eqiad.wmnet
08:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33709 and previous config saved to /var/cache/conftool/dbconfig/20220831-083824-root.json
08:32 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-fe1001.eqiad.wmnet
08:28 moritzm: upgrading ganeti2016/ganeti2018 to 3.0.2 T312637
08:28 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 24 hosts with reason: Downtiming php7.4 parsoid servers until they are ready to pool
08:27 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 24 hosts with reason: Downtiming php7.4 parsoid servers until they are ready to pool
08:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33708 and previous config saved to /var/cache/conftool/dbconfig/20220831-082319-root.json
08:20 vgutierrez: end test trafficserver: Hide non session cookies during cache lookup in cp6016 - T316338
08:12 vgutierrez: test trafficserver: Hide non session cookies during cache lookup in cp6016 - T316338
08:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33707 and previous config saved to /var/cache/conftool/dbconfig/20220831-080815-root.json
07:54 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host prometheus2006.codfw.wmnet
07:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33706 and previous config saved to /var/cache/conftool/dbconfig/20220831-075310-root.json
07:51 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2022.codfw.wmnet to cluster codfw and group B
07:50 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
07:50 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2022.codfw.wmnet to cluster codfw and group B
07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1120 for upgrade', diff saved to https://phabricator.wikimedia.org/P33705 and previous config saved to /var/cache/conftool/dbconfig/20220831-074748-root.json
07:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2022.codfw.wmnet
07:40 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
07:39 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet
07:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2022.codfw.wmnet
07:15 godog: bounce thanos-compact on thanos-fe2001
05:00 marostegui: Failover m3 from db1183 to db1159 - T316506
04:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2132,2160].codfw.wmnet,db[1117,1195].eqiad.wmnet with reason: switchover m1 T316506
04:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2132,2160].codfw.wmnet,db[1117,1195].eqiad.wmnet with reason: switchover m1 T316506
03:23 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
03:23 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
03:17 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
02:50 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
02:49 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
00:15 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
00:14 ryankemper: T316719 First elastic host upgraded properly. Cancelling cookbook to kick off a new rolling upgrade that will go 3 nodes at a time (first run was just one host as a sanity check)
00:14 ryankemper@cumin2002: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
00:08 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719
16:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T316186)', diff saved to https://phabricator.wikimedia.org/P33701 and previous config saved to /var/cache/conftool/dbconfig/20220830-163619-ladsgroup.json
16:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2022.codfw.wmnet with OS bullseye
16:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1010.eqiad.wmnet
16:29 dduvall@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.27 refs T314188
16:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host stat1010.eqiad.wmnet
16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P33700 and previous config saved to /var/cache/conftool/dbconfig/20220830-162113-ladsgroup.json
16:20 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1005.eqiad.wmnet
16:12 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1005.eqiad.wmnet
16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P33699 and previous config saved to /var/cache/conftool/dbconfig/20220830-160607-ladsgroup.json
15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T316186)', diff saved to https://phabricator.wikimedia.org/P33698 and previous config saved to /var/cache/conftool/dbconfig/20220830-155101-ladsgroup.json
15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T316186)', diff saved to https://phabricator.wikimedia.org/P33697 and previous config saved to /var/cache/conftool/dbconfig/20220830-154337-ladsgroup.json
15:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
15:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T316186)', diff saved to https://phabricator.wikimedia.org/P33696 and previous config saved to /var/cache/conftool/dbconfig/20220830-154314-ladsgroup.json
15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P33695 and previous config saved to /var/cache/conftool/dbconfig/20220830-152807-ladsgroup.json
15:25 vgutierrez: restarting ats in cp6008
15:25 vgutierrez: restarting ats in cp6007
15:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stat1009.eqiad.wmnet
15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti2022.codfw.wmnet with reason: Remove node for eventual reimage, T311686
15:17 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti2022.codfw.wmnet with reason: Remove node for eventual reimage, T311686
15:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host stat1009.eqiad.wmnet
15:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2002.codfw.wmnet
15:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P33694 and previous config saved to /var/cache/conftool/dbconfig/20220830-151301-ladsgroup.json
15:10 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
15:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2002.codfw.wmnet
15:09 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
15:07 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
15:06 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
15:06 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
15:05 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
14:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T316186)', diff saved to https://phabricator.wikimedia.org/P33693 and previous config saved to /var/cache/conftool/dbconfig/20220830-145755-ladsgroup.json
14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T316186)', diff saved to https://phabricator.wikimedia.org/P33691 and previous config saved to /var/cache/conftool/dbconfig/20220830-145035-ladsgroup.json
14:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1147.eqiad.wmnet with reason: Maintenance
14:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1147.eqiad.wmnet with reason: Maintenance
14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T316186)', diff saved to https://phabricator.wikimedia.org/P33690 and previous config saved to /var/cache/conftool/dbconfig/20220830-145011-ladsgroup.json
14:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P33689 and previous config saved to /var/cache/conftool/dbconfig/20220830-143505-ladsgroup.json
14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P33688 and previous config saved to /var/cache/conftool/dbconfig/20220830-141959-ladsgroup.json
14:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T316186)', diff saved to https://phabricator.wikimedia.org/P33687 and previous config saved to /var/cache/conftool/dbconfig/20220830-140452-ladsgroup.json
13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T316186)', diff saved to https://phabricator.wikimedia.org/P33686 and previous config saved to /var/cache/conftool/dbconfig/20220830-135733-ladsgroup.json
13:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
13:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T316186)', diff saved to https://phabricator.wikimedia.org/P33685 and previous config saved to /var/cache/conftool/dbconfig/20220830-135658-ladsgroup.json
13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P33684 and previous config saved to /var/cache/conftool/dbconfig/20220830-134152-ladsgroup.json
13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P33683 and previous config saved to /var/cache/conftool/dbconfig/20220830-132646-ladsgroup.json
13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T316186)', diff saved to https://phabricator.wikimedia.org/P33682 and previous config saved to /var/cache/conftool/dbconfig/20220830-131140-ladsgroup.json
13:11 moritzm: installing libxslt security updates for stretch
13:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T316186)', diff saved to https://phabricator.wikimedia.org/P33681 and previous config saved to /var/cache/conftool/dbconfig/20220830-130521-ladsgroup.json
13:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
13:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
13:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T316186)', diff saved to https://phabricator.wikimedia.org/P33680 and previous config saved to /var/cache/conftool/dbconfig/20220830-130457-ladsgroup.json
12:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P33679 and previous config saved to /var/cache/conftool/dbconfig/20220830-124951-ladsgroup.json
12:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P33678 and previous config saved to /var/cache/conftool/dbconfig/20220830-123445-ladsgroup.json
12:31 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host graphite1004.eqiad.wmnet
12:20 godog: rollback and reboot graphite1004 with linux-image-5.10.0-16-amd64
12:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T316186)', diff saved to https://phabricator.wikimedia.org/P33677 and previous config saved to /var/cache/conftool/dbconfig/20220830-121938-ladsgroup.json
12:19 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host graphite1004.eqiad.wmnet
12:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T316186)', diff saved to https://phabricator.wikimedia.org/P33676 and previous config saved to /var/cache/conftool/dbconfig/20220830-121421-ladsgroup.json
12:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
12:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T316186)', diff saved to https://phabricator.wikimedia.org/P33675 and previous config saved to /var/cache/conftool/dbconfig/20220830-121357-ladsgroup.json
12:04 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host graphite1004.eqiad.wmnet
12:01 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet
11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P33674 and previous config saved to /var/cache/conftool/dbconfig/20220830-115851-ladsgroup.json
11:53 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
11:52 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host prometheus2005.codfw.wmnet
11:52 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host graphite1004.eqiad.wmnet
11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P33673 and previous config saved to /var/cache/conftool/dbconfig/20220830-114345-ladsgroup.json
11:43 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
11:36 moritzm: uploaded libxslt 1.1.29-2.1+deb9u2+wmf1 to apt.wikimedia.org
11:32 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host idp1002.wikimedia.org
11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T316186)', diff saved to https://phabricator.wikimedia.org/P33672 and previous config saved to /var/cache/conftool/dbconfig/20220830-112838-ladsgroup.json
11:24 volans@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
11:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp1002.wikimedia.org
11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T316186)', diff saved to https://phabricator.wikimedia.org/P33671 and previous config saved to /var/cache/conftool/dbconfig/20220830-112117-ladsgroup.json
11:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
11:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
11:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T316186)', diff saved to https://phabricator.wikimedia.org/P33670 and previous config saved to /var/cache/conftool/dbconfig/20220830-112048-ladsgroup.json
11:07 volans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
11:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P33669 and previous config saved to /var/cache/conftool/dbconfig/20220830-110542-ladsgroup.json
11:04 volans@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
10:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P33668 and previous config saved to /var/cache/conftool/dbconfig/20220830-105036-ladsgroup.json
10:50 volans@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling db2096 after maint', diff saved to https://phabricator.wikimedia.org/P33667 and previous config saved to /var/cache/conftool/dbconfig/20220830-104616-ladsgroup.json
10:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2096.codfw.wmnet with reason: Maintenance
10:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2096.codfw.wmnet with reason: Maintenance
10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T316186)', diff saved to https://phabricator.wikimedia.org/P33666 and previous config saved to /var/cache/conftool/dbconfig/20220830-103530-ladsgroup.json
10:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 (T316186)', diff saved to https://phabricator.wikimedia.org/P33665 and previous config saved to /var/cache/conftool/dbconfig/20220830-103012-ladsgroup.json
10:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1138.eqiad.wmnet with reason: Maintenance
10:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1138.eqiad.wmnet with reason: Maintenance
10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling db1167 (T316186)', diff saved to https://phabricator.wikimedia.org/P33664 and previous config saved to /var/cache/conftool/dbconfig/20220830-102342-ladsgroup.json
10:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T316186)', diff saved to https://phabricator.wikimedia.org/P33663 and previous config saved to /var/cache/conftool/dbconfig/20220830-102220-ladsgroup.json
10:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
10:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
10:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
10:21 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1005.eqiad.wmnet
10:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
10:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch instance to plain disks, T311686
10:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch instance to plain disks, T311686
10:11 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet
10:08 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite2003.codfw.wmnet
09:53 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host centrallog2002.codfw.wmnet
09:53 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host centrallog2002.codfw.wmnet
09:51 dcausse@deploy1002: Started deploy [wikimedia/discovery/analytics@ff76338]: Add sd-alerts notifications to image_suggestions_weekly
09:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch instance to DRBD, T311686
09:51 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch instance to DRBD, T311686
08:58 moritzm: upgrading ganeti2010,ganeti2012,ganeti2024 to 3.0.2 T312637
08:53 moritzm: failover Ganeti master in codfw to ganeti2020 T311686
08:49 marostegui@cumin1001: dbctl commit (dc=all): 'Give some weight to current x1 codfw master', diff saved to https://phabricator.wikimedia.org/P33661 and previous config saved to /var/cache/conftool/dbconfig/20220830-084945-root.json
08:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2096.codfw.wmnet with reason: Maintenance
08:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2096.codfw.wmnet with reason: Maintenance
08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2096 T316522', diff saved to https://phabricator.wikimedia.org/P33660 and previous config saved to /var/cache/conftool/dbconfig/20220830-083845-root.json
08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2115 to x1 codfw primary T316522', diff saved to https://phabricator.wikimedia.org/P33659 and previous config saved to /var/cache/conftool/dbconfig/20220830-083654-root.json
08:36 marostegui: Starting x1 codfw failover from db2096 to db2115 - T316522
08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2115 with weight 0 T316522', diff saved to https://phabricator.wikimedia.org/P33658 and previous config saved to /var/cache/conftool/dbconfig/20220830-083103-root.json
08:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: DC switchover x1 T316522
08:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: DC switchover x1 T316522
06:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T316186)', diff saved to https://phabricator.wikimedia.org/P33657 and previous config saved to /var/cache/conftool/dbconfig/20220830-063332-ladsgroup.json
06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 (T316186)', diff saved to https://phabricator.wikimedia.org/P33656 and previous config saved to /var/cache/conftool/dbconfig/20220830-062613-ladsgroup.json
06:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
06:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T316186)', diff saved to https://phabricator.wikimedia.org/P33655 and previous config saved to /var/cache/conftool/dbconfig/20220830-062547-ladsgroup.json
06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T316186)', diff saved to https://phabricator.wikimedia.org/P33654 and previous config saved to /var/cache/conftool/dbconfig/20220830-061926-ladsgroup.json
06:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
06:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T316186)', diff saved to https://phabricator.wikimedia.org/P33653 and previous config saved to /var/cache/conftool/dbconfig/20220830-061901-ladsgroup.json
06:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1173.eqiad.wmnet with reason: Maintenance
06:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1173.eqiad.wmnet with reason: Maintenance
06:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2176 (T316186)', diff saved to https://phabricator.wikimedia.org/P33652 and previous config saved to /var/cache/conftool/dbconfig/20220830-061243-ladsgroup.json
06:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
06:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance
06:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T316186)', diff saved to https://phabricator.wikimedia.org/P33651 and previous config saved to /var/cache/conftool/dbconfig/20220830-061218-ladsgroup.json
06:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
06:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
06:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2112 (T316186)', diff saved to https://phabricator.wikimedia.org/P33650 and previous config saved to /var/cache/conftool/dbconfig/20220830-060554-ladsgroup.json
06:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2112.codfw.wmnet with reason: Maintenance
06:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2112.codfw.wmnet with reason: Maintenance
06:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T316186)', diff saved to https://phabricator.wikimedia.org/P33648 and previous config saved to /var/cache/conftool/dbconfig/20220830-060509-ladsgroup.json
06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1131 to s6 primary and set section read-write T316110', diff saved to https://phabricator.wikimedia.org/P33647 and previous config saved to /var/cache/conftool/dbconfig/20220830-060109-ladsgroup.json
06:00 Amir1: Starting s6 eqiad failover from db1173 to db1131 - T316110
05:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T316186)', diff saved to https://phabricator.wikimedia.org/P33645 and previous config saved to /var/cache/conftool/dbconfig/20220830-055948-ladsgroup.json
05:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
05:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
05:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
05:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T316186)', diff saved to https://phabricator.wikimedia.org/P33644 and previous config saved to /var/cache/conftool/dbconfig/20220830-055555-ladsgroup.json
05:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T316186)', diff saved to https://phabricator.wikimedia.org/P33643 and previous config saved to /var/cache/conftool/dbconfig/20220830-054924-ladsgroup.json
05:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
05:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
05:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T316186)', diff saved to https://phabricator.wikimedia.org/P33642 and previous config saved to /var/cache/conftool/dbconfig/20220830-054859-ladsgroup.json
05:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2174 (T316186)', diff saved to https://phabricator.wikimedia.org/P33641 and previous config saved to /var/cache/conftool/dbconfig/20220830-054242-ladsgroup.json
05:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
05:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance
05:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T316186)', diff saved to https://phabricator.wikimedia.org/P33640 and previous config saved to /var/cache/conftool/dbconfig/20220830-054217-ladsgroup.json
05:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 (T316186)', diff saved to https://phabricator.wikimedia.org/P33639 and previous config saved to /var/cache/conftool/dbconfig/20220830-053559-ladsgroup.json
05:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
05:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
05:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
05:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
05:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T316186)', diff saved to https://phabricator.wikimedia.org/P33638 and previous config saved to /var/cache/conftool/dbconfig/20220830-053529-ladsgroup.json
05:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1143.eqiad.wmnet with reason: Maintenance
05:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1143.eqiad.wmnet with reason: Maintenance
05:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T316186)', diff saved to https://phabricator.wikimedia.org/P33637 and previous config saved to /var/cache/conftool/dbconfig/20220830-052930-ladsgroup.json
05:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance
05:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance
05:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1131 with weight 0 T316110', diff saved to https://phabricator.wikimedia.org/P33636 and previous config saved to /var/cache/conftool/dbconfig/20220830-051106-ladsgroup.json
05:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s6 T316110
05:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s6 T316110
05:03 ryankemper: T306899T316496 Deployed WCQS `0.3.115`. That should (hopefully) resolve these tickets.
05:01 ryankemper: [WCQS Deploy] Restarted `wcqs-updater` across all hosts: `sudo -E cumin 'A:wcqs-public' 'systemctl restart wcqs-updater'`
22:41 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@57fb704]: force re-deploy HEAD to attempt to get artifacts directory populated on an-airflow1001 (duration: 02m 01s)
22:39 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@57fb704]: force re-deploy HEAD to attempt to get artifacts directory populated on an-airflow1001
22:13 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@57fb704]: re-deploy HEAD to attempt to get artifacts directory populated on an-airflow1001 (duration: 00m 04s)
22:13 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@57fb704]: re-deploy HEAD to attempt to get artifacts directory populated on an-airflow1001
21:53 tgr@deploy1002: Synchronized static/images/project-logos: Config: Adjust width-height ratio of logos for bewikisource, euwikisource, cswikisource to fix display issue (T310961) (duration: 03m 59s)
21:48 tgr@deploy1002: Synchronized wmf-config/logos.php: Config: Adjust width-height ratio of logos for bewikisource, euwikisource, cswikisource to fix display issue (T310961) (duration: 03m 34s)
21:44 tgr@deploy1002: Synchronized logos/config.yaml: Config: Adjust width-height ratio of logos for bewikisource, euwikisource, cswikisource to fix display issue (T310961) (duration: 03m 45s)
20:04 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@57fb704]: Deploy mjolnir 1.1 for elasticsearch 7.x compatability
19:34 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@5c0af35]: Update to work with elasticsearch 7.x (duration: 00m 54s)
19:33 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@5c0af35]: Update to work with elasticsearch 7.x
19:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P33634 and previous config saved to /var/cache/conftool/dbconfig/20220829-192608-ladsgroup.json
19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P33633 and previous config saved to /var/cache/conftool/dbconfig/20220829-191950-ladsgroup.json
19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T316186)', diff saved to https://phabricator.wikimedia.org/P33632 and previous config saved to /var/cache/conftool/dbconfig/20220829-190444-ladsgroup.json
18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T316186)', diff saved to https://phabricator.wikimedia.org/P33631 and previous config saved to /var/cache/conftool/dbconfig/20220829-185723-ladsgroup.json
18:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
18:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T316186)', diff saved to https://phabricator.wikimedia.org/P33630 and previous config saved to /var/cache/conftool/dbconfig/20220829-185659-ladsgroup.json
18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P33629 and previous config saved to /var/cache/conftool/dbconfig/20220829-184153-ladsgroup.json
18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P33628 and previous config saved to /var/cache/conftool/dbconfig/20220829-182646-ladsgroup.json
18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T316186)', diff saved to https://phabricator.wikimedia.org/P33627 and previous config saved to /var/cache/conftool/dbconfig/20220829-181140-ladsgroup.json
18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T316186)', diff saved to https://phabricator.wikimedia.org/P33626 and previous config saved to /var/cache/conftool/dbconfig/20220829-180421-ladsgroup.json
18:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
18:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
18:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T316186)', diff saved to https://phabricator.wikimedia.org/P33625 and previous config saved to /var/cache/conftool/dbconfig/20220829-180358-ladsgroup.json
17:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P33624 and previous config saved to /var/cache/conftool/dbconfig/20220829-174851-ladsgroup.json
17:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P33623 and previous config saved to /var/cache/conftool/dbconfig/20220829-173345-ladsgroup.json
17:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T316186)', diff saved to https://phabricator.wikimedia.org/P33622 and previous config saved to /var/cache/conftool/dbconfig/20220829-171839-ladsgroup.json
17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T316186)', diff saved to https://phabricator.wikimedia.org/P33621 and previous config saved to /var/cache/conftool/dbconfig/20220829-171116-ladsgroup.json
17:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
17:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
17:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
17:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
17:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33620 and previous config saved to /var/cache/conftool/dbconfig/20220829-171035-ladsgroup.json
17:03 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on restbase[1031-1033].eqiad.wmnet with reason: New hosts - awaiting cassandra joins
17:03 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on restbase[1031-1033].eqiad.wmnet with reason: New hosts - awaiting cassandra joins
16:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P33619 and previous config saved to /var/cache/conftool/dbconfig/20220829-165529-ladsgroup.json
16:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P33618 and previous config saved to /var/cache/conftool/dbconfig/20220829-164022-ladsgroup.json
16:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33617 and previous config saved to /var/cache/conftool/dbconfig/20220829-162516-ladsgroup.json
16:24 claime: repooled wtp1034.eqiad.wmnet and depooled parse1001.eqiad.wmnet
16:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33616 and previous config saved to /var/cache/conftool/dbconfig/20220829-161959-ladsgroup.json
16:05 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1033.eqiad.wmnet with OS buster
16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P33615 and previous config saved to /var/cache/conftool/dbconfig/20220829-160452-ladsgroup.json
15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P33614 and previous config saved to /var/cache/conftool/dbconfig/20220829-154946-ladsgroup.json
15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33613 and previous config saved to /var/cache/conftool/dbconfig/20220829-153440-ladsgroup.json
15:31 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1033.eqiad.wmnet with reason: host reimage
15:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33612 and previous config saved to /var/cache/conftool/dbconfig/20220829-152741-ladsgroup.json
15:27 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1033.eqiad.wmnet with reason: host reimage
15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33611 and previous config saved to /var/cache/conftool/dbconfig/20220829-152612-ladsgroup.json
15:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
15:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T316186)', diff saved to https://phabricator.wikimedia.org/P33610 and previous config saved to /var/cache/conftool/dbconfig/20220829-152549-ladsgroup.json
15:14 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1033.eqiad.wmnet with OS buster
15:13 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1032.eqiad.wmnet with OS buster
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P33609 and previous config saved to /var/cache/conftool/dbconfig/20220829-151042-ladsgroup.json
14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P33608 and previous config saved to /var/cache/conftool/dbconfig/20220829-145536-ladsgroup.json
14:43 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1032.eqiad.wmnet with reason: host reimage
14:41 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on restbase1031.eqiad.wmnet with reason: New host
14:41 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on restbase1031.eqiad.wmnet with reason: New host
14:40 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1032.eqiad.wmnet with reason: host reimage
14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T316186)', diff saved to https://phabricator.wikimedia.org/P33607 and previous config saved to /var/cache/conftool/dbconfig/20220829-144030-ladsgroup.json
14:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T316186)', diff saved to https://phabricator.wikimedia.org/P33606 and previous config saved to /var/cache/conftool/dbconfig/20220829-143319-ladsgroup.json
14:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
14:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
14:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T316186)', diff saved to https://phabricator.wikimedia.org/P33605 and previous config saved to /var/cache/conftool/dbconfig/20220829-143255-ladsgroup.json
14:28 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1032.eqiad.wmnet with OS buster
14:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P33604 and previous config saved to /var/cache/conftool/dbconfig/20220829-141749-ladsgroup.json
14:06 Lucas_WMDE: UTC afternoon backport+config window done
14:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P33603 and previous config saved to /var/cache/conftool/dbconfig/20220829-140243-ladsgroup.json
13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T316186)', diff saved to https://phabricator.wikimedia.org/P33602 and previous config saved to /var/cache/conftool/dbconfig/20220829-134736-ladsgroup.json
13:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1186 (T316186)', diff saved to https://phabricator.wikimedia.org/P33601 and previous config saved to /var/cache/conftool/dbconfig/20220829-134014-ladsgroup.json
13:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
13:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
13:12 vgutierrez: Increase roll-out of query-sorting to 75% - T314868
13:06 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
13:00 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
12:14 vgutierrez: rolling restart of ats-be fleet wide to apply "Hide non session cookies during cache lookup" - T316338T316337
12:08 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host restbase1031.eqiad.wmnet with OS buster
12:03 hnowlan: joining restbase1031-a to cassandra cluster
12:03 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on restbase1031.eqiad.wmnet with reason: New host
12:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on restbase1031.eqiad.wmnet with reason: New host
11:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
11:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T316186)', diff saved to https://phabricator.wikimedia.org/P33600 and previous config saved to /var/cache/conftool/dbconfig/20220829-115107-ladsgroup.json
11:37 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1031.eqiad.wmnet with reason: host reimage
11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P33599 and previous config saved to /var/cache/conftool/dbconfig/20220829-113600-ladsgroup.json
11:33 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1031.eqiad.wmnet with reason: host reimage
11:21 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
11:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P33598 and previous config saved to /var/cache/conftool/dbconfig/20220829-112054-ladsgroup.json
11:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T316186)', diff saved to https://phabricator.wikimedia.org/P33597 and previous config saved to /var/cache/conftool/dbconfig/20220829-110548-ladsgroup.json
10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T316186)', diff saved to https://phabricator.wikimedia.org/P33596 and previous config saved to /var/cache/conftool/dbconfig/20220829-105928-ladsgroup.json
10:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
10:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33595 and previous config saved to /var/cache/conftool/dbconfig/20220829-105904-ladsgroup.json
10:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P33593 and previous config saved to /var/cache/conftool/dbconfig/20220829-104358-ladsgroup.json
10:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P33592 and previous config saved to /var/cache/conftool/dbconfig/20220829-102851-ladsgroup.json
10:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33591 and previous config saved to /var/cache/conftool/dbconfig/20220829-101345-ladsgroup.json
10:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33590 and previous config saved to /var/cache/conftool/dbconfig/20220829-101029-ladsgroup.json
10:09 vgutierrez: test trafficserver: Hide non session cookies during cache lookup in drmrs - T316338T316337
09:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P33589 and previous config saved to /var/cache/conftool/dbconfig/20220829-095523-ladsgroup.json
09:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P33587 and previous config saved to /var/cache/conftool/dbconfig/20220829-094017-ladsgroup.json
09:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33586 and previous config saved to /var/cache/conftool/dbconfig/20220829-092511-ladsgroup.json
09:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33585 and previous config saved to /var/cache/conftool/dbconfig/20220829-092005-ladsgroup.json
09:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33584 and previous config saved to /var/cache/conftool/dbconfig/20220829-091840-ladsgroup.json
09:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
09:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
09:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33583 and previous config saved to /var/cache/conftool/dbconfig/20220829-091816-ladsgroup.json
09:16 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
09:10 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
09:10 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
09:03 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
09:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P33582 and previous config saved to /var/cache/conftool/dbconfig/20220829-090310-ladsgroup.json
08:55 vgutierrez: test trafficserver: Hide non session cookies during cache lookup in cp6016 - T316338T316337
08:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P33581 and previous config saved to /var/cache/conftool/dbconfig/20220829-084804-ladsgroup.json
08:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33580 and previous config saved to /var/cache/conftool/dbconfig/20220829-083258-ladsgroup.json
08:31 marostegui: Failover m2 from db1159 to db1164 - T316202
08:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33579 and previous config saved to /var/cache/conftool/dbconfig/20220829-082643-ladsgroup.json
08:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P33578 and previous config saved to /var/cache/conftool/dbconfig/20220829-081136-ladsgroup.json
07:58 vgutierrez: Increase roll-out of query-sorting to 50% - T314868
07:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P33577 and previous config saved to /var/cache/conftool/dbconfig/20220829-075630-ladsgroup.json
07:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
07:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
07:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33576 and previous config saved to /var/cache/conftool/dbconfig/20220829-074124-ladsgroup.json
07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33575 and previous config saved to /var/cache/conftool/dbconfig/20220829-073516-ladsgroup.json
07:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33574 and previous config saved to /var/cache/conftool/dbconfig/20220829-073354-ladsgroup.json
07:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
07:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
07:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T316186)', diff saved to https://phabricator.wikimedia.org/P33573 and previous config saved to /var/cache/conftool/dbconfig/20220829-073330-ladsgroup.json
07:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P33572 and previous config saved to /var/cache/conftool/dbconfig/20220829-071824-ladsgroup.json
07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2133,2160].codfw.wmnet,db[1117,1159,1164].eqiad.wmnet with reason: Switchover m2 T316202
07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2133,2160].codfw.wmnet,db[1117,1159,1164].eqiad.wmnet with reason: Switchover m2 T316202
07:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P33571 and previous config saved to /var/cache/conftool/dbconfig/20220829-070318-ladsgroup.json
06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T316186)', diff saved to https://phabricator.wikimedia.org/P33570 and previous config saved to /var/cache/conftool/dbconfig/20220829-064811-ladsgroup.json
06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T316186)', diff saved to https://phabricator.wikimedia.org/P33569 and previous config saved to /var/cache/conftool/dbconfig/20220829-064154-ladsgroup.json
06:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
06:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
06:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
06:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33568 and previous config saved to /var/cache/conftool/dbconfig/20220829-064113-ladsgroup.json
06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P33567 and previous config saved to /var/cache/conftool/dbconfig/20220829-062607-ladsgroup.json
06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P33566 and previous config saved to /var/cache/conftool/dbconfig/20220829-061100-ladsgroup.json
05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33565 and previous config saved to /var/cache/conftool/dbconfig/20220829-055554-ladsgroup.json
05:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33564 and previous config saved to /var/cache/conftool/dbconfig/20220829-054939-ladsgroup.json
05:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
05:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
05:44 hashar: Restarted Gerrit for 3.4.5 upgrade
05:40 hashar@deploy1002: Finished deploy [gerrit/gerrit@f1a820b]: Gerrit to 3.4.5 on gerrit1001 (duration: 00m 09s)
05:40 hashar@deploy1002: Started deploy [gerrit/gerrit@f1a820b]: Gerrit to 3.4.5 on gerrit1001
05:37 hashar@deploy1002: Finished deploy [gerrit/gerrit@f1a820b]: Gerrit to 3.4.5 on gerrit2002 (duration: 00m 11s)
05:36 hashar@deploy1002: Started deploy [gerrit/gerrit@f1a820b]: Gerrit to 3.4.5 on gerrit2002
05:12 marostegui@cumin1001: dbctl commit (dc=all): 'Adjust weights on s1 T316481', diff saved to https://phabricator.wikimedia.org/P33563 and previous config saved to /var/cache/conftool/dbconfig/20220829-051206-marostegui.json
05:10 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2103 as master in dbctl T316481', diff saved to https://phabricator.wikimedia.org/P33562 and previous config saved to /var/cache/conftool/dbconfig/20220829-051020-marostegui.json
2022-08-28
21:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P33561 and previous config saved to /var/cache/conftool/dbconfig/20220828-210336-ladsgroup.json
21:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P33560 and previous config saved to /var/cache/conftool/dbconfig/20220828-210235-ladsgroup.json
20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P33559 and previous config saved to /var/cache/conftool/dbconfig/20220828-204729-ladsgroup.json
20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T316186)', diff saved to https://phabricator.wikimedia.org/P33558 and previous config saved to /var/cache/conftool/dbconfig/20220828-203223-ladsgroup.json
20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T316186)', diff saved to https://phabricator.wikimedia.org/P33557 and previous config saved to /var/cache/conftool/dbconfig/20220828-202701-ladsgroup.json
20:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
20:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T316186)', diff saved to https://phabricator.wikimedia.org/P33556 and previous config saved to /var/cache/conftool/dbconfig/20220828-202638-ladsgroup.json
20:18 ori: mw1411, mw1413, mw1419, mw1429, mw1431, mw1433: set energy-performance preference to 0 via 'x86_energy_perf_policy --hwp-epp 0' T315398
20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P33555 and previous config saved to /var/cache/conftool/dbconfig/20220828-201131-ladsgroup.json
19:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P33554 and previous config saved to /var/cache/conftool/dbconfig/20220828-195625-ladsgroup.json
19:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T316186)', diff saved to https://phabricator.wikimedia.org/P33553 and previous config saved to /var/cache/conftool/dbconfig/20220828-194119-ladsgroup.json
19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T316186)', diff saved to https://phabricator.wikimedia.org/P33552 and previous config saved to /var/cache/conftool/dbconfig/20220828-193500-ladsgroup.json
19:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1188.eqiad.wmnet with reason: Maintenance
19:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
19:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33551 and previous config saved to /var/cache/conftool/dbconfig/20220828-192705-ladsgroup.json
19:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33550 and previous config saved to /var/cache/conftool/dbconfig/20220828-192550-ladsgroup.json
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33549 and previous config saved to /var/cache/conftool/dbconfig/20220828-192042-ladsgroup.json
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33548 and previous config saved to /var/cache/conftool/dbconfig/20220828-192016-ladsgroup.json
19:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
19:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T316186)', diff saved to https://phabricator.wikimedia.org/P33547 and previous config saved to /var/cache/conftool/dbconfig/20220828-191951-ladsgroup.json
19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T316186)', diff saved to https://phabricator.wikimedia.org/P33546 and previous config saved to /var/cache/conftool/dbconfig/20220828-191440-ladsgroup.json
19:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
19:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2175.codfw.wmnet with reason: Maintenance
19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2107 (T316186)', diff saved to https://phabricator.wikimedia.org/P33545 and previous config saved to /var/cache/conftool/dbconfig/20220828-191414-ladsgroup.json
19:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2107 (T316186)', diff saved to https://phabricator.wikimedia.org/P33544 and previous config saved to /var/cache/conftool/dbconfig/20220828-190849-ladsgroup.json
19:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
19:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2107.codfw.wmnet with reason: Maintenance
19:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T316186)', diff saved to https://phabricator.wikimedia.org/P33543 and previous config saved to /var/cache/conftool/dbconfig/20220828-190824-ladsgroup.json
19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T316186)', diff saved to https://phabricator.wikimedia.org/P33542 and previous config saved to /var/cache/conftool/dbconfig/20220828-190303-ladsgroup.json
19:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
19:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2125.codfw.wmnet with reason: Maintenance
19:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T316186)', diff saved to https://phabricator.wikimedia.org/P33541 and previous config saved to /var/cache/conftool/dbconfig/20220828-190238-ladsgroup.json
18:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T316186)', diff saved to https://phabricator.wikimedia.org/P33540 and previous config saved to /var/cache/conftool/dbconfig/20220828-185606-ladsgroup.json
18:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
18:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
18:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
18:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
18:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T316186)', diff saved to https://phabricator.wikimedia.org/P33539 and previous config saved to /var/cache/conftool/dbconfig/20220828-185536-ladsgroup.json
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T316186)', diff saved to https://phabricator.wikimedia.org/P33538 and previous config saved to /var/cache/conftool/dbconfig/20220828-185022-ladsgroup.json
18:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
18:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2148.codfw.wmnet with reason: Maintenance
18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T316186)', diff saved to https://phabricator.wikimedia.org/P33537 and previous config saved to /var/cache/conftool/dbconfig/20220828-184542-ladsgroup.json
18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T316186)', diff saved to https://phabricator.wikimedia.org/P33536 and previous config saved to /var/cache/conftool/dbconfig/20220828-183915-ladsgroup.json
18:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
18:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
18:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T316186)', diff saved to https://phabricator.wikimedia.org/P33535 and previous config saved to /var/cache/conftool/dbconfig/20220828-183850-ladsgroup.json
18:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T316186)', diff saved to https://phabricator.wikimedia.org/P33534 and previous config saved to /var/cache/conftool/dbconfig/20220828-183226-ladsgroup.json
18:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
18:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
18:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
18:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T316186)', diff saved to https://phabricator.wikimedia.org/P33533 and previous config saved to /var/cache/conftool/dbconfig/20220828-183156-ladsgroup.json
18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T316186)', diff saved to https://phabricator.wikimedia.org/P33532 and previous config saved to /var/cache/conftool/dbconfig/20220828-182630-ladsgroup.json
18:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
18:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33531 and previous config saved to /var/cache/conftool/dbconfig/20220828-182605-ladsgroup.json
18:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33530 and previous config saved to /var/cache/conftool/dbconfig/20220828-182350-ladsgroup.json
18:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33529 and previous config saved to /var/cache/conftool/dbconfig/20220828-181830-ladsgroup.json
18:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33528 and previous config saved to /var/cache/conftool/dbconfig/20220828-181805-ladsgroup.json
18:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
18:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
18:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
18:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T316186)', diff saved to https://phabricator.wikimedia.org/P33527 and previous config saved to /var/cache/conftool/dbconfig/20220828-181421-ladsgroup.json
18:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T316186)', diff saved to https://phabricator.wikimedia.org/P33526 and previous config saved to /var/cache/conftool/dbconfig/20220828-180751-ladsgroup.json
18:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
18:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
18:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T316186)', diff saved to https://phabricator.wikimedia.org/P33525 and previous config saved to /var/cache/conftool/dbconfig/20220828-180725-ladsgroup.json
18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T316186)', diff saved to https://phabricator.wikimedia.org/P33524 and previous config saved to /var/cache/conftool/dbconfig/20220828-180108-ladsgroup.json
18:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
18:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
18:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2140 (T316186)', diff saved to https://phabricator.wikimedia.org/P33523 and previous config saved to /var/cache/conftool/dbconfig/20220828-180042-ladsgroup.json
17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2140 (T316186)', diff saved to https://phabricator.wikimedia.org/P33522 and previous config saved to /var/cache/conftool/dbconfig/20220828-175311-ladsgroup.json
17:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
17:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T316186)', diff saved to https://phabricator.wikimedia.org/P33521 and previous config saved to /var/cache/conftool/dbconfig/20220828-175246-ladsgroup.json
17:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T316186)', diff saved to https://phabricator.wikimedia.org/P33520 and previous config saved to /var/cache/conftool/dbconfig/20220828-174655-ladsgroup.json
17:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
17:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
17:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T316186)', diff saved to https://phabricator.wikimedia.org/P33519 and previous config saved to /var/cache/conftool/dbconfig/20220828-174630-ladsgroup.json
17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T316186)', diff saved to https://phabricator.wikimedia.org/P33518 and previous config saved to /var/cache/conftool/dbconfig/20220828-174059-ladsgroup.json
17:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
17:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
17:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling failed', diff saved to https://phabricator.wikimedia.org/P33517 and previous config saved to /var/cache/conftool/dbconfig/20220828-174002-ladsgroup.json
17:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T316186)', diff saved to https://phabricator.wikimedia.org/P33516 and previous config saved to /var/cache/conftool/dbconfig/20220828-173304-ladsgroup.json
17:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
17:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T316186)', diff saved to https://phabricator.wikimedia.org/P33515 and previous config saved to /var/cache/conftool/dbconfig/20220828-173241-ladsgroup.json
17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P33514 and previous config saved to /var/cache/conftool/dbconfig/20220828-171734-ladsgroup.json
17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P33513 and previous config saved to /var/cache/conftool/dbconfig/20220828-170228-ladsgroup.json
16:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T316186)', diff saved to https://phabricator.wikimedia.org/P33512 and previous config saved to /var/cache/conftool/dbconfig/20220828-164722-ladsgroup.json
16:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T316186)', diff saved to https://phabricator.wikimedia.org/P33511 and previous config saved to /var/cache/conftool/dbconfig/20220828-164211-ladsgroup.json
16:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
16:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
16:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
16:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
16:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T316186)', diff saved to https://phabricator.wikimedia.org/P33510 and previous config saved to /var/cache/conftool/dbconfig/20220828-164004-ladsgroup.json
16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T316186)', diff saved to https://phabricator.wikimedia.org/P33509 and previous config saved to /var/cache/conftool/dbconfig/20220828-163447-ladsgroup.json
16:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
16:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
16:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33508 and previous config saved to /var/cache/conftool/dbconfig/20220828-163211-ladsgroup.json
16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33507 and previous config saved to /var/cache/conftool/dbconfig/20220828-162906-ladsgroup.json
16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33506 and previous config saved to /var/cache/conftool/dbconfig/20220828-162349-ladsgroup.json
16:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
16:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33505 and previous config saved to /var/cache/conftool/dbconfig/20220828-162324-ladsgroup.json
16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P33504 and previous config saved to /var/cache/conftool/dbconfig/20220828-160818-ladsgroup.json
15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P33503 and previous config saved to /var/cache/conftool/dbconfig/20220828-155312-ladsgroup.json
15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33502 and previous config saved to /var/cache/conftool/dbconfig/20220828-153806-ladsgroup.json
15:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33501 and previous config saved to /var/cache/conftool/dbconfig/20220828-153349-ladsgroup.json
15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P33499 and previous config saved to /var/cache/conftool/dbconfig/20220828-150336-ladsgroup.json
14:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33498 and previous config saved to /var/cache/conftool/dbconfig/20220828-144830-ladsgroup.json
14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33497 and previous config saved to /var/cache/conftool/dbconfig/20220828-144319-ladsgroup.json
14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T316186)', diff saved to https://phabricator.wikimedia.org/P33496 and previous config saved to /var/cache/conftool/dbconfig/20220828-144257-ladsgroup.json
14:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
14:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
14:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33495 and previous config saved to /var/cache/conftool/dbconfig/20220828-144232-ladsgroup.json
14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P33494 and previous config saved to /var/cache/conftool/dbconfig/20220828-142726-ladsgroup.json
14:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P33493 and previous config saved to /var/cache/conftool/dbconfig/20220828-141220-ladsgroup.json
13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33492 and previous config saved to /var/cache/conftool/dbconfig/20220828-135713-ladsgroup.json
13:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33491 and previous config saved to /var/cache/conftool/dbconfig/20220828-135158-ladsgroup.json
13:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
13:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
13:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33490 and previous config saved to /var/cache/conftool/dbconfig/20220828-135133-ladsgroup.json
13:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P33489 and previous config saved to /var/cache/conftool/dbconfig/20220828-133627-ladsgroup.json
13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P33488 and previous config saved to /var/cache/conftool/dbconfig/20220828-132120-ladsgroup.json
13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33487 and previous config saved to /var/cache/conftool/dbconfig/20220828-130614-ladsgroup.json
13:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33486 and previous config saved to /var/cache/conftool/dbconfig/20220828-130059-ladsgroup.json
13:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
13:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
13:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T316186)', diff saved to https://phabricator.wikimedia.org/P33485 and previous config saved to /var/cache/conftool/dbconfig/20220828-130033-ladsgroup.json
12:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P33484 and previous config saved to /var/cache/conftool/dbconfig/20220828-124527-ladsgroup.json
12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P33483 and previous config saved to /var/cache/conftool/dbconfig/20220828-123021-ladsgroup.json
12:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T316186)', diff saved to https://phabricator.wikimedia.org/P33482 and previous config saved to /var/cache/conftool/dbconfig/20220828-121515-ladsgroup.json
12:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2164 (T316186)', diff saved to https://phabricator.wikimedia.org/P33481 and previous config saved to /var/cache/conftool/dbconfig/20220828-121000-ladsgroup.json
12:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
12:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
12:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
12:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T316186)', diff saved to https://phabricator.wikimedia.org/P33480 and previous config saved to /var/cache/conftool/dbconfig/20220828-120931-ladsgroup.json
11:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P33479 and previous config saved to /var/cache/conftool/dbconfig/20220828-115424-ladsgroup.json
11:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P33478 and previous config saved to /var/cache/conftool/dbconfig/20220828-113918-ladsgroup.json
11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T316186)', diff saved to https://phabricator.wikimedia.org/P33477 and previous config saved to /var/cache/conftool/dbconfig/20220828-112412-ladsgroup.json
11:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2163 (T316186)', diff saved to https://phabricator.wikimedia.org/P33476 and previous config saved to /var/cache/conftool/dbconfig/20220828-111857-ladsgroup.json
11:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
11:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
11:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33475 and previous config saved to /var/cache/conftool/dbconfig/20220828-111832-ladsgroup.json
11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P33474 and previous config saved to /var/cache/conftool/dbconfig/20220828-110326-ladsgroup.json
10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P33473 and previous config saved to /var/cache/conftool/dbconfig/20220828-104820-ladsgroup.json
10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33472 and previous config saved to /var/cache/conftool/dbconfig/20220828-103314-ladsgroup.json
10:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2162 (T316186)', diff saved to https://phabricator.wikimedia.org/P33471 and previous config saved to /var/cache/conftool/dbconfig/20220828-102800-ladsgroup.json
10:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
10:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
10:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
10:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
10:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T316186)', diff saved to https://phabricator.wikimedia.org/P33470 and previous config saved to /var/cache/conftool/dbconfig/20220828-102423-ladsgroup.json
10:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P33469 and previous config saved to /var/cache/conftool/dbconfig/20220828-100917-ladsgroup.json
09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P33468 and previous config saved to /var/cache/conftool/dbconfig/20220828-095411-ladsgroup.json
09:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T316186)', diff saved to https://phabricator.wikimedia.org/P33467 and previous config saved to /var/cache/conftool/dbconfig/20220828-093904-ladsgroup.json
09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2154 (T316186)', diff saved to https://phabricator.wikimedia.org/P33466 and previous config saved to /var/cache/conftool/dbconfig/20220828-093346-ladsgroup.json
09:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
09:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
09:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
09:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
08:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33465 and previous config saved to /var/cache/conftool/dbconfig/20220828-082851-ladsgroup.json
08:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P33464 and previous config saved to /var/cache/conftool/dbconfig/20220828-081344-ladsgroup.json
07:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P33463 and previous config saved to /var/cache/conftool/dbconfig/20220828-075838-ladsgroup.json
07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33462 and previous config saved to /var/cache/conftool/dbconfig/20220828-074332-ladsgroup.json
07:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33461 and previous config saved to /var/cache/conftool/dbconfig/20220828-074116-ladsgroup.json
07:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P33460 and previous config saved to /var/cache/conftool/dbconfig/20220828-072610-ladsgroup.json
07:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P33459 and previous config saved to /var/cache/conftool/dbconfig/20220828-071103-ladsgroup.json
06:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33458 and previous config saved to /var/cache/conftool/dbconfig/20220828-065557-ladsgroup.json
06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33457 and previous config saved to /var/cache/conftool/dbconfig/20220828-064952-ladsgroup.json
06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33456 and previous config saved to /var/cache/conftool/dbconfig/20220828-064920-ladsgroup.json
06:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
06:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2113 (T316186)', diff saved to https://phabricator.wikimedia.org/P33455 and previous config saved to /var/cache/conftool/dbconfig/20220828-064855-ladsgroup.json
06:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2113', diff saved to https://phabricator.wikimedia.org/P33454 and previous config saved to /var/cache/conftool/dbconfig/20220828-063348-ladsgroup.json
06:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2113', diff saved to https://phabricator.wikimedia.org/P33453 and previous config saved to /var/cache/conftool/dbconfig/20220828-061842-ladsgroup.json
06:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2113 (T316186)', diff saved to https://phabricator.wikimedia.org/P33452 and previous config saved to /var/cache/conftool/dbconfig/20220828-060336-ladsgroup.json
05:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2113 (T316186)', diff saved to https://phabricator.wikimedia.org/P33451 and previous config saved to /var/cache/conftool/dbconfig/20220828-055821-ladsgroup.json
05:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
05:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2113.codfw.wmnet with reason: Maintenance
05:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T316186)', diff saved to https://phabricator.wikimedia.org/P33450 and previous config saved to /var/cache/conftool/dbconfig/20220828-055756-ladsgroup.json
05:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P33449 and previous config saved to /var/cache/conftool/dbconfig/20220828-054249-ladsgroup.json
05:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P33448 and previous config saved to /var/cache/conftool/dbconfig/20220828-052743-ladsgroup.json
05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T316186)', diff saved to https://phabricator.wikimedia.org/P33447 and previous config saved to /var/cache/conftool/dbconfig/20220828-051237-ladsgroup.json
05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2157 (T316186)', diff saved to https://phabricator.wikimedia.org/P33446 and previous config saved to /var/cache/conftool/dbconfig/20220828-050729-ladsgroup.json
05:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
05:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2157.codfw.wmnet with reason: Maintenance
05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T316186)', diff saved to https://phabricator.wikimedia.org/P33445 and previous config saved to /var/cache/conftool/dbconfig/20220828-050704-ladsgroup.json
04:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P33444 and previous config saved to /var/cache/conftool/dbconfig/20220828-045157-ladsgroup.json
04:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P33443 and previous config saved to /var/cache/conftool/dbconfig/20220828-043651-ladsgroup.json
04:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T316186)', diff saved to https://phabricator.wikimedia.org/P33442 and previous config saved to /var/cache/conftool/dbconfig/20220828-042145-ladsgroup.json
04:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 (T316186)', diff saved to https://phabricator.wikimedia.org/P33441 and previous config saved to /var/cache/conftool/dbconfig/20220828-041622-ladsgroup.json
04:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
04:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
04:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
04:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2128.codfw.wmnet with reason: Maintenance
04:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
04:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2101.codfw.wmnet with reason: Maintenance
04:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T316186)', diff saved to https://phabricator.wikimedia.org/P33440 and previous config saved to /var/cache/conftool/dbconfig/20220828-041231-ladsgroup.json
03:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P33439 and previous config saved to /var/cache/conftool/dbconfig/20220828-035725-ladsgroup.json
03:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P33438 and previous config saved to /var/cache/conftool/dbconfig/20220828-034219-ladsgroup.json
03:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T316186)', diff saved to https://phabricator.wikimedia.org/P33437 and previous config saved to /var/cache/conftool/dbconfig/20220828-032713-ladsgroup.json
03:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2178 (T316186)', diff saved to https://phabricator.wikimedia.org/P33436 and previous config saved to /var/cache/conftool/dbconfig/20220828-032202-ladsgroup.json
03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
03:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2178.codfw.wmnet with reason: Maintenance
03:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T316186)', diff saved to https://phabricator.wikimedia.org/P33435 and previous config saved to /var/cache/conftool/dbconfig/20220828-032137-ladsgroup.json
03:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P33434 and previous config saved to /var/cache/conftool/dbconfig/20220828-030631-ladsgroup.json
02:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P33433 and previous config saved to /var/cache/conftool/dbconfig/20220828-025124-ladsgroup.json
02:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T316186)', diff saved to https://phabricator.wikimedia.org/P33432 and previous config saved to /var/cache/conftool/dbconfig/20220828-023618-ladsgroup.json
02:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 (T316186)', diff saved to https://phabricator.wikimedia.org/P33431 and previous config saved to /var/cache/conftool/dbconfig/20220828-023111-ladsgroup.json
02:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
02:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2111.codfw.wmnet with reason: Maintenance
02:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
02:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
02:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T316186)', diff saved to https://phabricator.wikimedia.org/P33430 and previous config saved to /var/cache/conftool/dbconfig/20220828-022620-ladsgroup.json
02:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P33429 and previous config saved to /var/cache/conftool/dbconfig/20220828-021114-ladsgroup.json
01:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P33428 and previous config saved to /var/cache/conftool/dbconfig/20220828-015608-ladsgroup.json
01:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T316186)', diff saved to https://phabricator.wikimedia.org/P33427 and previous config saved to /var/cache/conftool/dbconfig/20220828-014101-ladsgroup.json
01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T316186)', diff saved to https://phabricator.wikimedia.org/P33426 and previous config saved to /var/cache/conftool/dbconfig/20220828-013558-ladsgroup.json
01:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
01:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
01:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T316186)', diff saved to https://phabricator.wikimedia.org/P33425 and previous config saved to /var/cache/conftool/dbconfig/20220828-013534-ladsgroup.json
01:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P33424 and previous config saved to /var/cache/conftool/dbconfig/20220828-012028-ladsgroup.json
01:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P33423 and previous config saved to /var/cache/conftool/dbconfig/20220828-010522-ladsgroup.json
00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T316186)', diff saved to https://phabricator.wikimedia.org/P33422 and previous config saved to /var/cache/conftool/dbconfig/20220828-005015-ladsgroup.json
00:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T316186)', diff saved to https://phabricator.wikimedia.org/P33421 and previous config saved to /var/cache/conftool/dbconfig/20220828-004410-ladsgroup.json
00:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
00:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
00:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
00:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
00:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33420 and previous config saved to /var/cache/conftool/dbconfig/20220828-004329-ladsgroup.json
00:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P33419 and previous config saved to /var/cache/conftool/dbconfig/20220828-002823-ladsgroup.json
00:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P33418 and previous config saved to /var/cache/conftool/dbconfig/20220828-001317-ladsgroup.json
2022-08-27
23:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33417 and previous config saved to /var/cache/conftool/dbconfig/20220827-235810-ladsgroup.json
23:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33416 and previous config saved to /var/cache/conftool/dbconfig/20220827-235556-ladsgroup.json
23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P33415 and previous config saved to /var/cache/conftool/dbconfig/20220827-234050-ladsgroup.json
23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P33414 and previous config saved to /var/cache/conftool/dbconfig/20220827-232544-ladsgroup.json
23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33413 and previous config saved to /var/cache/conftool/dbconfig/20220827-231038-ladsgroup.json
23:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33412 and previous config saved to /var/cache/conftool/dbconfig/20220827-230339-ladsgroup.json
23:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T316186)', diff saved to https://phabricator.wikimedia.org/P33411 and previous config saved to /var/cache/conftool/dbconfig/20220827-230214-ladsgroup.json
23:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
23:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
23:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 (T316186)', diff saved to https://phabricator.wikimedia.org/P33410 and previous config saved to /var/cache/conftool/dbconfig/20220827-230150-ladsgroup.json
22:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130', diff saved to https://phabricator.wikimedia.org/P33408 and previous config saved to /var/cache/conftool/dbconfig/20220827-223137-ladsgroup.json
22:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149', diff saved to https://phabricator.wikimedia.org/P33407 and previous config saved to /var/cache/conftool/dbconfig/20220827-221749-ladsgroup.json
22:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2149.codfw.wmnet with reason: Sad disk
22:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2149.codfw.wmnet with reason: Sad disk
22:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 (T316186)', diff saved to https://phabricator.wikimedia.org/P33406 and previous config saved to /var/cache/conftool/dbconfig/20220827-221631-ladsgroup.json
22:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 (T316186)', diff saved to https://phabricator.wikimedia.org/P33405 and previous config saved to /var/cache/conftool/dbconfig/20220827-221118-ladsgroup.json
22:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1130.eqiad.wmnet with reason: Maintenance
22:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1130.eqiad.wmnet with reason: Maintenance
20:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T316186)', diff saved to https://phabricator.wikimedia.org/P33404 and previous config saved to /var/cache/conftool/dbconfig/20220827-205809-ladsgroup.json
20:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P33403 and previous config saved to /var/cache/conftool/dbconfig/20220827-204303-ladsgroup.json
20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P33402 and previous config saved to /var/cache/conftool/dbconfig/20220827-202757-ladsgroup.json
20:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T316186)', diff saved to https://phabricator.wikimedia.org/P33401 and previous config saved to /var/cache/conftool/dbconfig/20220827-201250-ladsgroup.json
20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T316186)', diff saved to https://phabricator.wikimedia.org/P33400 and previous config saved to /var/cache/conftool/dbconfig/20220827-200639-ladsgroup.json
20:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
20:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
20:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
20:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2156.codfw.wmnet with reason: Maintenance
20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T316186)', diff saved to https://phabricator.wikimedia.org/P33399 and previous config saved to /var/cache/conftool/dbconfig/20220827-200559-ladsgroup.json
19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P33398 and previous config saved to /var/cache/conftool/dbconfig/20220827-195053-ladsgroup.json
19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P33397 and previous config saved to /var/cache/conftool/dbconfig/20220827-193546-ladsgroup.json
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T316186)', diff saved to https://phabricator.wikimedia.org/P33396 and previous config saved to /var/cache/conftool/dbconfig/20220827-192040-ladsgroup.json
19:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T316186)', diff saved to https://phabricator.wikimedia.org/P33395 and previous config saved to /var/cache/conftool/dbconfig/20220827-191515-ladsgroup.json
19:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
19:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2109.codfw.wmnet with reason: Maintenance
19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T316186)', diff saved to https://phabricator.wikimedia.org/P33394 and previous config saved to /var/cache/conftool/dbconfig/20220827-191450-ladsgroup.json
18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P33393 and previous config saved to /var/cache/conftool/dbconfig/20220827-185944-ladsgroup.json
18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P33392 and previous config saved to /var/cache/conftool/dbconfig/20220827-184438-ladsgroup.json
18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T316186)', diff saved to https://phabricator.wikimedia.org/P33391 and previous config saved to /var/cache/conftool/dbconfig/20220827-182931-ladsgroup.json
18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T316186)', diff saved to https://phabricator.wikimedia.org/P33390 and previous config saved to /var/cache/conftool/dbconfig/20220827-182408-ladsgroup.json
18:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
18:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2149.codfw.wmnet with reason: Maintenance
18:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T316186)', diff saved to https://phabricator.wikimedia.org/P33389 and previous config saved to /var/cache/conftool/dbconfig/20220827-182343-ladsgroup.json
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P33388 and previous config saved to /var/cache/conftool/dbconfig/20220827-180836-ladsgroup.json
17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P33387 and previous config saved to /var/cache/conftool/dbconfig/20220827-175330-ladsgroup.json
17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T316186)', diff saved to https://phabricator.wikimedia.org/P33386 and previous config saved to /var/cache/conftool/dbconfig/20220827-173824-ladsgroup.json
17:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T316186)', diff saved to https://phabricator.wikimedia.org/P33385 and previous config saved to /var/cache/conftool/dbconfig/20220827-173305-ladsgroup.json
17:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
17:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2177.codfw.wmnet with reason: Maintenance
17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T316186)', diff saved to https://phabricator.wikimedia.org/P33384 and previous config saved to /var/cache/conftool/dbconfig/20220827-173240-ladsgroup.json
17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P33383 and previous config saved to /var/cache/conftool/dbconfig/20220827-171734-ladsgroup.json
17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P33382 and previous config saved to /var/cache/conftool/dbconfig/20220827-170227-ladsgroup.json
16:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T316186)', diff saved to https://phabricator.wikimedia.org/P33381 and previous config saved to /var/cache/conftool/dbconfig/20220827-164721-ladsgroup.json
16:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2127 (T316186)', diff saved to https://phabricator.wikimedia.org/P33380 and previous config saved to /var/cache/conftool/dbconfig/20220827-164156-ladsgroup.json
16:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: Maintenance
16:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2127.codfw.wmnet with reason: Maintenance
16:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
16:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T316186)', diff saved to https://phabricator.wikimedia.org/P33379 and previous config saved to /var/cache/conftool/dbconfig/20220827-163528-ladsgroup.json
16:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P33378 and previous config saved to /var/cache/conftool/dbconfig/20220827-162022-ladsgroup.json
16:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P33377 and previous config saved to /var/cache/conftool/dbconfig/20220827-160516-ladsgroup.json
15:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T316186)', diff saved to https://phabricator.wikimedia.org/P33376 and previous config saved to /var/cache/conftool/dbconfig/20220827-155010-ladsgroup.json
15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T316186)', diff saved to https://phabricator.wikimedia.org/P33375 and previous config saved to /var/cache/conftool/dbconfig/20220827-154452-ladsgroup.json
15:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
15:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
15:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
15:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T316186)', diff saved to https://phabricator.wikimedia.org/P33374 and previous config saved to /var/cache/conftool/dbconfig/20220827-154410-ladsgroup.json
15:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P33373 and previous config saved to /var/cache/conftool/dbconfig/20220827-152903-ladsgroup.json
15:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P33372 and previous config saved to /var/cache/conftool/dbconfig/20220827-151357-ladsgroup.json
14:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T316186)', diff saved to https://phabricator.wikimedia.org/P33371 and previous config saved to /var/cache/conftool/dbconfig/20220827-145851-ladsgroup.json
14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T316186)', diff saved to https://phabricator.wikimedia.org/P33370 and previous config saved to /var/cache/conftool/dbconfig/20220827-145224-ladsgroup.json
14:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
14:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T316186)', diff saved to https://phabricator.wikimedia.org/P33369 and previous config saved to /var/cache/conftool/dbconfig/20220827-145201-ladsgroup.json
14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P33368 and previous config saved to /var/cache/conftool/dbconfig/20220827-143654-ladsgroup.json
14:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P33367 and previous config saved to /var/cache/conftool/dbconfig/20220827-142148-ladsgroup.json
14:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T316186)', diff saved to https://phabricator.wikimedia.org/P33366 and previous config saved to /var/cache/conftool/dbconfig/20220827-140642-ladsgroup.json
13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T316186)', diff saved to https://phabricator.wikimedia.org/P33365 and previous config saved to /var/cache/conftool/dbconfig/20220827-135719-ladsgroup.json
13:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
13:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T316186)', diff saved to https://phabricator.wikimedia.org/P33364 and previous config saved to /var/cache/conftool/dbconfig/20220827-135655-ladsgroup.json
13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P33363 and previous config saved to /var/cache/conftool/dbconfig/20220827-134149-ladsgroup.json
13:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P33362 and previous config saved to /var/cache/conftool/dbconfig/20220827-132643-ladsgroup.json
13:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T316186)', diff saved to https://phabricator.wikimedia.org/P33361 and previous config saved to /var/cache/conftool/dbconfig/20220827-131136-ladsgroup.json
12:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T316186)', diff saved to https://phabricator.wikimedia.org/P33360 and previous config saved to /var/cache/conftool/dbconfig/20220827-121121-ladsgroup.json
12:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
12:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
12:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
12:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
12:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
12:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
12:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33359 and previous config saved to /var/cache/conftool/dbconfig/20220827-120059-ladsgroup.json
11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P33358 and previous config saved to /var/cache/conftool/dbconfig/20220827-114552-ladsgroup.json
11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P33357 and previous config saved to /var/cache/conftool/dbconfig/20220827-113046-ladsgroup.json
11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33356 and previous config saved to /var/cache/conftool/dbconfig/20220827-111540-ladsgroup.json
10:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T316186)', diff saved to https://phabricator.wikimedia.org/P33355 and previous config saved to /var/cache/conftool/dbconfig/20220827-101523-ladsgroup.json
10:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
10:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
10:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T316186)', diff saved to https://phabricator.wikimedia.org/P33354 and previous config saved to /var/cache/conftool/dbconfig/20220827-101459-ladsgroup.json
09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P33353 and previous config saved to /var/cache/conftool/dbconfig/20220827-095953-ladsgroup.json
09:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P33352 and previous config saved to /var/cache/conftool/dbconfig/20220827-094446-ladsgroup.json
09:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T316186)', diff saved to https://phabricator.wikimedia.org/P33351 and previous config saved to /var/cache/conftool/dbconfig/20220827-092940-ladsgroup.json
08:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T316186)', diff saved to https://phabricator.wikimedia.org/P33350 and previous config saved to /var/cache/conftool/dbconfig/20220827-082924-ladsgroup.json
08:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
08:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
08:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
08:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
01:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T316186)', diff saved to https://phabricator.wikimedia.org/P33349 and previous config saved to /var/cache/conftool/dbconfig/20220827-014831-ladsgroup.json
01:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P33348 and previous config saved to /var/cache/conftool/dbconfig/20220827-013325-ladsgroup.json
01:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P33347 and previous config saved to /var/cache/conftool/dbconfig/20220827-011819-ladsgroup.json
01:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T316186)', diff saved to https://phabricator.wikimedia.org/P33346 and previous config saved to /var/cache/conftool/dbconfig/20220827-010313-ladsgroup.json
00:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T316186)', diff saved to https://phabricator.wikimedia.org/P33345 and previous config saved to /var/cache/conftool/dbconfig/20220827-005555-ladsgroup.json
00:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
00:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
00:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
00:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance
00:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T316186)', diff saved to https://phabricator.wikimedia.org/P33344 and previous config saved to /var/cache/conftool/dbconfig/20220827-005525-ladsgroup.json
00:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P33343 and previous config saved to /var/cache/conftool/dbconfig/20220827-004019-ladsgroup.json
00:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P33342 and previous config saved to /var/cache/conftool/dbconfig/20220827-002513-ladsgroup.json
00:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T316186)', diff saved to https://phabricator.wikimedia.org/P33341 and previous config saved to /var/cache/conftool/dbconfig/20220827-001006-ladsgroup.json
00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T316186)', diff saved to https://phabricator.wikimedia.org/P33340 and previous config saved to /var/cache/conftool/dbconfig/20220827-000442-ladsgroup.json
00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33339 and previous config saved to /var/cache/conftool/dbconfig/20220827-000415-ladsgroup.json
2022-08-26
23:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P33338 and previous config saved to /var/cache/conftool/dbconfig/20220826-234908-ladsgroup.json
23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P33337 and previous config saved to /var/cache/conftool/dbconfig/20220826-233402-ladsgroup.json
23:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33336 and previous config saved to /var/cache/conftool/dbconfig/20220826-231856-ladsgroup.json
23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33335 and previous config saved to /var/cache/conftool/dbconfig/20220826-231540-ladsgroup.json
23:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P33334 and previous config saved to /var/cache/conftool/dbconfig/20220826-230033-ladsgroup.json
22:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P33333 and previous config saved to /var/cache/conftool/dbconfig/20220826-224527-ladsgroup.json
22:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33332 and previous config saved to /var/cache/conftool/dbconfig/20220826-223021-ladsgroup.json
22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33331 and previous config saved to /var/cache/conftool/dbconfig/20220826-222409-ladsgroup.json
22:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33330 and previous config saved to /var/cache/conftool/dbconfig/20220826-222345-ladsgroup.json
22:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
22:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
22:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2118 (T316186)', diff saved to https://phabricator.wikimedia.org/P33329 and previous config saved to /var/cache/conftool/dbconfig/20220826-222320-ladsgroup.json
22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2118', diff saved to https://phabricator.wikimedia.org/P33328 and previous config saved to /var/cache/conftool/dbconfig/20220826-220814-ladsgroup.json
21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2118', diff saved to https://phabricator.wikimedia.org/P33327 and previous config saved to /var/cache/conftool/dbconfig/20220826-215307-ladsgroup.json
21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2118 (T316186)', diff saved to https://phabricator.wikimedia.org/P33326 and previous config saved to /var/cache/conftool/dbconfig/20220826-213801-ladsgroup.json
21:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2118 (T316186)', diff saved to https://phabricator.wikimedia.org/P33325 and previous config saved to /var/cache/conftool/dbconfig/20220826-213140-ladsgroup.json
21:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2118.codfw.wmnet with reason: Maintenance
21:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2118.codfw.wmnet with reason: Maintenance
21:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T316186)', diff saved to https://phabricator.wikimedia.org/P33324 and previous config saved to /var/cache/conftool/dbconfig/20220826-213115-ladsgroup.json
21:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P33323 and previous config saved to /var/cache/conftool/dbconfig/20220826-211608-ladsgroup.json
21:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P33322 and previous config saved to /var/cache/conftool/dbconfig/20220826-210102-ladsgroup.json
20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T316186)', diff saved to https://phabricator.wikimedia.org/P33321 and previous config saved to /var/cache/conftool/dbconfig/20220826-204555-ladsgroup.json
20:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T316186)', diff saved to https://phabricator.wikimedia.org/P33320 and previous config saved to /var/cache/conftool/dbconfig/20220826-203935-ladsgroup.json
20:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
20:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
20:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T316186)', diff saved to https://phabricator.wikimedia.org/P33319 and previous config saved to /var/cache/conftool/dbconfig/20220826-203910-ladsgroup.json
20:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P33318 and previous config saved to /var/cache/conftool/dbconfig/20220826-202404-ladsgroup.json
20:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P33316 and previous config saved to /var/cache/conftool/dbconfig/20220826-200858-ladsgroup.json
19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T316186)', diff saved to https://phabricator.wikimedia.org/P33315 and previous config saved to /var/cache/conftool/dbconfig/20220826-195351-ladsgroup.json
19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T316186)', diff saved to https://phabricator.wikimedia.org/P33314 and previous config saved to /var/cache/conftool/dbconfig/20220826-194734-ladsgroup.json
19:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
19:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T316186)', diff saved to https://phabricator.wikimedia.org/P33313 and previous config saved to /var/cache/conftool/dbconfig/20220826-194709-ladsgroup.json
19:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P33312 and previous config saved to /var/cache/conftool/dbconfig/20220826-193203-ladsgroup.json
19:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P33311 and previous config saved to /var/cache/conftool/dbconfig/20220826-191657-ladsgroup.json
19:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T316186)', diff saved to https://phabricator.wikimedia.org/P33310 and previous config saved to /var/cache/conftool/dbconfig/20220826-190151-ladsgroup.json
18:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T316186)', diff saved to https://phabricator.wikimedia.org/P33309 and previous config saved to /var/cache/conftool/dbconfig/20220826-185527-ladsgroup.json
18:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
18:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
18:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T316186)', diff saved to https://phabricator.wikimedia.org/P33308 and previous config saved to /var/cache/conftool/dbconfig/20220826-185502-ladsgroup.json
18:40 xcollazo@deploy1002: Started deploy [airflow-dags/analytics@c5f46a4]: (no justification provided)
18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P33307 and previous config saved to /var/cache/conftool/dbconfig/20220826-183956-ladsgroup.json
18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P33306 and previous config saved to /var/cache/conftool/dbconfig/20220826-182450-ladsgroup.json
18:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T316186)', diff saved to https://phabricator.wikimedia.org/P33305 and previous config saved to /var/cache/conftool/dbconfig/20220826-180943-ladsgroup.json
18:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T316186)', diff saved to https://phabricator.wikimedia.org/P33304 and previous config saved to /var/cache/conftool/dbconfig/20220826-180223-ladsgroup.json
18:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
18:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance
18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33303 and previous config saved to /var/cache/conftool/dbconfig/20220826-180157-ladsgroup.json
17:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P33302 and previous config saved to /var/cache/conftool/dbconfig/20220826-174651-ladsgroup.json
17:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P33301 and previous config saved to /var/cache/conftool/dbconfig/20220826-173144-ladsgroup.json
17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33300 and previous config saved to /var/cache/conftool/dbconfig/20220826-171638-ladsgroup.json
17:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T316186)', diff saved to https://phabricator.wikimedia.org/P33299 and previous config saved to /var/cache/conftool/dbconfig/20220826-170911-ladsgroup.json
17:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
17:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T316186)', diff saved to https://phabricator.wikimedia.org/P33298 and previous config saved to /var/cache/conftool/dbconfig/20220826-170538-ladsgroup.json
16:56 xcollazo@deploy1002: Finished deploy [airflow-dags/analytics@5d95fe5]: Add job for MediaWiki history dumps. (duration: 00m 13s)
16:56 xcollazo@deploy1002: Started deploy [airflow-dags/analytics@5d95fe5]: Add job for MediaWiki history dumps.
16:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P33297 and previous config saved to /var/cache/conftool/dbconfig/20220826-165032-ladsgroup.json
16:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P33296 and previous config saved to /var/cache/conftool/dbconfig/20220826-163525-ladsgroup.json
16:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T316186)', diff saved to https://phabricator.wikimedia.org/P33295 and previous config saved to /var/cache/conftool/dbconfig/20220826-162019-ladsgroup.json
15:50 jynus: rolling restart of ms-backup1001,2, ms-backup2001,2
15:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T316186)', diff saved to https://phabricator.wikimedia.org/P33293 and previous config saved to /var/cache/conftool/dbconfig/20220826-152003-ladsgroup.json
15:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
15:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
15:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33292 and previous config saved to /var/cache/conftool/dbconfig/20220826-151921-ladsgroup.json
15:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P33291 and previous config saved to /var/cache/conftool/dbconfig/20220826-150415-ladsgroup.json
14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P33290 and previous config saved to /var/cache/conftool/dbconfig/20220826-144908-ladsgroup.json
14:38 jynus: rolling restart of backup1004-9, backup2004-9
14:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33289 and previous config saved to /var/cache/conftool/dbconfig/20220826-143402-ladsgroup.json
14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33288 and previous config saved to /var/cache/conftool/dbconfig/20220826-142945-ladsgroup.json
14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P33286 and previous config saved to /var/cache/conftool/dbconfig/20220826-141438-ladsgroup.json
13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P33285 and previous config saved to /var/cache/conftool/dbconfig/20220826-135932-ladsgroup.json
13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33284 and previous config saved to /var/cache/conftool/dbconfig/20220826-134426-ladsgroup.json
13:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T316186)', diff saved to https://phabricator.wikimedia.org/P33283 and previous config saved to /var/cache/conftool/dbconfig/20220826-133318-ladsgroup.json
13:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33281 and previous config saved to /var/cache/conftool/dbconfig/20220826-132817-root.json
13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33280 and previous config saved to /var/cache/conftool/dbconfig/20220826-132751-ladsgroup.json
13:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
13:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33279 and previous config saved to /var/cache/conftool/dbconfig/20220826-132304-ladsgroup.json
13:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33278 and previous config saved to /var/cache/conftool/dbconfig/20220826-131312-root.json
13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P33277 and previous config saved to /var/cache/conftool/dbconfig/20220826-130756-ladsgroup.json
12:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33276 and previous config saved to /var/cache/conftool/dbconfig/20220826-125808-root.json
12:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P33275 and previous config saved to /var/cache/conftool/dbconfig/20220826-125250-ladsgroup.json
12:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33274 and previous config saved to /var/cache/conftool/dbconfig/20220826-124303-root.json
12:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33273 and previous config saved to /var/cache/conftool/dbconfig/20220826-123743-ladsgroup.json
12:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33272 and previous config saved to /var/cache/conftool/dbconfig/20220826-122758-root.json
12:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33271 and previous config saved to /var/cache/conftool/dbconfig/20220826-122527-ladsgroup.json
12:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33270 and previous config saved to /var/cache/conftool/dbconfig/20220826-121253-root.json
12:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P33269 and previous config saved to /var/cache/conftool/dbconfig/20220826-121021-ladsgroup.json
11:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33268 and previous config saved to /var/cache/conftool/dbconfig/20220826-115748-root.json
11:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P33267 and previous config saved to /var/cache/conftool/dbconfig/20220826-115514-ladsgroup.json
11:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33266 and previous config saved to /var/cache/conftool/dbconfig/20220826-114243-root.json
11:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33265 and previous config saved to /var/cache/conftool/dbconfig/20220826-114008-ladsgroup.json
11:37 moritzm: installing intel-microcode updates on stretch hosts
11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33264 and previous config saved to /var/cache/conftool/dbconfig/20220826-113511-ladsgroup.json
11:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T316186)', diff saved to https://phabricator.wikimedia.org/P33263 and previous config saved to /var/cache/conftool/dbconfig/20220826-113347-ladsgroup.json
11:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
11:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
11:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
11:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
11:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T316186)', diff saved to https://phabricator.wikimedia.org/P33262 and previous config saved to /var/cache/conftool/dbconfig/20220826-112946-ladsgroup.json
11:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33261 and previous config saved to /var/cache/conftool/dbconfig/20220826-112739-root.json
11:19 moritzm: uploaded intel-microcode 3.20220510.1~wmf9u1 to apt.wikimedia.org
11:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P33260 and previous config saved to /var/cache/conftool/dbconfig/20220826-111440-ladsgroup.json
11:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1194 (re)pooling @ 1%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33259 and previous config saved to /var/cache/conftool/dbconfig/20220826-111234-root.json
10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P33258 and previous config saved to /var/cache/conftool/dbconfig/20220826-105934-ladsgroup.json
10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T316186)', diff saved to https://phabricator.wikimedia.org/P33257 and previous config saved to /var/cache/conftool/dbconfig/20220826-104427-ladsgroup.json
10:44 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
10:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T316186)', diff saved to https://phabricator.wikimedia.org/P33256 and previous config saved to /var/cache/conftool/dbconfig/20220826-103707-ladsgroup.json
10:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
10:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1191.eqiad.wmnet with reason: Maintenance
10:33 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-test-eqiad cluster: Roll restart of jvm daemons.
10:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
10:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
10:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
10:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
10:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33255 and previous config saved to /var/cache/conftool/dbconfig/20220826-102510-ladsgroup.json
10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33254 and previous config saved to /var/cache/conftool/dbconfig/20220826-102334-ladsgroup.json
10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33253 and previous config saved to /var/cache/conftool/dbconfig/20220826-102117-ladsgroup.json
10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P33252 and previous config saved to /var/cache/conftool/dbconfig/20220826-100611-ladsgroup.json
09:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P33251 and previous config saved to /var/cache/conftool/dbconfig/20220826-095104-ladsgroup.json
09:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33250 and previous config saved to /var/cache/conftool/dbconfig/20220826-093558-ladsgroup.json
09:33 vgutierrez: disable origin coalescing in cp6007 and cp6008 - T315911
09:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33249 and previous config saved to /var/cache/conftool/dbconfig/20220826-093051-ladsgroup.json
09:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33248 and previous config saved to /var/cache/conftool/dbconfig/20220826-093034-ladsgroup.json
09:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
09:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
09:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33247 and previous config saved to /var/cache/conftool/dbconfig/20220826-093000-ladsgroup.json
09:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P33246 and previous config saved to /var/cache/conftool/dbconfig/20220826-091454-ladsgroup.json
08:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P33245 and previous config saved to /var/cache/conftool/dbconfig/20220826-085947-ladsgroup.json
08:47 vgutierrez: Increase roll-out of query-sorting to 30% - T314868
08:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33244 and previous config saved to /var/cache/conftool/dbconfig/20220826-084441-ladsgroup.json
08:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33243 and previous config saved to /var/cache/conftool/dbconfig/20220826-083424-ladsgroup.json
08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2025.codfw.wmnet to cluster codfw and group D
08:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2025.codfw.wmnet to cluster codfw and group D
08:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
08:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P33242 and previous config saved to /var/cache/conftool/dbconfig/20220826-081918-ladsgroup.json
08:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P33241 and previous config saved to /var/cache/conftool/dbconfig/20220826-080411-ladsgroup.json
08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2025.codfw.wmnet with OS bullseye
07:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33240 and previous config saved to /var/cache/conftool/dbconfig/20220826-074905-ladsgroup.json
07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33239 and previous config saved to /var/cache/conftool/dbconfig/20220826-074801-root.json
07:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2025.codfw.wmnet with reason: host reimage
07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33238 and previous config saved to /var/cache/conftool/dbconfig/20220826-074434-root.json
07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33237 and previous config saved to /var/cache/conftool/dbconfig/20220826-074412-root.json
07:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33236 and previous config saved to /var/cache/conftool/dbconfig/20220826-074252-ladsgroup.json
07:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33235 and previous config saved to /var/cache/conftool/dbconfig/20220826-074140-root.json
07:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33234 and previous config saved to /var/cache/conftool/dbconfig/20220826-074126-ladsgroup.json
07:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2025.codfw.wmnet with reason: host reimage
07:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
07:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
07:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33233 and previous config saved to /var/cache/conftool/dbconfig/20220826-074052-ladsgroup.json
07:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33232 and previous config saved to /var/cache/conftool/dbconfig/20220826-073256-root.json
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33231 and previous config saved to /var/cache/conftool/dbconfig/20220826-072929-root.json
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33230 and previous config saved to /var/cache/conftool/dbconfig/20220826-072908-root.json
07:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33229 and previous config saved to /var/cache/conftool/dbconfig/20220826-072635-root.json
07:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P33228 and previous config saved to /var/cache/conftool/dbconfig/20220826-072545-ladsgroup.json
07:24 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS bullseye
07:23 vgutierrez: Increase roll-out of query-sorting to 15% - T314868
07:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33227 and previous config saved to /var/cache/conftool/dbconfig/20220826-071751-root.json
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33226 and previous config saved to /var/cache/conftool/dbconfig/20220826-071424-root.json
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33225 and previous config saved to /var/cache/conftool/dbconfig/20220826-071403-root.json
07:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33224 and previous config saved to /var/cache/conftool/dbconfig/20220826-071131-root.json
07:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P33223 and previous config saved to /var/cache/conftool/dbconfig/20220826-071039-ladsgroup.json
07:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33222 and previous config saved to /var/cache/conftool/dbconfig/20220826-070247-root.json
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33221 and previous config saved to /var/cache/conftool/dbconfig/20220826-065919-root.json
06:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33220 and previous config saved to /var/cache/conftool/dbconfig/20220826-065858-root.json
06:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33219 and previous config saved to /var/cache/conftool/dbconfig/20220826-065626-root.json
06:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33218 and previous config saved to /var/cache/conftool/dbconfig/20220826-065533-ladsgroup.json
06:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33217 and previous config saved to /var/cache/conftool/dbconfig/20220826-065217-ladsgroup.json
06:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33216 and previous config saved to /var/cache/conftool/dbconfig/20220826-064742-root.json
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33215 and previous config saved to /var/cache/conftool/dbconfig/20220826-064414-root.json
06:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33214 and previous config saved to /var/cache/conftool/dbconfig/20220826-064353-root.json
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33213 and previous config saved to /var/cache/conftool/dbconfig/20220826-064121-root.json
06:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P33212 and previous config saved to /var/cache/conftool/dbconfig/20220826-063711-ladsgroup.json
06:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 5%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33211 and previous config saved to /var/cache/conftool/dbconfig/20220826-063237-root.json
06:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33210 and previous config saved to /var/cache/conftool/dbconfig/20220826-062910-root.json
06:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33209 and previous config saved to /var/cache/conftool/dbconfig/20220826-062849-root.json
06:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33208 and previous config saved to /var/cache/conftool/dbconfig/20220826-062616-root.json
06:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P33207 and previous config saved to /var/cache/conftool/dbconfig/20220826-062205-ladsgroup.json
06:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 3%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33206 and previous config saved to /var/cache/conftool/dbconfig/20220826-061732-root.json
06:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33205 and previous config saved to /var/cache/conftool/dbconfig/20220826-061405-root.json
06:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33204 and previous config saved to /var/cache/conftool/dbconfig/20220826-061344-root.json
06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33203 and previous config saved to /var/cache/conftool/dbconfig/20220826-061112-root.json
06:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P33202 and previous config saved to /var/cache/conftool/dbconfig/20220826-060734-ladsgroup.json
06:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33201 and previous config saved to /var/cache/conftool/dbconfig/20220826-060658-ladsgroup.json
06:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 2%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33200 and previous config saved to /var/cache/conftool/dbconfig/20220826-060227-root.json
06:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33199 and previous config saved to /var/cache/conftool/dbconfig/20220826-060203-ladsgroup.json
06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T316186)', diff saved to https://phabricator.wikimedia.org/P33198 and previous config saved to /var/cache/conftool/dbconfig/20220826-060146-ladsgroup.json
06:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
06:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
05:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33197 and previous config saved to /var/cache/conftool/dbconfig/20220826-055900-root.json
05:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33196 and previous config saved to /var/cache/conftool/dbconfig/20220826-055839-root.json
05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33195 and previous config saved to /var/cache/conftool/dbconfig/20220826-055607-root.json
05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P33194 and previous config saved to /var/cache/conftool/dbconfig/20220826-055553-ladsgroup.json
05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P33193 and previous config saved to /var/cache/conftool/dbconfig/20220826-055420-ladsgroup.json
05:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P33192 and previous config saved to /var/cache/conftool/dbconfig/20220826-055229-ladsgroup.json
05:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 1%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P33191 and previous config saved to /var/cache/conftool/dbconfig/20220826-054722-root.json
05:47 marostegui: Failover m2-master
05:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1193 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33190 and previous config saved to /var/cache/conftool/dbconfig/20220826-054356-root.json
05:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1192 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33189 and previous config saved to /var/cache/conftool/dbconfig/20220826-054334-root.json
05:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1185 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33188 and previous config saved to /var/cache/conftool/dbconfig/20220826-054102-root.json
05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P33186 and previous config saved to /var/cache/conftool/dbconfig/20220826-054048-ladsgroup.json
05:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119', diff saved to https://phabricator.wikimedia.org/P33185 and previous config saved to /var/cache/conftool/dbconfig/20220826-054023-root.json
05:39 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1194 for the first time in s7 T313569', diff saved to https://phabricator.wikimedia.org/P33184 and previous config saved to /var/cache/conftool/dbconfig/20220826-053954-marostegui.json
05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P33183 and previous config saved to /var/cache/conftool/dbconfig/20220826-053915-ladsgroup.json
05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P33182 and previous config saved to /var/cache/conftool/dbconfig/20220826-053724-ladsgroup.json
05:27 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1193 to dbctl T313569', diff saved to https://phabricator.wikimedia.org/P33181 and previous config saved to /var/cache/conftool/dbconfig/20220826-052715-marostegui.json
05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P33180 and previous config saved to /var/cache/conftool/dbconfig/20220826-052544-ladsgroup.json
05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P33179 and previous config saved to /var/cache/conftool/dbconfig/20220826-052410-ladsgroup.json
05:22 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1192 to dbctl T313569', diff saved to https://phabricator.wikimedia.org/P33178 and previous config saved to /var/cache/conftool/dbconfig/20220826-052233-marostegui.json
05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P33177 and previous config saved to /var/cache/conftool/dbconfig/20220826-052219-ladsgroup.json
05:17 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1185 for the first time in s5 T313569', diff saved to https://phabricator.wikimedia.org/P33176 and previous config saved to /var/cache/conftool/dbconfig/20220826-051721-marostegui.json
05:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P33175 and previous config saved to /var/cache/conftool/dbconfig/20220826-051039-ladsgroup.json
05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P33174 and previous config saved to /var/cache/conftool/dbconfig/20220826-050906-ladsgroup.json
05:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling for maintenance', diff saved to https://phabricator.wikimedia.org/P33173 and previous config saved to /var/cache/conftool/dbconfig/20220826-050652-ladsgroup.json
05:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
05:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131 (T312160)', diff saved to https://phabricator.wikimedia.org/P33172 and previous config saved to /var/cache/conftool/dbconfig/20220826-003819-ladsgroup.json
00:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131', diff saved to https://phabricator.wikimedia.org/P33171 and previous config saved to /var/cache/conftool/dbconfig/20220826-002313-ladsgroup.json
00:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131', diff saved to https://phabricator.wikimedia.org/P33170 and previous config saved to /var/cache/conftool/dbconfig/20220826-000807-ladsgroup.json
2022-08-25
23:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131 (T312160)', diff saved to https://phabricator.wikimedia.org/P33169 and previous config saved to /var/cache/conftool/dbconfig/20220825-235300-ladsgroup.json
22:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T316186)', diff saved to https://phabricator.wikimedia.org/P33168 and previous config saved to /var/cache/conftool/dbconfig/20220825-223805-ladsgroup.json
22:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P33167 and previous config saved to /var/cache/conftool/dbconfig/20220825-222259-ladsgroup.json
22:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2131 (T312160)', diff saved to https://phabricator.wikimedia.org/P33165 and previous config saved to /var/cache/conftool/dbconfig/20220825-220937-ladsgroup.json
22:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2131.codfw.wmnet with reason: Maintenance
22:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2131.codfw.wmnet with reason: Maintenance
22:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P33164 and previous config saved to /var/cache/conftool/dbconfig/20220825-220753-ladsgroup.json
21:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T316186)', diff saved to https://phabricator.wikimedia.org/P33163 and previous config saved to /var/cache/conftool/dbconfig/20220825-215247-ladsgroup.json
21:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2180 (T316186)', diff saved to https://phabricator.wikimedia.org/P33162 and previous config saved to /var/cache/conftool/dbconfig/20220825-214722-ladsgroup.json
21:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
21:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
21:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T316186)', diff saved to https://phabricator.wikimedia.org/P33161 and previous config saved to /var/cache/conftool/dbconfig/20220825-214649-ladsgroup.json
21:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P33160 and previous config saved to /var/cache/conftool/dbconfig/20220825-213143-ladsgroup.json
21:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P33159 and previous config saved to /var/cache/conftool/dbconfig/20220825-211637-ladsgroup.json
21:12 bking@cumin2002: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
21:02 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
21:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T316186)', diff saved to https://phabricator.wikimedia.org/P33158 and previous config saved to /var/cache/conftool/dbconfig/20220825-210130-ladsgroup.json
20:56 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T316186)', diff saved to https://phabricator.wikimedia.org/P33157 and previous config saved to /var/cache/conftool/dbconfig/20220825-202716-ladsgroup.json
20:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
20:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33156 and previous config saved to /var/cache/conftool/dbconfig/20220825-202647-ladsgroup.json
20:24 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: f37eff3: Make DiscussionTools autotopicsub also opt-out on A/B test wikis (T314693) (duration: 03m 37s)
20:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance
20:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance
20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2115 (T312160)', diff saved to https://phabricator.wikimedia.org/P33155 and previous config saved to /var/cache/conftool/dbconfig/20220825-201756-ladsgroup.json
20:17 urbanecm: [urbanecm@deploy1002 ~]$ rm /var/lock/scap.operations_mediawiki-config.lock # connection to deploy1002 handled, to let me re-sync
20:14 urandom: re-rebooting ms-be2067 to "fix" disk enumeration(?) -- T314049
20:11 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33154 and previous config saved to /var/cache/conftool/dbconfig/20220825-201141-ladsgroup.json
20:07 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P33153 and previous config saved to /var/cache/conftool/dbconfig/20220825-200250-ladsgroup.json
19:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33152 and previous config saved to /var/cache/conftool/dbconfig/20220825-195635-ladsgroup.json
19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P33151 and previous config saved to /var/cache/conftool/dbconfig/20220825-194744-ladsgroup.json
19:42 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
19:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33150 and previous config saved to /var/cache/conftool/dbconfig/20220825-194129-ladsgroup.json
19:41 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
19:37 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudservices1003
19:37 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:36 urandom: rebooting ms-be2067 to "fix" disk enumeration(?) -- T314049
19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T316186)', diff saved to https://phabricator.wikimedia.org/P33149 and previous config saved to /var/cache/conftool/dbconfig/20220825-193513-ladsgroup.json
19:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
19:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
19:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33148 and previous config saved to /var/cache/conftool/dbconfig/20220825-193430-ladsgroup.json
19:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2115 (T312160)', diff saved to https://phabricator.wikimedia.org/P33147 and previous config saved to /var/cache/conftool/dbconfig/20220825-193238-ladsgroup.json
19:29 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudservices1003
19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33146 and previous config saved to /var/cache/conftool/dbconfig/20220825-191924-ladsgroup.json
19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33145 and previous config saved to /var/cache/conftool/dbconfig/20220825-190417-ladsgroup.json
18:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33144 and previous config saved to /var/cache/conftool/dbconfig/20220825-184911-ladsgroup.json
18:48 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
18:47 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - bking@cumin2002 - T316159
18:45 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@d00af45]: bump elasticsearch-hadoop to 7.10.2
18:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T316186)', diff saved to https://phabricator.wikimedia.org/P33143 and previous config saved to /var/cache/conftool/dbconfig/20220825-184301-ladsgroup.json
18:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
18:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
18:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T316186)', diff saved to https://phabricator.wikimedia.org/P33142 and previous config saved to /var/cache/conftool/dbconfig/20220825-184233-ladsgroup.json
18:33 ottomata: rolling restart of eventgate-analytics-external to pick up retroactive schema change for android schemas in T316047
18:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P33141 and previous config saved to /var/cache/conftool/dbconfig/20220825-182727-ladsgroup.json
18:19 dancy@deploy1002: rebuilt and synchronized wikiversions files: (no justification provided)
18:18 bmansurov@deploy1002: Started deploy [airflow-dags/research@5712187]: (no justification provided)
18:13 dancy@deploy1002: Installation of scap version "4.15.0" completed for 557 hosts
18:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P33140 and previous config saved to /var/cache/conftool/dbconfig/20220825-181221-ladsgroup.json
18:11 dancy@deploy1002: Installing scap version "4.15.0" for 557 hosts
17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T316186)', diff saved to https://phabricator.wikimedia.org/P33139 and previous config saved to /var/cache/conftool/dbconfig/20220825-175715-ladsgroup.json
17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T316186)', diff saved to https://phabricator.wikimedia.org/P33138 and previous config saved to /var/cache/conftool/dbconfig/20220825-174946-ladsgroup.json
17:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1131.eqiad.wmnet with reason: Maintenance
17:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1131.eqiad.wmnet with reason: Maintenance
17:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2115 (T312160)', diff saved to https://phabricator.wikimedia.org/P33137 and previous config saved to /var/cache/conftool/dbconfig/20220825-174826-ladsgroup.json
17:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2115.codfw.wmnet with reason: Maintenance
17:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2115.codfw.wmnet with reason: Maintenance
17:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
17:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33136 and previous config saved to /var/cache/conftool/dbconfig/20220825-173731-ladsgroup.json
17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P33135 and previous config saved to /var/cache/conftool/dbconfig/20220825-172225-ladsgroup.json
17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P33133 and previous config saved to /var/cache/conftool/dbconfig/20220825-170719-ladsgroup.json
16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33132 and previous config saved to /var/cache/conftool/dbconfig/20220825-165213-ladsgroup.json
16:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33131 and previous config saved to /var/cache/conftool/dbconfig/20220825-164556-ladsgroup.json
16:40 urandom: shutting down ms-be2067.codfw.wmnet for backplane replacement -- T314049
16:37 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: backplane replacement
16:37 eevans@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: backplane replacement
16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P33130 and previous config saved to /var/cache/conftool/dbconfig/20220825-163050-ladsgroup.json
16:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P33129 and previous config saved to /var/cache/conftool/dbconfig/20220825-161544-ladsgroup.json
16:07 bmansurov@deploy1002: Started deploy [airflow-dags/research@5712187]: (no justification provided)
16:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
16:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
16:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1120 (T312160)', diff saved to https://phabricator.wikimedia.org/P33128 and previous config saved to /var/cache/conftool/dbconfig/20220825-160250-ladsgroup.json
16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33127 and previous config saved to /var/cache/conftool/dbconfig/20220825-160036-ladsgroup.json
16:00 bmansurov@deploy1002: Started deploy [airflow-dags/research@5712187]: (no justification provided)
15:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T316186)', diff saved to https://phabricator.wikimedia.org/P33126 and previous config saved to /var/cache/conftool/dbconfig/20220825-155529-ladsgroup.json
15:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3316 (T316186)', diff saved to https://phabricator.wikimedia.org/P33125 and previous config saved to /var/cache/conftool/dbconfig/20220825-155506-ladsgroup.json
15:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
15:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
15:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P33124 and previous config saved to /var/cache/conftool/dbconfig/20220825-155401-ladsgroup.json
15:50 bmansurov@deploy1002: Started deploy [airflow-dags/research@5712187]: (no justification provided)
15:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1120', diff saved to https://phabricator.wikimedia.org/P33123 and previous config saved to /var/cache/conftool/dbconfig/20220825-154743-ladsgroup.json
15:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P33122 and previous config saved to /var/cache/conftool/dbconfig/20220825-154438-ladsgroup.json
15:41 bmansurov@deploy1002: Started deploy [airflow-dags/research@5712187]: (no justification provided)
15:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1120', diff saved to https://phabricator.wikimedia.org/P33121 and previous config saved to /var/cache/conftool/dbconfig/20220825-153237-ladsgroup.json
15:31 bmansurov@deploy1002: Started deploy [airflow-dags/research@5712187]: (no justification provided)
15:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T316186)', diff saved to https://phabricator.wikimedia.org/P33120 and previous config saved to /var/cache/conftool/dbconfig/20220825-152932-ladsgroup.json
15:26 bmansurov@deploy1002: Started deploy [airflow-dags/research@5712187]: (no justification provided)
15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T316186)', diff saved to https://phabricator.wikimedia.org/P33119 and previous config saved to /var/cache/conftool/dbconfig/20220825-152417-ladsgroup.json
15:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
15:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
15:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
15:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
15:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1120 (T312160)', diff saved to https://phabricator.wikimedia.org/P33118 and previous config saved to /var/cache/conftool/dbconfig/20220825-151731-ladsgroup.json
14:44 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx1001.wikimedia.org with reason: New Kernel
14:43 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx1001.wikimedia.org with reason: New Kernel
14:42 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: New Kernel
14:42 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: New Kernel
14:36 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: New Kernel
14:36 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: New Kernel
14:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti2025.codfw.wmnet with reason: Remove node for eventual reimage, T311686
14:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti2025.codfw.wmnet with reason: Remove node for eventual reimage, T311686
14:11 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2002.codfw.wmnet
14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T316186)', diff saved to https://phabricator.wikimedia.org/P33117 and previous config saved to /var/cache/conftool/dbconfig/20220825-140915-ladsgroup.json
14:07 cgoubert@cumin1001: START - Cookbook sre.hosts.reboot-single for host people2002.codfw.wmnet
13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P33116 and previous config saved to /var/cache/conftool/dbconfig/20220825-135408-ladsgroup.json
13:45 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1003.eqiad.wmnet
13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1120 (T312160)', diff saved to https://phabricator.wikimedia.org/P33115 and previous config saved to /var/cache/conftool/dbconfig/20220825-134318-ladsgroup.json
13:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1120.eqiad.wmnet with reason: Maintenance
13:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1120.eqiad.wmnet with reason: Maintenance
13:39 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1003.eqiad.wmnet
13:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P33114 and previous config saved to /var/cache/conftool/dbconfig/20220825-133902-ladsgroup.json
13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T316186)', diff saved to https://phabricator.wikimedia.org/P33113 and previous config saved to /var/cache/conftool/dbconfig/20220825-132356-ladsgroup.json
13:19 vgutierrez: disable origin coalescing in ats-be globally - T315911
13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T316186)', diff saved to https://phabricator.wikimedia.org/P33112 and previous config saved to /var/cache/conftool/dbconfig/20220825-131735-ladsgroup.json
13:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
13:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
13:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P33111 and previous config saved to /var/cache/conftool/dbconfig/20220825-130950-ladsgroup.json
13:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T316186)', diff saved to https://phabricator.wikimedia.org/P33110 and previous config saved to /var/cache/conftool/dbconfig/20220825-130235-ladsgroup.json
13:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:00 ladsgroup@cumin1001: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
13:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P33109 and previous config saved to /var/cache/conftool/dbconfig/20220825-125806-ladsgroup.json
12:57 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet
12:46 ladsgroup@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host db2114.codfw.wmnet
12:45 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.26 refs T314187
12:40 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet
12:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.reboot-single for host db2114.codfw.wmnet
12:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T316186)', diff saved to https://phabricator.wikimedia.org/P33108 and previous config saved to /var/cache/conftool/dbconfig/20220825-123448-ladsgroup.json
12:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
12:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
12:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Testing a script
12:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Testing a script
12:06 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 11 days, 0:00:00 on ms-fe1012.eqiad.wmnet with reason: known depooled, left for investigation
12:06 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 11 days, 0:00:00 on ms-fe1012.eqiad.wmnet with reason: known depooled, left for investigation
11:57 godog: roll-restart swift-proxy on thanos-fe* and ms-fe* (not ms-fe1012)
11:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1116.eqiad.wmnet with reason: Maintenance
11:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1116.eqiad.wmnet with reason: Maintenance
11:40 godog: depool ms-fe1012, leave swift-proxy alone for investigation
11:32 godog: restart swift-proxy on ms-fe1010
11:29 marostegui: Failover m1-master
11:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
11:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance
11:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1109.eqiad.wmnet with reason: Maintenance
11:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1109.eqiad.wmnet with reason: Maintenance
11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T314041)', diff saved to https://phabricator.wikimedia.org/P33106 and previous config saved to /var/cache/conftool/dbconfig/20220825-110448-ladsgroup.json
10:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P33105 and previous config saved to /var/cache/conftool/dbconfig/20220825-104942-ladsgroup.json
10:42 cgoubert@cumin1001: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-eqiad
10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P33104 and previous config saved to /var/cache/conftool/dbconfig/20220825-103436-ladsgroup.json
10:23 cgoubert@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-eqiad
10:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T314041)', diff saved to https://phabricator.wikimedia.org/P33103 and previous config saved to /var/cache/conftool/dbconfig/20220825-101930-ladsgroup.json
10:13 cgoubert@cumin1001: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-codfw
10:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
10:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
10:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1137 (T312160)', diff saved to https://phabricator.wikimedia.org/P33102 and previous config saved to /var/cache/conftool/dbconfig/20220825-100915-ladsgroup.json
10:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
10:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1174', diff saved to https://phabricator.wikimedia.org/P33100 and previous config saved to /var/cache/conftool/dbconfig/20220825-100010-root.json
09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T314041)', diff saved to https://phabricator.wikimedia.org/P33099 and previous config saved to /var/cache/conftool/dbconfig/20220825-095942-ladsgroup.json
09:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
09:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P33098 and previous config saved to /var/cache/conftool/dbconfig/20220825-095611-ladsgroup.json
09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1137', diff saved to https://phabricator.wikimedia.org/P33097 and previous config saved to /var/cache/conftool/dbconfig/20220825-095408-ladsgroup.json
09:51 moritzm: installing libxslt security updates on bullseye
09:50 cgoubert@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-codfw
09:49 jynus: restart backup1002, backup2002
09:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P33096 and previous config saved to /var/cache/conftool/dbconfig/20220825-094646-ladsgroup.json
09:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33095 and previous config saved to /var/cache/conftool/dbconfig/20220825-094438-root.json
09:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33094 and previous config saved to /var/cache/conftool/dbconfig/20220825-094401-root.json
09:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33093 and previous config saved to /var/cache/conftool/dbconfig/20220825-094353-root.json
09:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33092 and previous config saved to /var/cache/conftool/dbconfig/20220825-094345-root.json
09:39 marostegui: Reboot stand by dbproxy hosts
09:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1137', diff saved to https://phabricator.wikimedia.org/P33091 and previous config saved to /var/cache/conftool/dbconfig/20220825-093902-ladsgroup.json
09:35 jynus: restart backup2001
09:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P33090 and previous config saved to /var/cache/conftool/dbconfig/20220825-093140-ladsgroup.json
09:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33089 and previous config saved to /var/cache/conftool/dbconfig/20220825-092933-root.json
09:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33088 and previous config saved to /var/cache/conftool/dbconfig/20220825-092856-root.json
09:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33087 and previous config saved to /var/cache/conftool/dbconfig/20220825-092848-root.json
09:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33086 and previous config saved to /var/cache/conftool/dbconfig/20220825-092840-root.json
09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1137 (T312160)', diff saved to https://phabricator.wikimedia.org/P33085 and previous config saved to /var/cache/conftool/dbconfig/20220825-092356-ladsgroup.json
09:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T314041)', diff saved to https://phabricator.wikimedia.org/P33084 and previous config saved to /var/cache/conftool/dbconfig/20220825-091633-ladsgroup.json
09:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P33083 and previous config saved to /var/cache/conftool/dbconfig/20220825-091448-root.json
09:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2154 (T314041)', diff saved to https://phabricator.wikimedia.org/P33082 and previous config saved to /var/cache/conftool/dbconfig/20220825-091447-ladsgroup.json
09:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
09:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33081 and previous config saved to /var/cache/conftool/dbconfig/20220825-091428-root.json
09:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33080 and previous config saved to /var/cache/conftool/dbconfig/20220825-091351-root.json
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33079 and previous config saved to /var/cache/conftool/dbconfig/20220825-091344-root.json
09:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33078 and previous config saved to /var/cache/conftool/dbconfig/20220825-091336-root.json
09:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2100.codfw.wmnet with reason: Maintenance
09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T314041)', diff saved to https://phabricator.wikimedia.org/P33077 and previous config saved to /var/cache/conftool/dbconfig/20220825-091325-ladsgroup.json
08:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P33075 and previous config saved to /var/cache/conftool/dbconfig/20220825-085943-root.json
08:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33074 and previous config saved to /var/cache/conftool/dbconfig/20220825-085924-root.json
08:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33073 and previous config saved to /var/cache/conftool/dbconfig/20220825-085847-root.json
08:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33072 and previous config saved to /var/cache/conftool/dbconfig/20220825-085839-root.json
08:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33071 and previous config saved to /var/cache/conftool/dbconfig/20220825-085831-root.json
08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P33070 and previous config saved to /var/cache/conftool/dbconfig/20220825-085819-ladsgroup.json
08:54 moritzm: installing curl security updates on bullseye
08:50 moritzm: installing gnutls28 security updates on bullseye
08:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P33069 and previous config saved to /var/cache/conftool/dbconfig/20220825-084438-root.json
08:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33068 and previous config saved to /var/cache/conftool/dbconfig/20220825-084419-root.json
08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33067 and previous config saved to /var/cache/conftool/dbconfig/20220825-084342-root.json
08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33066 and previous config saved to /var/cache/conftool/dbconfig/20220825-084334-root.json
08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33065 and previous config saved to /var/cache/conftool/dbconfig/20220825-084326-root.json
08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P33064 and previous config saved to /var/cache/conftool/dbconfig/20220825-084313-ladsgroup.json
08:39 jynus: restarting backupmon1001
08:30 marostegui: Failover m1 from db1164 to db1195 - T315864
08:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P33063 and previous config saved to /var/cache/conftool/dbconfig/20220825-082933-root.json
08:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33062 and previous config saved to /var/cache/conftool/dbconfig/20220825-082915-root.json
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33061 and previous config saved to /var/cache/conftool/dbconfig/20220825-082837-root.json
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33060 and previous config saved to /var/cache/conftool/dbconfig/20220825-082830-root.json
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33059 and previous config saved to /var/cache/conftool/dbconfig/20220825-082821-root.json
08:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T314041)', diff saved to https://phabricator.wikimedia.org/P33058 and previous config saved to /var/cache/conftool/dbconfig/20220825-082807-ladsgroup.json
08:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2162 (T314041)', diff saved to https://phabricator.wikimedia.org/P33057 and previous config saved to /var/cache/conftool/dbconfig/20220825-082621-ladsgroup.json
08:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
08:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2162.codfw.wmnet with reason: Maintenance
08:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T314041)', diff saved to https://phabricator.wikimedia.org/P33056 and previous config saved to /var/cache/conftool/dbconfig/20220825-082559-ladsgroup.json
08:23 vgutierrez: Increase roll-out of query-sorting to 5% - T314868
08:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P33055 and previous config saved to /var/cache/conftool/dbconfig/20220825-081429-root.json
08:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33054 and previous config saved to /var/cache/conftool/dbconfig/20220825-081410-root.json
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33053 and previous config saved to /var/cache/conftool/dbconfig/20220825-081333-root.json
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33052 and previous config saved to /var/cache/conftool/dbconfig/20220825-081325-root.json
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33051 and previous config saved to /var/cache/conftool/dbconfig/20220825-081316-root.json
08:13 jynus: stopping bacula services on backup1001 T315864
08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P33050 and previous config saved to /var/cache/conftool/dbconfig/20220825-081053-ladsgroup.json
08:09 marostegui: Reboot db1195 for kernel upgrade T315864
07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P33049 and previous config saved to /var/cache/conftool/dbconfig/20220825-075924-root.json
07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33048 and previous config saved to /var/cache/conftool/dbconfig/20220825-075905-root.json
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33047 and previous config saved to /var/cache/conftool/dbconfig/20220825-075828-root.json
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33046 and previous config saved to /var/cache/conftool/dbconfig/20220825-075820-root.json
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33045 and previous config saved to /var/cache/conftool/dbconfig/20220825-075811-root.json
07:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P33044 and previous config saved to /var/cache/conftool/dbconfig/20220825-075547-ladsgroup.json
07:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2132,2160].codfw.wmnet,db[1117,1164,1195].eqiad.wmnet with reason: Switchover m1 T315864
07:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2132,2160].codfw.wmnet,db[1117,1164,1195].eqiad.wmnet with reason: Switchover m1 T315864
07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1191 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33042 and previous config saved to /var/cache/conftool/dbconfig/20220825-074400-root.json
07:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1190 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33041 and previous config saved to /var/cache/conftool/dbconfig/20220825-074323-root.json
07:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33040 and previous config saved to /var/cache/conftool/dbconfig/20220825-074315-root.json
07:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1186 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P33039 and previous config saved to /var/cache/conftool/dbconfig/20220825-074307-root.json
07:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1137 (T312160)', diff saved to https://phabricator.wikimedia.org/P33038 and previous config saved to /var/cache/conftool/dbconfig/20220825-074220-ladsgroup.json
07:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1137.eqiad.wmnet with reason: Maintenance
07:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1137.eqiad.wmnet with reason: Maintenance
07:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T314041)', diff saved to https://phabricator.wikimedia.org/P33037 and previous config saved to /var/cache/conftool/dbconfig/20220825-074041-ladsgroup.json
07:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2163 (T314041)', diff saved to https://phabricator.wikimedia.org/P33036 and previous config saved to /var/cache/conftool/dbconfig/20220825-073855-ladsgroup.json
07:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
07:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance
07:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T314041)', diff saved to https://phabricator.wikimedia.org/P33035 and previous config saved to /var/cache/conftool/dbconfig/20220825-073834-ladsgroup.json
07:34 marostegui: Promote pc1012 back as pc2 master T315526
07:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 100%: Repooling after cloning db1185', diff saved to https://phabricator.wikimedia.org/P33034 and previous config saved to /var/cache/conftool/dbconfig/20220825-072340-root.json
07:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P33033 and previous config saved to /var/cache/conftool/dbconfig/20220825-072327-ladsgroup.json
07:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 75%: Repooling after cloning db1185', diff saved to https://phabricator.wikimedia.org/P33032 and previous config saved to /var/cache/conftool/dbconfig/20220825-070835-root.json
07:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P33031 and previous config saved to /var/cache/conftool/dbconfig/20220825-070821-ladsgroup.json
06:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 50%: Repooling after cloning db1185', diff saved to https://phabricator.wikimedia.org/P33030 and previous config saved to /var/cache/conftool/dbconfig/20220825-065331-root.json
06:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T314041)', diff saved to https://phabricator.wikimedia.org/P33029 and previous config saved to /var/cache/conftool/dbconfig/20220825-065315-ladsgroup.json
06:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2164 (T314041)', diff saved to https://phabricator.wikimedia.org/P33028 and previous config saved to /var/cache/conftool/dbconfig/20220825-065128-ladsgroup.json
06:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
06:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
06:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
06:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2164.codfw.wmnet with reason: Maintenance
06:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 25%: Repooling after cloning db1185', diff saved to https://phabricator.wikimedia.org/P33027 and previous config saved to /var/cache/conftool/dbconfig/20220825-063826-root.json
06:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1143.eqiad.wmnet with reason: Maintenance
06:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1143.eqiad.wmnet with reason: Maintenance
06:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1138.eqiad.wmnet with reason: Maintenance
06:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1138.eqiad.wmnet with reason: Maintenance
06:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1138.eqiad.wmnet with reason: Maint on s4 old master
06:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1138.eqiad.wmnet with reason: Maint on s4 old master
06:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1138 T315419', diff saved to https://phabricator.wikimedia.org/P33026 and previous config saved to /var/cache/conftool/dbconfig/20220825-062852-ladsgroup.json
06:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1160 to s4 primary and set section read-write T315419', diff saved to https://phabricator.wikimedia.org/P33025 and previous config saved to /var/cache/conftool/dbconfig/20220825-062425-ladsgroup.json
06:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s4 eqiad as read-only for maintenance - T315419', diff saved to https://phabricator.wikimedia.org/P33024 and previous config saved to /var/cache/conftool/dbconfig/20220825-062353-ladsgroup.json
06:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 10%: Repooling after cloning db1185', diff saved to https://phabricator.wikimedia.org/P33023 and previous config saved to /var/cache/conftool/dbconfig/20220825-062321-root.json
06:22 Amir1: Starting s4 eqiad failover from db1138 to db1160 - T315419
06:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 5%: Repooling after cloning db1185', diff saved to https://phabricator.wikimedia.org/P33022 and previous config saved to /var/cache/conftool/dbconfig/20220825-060816-root.json
06:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114', diff saved to https://phabricator.wikimedia.org/P33020 and previous config saved to /var/cache/conftool/dbconfig/20220825-060601-root.json
05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1191 with minimal weight in s7 T313569', diff saved to https://phabricator.wikimedia.org/P33019 and previous config saved to /var/cache/conftool/dbconfig/20220825-055057-root.json
05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1191 to dbctl T313569', diff saved to https://phabricator.wikimedia.org/P33018 and previous config saved to /var/cache/conftool/dbconfig/20220825-055038-marostegui.json
05:33 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1190 with minimal weight in s4 T313569', diff saved to https://phabricator.wikimedia.org/P33017 and previous config saved to /var/cache/conftool/dbconfig/20220825-053310-root.json
05:32 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1190 to dbctl T313569', diff saved to https://phabricator.wikimedia.org/P33016 and previous config saved to /var/cache/conftool/dbconfig/20220825-053253-marostegui.json
05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1160 with weight 0 T315419', diff saved to https://phabricator.wikimedia.org/P33015 and previous config saved to /var/cache/conftool/dbconfig/20220825-052415-ladsgroup.json
05:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s4 T315419
05:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s4 T315419
05:17 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1188 with minimal weight in s2 T313569', diff saved to https://phabricator.wikimedia.org/P33013 and previous config saved to /var/cache/conftool/dbconfig/20220825-051754-root.json
05:17 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1188 to dbctl T313569', diff saved to https://phabricator.wikimedia.org/P33012 and previous config saved to /var/cache/conftool/dbconfig/20220825-051737-marostegui.json
05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1186 with minimal weight in s1 T313569', diff saved to https://phabricator.wikimedia.org/P33011 and previous config saved to /var/cache/conftool/dbconfig/20220825-051155-root.json
05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1186 to dbctl', diff saved to https://phabricator.wikimedia.org/P33010 and previous config saved to /var/cache/conftool/dbconfig/20220825-051130-marostegui.json
05:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P33008 and previous config saved to /var/cache/conftool/dbconfig/20220825-050713-root.json
05:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T314041)', diff saved to https://phabricator.wikimedia.org/P33007 and previous config saved to /var/cache/conftool/dbconfig/20220825-050539-ladsgroup.json
04:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P33006 and previous config saved to /var/cache/conftool/dbconfig/20220825-045033-ladsgroup.json
04:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P33005 and previous config saved to /var/cache/conftool/dbconfig/20220825-043527-ladsgroup.json
04:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T314041)', diff saved to https://phabricator.wikimedia.org/P33004 and previous config saved to /var/cache/conftool/dbconfig/20220825-042020-ladsgroup.json
04:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2166 (T314041)', diff saved to https://phabricator.wikimedia.org/P33003 and previous config saved to /var/cache/conftool/dbconfig/20220825-041833-ladsgroup.json
04:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
04:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2166.codfw.wmnet with reason: Maintenance
04:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T314041)', diff saved to https://phabricator.wikimedia.org/P33002 and previous config saved to /var/cache/conftool/dbconfig/20220825-041812-ladsgroup.json
04:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P33001 and previous config saved to /var/cache/conftool/dbconfig/20220825-040306-ladsgroup.json
03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2165', diff saved to https://phabricator.wikimedia.org/P33000 and previous config saved to /var/cache/conftool/dbconfig/20220825-034759-ladsgroup.json
03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2165 (T314041)', diff saved to https://phabricator.wikimedia.org/P32999 and previous config saved to /var/cache/conftool/dbconfig/20220825-033253-ladsgroup.json
03:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2165 (T314041)', diff saved to https://phabricator.wikimedia.org/P32998 and previous config saved to /var/cache/conftool/dbconfig/20220825-033107-ladsgroup.json
03:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
03:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2165.codfw.wmnet with reason: Maintenance
03:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32997 and previous config saved to /var/cache/conftool/dbconfig/20220825-033045-ladsgroup.json
03:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P32996 and previous config saved to /var/cache/conftool/dbconfig/20220825-031539-ladsgroup.json
03:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P32995 and previous config saved to /var/cache/conftool/dbconfig/20220825-030033-ladsgroup.json
02:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32994 and previous config saved to /var/cache/conftool/dbconfig/20220825-024527-ladsgroup.json
02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32993 and previous config saved to /var/cache/conftool/dbconfig/20220825-024339-ladsgroup.json
02:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
02:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance
02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32992 and previous config saved to /var/cache/conftool/dbconfig/20220825-024318-ladsgroup.json
02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P32991 and previous config saved to /var/cache/conftool/dbconfig/20220825-022812-ladsgroup.json
02:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P32990 and previous config saved to /var/cache/conftool/dbconfig/20220825-021306-ladsgroup.json
01:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32989 and previous config saved to /var/cache/conftool/dbconfig/20220825-015800-ladsgroup.json
01:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32988 and previous config saved to /var/cache/conftool/dbconfig/20220825-015612-ladsgroup.json
01:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
01:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
01:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T314041)', diff saved to https://phabricator.wikimedia.org/P32987 and previous config saved to /var/cache/conftool/dbconfig/20220825-015550-ladsgroup.json
01:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P32986 and previous config saved to /var/cache/conftool/dbconfig/20220825-014044-ladsgroup.json
01:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P32985 and previous config saved to /var/cache/conftool/dbconfig/20220825-012538-ladsgroup.json
01:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T314041)', diff saved to https://phabricator.wikimedia.org/P32984 and previous config saved to /var/cache/conftool/dbconfig/20220825-011032-ladsgroup.json
01:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2181 (T314041)', diff saved to https://phabricator.wikimedia.org/P32983 and previous config saved to /var/cache/conftool/dbconfig/20220825-010845-ladsgroup.json
01:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
01:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2181.codfw.wmnet with reason: Maintenance
01:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T314041)', diff saved to https://phabricator.wikimedia.org/P32982 and previous config saved to /var/cache/conftool/dbconfig/20220825-010824-ladsgroup.json
00:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P32981 and previous config saved to /var/cache/conftool/dbconfig/20220825-005318-ladsgroup.json
00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P32980 and previous config saved to /var/cache/conftool/dbconfig/20220825-003812-ladsgroup.json
00:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T314041)', diff saved to https://phabricator.wikimedia.org/P32979 and previous config saved to /var/cache/conftool/dbconfig/20220825-002306-ladsgroup.json
00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T314041)', diff saved to https://phabricator.wikimedia.org/P32978 and previous config saved to /var/cache/conftool/dbconfig/20220825-002120-ladsgroup.json
00:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
00:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
00:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
00:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance
00:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T314041)', diff saved to https://phabricator.wikimedia.org/P32977 and previous config saved to /var/cache/conftool/dbconfig/20220825-001949-ladsgroup.json
00:15 ejegg: fundraising scheduled jobs re-enabled
00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P32976 and previous config saved to /var/cache/conftool/dbconfig/20220825-000443-ladsgroup.json
2022-08-24
23:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P32975 and previous config saved to /var/cache/conftool/dbconfig/20220824-234937-ladsgroup.json
23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T314041)', diff saved to https://phabricator.wikimedia.org/P32974 and previous config saved to /var/cache/conftool/dbconfig/20220824-233431-ladsgroup.json
23:33 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restarting to apply OpenJDK 8u342 - eevans@cumin1001
23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T314041)', diff saved to https://phabricator.wikimedia.org/P32973 and previous config saved to /var/cache/conftool/dbconfig/20220824-233046-ladsgroup.json
23:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
23:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
23:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32972 and previous config saved to /var/cache/conftool/dbconfig/20220824-233025-ladsgroup.json
23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P32971 and previous config saved to /var/cache/conftool/dbconfig/20220824-231519-ladsgroup.json
23:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P32970 and previous config saved to /var/cache/conftool/dbconfig/20220824-230013-ladsgroup.json
22:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32969 and previous config saved to /var/cache/conftool/dbconfig/20220824-224507-ladsgroup.json
22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32968 and previous config saved to /var/cache/conftool/dbconfig/20220824-224214-ladsgroup.json
22:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
22:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T314041)', diff saved to https://phabricator.wikimedia.org/P32967 and previous config saved to /var/cache/conftool/dbconfig/20220824-224153-ladsgroup.json
22:37 ryankemper: [Elastic] We're back to green in `cloudelastic-chi`, so cloudelastic is back to fully healthy
22:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P32966 and previous config saved to /var/cache/conftool/dbconfig/20220824-222646-ladsgroup.json
22:20 ryankemper: [Elastic] We've got the cloudelastic instances all back up. A bunch of shard recoveries ongoing; currently the cluster is red. It might go all the way back to green; hard to say until the shard recoveries complete.
22:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P32965 and previous config saved to /var/cache/conftool/dbconfig/20220824-221140-ladsgroup.json
21:58 ryankemper: [Elastic] `ryankemper@cloudelastic1003:~$ sudo systemctl restart elasticsearch_6@cloudelastic-chi-eqiad.service`, 1003 was also oom-killed: `[4165984.362182] Out of memory: Killed process 3759 (java) total-vm:2277062348kB, anon-rss:61648756kB, file-rss:0kB, shmem-rss:0kB, UID:113 pgtables:1448136kB oom_score_adj:0`
21:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T314041)', diff saved to https://phabricator.wikimedia.org/P32964 and previous config saved to /var/cache/conftool/dbconfig/20220824-215634-ladsgroup.json
21:54 ryankemper: [Elastic] `ryankemper@cloudelastic1004:~$ sudo systemctl restart elasticsearch_6@cloudelastic-chi-eqiad.service` Restarting 1004's chi eqiad, it died due to `Aug 24 21:43:21 cloudelastic1004 systemd[1]: elasticsearch_6@cloudelastic-chi-eqiad.service: Main process exited, code=killed, status=9/KILL`
21:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T314041)', diff saved to https://phabricator.wikimedia.org/P32963 and previous config saved to /var/cache/conftool/dbconfig/20220824-215143-ladsgroup.json
21:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1114.eqiad.wmnet with reason: Maintenance
21:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1114.eqiad.wmnet with reason: Maintenance
21:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1116.eqiad.wmnet with reason: Maintenance
21:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1116.eqiad.wmnet with reason: Maintenance
21:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T314041)', diff saved to https://phabricator.wikimedia.org/P32962 and previous config saved to /var/cache/conftool/dbconfig/20220824-215025-ladsgroup.json
21:48 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on 6 hosts with reason: T316159
21:48 bking@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on 6 hosts with reason: T316159
21:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P32961 and previous config saved to /var/cache/conftool/dbconfig/20220824-213519-ladsgroup.json
21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P32959 and previous config saved to /var/cache/conftool/dbconfig/20220824-212013-ladsgroup.json
21:20 mutante: setting weight to 25 (from 30) for appservers and API servers in the range mw1307 through mw1348 because they are of an older hardware type (not changing weights of jobrunners/videoscalers even if in this range) (T304800)
21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T314041)', diff saved to https://phabricator.wikimedia.org/P32958 and previous config saved to /var/cache/conftool/dbconfig/20220824-210507-ladsgroup.json
21:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T314041)', diff saved to https://phabricator.wikimedia.org/P32957 and previous config saved to /var/cache/conftool/dbconfig/20220824-210216-ladsgroup.json
21:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1126.eqiad.wmnet with reason: Maintenance
21:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1126.eqiad.wmnet with reason: Maintenance
21:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32956 and previous config saved to /var/cache/conftool/dbconfig/20220824-210155-ladsgroup.json
20:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P32955 and previous config saved to /var/cache/conftool/dbconfig/20220824-204649-ladsgroup.json
20:44 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restarting to apply OpenJDK 8u342 - eevans@cumin1001
20:40 mutante: otrs1001 - systemctl reset failed
20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P32954 and previous config saved to /var/cache/conftool/dbconfig/20220824-203143-ladsgroup.json
20:21 ejegg: updated standalone SmashPig deploy from 13e9e9cc to 11ba0a1b
20:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32953 and previous config saved to /var/cache/conftool/dbconfig/20220824-201637-ladsgroup.json
20:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T314041)', diff saved to https://phabricator.wikimedia.org/P32952 and previous config saved to /var/cache/conftool/dbconfig/20220824-201344-ladsgroup.json
20:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
20:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1101.eqiad.wmnet with reason: Maintenance
20:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
20:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
20:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T314041)', diff saved to https://phabricator.wikimedia.org/P32951 and previous config saved to /var/cache/conftool/dbconfig/20220824-201224-ladsgroup.json
19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P32950 and previous config saved to /var/cache/conftool/dbconfig/20220824-195717-ladsgroup.json
19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P32949 and previous config saved to /var/cache/conftool/dbconfig/20220824-194211-ladsgroup.json
19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T314041)', diff saved to https://phabricator.wikimedia.org/P32948 and previous config saved to /var/cache/conftool/dbconfig/20220824-192705-ladsgroup.json
19:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1111.eqiad.wmnet with reason: Maintenance
19:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1111.eqiad.wmnet with reason: Maintenance
19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T314041)', diff saved to https://phabricator.wikimedia.org/P32946 and previous config saved to /var/cache/conftool/dbconfig/20220824-191943-ladsgroup.json
19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P32945 and previous config saved to /var/cache/conftool/dbconfig/20220824-190437-ladsgroup.json
18:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P32944 and previous config saved to /var/cache/conftool/dbconfig/20220824-184931-ladsgroup.json
18:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T314041)', diff saved to https://phabricator.wikimedia.org/P32943 and previous config saved to /var/cache/conftool/dbconfig/20220824-183425-ladsgroup.json
17:46 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restarting to apply OpenJDK 8u342 - eevans@cumin1001
17:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2096.codfw.wmnet with reason: Maintenance
17:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2096.codfw.wmnet with reason: Maintenance
17:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1103.eqiad.wmnet with reason: Maintenance
17:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1103.eqiad.wmnet with reason: Maintenance
17:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T314041)', diff saved to https://phabricator.wikimedia.org/P32942 and previous config saved to /var/cache/conftool/dbconfig/20220824-173409-ladsgroup.json
17:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
17:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
17:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
17:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1167.eqiad.wmnet with reason: Maintenance
17:06 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1009.eqiad.wmnet with OS bullseye
16:17 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-presto1009.eqiad.wmnet with OS bullseye
16:15 btullis@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-presto1007.eqiad.wmnet with OS bullseye
16:05 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-presto1007.eqiad.wmnet with OS bullseye
16:00 hashar: Restarted CI Jenkins, Release Jenkins, Gerrit replica and Gerrit
15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2115 (T312975)', diff saved to https://phabricator.wikimedia.org/P32941 and previous config saved to /var/cache/conftool/dbconfig/20220824-151445-ladsgroup.json
15:12 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restarting to apply OpenJDK 8u342 - eevans@cumin1001
15:04 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase1016.eqiad.wmnet: Restarting to canary OpenJDK 8u342 - eevans@cumin1001
15:01 btullis: restarting pybal on lvs1019
14:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P32940 and previous config saved to /var/cache/conftool/dbconfig/20220824-145939-ladsgroup.json
14:57 btullis: restarting pybal on lvs1020
14:55 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase1016.eqiad.wmnet: Restarting to canary OpenJDK 8u342 - eevans@cumin1001
14:48 moritzm: powercycling krb2002
14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2115', diff saved to https://phabricator.wikimedia.org/P32939 and previous config saved to /var/cache/conftool/dbconfig/20220824-144432-ladsgroup.json
14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T314041)', diff saved to https://phabricator.wikimedia.org/P32938 and previous config saved to /var/cache/conftool/dbconfig/20220824-143923-ladsgroup.json
14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2115 (T312975)', diff saved to https://phabricator.wikimedia.org/P32937 and previous config saved to /var/cache/conftool/dbconfig/20220824-142926-ladsgroup.json
14:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2115 (T312975)', diff saved to https://phabricator.wikimedia.org/P32936 and previous config saved to /var/cache/conftool/dbconfig/20220824-142715-ladsgroup.json
14:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2115.codfw.wmnet with reason: Maintenance
14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
14:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2115.codfw.wmnet with reason: Maintenance
14:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2101.codfw.wmnet with reason: Maintenance
14:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2101.codfw.wmnet with reason: Maintenance
14:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131 (T312975)', diff saved to https://phabricator.wikimedia.org/P32935 and previous config saved to /var/cache/conftool/dbconfig/20220824-142623-ladsgroup.json
14:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1185.eqiad.wmnet with OS bullseye
14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P32934 and previous config saved to /var/cache/conftool/dbconfig/20220824-142416-ladsgroup.json
14:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
14:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1185.eqiad.wmnet with reason: host reimage
14:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131', diff saved to https://phabricator.wikimedia.org/P32933 and previous config saved to /var/cache/conftool/dbconfig/20220824-141117-ladsgroup.json
14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P32932 and previous config saved to /var/cache/conftool/dbconfig/20220824-140910-ladsgroup.json
14:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1185.eqiad.wmnet with reason: host reimage
13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131', diff saved to https://phabricator.wikimedia.org/P32931 and previous config saved to /var/cache/conftool/dbconfig/20220824-135611-ladsgroup.json
13:55 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1185.eqiad.wmnet with OS bullseye
13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T314041)', diff saved to https://phabricator.wikimedia.org/P32930 and previous config saved to /var/cache/conftool/dbconfig/20220824-135404-ladsgroup.json
13:49 hashar@deploy1002: rebuilt and synchronized wikiversions files: Revert "Group 1 wikis to 1.39.0-wmf.26" # T316085T314187
13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T314041)', diff saved to https://phabricator.wikimedia.org/P32929 and previous config saved to /var/cache/conftool/dbconfig/20220824-134118-ladsgroup.json
13:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
13:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131 (T312975)', diff saved to https://phabricator.wikimedia.org/P32928 and previous config saved to /var/cache/conftool/dbconfig/20220824-134104-ladsgroup.json
13:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
13:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T314041)', diff saved to https://phabricator.wikimedia.org/P32927 and previous config saved to /var/cache/conftool/dbconfig/20220824-134057-ladsgroup.json
13:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2131 (T312975)', diff saved to https://phabricator.wikimedia.org/P32926 and previous config saved to /var/cache/conftool/dbconfig/20220824-133953-ladsgroup.json
13:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2131.codfw.wmnet with reason: Maintenance
13:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2131.codfw.wmnet with reason: Maintenance
13:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1137 (T312975)', diff saved to https://phabricator.wikimedia.org/P32925 and previous config saved to /var/cache/conftool/dbconfig/20220824-133932-ladsgroup.json
13:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repooling after cloning db1191', diff saved to https://phabricator.wikimedia.org/P32924 and previous config saved to /var/cache/conftool/dbconfig/20220824-132908-root.json
13:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1137', diff saved to https://phabricator.wikimedia.org/P32919 and previous config saved to /var/cache/conftool/dbconfig/20220824-130920-ladsgroup.json
12:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repooling after cloning db1191', diff saved to https://phabricator.wikimedia.org/P32918 and previous config saved to /var/cache/conftool/dbconfig/20220824-125858-root.json
12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T314041)', diff saved to https://phabricator.wikimedia.org/P32917 and previous config saved to /var/cache/conftool/dbconfig/20220824-125537-ladsgroup.json
12:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1137 (T312975)', diff saved to https://phabricator.wikimedia.org/P32916 and previous config saved to /var/cache/conftool/dbconfig/20220824-125414-ladsgroup.json
12:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1137 (T312975)', diff saved to https://phabricator.wikimedia.org/P32915 and previous config saved to /var/cache/conftool/dbconfig/20220824-125003-ladsgroup.json
12:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1137.eqiad.wmnet with reason: Maintenance
12:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1137.eqiad.wmnet with reason: Maintenance
12:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
12:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
12:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
12:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
12:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1120 (T312975)', diff saved to https://phabricator.wikimedia.org/P32914 and previous config saved to /var/cache/conftool/dbconfig/20220824-124905-ladsgroup.json
12:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repooling after cloning db1191', diff saved to https://phabricator.wikimedia.org/P32913 and previous config saved to /var/cache/conftool/dbconfig/20220824-124354-root.json
12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T314041)', diff saved to https://phabricator.wikimedia.org/P32912 and previous config saved to /var/cache/conftool/dbconfig/20220824-124346-ladsgroup.json
12:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1104.eqiad.wmnet with reason: Maintenance
12:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1104.eqiad.wmnet with reason: Maintenance
12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1120', diff saved to https://phabricator.wikimedia.org/P32911 and previous config saved to /var/cache/conftool/dbconfig/20220824-123358-ladsgroup.json
12:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: Repooling after cloning db1191', diff saved to https://phabricator.wikimedia.org/P32910 and previous config saved to /var/cache/conftool/dbconfig/20220824-122848-root.json
12:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1120', diff saved to https://phabricator.wikimedia.org/P32909 and previous config saved to /var/cache/conftool/dbconfig/20220824-121852-ladsgroup.json
12:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 5%: Repooling after cloning db1191', diff saved to https://phabricator.wikimedia.org/P32908 and previous config saved to /var/cache/conftool/dbconfig/20220824-121343-root.json
12:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1120 (T312975)', diff saved to https://phabricator.wikimedia.org/P32907 and previous config saved to /var/cache/conftool/dbconfig/20220824-120346-ladsgroup.json
11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1120 (T312975)', diff saved to https://phabricator.wikimedia.org/P32906 and previous config saved to /var/cache/conftool/dbconfig/20220824-115935-ladsgroup.json
11:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1120.eqiad.wmnet with reason: Maintenance
11:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1120.eqiad.wmnet with reason: Maintenance
11:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
11:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
11:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
11:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
11:42 klausman@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching ml-cache*: Rolling restart to activate new JRE - klausman@cumin1001
11:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 100%: Repooling after cloning db1190', diff saved to https://phabricator.wikimedia.org/P32905 and previous config saved to /var/cache/conftool/dbconfig/20220824-112938-root.json
11:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 75%: Repooling after cloning db1190', diff saved to https://phabricator.wikimedia.org/P32904 and previous config saved to /var/cache/conftool/dbconfig/20220824-111433-root.json
11:07 klausman@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching ml-cache*: Rolling restart to activate new JRE - klausman@cumin1001
10:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 50%: Repooling after cloning db1190', diff saved to https://phabricator.wikimedia.org/P32903 and previous config saved to /var/cache/conftool/dbconfig/20220824-105928-root.json
10:52 vgutierrez: disable origin coalescing in ats@cp600[78] - T315911
10:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 10%: Repooling after cloning db1190', diff saved to https://phabricator.wikimedia.org/P32901 and previous config saved to /var/cache/conftool/dbconfig/20220824-102919-root.json
10:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 5%: Repooling after cloning db1190', diff saved to https://phabricator.wikimedia.org/P32900 and previous config saved to /var/cache/conftool/dbconfig/20220824-101414-root.json
09:46 vgutierrez: Restart incremental roll-out of query-sorting at 1% - T314868
08:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32899 and previous config saved to /var/cache/conftool/dbconfig/20220824-085902-root.json
08:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32898 and previous config saved to /var/cache/conftool/dbconfig/20220824-085639-root.json
08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32897 and previous config saved to /var/cache/conftool/dbconfig/20220824-084357-root.json
08:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32896 and previous config saved to /var/cache/conftool/dbconfig/20220824-084134-root.json
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32895 and previous config saved to /var/cache/conftool/dbconfig/20220824-082852-root.json
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1174', diff saved to https://phabricator.wikimedia.org/P32893 and previous config saved to /var/cache/conftool/dbconfig/20220824-082809-root.json
08:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32892 and previous config saved to /var/cache/conftool/dbconfig/20220824-082630-root.json
08:16 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.26 refs T314187
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32891 and previous config saved to /var/cache/conftool/dbconfig/20220824-081347-root.json
08:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32890 and previous config saved to /var/cache/conftool/dbconfig/20220824-081125-root.json
07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32888 and previous config saved to /var/cache/conftool/dbconfig/20220824-075955-root.json
07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32887 and previous config saved to /var/cache/conftool/dbconfig/20220824-075946-root.json
07:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1147', diff saved to https://phabricator.wikimedia.org/P32886 and previous config saved to /var/cache/conftool/dbconfig/20220824-075927-root.json
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32885 and previous config saved to /var/cache/conftool/dbconfig/20220824-075843-root.json
07:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P32884 and previous config saved to /var/cache/conftool/dbconfig/20220824-075620-root.json
07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32883 and previous config saved to /var/cache/conftool/dbconfig/20220824-074451-root.json
07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32882 and previous config saved to /var/cache/conftool/dbconfig/20220824-074441-root.json
07:40 marostegui: Promote pc1014 to pc2 master T315526
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32880 and previous config saved to /var/cache/conftool/dbconfig/20220824-072946-root.json
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32879 and previous config saved to /var/cache/conftool/dbconfig/20220824-072937-root.json
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32878 and previous config saved to /var/cache/conftool/dbconfig/20220824-071441-root.json
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32877 and previous config saved to /var/cache/conftool/dbconfig/20220824-071432-root.json
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32876 and previous config saved to /var/cache/conftool/dbconfig/20220824-065937-root.json
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32875 and previous config saved to /var/cache/conftool/dbconfig/20220824-065927-root.json
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32874 and previous config saved to /var/cache/conftool/dbconfig/20220824-064432-root.json
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32873 and previous config saved to /var/cache/conftool/dbconfig/20220824-064423-root.json
06:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32872 and previous config saved to /var/cache/conftool/dbconfig/20220824-062927-root.json
06:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 4%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32871 and previous config saved to /var/cache/conftool/dbconfig/20220824-062918-root.json
06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1129', diff saved to https://phabricator.wikimedia.org/P32869 and previous config saved to /var/cache/conftool/dbconfig/20220824-061532-root.json
06:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32868 and previous config saved to /var/cache/conftool/dbconfig/20220824-061422-root.json
06:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 3%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32867 and previous config saved to /var/cache/conftool/dbconfig/20220824-061413-root.json
05:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1187 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32866 and previous config saved to /var/cache/conftool/dbconfig/20220824-055918-root.json
05:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 2%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32865 and previous config saved to /var/cache/conftool/dbconfig/20220824-055909-root.json
05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119', diff saved to https://phabricator.wikimedia.org/P32863 and previous config saved to /var/cache/conftool/dbconfig/20220824-054719-root.json
05:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: Repooling for the first time', diff saved to https://phabricator.wikimedia.org/P32862 and previous config saved to /var/cache/conftool/dbconfig/20220824-054404-root.json
05:40 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1187 with minimal weight', diff saved to https://phabricator.wikimedia.org/P32861 and previous config saved to /var/cache/conftool/dbconfig/20220824-054018-root.json
05:34 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1189 with minimal weight', diff saved to https://phabricator.wikimedia.org/P32860 and previous config saved to /var/cache/conftool/dbconfig/20220824-053434-root.json
05:33 marostegui@cumin1001: dbctl commit (dc=all): 'Move db2180 from s4 to s6', diff saved to https://phabricator.wikimedia.org/P32859 and previous config saved to /var/cache/conftool/dbconfig/20220824-053311-root.json
05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1189 with minimal weight', diff saved to https://phabricator.wikimedia.org/P32858 and previous config saved to /var/cache/conftool/dbconfig/20220824-053141-root.json
19:35 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on relforge[1003-1004].eqiad.wmnet with reason: T315604
19:34 bking@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on relforge[1003-1004].eqiad.wmnet with reason: T315604
12:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1132.eqiad.wmnet with reason: Maintenance
12:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1132.eqiad.wmnet with reason: Maintenance
12:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 15 hosts with reason: Maintenance
12:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 15 hosts with reason: Maintenance
12:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
12:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
12:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T312972)', diff saved to https://phabricator.wikimedia.org/P32847 and previous config saved to /var/cache/conftool/dbconfig/20220823-125824-marostegui.json
12:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P32846 and previous config saved to /var/cache/conftool/dbconfig/20220823-124317-marostegui.json
12:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2019.codfw.wmnet to cluster codfw and group B
12:39 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2019.codfw.wmnet to cluster codfw and group B
12:33 vgutierrez: Incremental roll-out of query-sorting (15%) - T314868
12:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P32845 and previous config saved to /var/cache/conftool/dbconfig/20220823-122811-marostegui.json
12:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T312972)', diff saved to https://phabricator.wikimedia.org/P32844 and previous config saved to /var/cache/conftool/dbconfig/20220823-121305-marostegui.json
12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T312972)', diff saved to https://phabricator.wikimedia.org/P32843 and previous config saved to /var/cache/conftool/dbconfig/20220823-121159-marostegui.json
12:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
12:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
12:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
12:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T312972)', diff saved to https://phabricator.wikimedia.org/P32842 and previous config saved to /var/cache/conftool/dbconfig/20220823-121055-marostegui.json
11:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P32841 and previous config saved to /var/cache/conftool/dbconfig/20220823-115549-marostegui.json
11:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P32840 and previous config saved to /var/cache/conftool/dbconfig/20220823-114220-root.json
11:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P32839 and previous config saved to /var/cache/conftool/dbconfig/20220823-114043-marostegui.json
11:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P32838 and previous config saved to /var/cache/conftool/dbconfig/20220823-112715-root.json
11:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T312972)', diff saved to https://phabricator.wikimedia.org/P32837 and previous config saved to /var/cache/conftool/dbconfig/20220823-112537-marostegui.json
11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T312972)', diff saved to https://phabricator.wikimedia.org/P32836 and previous config saved to /var/cache/conftool/dbconfig/20220823-112430-marostegui.json
11:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
11:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312972)', diff saved to https://phabricator.wikimedia.org/P32835 and previous config saved to /var/cache/conftool/dbconfig/20220823-112408-marostegui.json
11:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P32834 and previous config saved to /var/cache/conftool/dbconfig/20220823-111210-root.json
11:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 100%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32833 and previous config saved to /var/cache/conftool/dbconfig/20220823-111139-root.json
11:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P32832 and previous config saved to /var/cache/conftool/dbconfig/20220823-110902-marostegui.json
10:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P32831 and previous config saved to /var/cache/conftool/dbconfig/20220823-105706-root.json
10:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 75%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32830 and previous config saved to /var/cache/conftool/dbconfig/20220823-105634-root.json
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P32829 and previous config saved to /var/cache/conftool/dbconfig/20220823-105356-marostegui.json
10:46 btullis@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons.
10:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P32828 and previous config saved to /var/cache/conftool/dbconfig/20220823-104201-root.json
10:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 60%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32827 and previous config saved to /var/cache/conftool/dbconfig/20220823-104126-root.json
10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312972)', diff saved to https://phabricator.wikimedia.org/P32826 and previous config saved to /var/cache/conftool/dbconfig/20220823-103850-marostegui.json
10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T312972)', diff saved to https://phabricator.wikimedia.org/P32825 and previous config saved to /var/cache/conftool/dbconfig/20220823-103742-marostegui.json
10:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
10:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
10:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
10:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T312972)', diff saved to https://phabricator.wikimedia.org/P32824 and previous config saved to /var/cache/conftool/dbconfig/20220823-103704-marostegui.json
10:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 50%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32822 and previous config saved to /var/cache/conftool/dbconfig/20220823-102622-root.json
10:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P32821 and previous config saved to /var/cache/conftool/dbconfig/20220823-102158-marostegui.json
10:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 40%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32820 and previous config saved to /var/cache/conftool/dbconfig/20220823-101117-root.json
10:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P32819 and previous config saved to /var/cache/conftool/dbconfig/20220823-101048-root.json
10:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P32818 and previous config saved to /var/cache/conftool/dbconfig/20220823-100652-marostegui.json
09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T312972)', diff saved to https://phabricator.wikimedia.org/P32815 and previous config saved to /var/cache/conftool/dbconfig/20220823-095146-marostegui.json
09:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T312972)', diff saved to https://phabricator.wikimedia.org/P32814 and previous config saved to /var/cache/conftool/dbconfig/20220823-095039-marostegui.json
09:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
09:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
09:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T312972)', diff saved to https://phabricator.wikimedia.org/P32813 and previous config saved to /var/cache/conftool/dbconfig/20220823-095018-marostegui.json
09:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P32810 and previous config saved to /var/cache/conftool/dbconfig/20220823-093512-marostegui.json
09:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 10%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32809 and previous config saved to /var/cache/conftool/dbconfig/20220823-092603-root.json
09:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P32808 and previous config saved to /var/cache/conftool/dbconfig/20220823-092534-root.json
09:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P32807 and previous config saved to /var/cache/conftool/dbconfig/20220823-092006-marostegui.json
09:05 hashar@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.26 refs T314187
09:05 btullis@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons.
09:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T312972)', diff saved to https://phabricator.wikimedia.org/P32804 and previous config saved to /var/cache/conftool/dbconfig/20220823-090500-marostegui.json
09:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T312972)', diff saved to https://phabricator.wikimedia.org/P32803 and previous config saved to /var/cache/conftool/dbconfig/20220823-090353-marostegui.json
09:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1128.eqiad.wmnet with reason: Maintenance
09:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1128.eqiad.wmnet with reason: Maintenance
09:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T312972)', diff saved to https://phabricator.wikimedia.org/P32802 and previous config saved to /var/cache/conftool/dbconfig/20220823-090332-marostegui.json
08:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 5%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32801 and previous config saved to /var/cache/conftool/dbconfig/20220823-085554-root.json
08:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P32800 and previous config saved to /var/cache/conftool/dbconfig/20220823-085525-root.json
08:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2019.codfw.wmnet
08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P32799 and previous config saved to /var/cache/conftool/dbconfig/20220823-084826-marostegui.json
08:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2019.codfw.wmnet
08:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2019.codfw.wmnet to cluster codfw and group B
08:44 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2019.codfw.wmnet to cluster codfw and group B
08:41 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
08:41 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 2%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32798 and previous config saved to /var/cache/conftool/dbconfig/20220823-084050-root.json
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P32797 and previous config saved to /var/cache/conftool/dbconfig/20220823-084020-root.json
08:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P32796 and previous config saved to /var/cache/conftool/dbconfig/20220823-083319-marostegui.json
08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P32794 and previous config saved to /var/cache/conftool/dbconfig/20220823-082605-root.json
08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 1%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32793 and previous config saved to /var/cache/conftool/dbconfig/20220823-082545-root.json
08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P32792 and previous config saved to /var/cache/conftool/dbconfig/20220823-082515-root.json
08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1162', diff saved to https://phabricator.wikimedia.org/P32790 and previous config saved to /var/cache/conftool/dbconfig/20220823-082336-root.json
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P32789 and previous config saved to /var/cache/conftool/dbconfig/20220823-082215-root.json
08:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T312972)', diff saved to https://phabricator.wikimedia.org/P32788 and previous config saved to /var/cache/conftool/dbconfig/20220823-081813-marostegui.json
08:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T312972)', diff saved to https://phabricator.wikimedia.org/P32787 and previous config saved to /var/cache/conftool/dbconfig/20220823-081706-marostegui.json
08:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
08:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T312972)', diff saved to https://phabricator.wikimedia.org/P32786 and previous config saved to /var/cache/conftool/dbconfig/20220823-081645-marostegui.json
08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2019.codfw.wmnet
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P32785 and previous config saved to /var/cache/conftool/dbconfig/20220823-080710-root.json
08:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P32784 and previous config saved to /var/cache/conftool/dbconfig/20220823-080139-marostegui.json
07:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P32783 and previous config saved to /var/cache/conftool/dbconfig/20220823-074633-marostegui.json
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T312972)', diff saved to https://phabricator.wikimedia.org/P32781 and previous config saved to /var/cache/conftool/dbconfig/20220823-073127-marostegui.json
07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T312972)', diff saved to https://phabricator.wikimedia.org/P32780 and previous config saved to /var/cache/conftool/dbconfig/20220823-073020-marostegui.json
07:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
07:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
07:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2019.codfw.wmnet with OS bullseye
07:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T312972)', diff saved to https://phabricator.wikimedia.org/P32779 and previous config saved to /var/cache/conftool/dbconfig/20220823-072943-marostegui.json
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P32778 and previous config saved to /var/cache/conftool/dbconfig/20220823-071437-marostegui.json
07:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2019.codfw.wmnet with reason: host reimage
07:06 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2019.codfw.wmnet with reason: host reimage
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P32777 and previous config saved to /var/cache/conftool/dbconfig/20220823-065931-marostegui.json
06:50 kart_: Updated cxserver to 2022-08-22-093815-production (T308248, T308371)
06:49 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2019.codfw.wmnet with OS bullseye
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T312972)', diff saved to https://phabricator.wikimedia.org/P32776 and previous config saved to /var/cache/conftool/dbconfig/20220823-064425-marostegui.json
06:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T312972)', diff saved to https://phabricator.wikimedia.org/P32775 and previous config saved to /var/cache/conftool/dbconfig/20220823-064318-marostegui.json
06:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
06:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T312972)', diff saved to https://phabricator.wikimedia.org/P32774 and previous config saved to /var/cache/conftool/dbconfig/20220823-064257-marostegui.json
06:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on ganeti2019.codfw.wmnet with reason: Remove node for eventual reimage, T311686
06:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ganeti2019.codfw.wmnet with reason: Remove node for eventual reimage, T311686
06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P32773 and previous config saved to /var/cache/conftool/dbconfig/20220823-062751-marostegui.json
06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P32772 and previous config saved to /var/cache/conftool/dbconfig/20220823-061245-marostegui.json
05:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T312972)', diff saved to https://phabricator.wikimedia.org/P32771 and previous config saved to /var/cache/conftool/dbconfig/20220823-055739-marostegui.json
05:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T312972)', diff saved to https://phabricator.wikimedia.org/P32770 and previous config saved to /var/cache/conftool/dbconfig/20220823-053929-marostegui.json
05:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
05:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
05:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
05:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
05:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T312972)', diff saved to https://phabricator.wikimedia.org/P32769 and previous config saved to /var/cache/conftool/dbconfig/20220823-053852-marostegui.json
05:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P32768 and previous config saved to /var/cache/conftool/dbconfig/20220823-052346-marostegui.json
05:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P32767 and previous config saved to /var/cache/conftool/dbconfig/20220823-050840-marostegui.json
04:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T312972)', diff saved to https://phabricator.wikimedia.org/P32765 and previous config saved to /var/cache/conftool/dbconfig/20220823-045334-marostegui.json
04:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1131', diff saved to https://phabricator.wikimedia.org/P32764 and previous config saved to /var/cache/conftool/dbconfig/20220823-045322-root.json
04:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T312972)', diff saved to https://phabricator.wikimedia.org/P32763 and previous config saved to /var/cache/conftool/dbconfig/20220823-045227-marostegui.json
04:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
04:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
03:00 TimStarling: on wtp1025,wtp1027,wtp1029,wtp1031,wtp1033,wtp1035: set scaling_governor to performance T315398
01:41 TimStarling: on mw1411, mw1413, mw1419, mw1429, mw1431, mw1433: set energy_performance_preference to balance_performance T315398
01:11 TimStarling: on mw1411, mw1413, mw1419, mw1429, mw1431, mw1433: set scaling_governor to powersave and energy_performance_preference to performance
00:09 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1187.eqiad.wmnet with OS bullseye
2022-08-22
23:55 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
23:52 pt1979@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1187.eqiad.wmnet with reason: host reimage
23:39 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host db1187.eqiad.wmnet with OS bullseye
21:17 bking@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - bking@cumin2002 - T315604
21:17 bking@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - bking@cumin2002 - T315604
21:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf2002.codfw.wmnet with OS bullseye
21:06 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host db1185.eqiad.wmnet
21:04 pt1979@cumin1001: START - Cookbook sre.hosts.dhcp for host db1185.eqiad.wmnet
21:02 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1185.eqiad.wmnet with OS bullseye
21:01 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host db1185.eqiad.wmnet with OS bullseye
20:59 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1185.eqiad.wmnet with OS bullseye
20:59 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host db1185.eqiad.wmnet with OS bullseye
20:59 pt1979@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1195.eqiad.wmnet with OS bullseye
20:58 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host db1195.eqiad.wmnet with OS bullseye
20:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
20:53 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf2002.codfw.wmnet with reason: host reimage
20:51 urbanecm@deploy1002: Synchronized php-1.39.0-wmf.25/skins/Vector/: e0ff763: Layout: Restore disabling of max width on certain pages (T315460) (duration: 03m 37s)
19:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-wf2002.mgmt.codfw.wmnet with reboot policy FORCED
19:11 xcollazo@deploy1002: Finished deploy [airflow-dags/analytics_test@5ac442f]: Use instance specific HDFS cache on analytics_test (duration: 00m 17s)
19:11 xcollazo@deploy1002: Started deploy [airflow-dags/analytics_test@5ac442f]: Use instance specific HDFS cache on analytics_test
19:04 xcollazo@deploy1002: Finished deploy [airflow-dags/analytics_test@9edd1ab]: Use instance specific HDFS cache on analytics_test (duration: 00m 05s)
19:04 xcollazo@deploy1002: Started deploy [airflow-dags/analytics_test@9edd1ab]: Use instance specific HDFS cache on analytics_test
18:59 xcollazo@deploy1002: Finished deploy [airflow-dags/platform_eng@5ac442f]: Use instance specific HDFS cache on platform_eng (duration: 00m 10s)
18:59 xcollazo@deploy1002: Started deploy [airflow-dags/platform_eng@5ac442f]: Use instance specific HDFS cache on platform_eng
18:54 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc-wf2002.mgmt.codfw.wmnet with reboot policy FORCED
18:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-wf2001.mgmt.codfw.wmnet with reboot policy FORCED
18:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc-wf2001.mgmt.codfw.wmnet with reboot policy FORCED
18:26 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
13:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
13:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
13:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
13:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T312972)', diff saved to https://phabricator.wikimedia.org/P32737 and previous config saved to /var/cache/conftool/dbconfig/20220822-130732-marostegui.json
13:03 jynus: disabled backup scheduling for backup1002, backup2002 T315864
13:01 marostegui@cumin1001: dbctl commit (dc=all): 'es1023 (re)pooling @ 5%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32735 and previous config saved to /var/cache/conftool/dbconfig/20220822-130144-root.json
12:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P32734 and previous config saved to /var/cache/conftool/dbconfig/20220822-125226-marostegui.json
12:52 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubemaster2002.codfw.wmnet
12:48 jayme@cumin1001: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-worker-eqiad
12:46 marostegui@cumin1001: dbctl commit (dc=all): 'es1023 (re)pooling @ 2%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32732 and previous config saved to /var/cache/conftool/dbconfig/20220822-124640-root.json
12:45 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubemaster2002.codfw.wmnet
12:39 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubemaster2001.codfw.wmnet
12:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P32731 and previous config saved to /var/cache/conftool/dbconfig/20220822-123720-marostegui.json
12:33 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host kubemaster2001.codfw.wmnet
12:31 marostegui@cumin1001: dbctl commit (dc=all): 'es1023 (re)pooling @ 1%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32730 and previous config saved to /var/cache/conftool/dbconfig/20220822-123135-root.json
12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ldap-replica2006.wikimedia.org
12:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T312972)', diff saved to https://phabricator.wikimedia.org/P32729 and previous config saved to /var/cache/conftool/dbconfig/20220822-122214-marostegui.json
12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1023 for reboot T315542', diff saved to https://phabricator.wikimedia.org/P32728 and previous config saved to /var/cache/conftool/dbconfig/20220822-121401-root.json
11:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Switchover es5 T315542
11:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Switchover es5 T315542
11:36 moritzm: installing libdatetime-timezone-perl updates from SUA update
11:33 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 100%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32725 and previous config saved to /var/cache/conftool/dbconfig/20220822-113352-root.json
11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T312972)', diff saved to https://phabricator.wikimedia.org/P32724 and previous config saved to /var/cache/conftool/dbconfig/20220822-112829-marostegui.json
11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T312972)', diff saved to https://phabricator.wikimedia.org/P32723 and previous config saved to /var/cache/conftool/dbconfig/20220822-112808-marostegui.json
11:25 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host dse-k8s-ctrl1001.eqiad.wmnet
11:18 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 75%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32722 and previous config saved to /var/cache/conftool/dbconfig/20220822-111847-root.json
11:16 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
11:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P32721 and previous config saved to /var/cache/conftool/dbconfig/20220822-111301-marostegui.json
11:03 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 60%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32720 and previous config saved to /var/cache/conftool/dbconfig/20220822-110342-root.json
10:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P32719 and previous config saved to /var/cache/conftool/dbconfig/20220822-105755-marostegui.json
10:48 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 50%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32718 and previous config saved to /var/cache/conftool/dbconfig/20220822-104838-root.json
10:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T312972)', diff saved to https://phabricator.wikimedia.org/P32717 and previous config saved to /var/cache/conftool/dbconfig/20220822-104249-marostegui.json
10:33 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 40%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32716 and previous config saved to /var/cache/conftool/dbconfig/20220822-103333-root.json
10:18 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 30%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32715 and previous config saved to /var/cache/conftool/dbconfig/20220822-101828-root.json
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 20%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32714 and previous config saved to /var/cache/conftool/dbconfig/20220822-100324-root.json
10:00 vgutierrez: Incremental roll-out of query-sorting (1%) - T314868
09:58 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubestagemaster1001.eqiad.wmnet
09:38 XioNoX: push new policy on pfw3-eqiad - T315578
09:36 jayme@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-worker-eqiad
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 8%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32709 and previous config saved to /var/cache/conftool/dbconfig/20220822-093314-root.json
09:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P32708 and previous config saved to /var/cache/conftool/dbconfig/20220822-092706-marostegui.json
09:18 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 5%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32706 and previous config saved to /var/cache/conftool/dbconfig/20220822-091810-root.json
09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P32705 and previous config saved to /var/cache/conftool/dbconfig/20220822-091200-marostegui.json
09:03 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 2%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32704 and previous config saved to /var/cache/conftool/dbconfig/20220822-090305-root.json
08:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T312972)', diff saved to https://phabricator.wikimedia.org/P32703 and previous config saved to /var/cache/conftool/dbconfig/20220822-085654-marostegui.json
08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T312972)', diff saved to https://phabricator.wikimedia.org/P32702 and previous config saved to /var/cache/conftool/dbconfig/20220822-085014-marostegui.json
08:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
08:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
08:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T312972)', diff saved to https://phabricator.wikimedia.org/P32701 and previous config saved to /var/cache/conftool/dbconfig/20220822-084942-marostegui.json
08:48 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 1%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32700 and previous config saved to /var/cache/conftool/dbconfig/20220822-084800-root.json
08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1020 ', diff saved to https://phabricator.wikimedia.org/P32699 and previous config saved to /var/cache/conftool/dbconfig/20220822-084359-root.json
08:43 marostegui@cumin1001: dbctl commit (dc=all): 'es1020 (re)pooling @ 1%: Repooling after reboot', diff saved to https://phabricator.wikimedia.org/P32698 and previous config saved to /var/cache/conftool/dbconfig/20220822-084335-root.json
08:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P32697 and previous config saved to /var/cache/conftool/dbconfig/20220822-083436-marostegui.json
08:33 moritzm: powercycling wdqs1014 (unresponsive via botched wdqs-categories process
08:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1020 for reboot T310485', diff saved to https://phabricator.wikimedia.org/P32696 and previous config saved to /var/cache/conftool/dbconfig/20220822-083341-root.json
08:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P32693 and previous config saved to /var/cache/conftool/dbconfig/20220822-081930-marostegui.json
08:18 marostegui@cumin1001: dbctl commit (dc=all): 'Set es1021 with weight 10 T315540', diff saved to https://phabricator.wikimedia.org/P32692 and previous config saved to /var/cache/conftool/dbconfig/20220822-081817-root.json
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T312972)', diff saved to https://phabricator.wikimedia.org/P32690 and previous config saved to /var/cache/conftool/dbconfig/20220822-080424-marostegui.json
08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P32689 and previous config saved to /var/cache/conftool/dbconfig/20220822-080020-root.json
08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P32688 and previous config saved to /var/cache/conftool/dbconfig/20220822-080012-root.json
07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P32687 and previous config saved to /var/cache/conftool/dbconfig/20220822-075949-root.json
07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32686 and previous config saved to /var/cache/conftool/dbconfig/20220822-075941-root.json
07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2182 to dbctl T311494', diff saved to https://phabricator.wikimedia.org/P32685 and previous config saved to /var/cache/conftool/dbconfig/20220822-075359-marostegui.json
07:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P32684 and previous config saved to /var/cache/conftool/dbconfig/20220822-074515-root.json
07:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P32683 and previous config saved to /var/cache/conftool/dbconfig/20220822-074507-root.json
07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P32682 and previous config saved to /var/cache/conftool/dbconfig/20220822-074443-root.json
07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32681 and previous config saved to /var/cache/conftool/dbconfig/20220822-074437-root.json
07:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P32677 and previous config saved to /var/cache/conftool/dbconfig/20220822-073010-root.json
07:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P32676 and previous config saved to /var/cache/conftool/dbconfig/20220822-073002-root.json
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P32675 and previous config saved to /var/cache/conftool/dbconfig/20220822-072938-root.json
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32674 and previous config saved to /var/cache/conftool/dbconfig/20220822-072932-root.json
07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T312972)', diff saved to https://phabricator.wikimedia.org/P32673 and previous config saved to /var/cache/conftool/dbconfig/20220822-072339-marostegui.json
07:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
07:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
07:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P32672 and previous config saved to /var/cache/conftool/dbconfig/20220822-071506-root.json
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P32671 and previous config saved to /var/cache/conftool/dbconfig/20220822-071458-root.json
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P32670 and previous config saved to /var/cache/conftool/dbconfig/20220822-071433-root.json
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32669 and previous config saved to /var/cache/conftool/dbconfig/20220822-071427-root.json
07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2181 to dbctl T311494', diff saved to https://phabricator.wikimedia.org/P32668 and previous config saved to /var/cache/conftool/dbconfig/20220822-071153-marostegui.json
07:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 7 hosts with reason: Maintenance
07:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 7 hosts with reason: Maintenance
07:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
07:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
07:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T312972)', diff saved to https://phabricator.wikimedia.org/P32667 and previous config saved to /var/cache/conftool/dbconfig/20220822-070804-marostegui.json
07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P32666 and previous config saved to /var/cache/conftool/dbconfig/20220822-070001-root.json
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P32665 and previous config saved to /var/cache/conftool/dbconfig/20220822-065953-root.json
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P32664 and previous config saved to /var/cache/conftool/dbconfig/20220822-065929-root.json
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 5%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32663 and previous config saved to /var/cache/conftool/dbconfig/20220822-065923-root.json
06:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P32662 and previous config saved to /var/cache/conftool/dbconfig/20220822-065258-marostegui.json
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P32661 and previous config saved to /var/cache/conftool/dbconfig/20220822-064457-root.json
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P32660 and previous config saved to /var/cache/conftool/dbconfig/20220822-064448-root.json
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P32659 and previous config saved to /var/cache/conftool/dbconfig/20220822-064424-root.json
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 1%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32658 and previous config saved to /var/cache/conftool/dbconfig/20220822-064418-root.json
06:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 db1142 db1096', diff saved to https://phabricator.wikimedia.org/P32657 and previous config saved to /var/cache/conftool/dbconfig/20220822-063857-root.json
06:38 marostegui: Install 10.4.26 on db1119, db1142, db1096 T315411
06:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P32656 and previous config saved to /var/cache/conftool/dbconfig/20220822-063752-marostegui.json
06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2180 to dbctl T311494', diff saved to https://phabricator.wikimedia.org/P32655 and previous config saved to /var/cache/conftool/dbconfig/20220822-063533-marostegui.json
06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T312972)', diff saved to https://phabricator.wikimedia.org/P32654 and previous config saved to /var/cache/conftool/dbconfig/20220822-062246-marostegui.json
06:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T312972)', diff saved to https://phabricator.wikimedia.org/P32653 and previous config saved to /var/cache/conftool/dbconfig/20220822-061600-marostegui.json
06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2179 to dbctl T311494', diff saved to https://phabricator.wikimedia.org/P32652 and previous config saved to /var/cache/conftool/dbconfig/20220822-061553-marostegui.json
06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
06:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
06:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
06:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
06:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2178 to dbctl T311494', diff saved to https://phabricator.wikimedia.org/P32651 and previous config saved to /var/cache/conftool/dbconfig/20220822-055446-marostegui.json
12:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db[1111,1127,1132].eqiad.wmnet with reason: 10.6 being 10.6
12:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db[1111,1127,1132].eqiad.wmnet with reason: 10.6 being 10.6
12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool 10.6 hosts', diff saved to https://phabricator.wikimedia.org/P32649 and previous config saved to /var/cache/conftool/dbconfig/20220821-123038-ladsgroup.json
12:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1132', diff saved to https://phabricator.wikimedia.org/P32648 and previous config saved to /var/cache/conftool/dbconfig/20220821-121140-root.json
09:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T314041)', diff saved to https://phabricator.wikimedia.org/P32647 and previous config saved to /var/cache/conftool/dbconfig/20220821-092727-ladsgroup.json
09:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P32646 and previous config saved to /var/cache/conftool/dbconfig/20220821-091221-ladsgroup.json
08:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P32645 and previous config saved to /var/cache/conftool/dbconfig/20220821-085716-ladsgroup.json
08:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T314041)', diff saved to https://phabricator.wikimedia.org/P32644 and previous config saved to /var/cache/conftool/dbconfig/20220821-084209-ladsgroup.json
04:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T314041)', diff saved to https://phabricator.wikimedia.org/P32643 and previous config saved to /var/cache/conftool/dbconfig/20220821-042415-ladsgroup.json
04:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
04:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
04:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
04:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
03:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32642 and previous config saved to /var/cache/conftool/dbconfig/20220821-033020-ladsgroup.json
03:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P32641 and previous config saved to /var/cache/conftool/dbconfig/20220821-031514-ladsgroup.json
03:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P32640 and previous config saved to /var/cache/conftool/dbconfig/20220821-030008-ladsgroup.json
02:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32639 and previous config saved to /var/cache/conftool/dbconfig/20220821-024502-ladsgroup.json
01:35 rzl@cumin2002: dbctl commit (dc=all): 'Depool db1143', diff saved to https://phabricator.wikimedia.org/P32638 and previous config saved to /var/cache/conftool/dbconfig/20220821-013504-rzl.json
2022-08-20
22:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32637 and previous config saved to /var/cache/conftool/dbconfig/20220820-221826-ladsgroup.json
22:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
22:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 9 hosts with reason: Maintenance
17:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 9 hosts with reason: Maintenance
17:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
17:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32636 and previous config saved to /var/cache/conftool/dbconfig/20220820-173723-ladsgroup.json
17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P32635 and previous config saved to /var/cache/conftool/dbconfig/20220820-172217-ladsgroup.json
17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P32634 and previous config saved to /var/cache/conftool/dbconfig/20220820-170711-ladsgroup.json
16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32633 and previous config saved to /var/cache/conftool/dbconfig/20220820-165203-ladsgroup.json
11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32632 and previous config saved to /var/cache/conftool/dbconfig/20220820-115816-ladsgroup.json
11:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
11:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
11:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T314041)', diff saved to https://phabricator.wikimedia.org/P32631 and previous config saved to /var/cache/conftool/dbconfig/20220820-115755-ladsgroup.json
11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P32630 and previous config saved to /var/cache/conftool/dbconfig/20220820-114249-ladsgroup.json
11:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P32629 and previous config saved to /var/cache/conftool/dbconfig/20220820-112744-ladsgroup.json
11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T314041)', diff saved to https://phabricator.wikimedia.org/P32628 and previous config saved to /var/cache/conftool/dbconfig/20220820-111238-ladsgroup.json
06:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T314041)', diff saved to https://phabricator.wikimedia.org/P32627 and previous config saved to /var/cache/conftool/dbconfig/20220820-065528-ladsgroup.json
06:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
06:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
06:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T314041)', diff saved to https://phabricator.wikimedia.org/P32626 and previous config saved to /var/cache/conftool/dbconfig/20220820-065507-ladsgroup.json
06:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P32625 and previous config saved to /var/cache/conftool/dbconfig/20220820-064001-ladsgroup.json
06:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P32624 and previous config saved to /var/cache/conftool/dbconfig/20220820-062455-ladsgroup.json
06:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T314041)', diff saved to https://phabricator.wikimedia.org/P32623 and previous config saved to /var/cache/conftool/dbconfig/20220820-060949-ladsgroup.json
01:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T314041)', diff saved to https://phabricator.wikimedia.org/P32622 and previous config saved to /var/cache/conftool/dbconfig/20220820-012602-ladsgroup.json
01:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
01:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
2022-08-19
23:37 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on phab2002.codfw.wmnet with reason: new host in setup
23:37 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on phab2002.codfw.wmnet with reason: new host in setup
23:35 mutante: phab2002 - service phd: stopped phabricator_logmail: disabled, phabricator dumps: disabled, systemd::sysuser: not used (all via Hiera switches) - T280597
23:33 mutante: phab2002 - re-enabled puppet, sshd config ListenAddress fixed by puppet gerrit:824797 - now has phabricator prod role but without LVS/git-ssh - no more error in puppet run - T280597
20:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
20:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
18:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
18:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
18:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 (T312972)', diff saved to https://phabricator.wikimedia.org/P32621 and previous config saved to /var/cache/conftool/dbconfig/20220819-182835-marostegui.json
18:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130', diff saved to https://phabricator.wikimedia.org/P32620 and previous config saved to /var/cache/conftool/dbconfig/20220819-181329-marostegui.json
17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130', diff saved to https://phabricator.wikimedia.org/P32619 and previous config saved to /var/cache/conftool/dbconfig/20220819-175823-marostegui.json
17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 (T312972)', diff saved to https://phabricator.wikimedia.org/P32618 and previous config saved to /var/cache/conftool/dbconfig/20220819-174317-marostegui.json
17:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1130 (T312972)', diff saved to https://phabricator.wikimedia.org/P32617 and previous config saved to /var/cache/conftool/dbconfig/20220819-171052-marostegui.json
17:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
17:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1130.eqiad.wmnet with reason: Maintenance
17:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32616 and previous config saved to /var/cache/conftool/dbconfig/20220819-171031-marostegui.json
16:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P32615 and previous config saved to /var/cache/conftool/dbconfig/20220819-165525-marostegui.json
16:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P32614 and previous config saved to /var/cache/conftool/dbconfig/20220819-164019-marostegui.json
16:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32613 and previous config saved to /var/cache/conftool/dbconfig/20220819-162513-marostegui.json
16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32612 and previous config saved to /var/cache/conftool/dbconfig/20220819-162253-marostegui.json
16:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
16:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32611 and previous config saved to /var/cache/conftool/dbconfig/20220819-162232-marostegui.json
16:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P32610 and previous config saved to /var/cache/conftool/dbconfig/20220819-160726-marostegui.json
15:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
15:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
15:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32609 and previous config saved to /var/cache/conftool/dbconfig/20220819-155611-ladsgroup.json
15:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P32608 and previous config saved to /var/cache/conftool/dbconfig/20220819-155220-marostegui.json
15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P32607 and previous config saved to /var/cache/conftool/dbconfig/20220819-154105-ladsgroup.json
15:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32606 and previous config saved to /var/cache/conftool/dbconfig/20220819-153714-marostegui.json
15:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32605 and previous config saved to /var/cache/conftool/dbconfig/20220819-153554-marostegui.json
15:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
15:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
15:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32604 and previous config saved to /var/cache/conftool/dbconfig/20220819-153533-marostegui.json
15:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2024.codfw.wmnet with OS bullseye
15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P32603 and previous config saved to /var/cache/conftool/dbconfig/20220819-152559-ladsgroup.json
15:25 dancy@deploy1002: Installation of scap version "4.14.0" completed for 556 hosts
15:23 dancy@deploy1002: Installing scap version "4.14.0" for 556 hosts
15:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P32602 and previous config saved to /var/cache/conftool/dbconfig/20220819-152027-marostegui.json
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32601 and previous config saved to /var/cache/conftool/dbconfig/20220819-151053-ladsgroup.json
15:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P32600 and previous config saved to /var/cache/conftool/dbconfig/20220819-150521-marostegui.json
14:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2023.codfw.wmnet with reason: host reimage
14:55 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2023.codfw.wmnet with reason: host reimage
14:52 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2024.codfw.wmnet with OS bullseye
14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32599 and previous config saved to /var/cache/conftool/dbconfig/20220819-145015-marostegui.json
14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T312972)', diff saved to https://phabricator.wikimedia.org/P32598 and previous config saved to /var/cache/conftool/dbconfig/20220819-144755-marostegui.json
14:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
14:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T312972)', diff saved to https://phabricator.wikimedia.org/P32597 and previous config saved to /var/cache/conftool/dbconfig/20220819-144734-marostegui.json
14:33 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2023.codfw.wmnet with OS bullseye
14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P32596 and previous config saved to /var/cache/conftool/dbconfig/20220819-143228-marostegui.json
14:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kubernetes2024']
14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P32595 and previous config saved to /var/cache/conftool/dbconfig/20220819-141722-marostegui.json
14:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2024.mgmt.codfw.wmnet with reboot policy FORCED
14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T312972)', diff saved to https://phabricator.wikimedia.org/P32594 and previous config saved to /var/cache/conftool/dbconfig/20220819-140216-marostegui.json
13:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T312972)', diff saved to https://phabricator.wikimedia.org/P32593 and previous config saved to /var/cache/conftool/dbconfig/20220819-135956-marostegui.json
13:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
13:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
13:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T312972)', diff saved to https://phabricator.wikimedia.org/P32592 and previous config saved to /var/cache/conftool/dbconfig/20220819-135917-marostegui.json
13:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2024.mgmt.codfw.wmnet with reboot policy FORCED
13:45 marostegui: Install 10.4.26 on db2111 db2148 db2124
13:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P32591 and previous config saved to /var/cache/conftool/dbconfig/20220819-134411-marostegui.json
13:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P32590 and previous config saved to /var/cache/conftool/dbconfig/20220819-132905-marostegui.json
13:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T312972)', diff saved to https://phabricator.wikimedia.org/P32589 and previous config saved to /var/cache/conftool/dbconfig/20220819-131359-marostegui.json
13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T312972)', diff saved to https://phabricator.wikimedia.org/P32588 and previous config saved to /var/cache/conftool/dbconfig/20220819-131139-marostegui.json
13:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
13:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
13:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 9 hosts with reason: Maintenance
13:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 9 hosts with reason: Maintenance
13:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
13:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
11:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance
11:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance
11:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
11:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
11:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
11:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32587 and previous config saved to /var/cache/conftool/dbconfig/20220819-114703-marostegui.json
11:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P32586 and previous config saved to /var/cache/conftool/dbconfig/20220819-113157-marostegui.json
11:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P32584 and previous config saved to /var/cache/conftool/dbconfig/20220819-111651-marostegui.json
11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32583 and previous config saved to /var/cache/conftool/dbconfig/20220819-110145-marostegui.json
10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32582 and previous config saved to /var/cache/conftool/dbconfig/20220819-105934-marostegui.json
10:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
10:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
10:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
10:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32581 and previous config saved to /var/cache/conftool/dbconfig/20220819-105906-marostegui.json
10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32580 and previous config saved to /var/cache/conftool/dbconfig/20220819-105212-ladsgroup.json
10:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
10:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
10:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T314041)', diff saved to https://phabricator.wikimedia.org/P32579 and previous config saved to /var/cache/conftool/dbconfig/20220819-105151-ladsgroup.json
10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P32578 and previous config saved to /var/cache/conftool/dbconfig/20220819-104400-marostegui.json
10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P32577 and previous config saved to /var/cache/conftool/dbconfig/20220819-103645-ladsgroup.json
10:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P32576 and previous config saved to /var/cache/conftool/dbconfig/20220819-102854-marostegui.json
10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P32575 and previous config saved to /var/cache/conftool/dbconfig/20220819-102139-ladsgroup.json
10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32574 and previous config saved to /var/cache/conftool/dbconfig/20220819-101348-marostegui.json
10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T314041)', diff saved to https://phabricator.wikimedia.org/P32573 and previous config saved to /var/cache/conftool/dbconfig/20220819-100633-ladsgroup.json
09:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32572 and previous config saved to /var/cache/conftool/dbconfig/20220819-095035-marostegui.json
09:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
09:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
09:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32571 and previous config saved to /var/cache/conftool/dbconfig/20220819-095014-marostegui.json
09:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P32570 and previous config saved to /var/cache/conftool/dbconfig/20220819-093508-marostegui.json
09:23 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons.
09:21 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
09:21 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
09:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P32569 and previous config saved to /var/cache/conftool/dbconfig/20220819-092002-marostegui.json
09:17 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:11 topranks: running authdns-update on auth1001 to add new include to 0.0.5.e.2.f.d.0.1.0.0.2.ip6.arpa. zone
09:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32568 and previous config saved to /var/cache/conftool/dbconfig/20220819-090456-marostegui.json
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T312972)', diff saved to https://phabricator.wikimedia.org/P32567 and previous config saved to /var/cache/conftool/dbconfig/20220819-090146-marostegui.json
09:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
09:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T312972)', diff saved to https://phabricator.wikimedia.org/P32566 and previous config saved to /var/cache/conftool/dbconfig/20220819-090124-marostegui.json
08:56 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons.
08:56 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P32565 and previous config saved to /var/cache/conftool/dbconfig/20220819-084618-marostegui.json
08:44 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade.
08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P32564 and previous config saved to /var/cache/conftool/dbconfig/20220819-083112-marostegui.json
08:16 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be2067.codfw.wmnet
08:16 mvernon@cumin1001: START - Cookbook sre.hosts.remove-downtime for ms-be2067.codfw.wmnet
08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T312972)', diff saved to https://phabricator.wikimedia.org/P32563 and previous config saved to /var/cache/conftool/dbconfig/20220819-081606-marostegui.json
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T312972)', diff saved to https://phabricator.wikimedia.org/P32562 and previous config saved to /var/cache/conftool/dbconfig/20220819-081356-marostegui.json
08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
08:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T312972)', diff saved to https://phabricator.wikimedia.org/P32561 and previous config saved to /var/cache/conftool/dbconfig/20220819-081317-marostegui.json
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P32559 and previous config saved to /var/cache/conftool/dbconfig/20220819-075812-marostegui.json
07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P32558 and previous config saved to /var/cache/conftool/dbconfig/20220819-074306-marostegui.json
07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T312972)', diff saved to https://phabricator.wikimedia.org/P32557 and previous config saved to /var/cache/conftool/dbconfig/20220819-072800-marostegui.json
07:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
07:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
07:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T312972)', diff saved to https://phabricator.wikimedia.org/P32556 and previous config saved to /var/cache/conftool/dbconfig/20220819-072422-marostegui.json
07:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T314041)', diff saved to https://phabricator.wikimedia.org/P32555 and previous config saved to /var/cache/conftool/dbconfig/20220819-071934-ladsgroup.json
07:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
07:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
07:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
07:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
07:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P32553 and previous config saved to /var/cache/conftool/dbconfig/20220819-070916-marostegui.json
06:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P32552 and previous config saved to /var/cache/conftool/dbconfig/20220819-065409-marostegui.json
06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T312972)', diff saved to https://phabricator.wikimedia.org/P32551 and previous config saved to /var/cache/conftool/dbconfig/20220819-063903-marostegui.json
06:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T312972)', diff saved to https://phabricator.wikimedia.org/P32550 and previous config saved to /var/cache/conftool/dbconfig/20220819-061649-marostegui.json
06:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
06:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
06:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T312972)', diff saved to https://phabricator.wikimedia.org/P32549 and previous config saved to /var/cache/conftool/dbconfig/20220819-061628-marostegui.json
06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1127', diff saved to https://phabricator.wikimedia.org/P32548 and previous config saved to /var/cache/conftool/dbconfig/20220819-061515-root.json
06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P32547 and previous config saved to /var/cache/conftool/dbconfig/20220819-060122-marostegui.json
05:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P32546 and previous config saved to /var/cache/conftool/dbconfig/20220819-054616-marostegui.json
05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T312972)', diff saved to https://phabricator.wikimedia.org/P32544 and previous config saved to /var/cache/conftool/dbconfig/20220819-053110-marostegui.json
05:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T312972)', diff saved to https://phabricator.wikimedia.org/P32543 and previous config saved to /var/cache/conftool/dbconfig/20220819-052900-marostegui.json
05:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
05:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
05:20 marostegui: Install 10.6.9 on db2122 and db2146
21:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on relforge[1003-1004].eqiad.wmnet with reason: elastic 7 upgrade
21:14 bking@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on relforge[1003-1004].eqiad.wmnet with reason: elastic 7 upgrade
16:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 13 hosts with reason: Maintenance
16:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 13 hosts with reason: Maintenance
16:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
16:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
16:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T312972)', diff saved to https://phabricator.wikimedia.org/P32541 and previous config saved to /var/cache/conftool/dbconfig/20220818-164456-marostegui.json
16:44 demon@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.25 refs T314186
16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P32540 and previous config saved to /var/cache/conftool/dbconfig/20220818-162950-marostegui.json
16:26 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: disk fault investigation
16:26 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: disk fault investigation
16:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-stretch2001.codfw.wmnet with OS bullseye
16:17 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-stretch2001.codfw.wmnet with OS bullseye
16:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P32539 and previous config saved to /var/cache/conftool/dbconfig/20220818-161444-marostegui.json
15:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T312972)', diff saved to https://phabricator.wikimedia.org/P32538 and previous config saved to /var/cache/conftool/dbconfig/20220818-155938-marostegui.json
15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T312972)', diff saved to https://phabricator.wikimedia.org/P32537 and previous config saved to /var/cache/conftool/dbconfig/20220818-155410-marostegui.json
15:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
15:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
15:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T312972)', diff saved to https://phabricator.wikimedia.org/P32536 and previous config saved to /var/cache/conftool/dbconfig/20220818-155348-marostegui.json
15:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P32535 and previous config saved to /var/cache/conftool/dbconfig/20220818-153842-marostegui.json
15:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P32534 and previous config saved to /var/cache/conftool/dbconfig/20220818-152335-marostegui.json
15:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T312972)', diff saved to https://phabricator.wikimedia.org/P32533 and previous config saved to /var/cache/conftool/dbconfig/20220818-150829-marostegui.json
15:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T312972)', diff saved to https://phabricator.wikimedia.org/P32532 and previous config saved to /var/cache/conftool/dbconfig/20220818-150621-marostegui.json
15:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
15:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
15:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T312972)', diff saved to https://phabricator.wikimedia.org/P32531 and previous config saved to /var/cache/conftool/dbconfig/20220818-150601-marostegui.json
15:05 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-stretch2001.codfw.wmnet with OS bullseye
14:58 dancy@deploy1002: Started deploy [integration/docroot@a43ff3b]: (no justification provided)
14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P32530 and previous config saved to /var/cache/conftool/dbconfig/20220818-145055-marostegui.json
14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P32529 and previous config saved to /var/cache/conftool/dbconfig/20220818-143549-marostegui.json
14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T312972)', diff saved to https://phabricator.wikimedia.org/P32528 and previous config saved to /var/cache/conftool/dbconfig/20220818-142043-marostegui.json
14:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T312972)', diff saved to https://phabricator.wikimedia.org/P32527 and previous config saved to /var/cache/conftool/dbconfig/20220818-141835-marostegui.json
14:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
14:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
14:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T312972)', diff saved to https://phabricator.wikimedia.org/P32526 and previous config saved to /var/cache/conftool/dbconfig/20220818-141815-marostegui.json
14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P32525 and previous config saved to /var/cache/conftool/dbconfig/20220818-140309-marostegui.json
13:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P32524 and previous config saved to /var/cache/conftool/dbconfig/20220818-134803-marostegui.json
13:37 jbond: uploaded spicerack_3.2.0 to apt.wikimedia.org bullseye-wikimedia
13:37 samtar@deploy1002: scap failed: average error rate on 5/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)
13:33 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T312972)', diff saved to https://phabricator.wikimedia.org/P32523 and previous config saved to /var/cache/conftool/dbconfig/20220818-133257-marostegui.json
13:32 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1033.eqiad.wmnet with OS bullseye
12:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T312972)', diff saved to https://phabricator.wikimedia.org/P32522 and previous config saved to /var/cache/conftool/dbconfig/20220818-123241-marostegui.json
12:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
12:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
12:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T312972)', diff saved to https://phabricator.wikimedia.org/P32521 and previous config saved to /var/cache/conftool/dbconfig/20220818-123220-marostegui.json
12:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P32520 and previous config saved to /var/cache/conftool/dbconfig/20220818-121714-marostegui.json
12:04 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
12:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P32519 and previous config saved to /var/cache/conftool/dbconfig/20220818-120208-marostegui.json
11:55 jbond@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
11:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T312972)', diff saved to https://phabricator.wikimedia.org/P32518 and previous config saved to /var/cache/conftool/dbconfig/20220818-114702-marostegui.json
11:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T312972)', diff saved to https://phabricator.wikimedia.org/P32517 and previous config saved to /var/cache/conftool/dbconfig/20220818-114555-marostegui.json
11:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
11:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
11:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
11:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
11:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T312972)', diff saved to https://phabricator.wikimedia.org/P32516 and previous config saved to /var/cache/conftool/dbconfig/20220818-114518-marostegui.json
11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repool db1112', diff saved to https://phabricator.wikimedia.org/P32515 and previous config saved to /var/cache/conftool/dbconfig/20220818-113655-ladsgroup.json
11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'depool db1112', diff saved to https://phabricator.wikimedia.org/P32514 and previous config saved to /var/cache/conftool/dbconfig/20220818-113556-ladsgroup.json
11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P32513 and previous config saved to /var/cache/conftool/dbconfig/20220818-113012-marostegui.json
11:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P32511 and previous config saved to /var/cache/conftool/dbconfig/20220818-111506-marostegui.json
11:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T312972)', diff saved to https://phabricator.wikimedia.org/P32510 and previous config saved to /var/cache/conftool/dbconfig/20220818-110000-marostegui.json
10:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
10:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T312972)', diff saved to https://phabricator.wikimedia.org/P32508 and previous config saved to /var/cache/conftool/dbconfig/20220818-105531-marostegui.json
10:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P32505 and previous config saved to /var/cache/conftool/dbconfig/20220818-104552-ladsgroup.json
10:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P32504 and previous config saved to /var/cache/conftool/dbconfig/20220818-104025-marostegui.json
10:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P32503 and previous config saved to /var/cache/conftool/dbconfig/20220818-102519-marostegui.json
10:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T312972)', diff saved to https://phabricator.wikimedia.org/P32501 and previous config saved to /var/cache/conftool/dbconfig/20220818-101013-marostegui.json
10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T312972)', diff saved to https://phabricator.wikimedia.org/P32500 and previous config saved to /var/cache/conftool/dbconfig/20220818-100806-marostegui.json
10:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
10:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T312972)', diff saved to https://phabricator.wikimedia.org/P32499 and previous config saved to /var/cache/conftool/dbconfig/20220818-100744-marostegui.json
10:03 jayme@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-worker-codfw
10:00 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
09:52 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P32498 and previous config saved to /var/cache/conftool/dbconfig/20220818-095238-marostegui.json
09:44 jayme: dnsdisc depooling codfw for services running in kubernetes cluster (for 30-60min due to T310483, T260661)
09:43 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2004.codfw.wmnet
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P32497 and previous config saved to /var/cache/conftool/dbconfig/20220818-093732-marostegui.json
09:34 _joe_: updating vopsbot to 0.3.0
09:33 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
09:29 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2003.codfw.wmnet
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T312972)', diff saved to https://phabricator.wikimedia.org/P32496 and previous config saved to /var/cache/conftool/dbconfig/20220818-092226-marostegui.json
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T312972)', diff saved to https://phabricator.wikimedia.org/P32495 and previous config saved to /var/cache/conftool/dbconfig/20220818-092219-marostegui.json
09:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
09:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
09:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
09:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
09:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T312972)', diff saved to https://phabricator.wikimedia.org/P32494 and previous config saved to /var/cache/conftool/dbconfig/20220818-092130-marostegui.json
09:19 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
09:18 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet
09:09 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2002.codfw.wmnet
09:09 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1004.eqiad.wmnet
09:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P32493 and previous config saved to /var/cache/conftool/dbconfig/20220818-090624-marostegui.json
08:59 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1004.eqiad.wmnet
08:59 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1003.eqiad.wmnet
08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P32492 and previous config saved to /var/cache/conftool/dbconfig/20220818-085118-marostegui.json
08:49 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1003.eqiad.wmnet
08:49 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner1002.eqiad.wmnet
08:39 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner1002.eqiad.wmnet
08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T312972)', diff saved to https://phabricator.wikimedia.org/P32491 and previous config saved to /var/cache/conftool/dbconfig/20220818-083612-marostegui.json
08:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T312972)', diff saved to https://phabricator.wikimedia.org/P32490 and previous config saved to /var/cache/conftool/dbconfig/20220818-083505-marostegui.json
08:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
08:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
08:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T312972)', diff saved to https://phabricator.wikimedia.org/P32489 and previous config saved to /var/cache/conftool/dbconfig/20220818-083417-marostegui.json
08:33 vgutierrez: upgrade to ATS 9.1.3 in cp5014 and cp5016 - T309651
08:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P32488 and previous config saved to /var/cache/conftool/dbconfig/20220818-081911-marostegui.json
08:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P32487 and previous config saved to /var/cache/conftool/dbconfig/20220818-081627-ladsgroup.json
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P32486 and previous config saved to /var/cache/conftool/dbconfig/20220818-080405-marostegui.json
08:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P32485 and previous config saved to /var/cache/conftool/dbconfig/20220818-080122-ladsgroup.json
07:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T312972)', diff saved to https://phabricator.wikimedia.org/P32484 and previous config saved to /var/cache/conftool/dbconfig/20220818-074859-marostegui.json
07:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T312972)', diff saved to https://phabricator.wikimedia.org/P32483 and previous config saved to /var/cache/conftool/dbconfig/20220818-074652-marostegui.json
07:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
07:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
07:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1104 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P32482 and previous config saved to /var/cache/conftool/dbconfig/20220818-074618-ladsgroup.json
06:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
06:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
06:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
06:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T314041)', diff saved to https://phabricator.wikimedia.org/P32479 and previous config saved to /var/cache/conftool/dbconfig/20220818-064124-ladsgroup.json
06:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P32478 and previous config saved to /var/cache/conftool/dbconfig/20220818-062618-ladsgroup.json
06:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
06:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1104.eqiad.wmnet with reason: Maint
06:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1104.eqiad.wmnet with reason: Maint
06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P32477 and previous config saved to /var/cache/conftool/dbconfig/20220818-061112-ladsgroup.json
06:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1104 T314369', diff saved to https://phabricator.wikimedia.org/P32476 and previous config saved to /var/cache/conftool/dbconfig/20220818-060707-ladsgroup.json
06:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1109 to s8 primary and set section read-write T314369', diff saved to https://phabricator.wikimedia.org/P32475 and previous config saved to /var/cache/conftool/dbconfig/20220818-060213-ladsgroup.json
06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - T314369', diff saved to https://phabricator.wikimedia.org/P32474 and previous config saved to /var/cache/conftool/dbconfig/20220818-060137-ladsgroup.json
06:01 Amir1: Starting s8 eqiad failover from db1104 to db1109 - T314369
05:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T314041)', diff saved to https://phabricator.wikimedia.org/P32473 and previous config saved to /var/cache/conftool/dbconfig/20220818-055606-ladsgroup.json
04:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 31 hosts with reason: Primary switchover s8 T314369
04:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 31 hosts with reason: Primary switchover s8 T314369
04:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1109 with weight 0 T314369', diff saved to https://phabricator.wikimedia.org/P32471 and previous config saved to /var/cache/conftool/dbconfig/20220818-045218-ladsgroup.json
04:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 31 hosts with reason: Primary switchover s8 T314369
04:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 31 hosts with reason: Primary switchover s8 T314369
04:30 TimStarling: on mw1411, mw1413, mw1419, mw1429, mw1431, mw1433: set scaling_governor to performance, attempt 2, T315398
02:15 TimStarling: on mw1411, mw1413, mw1419, mw1429, mw1431, mw1433: set scaling_governor to performance T315398
00:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kubernetes2023.codfw.wmnet']
20:22 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 1ddc661: QuickSurveys: Disable extension on JA wiki (T311015) (duration: 03m 19s)
18:58 demon@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.25 refs T314186
18:58 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-logging2004.codfw.wmnet with OS bullseye
18:55 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1027.eqiad.wmnet with OS bullseye
18:43 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1073.eqiad.wmnet with reason: host reimage
18:40 urandom: disabling reserved space on codfw nodes (RESTBase), /dev/md2 (aka /srv/cassandra/instance-data) -- T314941
18:40 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1027.eqiad.wmnet with reason: host reimage
18:38 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1073.eqiad.wmnet with reason: host reimage
18:36 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1027.eqiad.wmnet with reason: host reimage
18:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T314041)', diff saved to https://phabricator.wikimedia.org/P32469 and previous config saved to /var/cache/conftool/dbconfig/20220817-183223-ladsgroup.json
18:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
18:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
18:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T314041)', diff saved to https://phabricator.wikimedia.org/P32468 and previous config saved to /var/cache/conftool/dbconfig/20220817-183202-ladsgroup.json
18:25 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1073.eqiad.wmnet with OS bullseye
18:22 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1027.eqiad.wmnet with OS bullseye
18:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P32467 and previous config saved to /var/cache/conftool/dbconfig/20220817-181656-ladsgroup.json
18:07 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1056.eqiad.wmnet with OS bullseye
18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P32466 and previous config saved to /var/cache/conftool/dbconfig/20220817-180150-ladsgroup.json
18:01 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS bullseye
17:48 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging2004.codfw.wmnet with OS bullseye
17:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T314041)', diff saved to https://phabricator.wikimedia.org/P32465 and previous config saved to /var/cache/conftool/dbconfig/20220817-174644-ladsgroup.json
17:43 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1056.eqiad.wmnet with reason: host reimage
17:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2005
17:41 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2005
17:39 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1056.eqiad.wmnet with reason: host reimage
17:33 ladsgroup@deploy1002: Synchronized portals: Migrate wikinews.org to the modern portals (duration: 03m 32s)
17:31 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging2004
17:30 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging2004
17:29 ladsgroup@deploy1002: Synchronized portals/wikipedia.org/assets: Migrate wikinews.org to the modern portals (duration: 03m 29s)
17:24 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1056.eqiad.wmnet with OS bullseye
17:10 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kafka-logging2004.codfw.wmnet with OS bullseye
17:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2004.codfw.wmnet with OS bullseye
16:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on graphite2004.codfw.wmnet with reason: host reimage
16:55 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on graphite2004.codfw.wmnet with reason: host reimage
11:11 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
11:11 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 100%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32444 and previous config saved to /var/cache/conftool/dbconfig/20220817-092244-root.json
09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 100%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32443 and previous config saved to /var/cache/conftool/dbconfig/20220817-092125-root.json
09:10 hashar: Upgraded Gerrit from 3.4.4 to 3.4.5 # T315408
09:09 hashar@deploy1002: Finished deploy [gerrit/gerrit@e11e6a7]: Gerrit to 3.4.5 on gerrit1001 # T315408 (duration: 00m 09s)
09:09 hashar@deploy1002: Started deploy [gerrit/gerrit@e11e6a7]: Gerrit to 3.4.5 on gerrit1001 # T315408
09:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 75%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32442 and previous config saved to /var/cache/conftool/dbconfig/20220817-090739-root.json
09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 75%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32441 and previous config saved to /var/cache/conftool/dbconfig/20220817-090620-root.json
09:04 hashar@deploy1002: Finished deploy [gerrit/gerrit@e11e6a7]: Gerrit to 3.4.5 on gerrit 2002 # T315408 (duration: 00m 11s)
09:03 hashar@deploy1002: Started deploy [gerrit/gerrit@e11e6a7]: Gerrit to 3.4.5 on gerrit 2002 # T315408
08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 50%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32440 and previous config saved to /var/cache/conftool/dbconfig/20220817-085235-root.json
08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32439 and previous config saved to /var/cache/conftool/dbconfig/20220817-085224-root.json
08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 100%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32438 and previous config saved to /var/cache/conftool/dbconfig/20220817-085136-root.json
08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 50%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32437 and previous config saved to /var/cache/conftool/dbconfig/20220817-085115-root.json
08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 25%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32436 and previous config saved to /var/cache/conftool/dbconfig/20220817-083730-root.json
08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32435 and previous config saved to /var/cache/conftool/dbconfig/20220817-083719-root.json
08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 75%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32434 and previous config saved to /var/cache/conftool/dbconfig/20220817-083631-root.json
08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 25%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32433 and previous config saved to /var/cache/conftool/dbconfig/20220817-083611-root.json
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 10%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32432 and previous config saved to /var/cache/conftool/dbconfig/20220817-082226-root.json
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32431 and previous config saved to /var/cache/conftool/dbconfig/20220817-082215-root.json
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 50%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32430 and previous config saved to /var/cache/conftool/dbconfig/20220817-082127-root.json
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 10%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32429 and previous config saved to /var/cache/conftool/dbconfig/20220817-082106-root.json
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 5%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32428 and previous config saved to /var/cache/conftool/dbconfig/20220817-080721-root.json
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 10%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32427 and previous config saved to /var/cache/conftool/dbconfig/20220817-080710-root.json
08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 10%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32426 and previous config saved to /var/cache/conftool/dbconfig/20220817-080622-root.json
08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 5%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32425 and previous config saved to /var/cache/conftool/dbconfig/20220817-080602-root.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 2%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32424 and previous config saved to /var/cache/conftool/dbconfig/20220817-075216-root.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 5%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32423 and previous config saved to /var/cache/conftool/dbconfig/20220817-075206-root.json
07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 5%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32422 and previous config saved to /var/cache/conftool/dbconfig/20220817-075118-root.json
07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 2%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32421 and previous config saved to /var/cache/conftool/dbconfig/20220817-075057-root.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 1%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32420 and previous config saved to /var/cache/conftool/dbconfig/20220817-073712-root.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 1%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32419 and previous config saved to /var/cache/conftool/dbconfig/20220817-073701-root.json
07:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 1%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32418 and previous config saved to /var/cache/conftool/dbconfig/20220817-073613-root.json
07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 1%: Repooling 10.6', diff saved to https://phabricator.wikimedia.org/P32417 and previous config saved to /var/cache/conftool/dbconfig/20220817-073553-root.json
07:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T314041)', diff saved to https://phabricator.wikimedia.org/P32416 and previous config saved to /var/cache/conftool/dbconfig/20220817-073141-ladsgroup.json
07:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
07:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
07:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32415 and previous config saved to /var/cache/conftool/dbconfig/20220817-073052-ladsgroup.json
07:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P32414 and previous config saved to /var/cache/conftool/dbconfig/20220817-071546-ladsgroup.json
07:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P32413 and previous config saved to /var/cache/conftool/dbconfig/20220817-070040-ladsgroup.json
06:54 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1034.eqiad.wmnet with OS bullseye
06:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32412 and previous config saved to /var/cache/conftool/dbconfig/20220817-064534-ladsgroup.json
06:42 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1028.eqiad.wmnet with OS bullseye
06:38 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1033.eqiad.wmnet with OS bullseye
06:38 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1034.eqiad.wmnet with reason: host reimage
06:37 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1031.eqiad.wmnet with OS bullseye
06:36 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1029.eqiad.wmnet with OS bullseye
06:35 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1034.eqiad.wmnet with reason: host reimage
06:30 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1032.eqiad.wmnet with OS bullseye
06:28 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1027.eqiad.wmnet with OS bullseye
06:21 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1034.eqiad.wmnet with OS bullseye
06:21 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1034.eqiad.wmnet with OS bullseye
06:20 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1029.eqiad.wmnet with reason: host reimage
06:20 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1033.eqiad.wmnet with reason: host reimage
06:20 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1028.eqiad.wmnet with reason: host reimage
06:17 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1031.eqiad.wmnet with reason: host reimage
06:15 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1032.eqiad.wmnet with reason: host reimage
06:13 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1027.eqiad.wmnet with reason: host reimage
06:10 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1028.eqiad.wmnet with reason: host reimage
06:10 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1033.eqiad.wmnet with reason: host reimage
06:10 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1032.eqiad.wmnet with reason: host reimage
06:10 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1029.eqiad.wmnet with reason: host reimage
06:10 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1031.eqiad.wmnet with reason: host reimage
06:10 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1027.eqiad.wmnet with reason: host reimage
06:00 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd1025.eqiad.wmnet with OS bullseye
05:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1034.eqiad.wmnet with OS bullseye
05:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1033.eqiad.wmnet with OS bullseye
05:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1032.eqiad.wmnet with OS bullseye
05:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1031.eqiad.wmnet with OS bullseye
05:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1029.eqiad.wmnet with OS bullseye
05:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1028.eqiad.wmnet with OS bullseye
05:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1027.eqiad.wmnet with OS bullseye
05:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1027.eqiad.wmnet with OS bullseye
05:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1029.eqiad.wmnet with OS bullseye
05:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1028.eqiad.wmnet with OS bullseye
05:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1031.eqiad.wmnet with OS bullseye
05:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1032.eqiad.wmnet with OS bullseye
05:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1033.eqiad.wmnet with OS bullseye
05:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1034.eqiad.wmnet with OS bullseye
05:31 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1026.eqiad.wmnet with OS bullseye
05:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1034.eqiad.wmnet with OS bullseye
05:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1033.eqiad.wmnet with OS bullseye
05:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1032.eqiad.wmnet with OS bullseye
05:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1031.eqiad.wmnet with OS bullseye
05:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1029.eqiad.wmnet with OS bullseye
05:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1028.eqiad.wmnet with OS bullseye
05:31 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1027.eqiad.wmnet with OS bullseye
05:26 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1027.eqiad.wmnet with OS bullseye
05:26 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1028.eqiad.wmnet with OS bullseye
05:26 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1031.eqiad.wmnet with OS bullseye
05:26 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1029.eqiad.wmnet with OS bullseye
05:26 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1032.eqiad.wmnet with OS bullseye
05:26 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1033.eqiad.wmnet with OS bullseye
05:26 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1034.eqiad.wmnet with OS bullseye
05:19 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1025.eqiad.wmnet with reason: host reimage
05:16 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1026.eqiad.wmnet with reason: host reimage
05:14 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1030.eqiad.wmnet with OS bullseye
05:13 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1025.eqiad.wmnet with reason: host reimage
05:13 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1026.eqiad.wmnet with reason: host reimage
05:03 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1032.eqiad.wmnet with OS bullseye
05:02 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1032.eqiad.wmnet with OS bullseye
04:59 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1025.eqiad.wmnet with OS bullseye
04:59 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1026.eqiad.wmnet with OS bullseye
04:58 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1030.eqiad.wmnet with reason: host reimage
04:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1034.eqiad.wmnet with OS bullseye
04:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1033.eqiad.wmnet with OS bullseye
04:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1031.eqiad.wmnet with OS bullseye
04:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1029.eqiad.wmnet with OS bullseye
04:57 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1028.eqiad.wmnet with OS bullseye
04:56 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1027.eqiad.wmnet with OS bullseye
04:55 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1030.eqiad.wmnet with reason: host reimage
04:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1034.eqiad.wmnet with OS bullseye
04:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1033.eqiad.wmnet with OS bullseye
04:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1031.eqiad.wmnet with OS bullseye
04:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1029.eqiad.wmnet with OS bullseye
04:47 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1028.eqiad.wmnet with OS bullseye
04:47 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1027.eqiad.wmnet with OS bullseye
04:42 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1030.eqiad.wmnet with OS bullseye
04:31 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1030.eqiad.wmnet with OS bullseye
04:25 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1030.eqiad.wmnet with reason: host reimage
04:23 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1034.eqiad.wmnet with OS bullseye
04:23 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1033.eqiad.wmnet with OS bullseye
04:23 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1032.eqiad.wmnet with OS bullseye
04:23 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1031.eqiad.wmnet with OS bullseye
04:23 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1029.eqiad.wmnet with OS bullseye
04:23 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1028.eqiad.wmnet with OS bullseye
04:23 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1027.eqiad.wmnet with OS bullseye
04:23 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1030.eqiad.wmnet with reason: host reimage
04:09 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1030.eqiad.wmnet with OS bullseye
04:08 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd1025.eqiad.wmnet with OS bullseye
02:58 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts elastic[1051-1052].eqiad.wmnet
02:58 ryankemper@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
00:03 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-logging2005.mgmt.codfw.wmnet with reboot policy FORCED
00:02 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging2005.mgmt.codfw.wmnet with reboot policy FORCED
2022-08-16
23:56 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-logging2005.mgmt.codfw.wmnet with reboot policy FORCED
23:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging2004.mgmt.codfw.wmnet with reboot policy FORCED
23:44 mutante: phab1001 - repeated rsync of /srv/repos to phab2002, then chown -R phd /srv/repos/ (without setting the group) - this way UID is fixed and privs match exactly phab1001 - T313360
23:19 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1026.eqiad.wmnet with OS bullseye
23:04 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1026.eqiad.wmnet with reason: host reimage
23:02 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1025.eqiad.wmnet with reason: host reimage
23:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host graphite2004.mgmt.codfw.wmnet with reboot policy FORCED
22:59 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1026.eqiad.wmnet with reason: host reimage
22:59 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1025.eqiad.wmnet with reason: host reimage
22:47 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1026.eqiad.wmnet with OS bullseye
22:47 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1025.eqiad.wmnet with OS bullseye
22:42 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1025.eqiad.wmnet with OS bullseye
22:31 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1025.eqiad.wmnet with OS bullseye
22:30 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1025.eqiad.wmnet with OS bullseye
22:29 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1025.eqiad.wmnet with OS bullseye
22:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Large deletions affecting this replica
22:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Large deletions affecting this replica
22:10 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1025.eqiad.wmnet with reason: host reimage
22:07 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1025.eqiad.wmnet with reason: host reimage
21:56 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1025.eqiad.wmnet with OS bullseye
21:56 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd1025.eqiad.wmnet with OS bullseye
21:54 demon@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.25 refs T314186
21:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1025.eqiad.wmnet with OS bullseye
21:53 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1025.eqiad.wmnet with OS bullseye
21:53 bking@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts elastic1048.eqiad.wmnet
21:53 bking@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:19 cjming@deploy1002: Started scap: Backport for gerrit:823658 mediawikiwiki: set $wgCdnMatchParameterOrder to false
20:18 ori: removed /var/lock/scap.operations_mediawiki-config.lock on deploy1002
20:16 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1055.eqiad.wmnet with reason: host reimage
20:14 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1025.eqiad.wmnet with OS bullseye
20:14 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1055.eqiad.wmnet with reason: host reimage
20:13 cjming@deploy1002: scap failed: LockFailedError Failed to acquire lock "/var/lock/scap.operations_mediawiki-config.lock"; owner is "demon"; reason is "all wikis to 1.39.0-wmf.23 refs T314186" (duration: 00m 00s)
19:58 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1055.eqiad.wmnet with OS bullseye
19:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32408 and previous config saved to /var/cache/conftool/dbconfig/20220816-195115-ladsgroup.json
19:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
19:50 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1025.eqiad.wmnet with OS bullseye
19:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
19:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1025.eqiad.wmnet with OS bullseye
19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32407 and previous config saved to /var/cache/conftool/dbconfig/20220816-195043-ladsgroup.json
19:49 demon@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.25 refs T314186
19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P32406 and previous config saved to /var/cache/conftool/dbconfig/20220816-193537-ladsgroup.json
19:25 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reimage (bullseye upgrade) - ryankemper@cumin1001 - T289135
19:21 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1025.eqiad.wmnet with OS bullseye
19:20 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1025.eqiad.wmnet with OS bullseye
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P32405 and previous config saved to /var/cache/conftool/dbconfig/20220816-192031-ladsgroup.json
19:19 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1025.eqiad.wmnet with OS bullseye
19:18 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reimage (bullseye upgrade) - ryankemper@cumin1001 - T289135
19:13 otto@deploy1002: Finished deploy [analytics/refinery@6e47e0e]: Full deploy after last week's interrupted deployment. This syncs the latest refinery to all targets. an-launcher1002 already has these files. (duration: 24m 46s)
19:07 demon@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.23 refs T314186
19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32404 and previous config saved to /var/cache/conftool/dbconfig/20220816-190525-ladsgroup.json
19:05 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1025.eqiad.wmnet with OS bullseye
19:04 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirt1025.eqiad.wmnet with OS bullseye
19:04 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1076.eqiad.wmnet with OS bullseye
18:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1025.eqiad.wmnet with OS bullseye
18:51 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1025.eqiad.wmnet with OS bullseye
18:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1025.eqiad.wmnet with OS bullseye
18:48 otto@deploy1002: Started deploy [analytics/refinery@6e47e0e]: Full deploy after last week's interrupted deployment. This syncs the latest refinery to all targets. an-launcher1002 already has these files.
18:42 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1076.eqiad.wmnet with reason: host reimage
18:40 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1025.eqiad.wmnet with OS bullseye
18:40 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1025.eqiad.wmnet with OS bullseye
18:39 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1076.eqiad.wmnet with reason: host reimage
16:02 btullis@deploy1002: Started deploy [airflow-dags/analytics@3c998da]: (no justification provided)
15:48 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be2032.codfw.wmnet
15:48 mvernon@cumin1001: START - Cookbook sre.hosts.remove-downtime for ms-be2032.codfw.wmnet
15:42 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1074.eqiad.wmnet with OS bullseye
15:29 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2032.codfw.wmnet with reason: RAID battery failure
15:29 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2032.codfw.wmnet with reason: RAID battery failure
15:25 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1074.eqiad.wmnet with reason: host reimage
15:23 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1074.eqiad.wmnet with reason: host reimage
15:12 jayme@cumin1001: END (PASS) - Cookbook sre.discovery.service-route-jayme (exit_code=0)
15:10 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1074.eqiad.wmnet with OS bullseye
10:34 jelto@cumin1001: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
10:30 jelto: reimaging gitlab2003 (insetup) to test partman recipe from gerrit:823115 - T274463
09:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db1169.eqiad.wmnet with reason: Maintenance
09:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on db1169.eqiad.wmnet with reason: Maintenance
09:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1163.eqiad.wmnet with reason: Maintenance
09:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1163.eqiad.wmnet with reason: Maintenance
09:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on db1169.eqiad.wmnet with reason: Maintenance
09:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on db1169.eqiad.wmnet with reason: Maintenance
09:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1163.eqiad.wmnet with reason: Maintenance
09:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1163.eqiad.wmnet with reason: Maintenance
08:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
08:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
08:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
08:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32402 and previous config saved to /var/cache/conftool/dbconfig/20220816-074259-ladsgroup.json
07:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32401 and previous config saved to /var/cache/conftool/dbconfig/20220816-074239-ladsgroup.json
07:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P32400 and previous config saved to /var/cache/conftool/dbconfig/20220816-072733-ladsgroup.json
07:26 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be2067.codfw.wmnet
07:26 mvernon@cumin1001: START - Cookbook sre.hosts.remove-downtime for ms-be2067.codfw.wmnet
07:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
07:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance
07:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
07:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
07:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1169.eqiad.wmnet with reason: Maint
07:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1169.eqiad.wmnet with reason: Maint
07:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P32399 and previous config saved to /var/cache/conftool/dbconfig/20220816-071227-ladsgroup.json
06:58 hashar@deploy1002: Started deploy [integration/docroot@c142ba7]: Drop archived wikibase-vuejs-components storybook - T309872
06:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32398 and previous config saved to /var/cache/conftool/dbconfig/20220816-065721-ladsgroup.json
06:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maint
06:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maint
06:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1169', diff saved to https://phabricator.wikimedia.org/P32397 and previous config saved to /var/cache/conftool/dbconfig/20220816-062955-ladsgroup.json
06:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1163 T314380', diff saved to https://phabricator.wikimedia.org/P32396 and previous config saved to /var/cache/conftool/dbconfig/20220816-061413-ladsgroup.json
06:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1118 to s1 primary and set section read-write T314380', diff saved to https://phabricator.wikimedia.org/P32395 and previous config saved to /var/cache/conftool/dbconfig/20220816-060530-ladsgroup.json
06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - T314380', diff saved to https://phabricator.wikimedia.org/P32394 and previous config saved to /var/cache/conftool/dbconfig/20220816-060455-ladsgroup.json
06:04 Amir1: Starting s1 eqiad failover from db1163 to db1118 - T314380
05:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1118 with weight 0 T314380', diff saved to https://phabricator.wikimedia.org/P32393 and previous config saved to /var/cache/conftool/dbconfig/20220816-053534-ladsgroup.json
05:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 34 hosts with reason: Primary switchover s1 T314380
05:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 34 hosts with reason: Primary switchover s1 T314380
05:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db[2142-2143].codfw.wmnet with reason: After-canary
05:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db[2142-2143].codfw.wmnet with reason: After-canary
23:20 mutante: phab2002 - manually removing service IP addresses for git-ssh.codfw.wikimedia.org which were added by puppet even after gerrit:823220 (!) T280597
22:59 mutante: search-loader1001 - killed puppet process that had been running since May
22:52 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
22:49 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage
22:36 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddumps1001.wikimedia.org with OS bullseye
22:33 mutante: rsyncing /srv/repos and /srv/dumps from phab1001 to phab2002 before applying prod puppet role (T313360)
22:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1083.eqiad.wmnet with OS bullseye
19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T314041)', diff saved to https://phabricator.wikimedia.org/P32391 and previous config saved to /var/cache/conftool/dbconfig/20220815-193541-ladsgroup.json
19:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
19:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 (T314041)', diff saved to https://phabricator.wikimedia.org/P32390 and previous config saved to /var/cache/conftool/dbconfig/20220815-193520-ladsgroup.json
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130', diff saved to https://phabricator.wikimedia.org/P32389 and previous config saved to /var/cache/conftool/dbconfig/20220815-192014-ladsgroup.json
19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130', diff saved to https://phabricator.wikimedia.org/P32388 and previous config saved to /var/cache/conftool/dbconfig/20220815-190508-ladsgroup.json
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1130 (T314041)', diff saved to https://phabricator.wikimedia.org/P32387 and previous config saved to /var/cache/conftool/dbconfig/20220815-185002-ladsgroup.json
18:49 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1081.eqiad.wmnet with OS bullseye
18:40 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@230a820]: include additional deubgging information in HivePartitionRangeSensor logs (duration: 02m 08s)
18:38 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@230a820]: include additional deubgging information in HivePartitionRangeSensor logs
18:33 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1081.eqiad.wmnet with reason: host reimage
18:31 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ms-be2067.codfw.wmnet
18:29 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1081.eqiad.wmnet with reason: host reimage
18:24 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be2067.codfw.wmnet
18:16 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1081.eqiad.wmnet with OS bullseye
18:07 herron: thanos compact process was hung, forced thanos-compact restart on thanos-fe2001
17:48 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1052.eqiad.wmnet with OS bullseye
17:32 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1052.eqiad.wmnet with reason: host reimage
17:29 pt1979@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet
15:32 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: disk fault investigation
15:32 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: disk fault investigation
15:31 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be2032.codfw.wmnet
15:31 mvernon@cumin1001: START - Cookbook sre.hosts.remove-downtime for ms-be2032.codfw.wmnet
15:31 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2032.codfw.wmnet with reason: RAID battery failure
15:31 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2032.codfw.wmnet with reason: RAID battery failure
15:31 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be2032.codfw.wmnet
15:31 mvernon@cumin1001: START - Cookbook sre.hosts.remove-downtime for ms-be2032.codfw.wmnet
15:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1068.eqiad.wmnet with OS bullseye
14:39 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1068.eqiad.wmnet with reason: host reimage
14:36 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1068.eqiad.wmnet with reason: host reimage
14:23 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1068.eqiad.wmnet with OS bullseye
14:10 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2032.codfw.wmnet with reason: RAID battery failure
14:10 hnowlan@deploy1002: Started deploy [restbase/deploy@a571f9a]: Add blwiki T310874
14:10 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2032.codfw.wmnet with reason: RAID battery failure
14:05 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1070.eqiad.wmnet with OS bullseye
13:49 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1070.eqiad.wmnet with reason: host reimage
13:46 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1070.eqiad.wmnet with reason: host reimage
13:34 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1070.eqiad.wmnet with OS bullseye
13:29 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reimage (bullseye upgrade) - ryankemper@cumin1001 - T289135
07:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1130 (T314041)', diff saved to https://phabricator.wikimedia.org/P32386 and previous config saved to /var/cache/conftool/dbconfig/20220815-070955-ladsgroup.json
07:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1130.eqiad.wmnet with reason: Maintenance
07:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1130.eqiad.wmnet with reason: Maintenance
07:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
07:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T312863)', diff saved to https://phabricator.wikimedia.org/P32385 and previous config saved to /var/cache/conftool/dbconfig/20220815-023538-ladsgroup.json
02:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P32384 and previous config saved to /var/cache/conftool/dbconfig/20220815-022032-ladsgroup.json
02:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P32383 and previous config saved to /var/cache/conftool/dbconfig/20220815-020526-ladsgroup.json
01:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T312863)', diff saved to https://phabricator.wikimedia.org/P32382 and previous config saved to /var/cache/conftool/dbconfig/20220815-015020-ladsgroup.json
2022-08-14
08:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T312863)', diff saved to https://phabricator.wikimedia.org/P32380 and previous config saved to /var/cache/conftool/dbconfig/20220814-085443-ladsgroup.json
08:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
08:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
2022-08-13
13:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
13:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T312863)', diff saved to https://phabricator.wikimedia.org/P32379 and previous config saved to /var/cache/conftool/dbconfig/20220813-133713-ladsgroup.json
13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P32378 and previous config saved to /var/cache/conftool/dbconfig/20220813-132207-ladsgroup.json
13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P32377 and previous config saved to /var/cache/conftool/dbconfig/20220813-130701-ladsgroup.json
12:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T312863)', diff saved to https://phabricator.wikimedia.org/P32376 and previous config saved to /var/cache/conftool/dbconfig/20220813-125156-ladsgroup.json
22:14 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reimage (bullseye upgrade) - ryankemper@cumin1001 - T289135
21:48 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1071.eqiad.wmnet with OS bullseye
21:45 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host clouddb2002-dev.codfw.wmnet with OS bullseye
21:27 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1071.eqiad.wmnet with reason: host reimage
21:25 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1071.eqiad.wmnet with reason: host reimage
21:12 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1071.eqiad.wmnet with OS bullseye
21:10 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb2002-dev.codfw.wmnet with reason: host reimage
21:06 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb2002-dev.codfw.wmnet with reason: host reimage
21:06 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1053.eqiad.wmnet with OS bullseye
20:50 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host clouddb2002-dev.codfw.wmnet with OS bullseye
20:43 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1053.eqiad.wmnet with reason: host reimage
20:39 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1053.eqiad.wmnet with reason: host reimage
20:24 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1053.eqiad.wmnet with OS bullseye
20:12 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1048.eqiad.wmnet with OS bullseye
19:55 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1048.eqiad.wmnet with reason: host reimage
19:53 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1048.eqiad.wmnet with reason: host reimage
19:42 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1048.eqiad.wmnet with OS bullseye
19:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T312863)', diff saved to https://phabricator.wikimedia.org/P32375 and previous config saved to /var/cache/conftool/dbconfig/20220812-193822-ladsgroup.json
19:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
19:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
19:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T312863)', diff saved to https://phabricator.wikimedia.org/P32374 and previous config saved to /var/cache/conftool/dbconfig/20220812-193801-ladsgroup.json
19:33 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1054.eqiad.wmnet with OS bullseye
19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P32373 and previous config saved to /var/cache/conftool/dbconfig/20220812-192255-ladsgroup.json
19:12 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1054.eqiad.wmnet with reason: host reimage
19:09 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1054.eqiad.wmnet with reason: host reimage
19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P32372 and previous config saved to /var/cache/conftool/dbconfig/20220812-190749-ladsgroup.json
18:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet with reason: Maint
18:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet with reason: Maint
18:54 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1054.eqiad.wmnet with OS bullseye
18:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T312863)', diff saved to https://phabricator.wikimedia.org/P32371 and previous config saved to /var/cache/conftool/dbconfig/20220812-185243-ladsgroup.json
18:48 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1066.eqiad.wmnet with OS bullseye
18:25 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1066.eqiad.wmnet with reason: host reimage
18:22 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1066.eqiad.wmnet with reason: host reimage
18:08 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1066.eqiad.wmnet with OS bullseye
18:00 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1064.eqiad.wmnet with OS bullseye
17:42 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1064.eqiad.wmnet with reason: host reimage
17:39 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1064.eqiad.wmnet with reason: host reimage
17:24 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1064.eqiad.wmnet with OS bullseye
17:21 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts netmon2002.wikimedia.org
01:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T312863)', diff saved to https://phabricator.wikimedia.org/P32369 and previous config saved to /var/cache/conftool/dbconfig/20220812-010312-ladsgroup.json
01:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
01:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
01:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
01:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
01:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T312863)', diff saved to https://phabricator.wikimedia.org/P32368 and previous config saved to /var/cache/conftool/dbconfig/20220812-010233-ladsgroup.json
00:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P32367 and previous config saved to /var/cache/conftool/dbconfig/20220812-004727-ladsgroup.json
00:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P32366 and previous config saved to /var/cache/conftool/dbconfig/20220812-003221-ladsgroup.json
00:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T312863)', diff saved to https://phabricator.wikimedia.org/P32365 and previous config saved to /var/cache/conftool/dbconfig/20220812-001715-ladsgroup.json
16:35 mvernon@cumin2002: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: upgrade to 3.11.13 T309896 - mvernon@cumin2002
16:30 mvernon@cumin2002: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: upgrade to 3.11.13 T309896 - mvernon@cumin2002
16:29 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic[1100-1102].eqiad.wmnet with reason: T309810
16:29 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic[1100-1102].eqiad.wmnet with reason: T309810
16:26 inflatador: bking@elastic1054 attempting to ban elastic1100-1102 from cluster due to firewall issues
14:16 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
14:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
14:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
14:15 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
07:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T312863)', diff saved to https://phabricator.wikimedia.org/P32357 and previous config saved to /var/cache/conftool/dbconfig/20220811-070312-ladsgroup.json
07:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1147.eqiad.wmnet with reason: Maintenance
07:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1147.eqiad.wmnet with reason: Maintenance
07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T312863)', diff saved to https://phabricator.wikimedia.org/P32356 and previous config saved to /var/cache/conftool/dbconfig/20220811-070252-ladsgroup.json
06:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P32355 and previous config saved to /var/cache/conftool/dbconfig/20220811-064746-ladsgroup.json
06:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P32354 and previous config saved to /var/cache/conftool/dbconfig/20220811-063240-ladsgroup.json
06:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
06:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
06:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T312863)', diff saved to https://phabricator.wikimedia.org/P32353 and previous config saved to /var/cache/conftool/dbconfig/20220811-061734-ladsgroup.json
06:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maint
06:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maint
06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1122 to s2 primary and set section read-write T314368', diff saved to https://phabricator.wikimedia.org/P32351 and previous config saved to /var/cache/conftool/dbconfig/20220811-060113-ladsgroup.json
06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s2 eqiad as read-only for maintenance - T314368', diff saved to https://phabricator.wikimedia.org/P32350 and previous config saved to /var/cache/conftool/dbconfig/20220811-060042-ladsgroup.json
06:00 Amir1: Starting s2 eqiad failover from db1162 to db1122 - T314368
05:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1122 with weight 0 T314368', diff saved to https://phabricator.wikimedia.org/P32349 and previous config saved to /var/cache/conftool/dbconfig/20220811-051913-ladsgroup.json
05:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 25 hosts with reason: Primary switchover s2 T314368
05:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 25 hosts with reason: Primary switchover s2 T314368
m: chown -R librenms /srv/librenms/rrd/ on netmon1003 T314972
03:51 cwhite: chown librenms /srv/librenms/rrd/* on netmon1003 T314972
00:57 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on cp2042.codfw.wmnet with reason: host down; depooled and will debug tomorrow
00:57 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on cp2042.codfw.wmnet with reason: host down; depooled and will debug tomorrow
17:34 otto@deploy1002: Finished deploy [analytics/refinery@6e47e0e] (hadoop-test): Add missing changes to the deletion script - T270433 - TEST [analytics/refinery@6e47e0e] (duration: 04m 19s)
17:30 fnegri@cumin1001: START - Cookbook sre.hosts.decommission for hosts labweb1002.wikimedia.org
17:30 otto@deploy1002: Started deploy [analytics/refinery@6e47e0e] (hadoop-test): Add missing changes to the deletion script - T270433 - TEST [analytics/refinery@6e47e0e]
17:08 otto@deploy1002: Started deploy [analytics/refinery@d4dd7e4] (hadoop-test): Add safety limits to refinery-drop-older-than - T270433 - TEST [analytics/refinery@d4dd7e4]
17:06 sukhe: testing ATS 9.1.3-1wm1 on cp4032: T309651
17:06 urandom: flushing RESTBase Cassandra tables -row B- to (temporarily) free instance-data space -- T314941
17:05 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on krb2001.codfw.wmnet with reason: btullis codfw maintenance
17:05 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on krb2001.codfw.wmnet with reason: btullis codfw maintenance
17:04 dzahn@cumin2002: START - Cookbook sre.hosts.decommission for hosts gerrit2001.wikimedia.org
17:02 sukhe: testing ATS 9.1.3-1wm1 on cp6008: T309651
16:56 sukhe: testing ATS 9.1.3-1wm1 on cp6016: T309651
16:55 fnegri@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts labweb1001.wikimedia.org
16:55 fnegri@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:32 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts gerrit2001.wikimedia.org
16:32 dzahn@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
16:32 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kubernetes[2013-2014].codfw.wmnet
16:31 jelto@cumin1001: START - Cookbook sre.hosts.remove-downtime for kubernetes[2013-2014].codfw.wmnet
16:13 sukhe: reprepro -C component/trafficserver9 include buster-wikimedia trafficserver_9.1.3-1wm1_amd64.changes: T309651
16:13 dzahn@cumin2002: START - Cookbook sre.hosts.decommission for hosts gerrit2001.wikimedia.org
16:11 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be[2039,2050,2056,2059].codfw.wmnet,thanos-be2004.codfw.wmnet with reason: PDU work
16:10 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be[2039,2050,2056,2059].codfw.wmnet,thanos-be2004.codfw.wmnet with reason: PDU work
16:09 urandom: flushing tables in row D (RESTBase Cassandra cluster) -- T314941
15:54 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gitlab-runner2004.codfw.wmnet
15:54 jelto@cumin1001: START - Cookbook sre.hosts.remove-downtime for gitlab-runner2004.codfw.wmnet
15:53 sukhe: poweroff cp2041, 42 for PDU ugprade: rack D7
15:51 urandom: flushing tables in row B (RESTBase Cassandra cluster) -- T314941
15:49 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2004.codfw.wmnet with reason: PDU maintenance
15:49 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2004.codfw.wmnet with reason: PDU maintenance
15:46 urandom: flushing tables in row A (RESTBase Cassandra cluster) -- T314941
15:46 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on aqs2012.codfw.wmnet with reason: btullis codfw maintenance
15:46 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2041-2042].codfw.wmnet with reason: shutdown for PDU upgrade: rack D4
15:46 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on aqs2012.codfw.wmnet with reason: btullis codfw maintenance
15:46 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on cp[2041-2042].codfw.wmnet with reason: shutdown for PDU upgrade: rack D4
15:46 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on aqs2011.codfw.wmnet with reason: btullis codfw maintenance
15:46 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on aqs2011.codfw.wmnet with reason: btullis codfw maintenance
15:45 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on aqs2010.codfw.wmnet with reason: btullis codfw maintenance
15:45 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on aqs2010.codfw.wmnet with reason: btullis codfw maintenance
15:45 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on aqs2009.codfw.wmnet with reason: btullis codfw maintenance
15:45 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on aqs2009.codfw.wmnet with reason: btullis codfw maintenance
15:34 jbond: remove puppetmaster[12]002 from production
15:30 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kafka-main2004.codfw.wmnet
15:30 jelto@cumin1001: START - Cookbook sre.hosts.remove-downtime for kafka-main2004.codfw.wmnet
15:20 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mc[2051-2052].codfw.wmnet
15:20 jelto@cumin1001: START - Cookbook sre.hosts.remove-downtime for mc[2051-2052].codfw.wmnet
15:17 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mc-gp2003.codfw.wmnet
15:17 jelto@cumin1001: START - Cookbook sre.hosts.remove-downtime for mc-gp2003.codfw.wmnet
15:16 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mc2033.codfw.wmnet
15:16 jelto@cumin1001: START - Cookbook sre.hosts.remove-downtime for mc2033.codfw.wmnet
15:14 _joe_: power off krb2002
15:14 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on krb2002.codfw.wmnet with reason: PDU maintenance
15:13 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on krb2002.codfw.wmnet with reason: PDU maintenance
15:13 _joe_: shutting down rdb2010,puppetmaster2002 for d5 maintenance
15:02 jelto: power off mc2035
15:01 jelto: power off mc2034
15:01 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mc2035.codfw.wmnet with reason: PDU swap
15:01 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on mc2035.codfw.wmnet with reason: PDU swap
15:01 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mc2034.codfw.wmnet with reason: PDU swap
15:01 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on mc2034.codfw.wmnet with reason: PDU swap
14:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2094.codfw.wmnet with reason: PDU Maint (T310146)
14:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2094.codfw.wmnet with reason: PDU Maint (T310146)
14:38 urandom: disabling reserved space on eqiad nodes (RESTBase), /dev/md2 (aka /srv/cassandra/instance-data) -- T314941
14:27 sukhe: power off cp2039, cp2040 for PDU upgrade: rack D
14:27 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore2003.codfw.wmnet with reason: PDU maintenance
14:27 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore2003.codfw.wmnet with reason: PDU maintenance
14:25 jelto: power off mc-gp2003
14:25 jelto: power off mc2033
14:24 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:30:00 on kafka-main2004.codfw.wmnet with reason: PDU swap
14:23 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 4:30:00 on kafka-main2004.codfw.wmnet with reason: PDU swap
14:23 sukhe: depool codfw for PDU upgrade: rack D
14:23 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:30:00 on mc-gp2003.codfw.wmnet with reason: PDU swap
14:23 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 4:30:00 on mc-gp2003.codfw.wmnet with reason: PDU swap
14:23 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:30:00 on mc2033.codfw.wmnet with reason: PDU swap
14:23 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 4:30:00 on mc2033.codfw.wmnet with reason: PDU swap
14:13 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2039-2040].codfw.wmnet with reason: shutdown for PDU upgrade: rack D4
14:12 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on dns2002.wikimedia.org with reason: shutdown for PDU upgrade: rack D4
14:12 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on dns2002.wikimedia.org with reason: shutdown for PDU upgrade: rack D4
13:32 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on netmon1003.wikimedia.org with reason: pdu
13:32 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on netmon1003.wikimedia.org with reason: pdu
13:32 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on logstash2003.codfw.wmnet with reason: pdu
13:31 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on logstash2003.codfw.wmnet with reason: pdu
13:31 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on logstash2029.codfw.wmnet with reason: pdu
13:31 root@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on logstash2029.codfw.wmnet with reason: pdu
13:30 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: T310146
13:30 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 6 hosts with reason: T310146
13:17 elukey: powering on restbase2027
13:12 elukey: powering on restbase2026
13:12 _joe_: powering on restbase2023
13:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T312863)', diff saved to https://phabricator.wikimedia.org/P32343 and previous config saved to /var/cache/conftool/dbconfig/20220810-130108-ladsgroup.json
13:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
13:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1160.eqiad.wmnet with reason: Maintenance
12:37 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic[2072,2084-2085].codfw.wmnet with reason: T310146
12:37 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic[2072,2084-2085].codfw.wmnet with reason: T310146
12:27 jbond: remove confd from serveres that shouldn;t have it
11:15 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
10:54 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
10:51 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage
10:37 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
10:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2181-2182].codfw.wmnet with reason: D6 PDU maint (T310146)
10:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db[2181-2182].codfw.wmnet with reason: D6 PDU maint (T310146)
10:26 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on maps2010.codfw.wmnet with reason: PDU maintenance
10:26 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on maps2010.codfw.wmnet with reason: PDU maintenance
10:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on restbase[2023,2026-2027].codfw.wmnet with reason: PDU maintenance
10:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on restbase[2023,2026-2027].codfw.wmnet with reason: PDU maintenance
09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: D8 PDU Maint (T310146)
09:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 6 hosts with reason: D8 PDU Maint (T310146)
09:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool D8 DBs for PDU maint (T310146)', diff saved to https://phabricator.wikimedia.org/P32341 and previous config saved to /var/cache/conftool/dbconfig/20220810-095059-ladsgroup.json
09:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2101,2130,2140].codfw.wmnet,dbproxy2004.codfw.wmnet with reason: D6 PDU maint (T310146)
09:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db[2101,2130,2140].codfw.wmnet,dbproxy2004.codfw.wmnet with reason: D6 PDU maint (T310146)
09:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool D6 dbs (T310146)', diff saved to https://phabricator.wikimedia.org/P32340 and previous config saved to /var/cache/conftool/dbconfig/20220810-093433-ladsgroup.json
09:31 jelto: depool services in codfw for upcoming PDU replacement - T309956
09:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
09:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
09:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
09:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
09:28 jynus: shutdown backup2007 before pdu upgrade T310146
09:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2093,2120,2129,2172].codfw.wmnet with reason: D5 PDU maint (T310146)
09:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db[2093,2120,2129,2172].codfw.wmnet with reason: D5 PDU maint (T310146)
08:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
08:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
08:31 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:30:00 on gitlab-runner2004.codfw.wmnet with reason: PDU swap
08:31 root@cumin1001: START - Cookbook sre.hosts.downtime for 8:30:00 on gitlab-runner2004.codfw.wmnet with reason: PDU swap
08:29 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2028.codfw.wmnet with reason: Trying to fix full /
08:28 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2028.codfw.wmnet with reason: Trying to fix full /
08:28 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox interface ID cr1-drmrs:xe-0/1/2
08:27 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox interface ID cr1-drmrs:xe-0/1/2
08:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P32336 and previous config saved to /var/cache/conftool/dbconfig/20220810-082718-ladsgroup.json
08:25 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.debug (exit_code=99) for Netbox interface ID cr1-drmrs:xe-0/1/2
08:25 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox interface ID cr1-drmrs:xe-0/1/2
08:24 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.debug (exit_code=99) for Netbox interface ID cr1-drmrs:xe-0/1/2
08:24 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox interface ID cr1-drmrs:xe-0/1/2
05:24 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on kubernetes[2013-2014].codfw.wmnet with reason: PDU maintenance
05:24 oblivian@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on kubernetes[2013-2014].codfw.wmnet with reason: PDU maintenance
05:19 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on parse[2016-2020].codfw.wmnet with reason: PDU maintenance
05:19 oblivian@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on parse[2016-2020].codfw.wmnet with reason: PDU maintenance
05:12 _joe_: starting to shut down servers in codfw for the PDU maintenance
05:09 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on 10 hosts with reason: PDU maintenance
05:09 oblivian@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on 10 hosts with reason: PDU maintenance
05:09 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on mc-gp2003.codfw.wmnet with reason: PDU maintenance
05:09 oblivian@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on mc-gp2003.codfw.wmnet with reason: PDU maintenance
05:06 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on mc2033.codfw.wmnet with reason: PDU maintenance
05:06 oblivian@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on mc2033.codfw.wmnet with reason: PDU maintenance
05:05 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on 7 hosts with reason: PDU maintenance
05:05 oblivian@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on 7 hosts with reason: PDU maintenance
20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T312863)', diff saved to https://phabricator.wikimedia.org/P32332 and previous config saved to /var/cache/conftool/dbconfig/20220809-205548-ladsgroup.json
20:51 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs1014.eqiad.wmnet
20:51 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs1014.eqiad.wmnet
20:46 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
20:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P32331 and previous config saved to /var/cache/conftool/dbconfig/20220809-204042-ladsgroup.json
20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P32330 and previous config saved to /var/cache/conftool/dbconfig/20220809-202536-ladsgroup.json
20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T312863)', diff saved to https://phabricator.wikimedia.org/P32329 and previous config saved to /var/cache/conftool/dbconfig/20220809-201030-ladsgroup.json
19:57 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on wdqs1014.eqiad.wmnet with reason: T314890
19:57 bking@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on wdqs1014.eqiad.wmnet with reason: T314890
19:56 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: T314890
19:56 bking@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on wdqs1016.eqiad.wmnet with reason: T314890
19:55 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: T314890
19:55 bking@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on wdqs1015.eqiad.wmnet with reason: T314890
15:27 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1058.eqiad.wmnet with OS bullseye
15:08 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1058.eqiad.wmnet with reason: host reimage
15:05 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1058.eqiad.wmnet with reason: host reimage
14:59 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
14:59 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
m: finished running 'homer "status:active" commit "netmon: Add the netmon1003 host as a syslog destination"' in the cumin1001 host. Homer reported no errors.
14:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
14:50 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1058.eqiad.wmnet with OS bullseye
06:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1100 to s5 primary and set section read-write T314370', diff saved to https://phabricator.wikimedia.org/P32322 and previous config saved to /var/cache/conftool/dbconfig/20220809-060159-ladsgroup.json
06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s5 eqiad as read-only for maintenance - T314370', diff saved to https://phabricator.wikimedia.org/P32321 and previous config saved to /var/cache/conftool/dbconfig/20220809-060105-ladsgroup.json
06:00 Amir1: Starting s5 eqiad failover from db1130 to db1100 - T314370
05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1100 with weight 0 T314370', diff saved to https://phabricator.wikimedia.org/P32320 and previous config saved to /var/cache/conftool/dbconfig/20220809-051251-ladsgroup.json
05:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 22 hosts with reason: Primary switchover s5 T314370
05:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 22 hosts with reason: Primary switchover s5 T314370
02:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T312863)', diff saved to https://phabricator.wikimedia.org/P32318 and previous config saved to /var/cache/conftool/dbconfig/20220809-023113-ladsgroup.json
02:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
02:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1148.eqiad.wmnet with reason: Maintenance
02:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T312863)', diff saved to https://phabricator.wikimedia.org/P32317 and previous config saved to /var/cache/conftool/dbconfig/20220809-023052-ladsgroup.json
02:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P32316 and previous config saved to /var/cache/conftool/dbconfig/20220809-021546-ladsgroup.json
02:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P32315 and previous config saved to /var/cache/conftool/dbconfig/20220809-020040-ladsgroup.json
01:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T312863)', diff saved to https://phabricator.wikimedia.org/P32314 and previous config saved to /var/cache/conftool/dbconfig/20220809-014534-ladsgroup.json
16:38 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1085.eqiad.wmnet with reason: host reimage
16:26 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1085.eqiad.wmnet with OS bullseye
16:24 bking@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic1085.eqiad.wmnet with OS bullseye
16:19 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1085.eqiad.wmnet with reason: host reimage
16:16 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1085.eqiad.wmnet with reason: host reimage
16:16 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reimage (bullseye upgrade) - ryankemper@cumin1001 - T289135
16:14 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
16:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
16:10 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
16:09 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reimage (bullseye upgrade) - ryankemper@cumin1001 - T289135
16:04 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1085.eqiad.wmnet with OS bullseye
16:00 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1084.eqiad.wmnet with OS bullseye
15:58 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster reimage (bullseye upgrade) - ryankemper@cumin1001 - T289135
15:47 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1084.eqiad.wmnet with reason: host reimage
15:46 sukhe: upload reprepro -C main include bullseye-wikimedia python-pynetbox_6.6.0-1+wmf11u1_amd64.changes
15:45 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1084.eqiad.wmnet with reason: host reimage
15:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es2021.codfw.wmnet with reason: Maint
15:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es2021.codfw.wmnet with reason: Maint
15:32 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1084.eqiad.wmnet with OS bullseye
14:59 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
14:55 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
14:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp5001.eqsin.wmnet with reason: depooled: faulty DIMM: T314256
14:46 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cp5001.eqsin.wmnet with reason: depooled: faulty DIMM: T314256
14:34 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
14:11 kevinbazira@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
12:06 urbanecm@deploy1002: Synchronized php-1.39.0-wmf.23/extensions/GrowthExperiments/: 3eaf155: MentorTools: Do not use MentorWeightManager (T314362) (duration: 03m 31s)
10:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.decommission for hosts db2079.codfw.wmnet
10:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db2079.codfw.wmnet with reason: Decom
10:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db2079.codfw.wmnet with reason: Decom
08:41 jbond: deploy libtirpc update
07:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T312863)', diff saved to https://phabricator.wikimedia.org/P32310 and previous config saved to /var/cache/conftool/dbconfig/20220808-075723-ladsgroup.json
07:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
07:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
07:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T312863)', diff saved to https://phabricator.wikimedia.org/P32309 and previous config saved to /var/cache/conftool/dbconfig/20220808-075702-ladsgroup.json
07:53 godog: grow sda/sdb 3 by 100G on thanos-be2001 - T314275
07:50 godog: grow sda/sdb 3 by 100G on thanos-be1004 - T314275
07:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P32308 and previous config saved to /var/cache/conftool/dbconfig/20220808-074156-ladsgroup.json
07:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P32307 and previous config saved to /var/cache/conftool/dbconfig/20220808-072650-ladsgroup.json
07:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T312863)', diff saved to https://phabricator.wikimedia.org/P32306 and previous config saved to /var/cache/conftool/dbconfig/20220808-071144-ladsgroup.json
13:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T312863)', diff saved to https://phabricator.wikimedia.org/P32305 and previous config saved to /var/cache/conftool/dbconfig/20220807-135204-ladsgroup.json
13:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
13:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
13:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T312863)', diff saved to https://phabricator.wikimedia.org/P32304 and previous config saved to /var/cache/conftool/dbconfig/20220807-135143-ladsgroup.json
13:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P32303 and previous config saved to /var/cache/conftool/dbconfig/20220807-133637-ladsgroup.json
13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P32302 and previous config saved to /var/cache/conftool/dbconfig/20220807-132131-ladsgroup.json
13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T312863)', diff saved to https://phabricator.wikimedia.org/P32301 and previous config saved to /var/cache/conftool/dbconfig/20220807-130625-ladsgroup.json
12:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T312863)', diff saved to https://phabricator.wikimedia.org/P32300 and previous config saved to /var/cache/conftool/dbconfig/20220807-120610-ladsgroup.json
12:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
12:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
12:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T312863)', diff saved to https://phabricator.wikimedia.org/P32299 and previous config saved to /var/cache/conftool/dbconfig/20220807-120549-ladsgroup.json
11:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P32298 and previous config saved to /var/cache/conftool/dbconfig/20220807-115043-ladsgroup.json
11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P32297 and previous config saved to /var/cache/conftool/dbconfig/20220807-113537-ladsgroup.json
11:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T312863)', diff saved to https://phabricator.wikimedia.org/P32296 and previous config saved to /var/cache/conftool/dbconfig/20220807-112031-ladsgroup.json
2022-08-06
17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T312863)', diff saved to https://phabricator.wikimedia.org/P32295 and previous config saved to /var/cache/conftool/dbconfig/20220806-175916-ladsgroup.json
17:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
17:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1149.eqiad.wmnet with reason: Maintenance
14:23 jbond: upload encore-clojure to puppet7 component
14:17 jbond: upload truss-clojure to puppet7 component
14:13 jbond: upload structured-logging-clojure to puppet7 component
14:06 jbond: upload murphy-clojure to puppet7 component
13:57 jbond: upload logstash-logback-encoder-7.2 to puppet7 component
13:49 jbond: upload kitchensink-clojure to puppet7 component
13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool hosts with fragile power supply (T314559T314628)', diff saved to https://phabricator.wikimedia.org/P32292 and previous config saved to /var/cache/conftool/dbconfig/20220805-132709-ladsgroup.json
13:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
13:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
13:09 sukhe: repool codfw
13:02 jbond: upload honeysql-clojure to puppet7 component
12:53 _joe_: progressive repool of services in codfw
12:24 moritzm: installing nano bugfix updates from bullseye point release
11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repool after PDU maint on D3 (T310146)', diff saved to https://phabricator.wikimedia.org/P32291 and previous config saved to /var/cache/conftool/dbconfig/20220805-113729-ladsgroup.json
11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repool after PDU maint on C6 (T310145)', diff saved to https://phabricator.wikimedia.org/P32290 and previous config saved to /var/cache/conftool/dbconfig/20220805-113555-ladsgroup.json
11:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repool after PDU maint on C5 (T310145)', diff saved to https://phabricator.wikimedia.org/P32289 and previous config saved to /var/cache/conftool/dbconfig/20220805-113436-ladsgroup.json
09:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 12 hosts with reason: Maintenance
09:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 12 hosts with reason: Maintenance
09:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
09:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2110.codfw.wmnet with reason: Maintenance
00:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on gerrit2001.wikimedia.org with reason: decom, replaced by gerrit2002
00:53 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 8 days, 0:00:00 on gerrit2001.wikimedia.org with reason: decom, replaced by gerrit2002
00:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit2002.wikimedia.org
00:53 dzahn@cumin1001: START - Cookbook sre.hosts.remove-downtime for gerrit2002.wikimedia.org
00:52 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on gerrit2002.wikimedia.org with reason: decom, replaced by gerrit2002
00:52 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 8 days, 0:00:00 on gerrit2002.wikimedia.org with reason: decom, replaced by gerrit2002
00:18 mutante: restarting gerrit for config change - removing old replica T313250
2022-08-04
23:07 mutante: switching gerrit-replica.wikimedia.org to new machine gerrit2002, dropping gerrit-replica-new.wikimedia.org T313250
18:04 btullis@deploy1002: Started deploy [analytics/refinery@2553288]: Regular analytics weekly train [analytics/refinery@2553288]
17:55 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2009.codfw.wmnet with reason: shutdown for PDU upgrade
17:55 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2009.codfw.wmnet with reason: shutdown for PDU upgrade
17:43 mutante: maps2008 - downtime and shutdown for D3 maintenance
17:42 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on maps2008.codfw.wmnet with reason: codfw reboots
17:42 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on maps2008.codfw.wmnet with reason: codfw reboots
17:42 mutante: thunmbor2006 - downtime and shutdown for D3 maintenance
17:42 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on thumbor2006.codfw.wmnet with reason: codfw reboots
17:41 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on thumbor2006.codfw.wmnet with reason: codfw reboots
17:39 mutante: mw2386 - systemctl reset-failed
17:31 mutante: phab2001 - systemctl restart ssh-phab, attempting to clear Icinga pybal alerts, related to reboots
17:30 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on dns2001.wikimedia.org with reason: shutdown for PDU upgrade
17:30 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on dns2001.wikimedia.org with reason: shutdown for PDU upgrade
17:29 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dns2001.wikimedia.org with reason: shutdown for PDU upgrade
17:29 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on dns2001.wikimedia.org with reason: shutdown for PDU upgrade
17:16 Emperor: shutdown of moss-fe2002.codfw.wmnet,ms-be20[37,38,43,61,65,69].codfw.wmnet,ms-fe2012.codfw.wmnet,thanos-fe2003.codfw.wmnet for power work T310146
17:16 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cp[2035-2036].codfw.wmnet with reason: shutdown for PDU upgrade
17:15 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on cp[2035-2036].codfw.wmnet with reason: shutdown for PDU upgrade
17:15 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 9 hosts with reason: PDU work
17:15 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 9 hosts with reason: PDU work
16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool D3 for PDU maint', diff saved to https://phabricator.wikimedia.org/P32286 and previous config saved to /var/cache/conftool/dbconfig/20220804-163037-ladsgroup.json
16:06 Emperor: shutdown ms-be20[39,49,54].codfw.wmnet,thanos-be2003 for PDU swap T310145
16:03 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be[2036,2049,2054].codfw.wmnet,thanos-be2003.codfw.wmnet with reason: PDU work
16:02 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be[2036,2049,2054].codfw.wmnet,thanos-be2003.codfw.wmnet with reason: PDU work
16:02 ebysans@deploy1002: Started deploy [analytics/refinery@2553288]: Regular analytics weekly train [analytics/refinery@2553288]
15:50 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2048.codfw.wmnet with reason: T310145
15:50 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2048.codfw.wmnet with reason: T310145
15:43 damilare: payments-wiki upgraded from 0e4a5b3b to 6880236d
15:37 _joe_: uncordoning ml-serve200{1,6}
15:27 sukhe: power off cp2037,cp2038: PDU upgrade
15:25 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:30:00 on phab2001.codfw.wmnet with reason: PDU swap
15:25 jelto: power off phab2001
15:25 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 3:30:00 on phab2001.codfw.wmnet with reason: PDU swap
15:25 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cp[2037-2038].codfw.wmnet with reason: shutdown for PDU upgrade
15:24 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on cp[2037-2038].codfw.wmnet with reason: shutdown for PDU upgrade
15:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db[2116,2127,2167-2168].codfw.wmnet,es2022.codfw.wmnet with reason: Maintenance (T310145)
15:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db[2116,2127,2167-2168].codfw.wmnet,es2022.codfw.wmnet with reason: Maintenance (T310145)
15:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool C6 for PDU maint (T310145)', diff saved to https://phabricator.wikimedia.org/P32285 and previous config saved to /var/cache/conftool/dbconfig/20220804-151958-ladsgroup.json
15:16 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
15:16 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on restbase[2016,2020,2025].codfw.wmnet with reason: PDU maintenance
15:16 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on restbase[2016,2020,2025].codfw.wmnet with reason: PDU maintenance
15:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db[2114,2126,2166].codfw.wmnet with reason: Maintenance (T310145)
15:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db[2114,2126,2166].codfw.wmnet with reason: Maintenance (T310145)
13:58 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be[2058,2064].codfw.wmnet with reason: PDU work
13:58 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be[2058,2064].codfw.wmnet with reason: PDU work
10:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2017.codfw.wmnet with reason: host reimage
10:49 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2017.codfw.wmnet with reason: host reimage
10:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2017.codfw.wmnet with OS bullseye
10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on ganeti2017.codfw.wmnet with reason: Remove node for eventual reimage, T311686
10:27 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ganeti2017.codfw.wmnet with reason: Remove node for eventual reimage, T311686
10:19 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 9:00:00 on 32 hosts with reason: PDU swap
10:19 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 9:00:00 on 32 hosts with reason: PDU swap
10:03 Lucas_WMDE: stashbot temporarily parted and lost several logs between 9:42 UTC and 9:49 UTC; mainly mwdebug helmfil start/done, also ayounsi sre.deploy.python-code cookbook to cumin1001, cumin2002; see IRC logs
10:02 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update requirements + wmf-netbox - ayounsi@cumin1001
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2177 to s3 T311494', diff saved to https://phabricator.wikimedia.org/P32282 and previous config saved to /var/cache/conftool/dbconfig/20220804-093704-marostegui.json
09:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: wmf-netbox.py update - ayounsi@cumin1001
09:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2089.codfw.wmnet
09:26 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:26 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 0614a39: testwiki: Growth: Switch to structured mentor list (T310905) (duration: 03m 38s)
09:25 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: wmf-netbox.py update - ayounsi@cumin1001
08:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1132, db111, db1127, db1143', diff saved to https://phabricator.wikimedia.org/P32281 and previous config saved to /var/cache/conftool/dbconfig/20220804-081958-root.json
08:19 jelto: power off mc2047 and mc2048
08:16 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:45:00 on mc[2047-2048].codfw.wmnet with reason: PDU swap
08:16 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 10:45:00 on mc[2047-2048].codfw.wmnet with reason: PDU swap
08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch instance to plain disks, T311686
08:04 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch instance to plain disks, T311686
07:55 marostegui: Remove grants for 208.80.154.160/208.80.155.109 T314528
07:49 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2089 from dbctl T313799', diff saved to https://phabricator.wikimedia.org/P32280 and previous config saved to /var/cache/conftool/dbconfig/20220804-074957-marostegui.json
07:47 godog: grow sda/sdb 3 by 100G on thanos-be2002 - T314275
07:46 godog: grow sda/sdb 3 by 100G on thanos-be1003 - T314275
07:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
07:29 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
07:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2135,2160].codfw.wmnet with reason: codfw pdu maintenance
07:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db[2135,2160].codfw.wmnet with reason: codfw pdu maintenance
07:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
07:09 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2030.codfw.wmnet to cluster codfw and group A
07:09 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to cluster codfw and group A
07:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es[2023-2025].codfw.wmnet with reason: codfw pdu maintenance
07:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es[2023-2025].codfw.wmnet with reason: codfw pdu maintenance
05:36 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2030.codfw.wmnet with reason: host reimage
05:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2030.codfw.wmnet with OS bullseye
05:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on ganeti2030.codfw.wmnet with reason: Remove node for eventual reimage, T311686
05:16 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ganeti2030.codfw.wmnet with reason: Remove node for eventual reimage, T311686
01:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32278 and previous config saved to /var/cache/conftool/dbconfig/20220804-012341-marostegui.json
01:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P32277 and previous config saved to /var/cache/conftool/dbconfig/20220804-010834-marostegui.json
00:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P32276 and previous config saved to /var/cache/conftool/dbconfig/20220804-005328-marostegui.json
00:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32275 and previous config saved to /var/cache/conftool/dbconfig/20220804-003822-marostegui.json
00:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32274 and previous config saved to /var/cache/conftool/dbconfig/20220804-003611-marostegui.json
00:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
00:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
00:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32273 and previous config saved to /var/cache/conftool/dbconfig/20220804-003549-marostegui.json
00:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P32272 and previous config saved to /var/cache/conftool/dbconfig/20220804-002043-marostegui.json
00:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P32271 and previous config saved to /var/cache/conftool/dbconfig/20220804-000536-marostegui.json
00:03 mutante: gerrit - service restart to deploy config change to add second replica T313250
00:01 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit.wikimedia.org with reason: service restart
00:00 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit.wikimedia.org with reason: service restart
00:00 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1001.wikimedia.org with reason: service restart
2022-08-03
23:59 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit1001.wikimedia.org with reason: service restart
23:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32270 and previous config saved to /var/cache/conftool/dbconfig/20220803-235030-marostegui.json
22:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32269 and previous config saved to /var/cache/conftool/dbconfig/20220803-225015-marostegui.json
22:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
22:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
22:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 9 hosts with reason: Maintenance
22:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 9 hosts with reason: Maintenance
22:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
22:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
22:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
22:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
22:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T312972)', diff saved to https://phabricator.wikimedia.org/P32268 and previous config saved to /var/cache/conftool/dbconfig/20220803-224827-marostegui.json
22:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P32267 and previous config saved to /var/cache/conftool/dbconfig/20220803-223321-marostegui.json
22:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P32266 and previous config saved to /var/cache/conftool/dbconfig/20220803-221815-marostegui.json
22:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T312972)', diff saved to https://phabricator.wikimedia.org/P32265 and previous config saved to /var/cache/conftool/dbconfig/20220803-220309-marostegui.json
22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T312972)', diff saved to https://phabricator.wikimedia.org/P32264 and previous config saved to /var/cache/conftool/dbconfig/20220803-220057-marostegui.json
22:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
22:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
22:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
22:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32263 and previous config saved to /var/cache/conftool/dbconfig/20220803-220007-marostegui.json
21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P32262 and previous config saved to /var/cache/conftool/dbconfig/20220803-214501-marostegui.json
21:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P32261 and previous config saved to /var/cache/conftool/dbconfig/20220803-212955-marostegui.json
21:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32260 and previous config saved to /var/cache/conftool/dbconfig/20220803-211449-marostegui.json
21:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32259 and previous config saved to /var/cache/conftool/dbconfig/20220803-211237-marostegui.json
21:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
21:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
21:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T312972)', diff saved to https://phabricator.wikimedia.org/P32258 and previous config saved to /var/cache/conftool/dbconfig/20220803-211216-marostegui.json
21:03 ejegg: updated standalone SmashPig deployment from 8e8f0017 to 9b97ea15
20:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P32257 and previous config saved to /var/cache/conftool/dbconfig/20220803-205710-marostegui.json
20:43 urbanecm@deploy1002: Synchronized php-1.39.0-wmf.23/extensions/VisualEditor/includes/VisualEditorParsoidClient.php: a804fe1: Update call to PageConfigFactory::create to use new signature (T314523) (duration: 03m 25s)
20:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P32256 and previous config saved to /var/cache/conftool/dbconfig/20220803-204204-marostegui.json
20:39 urbanecm@deploy1002: sync-file aborted: a804fe1: Update call to PageConfigFactory::create to use new signature (T314523Ăş (duration: 00m 00s)
20:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T312972)', diff saved to https://phabricator.wikimedia.org/P32255 and previous config saved to /var/cache/conftool/dbconfig/20220803-202658-marostegui.json
20:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1122 (T312972)', diff saved to https://phabricator.wikimedia.org/P32254 and previous config saved to /var/cache/conftool/dbconfig/20220803-202146-marostegui.json
20:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
20:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1122.eqiad.wmnet with reason: Maintenance
20:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T312972)', diff saved to https://phabricator.wikimedia.org/P32253 and previous config saved to /var/cache/conftool/dbconfig/20220803-202125-marostegui.json
20:08 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash2032.codfw.wmnet on all recursors
20:08 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache logstash2032.codfw.wmnet on all recursors
20:08 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:07 mutante: gerrit - adding second replica T313250
20:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P32252 and previous config saved to /var/cache/conftool/dbconfig/20220803-200619-marostegui.json
19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P32251 and previous config saved to /var/cache/conftool/dbconfig/20220803-195113-marostegui.json
19:40 ryankemper: T314078 Forgot to mention, restart is at `ryankemper@cumin1001` tmux session `codfw_restarts`
19:39 ryankemper: T314078 Rolling upgrade of codfw hosts; after this all of eqiad/codfw will have the new plugin version and we can resume the `search-loader` instances: `sudo -E cookbook sre.elasticsearch.rolling-operation search_codfw "codfw cluster plugin upgrade" --upgrade --nodes-per-run 3 --start-datetime 2022-08-03T19:38:10 --task-id T314078`
19:38 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster plugin upgrade - ryankemper@cumin1001 - T314078
19:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T312972)', diff saved to https://phabricator.wikimedia.org/P32250 and previous config saved to /var/cache/conftool/dbconfig/20220803-193607-marostegui.json
19:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T312972)', diff saved to https://phabricator.wikimedia.org/P32249 and previous config saved to /var/cache/conftool/dbconfig/20220803-193354-marostegui.json
19:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
19:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
19:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T312972)', diff saved to https://phabricator.wikimedia.org/P32248 and previous config saved to /var/cache/conftool/dbconfig/20220803-193334-marostegui.json
19:25 mutante: gerrit1001 - rsyncing /var/lib/gerrit/review_site/ over to gerrit2002 815401
19:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P32247 and previous config saved to /var/cache/conftool/dbconfig/20220803-191828-marostegui.json
19:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P32246 and previous config saved to /var/cache/conftool/dbconfig/20220803-190321-marostegui.json
18:56 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kubernetes2011.codfw.wmnet
18:56 rzl@cumin1001: START - Cookbook sre.hosts.remove-downtime for kubernetes2011.codfw.wmnet
18:12 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.23 refs T308076
17:58 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kubestage2002.codfw.wmnet
17:58 rzl@cumin1001: START - Cookbook sre.hosts.remove-downtime for kubestage2002.codfw.wmnet
17:57 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mc[2025-2026].codfw.wmnet
17:57 rzl@cumin1001: START - Cookbook sre.hosts.remove-downtime for mc[2025-2026].codfw.wmnet
17:57 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for elastic2044.codfw.wmnet
17:57 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for elastic2044.codfw.wmnet
17:56 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for elastic2043.codfw.wmnet
17:56 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for elastic2043.codfw.wmnet
17:55 ottomata: increasing partitions from 5 to 6 for *.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite topics in Kafka main-eqiad and main-codfw - T314426
17:55 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be2055.codfw.wmnet
17:55 mvernon@cumin1001: START - Cookbook sre.hosts.remove-downtime for ms-be2055.codfw.wmnet
17:00 btullis@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
16:48 Emperor: shutdown moss-fe2001.codfw.wmnet,ms-fe2011.codfw.wmnet,ms-be20[34,35,42,48,55,68].codfw.wmnet PDU work T310145
16:47 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 8 hosts with reason: PDU work
16:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on gerrit2002.wikimedia.org with reason: in setup / flapping
16:47 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 8 hosts with reason: PDU work
16:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on gerrit2002.wikimedia.org with reason: in setup / flapping
16:46 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ms-be[2033,2047].codfw.wmnet,thanos-be2002.codfw.wmnet
16:46 mvernon@cumin1001: START - Cookbook sre.hosts.remove-downtime for ms-be[2033,2047].codfw.wmnet,thanos-be2002.codfw.wmnet
16:40 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mc2046.codfw.wmnet
16:40 jayme@cumin1001: START - Cookbook sre.hosts.remove-downtime for mc2046.codfw.wmnet
16:39 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 10 hosts
16:39 jayme@cumin1001: START - Cookbook sre.hosts.remove-downtime for 10 hosts
16:38 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mc2023.codfw.wmnet
16:38 jelto@cumin1001: START - Cookbook sre.hosts.remove-downtime for mc2023.codfw.wmnet
16:37 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on gitlab-runner2002.codfw.wmnet with reason: PDU swap
16:37 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on gitlab-runner2002.codfw.wmnet with reason: PDU swap
16:35 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on mc[2025-2026].codfw.wmnet with reason: PDU swap
16:35 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on mc[2025-2026].codfw.wmnet with reason: PDU swap
16:32 jelto: power off mc2025-2026
16:31 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for rdb2008.codfw.wmnet
16:30 jayme@cumin1001: START - Cookbook sre.hosts.remove-downtime for rdb2008.codfw.wmnet
16:28 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
16:28 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for kubernetes[2009-2010,2020].codfw.wmnet
16:27 jayme@cumin1001: START - Cookbook sre.hosts.remove-downtime for kubernetes[2009-2010,2020].codfw.wmnet
16:11 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 12 hosts
16:11 jelto@cumin1001: START - Cookbook sre.hosts.remove-downtime for 12 hosts
16:08 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 15 hosts
16:08 jayme@cumin1001: START - Cookbook sre.hosts.remove-downtime for 15 hosts
16:08 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs[2005-2008].codfw.wmnet
16:08 mvernon@cumin1001: START - Cookbook sre.hosts.remove-downtime for aqs[2005-2008].codfw.wmnet
15:59 Emperor: shutdown ms-be20[33,47],thanos-be2002 prior to PDU work T310070
15:58 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be[2033,2047].codfw.wmnet,thanos-be2002.codfw.wmnet with reason: PDU work
15:58 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be[2033,2047].codfw.wmnet,thanos-be2002.codfw.wmnet with reason: PDU work
15:52 jelto: pooling mw2259-2270 again
15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T312972)', diff saved to https://phabricator.wikimedia.org/P32242 and previous config saved to /var/cache/conftool/dbconfig/20220803-154515-marostegui.json
15:38 vgutierrez: clearing ats-be cache on cp6008 - T309651
15:36 elukey: powercycle kafka-logging2003 - not responsive to serial console
15:36 urbanecm@deploy1002: Synchronized php-1.39.0-wmf.22/extensions/GrowthExperiments/includes/NewcomerTasks/AddImage/ServiceImageRecommendationProvider.php: 4438957: ServiceImageRecommendationProvider: Add extra logging when no JSON response received (T313973) (duration: 03m 04s)
15:35 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on maps2009.codfw.wmnet with reason: PDU maintenance
15:35 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on maps2009.codfw.wmnet with reason: PDU maintenance
15:30 vgutierrez: clearing ats-be cache on cp6016 - T309651
15:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P32241 and previous config saved to /var/cache/conftool/dbconfig/20220803-153009-marostegui.json
15:24 jayme@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) _etcd._tcp.eqsin.wmnet on all recursors
15:24 jayme@cumin1001: START - Cookbook sre.dns.wipe-cache _etcd._tcp.eqsin.wmnet on all recursors
15:24 jayme@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) _etcd._tcp.ulsfo.wmnet on all recursors
15:24 jayme@cumin1001: START - Cookbook sre.dns.wipe-cache _etcd._tcp.ulsfo.wmnet on all recursors
15:24 jayme@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) _etcd._tcp.codfw.wmnet on all recursors
15:24 jayme@cumin1001: START - Cookbook sre.dns.wipe-cache _etcd._tcp.codfw.wmnet on all recursors
15:19 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2030.codfw.wmnet with reason: T310070
15:19 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2030.codfw.wmnet with reason: T310070
15:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P32240 and previous config saved to /var/cache/conftool/dbconfig/20220803-151502-marostegui.json
15:10 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for conf2004.codfw.wmnet
15:10 jayme@cumin1001: START - Cookbook sre.hosts.remove-downtime for conf2004.codfw.wmnet
15:04 jelto: power off mc2023
14:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T312972)', diff saved to https://phabricator.wikimedia.org/P32239 and previous config saved to /var/cache/conftool/dbconfig/20220803-145956-marostegui.json
14:59 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on mc2023.codfw.wmnet with reason: PDU swap
14:59 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on mc2023.codfw.wmnet with reason: PDU swap
14:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T312972)', diff saved to https://phabricator.wikimedia.org/P32238 and previous config saved to /var/cache/conftool/dbconfig/20220803-145849-marostegui.json
14:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
14:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
14:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109 (T312972)', diff saved to https://phabricator.wikimedia.org/P32237 and previous config saved to /var/cache/conftool/dbconfig/20220803-145828-marostegui.json
14:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109', diff saved to https://phabricator.wikimedia.org/P32236 and previous config saved to /var/cache/conftool/dbconfig/20220803-144322-marostegui.json
14:34 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2029.codfw.wmnet with reason: T310070
14:33 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2029.codfw.wmnet with reason: T310070
14:32 Emperor: shutdown aqs200[5-8] prior to PDU work T310070
14:31 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on aqs[2005-2008].codfw.wmnet with reason: PDU work
14:31 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on thumbor[2003-2004].codfw.wmnet with reason: PDU swap
14:31 mvernon@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on aqs[2005-2008].codfw.wmnet with reason: PDU work
14:31 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on thumbor[2003-2004].codfw.wmnet with reason: PDU swap
14:28 jelto: power off thumbor2003 and thumbor2004
14:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109', diff saved to https://phabricator.wikimedia.org/P32235 and previous config saved to /var/cache/conftool/dbconfig/20220803-142816-marostegui.json
14:27 moritzm: upgrading ganeti/esams to Ganeti 3.0.2 T312637
14:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109 (T312972)', diff saved to https://phabricator.wikimedia.org/P32234 and previous config saved to /var/cache/conftool/dbconfig/20220803-141310-marostegui.json
14:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1109 (T312972)', diff saved to https://phabricator.wikimedia.org/P32233 and previous config saved to /var/cache/conftool/dbconfig/20220803-141103-marostegui.json
14:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1109.eqiad.wmnet with reason: Maintenance
14:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1109.eqiad.wmnet with reason: Maintenance
14:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T312972)', diff saved to https://phabricator.wikimedia.org/P32232 and previous config saved to /var/cache/conftool/dbconfig/20220803-141042-marostegui.json
14:06 moritzm: installing freetype security updates on bullseye
13:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P32231 and previous config saved to /var/cache/conftool/dbconfig/20220803-135536-marostegui.json
13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P32230 and previous config saved to /var/cache/conftool/dbconfig/20220803-134030-marostegui.json
13:30 moritzm: installing Java 8 security updates for Buster
13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T312972)', diff saved to https://phabricator.wikimedia.org/P32229 and previous config saved to /var/cache/conftool/dbconfig/20220803-132524-marostegui.json
13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T312972)', diff saved to https://phabricator.wikimedia.org/P32228 and previous config saved to /var/cache/conftool/dbconfig/20220803-131916-marostegui.json
13:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
13:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T312972)', diff saved to https://phabricator.wikimedia.org/P32227 and previous config saved to /var/cache/conftool/dbconfig/20220803-131855-marostegui.json
13:18 sukhe: depool codfw for PDU upgrade: CR 819798
13:05 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2044.codfw.wmnet with reason: T310070
13:05 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2044.codfw.wmnet with reason: T310070
13:04 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P32226 and previous config saved to /var/cache/conftool/dbconfig/20220803-130348-marostegui.json
12:59 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2043.codfw.wmnet with reason: T310070
12:59 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2043.codfw.wmnet with reason: T310070
12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P32224 and previous config saved to /var/cache/conftool/dbconfig/20220803-124842-marostegui.json
12:40 moritzm: uploaded openjdk-8 8u342-b07-1~deb10u1 to component/jdk8 for buster-wikimedia (rebuild of latest Java 8 security update)
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T312972)', diff saved to https://phabricator.wikimedia.org/P32223 and previous config saved to /var/cache/conftool/dbconfig/20220803-123336-marostegui.json
12:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T312972)', diff saved to https://phabricator.wikimedia.org/P32222 and previous config saved to /var/cache/conftool/dbconfig/20220803-122929-marostegui.json
12:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
12:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
12:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
12:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
12:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T312972)', diff saved to https://phabricator.wikimedia.org/P32221 and previous config saved to /var/cache/conftool/dbconfig/20220803-122819-marostegui.json
12:16 ebysans@deploy1002: Started deploy [airflow-dags/analytics@614f7b2]: (no justification provided)
12:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P32220 and previous config saved to /var/cache/conftool/dbconfig/20220803-121313-marostegui.json
11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P32219 and previous config saved to /var/cache/conftool/dbconfig/20220803-115807-marostegui.json
11:57 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2176 to s1 T311494', diff saved to https://phabricator.wikimedia.org/P32218 and previous config saved to /var/cache/conftool/dbconfig/20220803-115706-marostegui.json
11:49 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cumin2002.codfw.wmnet with reason: PDU maintenance, T310145
11:49 root@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cumin2002.codfw.wmnet with reason: PDU maintenance, T310145
11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T312972)', diff saved to https://phabricator.wikimedia.org/P32217 and previous config saved to /var/cache/conftool/dbconfig/20220803-114301-marostegui.json
11:17 _joe_: depooling codfw services from all traffic
10:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2011.codfw.wmnet to cluster codfw and group C
10:53 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2011.codfw.wmnet to cluster codfw and group C
10:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2011.codfw.wmnet
10:47 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on kubestage2002.codfw.wmnet with reason: PDU swap
10:46 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on kubestage2002.codfw.wmnet with reason: PDU swap
10:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T312972)', diff saved to https://phabricator.wikimedia.org/P32216 and previous config saved to /var/cache/conftool/dbconfig/20220803-104246-marostegui.json
10:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
10:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
10:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T312972)', diff saved to https://phabricator.wikimedia.org/P32215 and previous config saved to /var/cache/conftool/dbconfig/20220803-104224-marostegui.json
10:41 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2011.codfw.wmnet
10:38 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on restbase[2014-2015,2021-2022].codfw.wmnet with reason: PDU maintenance
10:38 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on restbase[2014-2015,2021-2022].codfw.wmnet with reason: PDU maintenance
10:30 oblivian@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) proton.discovery.wmnet on all recursors
10:30 oblivian@cumin1001: START - Cookbook sre.dns.wipe-cache proton.discovery.wmnet on all recursors
10:29 oblivian@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mathoid.discovery.wmnet on all recursors
10:29 oblivian@cumin1001: START - Cookbook sre.dns.wipe-cache mathoid.discovery.wmnet on all recursors
10:27 oblivian@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) proton.discovery.wmnet on all recursors
10:27 oblivian@cumin1001: START - Cookbook sre.dns.wipe-cache proton.discovery.wmnet on all recursors
10:27 oblivian@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mathoid.discovery.wmnet on all recursors
10:27 oblivian@cumin1001: START - Cookbook sre.dns.wipe-cache mathoid.discovery.wmnet on all recursors
10:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P32213 and previous config saved to /var/cache/conftool/dbconfig/20220803-102718-marostegui.json
10:14 oblivian@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) proton.discovery.wmnet on all recursors
10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2011.codfw.wmnet with OS bullseye
10:14 oblivian@cumin1001: START - Cookbook sre.dns.wipe-cache proton.discovery.wmnet on all recursors
10:14 oblivian@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mathoid.discovery.wmnet on all recursors
10:14 oblivian@cumin1001: START - Cookbook sre.dns.wipe-cache mathoid.discovery.wmnet on all recursors
10:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P32212 and previous config saved to /var/cache/conftool/dbconfig/20220803-101212-marostegui.json
09:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T312972)', diff saved to https://phabricator.wikimedia.org/P32211 and previous config saved to /var/cache/conftool/dbconfig/20220803-095706-marostegui.json
09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2011.codfw.wmnet with reason: host reimage
09:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T312972)', diff saved to https://phabricator.wikimedia.org/P32210 and previous config saved to /var/cache/conftool/dbconfig/20220803-095559-marostegui.json
09:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
09:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
09:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T312972)', diff saved to https://phabricator.wikimedia.org/P32209 and previous config saved to /var/cache/conftool/dbconfig/20220803-095538-marostegui.json
09:43 vgutierrez: rolling restart of pybal in codfw lvs instances - T310070
09:42 jelto: kubectl cordon kubestage2002
09:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P32208 and previous config saved to /var/cache/conftool/dbconfig/20220803-094032-marostegui.json
09:35 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2011.codfw.wmnet with OS bullseye
09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2090.codfw.wmnet
09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:33 ebysans@deploy1002: Started deploy [airflow-dags/analytics@674bb8b]: (no justification provided)
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on ganeti2011.codfw.wmnet with reason: Remove node for eventual reimage, T311686
09:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ganeti2011.codfw.wmnet with reason: Remove node for eventual reimage, T311686
09:25 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db2090.codfw.wmnet
09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P32207 and previous config saved to /var/cache/conftool/dbconfig/20220803-092525-marostegui.json
09:24 oblivian@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0)
09:20 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2090 from dbctl T314109', diff saved to https://phabricator.wikimedia.org/P32206 and previous config saved to /var/cache/conftool/dbconfig/20220803-092053-marostegui.json
09:20 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2024.codfw.wmnet
09:15 jelto: power on mc2024
09:10 XioNoX: configure BGP on the esams-drmrs link - T307221
09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T312972)', diff saved to https://phabricator.wikimedia.org/P32205 and previous config saved to /var/cache/conftool/dbconfig/20220803-091019-marostegui.json
09:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T312972)', diff saved to https://phabricator.wikimedia.org/P32204 and previous config saved to /var/cache/conftool/dbconfig/20220803-090912-marostegui.json
09:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
09:08 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2031.codfw.wmnet
09:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
09:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
09:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
09:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T312972)', diff saved to https://phabricator.wikimedia.org/P32203 and previous config saved to /var/cache/conftool/dbconfig/20220803-090836-marostegui.json
09:07 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2032.codfw.wmnet
09:06 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2042.codfw.wmnet
09:05 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2043.codfw.wmnet
09:04 jynus: stop backup2006 backup2009 for T310070
09:00 jelto@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mc2024.codfw.wmnet
09:00 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2024.codfw.wmnet
08:59 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host cp2031.codfw.wmnet
08:59 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host cp2032.codfw.wmnet
08:58 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2042.codfw.wmnet
08:58 jelto@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host mc2024.codfw.wmnet
08:58 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2024.codfw.wmnet
08:57 jayme@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2043.codfw.wmnet
08:57 oblivian@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
08:54 XioNoX: put the esams-drmrs link in service - T307221
08:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P32202 and previous config saved to /var/cache/conftool/dbconfig/20220803-085330-marostegui.json
08:53 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:51 oblivian@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P32201 and previous config saved to /var/cache/conftool/dbconfig/20220803-083824-marostegui.json
08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T312972)', diff saved to https://phabricator.wikimedia.org/P32200 and previous config saved to /var/cache/conftool/dbconfig/20220803-082318-marostegui.json
07:49 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db2072.codfw.wmnet
07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2072 from dbctl T313911', diff saved to https://phabricator.wikimedia.org/P32199 and previous config saved to /var/cache/conftool/dbconfig/20220803-074806-marostegui.json
07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T312972)', diff saved to https://phabricator.wikimedia.org/P32197 and previous config saved to /var/cache/conftool/dbconfig/20220803-072253-marostegui.json
07:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
07:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T312972)', diff saved to https://phabricator.wikimedia.org/P32196 and previous config saved to /var/cache/conftool/dbconfig/20220803-072214-marostegui.json
07:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 14 hosts with reason: codfw pdu maintenance
07:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 14 hosts with reason: codfw pdu maintenance
07:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2134,2160].codfw.wmnet with reason: codfw pdu maintenance
07:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db[2134,2160].codfw.wmnet with reason: codfw pdu maintenance
07:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 14 hosts with reason: codfw pdu maintenance
07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 14 hosts with reason: codfw pdu maintenance
07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es[2020-2022].codfw.wmnet with reason: codfw pdu maintenance
07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es[2020-2022].codfw.wmnet with reason: codfw pdu maintenance
07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 10 hosts with reason: codfw pdu maintenance
07:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 10 hosts with reason: codfw pdu maintenance
07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db[2096,2101,2115,2131].codfw.wmnet with reason: codfw pdu maintenance
07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P32195 and previous config saved to /var/cache/conftool/dbconfig/20220803-070708-marostegui.json
07:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch instance to plain disks, T311686
07:05 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch instance to plain disks, T311686
06:56 godog: grow sda/sdb 3 by 100G on thanos-be2003 - T314275
06:56 godog: grow sda/sdb 3 by 100G on thanos-be1002 - T314275
06:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P32194 and previous config saved to /var/cache/conftool/dbconfig/20220803-065202-marostegui.json
06:46 godog: power up centrallog2002 and prometheus2005 - T310070
06:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2013.codfw.wmnet to cluster codfw and group C
06:37 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2013.codfw.wmnet to cluster codfw and group C
06:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T312972)', diff saved to https://phabricator.wikimedia.org/P32193 and previous config saved to /var/cache/conftool/dbconfig/20220803-063656-marostegui.json
06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T312972)', diff saved to https://phabricator.wikimedia.org/P32192 and previous config saved to /var/cache/conftool/dbconfig/20220803-063148-marostegui.json
06:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
06:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
06:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 13 hosts with reason: Maintenance
06:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 13 hosts with reason: Maintenance
06:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2161.codfw.wmnet with reason: Maintenance
06:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2161.codfw.wmnet with reason: Maintenance
06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T312972)', diff saved to https://phabricator.wikimedia.org/P32191 and previous config saved to /var/cache/conftool/dbconfig/20220803-063045-marostegui.json
06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P32190 and previous config saved to /var/cache/conftool/dbconfig/20220803-061538-marostegui.json
06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P32189 and previous config saved to /var/cache/conftool/dbconfig/20220803-060032-marostegui.json
05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T312972)', diff saved to https://phabricator.wikimedia.org/P32188 and previous config saved to /var/cache/conftool/dbconfig/20220803-054526-marostegui.json
05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T312972)', diff saved to https://phabricator.wikimedia.org/P32187 and previous config saved to /var/cache/conftool/dbconfig/20220803-054106-marostegui.json
05:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
05:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
05:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
05:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
20:51 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
20:50 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.23 refs T308076
20:38 mutante: re-imaging gerrit2002 with buster - because it's on bullseye, needs git-fat and that has not been ported to python3 yet which blocks upgrading gerrit machines otherwise T313250T243027T279509
18:04 dancy@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.23 refs T308076
17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T312972)', diff saved to https://phabricator.wikimedia.org/P32185 and previous config saved to /var/cache/conftool/dbconfig/20220802-175233-marostegui.json
17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db2159', diff saved to https://phabricator.wikimedia.org/P32184 and previous config saved to /var/cache/conftool/dbconfig/20220802-174311-ladsgroup.json
17:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P32183 and previous config saved to /var/cache/conftool/dbconfig/20220802-173723-marostegui.json
17:32 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic[2041-2042,2057].codfw.wmnet with reason: T310070
17:32 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic[2041-2042,2057].codfw.wmnet with reason: T310070
17:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2013.codfw.wmnet
17:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P32182 and previous config saved to /var/cache/conftool/dbconfig/20220802-172217-marostegui.json
17:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2013.codfw.wmnet
17:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T312972)', diff saved to https://phabricator.wikimedia.org/P32181 and previous config saved to /var/cache/conftool/dbconfig/20220802-170711-marostegui.json
17:06 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc[2042-2043].codfw.wmnet with reason: shutdown for PDU upgrade
17:06 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc[2042-2043].codfw.wmnet with reason: shutdown for PDU upgrade
17:05 Emperor: ms-be20[31,32,41,46].codfw.wmnet,ms-fe2010.codfw.wmnet,thanos-fe2002.codfw.wmnet downtime for PDU work T309957
17:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T312972)', diff saved to https://phabricator.wikimedia.org/P32180 and previous config saved to /var/cache/conftool/dbconfig/20220802-170503-marostegui.json
17:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
17:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
17:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
17:04 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: shutdown for PDU replacement
17:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
17:04 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 6 hosts with reason: shutdown for PDU replacement
17:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
17:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
17:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
17:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T312972)', diff saved to https://phabricator.wikimedia.org/P32179 and previous config saved to /var/cache/conftool/dbconfig/20220802-170333-marostegui.json
16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P32178 and previous config saved to /var/cache/conftool/dbconfig/20220802-164827-marostegui.json
16:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P32177 and previous config saved to /var/cache/conftool/dbconfig/20220802-163321-marostegui.json
16:25 rzl: rzl@stat1007:~$ sudo systemctl stop wmde-analytics-daily-early # wedged, timer will restart it now with max_runtime_seconds
16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T312972)', diff saved to https://phabricator.wikimedia.org/P32176 and previous config saved to /var/cache/conftool/dbconfig/20220802-161815-marostegui.json
16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T312972)', diff saved to https://phabricator.wikimedia.org/P32175 and previous config saved to /var/cache/conftool/dbconfig/20220802-161607-marostegui.json
16:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T312972)', diff saved to https://phabricator.wikimedia.org/P32174 and previous config saved to /var/cache/conftool/dbconfig/20220802-161545-marostegui.json
16:10 btullis@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) an-airflow1004.eqiad.wmnet on all recursors
16:10 btullis@cumin1001: START - Cookbook sre.dns.wipe-cache an-airflow1004.eqiad.wmnet on all recursors
16:10 btullis@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:05 btullis@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-airflow1004.eqiad.wmnet
16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P32173 and previous config saved to /var/cache/conftool/dbconfig/20220802-160039-marostegui.json
15:51 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2056.codfw.wmnet with reason: T309957
15:50 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2056.codfw.wmnet with reason: T309957
15:49 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2040.codfw.wmnet with reason: T309957
15:49 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2040.codfw.wmnet with reason: T309957
15:46 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2039.codfw.wmnet with reason: T309957
15:45 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2039.codfw.wmnet with reason: T309957
15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P32172 and previous config saved to /var/cache/conftool/dbconfig/20220802-154533-marostegui.json
15:37 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc[2040-2041].codfw.wmnet with reason: shutdown for PDU upgrade
15:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc[2040-2041].codfw.wmnet with reason: shutdown for PDU upgrade
15:36 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host elastic2037.codfw.wmnet
15:36 bking@cumin1001: START - Cookbook sre.hosts.reboot-single for host elastic2037.codfw.wmnet
15:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T312972)', diff saved to https://phabricator.wikimedia.org/P32171 and previous config saved to /var/cache/conftool/dbconfig/20220802-153027-marostegui.json
15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T312972)', diff saved to https://phabricator.wikimedia.org/P32170 and previous config saved to /var/cache/conftool/dbconfig/20220802-152818-marostegui.json
15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T312972)', diff saved to https://phabricator.wikimedia.org/P32169 and previous config saved to /var/cache/conftool/dbconfig/20220802-152740-marostegui.json
15:24 moritzm: installing gnupg2 security updates
15:15 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2024.codfw.wmnet with reason: shutdown for PDU upgrade
15:15 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2024.codfw.wmnet with reason: shutdown for PDU upgrade
15:13 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetmaster1004.eqiad.wmnet with OS buster
15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P32167 and previous config saved to /var/cache/conftool/dbconfig/20220802-151234-marostegui.json
15:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on ganeti-test[2001-2003].codfw.wmnet with reason: Power down for PDU maintenance, T310070
15:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on ganeti-test[2001-2003].codfw.wmnet with reason: Power down for PDU maintenance, T310070
15:08 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on thanos-be2001.codfw.wmnet with reason: pdu
15:08 root@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on thanos-be2001.codfw.wmnet with reason: pdu
15:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be[2030,2045,2052].codfw.wmnet with reason: shutdown for PDU replacement
15:07 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 3:00:00 on ms-be[2030,2045,2052].codfw.wmnet with reason: shutdown for PDU replacement
15:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mc-gp2002.codfw.wmnet with reason: Power down for PDU maintenance, T310070
15:06 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on mc-gp2002.codfw.wmnet with reason: Power down for PDU maintenance, T310070
15:04 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2037.codfw.wmnet with reason: T309957
15:04 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2037.codfw.wmnet with reason: T309957
15:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: shutdown for PDU upgrade
15:00 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: shutdown for PDU upgrade
14:59 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2025.codfw.wmnet with reason: T309957
14:59 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2025.codfw.wmnet with reason: T309957
14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P32166 and previous config saved to /var/cache/conftool/dbconfig/20220802-145728-marostegui.json
14:54 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2060.codfw.wmnet with OS bullseye
14:53 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on puppetmaster1004.eqiad.wmnet with reason: host reimage
14:50 moritzm: uploaded gnupg2 2.1.18-8~deb9u4+wmf1 to stretch-wikimedia
14:50 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetmaster1004.eqiad.wmnet with reason: host reimage
14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T312972)', diff saved to https://phabricator.wikimedia.org/P32164 and previous config saved to /var/cache/conftool/dbconfig/20220802-144222-marostegui.json
14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T312972)', diff saved to https://phabricator.wikimedia.org/P32163 and previous config saved to /var/cache/conftool/dbconfig/20220802-144013-marostegui.json
14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
14:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T312972)', diff saved to https://phabricator.wikimedia.org/P32162 and previous config saved to /var/cache/conftool/dbconfig/20220802-143952-marostegui.json
14:37 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host puppetmaster1004.eqiad.wmnet with OS buster
14:32 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2060.codfw.wmnet with reason: host reimage
14:28 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2060.codfw.wmnet with reason: host reimage
14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P32161 and previous config saved to /var/cache/conftool/dbconfig/20220802-142446-marostegui.json
14:23 Emperor: shutdown ms-be20[30,45,52] for PDU work T309957
14:22 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ms-be[2030,2045,2052].codfw.wmnet with reason: shutdown for PDU replacement
14:21 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on ms-be[2030,2045,2052].codfw.wmnet with reason: shutdown for PDU replacement
14:12 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2060.codfw.wmnet with OS bullseye
14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P32160 and previous config saved to /var/cache/conftool/dbconfig/20220802-140940-marostegui.json
14:05 jbond@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host puppetmaster2004.codfw.wmnet with OS buster
14:04 godog: grow sda/sdb 3 by 100G on thanos-be1001 - T314275
14:03 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on centrallog2002.codfw.wmnet with reason: pdu
14:03 root@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on centrallog2002.codfw.wmnet with reason: pdu
14:01 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on prometheus2005.codfw.wmnet with reason: pdu
14:01 root@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on prometheus2005.codfw.wmnet with reason: pdu
13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T312972)', diff saved to https://phabricator.wikimedia.org/P32159 and previous config saved to /var/cache/conftool/dbconfig/20220802-135435-marostegui.json
13:53 godog: depool and poweroff prometheus2005 - T310070
13:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
13:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T312972)', diff saved to https://phabricator.wikimedia.org/P32157 and previous config saved to /var/cache/conftool/dbconfig/20220802-135155-marostegui.json
13:40 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable usage tracking for statement for cebwiki (T296384) â expected to gradually increase number of wbc_entity_usage and probably recentchanges rows on cebwiki, but not too much, see task for details (duration: 03m 06s)
13:39 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2028.codfw.wmnet with OS bullseye
13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P32156 and previous config saved to /var/cache/conftool/dbconfig/20220802-133648-marostegui.json
13:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on ganeti2028.codfw.wmnet with reason: Power down for PDU maintenance, T309957
13:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4:00:00 on ganeti2028.codfw.wmnet with reason: Power down for PDU maintenance, T309957
13:27 jbond@cumin2002: START - Cookbook sre.hosts.reimage for host puppetmaster2004.codfw.wmnet with OS buster
13:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2013.codfw.wmnet with reason: host reimage
13:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2028.codfw.wmnet with reason: host reimage
13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P32155 and previous config saved to /var/cache/conftool/dbconfig/20220802-132142-marostegui.json
13:19 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2013.codfw.wmnet with reason: host reimage
13:19 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2028.codfw.wmnet with reason: host reimage
13:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T312972)', diff saved to https://phabricator.wikimedia.org/P32154 and previous config saved to /var/cache/conftool/dbconfig/20220802-130636-marostegui.json
13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T312972)', diff saved to https://phabricator.wikimedia.org/P32153 and previous config saved to /var/cache/conftool/dbconfig/20220802-130428-marostegui.json
13:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
13:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
13:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
13:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T312972)', diff saved to https://phabricator.wikimedia.org/P32152 and previous config saved to /var/cache/conftool/dbconfig/20220802-130351-marostegui.json
13:02 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2013.codfw.wmnet with OS bullseye
13:00 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2028.codfw.wmnet with OS bullseye
13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on ganeti2013.codfw.wmnet with reason: Remove node for eventual reimage, T311686
12:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on ganeti2013.codfw.wmnet with reason: Remove node for eventual reimage, T311686
12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P32151 and previous config saved to /var/cache/conftool/dbconfig/20220802-124845-marostegui.json
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P32150 and previous config saved to /var/cache/conftool/dbconfig/20220802-123338-marostegui.json
12:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T312972)', diff saved to https://phabricator.wikimedia.org/P32149 and previous config saved to /var/cache/conftool/dbconfig/20220802-121832-marostegui.json
12:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T312972)', diff saved to https://phabricator.wikimedia.org/P32148 and previous config saved to /var/cache/conftool/dbconfig/20220802-121624-marostegui.json
12:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
12:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
10:50 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-worker1082.eqiad.wmnet with reason: T312626 btullis
10:50 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on an-worker1082.eqiad.wmnet with reason: T312626 btullis
10:49 godog: grow sda3 by 100G on thanos-be2004 - T314275
10:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 100%: After restart', diff saved to https://phabricator.wikimedia.org/P32147 and previous config saved to /var/cache/conftool/dbconfig/20220802-103318-root.json
10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 75%: After restart', diff saved to https://phabricator.wikimedia.org/P32146 and previous config saved to /var/cache/conftool/dbconfig/20220802-101813-root.json
10:15 marostegui@cumin1001: dbctl commit (dc=all): 'Add db2175 to s2 T311494', diff saved to https://phabricator.wikimedia.org/P32145 and previous config saved to /var/cache/conftool/dbconfig/20220802-101522-marostegui.json
10:12 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1019.eqiad.wmnet with OS bullseye
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 50%: After restart', diff saved to https://phabricator.wikimedia.org/P32144 and previous config saved to /var/cache/conftool/dbconfig/20220802-100308-root.json
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32143 and previous config saved to /var/cache/conftool/dbconfig/20220802-100304-root.json
09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db2079 from dbctl T313885', diff saved to https://phabricator.wikimedia.org/P32141 and previous config saved to /var/cache/conftool/dbconfig/20220802-095455-marostegui.json
09:52 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbproxy1019.eqiad.wmnet with reason: host reimage
09:49 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on dbproxy1019.eqiad.wmnet with reason: host reimage
09:49 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 25%: After restart', diff saved to https://phabricator.wikimedia.org/P32140 and previous config saved to /var/cache/conftool/dbconfig/20220802-094804-root.json
09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32139 and previous config saved to /var/cache/conftool/dbconfig/20220802-094759-root.json
09:44 godog: grow sdb3 by 100G on thanos-be2004 - T314275
09:43 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
09:42 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
09:37 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host dbproxy1019.eqiad.wmnet with OS bullseye
09:36 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 10%: After restart', diff saved to https://phabricator.wikimedia.org/P32138 and previous config saved to /var/cache/conftool/dbconfig/20220802-093259-root.json
09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32137 and previous config saved to /var/cache/conftool/dbconfig/20220802-093254-root.json
09:28 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
09:22 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 5%: After restart', diff saved to https://phabricator.wikimedia.org/P32136 and previous config saved to /var/cache/conftool/dbconfig/20220802-091754-root.json
09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32135 and previous config saved to /var/cache/conftool/dbconfig/20220802-091749-root.json
09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2143', diff saved to https://phabricator.wikimedia.org/P32134 and previous config saved to /var/cache/conftool/dbconfig/20220802-091518-root.json
09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 2%: After restart', diff saved to https://phabricator.wikimedia.org/P32133 and previous config saved to /var/cache/conftool/dbconfig/20220802-090250-root.json
09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 5%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32132 and previous config saved to /var/cache/conftool/dbconfig/20220802-090245-root.json
08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1181 (re)pooling @ 1%: After restart', diff saved to https://phabricator.wikimedia.org/P32131 and previous config saved to /var/cache/conftool/dbconfig/20220802-084745-root.json
08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 1%: After maintenance', diff saved to https://phabricator.wikimedia.org/P32130 and previous config saved to /var/cache/conftool/dbconfig/20220802-084740-root.json
08:46 marostegui: stop mysql on db2095 db2107 db2109 db2137 db2147 db2159 db2160 pc2012 for pdu maintenance on codfw b5 T310070
07:49 moritzm: upgrading drmrs ganeti clusters to 3.0.2 T312637
07:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2005.codfw.wmnet with reason: Switch instance to plain disks, T311686
07:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2005.codfw.wmnet with reason: Switch instance to plain disks, T311686
07:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2005.codfw.wmnet with reason: Switch instance to DRBD, T311686
07:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2005.codfw.wmnet with reason: Switch instance to DRBD, T311686
04:17 ryankemper: [Elastic] Small amendment to my earlier statement; based off epoch time `be_x_oldwiki_titlesuggest_1659407912` was not an old index hanging around after a reindex operation, but rather the new one that the reindex operation was trying to create, but had not yet finished (therefore didn't switch over the aliases). It presumably got interrupted by the reimage of `elastic2059`.
04:07 ryankemper: [Elastic] Per `curl -s https://search.svc.codfw.wmnet:9243/_cat/aliases | grep -i be_x` I see `be_x_oldwiki_titlesuggest ` alias points to `be_x_oldwiki_titlesuggest_1658396688`. I think this means the red index is an old index from an in-progress reindex operation. I likely just need to delete `be_x_oldwiki_titlesuggest_1659407912` but doing some quick digging first
04:04 ryankemper: [Elastic] Red cluster status in main codfw elasticsearch cluster (`https://search.svc.codfw.wmnet:9243`); culprit appears to be index `be_x_oldwiki_titlesuggest_1659407912`. Confusingly it has 2 replicas set so it's not clear to me how we got into this state starting from green (in the past we've gone into red status from indices that erroneously had 0 replicas in production)
20:47 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: c19c3e36ab: DiscussionTools: Make new reply buttons available at mediawiki.org (T314076); 24db016c4: viwikibooks: Change wgArticleCountMethod to any (T314239) (duration: 03m 10s)
20:03 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2054.codfw.wmnet with OS bullseye
19:41 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2054.codfw.wmnet with reason: host reimage
19:35 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2054.codfw.wmnet with reason: host reimage
19:12 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2054.codfw.wmnet with OS bullseye
18:56 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2031.codfw.wmnet with OS bullseye
18:44 mutante: gitlab - moved data_persistence group to new parent, under /repos/
18:34 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2031.codfw.wmnet with reason: host reimage
18:32 mutante: gitlab - created group 'data_persistence' - added Ladsgroup and upgraded from member to maintainer
18:27 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2031.codfw.wmnet with reason: host reimage
18:12 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2031.codfw.wmnet with OS bullseye
17:58 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2025.codfw.wmnet with OS bullseye
17:37 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2025.codfw.wmnet with reason: host reimage
17:31 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2025.codfw.wmnet with reason: host reimage
17:18 ryankemper: T289135T314078 Manually reimaging remaining codfw stretch hosts (`elastic[2025,2031,2054,2059-2060]`) to bullseye, one host at a time, waiting for green cluster status to return between each run. `ryankemper@cumin1001` tmux session `codfw_reimage`
17:16 ryankemper@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2025.codfw.wmnet with OS bullseye
17:08 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reimage (bullseye upgrade) - bking@cumin1001 - T289135
17:08 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reimage (bullseye upgrade) - bking@cumin1001 - T289135
17:06 mutante: alert1001 - systemctl restart nsca - pinged by fundraising tech because fundraising hosts have the "passive check is awol" issue again (T196336)
16:25 moritzm: installing tcpdump updates from bullseye point release
14:29 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on gitlab1004.wikimedia.org with reason: upgrade gitlab1004 to new version
14:05 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reimage (bullseye upgrade) - bking@cumin1001 - T289135
14:04 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2044.codfw.wmnet with OS bullseye
14:04 urbanecm@deploy1002: Synchronized wmf-config/logos.php: bcb7b0d: Adjust width-height ratio of logo to fix display issue (T310961; 2/2) (duration: 03m 17s)
14:01 urbanecm@deploy1002: Synchronized static/images/project-logos/: bcb7b0d: srwikisource: Adjust width-height ratio of logo to fix display issue (T310961; 1/2) (duration: 03m 41s)
13:48 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2044.codfw.wmnet with reason: host reimage
13:44 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2044.codfw.wmnet with reason: host reimage
13:24 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2044.codfw.wmnet with OS bullseye
13:22 bking@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reimage (bullseye upgrade) - bking@cumin1001 - T289135
11:50 moritzm: installing openjdk-8 security updates for stretch
11:43 moritzm: uploaded openjdk-8 8u342-b07-1~deb9u1 for stretch-wikimedia
10:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T314041)', diff saved to https://phabricator.wikimedia.org/P32124 and previous config saved to /var/cache/conftool/dbconfig/20220801-102714-ladsgroup.json
10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P32123 and previous config saved to /var/cache/conftool/dbconfig/20220801-101208-ladsgroup.json
10:09 vgutierrez: test ATS 9.1.2 on cp6016 - T309651
10:05 vgutierrez: test ATS 9.1.2 on cp6008 - T309651
10:00 ebysans@deploy1002: Started deploy [airflow-dags/analytics@4da9195]: (no justification provided)
09:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P32122 and previous config saved to /var/cache/conftool/dbconfig/20220801-095702-ladsgroup.json
09:56 ebysans@deploy1002: Started deploy [airflow-dags/analytics@85585b0]: (no justification provided)
09:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T314041)', diff saved to https://phabricator.wikimedia.org/P32121 and previous config saved to /var/cache/conftool/dbconfig/20220801-094156-ladsgroup.json
09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T314041)', diff saved to https://phabricator.wikimedia.org/P32120 and previous config saved to /var/cache/conftool/dbconfig/20220801-093845-ladsgroup.json
09:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
09:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
09:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1112.eqiad.wmnet with reason: Maintenance
09:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db1112.eqiad.wmnet with reason: Maintenance
09:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Maintenance
09:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Maintenance
09:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2105.codfw.wmnet with reason: Maintenance
09:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db2105.codfw.wmnet with reason: Maintenance
09:21 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2004.codfw.wmnet
09:10 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2004.codfw.wmnet
09:10 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2003.codfw.wmnet
09:01 jelto@cumin1001: START - Cookbook sre.hosts.reboot-single for host gitlab-runner2003.codfw.wmnet
09:00 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab-runner2002.codfw.wmnet