Server Admin Log/Archive 49

2022-02-28

22:36 ebernhardson: start in-place reindex of kmwiki kmwiktionary and kmwikibooks on cirrus cloudelsatic cluster T299707
22:00 tzatziki: running extensions/SecurePoll/cli/wm-scripts/ucoc/populateEditCount.php on each wiki (s1 thru s8 simultaneously) (T302433)
21:39 urbanecm: UTC late B&C window done
21:38 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/VisualEditor/modules/ve-mw/init/targets: e22e4d5: b4dd4c4: VisualEditor backports (T302746) (duration: 00m 51s)
21:30 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/includes/htmlform/: 67831a3: Revert "htmlform: Replace some uses of isHidden to isDisabled" (T302512) (duration: 00m 48s)
21:24 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: 706c2bc: Mentor dashboard: Mark mentor-tools as stable (T280307) (duration: 00m 49s)
20:45 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
20:45 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
20:21 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
20:20 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
20:03 tzatziki: creating ucoc_edits table on each wiki for elections voterlist (T302433)
19:51 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host datahubsearch1003.eqiad.wmnet
19:50 rzl@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
19:50 rzl@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
19:44 rzl@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
19:38 rzl@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
19:28 razzi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:20 cmooney@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
19:18 cmooney@cumin1001: START - Cookbook sre.dns.netbox
19:18 razzi@cumin1001: START - Cookbook sre.dns.netbox
19:18 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1003.eqiad.wmnet
19:14 rzl@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
19:13 rzl@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
19:09 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:05 bblack@cumin1001: START - Cookbook sre.dns.netbox
18:52 bblack@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
18:52 bblack@cumin1001: START - Cookbook sre.dns.netbox
18:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti2007.codfw.wmnet
18:47 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:38 razzi@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 7 days, 0:00:00 on datahubsearch1002.eqiad.wmnet with reason: Node is being set up for first time and puppet run failed
18:38 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on datahubsearch1002.eqiad.wmnet with reason: Node is being set up for first time and puppet run failed
18:30 jmm@cumin2002: START - Cookbook sre.dns.netbox
18:26 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ganeti2007.codfw.wmnet
18:20 mutante: phabricator/diffusion - disable IO and hide http and ssh URIs for source repo 'word2vec' - it's still possible to pull and push via https (operation/debs/word2vec) - https://phabricator.wikimedia.org/source/word2vec/ - https://en.wikipedia.org/wiki/Word2vec T296022
18:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti2007.codfw.wmnet with reason: Remove from Ganeti cluster for decom
18:19 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ganeti2007.codfw.wmnet with reason: Remove from Ganeti cluster for decom
18:04 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host datahubsearch1002.eqiad.wmnet
17:59 mutante: phabricator/diffusion - disable http and ssh URIs for source repo "iltools" - T296022 - https://commons.wikimedia.org/wiki/User_talk:Inductiveload#c-Inductiveload-2022-02-25T22%3A26%3A00.000Z-Mutante-2022-02-25T20%3A37%3A00.000Z
17:51 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host an-worker1147.mgmt.eqiad.wmnet with reboot policy FORCED
17:48 bblack: lvs1017-20 (all eqiad lvs) - stopping puppet to attempt deploying https://gerrit.wikimedia.org/r/c/operations/puppet/+/765311
17:47 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:45 sukhe: rolling restart of anycast-hc.service on durum* hosts for security updates
17:42 cmooney@cumin1001: START - Cookbook sre.dns.netbox
17:40 razzi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:40 sukhe: rolling restart of anycast-hc.service on doh* hosts for security updates
17:35 razzi@cumin1001: START - Cookbook sre.dns.netbox
17:35 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
17:35 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:28 volans@cumin2002: START - Cookbook sre.dns.netbox
17:25 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:21 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2022.codfw.wmnet with OS bullseye
17:21 ebernhardson: manual trigger of cirrus SaneitizeJobs for with 2hr refresh
17:21 cmooney@cumin1001: START - Cookbook sre.dns.netbox
17:15 razzi@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host datahubsearch1002.eqiad.wmnet
17:15 razzi@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
17:11 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2022.codfw.wmnet with reason: host reimage
17:08 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2022.codfw.wmnet with reason: host reimage
17:05 razzi@cumin1001: START - Cookbook sre.dns.netbox
17:05 razzi@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
17:00 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:58 cmooney@cumin1001: START - Cookbook sre.dns.netbox
16:56 razzi@cumin1001: START - Cookbook sre.dns.netbox
16:56 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
16:54 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes2022.codfw.wmnet with OS bullseye
16:53 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:51 cmooney@cumin1001: START - Cookbook sre.dns.netbox
16:51 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
16:50 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:46 cmooney@cumin1001: START - Cookbook sre.dns.netbox
16:45 papaul: rebooting scs-a1-codfw to clear librenms alert
16:42 klausman@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ml-staging-etcd2001.codfw.wmnet
16:42 klausman@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
16:33 klausman@cumin2002: START - Cookbook sre.dns.netbox
16:32 klausman@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
16:32 cmooney@cumin1001: START - Cookbook sre.dns.netbox
16:27 klausman@cumin2002: START - Cookbook sre.dns.netbox
16:27 klausman@cumin2002: START - Cookbook sre.ganeti.makevm for new host ml-staging-etcd2001.codfw.wmnet
16:17 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2021.codfw.wmnet with OS bullseye
16:13 klausman@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
16:07 klausman@cumin2002: START - Cookbook sre.dns.netbox
16:07 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2021.codfw.wmnet with reason: host reimage
16:04 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:04 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2021.codfw.wmnet with reason: host reimage
16:01 cmooney@cumin1001: START - Cookbook sre.dns.netbox
15:59 cmooney@cumin1001: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host an-worker1147.mgmt.eqiad.wmnet with reboot policy FORCED
15:59 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host an-worker1147.mgmt.eqiad.wmnet with reboot policy FORCED
15:58 klausman@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts ml-etcd-staging2001
15:56 vgutierrez: rolling upgrade to HAProxy 2.4.14 on HAProxy caching nodes - T290005
15:54 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
15:53 klausman@cumin2002: START - Cookbook sre.hosts.decommission for hosts ml-etcd-staging2001
15:53 klausman@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts ml-etcd-staging2001
15:52 klausman@cumin2002: START - Cookbook sre.hosts.decommission for hosts ml-etcd-staging2001
15:50 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes2021.codfw.wmnet with OS bullseye
15:48 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:46 cmooney@cumin1001: START - Cookbook sre.dns.netbox
15:44 pt1979@cumin2002: START - Cookbook sre.dns.netbox
15:37 pt1979@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
15:33 milimetric@deploy1002: Finished deploy [analytics/refinery@84a0770] (hadoop-test): Add a few wikis to the sqoop list (duration: 07m 16s)
15:30 pt1979@cumin2002: START - Cookbook sre.dns.netbox
15:26 milimetric@deploy1002: Started deploy [analytics/refinery@84a0770] (hadoop-test): Add a few wikis to the sqoop list
15:25 milimetric@deploy1002: Finished deploy [analytics/refinery@84a0770] (thin): Add a few wikis to the sqoop list (duration: 00m 08s)
15:25 milimetric@deploy1002: Started deploy [analytics/refinery@84a0770] (thin): Add a few wikis to the sqoop list
15:23 milimetric@deploy1002: Finished deploy [analytics/refinery@84a0770]: Add a few wikis to the sqoop list (duration: 21m 18s)
15:18 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host kubernetes2020.codfw.wmnet with OS bullseye
15:07 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2020.codfw.wmnet with reason: host reimage
15:06 ntsako@deploy1002: Finished deploy [airflow-dags/analytics@0a2ffb8]: (no justification provided) (duration: 00m 07s)
15:06 ntsako@deploy1002: Started deploy [airflow-dags/analytics@0a2ffb8]: (no justification provided)
15:04 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2020.codfw.wmnet with reason: host reimage
15:02 krinkle@deploy1002: Synchronized wmf-config/InitialiseSettings.php: I616f56 (duration: 00m 49s)
15:02 milimetric@deploy1002: Started deploy [analytics/refinery@84a0770]: Add a few wikis to the sqoop list
14:53 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
14:50 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes2020.codfw.wmnet with OS bullseye
14:48 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2019.codfw.wmnet with OS bullseye
14:44 cmooney@cumin1001: START - Cookbook sre.dns.netbox
14:43 klausman@cumin2001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ml-etcd-staging2001.codfw.wmnet
14:37 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2019.codfw.wmnet with reason: host reimage
14:35 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2019.codfw.wmnet with reason: host reimage
14:33 klausman@cumin2001: START - Cookbook sre.ganeti.makevm for new host ml-etcd-staging2001.codfw.wmnet
14:20 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes2019.codfw.wmnet with OS bullseye
14:18 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2018.codfw.wmnet with OS bullseye
14:09 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
14:09 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
14:07 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2018.codfw.wmnet with reason: host reimage
14:05 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2018.codfw.wmnet with reason: host reimage
14:03 jelto: update gitlab-ce to 14.7.4 on all GitLab hosts
14:00 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@75e8eb7]: (no justification provided) (duration: 00m 14s)
14:00 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
14:00 ebysans@deploy1002: Started deploy [airflow-dags/analytics@75e8eb7]: (no justification provided)
13:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T302185)', diff saved to https://phabricator.wikimedia.org/P21600 and previous config saved to /var/cache/conftool/dbconfig/20220228-135158-ladsgroup.json
13:50 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes2018.codfw.wmnet with OS bullseye
13:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P21599 and previous config saved to /var/cache/conftool/dbconfig/20220228-133653-ladsgroup.json
13:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P21598 and previous config saved to /var/cache/conftool/dbconfig/20220228-132148-ladsgroup.json
13:14 moritzm: restarting apache on puppet masters to pick up expat security update
13:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T302185)', diff saved to https://phabricator.wikimedia.org/P21597 and previous config saved to /var/cache/conftool/dbconfig/20220228-130644-ladsgroup.json
13:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1111.eqiad.wmnet with OS bullseye
12:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1111.eqiad.wmnet with reason: host reimage
12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Your commit message', diff saved to https://phabricator.wikimedia.org/P21596 and previous config saved to /var/cache/conftool/dbconfig/20220228-124454-ladsgroup.json
12:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1111.eqiad.wmnet with reason: host reimage
12:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1111.eqiad.wmnet with OS bullseye
12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T302185)', diff saved to https://phabricator.wikimedia.org/P21594 and previous config saved to /var/cache/conftool/dbconfig/20220228-123008-ladsgroup.json
12:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1111.eqiad.wmnet with reason: Maintenance
12:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1111.eqiad.wmnet with reason: Maintenance
12:25 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5011.eqsin.wmnet with OS buster
12:24 vgutierrez: pool cp5011 running HAProxy as TLS termination layer - T290005 T271421
12:22 vgutierrez: vgutierrez@apt1001:~$ sudo -i reprepro --component thirdparty/haproxy24 update buster-wikimedia - T290005
12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300992)', diff saved to https://phabricator.wikimedia.org/P21593 and previous config saved to /var/cache/conftool/dbconfig/20220228-122039-ladsgroup.json
12:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P21592 and previous config saved to /var/cache/conftool/dbconfig/20220228-120535-ladsgroup.json
11:58 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5011.eqsin.wmnet with reason: host reimage
11:55 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5011.eqsin.wmnet with reason: host reimage
11:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P21591 and previous config saved to /var/cache/conftool/dbconfig/20220228-115030-ladsgroup.json
11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T302185)', diff saved to https://phabricator.wikimedia.org/P21590 and previous config saved to /var/cache/conftool/dbconfig/20220228-114230-ladsgroup.json
11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300992)', diff saved to https://phabricator.wikimedia.org/P21589 and previous config saved to /var/cache/conftool/dbconfig/20220228-113525-ladsgroup.json
11:29 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp5011.eqsin.wmnet with OS buster
11:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P21588 and previous config saved to /var/cache/conftool/dbconfig/20220228-112726-ladsgroup.json
11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300992)', diff saved to https://phabricator.wikimedia.org/P21587 and previous config saved to /var/cache/conftool/dbconfig/20220228-111700-ladsgroup.json
11:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
11:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
11:12 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1088.eqiad.wmnet with OS buster
11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P21586 and previous config saved to /var/cache/conftool/dbconfig/20220228-111221-ladsgroup.json
11:09 vgutierrez: pool cp1088 running HAProxy as TLS termination layer - T290005 T271421
10:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T302185)', diff saved to https://phabricator.wikimedia.org/P21585 and previous config saved to /var/cache/conftool/dbconfig/20220228-105716-ladsgroup.json
10:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
10:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
10:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300992)', diff saved to https://phabricator.wikimedia.org/P21584 and previous config saved to /var/cache/conftool/dbconfig/20220228-105447-ladsgroup.json
10:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1114.eqiad.wmnet with OS bullseye
10:48 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1088.eqiad.wmnet with reason: host reimage
10:44 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1088.eqiad.wmnet with reason: host reimage
10:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P21583 and previous config saved to /var/cache/conftool/dbconfig/20220228-103942-ladsgroup.json
10:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1114.eqiad.wmnet with reason: host reimage
10:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1114.eqiad.wmnet with reason: host reimage
10:28 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp1088.eqiad.wmnet with OS buster
10:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P21582 and previous config saved to /var/cache/conftool/dbconfig/20220228-102438-ladsgroup.json
10:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1114.eqiad.wmnet with OS bullseye
10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T302185)', diff saved to https://phabricator.wikimedia.org/P21581 and previous config saved to /var/cache/conftool/dbconfig/20220228-101815-ladsgroup.json
10:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1114.eqiad.wmnet with reason: Maintenance
10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1114.eqiad.wmnet with reason: Maintenance
10:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T302185)', diff saved to https://phabricator.wikimedia.org/P21580 and previous config saved to /var/cache/conftool/dbconfig/20220228-101726-ladsgroup.json
10:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300992)', diff saved to https://phabricator.wikimedia.org/P21579 and previous config saved to /var/cache/conftool/dbconfig/20220228-100933-ladsgroup.json
10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P21578 and previous config saved to /var/cache/conftool/dbconfig/20220228-100221-ladsgroup.json
09:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300992)', diff saved to https://phabricator.wikimedia.org/P21577 and previous config saved to /var/cache/conftool/dbconfig/20220228-095056-ladsgroup.json
09:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
09:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
09:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P21576 and previous config saved to /var/cache/conftool/dbconfig/20220228-094717-ladsgroup.json
09:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T302185)', diff saved to https://phabricator.wikimedia.org/P21575 and previous config saved to /var/cache/conftool/dbconfig/20220228-093212-ladsgroup.json
09:29 volans@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
09:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
09:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
09:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300992)', diff saved to https://phabricator.wikimedia.org/P21574 and previous config saved to /var/cache/conftool/dbconfig/20220228-092830-ladsgroup.json
09:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1126.eqiad.wmnet with OS bullseye
09:22 volans@cumin1001: START - Cookbook sre.dns.netbox
09:16 moritzm: restarting Hue to pick up expat security updates
09:13 moritzm: restarting turnilo to pick up expat security updates
09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P21573 and previous config saved to /var/cache/conftool/dbconfig/20220228-091325-ladsgroup.json
09:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1126.eqiad.wmnet with reason: host reimage
09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1126.eqiad.wmnet with reason: host reimage
09:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1126.eqiad.wmnet with OS bullseye
08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P21572 and previous config saved to /var/cache/conftool/dbconfig/20220228-085820-ladsgroup.json
08:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T302185)', diff saved to https://phabricator.wikimedia.org/P21571 and previous config saved to /var/cache/conftool/dbconfig/20220228-085329-ladsgroup.json
08:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1126.eqiad.wmnet with reason: Maintenance
08:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1126.eqiad.wmnet with reason: Maintenance
08:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T302185)', diff saved to https://phabricator.wikimedia.org/P21570 and previous config saved to /var/cache/conftool/dbconfig/20220228-085224-ladsgroup.json
08:51 moritzm: installing expat security updates
08:47 ayounsi@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
08:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300992)', diff saved to https://phabricator.wikimedia.org/P21567 and previous config saved to /var/cache/conftool/dbconfig/20220228-084316-ladsgroup.json
08:39 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
08:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P21566 and previous config saved to /var/cache/conftool/dbconfig/20220228-083720-ladsgroup.json
08:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P21564 and previous config saved to /var/cache/conftool/dbconfig/20220228-082215-ladsgroup.json
08:10 taavi: UTC morning deploys done
08:09 taavi@deploy1002: Synchronized logos/config.yaml: Config: Change temporary logo for slwiki (T302661) (duration: 00m 48s)
08:09 taavi@deploy1002: Synchronized wmf-config/logos.php: Config: Change temporary logo for slwiki (T302661) (duration: 00m 48s)
08:08 taavi@deploy1002: Synchronized static/images/project-logos: Config: Change temporary logo for slwiki (T302661) (duration: 00m 50s)
08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T302185)', diff saved to https://phabricator.wikimedia.org/P21563 and previous config saved to /var/cache/conftool/dbconfig/20220228-080710-ladsgroup.json
08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300992)', diff saved to https://phabricator.wikimedia.org/P21562 and previous config saved to /var/cache/conftool/dbconfig/20220228-080613-ladsgroup.json
08:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
08:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
08:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
08:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300992)', diff saved to https://phabricator.wikimedia.org/P21561 and previous config saved to /var/cache/conftool/dbconfig/20220228-080559-ladsgroup.json
08:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1177.eqiad.wmnet with OS bullseye
08:00 godog: enable notifications for thanos-be1003 in icinga and clear up /srv/swift-storage/sdm1 since it was filling up /
07:58 moritzm: drain instances off ganeti2007 for eventual decom
07:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P21560 and previous config saved to /var/cache/conftool/dbconfig/20220228-075054-ladsgroup.json
07:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
07:45 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
07:44 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
07:43 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
07:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage
07:42 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P21559 and previous config saved to /var/cache/conftool/dbconfig/20220228-073550-ladsgroup.json
07:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS bullseye
07:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T302185)', diff saved to https://phabricator.wikimedia.org/P21558 and previous config saved to /var/cache/conftool/dbconfig/20220228-072546-ladsgroup.json
07:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
07:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1177.eqiad.wmnet with reason: Maintenance
07:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T302185)', diff saved to https://phabricator.wikimedia.org/P21557 and previous config saved to /var/cache/conftool/dbconfig/20220228-072314-ladsgroup.json
07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300992)', diff saved to https://phabricator.wikimedia.org/P21556 and previous config saved to /var/cache/conftool/dbconfig/20220228-072045-ladsgroup.json
07:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P21555 and previous config saved to /var/cache/conftool/dbconfig/20220228-070809-ladsgroup.json
07:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300992)', diff saved to https://phabricator.wikimedia.org/P21554 and previous config saved to /var/cache/conftool/dbconfig/20220228-070148-ladsgroup.json
07:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
07:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
06:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P21553 and previous config saved to /var/cache/conftool/dbconfig/20220228-065304-ladsgroup.json
06:42 XioNoX: configure BGP between codfw and eqdfw
06:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T302185)', diff saved to https://phabricator.wikimedia.org/P21552 and previous config saved to /var/cache/conftool/dbconfig/20220228-063800-ladsgroup.json
06:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1178.eqiad.wmnet with OS bullseye
06:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
06:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
06:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
06:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
06:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300992)', diff saved to https://phabricator.wikimedia.org/P21551 and previous config saved to /var/cache/conftool/dbconfig/20220228-062236-ladsgroup.json
06:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
06:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1178.eqiad.wmnet with reason: host reimage
06:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P21550 and previous config saved to /var/cache/conftool/dbconfig/20220228-060731-ladsgroup.json
06:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1178.eqiad.wmnet with OS bullseye
05:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T302185)', diff saved to https://phabricator.wikimedia.org/P21549 and previous config saved to /var/cache/conftool/dbconfig/20220228-055626-ladsgroup.json
05:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
05:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance
05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21548 and previous config saved to /var/cache/conftool/dbconfig/20220228-055530-ladsgroup.json
05:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P21547 and previous config saved to /var/cache/conftool/dbconfig/20220228-055226-ladsgroup.json
05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21546 and previous config saved to /var/cache/conftool/dbconfig/20220228-054025-ladsgroup.json
05:38 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.23/includes/content/ContentHandler.php: Backport: ContentHandler: Use ParserOutputAccess for accessing ParserOutput (T302620) (duration: 00m 49s)
05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300992)', diff saved to https://phabricator.wikimedia.org/P21545 and previous config saved to /var/cache/conftool/dbconfig/20220228-053721-ladsgroup.json
05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21544 and previous config saved to /var/cache/conftool/dbconfig/20220228-052521-ladsgroup.json
05:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300992)', diff saved to https://phabricator.wikimedia.org/P21543 and previous config saved to /var/cache/conftool/dbconfig/20220228-051905-ladsgroup.json
05:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
05:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
05:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21542 and previous config saved to /var/cache/conftool/dbconfig/20220228-051016-ladsgroup.json
05:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1172.eqiad.wmnet with OS bullseye
04:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
04:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
04:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
04:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
04:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
04:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1172.eqiad.wmnet with reason: host reimage
04:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1172.eqiad.wmnet with OS bullseye
04:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21541 and previous config saved to /var/cache/conftool/dbconfig/20220228-043003-ladsgroup.json
04:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
04:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance

2022-02-27

20:42 XioNoX: configure OSPF between cr2-drmrs and cr2-eqdfw

2022-02-25

23:32 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
23:30 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21540 and previous config saved to /var/cache/conftool/dbconfig/20220225-213704-ladsgroup.json
21:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21539 and previous config saved to /var/cache/conftool/dbconfig/20220225-212159-ladsgroup.json
21:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21538 and previous config saved to /var/cache/conftool/dbconfig/20220225-210654-ladsgroup.json
21:02 ryankemper: [WDQS] Restarted wdqs eqiad exporters: `ryankemper@cumin1001:~$ sudo -E cumin -b 1 'wdqs1*' 'systemctl restart prometheus-blazegraph-exporter-wdqs-blazegraph.service'`
21:01 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good. Still looking into `Reduced availability for job jmx_wdqs_updater`; will try restarting blazegraph exporters in eqiad
20:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21537 and previous config saved to /var/cache/conftool/dbconfig/20220225-205149-ladsgroup.json
20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21536 and previous config saved to /var/cache/conftool/dbconfig/20220225-204844-ladsgroup.json
20:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
20:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21535 and previous config saved to /var/cache/conftool/dbconfig/20220225-204836-ladsgroup.json
20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21534 and previous config saved to /var/cache/conftool/dbconfig/20220225-203331-ladsgroup.json
20:31 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
20:31 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
20:31 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
20:30 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@5d384a5]: 0.3.104 (duration: 07m 18s)
20:23 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.104` on canary `wdqs1003`; proceeding to rest of fleet
20:22 ryankemper@deploy1002: Started deploy [wdqs/wdqs@5d384a5]: 0.3.104
20:22 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.104`. Pre-deploy tests passing on canary `wdqs1003`
20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21533 and previous config saved to /var/cache/conftool/dbconfig/20220225-201826-ladsgroup.json
20:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21532 and previous config saved to /var/cache/conftool/dbconfig/20220225-200322-ladsgroup.json
19:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300992)', diff saved to https://phabricator.wikimedia.org/P21531 and previous config saved to /var/cache/conftool/dbconfig/20220225-195917-ladsgroup.json
19:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
19:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
19:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
19:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
19:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
19:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
19:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
19:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21530 and previous config saved to /var/cache/conftool/dbconfig/20220225-195658-ladsgroup.json
19:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21529 and previous config saved to /var/cache/conftool/dbconfig/20220225-194153-ladsgroup.json
19:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21528 and previous config saved to /var/cache/conftool/dbconfig/20220225-192649-ladsgroup.json
19:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21527 and previous config saved to /var/cache/conftool/dbconfig/20220225-191144-ladsgroup.json
19:11 jgleeson: payments updated from 4638c0ec to 3dfac3b2
19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21526 and previous config saved to /var/cache/conftool/dbconfig/20220225-190939-ladsgroup.json
19:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
19:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
19:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
19:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
19:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
19:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21525 and previous config saved to /var/cache/conftool/dbconfig/20220225-190737-ladsgroup.json
18:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21524 and previous config saved to /var/cache/conftool/dbconfig/20220225-185233-ladsgroup.json
18:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21523 and previous config saved to /var/cache/conftool/dbconfig/20220225-183728-ladsgroup.json
18:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21522 and previous config saved to /var/cache/conftool/dbconfig/20220225-182223-ladsgroup.json
18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300992)', diff saved to https://phabricator.wikimedia.org/P21521 and previous config saved to /var/cache/conftool/dbconfig/20220225-181918-ladsgroup.json
18:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
18:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
18:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300992)', diff saved to https://phabricator.wikimedia.org/P21520 and previous config saved to /var/cache/conftool/dbconfig/20220225-181911-ladsgroup.json
18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21519 and previous config saved to /var/cache/conftool/dbconfig/20220225-180406-ladsgroup.json
17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21518 and previous config saved to /var/cache/conftool/dbconfig/20220225-174901-ladsgroup.json
17:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300992)', diff saved to https://phabricator.wikimedia.org/P21517 and previous config saved to /var/cache/conftool/dbconfig/20220225-173356-ladsgroup.json
17:29 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-puppet-dashboard updates: better error messages and code cleanup (prod) (duration: 08m 20s)
17:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300992)', diff saved to https://phabricator.wikimedia.org/P21516 and previous config saved to /var/cache/conftool/dbconfig/20220225-172845-ladsgroup.json
17:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
17:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
17:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300992)', diff saved to https://phabricator.wikimedia.org/P21515 and previous config saved to /var/cache/conftool/dbconfig/20220225-172837-ladsgroup.json
17:21 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-puppet-dashboard updates: better error messages and code cleanup (prod)
17:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21514 and previous config saved to /var/cache/conftool/dbconfig/20220225-171333-ladsgroup.json
17:12 ebernhardson: manual trigger of cirrus SaneitizeJobs for with 2hr refresh
17:01 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-puppet-dashboard updates: better error messages and code cleanup (duration: 01m 57s)
16:59 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-puppet-dashboard updates: better error messages and code cleanup
16:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21513 and previous config saved to /var/cache/conftool/dbconfig/20220225-165828-ladsgroup.json
16:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300992)', diff saved to https://phabricator.wikimedia.org/P21512 and previous config saved to /var/cache/conftool/dbconfig/20220225-164323-ladsgroup.json
16:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300992)', diff saved to https://phabricator.wikimedia.org/P21511 and previous config saved to /var/cache/conftool/dbconfig/20220225-164020-ladsgroup.json
16:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
16:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
16:36 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3063.esams.wmnet with OS buster
16:35 vgutierrez: pool cp3063 running HAProxy as TLS termination layer - T290005 T271421
16:10 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3063.esams.wmnet with reason: host reimage
16:06 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3063.esams.wmnet with reason: host reimage
15:38 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp3063.esams.wmnet with OS buster
15:36 moritzm: imported PHP 7.4 7.4.28-1+0~20220217.59+debian10~1.gbp1950+wmf1+buster1 to component/php74 for buster-wikimedia T271736
15:25 vgutierrez: pool cp5005 running HAProxy as TLS termination layer - T290005 T271421
15:19 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5005.eqsin.wmnet with OS buster
14:35 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5005.eqsin.wmnet with reason: host reimage
14:32 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5005.eqsin.wmnet with reason: host reimage
14:13 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: fix wmf-puppet-dashboard routes (duration: 07m 47s)
14:05 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: fix wmf-puppet-dashboard routes
14:04 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp5005.eqsin.wmnet with OS buster
13:56 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: deploying wmf-proxy-dashboard and wmf-puppet-dashboard changes for real after fixing the scap config (duration: 04m 50s)
13:52 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: deploying wmf-proxy-dashboard and wmf-puppet-dashboard changes for real after fixing the scap config
off: restoring psql-all-dbs-20220225.sql.gz into netbox
13:30 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): debugging deployment process (duration: 00m 06s)
13:30 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): debugging deployment process
13:30 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): debugging deployment process
13:29 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): debugging deployment process (duration: 00m 05s)
13:29 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): debugging deployment process
12:46 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: updating wmf-proxy-dashboard on eqiad1 (duration: 02m 04s)
12:44 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: updating wmf-proxy-dashboard on eqiad1
12:39 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): updating wmf-proxy-dashboard (duration: 00m 37s)
12:39 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): updating wmf-proxy-dashboard
12:39 moritzm: drain instances off ganeti2007 T302577
12:38 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2040.codfw.wmnet with OS buster
12:32 vgutierrez: pool cp2040 running HAProxy as TLS termination layer - T290005 T271421
12:14 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2040.codfw.wmnet with reason: host reimage
12:11 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2040.codfw.wmnet with reason: host reimage
11:55 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2030.codfw.wmnet to ganeti01.svc.codfw.wmnet
11:53 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp2040.codfw.wmnet with OS buster
11:53 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2030.codfw.wmnet to ganeti01.svc.codfw.wmnet
11:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2029.codfw.wmnet to ganeti01.svc.codfw.wmnet
11:41 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4025.ulsfo.wmnet with OS buster
11:40 vgutierrez: pool cp4025 running HAProxy as TLS termination layer - T290005 T271421
11:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
11:20 XioNoX: re-activate BGP session to Seabone in esams
11:13 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4025.ulsfo.wmnet with reason: host reimage
11:10 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4025.ulsfo.wmnet with reason: host reimage
11:04 moritzm: added ganeti2029 to codfw Ganeti cluster T298998
10:54 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp4025.ulsfo.wmnet with OS buster
10:43 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to ganeti01.svc.codfw.wmnet
10:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
10:41 moritzm: enabled virtualisation in BIOS for ganeti2029 T298998
10:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
10:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2029.codfw.wmnet with reason: Enable virtualisation in BIOS
10:27 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2029.codfw.wmnet with reason: Enable virtualisation in BIOS
10:22 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2029.codfw.wmnet to ganeti01.svc.codfw.wmnet
10:22 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2029.codfw.wmnet to ganeti01.svc.codfw.wmnet
10:17 vgutierrez: rolling upgrade to HAProxy 2.4.13 on HAProxy cache nodes - T290005
09:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2029.codfw.wmnet
09:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
02:43 cstone: Donation Interface revision changed from a6a9b63e to 4638c0ec

2022-02-24

23:35 ryankemper: T302526 Deployed https://gerrit.wikimedia.org/r/765652 and ran puppet across wcqs*
22:06 mutante: static-bugzilla.wikimedia.org - kubernetes - deployed gerrit:765572 - first prod service behind a k8s ingress (T290966)
22:05 mutante: phabricator - disabled git repo - labs-tools-harvesting-data-refinery/repository/master/
21:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2086.codfw.wmnet with OS bullseye
21:45 brennen: end of UTC late backport & config window
21:43 dancy@deploy1002: Started scap: testing scap container image building
21:43 tzatziki: removing 1 file for legal compliance
21:42 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2085.codfw.wmnet with OS bullseye
21:41 mutante: phabricator - disabled git repo "frig" - outdated fundraising stuff, checked with fr-tech, not needed T296022
21:40 brennen@deploy1002: Synchronized php-1.38.0-wmf.23/includes: Backport: Revert "Revert "Revert "Show message fallback keys when using &uselang=qqx""" (duration: 00m 57s)
21:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2086.codfw.wmnet with reason: host reimage
21:36 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2086.codfw.wmnet with reason: host reimage
21:34 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2085.codfw.wmnet with reason: host reimage
21:30 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2085.codfw.wmnet with reason: host reimage
21:29 brennen@deploy1002: Synchronized wmf-config/CirrusSearch-production.php: Config: cirrus: Reduce write isolation to only cloudelastic (T295705) (duration: 00m 55s)
21:27 mutante: phabricator - disabling git repo rGEDS (Elasticdash) - only one commit from 2015 - T296022
21:19 tzatziki: removing 1 file for legal compliance
21:19 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2086.codfw.wmnet with OS bullseye
21:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2083.codfw.wmnet with OS bullseye
21:13 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2085.codfw.wmnet with OS bullseye
21:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2084.codfw.wmnet with OS bullseye
21:07 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2083.codfw.wmnet with reason: host reimage
21:05 tzatziki: removing 4 files for legal compilance
21:04 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2083.codfw.wmnet with reason: host reimage
21:02 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: (no justification provided) (duration: 03m 18s)
21:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2084.codfw.wmnet with reason: host reimage
20:59 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2083.codfw.wmnet with OS bullseye
20:58 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: (no justification provided)
20:58 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2084.codfw.wmnet with reason: host reimage
20:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2084.codfw.wmnet with OS bullseye
20:14 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2084.codfw.wmnet with OS bullseye
20:10 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2083.codfw.wmnet with OS bullseye
20:04 ryankemper: T302526 `ryankemper@cumin1001:~$ sudo -E cumin -b 3 'wcqs*' 'enable-puppet "query_service: Simply jvm arg handling - T302526"; sudo run-puppet-agent'` in tmux `wcqs`
20:02 ryankemper: T302526 Depooled `wcqs1001`, ran puppet agent, and restarted `wcqs-blazegraph`. Service came up healthy, proceeding to rest of wcqs fleet
19:57 ryankemper: T302526 `ryankemper@cumin1001:~$ sudo -E cumin -b 6 'wdqs*' 'enable-puppet "query_service: Simply jvm arg handling - T302526"; sudo run-puppet-agent'` in tmux `deploy_window`
19:55 ryankemper: T302526 Depooled canary `wdqs1003`, ran puppet agent, and restarted `wdqs-blazegraph`. Tests look good, proceeding to rest of wdqs fleet
19:48 ryankemper: T302526 (Forgot to merge patch first, take two)
19:48 ryankemper: T302526 Running puppet on wdqs canary: `ryankemper@wdqs1003:~$ sudo enable-puppet "query_service: Simply jvm arg handling - T302526" && sudo run-puppet-agent`
19:46 ryankemper: T302526 Disabling puppet across entire query service (wdqs & wcqs) fleet for merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/761080: `ryankemper@cumin1001:~$ sudo -E cumin 'w*qs*' 'disable-puppet "query_service: Simply jvm arg handling - T302526"'`
19:06 dduvall@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.23 refs T300199
19:00 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2084.codfw.wmnet with OS bullseye
18:56 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2083.codfw.wmnet with OS bullseye
18:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host elastic2085.mgmt.codfw.wmnet with reboot policy FORCED
18:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2082.codfw.wmnet with OS bullseye
18:52 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2085.mgmt.codfw.wmnet with reboot policy FORCED
18:51 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host elastic2085.mgmt.codfw.wmnet with reboot policy FORCED
18:45 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2085.mgmt.codfw.wmnet with reboot policy FORCED
18:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2084.mgmt.codfw.wmnet with reboot policy FORCED
18:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2082.codfw.wmnet with reason: host reimage
18:39 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2082.codfw.wmnet with reason: host reimage
18:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2084.mgmt.codfw.wmnet with reboot policy FORCED
18:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2082.codfw.wmnet with OS bullseye
18:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300774)', diff saved to https://phabricator.wikimedia.org/P21508 and previous config saved to /var/cache/conftool/dbconfig/20220224-182102-kormat.json
18:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2083.mgmt.codfw.wmnet with reboot policy FORCED
18:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2081.codfw.wmnet with OS bullseye
18:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P21506 and previous config saved to /var/cache/conftool/dbconfig/20220224-180557-kormat.json
18:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2083.mgmt.codfw.wmnet with reboot policy FORCED
18:03 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2081.codfw.wmnet with reason: host reimage
18:02 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2082.mgmt.codfw.wmnet with reboot policy FORCED
18:02 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
18:01 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
18:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
18:00 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2081.codfw.wmnet with reason: host reimage
18:00 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
17:59 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
17:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
17:50 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P21504 and previous config saved to /var/cache/conftool/dbconfig/20220224-175052-kormat.json
17:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2082.mgmt.codfw.wmnet with reboot policy FORCED
17:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
17:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
17:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
17:44 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[1039,1043].eqiad.wmnet
17:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
17:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2081.codfw.wmnet with OS bullseye
17:40 elukey: `truncate -s 1g /var/log/auth.log.1` on krb1001 to free space on the root partition
17:38 elukey: `truncate -s 1g /var/log/auth.log` on krb1001 to free space on the root partition
17:35 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300774)', diff saved to https://phabricator.wikimedia.org/P21503 and previous config saved to /var/cache/conftool/dbconfig/20220224-173548-kormat.json
17:33 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300774)', diff saved to https://phabricator.wikimedia.org/P21502 and previous config saved to /var/cache/conftool/dbconfig/20220224-173307-kormat.json
17:33 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
17:33 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
17:33 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300774)', diff saved to https://phabricator.wikimedia.org/P21501 and previous config saved to /var/cache/conftool/dbconfig/20220224-173259-kormat.json
17:32 krinkle@deploy1002: Synchronized wmf-config/: Ia61fea (duration: 00m 52s)
17:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2080.codfw.wmnet with OS bullseye
17:22 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic[1039,1043].eqiad.wmnet
17:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2080.codfw.wmnet with reason: host reimage
17:17 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P21500 and previous config saved to /var/cache/conftool/dbconfig/20220224-171755-kormat.json
17:16 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2080.codfw.wmnet with reason: host reimage
17:11 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
17:11 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
17:11 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
17:11 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
17:02 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P21499 and previous config saved to /var/cache/conftool/dbconfig/20220224-170250-kormat.json
16:59 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2080.codfw.wmnet with OS bullseye
16:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2079.codfw.wmnet with OS bullseye
16:50 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
16:50 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
16:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300774)', diff saved to https://phabricator.wikimedia.org/P21498 and previous config saved to /var/cache/conftool/dbconfig/20220224-164745-kormat.json
16:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:45 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300774)', diff saved to https://phabricator.wikimedia.org/P21497 and previous config saved to /var/cache/conftool/dbconfig/20220224-164506-kormat.json
16:45 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
16:45 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
16:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300774)', diff saved to https://phabricator.wikimedia.org/P21496 and previous config saved to /var/cache/conftool/dbconfig/20220224-164458-kormat.json
16:44 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
16:41 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2079.codfw.wmnet with reason: host reimage
16:38 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
16:37 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2079.codfw.wmnet with reason: host reimage
16:34 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
16:29 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P21495 and previous config saved to /var/cache/conftool/dbconfig/20220224-162953-kormat.json
16:27 jbond: deploy new firmware fact
16:26 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:24 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply
16:21 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
16:19 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2079.codfw.wmnet with OS bullseye
16:15 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
16:15 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
16:14 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P21494 and previous config saved to /var/cache/conftool/dbconfig/20220224-161449-kormat.json
16:14 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
16:14 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
16:04 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:59 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300774)', diff saved to https://phabricator.wikimedia.org/P21493 and previous config saved to /var/cache/conftool/dbconfig/20220224-155944-kormat.json
15:57 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300774)', diff saved to https://phabricator.wikimedia.org/P21492 and previous config saved to /var/cache/conftool/dbconfig/20220224-155708-kormat.json
15:57 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
15:57 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
15:57 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
15:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
15:56 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
15:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
15:56 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
15:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
15:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
15:55 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
15:55 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300774)', diff saved to https://phabricator.wikimedia.org/P21491 and previous config saved to /var/cache/conftool/dbconfig/20220224-155521-kormat.json
15:54 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
15:52 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
15:52 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
15:47 moritzm: restarting apache on otrs1001/ticket.wikimedia.org
15:44 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
15:42 moritzm: restarting apache on people.w.o, planet.w.o, releases* to pick up expat update
15:42 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
15:40 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P21490 and previous config saved to /var/cache/conftool/dbconfig/20220224-154016-kormat.json
15:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:37 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
15:36 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1032.eqiad.wmnet with OS buster
15:25 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P21489 and previous config saved to /var/cache/conftool/dbconfig/20220224-152512-kormat.json
15:10 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300774)', diff saved to https://phabricator.wikimedia.org/P21488 and previous config saved to /var/cache/conftool/dbconfig/20220224-151007-kormat.json
15:09 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1032.eqiad.wmnet with OS buster
15:05 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300774)', diff saved to https://phabricator.wikimedia.org/P21487 and previous config saved to /var/cache/conftool/dbconfig/20220224-150527-kormat.json
15:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
15:05 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
15:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300774)', diff saved to https://phabricator.wikimedia.org/P21486 and previous config saved to /var/cache/conftool/dbconfig/20220224-150520-kormat.json
14:50 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P21484 and previous config saved to /var/cache/conftool/dbconfig/20220224-145015-kormat.json
14:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21483 and previous config saved to /var/cache/conftool/dbconfig/20220224-144511-ladsgroup.json
14:35 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P21482 and previous config saved to /var/cache/conftool/dbconfig/20220224-143509-kormat.json
14:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P21481 and previous config saved to /var/cache/conftool/dbconfig/20220224-143005-ladsgroup.json
14:20 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300774)', diff saved to https://phabricator.wikimedia.org/P21480 and previous config saved to /var/cache/conftool/dbconfig/20220224-142004-kormat.json
14:19 XioNoX: Prepend AS to anycast prefixes learned on the core routers - T302315
14:17 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300774)', diff saved to https://phabricator.wikimedia.org/P21479 and previous config saved to /var/cache/conftool/dbconfig/20220224-141724-kormat.json
14:17 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
14:17 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
14:17 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300774)', diff saved to https://phabricator.wikimedia.org/P21478 and previous config saved to /var/cache/conftool/dbconfig/20220224-141717-kormat.json
14:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P21477 and previous config saved to /var/cache/conftool/dbconfig/20220224-141501-ladsgroup.json
14:02 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P21476 and previous config saved to /var/cache/conftool/dbconfig/20220224-140212-kormat.json
14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2121.codfw.wmnet with OS bullseye
14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21475 and previous config saved to /var/cache/conftool/dbconfig/20220224-135955-ladsgroup.json
13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21474 and previous config saved to /var/cache/conftool/dbconfig/20220224-135819-ladsgroup.json
13:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
13:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300992)', diff saved to https://phabricator.wikimedia.org/P21473 and previous config saved to /var/cache/conftool/dbconfig/20220224-135811-ladsgroup.json
13:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P21472 and previous config saved to /var/cache/conftool/dbconfig/20220224-134707-kormat.json
13:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2121.codfw.wmnet with reason: host reimage
13:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P21471 and previous config saved to /var/cache/conftool/dbconfig/20220224-134307-ladsgroup.json
13:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2121.codfw.wmnet with reason: host reimage
13:32 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300774)', diff saved to https://phabricator.wikimedia.org/P21470 and previous config saved to /var/cache/conftool/dbconfig/20220224-133202-kormat.json
13:29 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300774)', diff saved to https://phabricator.wikimedia.org/P21469 and previous config saved to /var/cache/conftool/dbconfig/20220224-132923-kormat.json
13:29 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
13:29 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
13:29 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300774)', diff saved to https://phabricator.wikimedia.org/P21468 and previous config saved to /var/cache/conftool/dbconfig/20220224-132915-kormat.json
13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P21467 and previous config saved to /var/cache/conftool/dbconfig/20220224-132802-ladsgroup.json
13:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2121.codfw.wmnet with OS bullseye
13:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 10 hosts with reason: Maintenance
13:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 10 hosts with reason: Maintenance
13:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
13:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2121.codfw.wmnet with reason: Maintenance
13:23 Amir1: dbmaint on s7@codfw (T302363)
13:14 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P21466 and previous config saved to /var/cache/conftool/dbconfig/20220224-131410-kormat.json
13:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300992)', diff saved to https://phabricator.wikimedia.org/P21465 and previous config saved to /var/cache/conftool/dbconfig/20220224-131257-ladsgroup.json
13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300992)', diff saved to https://phabricator.wikimedia.org/P21464 and previous config saved to /var/cache/conftool/dbconfig/20220224-131041-ladsgroup.json
13:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
13:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21463 and previous config saved to /var/cache/conftool/dbconfig/20220224-131033-ladsgroup.json
13:02 moritzm: restarting apache/uwsgi-puppetboard on puppetboard* to pick up expat security updates
12:59 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P21462 and previous config saved to /var/cache/conftool/dbconfig/20220224-125905-kormat.json
12:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P21461 and previous config saved to /var/cache/conftool/dbconfig/20220224-125528-ladsgroup.json
12:44 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300774)', diff saved to https://phabricator.wikimedia.org/P21460 and previous config saved to /var/cache/conftool/dbconfig/20220224-124401-kormat.json
12:41 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300774)', diff saved to https://phabricator.wikimedia.org/P21459 and previous config saved to /var/cache/conftool/dbconfig/20220224-124122-kormat.json
12:41 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
12:41 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
12:40 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
12:40 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
12:40 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300774)', diff saved to https://phabricator.wikimedia.org/P21458 and previous config saved to /var/cache/conftool/dbconfig/20220224-124036-kormat.json
12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P21457 and previous config saved to /var/cache/conftool/dbconfig/20220224-124024-ladsgroup.json
12:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2079.codfw.wmnet with OS bullseye
12:25 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P21456 and previous config saved to /var/cache/conftool/dbconfig/20220224-122532-kormat.json
12:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21455 and previous config saved to /var/cache/conftool/dbconfig/20220224-122519-ladsgroup.json
12:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2079.codfw.wmnet with reason: host reimage
12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21454 and previous config saved to /var/cache/conftool/dbconfig/20220224-122232-ladsgroup.json
12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
12:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300992)', diff saved to https://phabricator.wikimedia.org/P21453 and previous config saved to /var/cache/conftool/dbconfig/20220224-122224-ladsgroup.json
12:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2079.codfw.wmnet with reason: host reimage
12:11 Amir1: dbmaint on s8@codfw (T302185)
12:10 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P21452 and previous config saved to /var/cache/conftool/dbconfig/20220224-121027-kormat.json
12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P21451 and previous config saved to /var/cache/conftool/dbconfig/20220224-120720-ladsgroup.json
12:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2079.codfw.wmnet with OS bullseye
12:04 arturo: aborrero@apt1001:~$ sudo -i reprepro --component thirdparty/openstack-db update bullseye-wikimedia (T302482)
12:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 12 hosts with reason: Maintenance
12:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 12 hosts with reason: Maintenance
12:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2079.codfw.wmnet with reason: Maintenance
12:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2079.codfw.wmnet with reason: Maintenance
11:55 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300774)', diff saved to https://phabricator.wikimedia.org/P21450 and previous config saved to /var/cache/conftool/dbconfig/20220224-115522-kormat.json
11:52 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300774)', diff saved to https://phabricator.wikimedia.org/P21449 and previous config saved to /var/cache/conftool/dbconfig/20220224-115246-kormat.json
11:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
11:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P21448 and previous config saved to /var/cache/conftool/dbconfig/20220224-115215-ladsgroup.json
11:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
11:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
11:52 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300774)', diff saved to https://phabricator.wikimedia.org/P21447 and previous config saved to /var/cache/conftool/dbconfig/20220224-115159-kormat.json
11:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300992)', diff saved to https://phabricator.wikimedia.org/P21446 and previous config saved to /var/cache/conftool/dbconfig/20220224-113710-ladsgroup.json
11:36 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21445 and previous config saved to /var/cache/conftool/dbconfig/20220224-113654-kormat.json
11:35 kart_: Updated cxserver to 2022-02-24-035645-production (T301443, T301952)
11:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300992)', diff saved to https://phabricator.wikimedia.org/P21444 and previous config saved to /var/cache/conftool/dbconfig/20220224-113453-ladsgroup.json
11:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
11:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
11:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
11:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
11:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300992)', diff saved to https://phabricator.wikimedia.org/P21443 and previous config saved to /var/cache/conftool/dbconfig/20220224-113439-ladsgroup.json
11:34 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
11:31 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
11:28 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
11:25 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
11:23 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
11:22 moritzm: rolling restart of thanos frontend swift-proxy/apache to pick up expat security updates
11:22 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
11:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21442 and previous config saved to /var/cache/conftool/dbconfig/20220224-112149-kormat.json
11:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P21441 and previous config saved to /var/cache/conftool/dbconfig/20220224-111935-ladsgroup.json
11:07 hashar: Updated Jenkins job operations-puppet-tests-buster-docker https://gerrit.wikimedia.org/r/c/integration/config/+/765487
11:06 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300774)', diff saved to https://phabricator.wikimedia.org/P21440 and previous config saved to /var/cache/conftool/dbconfig/20220224-110645-kormat.json
11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P21439 and previous config saved to /var/cache/conftool/dbconfig/20220224-110430-ladsgroup.json
11:04 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300774)', diff saved to https://phabricator.wikimedia.org/P21438 and previous config saved to /var/cache/conftool/dbconfig/20220224-110403-kormat.json
11:04 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
11:04 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
11:03 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300774)', diff saved to https://phabricator.wikimedia.org/P21437 and previous config saved to /var/cache/conftool/dbconfig/20220224-110355-kormat.json
11:03 moritzm: restarting apache/carbon-cache on graphite nodes to pickup expat update
10:54 aqu@deploy1002: Finished deploy [airflow-dags/analytics@97759bf]: Set aqs/hourly start date (duration: 00m 06s)
10:54 aqu@deploy1002: Started deploy [airflow-dags/analytics@97759bf]: Set aqs/hourly start date
10:52 moritzm: restarting apache on main prometheus nodes to pickup expat update
10:49 mmandere: enable-puppet on cp instances after finishing successfully testing varnish package component change - T302301
10:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300992)', diff saved to https://phabricator.wikimedia.org/P21436 and previous config saved to /var/cache/conftool/dbconfig/20220224-104925-ladsgroup.json
10:48 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21435 and previous config saved to /var/cache/conftool/dbconfig/20220224-104851-kormat.json
10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300992)', diff saved to https://phabricator.wikimedia.org/P21434 and previous config saved to /var/cache/conftool/dbconfig/20220224-104708-ladsgroup.json
10:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
10:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21433 and previous config saved to /var/cache/conftool/dbconfig/20220224-104700-ladsgroup.json
10:33 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21432 and previous config saved to /var/cache/conftool/dbconfig/20220224-103346-kormat.json
10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P21431 and previous config saved to /var/cache/conftool/dbconfig/20220224-103156-ladsgroup.json
10:18 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300774)', diff saved to https://phabricator.wikimedia.org/P21430 and previous config saved to /var/cache/conftool/dbconfig/20220224-101841-kormat.json
10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P21429 and previous config saved to /var/cache/conftool/dbconfig/20220224-101651-ladsgroup.json
10:16 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300774)', diff saved to https://phabricator.wikimedia.org/P21428 and previous config saved to /var/cache/conftool/dbconfig/20220224-101559-kormat.json
10:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
10:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
10:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
10:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
10:15 kormat: deploying schema change to s1 T300774
10:13 mmandere: depool cp4028.ulsfo.wmnet - T302301
10:02 moritzm: restarting apache on edge prometheus nodes to pickup expat update
10:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21427 and previous config saved to /var/cache/conftool/dbconfig/20220224-100147-ladsgroup.json
09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300992)', diff saved to https://phabricator.wikimedia.org/P21426 and previous config saved to /var/cache/conftool/dbconfig/20220224-095912-ladsgroup.json
09:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
09:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300992)', diff saved to https://phabricator.wikimedia.org/P21425 and previous config saved to /var/cache/conftool/dbconfig/20220224-095904-ladsgroup.json
09:58 aqu@deploy1002: Finished deploy [airflow-dags/analytics@d28cd92]: Fix aqs/hourly in production by adding memory to driver (duration: 00m 06s)
09:58 aqu@deploy1002: Started deploy [airflow-dags/analytics@d28cd92]: Fix aqs/hourly in production by adding memory to driver
09:58 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@d28cd92]: Fix aqs/hourly in production by adding memory to driver (duration: 00m 09s)
09:58 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@d28cd92]: Fix aqs/hourly in production by adding memory to driver
09:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P21424 and previous config saved to /var/cache/conftool/dbconfig/20220224-094400-ladsgroup.json
09:32 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2067.codfw.wmnet
09:31 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
09:31 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2066.codfw.wmnet
09:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P21423 and previous config saved to /var/cache/conftool/dbconfig/20220224-092855-ladsgroup.json
09:25 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
09:25 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet
09:24 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
09:24 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@17a70a0]: (no justification provided) (duration: 00m 08s)
09:24 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
09:24 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@17a70a0]: (no justification provided)
09:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
09:17 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
09:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
09:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
09:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
09:15 urbanecm: Morning B&C window is done
09:14 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/GrowthExperiments/modules/ext.growthExperiments.StructuredTask/StructuredTaskArticleTarget.js: Backport: Structured task: Don't show dialog for confirming leaving suggestions mode upon rejection (T302463) (duration: 00m 50s)
09:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300992)', diff saved to https://phabricator.wikimedia.org/P21422 and previous config saved to /var/cache/conftool/dbconfig/20220224-091350-ladsgroup.json
09:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300992)', diff saved to https://phabricator.wikimedia.org/P21421 and previous config saved to /var/cache/conftool/dbconfig/20220224-091132-ladsgroup.json
09:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
09:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
09:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
09:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
09:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
09:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
09:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
09:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
09:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
09:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
09:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
09:01 urbanecm: Morning B&C window is overruning
08:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
08:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
08:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
08:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
08:16 moritzm: installing expat security updates
06:36 tgr_: T301030#7734236 running UpdateWeightedTags.php on eswiki
02:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2078.codfw.wmnet with OS bullseye
02:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2078.codfw.wmnet with reason: host reimage
01:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2077.codfw.wmnet with OS bullseye
01:57 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2078.codfw.wmnet with reason: host reimage
01:48 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2077.codfw.wmnet with reason: host reimage
01:44 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2077.codfw.wmnet with reason: host reimage
01:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2078.codfw.wmnet with OS bullseye
01:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2076.codfw.wmnet with OS bullseye
01:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2076.codfw.wmnet with reason: host reimage
01:27 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2077.codfw.wmnet with OS bullseye
01:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2075.codfw.wmnet with OS bullseye
01:25 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2076.codfw.wmnet with reason: host reimage
01:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2075.codfw.wmnet with reason: host reimage
01:10 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2075.codfw.wmnet with reason: host reimage
01:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2076.codfw.wmnet with OS bullseye
01:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic2075.codfw.wmnet with OS bullseye
01:01 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2075.codfw.wmnet with OS bullseye
00:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2074.codfw.wmnet with OS bullseye
00:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2075.codfw.wmnet with OS bullseye
00:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2073.codfw.wmnet with OS bullseye
00:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2074.codfw.wmnet with reason: host reimage
00:45 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2074.codfw.wmnet with reason: host reimage
00:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2073.codfw.wmnet with reason: host reimage
00:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2073.codfw.wmnet with reason: host reimage
00:28 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2074.codfw.wmnet with OS bullseye
00:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2079.mgmt.codfw.wmnet with reboot policy FORCED
00:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host elastic2073.codfw.wmnet with OS bullseye
00:06 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2079.mgmt.codfw.wmnet with reboot policy FORCED

2022-02-23

23:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2069.codfw.wmnet with OS stretch
23:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
23:09 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2069.codfw.wmnet with reason: host reimage
22:58 mutante: phabricator - disabled empty but active repo: wikidata-query-LDFServer (WQLD) created in 2018 by qchris (T296022)
22:51 mutante: phabricator - disabled empty but active repos: dibyaduttabook and xtools-H (T296022)
22:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS stretch
22:37 mutante: phabricator - disabling repository dibyaduttabook
22:09 reedy@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/SecurePoll/cli/wm-scripts/ucoc/: (no justification provided) (duration: 00m 50s)
22:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
22:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
22:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
22:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:17 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: bird6 errors expected, not serving any traffic
21:17 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: bird6 errors expected, not serving any traffic
21:11 dduvall@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.23 refs T300199 (duration: 01m 31s)
21:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:10 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.23 refs T300199
21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
20:44 taavi: run CentralAuthUser::importLocalNames for FuzzyBot T302399
19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T302363)', diff saved to https://phabricator.wikimedia.org/P21414 and previous config saved to /var/cache/conftool/dbconfig/20220223-194254-ladsgroup.json
19:35 dancy@deploy1002: scap failed: CalledProcessError Command 'sudo -u mwbuilder /usr/bin/make -C /srv/mwbuilder/release/make-container-image -f Makefile build-and-push-all-images GIT_BASE=https://gerrit.wikimedia.org/r/ BRANCH=master workdir_volume=/srv/mediawiki-staging mv_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-multiversion webserver_image_name=docker-registry.discovery.wmnet/restricted/mediawik
19:35 dancy@deploy1002: Started scap: testing scap container image building
19:33 dancy@deploy1002: scap failed: CalledProcessError Command 'make -f Makefile build-and-push-all-images GIT_BASE=https://gerrit.wikimedia.org/r/ BRANCH=master workdir_volume=/srv/mediawiki-staging mv_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-multiversion webserver_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-webserver' returned non-zero exit status 2. (duration: 00m 03s)
19:33 dancy@deploy1002: Started scap: testing scap container image building
19:32 dancy@deploy1002: Started scap: testing scap container image building
19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P21413 and previous config saved to /var/cache/conftool/dbconfig/20220223-192749-ladsgroup.json
19:27 dancy@deploy1002: scap failed: CalledProcessError Command 'make -f Makefile build-and-push-all-images GIT_BASE=https://gerrit.wikimedia.org/r/ BRANCH=master workdir_volume=/srv/mediawiki-staging mv_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-multiversion webserver_image_name=docker-registry.discovery.wmnet/restricted/mediawiki-webserver' returned non-zero exit status 2. (duration: 00m 51s)
19:26 dancy@deploy1002: Started scap: testing
19:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P21411 and previous config saved to /var/cache/conftool/dbconfig/20220223-191245-ladsgroup.json
18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T302363)', diff saved to https://phabricator.wikimedia.org/P21410 and previous config saved to /var/cache/conftool/dbconfig/20220223-185740-ladsgroup.json
18:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1158.eqiad.wmnet with OS bullseye
18:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
18:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: host reimage
18:23 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2069.codfw.wmnet with OS stretch
18:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1158.eqiad.wmnet with OS bullseye
18:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T302363)', diff saved to https://phabricator.wikimedia.org/P21409 and previous config saved to /var/cache/conftool/dbconfig/20220223-181350-ladsgroup.json
18:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
18:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
18:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
18:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1158.eqiad.wmnet with reason: Maintenance
18:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T302363)', diff saved to https://phabricator.wikimedia.org/P21408 and previous config saved to /var/cache/conftool/dbconfig/20220223-180722-ladsgroup.json
17:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2069.codfw.wmnet with OS stretch
17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21407 and previous config saved to /var/cache/conftool/dbconfig/20220223-175217-ladsgroup.json
17:46 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@17a70a0]: (no justification provided) (duration: 00m 07s)
17:46 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@17a70a0]: (no justification provided)
17:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2066.codfw.wmnet with OS stretch
17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21406 and previous config saved to /var/cache/conftool/dbconfig/20220223-173711-ladsgroup.json
17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
17:23 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T302363)', diff saved to https://phabricator.wikimedia.org/P21404 and previous config saved to /var/cache/conftool/dbconfig/20220223-172206-ladsgroup.json
17:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS stretch
17:14 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2066.codfw.wmnet with OS stretch
17:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1127.eqiad.wmnet with OS bullseye
17:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1127.eqiad.wmnet with reason: host reimage
16:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1127.eqiad.wmnet with reason: host reimage
16:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1127.eqiad.wmnet with OS bullseye
16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T302363)', diff saved to https://phabricator.wikimedia.org/P21403 and previous config saved to /var/cache/conftool/dbconfig/20220223-164453-ladsgroup.json
16:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
16:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
16:42 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS stretch
16:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2068.codfw.wmnet with OS stretch
16:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300774)', diff saved to https://phabricator.wikimedia.org/P21401 and previous config saved to /var/cache/conftool/dbconfig/20220223-162125-kormat.json
16:06 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P21400 and previous config saved to /var/cache/conftool/dbconfig/20220223-160621-kormat.json
16:00 vgutierrez: vgutierrez@apt1001:~$ sudo -i reprepro --component thirdparty/haproxy24 update buster-wikimedia - T290005
15:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
15:52 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
15:51 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P21399 and previous config saved to /var/cache/conftool/dbconfig/20220223-155116-kormat.json
15:36 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS stretch
15:36 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300774)', diff saved to https://phabricator.wikimedia.org/P21398 and previous config saved to /var/cache/conftool/dbconfig/20220223-153611-kormat.json
15:30 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300774)', diff saved to https://phabricator.wikimedia.org/P21397 and previous config saved to /var/cache/conftool/dbconfig/20220223-153044-kormat.json
15:30 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
15:30 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
15:26 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS stretch
15:19 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase1032.eqiad.wmnet with OS buster
15:17 moritzm: rolling restart of FPM and Apache on mediawiki canaries to pick up expat security updates
15:12 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
15:12 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
15:12 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
15:12 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
15:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1033.eqiad.wmnet with OS buster
15:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300774)', diff saved to https://phabricator.wikimedia.org/P21396 and previous config saved to /var/cache/conftool/dbconfig/20220223-151207-kormat.json
15:07 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
15:04 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
15:03 moritzm: installing expat security updates
14:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1033.eqiad.wmnet with reason: host reimage
14:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1031.eqiad.wmnet with OS buster
14:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P21395 and previous config saved to /var/cache/conftool/dbconfig/20220223-145703-kormat.json
14:56 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1033.eqiad.wmnet with reason: host reimage
14:48 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS stretch
14:48 papaul: power down ms-be2068 for re-image
14:42 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1031.eqiad.wmnet with reason: host reimage
14:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P21394 and previous config saved to /var/cache/conftool/dbconfig/20220223-144158-kormat.json
14:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1033.eqiad.wmnet with OS buster
14:39 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1032.eqiad.wmnet with OS buster
14:39 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1031.eqiad.wmnet with reason: host reimage
14:36 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-wikikube
14:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
14:26 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300774)', diff saved to https://phabricator.wikimedia.org/P21393 and previous config saved to /var/cache/conftool/dbconfig/20220223-142652-kormat.json
14:26 mmandere: import varnishkafka_1.1.0-1_amd64.deb, varnishkafka_1.1.0-1.dsc, varnishkafka-dbg_1.1.0-1_amd64.deb to main component - T302301
14:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T302363)', diff saved to https://phabricator.wikimedia.org/P21392 and previous config saved to /var/cache/conftool/dbconfig/20220223-142413-ladsgroup.json
14:21 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300774)', diff saved to https://phabricator.wikimedia.org/P21391 and previous config saved to /var/cache/conftool/dbconfig/20220223-142121-kormat.json
14:21 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
14:21 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
14:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300774)', diff saved to https://phabricator.wikimedia.org/P21390 and previous config saved to /var/cache/conftool/dbconfig/20220223-142113-kormat.json
14:19 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
14:18 mmandere: import varnish-modules_0.15.0-1+wmf1.dsc, varnish-modules-dbgsym_0.15.0-1+wmf1_amd64.deb, varnish-modules_0.15.0-1+wmf1_amd64.deb to main component - T302301
14:18 taavi: UTC afternoon deploys done
14:17 taavi: deploy second patch for T302248
14:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
14:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
14:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
14:12 jayme: restarting pybal on lvs1019,lvs2009 - T290966
14:11 mmandere: import libvarnishapi2_6.0.10-1wm1_amd64.deb, libvarnishapi2-dbgsym_6.0.10-1wm1_amd64.deb, libvarnishapi-dev_6.0.10-1wm1_amd64.deb to main component - T302301
14:11 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/DiscussionTools/includes/Hooks/HookUtils.php: 78f0d9d: Fix check for enabling features on mobile (T302388) (duration: 00m 49s)
14:10 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.22/extensions/DiscussionTools/includes/Hooks/HookUtils.php: 815b3d1: Fix check for enabling features on mobile (T302388) (duration: 00m 50s)
14:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P21389 and previous config saved to /var/cache/conftool/dbconfig/20220223-140908-ladsgroup.json
14:08 jayme: restarting pybal on lvs1020,lvs2010 - T290966
14:06 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P21388 and previous config saved to /var/cache/conftool/dbconfig/20220223-140608-kormat.json
14:05 mmandere: import varnish_6.0.10-1wm1.dsc, varnish_6.0.10-1wm1_amd64.deb, varnish-dbg_6.0.6-1wm1_amd64.deb, varnish-dbgsym_6.0.10-1wm1_amd64.deb, varnish-doc_6.0.10-1wm1_all.deb to main component - T302301
13:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
13:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
13:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
13:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
13:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P21387 and previous config saved to /var/cache/conftool/dbconfig/20220223-135404-ladsgroup.json
13:52 mmandere: import libvmod-re2_1.5.3-1.dsc and libvmod-re2_1.5.3-1_amd64.deb to main component - T302301
13:51 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P21386 and previous config saved to /var/cache/conftool/dbconfig/20220223-135103-kormat.json
13:46 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
13:45 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
13:45 Lucas_WMDE: Deployed patch for T302192
13:41 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
13:41 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
13:39 mmandere: import libvmod-netmapper_1.9-1.dsc and libvmod-netmapper_1.9-1_amd64.deb to main component - T302301
13:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T302363)', diff saved to https://phabricator.wikimedia.org/P21385 and previous config saved to /var/cache/conftool/dbconfig/20220223-133858-ladsgroup.json
13:38 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
13:37 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
13:36 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300774)', diff saved to https://phabricator.wikimedia.org/P21384 and previous config saved to /var/cache/conftool/dbconfig/20220223-133559-kormat.json
13:30 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300774)', diff saved to https://phabricator.wikimedia.org/P21383 and previous config saved to /var/cache/conftool/dbconfig/20220223-133031-kormat.json
13:30 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:30 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:30 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
13:30 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
13:23 Krinkle: debugging on mwdebug1002
13:19 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
13:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
13:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21381 and previous config saved to /var/cache/conftool/dbconfig/20220223-131801-ladsgroup.json
13:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
13:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
13:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300774)', diff saved to https://phabricator.wikimedia.org/P21380 and previous config saved to /var/cache/conftool/dbconfig/20220223-131531-kormat.json
13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1174.eqiad.wmnet with OS bullseye
13:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21379 and previous config saved to /var/cache/conftool/dbconfig/20220223-130255-ladsgroup.json
13:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P21378 and previous config saved to /var/cache/conftool/dbconfig/20220223-130026-kormat.json
12:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1174.eqiad.wmnet with reason: host reimage
12:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1174.eqiad.wmnet with reason: host reimage
12:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21377 and previous config saved to /var/cache/conftool/dbconfig/20220223-124751-ladsgroup.json
12:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P21376 and previous config saved to /var/cache/conftool/dbconfig/20220223-124521-kormat.json
12:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1174.eqiad.wmnet with OS bullseye
12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T302363)', diff saved to https://phabricator.wikimedia.org/P21375 and previous config saved to /var/cache/conftool/dbconfig/20220223-124027-ladsgroup.json
12:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
12:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance
12:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T302363)', diff saved to https://phabricator.wikimedia.org/P21374 and previous config saved to /var/cache/conftool/dbconfig/20220223-123747-ladsgroup.json
12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21373 and previous config saved to /var/cache/conftool/dbconfig/20220223-123246-ladsgroup.json
12:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300774)', diff saved to https://phabricator.wikimedia.org/P21372 and previous config saved to /var/cache/conftool/dbconfig/20220223-123017-kormat.json
12:26 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
12:25 jayme@deploy1002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
12:24 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300774)', diff saved to https://phabricator.wikimedia.org/P21370 and previous config saved to /var/cache/conftool/dbconfig/20220223-122449-kormat.json
12:24 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
12:24 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P21369 and previous config saved to /var/cache/conftool/dbconfig/20220223-122242-ladsgroup.json
12:10 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
12:10 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
12:10 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300774)', diff saved to https://phabricator.wikimedia.org/P21368 and previous config saved to /var/cache/conftool/dbconfig/20220223-121036-kormat.json
12:08 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
12:07 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P21367 and previous config saved to /var/cache/conftool/dbconfig/20220223-120738-ladsgroup.json
12:07 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
12:07 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
12:04 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
12:02 jayme@deploy1002: helmfile [codfw] START helmfile.d/services/miscweb: apply
11:55 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P21366 and previous config saved to /var/cache/conftool/dbconfig/20220223-115531-kormat.json
11:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T302363)', diff saved to https://phabricator.wikimedia.org/P21365 and previous config saved to /var/cache/conftool/dbconfig/20220223-115233-ladsgroup.json
11:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1181.eqiad.wmnet with OS bullseye
11:44 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
11:42 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
11:42 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
11:42 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
11:40 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P21364 and previous config saved to /var/cache/conftool/dbconfig/20220223-114026-kormat.json
11:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21363 and previous config saved to /var/cache/conftool/dbconfig/20220223-113226-ladsgroup.json
11:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
11:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300992)', diff saved to https://phabricator.wikimedia.org/P21362 and previous config saved to /var/cache/conftool/dbconfig/20220223-113219-ladsgroup.json
11:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1181.eqiad.wmnet with reason: host reimage
11:28 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
11:28 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
11:28 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
11:28 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
11:25 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300774)', diff saved to https://phabricator.wikimedia.org/P21361 and previous config saved to /var/cache/conftool/dbconfig/20220223-112522-kormat.json
11:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1181.eqiad.wmnet with OS bullseye
11:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21360 and previous config saved to /var/cache/conftool/dbconfig/20220223-111714-ladsgroup.json
11:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
11:09 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
11:09 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
11:06 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
11:06 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
11:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T302363)', diff saved to https://phabricator.wikimedia.org/P21359 and previous config saved to /var/cache/conftool/dbconfig/20220223-110540-ladsgroup.json
11:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
11:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1181.eqiad.wmnet with reason: Maintenance
11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21358 and previous config saved to /var/cache/conftool/dbconfig/20220223-110209-ladsgroup.json
10:49 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300992)', diff saved to https://phabricator.wikimedia.org/P21357 and previous config saved to /var/cache/conftool/dbconfig/20220223-104704-ladsgroup.json
10:46 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
10:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300774)', diff saved to https://phabricator.wikimedia.org/P21356 and previous config saved to /var/cache/conftool/dbconfig/20220223-104644-kormat.json
10:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
10:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
10:45 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
10:38 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
10:32 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
10:32 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
10:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300992)', diff saved to https://phabricator.wikimedia.org/P21355 and previous config saved to /var/cache/conftool/dbconfig/20220223-103204-ladsgroup.json
10:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
10:32 kormat: running schema change against s3 T300774
10:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
10:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
10:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21354 and previous config saved to /var/cache/conftool/dbconfig/20220223-102919-ladsgroup.json
10:14 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P21353 and previous config saved to /var/cache/conftool/dbconfig/20220223-101414-ladsgroup.json
10:11 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P21352 and previous config saved to /var/cache/conftool/dbconfig/20220223-095909-ladsgroup.json
09:49 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
09:49 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
09:49 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
09:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2077 (T302363)', diff saved to https://phabricator.wikimedia.org/P21351 and previous config saved to /var/cache/conftool/dbconfig/20220223-094655-ladsgroup.json
09:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21350 and previous config saved to /var/cache/conftool/dbconfig/20220223-094405-ladsgroup.json
09:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21349 and previous config saved to /var/cache/conftool/dbconfig/20220223-093933-ladsgroup.json
09:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
09:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
09:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300992)', diff saved to https://phabricator.wikimedia.org/P21348 and previous config saved to /var/cache/conftool/dbconfig/20220223-093925-ladsgroup.json
09:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2077.codfw.wmnet with OS bullseye
09:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2077.codfw.wmnet with reason: host reimage
09:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21347 and previous config saved to /var/cache/conftool/dbconfig/20220223-092421-ladsgroup.json
09:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2077.codfw.wmnet with reason: host reimage
09:14 dcausse: restarting blazegrah on wdqs1007 (jvm stuck for 11hours)
09:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21346 and previous config saved to /var/cache/conftool/dbconfig/20220223-090916-ladsgroup.json
09:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2077.codfw.wmnet with OS bullseye
09:02 godog: bounce prometheus-statsd-exporter on C:prometheus::statsd_exporter - T302372
09:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2077 (T302363)', diff saved to https://phabricator.wikimedia.org/P21345 and previous config saved to /var/cache/conftool/dbconfig/20220223-090109-ladsgroup.json
09:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
09:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
09:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2077.codfw.wmnet with reason: Maintenance
09:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2077.codfw.wmnet with reason: Maintenance
08:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T302363)', diff saved to https://phabricator.wikimedia.org/P21343 and previous config saved to /var/cache/conftool/dbconfig/20220223-085755-ladsgroup.json
08:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300992)', diff saved to https://phabricator.wikimedia.org/P21342 and previous config saved to /var/cache/conftool/dbconfig/20220223-085411-ladsgroup.json
08:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300992)', diff saved to https://phabricator.wikimedia.org/P21341 and previous config saved to /var/cache/conftool/dbconfig/20220223-084951-ladsgroup.json
08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
08:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
08:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300992)', diff saved to https://phabricator.wikimedia.org/P21340 and previous config saved to /var/cache/conftool/dbconfig/20220223-084938-ladsgroup.json
08:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2108.codfw.wmnet with OS bullseye
08:38 urbanecm: UTC morning B&C window done
08:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
08:35 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 10cb05a: Enable DiscussionTools newtopictool, topicsubscription on MediaWiki.org (T302256) (duration: 00m 49s)
08:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
08:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
08:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21339 and previous config saved to /var/cache/conftool/dbconfig/20220223-083433-ladsgroup.json
08:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2108.codfw.wmnet with reason: host reimage
08:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
08:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/includes/pager/IndexPager.php: 38f33d3: ReverseChronologicalPager: Fix displaying date headers for non-revisions (T302343; 5/5) (duration: 00m 48s)
08:32 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/includes/pager/ReverseChronologicalPager.php: 38f33d3: ReverseChronologicalPager: Fix displaying date headers for non-revisions (T302343; 4/5) (duration: 00m 53s)
08:31 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
08:31 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
08:31 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/includes/specials/pagers/MergeHistoryPager.php: 38f33d3: ReverseChronologicalPager: Fix displaying date headers for non-revisions (T302343; 3/5) (duration: 00m 49s)
08:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2108.codfw.wmnet with reason: host reimage
08:30 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/includes/specials/pagers/ContribsPager.php: 38f33d3: ReverseChronologicalPager: Fix displaying date headers for non-revisions (T302343; 2/5) (duration: 00m 49s)
08:29 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/includes/actions/pagers/HistoryPager.php: 38f33d3: ReverseChronologicalPager: Fix displaying date headers for non-revisions (T302343; 1/5) (duration: 00m 49s)
08:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
08:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
08:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
08:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
08:24 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: d9e8861: Enable mobile DiscussionTools at ht.wiki (T302259) (duration: 00m 50s)
08:23 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/DiscussionTools/: 269dcfd: Mobile config: Always enable reply/newtopic tools on mobile, disable subscriptions (T302326) (duration: 00m 50s)
08:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
08:21 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.22/extensions/DiscussionTools/: b82e4eb: Mobile config: Always enable reply/newtopic tools on mobile, disable subscriptions (T302326) (duration: 00m 52s)
08:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
08:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
08:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
08:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21338 and previous config saved to /var/cache/conftool/dbconfig/20220223-081929-ladsgroup.json
08:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2108.codfw.wmnet with OS bullseye
08:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
08:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T302363)', diff saved to https://phabricator.wikimedia.org/P21337 and previous config saved to /var/cache/conftool/dbconfig/20220223-081338-ladsgroup.json
08:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
08:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance
08:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
08:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
08:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2118 (T302363)', diff saved to https://phabricator.wikimedia.org/P21336 and previous config saved to /var/cache/conftool/dbconfig/20220223-080609-ladsgroup.json
08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300992)', diff saved to https://phabricator.wikimedia.org/P21335 and previous config saved to /var/cache/conftool/dbconfig/20220223-080424-ladsgroup.json
07:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300992)', diff saved to https://phabricator.wikimedia.org/P21334 and previous config saved to /var/cache/conftool/dbconfig/20220223-075926-ladsgroup.json
07:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
07:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
07:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21333 and previous config saved to /var/cache/conftool/dbconfig/20220223-075918-ladsgroup.json
07:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21332 and previous config saved to /var/cache/conftool/dbconfig/20220223-074413-ladsgroup.json
07:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21331 and previous config saved to /var/cache/conftool/dbconfig/20220223-072909-ladsgroup.json
07:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21330 and previous config saved to /var/cache/conftool/dbconfig/20220223-071404-ladsgroup.json
07:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2118.codfw.wmnet with OS bullseye
07:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T300992)', diff saved to https://phabricator.wikimedia.org/P21329 and previous config saved to /var/cache/conftool/dbconfig/20220223-071038-ladsgroup.json
07:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
07:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
07:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2088.codfw.wmnet with reason: Maintenance
07:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2088.codfw.wmnet with reason: Maintenance
07:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2101.codfw.wmnet with reason: Maintenance
07:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2101.codfw.wmnet with reason: Maintenance
07:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
07:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2125.codfw.wmnet with reason: Maintenance
07:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
07:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2095.codfw.wmnet with reason: Maintenance
07:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
07:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2126.codfw.wmnet with reason: Maintenance
06:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
06:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2138.codfw.wmnet with reason: Maintenance
06:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2118.codfw.wmnet with reason: host reimage
06:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
06:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
06:54 Amir1: dbmaint on s2@codfw (T300992)
06:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
06:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
06:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2118.codfw.wmnet with reason: host reimage
06:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2118.codfw.wmnet with OS bullseye
06:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2118 (T302363)', diff saved to https://phabricator.wikimedia.org/P21328 and previous config saved to /var/cache/conftool/dbconfig/20220223-063733-ladsgroup.json
06:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2118.codfw.wmnet with reason: Maintenance
06:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2118.codfw.wmnet with reason: Maintenance
06:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T302363)', diff saved to https://phabricator.wikimedia.org/P21327 and previous config saved to /var/cache/conftool/dbconfig/20220223-063625-ladsgroup.json
06:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2120.codfw.wmnet with OS bullseye
06:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2120.codfw.wmnet with reason: host reimage
06:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2120.codfw.wmnet with reason: host reimage
05:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2120.codfw.wmnet with OS bullseye
05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T302363)', diff saved to https://phabricator.wikimedia.org/P21326 and previous config saved to /var/cache/conftool/dbconfig/20220223-055534-ladsgroup.json
05:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
05:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2120.codfw.wmnet with reason: Maintenance
05:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T302363)', diff saved to https://phabricator.wikimedia.org/P21325 and previous config saved to /var/cache/conftool/dbconfig/20220223-055416-ladsgroup.json
05:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2122.codfw.wmnet with OS bullseye
05:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2122.codfw.wmnet with reason: host reimage
05:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2122.codfw.wmnet with reason: host reimage
05:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2122.codfw.wmnet with OS bullseye
05:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T302363)', diff saved to https://phabricator.wikimedia.org/P21324 and previous config saved to /var/cache/conftool/dbconfig/20220223-051125-ladsgroup.json
05:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
05:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2122.codfw.wmnet with reason: Maintenance
05:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T302363)', diff saved to https://phabricator.wikimedia.org/P21323 and previous config saved to /var/cache/conftool/dbconfig/20220223-051026-ladsgroup.json
05:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2150.codfw.wmnet with OS bullseye
04:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
04:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2150.codfw.wmnet with reason: host reimage
04:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
04:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
04:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
04:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
04:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
04:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
04:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
04:36 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/includes/page/ParserOutputAccess.php: Backport: ParserOutputAccess: Check for latest revision when checking for cache (T283029) (duration: 00m 50s)
04:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
04:33 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.23/includes/page/ParserOutputAccess.php: Backport: ParserOutputAccess: Check for latest revision when checking for cache (T283029) (duration: 00m 51s)
04:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2150.codfw.wmnet with OS bullseye
04:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T302363)', diff saved to https://phabricator.wikimedia.org/P21322 and previous config saved to /var/cache/conftool/dbconfig/20220223-042802-ladsgroup.json
04:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
04:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2150.codfw.wmnet with reason: Maintenance
02:49 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2068.codfw.wmnet with OS stretch
02:49 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2066.codfw.wmnet with OS stretch
02:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
02:06 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
01:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS stretch
01:50 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-be2066.codfw.wmnet with OS stretch
01:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
01:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage
01:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
01:27 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2066.codfw.wmnet with reason: host reimage
01:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2068.codfw.wmnet with OS stretch
01:18 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1004.wikimedia.org with OS bullseye
01:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2066.codfw.wmnet with OS stretch
01:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2067.codfw.wmnet with OS stretch
01:03 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
01:00 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
00:59 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
00:56 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
00:55 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
00:52 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
00:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
00:51 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcontrol1004.wikimedia.org with OS bullseye
00:44 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
00:29 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
00:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
00:23 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
00:04 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS stretch

2022-02-22

23:20 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
23:18 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1004.wikimedia.org with reason: host reimage
23:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
22:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
22:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
22:53 dduvall@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: Backport: VisualEditor: Avoid undefined index for mobileformat ([T302344]) (duration: 00m 49s)
22:52 dduvall@deploy1002: Synchronized php-1.38.0-wmf.23/extensions/DiscussionTools/includes/ApiDiscussionToolsEdit.php: Backport: DiscussionTools: Avoid undefined index for mobileformat ([T302344]) (duration: 00m 51s)
22:45 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
22:32 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
22:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
22:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
22:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
22:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
22:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2067.codfw.wmnet with OS stretch
22:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2078.mgmt.codfw.wmnet with reboot policy FORCED
22:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
22:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
22:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
22:02 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
21:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:47 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2078.mgmt.codfw.wmnet with reboot policy FORCED
21:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:45 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS stretch
21:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2077.mgmt.codfw.wmnet with reboot policy FORCED
21:43 urbanecm@deploy1002: Synchronized wmf-config/filebackend.php: 91b81ac: filebackend: migrate $wmfSwift* to $wmgSwift* (T45956) (duration: 00m 52s)
21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:38 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 99f244c: [Cleanup] Remove non-existent config wgVectorUseWvuiSearch (duration: 00m 50s)
21:34 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 7172327: [Vector] Enable table of contents on beta cluster (duration: 00m 50s)
21:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 6d1d9a9: InitialiseSettings: General cleanup, wgRemoveGroups (A-D) (T301647) (duration: 00m 50s)
21:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2077.mgmt.codfw.wmnet with reboot policy FORCED
21:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2076.mgmt.codfw.wmnet with reboot policy FORCED
21:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: ee7608c: Deploy the fawiki test safety survey to production (T297629) (duration: 00m 51s)
21:19 cwhite: end opensearch upgrade (codfw) T299168
21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:12 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host elastic2076.mgmt.codfw.wmnet with reboot policy FORCED
21:06 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1004.wikimedia.org with OS bullseye
21:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:03 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2069.mgmt.codfw.wmnet with reboot policy FORCED
20:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
20:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
20:36 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1004.wikimedia.org with OS bullseye
20:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2069.mgmt.codfw.wmnet with reboot policy FORCED
20:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2068.mgmt.codfw.wmnet with reboot policy FORCED
20:26 cwhite: begin opensearch upgrade (codfw) T299168
20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
20:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
20:10 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2068.mgmt.codfw.wmnet with reboot policy FORCED
20:09 ryankemper: T302340 [WCQS] Seeing `0.3.104` running on the hosts now
20:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2067.mgmt.codfw.wmnet with reboot policy FORCED
20:08 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@5d384a5] (wcqs): Deploy 0.3.104 to WCQS (duration: 02m 33s)
20:07 dduvall@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.23 refs T300199
20:06 ryankemper@deploy1002: Started deploy [wdqs/wdqs@5d384a5] (wcqs): Deploy 0.3.104 to WCQS
20:06 ryankemper: T302340 [WCQS] Forgot to fetch & rebase `deploy1002:/srv/deployment/wdqs/wdqs` before deploy, so `0.3.104` did not actually deploy (still on `0.3.103`). Re-rolling deploy...
20:00 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@f0d05eb] (wcqs): Deploy 0.3.104 to WCQS (duration: 03m 00s)
19:58 ryankemper: T302340 `scap deploy -v --environment wcqs 'Deploy 0.3.104 to WCQS'`
19:57 ryankemper@deploy1002: Started deploy [wdqs/wdqs@f0d05eb] (wcqs): Deploy 0.3.104 to WCQS
19:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
19:49 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2067.mgmt.codfw.wmnet with reboot policy FORCED
19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
19:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2066.mgmt.codfw.wmnet with reboot policy FORCED
19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
19:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
19:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
19:25 ryankemper: T302330 `ryankemper@cumin1001:~$ sudo -E cumin '*mwmaint*' 'run-puppet-agent'` (getting https://gerrit.wikimedia.org/r/c/operations/puppet/+/764875 out)
19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
19:24 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash[2004-2006].codfw.wmnet
19:20 dduvall@deploy1002: Pruned MediaWiki: 1.38.0-wmf.21 (duration: 03m 50s)
19:16 dduvall@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.23 refs T300199 (duration: 49m 17s)
19:11 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash[2004-2006].codfw.wmnet
19:10 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts logstash[1007-1009].eqiad.wmnet
19:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be2066.mgmt.codfw.wmnet with reboot policy FORCED
18:58 ssastry@deploy1002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
18:56 ssastry@deploy1002: helmfile [eqiad] START helmfile.d/services/proton: apply
18:55 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts logstash[1007-1009].eqiad.wmnet
18:53 ssastry@deploy1002: helmfile [codfw] DONE helmfile.d/services/proton: apply
18:52 ssastry@deploy1002: helmfile [codfw] START helmfile.d/services/proton: apply
18:50 ssastry@deploy1002: helmfile [staging] DONE helmfile.d/services/proton: apply
18:49 ssastry@deploy1002: helmfile [staging] START helmfile.d/services/proton: apply
18:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
18:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
18:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
18:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
18:33 herron@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts centrallog2001.codfw.wmnet
18:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
18:30 moritzm: rebalance ganeti eqiad row_B (all nodes reimaged in there) T296721
18:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
18:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
18:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
18:27 dduvall@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.23 refs T300199
18:25 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:23 herron@cumin1001: START - Cookbook sre.hosts.decommission for hosts centrallog2001.codfw.wmnet
18:20 pt1979@cumin2002: START - Cookbook sre.dns.netbox
17:52 gehel: depooling WDQS codfw (internal + public) - issues with deployment of new updater version on cdofw
17:02 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
17:01 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
16:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21316 and previous config saved to /var/cache/conftool/dbconfig/20220222-164604-kormat.json
16:40 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
16:39 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
16:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21315 and previous config saved to /var/cache/conftool/dbconfig/20220222-163059-kormat.json
16:23 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
16:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21314 and previous config saved to /var/cache/conftool/dbconfig/20220222-161554-kormat.json
16:15 papaul: rebooting scs-oe16-esams to clear librenms alert
16:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21313 and previous config saved to /var/cache/conftool/dbconfig/20220222-160049-kormat.json
15:54 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
15:43 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
15:27 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21312 and previous config saved to /var/cache/conftool/dbconfig/20220222-152658-kormat.json
15:26 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
15:26 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
15:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
15:25 urbanecm: Migration of oversight => suppress is done (T112147)
15:25 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript migrateUserGroup.php --wiki=labswiki oversight suppress # T112147
15:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
15:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
15:24 urbanecm: Run `mwscript purgeExpiredUserrights.php enwikiquote` to purge an expired but not yet removed row with the old oversight group (T112147)
15:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
15:20 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
15:20 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
15:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
15:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 4a2a212: Update oversight group to suppress (T112147) (duration: 00m 49s)
15:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
15:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
15:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
15:13 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 79cfa4e: Remove the oversight group hack (T112147) (duration: 00m 48s)
15:07 urbanecm: Finishing deployment of T112147 that started during B&C time
14:54 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
14:53 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
14:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
14:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
14:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
14:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
14:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
14:32 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
14:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
14:31 urbanecm: Run `[urbanecm@mwmaint1002 ~]$ foreachwikiindblist oversight-wikis migrateUserGroup.php oversight suppress` in a tmux session (oversight-wikis.dblist is a temporary dblist from P21310; T112147)
14:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21311 and previous config saved to /var/cache/conftool/dbconfig/20220222-143023-kormat.json
14:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
14:24 urbanecm: mwscript migrateUserGroup.php --wiki=metawiki oversight suppress # T112147
14:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
14:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
14:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
14:22 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
14:21 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: ec07ac0: Add suppress group to privileged groups (T112147) (duration: 00m 49s)
14:21 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
14:18 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 6859cd2: Do not delete the suppress group (T112147) (duration: 00m 50s)
14:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21309 and previous config saved to /var/cache/conftool/dbconfig/20220222-141518-kormat.json
14:14 taavi: deploy T302248 patch
14:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21308 and previous config saved to /var/cache/conftool/dbconfig/20220222-141338-marostegui.json
14:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 100%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21307 and previous config saved to /var/cache/conftool/dbconfig/20220222-141148-root.json
14:10 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
14:10 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
14:07 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
14:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21306 and previous config saved to /var/cache/conftool/dbconfig/20220222-140013-kormat.json
13:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21305 and previous config saved to /var/cache/conftool/dbconfig/20220222-135833-marostegui.json
13:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 75%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21304 and previous config saved to /var/cache/conftool/dbconfig/20220222-135644-root.json
13:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21303 and previous config saved to /var/cache/conftool/dbconfig/20220222-134509-kormat.json
13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21302 and previous config saved to /var/cache/conftool/dbconfig/20220222-134329-marostegui.json
13:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 50%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21301 and previous config saved to /var/cache/conftool/dbconfig/20220222-134141-root.json
13:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
13:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
13:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
13:32 godog: bounce prometheus-blackbox-exporter on prometheus1005 - T302265
13:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21300 and previous config saved to /var/cache/conftool/dbconfig/20220222-132824-marostegui.json
13:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 25%: After mysql restart', diff saved to https://phabricator.wikimedia.org/P21299 and previous config saved to /var/cache/conftool/dbconfig/20220222-132637-root.json
13:24 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1093.eqiad.wmnet with OS bullseye
13:24 moritzm: rebalance ganeti eqiad row_D (all nodes reimaged in there) T296721
13:23 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21298 and previous config saved to /var/cache/conftool/dbconfig/20220222-131854-marostegui.json
13:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
13:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21297 and previous config saved to /var/cache/conftool/dbconfig/20220222-131846-marostegui.json
13:13 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
13:11 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
13:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21296 and previous config saved to /var/cache/conftool/dbconfig/20220222-130342-marostegui.json
13:00 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1004.eqiad.wmnet with OS bullseye
12:59 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
12:50 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
12:50 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21295 and previous config saved to /var/cache/conftool/dbconfig/20220222-124837-marostegui.json
12:48 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: host reimage
12:47 godog: bounce prometheus-blackbox-exporter on prometheus1006 - T302265
12:45 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1004.eqiad.wmnet with reason: host reimage
12:44 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21294 and previous config saved to /var/cache/conftool/dbconfig/20220222-124449-kormat.json
12:44 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
12:44 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21293 and previous config saved to /var/cache/conftool/dbconfig/20220222-123332-marostegui.json
12:32 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1004.eqiad.wmnet with OS bullseye
12:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300381)', diff saved to https://phabricator.wikimedia.org/P21292 and previous config saved to /var/cache/conftool/dbconfig/20220222-122351-marostegui.json
12:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
12:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300381)', diff saved to https://phabricator.wikimedia.org/P21291 and previous config saved to /var/cache/conftool/dbconfig/20220222-122124-marostegui.json
12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21290 and previous config saved to /var/cache/conftool/dbconfig/20220222-120619-marostegui.json
11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21289 and previous config saved to /var/cache/conftool/dbconfig/20220222-115808-ladsgroup.json
11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21288 and previous config saved to /var/cache/conftool/dbconfig/20220222-115114-marostegui.json
11:46 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1003.eqiad.wmnet with OS bullseye
11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P21287 and previous config saved to /var/cache/conftool/dbconfig/20220222-114304-ladsgroup.json
11:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P21286 and previous config saved to /var/cache/conftool/dbconfig/20220222-114206-kormat.json
11:40 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1003.eqiad.wmnet with reason: host reimage
11:37 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1003.eqiad.wmnet with reason: host reimage
11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300381)', diff saved to https://phabricator.wikimedia.org/P21285 and previous config saved to /var/cache/conftool/dbconfig/20220222-113609-marostegui.json
11:30 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
11:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P21284 and previous config saved to /var/cache/conftool/dbconfig/20220222-112759-ladsgroup.json
11:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21283 and previous config saved to /var/cache/conftool/dbconfig/20220222-112702-kormat.json
11:25 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
11:24 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1003.eqiad.wmnet with OS bullseye
11:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
11:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
11:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
11:22 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
11:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
11:20 jbond: deploy netbox puppet refactor gerrit:764330 (should be noop)
11:20 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: beta: Allow opening the alpha NewLexeme special page on beta-wikidatawiki (T301234) (Beta only) (duration: 00m 48s)
11:20 jbond: deploy netbox puppet refactor (should be noop)
11:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21282 and previous config saved to /var/cache/conftool/dbconfig/20220222-111254-ladsgroup.json
11:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21281 and previous config saved to /var/cache/conftool/dbconfig/20220222-111157-kormat.json
11:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300381)', diff saved to https://phabricator.wikimedia.org/P21280 and previous config saved to /var/cache/conftool/dbconfig/20220222-111144-marostegui.json
11:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
11:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
11:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300381)', diff saved to https://phabricator.wikimedia.org/P21279 and previous config saved to /var/cache/conftool/dbconfig/20220222-111137-marostegui.json
11:10 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
11:08 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
11:06 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve1002.eqiad.wmnet with OS bullseye
11:03 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
11:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21278 and previous config saved to /var/cache/conftool/dbconfig/20220222-110118-ladsgroup.json
11:00 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1002.eqiad.wmnet with reason: host reimage
10:59 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
10:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P21277 and previous config saved to /var/cache/conftool/dbconfig/20220222-105653-kormat.json
10:56 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1002.eqiad.wmnet with reason: host reimage
10:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21276 and previous config saved to /var/cache/conftool/dbconfig/20220222-105632-marostegui.json
10:56 Lucas_WMDE: Deployed patch for T302192
10:48 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21275 and previous config saved to /var/cache/conftool/dbconfig/20220222-104613-ladsgroup.json
10:43 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1002.eqiad.wmnet with OS bullseye
10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21274 and previous config saved to /var/cache/conftool/dbconfig/20220222-104128-marostegui.json
10:36 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1001.eqiad.wmnet with OS bullseye
10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21273 and previous config saved to /var/cache/conftool/dbconfig/20220222-103109-ladsgroup.json
10:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300381)', diff saved to https://phabricator.wikimedia.org/P21272 and previous config saved to /var/cache/conftool/dbconfig/20220222-102623-marostegui.json
10:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage
10:20 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1001.eqiad.wmnet with reason: host reimage
10:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300381)', diff saved to https://phabricator.wikimedia.org/P21271 and previous config saved to /var/cache/conftool/dbconfig/20220222-101710-marostegui.json
10:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
10:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
10:16 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P21270 and previous config saved to /var/cache/conftool/dbconfig/20220222-101649-kormat.json
10:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
10:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21269 and previous config saved to /var/cache/conftool/dbconfig/20220222-101604-ladsgroup.json
10:12 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet
10:07 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve1001.eqiad.wmnet with OS bullseye
10:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1099.eqiad.wmnet with OS bullseye
10:00 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet
09:52 XioNoX: restarting cr2-drmrs for software upgrade
09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1099.eqiad.wmnet with reason: host reimage
09:47 aqu@deploy1002: Finished deploy [analytics/refinery@ed5c9f9] (hadoop-test): Migrate aqs/hourly to Airflow TEST [analytics/refinery@ed5c9f9] (duration: 00m 03s)
09:47 aqu@deploy1002: Started deploy [analytics/refinery@ed5c9f9] (hadoop-test): Migrate aqs/hourly to Airflow TEST [analytics/refinery@ed5c9f9]
09:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300381)', diff saved to https://phabricator.wikimedia.org/P21268 and previous config saved to /var/cache/conftool/dbconfig/20220222-094740-marostegui.json
09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1099.eqiad.wmnet with reason: host reimage
09:43 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:38 aqu: Deploying analytics/refinery on hadoop-test only.
09:38 jayme@cumin1001: START - Cookbook sre.dns.netbox
09:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1099.eqiad.wmnet with OS bullseye
09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21267 and previous config saved to /var/cache/conftool/dbconfig/20220222-093235-marostegui.json
09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21266 and previous config saved to /var/cache/conftool/dbconfig/20220222-091730-marostegui.json
09:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
09:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
09:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
09:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300381)', diff saved to https://phabricator.wikimedia.org/P21265 and previous config saved to /var/cache/conftool/dbconfig/20220222-090226-marostegui.json
08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21264 and previous config saved to /var/cache/conftool/dbconfig/20220222-085835-ladsgroup.json
08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300381)', diff saved to https://phabricator.wikimedia.org/P21263 and previous config saved to /var/cache/conftool/dbconfig/20220222-085752-marostegui.json
08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
08:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21262 and previous config saved to /var/cache/conftool/dbconfig/20220222-085653-ladsgroup.json
08:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
08:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
08:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21261 and previous config saved to /var/cache/conftool/dbconfig/20220222-085536-ladsgroup.json
08:55 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@17a70a0]: Add aqs hourly (duration: 00m 08s)
08:55 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@17a70a0]: Add aqs hourly
08:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21260 and previous config saved to /var/cache/conftool/dbconfig/20220222-084031-ladsgroup.json
08:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300381)', diff saved to https://phabricator.wikimedia.org/P21259 and previous config saved to /var/cache/conftool/dbconfig/20220222-083534-marostegui.json
08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P21258 and previous config saved to /var/cache/conftool/dbconfig/20220222-082527-ladsgroup.json
08:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
08:21 taavi: UTC morning deploys done
08:20 taavi@deploy1002: Synchronized php-1.38.0-wmf.22/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.DesktopArticleTarget.js: Backport: Revert: Don't suppress teardown prompt when pressing escape (T302096) (duration: 00m 49s)
08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21257 and previous config saved to /var/cache/conftool/dbconfig/20220222-082029-marostegui.json
08:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21256 and previous config saved to /var/cache/conftool/dbconfig/20220222-081022-ladsgroup.json
08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21255 and previous config saved to /var/cache/conftool/dbconfig/20220222-080525-marostegui.json
07:51 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300381)', diff saved to https://phabricator.wikimedia.org/P21254 and previous config saved to /var/cache/conftool/dbconfig/20220222-075020-marostegui.json
07:49 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
07:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300381)', diff saved to https://phabricator.wikimedia.org/P21253 and previous config saved to /var/cache/conftool/dbconfig/20220222-074106-marostegui.json
07:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
07:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
07:31 marostegui: dbmaint on non-pooled hosts s2@eqiad T300381
07:13 marostegui: dbmaint on db2104 (and its replicas) s2@codfw T300381
07:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T302185)', diff saved to https://phabricator.wikimedia.org/P21252 and previous config saved to /var/cache/conftool/dbconfig/20220222-071003-ladsgroup.json
07:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
07:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1172.eqiad.wmnet with reason: Maintenance
07:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2082 (T302185)', diff saved to https://phabricator.wikimedia.org/P21251 and previous config saved to /var/cache/conftool/dbconfig/20220222-070759-ladsgroup.json
07:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2082.codfw.wmnet with OS bullseye
06:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2082.codfw.wmnet with reason: host reimage
06:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2082.codfw.wmnet with reason: host reimage
06:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2082.codfw.wmnet with OS bullseye
06:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2082 (T302185)', diff saved to https://phabricator.wikimedia.org/P21250 and previous config saved to /var/cache/conftool/dbconfig/20220222-062711-ladsgroup.json
06:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
06:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
06:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Maintenance
06:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2082.codfw.wmnet with reason: Maintenance
06:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2085:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21249 and previous config saved to /var/cache/conftool/dbconfig/20220222-062443-ladsgroup.json
06:22 marostegui: dbmaint on db2077 s7@codfw T302222
06:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2085:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21248 and previous config saved to /var/cache/conftool/dbconfig/20220222-062018-ladsgroup.json
06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300775)', diff saved to https://phabricator.wikimedia.org/P21247 and previous config saved to /var/cache/conftool/dbconfig/20220222-061235-marostegui.json
06:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
06:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1175.eqiad.wmnet with reason: Maintenance
06:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2085.codfw.wmnet with OS bullseye
06:10 marostegui: dbmain on db2077 s7@codfw T302222
05:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2085.codfw.wmnet with reason: host reimage
05:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2085.codfw.wmnet with reason: host reimage
05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2085.codfw.wmnet with OS bullseye
05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2085:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21246 and previous config saved to /var/cache/conftool/dbconfig/20220222-053901-ladsgroup.json
05:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2085:3311 (T302185)', diff saved to https://phabricator.wikimedia.org/P21245 and previous config saved to /var/cache/conftool/dbconfig/20220222-053836-ladsgroup.json
05:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2085.codfw.wmnet with reason: Maintenance
05:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2085.codfw.wmnet with reason: Maintenance
05:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2086:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21244 and previous config saved to /var/cache/conftool/dbconfig/20220222-053525-ladsgroup.json
05:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2086:3317 (T302185)', diff saved to https://phabricator.wikimedia.org/P21243 and previous config saved to /var/cache/conftool/dbconfig/20220222-053102-ladsgroup.json
05:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2086.codfw.wmnet with OS bullseye
05:16 Amir1: dbmaint on s1@codfw (T302185)
05:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2086.codfw.wmnet with reason: host reimage
05:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2086.codfw.wmnet with reason: host reimage
04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300992)', diff saved to https://phabricator.wikimedia.org/P21242 and previous config saved to /var/cache/conftool/dbconfig/20220222-045511-ladsgroup.json
04:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2086.codfw.wmnet with OS bullseye
04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2086:3318 (T302185)', diff saved to https://phabricator.wikimedia.org/P21241 and previous config saved to /var/cache/conftool/dbconfig/20220222-045406-ladsgroup.json
04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2086:3317 (T302185)', diff saved to https://phabricator.wikimedia.org/P21240 and previous config saved to /var/cache/conftool/dbconfig/20220222-045349-ladsgroup.json
04:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2086.codfw.wmnet with reason: Maintenance
04:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2086.codfw.wmnet with reason: Maintenance
04:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21239 and previous config saved to /var/cache/conftool/dbconfig/20220222-044006-ladsgroup.json
04:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2080 (T302185)', diff saved to https://phabricator.wikimedia.org/P21238 and previous config saved to /var/cache/conftool/dbconfig/20220222-042940-ladsgroup.json
04:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21237 and previous config saved to /var/cache/conftool/dbconfig/20220222-042502-ladsgroup.json
04:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2080.codfw.wmnet with OS bullseye
04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2080.codfw.wmnet with reason: host reimage
04:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300992)', diff saved to https://phabricator.wikimedia.org/P21236 and previous config saved to /var/cache/conftool/dbconfig/20220222-040957-ladsgroup.json
04:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2080.codfw.wmnet with reason: host reimage
04:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300992)', diff saved to https://phabricator.wikimedia.org/P21235 and previous config saved to /var/cache/conftool/dbconfig/20220222-040537-ladsgroup.json
04:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
04:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
03:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2080.codfw.wmnet with OS bullseye
03:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2080 (T302185)', diff saved to https://phabricator.wikimedia.org/P21234 and previous config saved to /var/cache/conftool/dbconfig/20220222-035419-ladsgroup.json
03:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2080.codfw.wmnet with reason: Maintenance
03:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2080.codfw.wmnet with reason: Maintenance
03:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2081 (T302185)', diff saved to https://phabricator.wikimedia.org/P21233 and previous config saved to /var/cache/conftool/dbconfig/20220222-035257-ladsgroup.json
03:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2081.codfw.wmnet with OS bullseye
03:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2081.codfw.wmnet with reason: host reimage
03:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2081.codfw.wmnet with reason: host reimage
03:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2081.codfw.wmnet with OS bullseye
03:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2081 (T302185)', diff saved to https://phabricator.wikimedia.org/P21232 and previous config saved to /var/cache/conftool/dbconfig/20220222-030456-ladsgroup.json
03:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2081.codfw.wmnet with reason: Maintenance
03:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2081.codfw.wmnet with reason: Maintenance
02:46 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcontrol1005.wikimedia.org with OS bullseye
02:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
02:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
02:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
02:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
02:08 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1005.wikimedia.org with reason: host reimage
02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
02:05 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1005.wikimedia.org with reason: host reimage
02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
01:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1005.wikimedia.org with OS bullseye

2022-02-21

22:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21231 and previous config saved to /var/cache/conftool/dbconfig/20220221-223015-marostegui.json
22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P21230 and previous config saved to /var/cache/conftool/dbconfig/20220221-221510-marostegui.json
22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P21229 and previous config saved to /var/cache/conftool/dbconfig/20220221-220005-marostegui.json
21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21228 and previous config saved to /var/cache/conftool/dbconfig/20220221-214500-marostegui.json
21:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21227 and previous config saved to /var/cache/conftool/dbconfig/20220221-213411-marostegui.json
21:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
21:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
21:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300381)', diff saved to https://phabricator.wikimedia.org/P21226 and previous config saved to /var/cache/conftool/dbconfig/20220221-213403-marostegui.json
21:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P21225 and previous config saved to /var/cache/conftool/dbconfig/20220221-211859-marostegui.json
21:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P21224 and previous config saved to /var/cache/conftool/dbconfig/20220221-210354-marostegui.json
20:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300381)', diff saved to https://phabricator.wikimedia.org/P21223 and previous config saved to /var/cache/conftool/dbconfig/20220221-204849-marostegui.json
20:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300381)', diff saved to https://phabricator.wikimedia.org/P21222 and previous config saved to /var/cache/conftool/dbconfig/20220221-203708-marostegui.json
20:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
20:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
20:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300381)', diff saved to https://phabricator.wikimedia.org/P21221 and previous config saved to /var/cache/conftool/dbconfig/20220221-203701-marostegui.json
20:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P21220 and previous config saved to /var/cache/conftool/dbconfig/20220221-202156-marostegui.json
20:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P21219 and previous config saved to /var/cache/conftool/dbconfig/20220221-200651-marostegui.json
19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300381)', diff saved to https://phabricator.wikimedia.org/P21218 and previous config saved to /var/cache/conftool/dbconfig/20220221-195147-marostegui.json
19:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300381)', diff saved to https://phabricator.wikimedia.org/P21217 and previous config saved to /var/cache/conftool/dbconfig/20220221-193842-marostegui.json
19:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
19:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
19:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
19:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
19:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
19:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
19:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
19:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
19:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
19:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
19:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300381)', diff saved to https://phabricator.wikimedia.org/P21216 and previous config saved to /var/cache/conftool/dbconfig/20220221-192309-marostegui.json
19:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P21215 and previous config saved to /var/cache/conftool/dbconfig/20220221-190801-marostegui.json
19:03 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcontrol1003.wikimedia.org with OS bullseye
18:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P21214 and previous config saved to /var/cache/conftool/dbconfig/20220221-185256-marostegui.json
18:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300381)', diff saved to https://phabricator.wikimedia.org/P21213 and previous config saved to /var/cache/conftool/dbconfig/20220221-183751-marostegui.json
18:33 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300774)', diff saved to https://phabricator.wikimedia.org/P21212 and previous config saved to /var/cache/conftool/dbconfig/20220221-183304-kormat.json
18:33 urbanecm: Password reset for Jrnka ka@SUL per Ticket#2022022010002692
18:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300381)', diff saved to https://phabricator.wikimedia.org/P21211 and previous config saved to /var/cache/conftool/dbconfig/20220221-182856-marostegui.json
18:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
18:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
18:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21210 and previous config saved to /var/cache/conftool/dbconfig/20220221-182849-marostegui.json
18:18 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P21209 and previous config saved to /var/cache/conftool/dbconfig/20220221-181800-kormat.json
18:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21208 and previous config saved to /var/cache/conftool/dbconfig/20220221-181344-marostegui.json
18:11 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2004.codfw.wmnet with OS bullseye
18:07 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcontrol1003.wikimedia.org with reason: host reimage
18:04 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcontrol1003.wikimedia.org with reason: host reimage
18:02 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P21207 and previous config saved to /var/cache/conftool/dbconfig/20220221-180255-kormat.json
18:02 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
18:02 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P21206 and previous config saved to /var/cache/conftool/dbconfig/20220221-175839-marostegui.json
17:58 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2004.codfw.wmnet with reason: host reimage
17:55 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2004.codfw.wmnet with reason: host reimage
17:50 aqu@deploy1002: Finished deploy [airflow-dags/analytics@17a70a0]: fix missing extra_query_parameters (duration: 00m 07s)
17:50 aqu@deploy1002: Started deploy [airflow-dags/analytics@17a70a0]: fix missing extra_query_parameters
17:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300774)', diff saved to https://phabricator.wikimedia.org/P21205 and previous config saved to /var/cache/conftool/dbconfig/20220221-174750-kormat.json
17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21204 and previous config saved to /var/cache/conftool/dbconfig/20220221-174335-marostegui.json
17:41 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T300774)', diff saved to https://phabricator.wikimedia.org/P21203 and previous config saved to /var/cache/conftool/dbconfig/20220221-174138-kormat.json
17:41 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
17:41 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
17:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300774)', diff saved to https://phabricator.wikimedia.org/P21202 and previous config saved to /var/cache/conftool/dbconfig/20220221-174130-kormat.json
17:38 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2004.codfw.wmnet with OS bullseye
17:33 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2003.codfw.wmnet with OS bullseye
17:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics@c2fdce7]: fix aqs hourly DAGs start date (duration: 00m 07s)
17:32 aqu@deploy1002: Started deploy [airflow-dags/analytics@c2fdce7]: fix aqs hourly DAGs start date
17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21201 and previous config saved to /var/cache/conftool/dbconfig/20220221-173130-marostegui.json
17:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
17:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300381)', diff saved to https://phabricator.wikimedia.org/P21200 and previous config saved to /var/cache/conftool/dbconfig/20220221-173122-marostegui.json
17:26 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P21199 and previous config saved to /var/cache/conftool/dbconfig/20220221-172626-kormat.json
17:26 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcontrol1003.wikimedia.org with OS bullseye
17:19 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2003.codfw.wmnet with reason: host reimage
17:16 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2003.codfw.wmnet with reason: host reimage
17:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P21198 and previous config saved to /var/cache/conftool/dbconfig/20220221-171618-marostegui.json
17:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P21197 and previous config saved to /var/cache/conftool/dbconfig/20220221-171121-kormat.json
17:06 aqu@deploy1002: Finished deploy [airflow-dags/analytics@f1244e0]: Migrate aqs/hourly from Oozie|Hive to Airflow|Spark (duration: 00m 07s)
17:06 aqu@deploy1002: Started deploy [airflow-dags/analytics@f1244e0]: Migrate aqs/hourly from Oozie|Hive to Airflow|Spark
17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P21196 and previous config saved to /var/cache/conftool/dbconfig/20220221-170113-marostegui.json
16:59 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2003.codfw.wmnet with OS bullseye
16:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300774)', diff saved to https://phabricator.wikimedia.org/P21195 and previous config saved to /var/cache/conftool/dbconfig/20220221-165616-kormat.json
16:54 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T300774)', diff saved to https://phabricator.wikimedia.org/P21194 and previous config saved to /var/cache/conftool/dbconfig/20220221-165405-kormat.json
16:54 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
16:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
16:54 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
16:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
16:53 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300774)', diff saved to https://phabricator.wikimedia.org/P21193 and previous config saved to /var/cache/conftool/dbconfig/20220221-165352-kormat.json
16:51 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2002.codfw.wmnet with OS bullseye
16:48 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
16:47 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300381)', diff saved to https://phabricator.wikimedia.org/P21192 and previous config saved to /var/cache/conftool/dbconfig/20220221-164608-marostegui.json
16:44 mforns@deploy1002: Finished deploy [analytics/refinery@ed5c9f9] (hadoop-test): Deploy Aqs Hourly for Airflow THIN [analytics/refinery@ed5c9f9] (duration: 07m 12s)
16:38 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P21191 and previous config saved to /var/cache/conftool/dbconfig/20220221-163847-kormat.json
16:38 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2002.codfw.wmnet with reason: host reimage
16:37 mforns@deploy1002: Started deploy [analytics/refinery@ed5c9f9] (hadoop-test): Deploy Aqs Hourly for Airflow THIN [analytics/refinery@ed5c9f9]
16:37 mforns@deploy1002: Finished deploy [analytics/refinery@ed5c9f9] (thin): Deploy Aqs Hourly for Airflow THIN [analytics/refinery@ed5c9f9] (duration: 00m 07s)
16:36 mforns@deploy1002: Started deploy [analytics/refinery@ed5c9f9] (thin): Deploy Aqs Hourly for Airflow THIN [analytics/refinery@ed5c9f9]
16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300381)', diff saved to https://phabricator.wikimedia.org/P21190 and previous config saved to /var/cache/conftool/dbconfig/20220221-163555-marostegui.json
16:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
16:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300381)', diff saved to https://phabricator.wikimedia.org/P21189 and previous config saved to /var/cache/conftool/dbconfig/20220221-163548-marostegui.json
16:35 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2002.codfw.wmnet with reason: host reimage
16:30 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1093.eqiad.wmnet with OS bullseye
16:23 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P21188 and previous config saved to /var/cache/conftool/dbconfig/20220221-162342-kormat.json
16:21 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
16:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P21187 and previous config saved to /var/cache/conftool/dbconfig/20220221-162043-marostegui.json
16:18 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2002.codfw.wmnet with OS bullseye
16:17 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
16:08 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300774)', diff saved to https://phabricator.wikimedia.org/P21186 and previous config saved to /var/cache/conftool/dbconfig/20220221-160838-kormat.json
16:05 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve200[5-8].codfw.wmnet
16:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P21185 and previous config saved to /var/cache/conftool/dbconfig/20220221-160538-marostegui.json
16:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=ml_serve,service=kubesvc
16:03 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=ml-serve,service=kubesvc
16:01 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
16:01 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2001.codfw.wmnet with OS bullseye
15:59 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T300774)', diff saved to https://phabricator.wikimedia.org/P21184 and previous config saved to /var/cache/conftool/dbconfig/20220221-155924-kormat.json
15:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
15:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
15:52 mforns@deploy1002: Finished deploy [analytics/refinery@ed5c9f9]: Deploy Aqs Hourly for Airflow [analytics/refinery@ed5c9f9] (duration: 21m 23s)
15:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300381)', diff saved to https://phabricator.wikimedia.org/P21183 and previous config saved to /var/cache/conftool/dbconfig/20220221-155034-marostegui.json
15:47 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve2001.codfw.wmnet with reason: host reimage
15:45 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
15:45 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
15:45 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
15:45 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2001.codfw.wmnet with reason: host reimage
15:45 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
15:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21182 and previous config saved to /var/cache/conftool/dbconfig/20220221-154518-kormat.json
15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300381)', diff saved to https://phabricator.wikimedia.org/P21181 and previous config saved to /var/cache/conftool/dbconfig/20220221-154118-marostegui.json
15:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
15:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300381)', diff saved to https://phabricator.wikimedia.org/P21180 and previous config saved to /var/cache/conftool/dbconfig/20220221-154110-marostegui.json
15:30 mforns@deploy1002: Started deploy [analytics/refinery@ed5c9f9]: Deploy Aqs Hourly for Airflow [analytics/refinery@ed5c9f9]
15:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21179 and previous config saved to /var/cache/conftool/dbconfig/20220221-153013-kormat.json
15:28 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2001.codfw.wmnet with OS bullseye
15:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P21178 and previous config saved to /var/cache/conftool/dbconfig/20220221-152606-marostegui.json
15:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21177 and previous config saved to /var/cache/conftool/dbconfig/20220221-151945-root.json
15:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P21176 and previous config saved to /var/cache/conftool/dbconfig/20220221-151509-kormat.json
15:11 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync
15:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P21175 and previous config saved to /var/cache/conftool/dbconfig/20220221-151101-marostegui.json
15:10 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync
15:09 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync
15:09 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync
15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21174 and previous config saved to /var/cache/conftool/dbconfig/20220221-150848-root.json
15:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21173 and previous config saved to /var/cache/conftool/dbconfig/20220221-150442-root.json
15:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21172 and previous config saved to /var/cache/conftool/dbconfig/20220221-150004-kormat.json
14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300381)', diff saved to https://phabricator.wikimedia.org/P21171 and previous config saved to /var/cache/conftool/dbconfig/20220221-145556-marostegui.json
14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21170 and previous config saved to /var/cache/conftool/dbconfig/20220221-145345-root.json
14:52 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
14:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21169 and previous config saved to /var/cache/conftool/dbconfig/20220221-144938-root.json
14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300381)', diff saved to https://phabricator.wikimedia.org/P21168 and previous config saved to /var/cache/conftool/dbconfig/20220221-144707-marostegui.json
14:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
14:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
14:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
14:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300381)', diff saved to https://phabricator.wikimedia.org/P21167 and previous config saved to /var/cache/conftool/dbconfig/20220221-143931-marostegui.json
14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21166 and previous config saved to /var/cache/conftool/dbconfig/20220221-143841-root.json
14:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21165 and previous config saved to /var/cache/conftool/dbconfig/20220221-143435-root.json
14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P21164 and previous config saved to /var/cache/conftool/dbconfig/20220221-142426-marostegui.json
14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21163 and previous config saved to /var/cache/conftool/dbconfig/20220221-142337-root.json
14:22 moritzm: installing twisted security updates
14:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1129 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21162 and previous config saved to /var/cache/conftool/dbconfig/20220221-141931-root.json
14:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
14:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P21161 and previous config saved to /var/cache/conftool/dbconfig/20220221-140922-marostegui.json
14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21160 and previous config saved to /var/cache/conftool/dbconfig/20220221-140831-root.json
14:05 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
14:00 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1093.eqiad.wmnet with reason: host reimage
13:59 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300774)', diff saved to https://phabricator.wikimedia.org/P21159 and previous config saved to /var/cache/conftool/dbconfig/20220221-135945-kormat.json
13:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
13:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
13:59 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21158 and previous config saved to /var/cache/conftool/dbconfig/20220221-135937-kormat.json
13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300381)', diff saved to https://phabricator.wikimedia.org/P21156 and previous config saved to /var/cache/conftool/dbconfig/20220221-135417-marostegui.json
13:49 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
13:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300381)', diff saved to https://phabricator.wikimedia.org/P21154 and previous config saved to /var/cache/conftool/dbconfig/20220221-134542-marostegui.json
13:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
13:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
13:44 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P21153 and previous config saved to /var/cache/conftool/dbconfig/20220221-134433-kormat.json
13:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
13:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21152 and previous config saved to /var/cache/conftool/dbconfig/20220221-133818-marostegui.json
13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21151 and previous config saved to /var/cache/conftool/dbconfig/20220221-133350-root.json
13:29 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P21150 and previous config saved to /var/cache/conftool/dbconfig/20220221-132928-kormat.json
13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21149 and previous config saved to /var/cache/conftool/dbconfig/20220221-132313-marostegui.json
13:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21148 and previous config saved to /var/cache/conftool/dbconfig/20220221-131846-root.json
13:14 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21147 and previous config saved to /var/cache/conftool/dbconfig/20220221-131423-kormat.json
13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21146 and previous config saved to /var/cache/conftool/dbconfig/20220221-130808-marostegui.json
13:06 moritzm: rebalance ganeti row_C (add nodes reimaged in there) T296721
13:04 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1009.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21145 and previous config saved to /var/cache/conftool/dbconfig/20220221-130343-root.json
13:02 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1009.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1009.eqiad.wmnet
12:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1009.eqiad.wmnet
12:53 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21144 and previous config saved to /var/cache/conftool/dbconfig/20220221-125326-kormat.json
12:53 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
12:53 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
12:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21143 and previous config saved to /var/cache/conftool/dbconfig/20220221-125303-marostegui.json
12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21142 and previous config saved to /var/cache/conftool/dbconfig/20220221-124839-root.json
12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300381)', diff saved to https://phabricator.wikimedia.org/P21141 and previous config saved to /var/cache/conftool/dbconfig/20220221-124215-marostegui.json
12:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
12:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
12:40 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:40 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:36 marostegui: Rebuild templatelinks table on db2077 (s7) T301848
12:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1017.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
12:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
12:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
12:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
12:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
12:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21140 and previous config saved to /var/cache/conftool/dbconfig/20220221-123335-root.json
12:30 Lucas_WMDE: Deployed patch for T302215
12:28 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:28 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:28 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21139 and previous config saved to /var/cache/conftool/dbconfig/20220221-122821-kormat.json
12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110', diff saved to https://phabricator.wikimedia.org/P21138 and previous config saved to /var/cache/conftool/dbconfig/20220221-122727-marostegui.json
12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300381)', diff saved to https://phabricator.wikimedia.org/P21137 and previous config saved to /var/cache/conftool/dbconfig/20220221-122504-marostegui.json
12:14 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1017.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
12:13 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P21136 and previous config saved to /var/cache/conftool/dbconfig/20220221-121316-kormat.json
12:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1017.eqiad.wmnet
12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21135 and previous config saved to /var/cache/conftool/dbconfig/20220221-120959-marostegui.json
12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1017.eqiad.wmnet
11:58 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P21134 and previous config saved to /var/cache/conftool/dbconfig/20220221-115811-kormat.json
11:58 marostegui: Rebuild templatelinks table on db1129 (s2) T301848
11:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1129 T301848', diff saved to https://phabricator.wikimedia.org/P21133 and previous config saved to /var/cache/conftool/dbconfig/20220221-115750-marostegui.json
11:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P21132 and previous config saved to /var/cache/conftool/dbconfig/20220221-115455-marostegui.json
11:48 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
11:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
11:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
11:43 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21131 and previous config saved to /var/cache/conftool/dbconfig/20220221-114307-kormat.json
11:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300381)', diff saved to https://phabricator.wikimedia.org/P21130 and previous config saved to /var/cache/conftool/dbconfig/20220221-113950-marostegui.json
11:28 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=kubernetes-staging,service=kubesvc
11:28 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21129 and previous config saved to /var/cache/conftool/dbconfig/20220221-112809-kormat.json
11:28 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
11:28 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
11:28 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21128 and previous config saved to /var/cache/conftool/dbconfig/20220221-112801-kormat.json
11:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1012.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
11:26 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1012.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
11:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1012.eqiad.wmnet
11:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage1004.eqiad.wmnet with OS bullseye
11:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1012.eqiad.wmnet
11:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P21127 and previous config saved to /var/cache/conftool/dbconfig/20220221-111256-kormat.json
11:12 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1004.eqiad.wmnet with reason: host reimage
11:09 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1004.eqiad.wmnet with reason: host reimage
11:05 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2002.codfw.wmnet with OS bullseye
10:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1022.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
10:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P21126 and previous config saved to /var/cache/conftool/dbconfig/20220221-105752-kormat.json
10:57 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1022.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
10:54 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-staging2002.codfw.wmnet with reason: host reimage
10:53 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
10:53 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubestage1004.eqiad.wmnet with OS bullseye
10:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1022.eqiad.wmnet
10:48 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2002.codfw.wmnet with reason: host reimage
10:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1022.eqiad.wmnet
10:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21125 and previous config saved to /var/cache/conftool/dbconfig/20220221-104247-kormat.json
10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300381)', diff saved to https://phabricator.wikimedia.org/P21124 and previous config saved to /var/cache/conftool/dbconfig/20220221-103931-marostegui.json
10:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
10:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21123 and previous config saved to /var/cache/conftool/dbconfig/20220221-103924-marostegui.json
10:32 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-staging2002.codfw.wmnet with OS bullseye
10:30 Lucas_WMDE: Deployed patch for T302192
10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21122 and previous config saved to /var/cache/conftool/dbconfig/20220221-102419-marostegui.json
10:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21121 and previous config saved to /var/cache/conftool/dbconfig/20220221-102241-root.json
10:16 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
10:15 jayme@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P21120 and previous config saved to /var/cache/conftool/dbconfig/20220221-100914-marostegui.json
10:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21119 and previous config saved to /var/cache/conftool/dbconfig/20220221-100737-root.json
10:03 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:01 marostegui: Rebuild templatelinks table on s2 codfw master (db2104), lag to be expected on codfw T301848
09:57 moritzm: installing PHP 7.4 security updates (as packaged in Debian)
09:56 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21118 and previous config saved to /var/cache/conftool/dbconfig/20220221-095410-marostegui.json
09:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21117 and previous config saved to /var/cache/conftool/dbconfig/20220221-095233-root.json
09:52 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2001.codfw.wmnet with OS bullseye
09:51 kormat: running schema change against s7 T300774
09:51 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300774)', diff saved to https://phabricator.wikimedia.org/P21116 and previous config saved to /var/cache/conftool/dbconfig/20220221-095122-kormat.json
09:51 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
09:51 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21115 and previous config saved to /var/cache/conftool/dbconfig/20220221-094826-marostegui.json
09:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
09:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300381)', diff saved to https://phabricator.wikimedia.org/P21114 and previous config saved to /var/cache/conftool/dbconfig/20220221-094819-marostegui.json
09:45 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=kubernetes-staging,service=kubesvc
09:41 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage
09:38 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-staging2001.codfw.wmnet with reason: host reimage
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21113 and previous config saved to /var/cache/conftool/dbconfig/20220221-093729-root.json
09:34 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage1003.eqiad.wmnet with OS bullseye
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1009.eqiad.wmnet with OS buster
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21112 and previous config saved to /var/cache/conftool/dbconfig/20220221-093314-marostegui.json
09:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: host reimage
09:24 godog: deploy prometheus-icinga-exporter 0.19 - T300951
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P21111 and previous config saved to /var/cache/conftool/dbconfig/20220221-092226-root.json
09:22 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS bullseye
09:22 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-staging2001.codfw.wmnet with OS bullseye
09:22 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS bullseye
09:22 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
09:20 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: host reimage
09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P21110 and previous config saved to /var/cache/conftool/dbconfig/20220221-091809-marostegui.json
09:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1009.eqiad.wmnet with reason: host reimage
09:04 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubestage1003.eqiad.wmnet with OS bullseye
09:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1009.eqiad.wmnet with reason: host reimage
09:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300381)', diff saved to https://phabricator.wikimedia.org/P21109 and previous config saved to /var/cache/conftool/dbconfig/20220221-090305-marostegui.json
08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300381)', diff saved to https://phabricator.wikimedia.org/P21108 and previous config saved to /var/cache/conftool/dbconfig/20220221-085745-marostegui.json
08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
08:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
08:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
08:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
08:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
08:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
08:50 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1009.eqiad.wmnet with OS buster
08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21107 and previous config saved to /var/cache/conftool/dbconfig/20220221-084802-marostegui.json
08:38 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes-staging,service=kubesvc
08:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21106 and previous config saved to /var/cache/conftool/dbconfig/20220221-083257-marostegui.json
08:22 godog: update karma to 0.99 on alert* hosts - T284213
08:21 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2002.codfw.wmnet with OS bullseye
08:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P21105 and previous config saved to /var/cache/conftool/dbconfig/20220221-081752-marostegui.json
08:11 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
08:10 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
08:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2002.codfw.wmnet with reason: host reimage
08:07 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2002.codfw.wmnet with reason: host reimage
08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21104 and previous config saved to /var/cache/conftool/dbconfig/20220221-080248-marostegui.json
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21103 and previous config saved to /var/cache/conftool/dbconfig/20220221-075800-marostegui.json
07:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
07:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
07:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
07:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
07:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21102 and previous config saved to /var/cache/conftool/dbconfig/20220221-075336-marostegui.json
07:48 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubestage2002.codfw.wmnet with OS bullseye
07:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21101 and previous config saved to /var/cache/conftool/dbconfig/20220221-073831-marostegui.json
07:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
07:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
07:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
07:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P21100 and previous config saved to /var/cache/conftool/dbconfig/20220221-072326-marostegui.json
07:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
07:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
07:11 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
07:10 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
07:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
07:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
07:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
07:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
07:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21099 and previous config saved to /var/cache/conftool/dbconfig/20220221-070822-marostegui.json
07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300381)', diff saved to https://phabricator.wikimedia.org/P21098 and previous config saved to /var/cache/conftool/dbconfig/20220221-070240-marostegui.json
07:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
07:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300381)', diff saved to https://phabricator.wikimedia.org/P21097 and previous config saved to /var/cache/conftool/dbconfig/20220221-070233-marostegui.json
06:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
06:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2107.codfw.wmnet with reason: Maintenance
06:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
06:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
06:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298554)', diff saved to https://phabricator.wikimedia.org/P21096 and previous config saved to /var/cache/conftool/dbconfig/20220221-065220-ladsgroup.json
06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21095 and previous config saved to /var/cache/conftool/dbconfig/20220221-064728-marostegui.json
06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1107.eqiad.wmnet with OS bullseye
06:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P21093 and previous config saved to /var/cache/conftool/dbconfig/20220221-063713-ladsgroup.json
06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21092 and previous config saved to /var/cache/conftool/dbconfig/20220221-063223-marostegui.json
06:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1107.eqiad.wmnet with reason: host reimage
06:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1107.eqiad.wmnet with reason: host reimage
06:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P21091 and previous config saved to /var/cache/conftool/dbconfig/20220221-062206-ladsgroup.json
06:20 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1107.eqiad.wmnet with OS bullseye
06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300381)', diff saved to https://phabricator.wikimedia.org/P21090 and previous config saved to /var/cache/conftool/dbconfig/20220221-061719-marostegui.json
06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300381)', diff saved to https://phabricator.wikimedia.org/P21089 and previous config saved to /var/cache/conftool/dbconfig/20220221-061205-marostegui.json
06:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
06:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300775)', diff saved to https://phabricator.wikimedia.org/P21088 and previous config saved to /var/cache/conftool/dbconfig/20220221-060804-marostegui.json
06:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
06:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1179.eqiad.wmnet with reason: Maintenance
06:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T298554)', diff saved to https://phabricator.wikimedia.org/P21087 and previous config saved to /var/cache/conftool/dbconfig/20220221-060701-ladsgroup.json
05:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T298554)', diff saved to https://phabricator.wikimedia.org/P21086 and previous config saved to /var/cache/conftool/dbconfig/20220221-054612-ladsgroup.json
05:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
05:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
05:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21085 and previous config saved to /var/cache/conftool/dbconfig/20220221-054604-ladsgroup.json
05:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P21084 and previous config saved to /var/cache/conftool/dbconfig/20220221-053059-ladsgroup.json
05:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P21083 and previous config saved to /var/cache/conftool/dbconfig/20220221-051555-ladsgroup.json
05:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21082 and previous config saved to /var/cache/conftool/dbconfig/20220221-050050-ladsgroup.json
04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2083 (T302185)', diff saved to https://phabricator.wikimedia.org/P21081 and previous config saved to /var/cache/conftool/dbconfig/20220221-045516-ladsgroup.json
04:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2083.codfw.wmnet with OS bullseye
04:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2083.codfw.wmnet with reason: host reimage
04:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21080 and previous config saved to /var/cache/conftool/dbconfig/20220221-043358-ladsgroup.json
04:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
04:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
04:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298554)', diff saved to https://phabricator.wikimedia.org/P21079 and previous config saved to /var/cache/conftool/dbconfig/20220221-043350-ladsgroup.json
04:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2083.codfw.wmnet with reason: host reimage
04:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P21078 and previous config saved to /var/cache/conftool/dbconfig/20220221-041846-ladsgroup.json
04:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2083.codfw.wmnet with OS bullseye
04:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2083 (T302185)', diff saved to https://phabricator.wikimedia.org/P21077 and previous config saved to /var/cache/conftool/dbconfig/20220221-041529-ladsgroup.json
04:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2083.codfw.wmnet with reason: Maintenance
04:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2083.codfw.wmnet with reason: Maintenance
04:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2084 (T302185)', diff saved to https://phabricator.wikimedia.org/P21076 and previous config saved to /var/cache/conftool/dbconfig/20220221-041123-ladsgroup.json
04:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P21075 and previous config saved to /var/cache/conftool/dbconfig/20220221-040341-ladsgroup.json
03:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2084.codfw.wmnet with OS bullseye
03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T298554)', diff saved to https://phabricator.wikimedia.org/P21074 and previous config saved to /var/cache/conftool/dbconfig/20220221-034836-ladsgroup.json
03:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2084.codfw.wmnet with reason: host reimage
03:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T298554)', diff saved to https://phabricator.wikimedia.org/P21073 and previous config saved to /var/cache/conftool/dbconfig/20220221-034100-ladsgroup.json
03:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
03:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
03:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T298554)', diff saved to https://phabricator.wikimedia.org/P21072 and previous config saved to /var/cache/conftool/dbconfig/20220221-034052-ladsgroup.json
03:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2084.codfw.wmnet with reason: host reimage
03:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2084.codfw.wmnet with OS bullseye
03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2084 (T302185)', diff saved to https://phabricator.wikimedia.org/P21071 and previous config saved to /var/cache/conftool/dbconfig/20220221-032548-ladsgroup.json
03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P21070 and previous config saved to /var/cache/conftool/dbconfig/20220221-032548-ladsgroup.json
03:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2084.codfw.wmnet with reason: Maintenance
03:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2084.codfw.wmnet with reason: Maintenance
03:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2091 (T302185)', diff saved to https://phabricator.wikimedia.org/P21069 and previous config saved to /var/cache/conftool/dbconfig/20220221-031602-ladsgroup.json
03:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P21068 and previous config saved to /var/cache/conftool/dbconfig/20220221-031039-ladsgroup.json
03:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2091.codfw.wmnet with OS bullseye
02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T298554)', diff saved to https://phabricator.wikimedia.org/P21067 and previous config saved to /var/cache/conftool/dbconfig/20220221-025534-ladsgroup.json
02:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2091.codfw.wmnet with reason: host reimage
02:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2091.codfw.wmnet with reason: host reimage
02:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T298554)', diff saved to https://phabricator.wikimedia.org/P21066 and previous config saved to /var/cache/conftool/dbconfig/20220221-023852-ladsgroup.json
02:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
02:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
02:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
02:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2091.codfw.wmnet with OS bullseye
02:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2091 (T302185)', diff saved to https://phabricator.wikimedia.org/P21065 and previous config saved to /var/cache/conftool/dbconfig/20220221-023158-ladsgroup.json
02:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2091.codfw.wmnet with reason: Maintenance
02:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2091.codfw.wmnet with reason: Maintenance
02:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T302185)', diff saved to https://phabricator.wikimedia.org/P21064 and previous config saved to /var/cache/conftool/dbconfig/20220221-022259-ladsgroup.json
02:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
02:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
02:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T298554)', diff saved to https://phabricator.wikimedia.org/P21063 and previous config saved to /var/cache/conftool/dbconfig/20220221-021943-ladsgroup.json
02:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2152.codfw.wmnet with OS bullseye
02:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P21062 and previous config saved to /var/cache/conftool/dbconfig/20220221-020438-ladsgroup.json
01:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
01:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2152.codfw.wmnet with reason: host reimage
01:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P21061 and previous config saved to /var/cache/conftool/dbconfig/20220221-014934-ladsgroup.json
01:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2152.codfw.wmnet with OS bullseye
01:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T302185)', diff saved to https://phabricator.wikimedia.org/P21060 and previous config saved to /var/cache/conftool/dbconfig/20220221-013811-ladsgroup.json
01:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
01:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2152.codfw.wmnet with reason: Maintenance
01:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T298554)', diff saved to https://phabricator.wikimedia.org/P21059 and previous config saved to /var/cache/conftool/dbconfig/20220221-013429-ladsgroup.json
01:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T298554)', diff saved to https://phabricator.wikimedia.org/P21058 and previous config saved to /var/cache/conftool/dbconfig/20220221-012649-ladsgroup.json
01:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
01:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
01:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21057 and previous config saved to /var/cache/conftool/dbconfig/20220221-012642-ladsgroup.json
01:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P21056 and previous config saved to /var/cache/conftool/dbconfig/20220221-011137-ladsgroup.json
00:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P21055 and previous config saved to /var/cache/conftool/dbconfig/20220221-005632-ladsgroup.json
00:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21054 and previous config saved to /var/cache/conftool/dbconfig/20220221-004128-ladsgroup.json
00:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T298554)', diff saved to https://phabricator.wikimedia.org/P21053 and previous config saved to /var/cache/conftool/dbconfig/20220221-001641-ladsgroup.json
00:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
00:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance

2022-02-20

12:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
12:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
12:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
12:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
12:27 taavi@deploy1002: Synchronized private/PrivateSettings.php: T302047 (duration: 00m 49s)

2022-02-19

16:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
16:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
16:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
16:40 ladsgroup@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 48s)
16:38 ladsgroup@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 48s)
16:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
16:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
16:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
16:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
12:24 _joe_: restarted php-fpm on wtp1027
03:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
03:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
03:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
03:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
03:25 legoktm@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 47s)
03:03 legoktm@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 31s)
03:00 legoktm@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 48s)
02:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
02:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
02:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
02:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
02:46 legoktm@deploy1002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 00m 37s)
02:29 ladsgroup@deploy1002: Synchronized private/PrivateSettings.php: T302047 (duration: 00m 48s)
02:16 ladsgroup@deploy1002: Synchronized private/PrivateSettings.php: T302047 (duration: 00m 48s)
02:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2022.codfw.wmnet with OS bullseye
02:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
02:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2022.codfw.wmnet with reason: host reimage
02:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
02:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
01:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
01:58 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: disable wmgEmergencyCaptcha and enable AbuseFilter throttling for enwiki aebac8fe1 7618ff941 T302047 (duration: 00m 48s)
01:57 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2022.codfw.wmnet with reason: host reimage
01:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2022.codfw.wmnet with OS bullseye
01:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
01:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
01:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
01:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
01:34 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2021.codfw.wmnet with OS bullseye
01:33 legoktm@deploy1002: Synchronized private/PrivateSettings.php: T302047 tweaks (duration: 00m 48s)
01:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
01:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
01:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
01:24 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2021.codfw.wmnet with reason: host reimage
01:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
01:21 legoktm@deploy1002: Synchronized private/PrivateSettings.php: T302047 (duration: 00m 49s)
01:19 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2021.codfw.wmnet with reason: host reimage
01:01 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2021.codfw.wmnet with OS bullseye
00:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2020.codfw.wmnet with OS bullseye
00:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2020.codfw.wmnet with reason: host reimage
00:45 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2020.codfw.wmnet with reason: host reimage
00:27 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2020.codfw.wmnet with OS bullseye
00:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2019.codfw.wmnet with OS bullseye
00:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes2019.codfw.wmnet with reason: host reimage
00:05 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes2019.codfw.wmnet with reason: host reimage

2022-02-18

23:47 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host kubernetes2019.codfw.wmnet with OS bullseye
23:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
23:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
23:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
23:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
23:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
23:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
23:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
23:34 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Revert "Revert "enable wmgEmergencyCaptcha for enwiki"" (duration: 00m 50s)
23:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
23:32 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
23:27 pt1979@cumin2002: START - Cookbook sre.dns.netbox
23:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2001.codfw.wmnet with OS bullseye
23:08 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2001.codfw.wmnet with reason: host reimage
23:04 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2001.codfw.wmnet with reason: host reimage
22:46 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2001.codfw.wmnet with OS bullseye
22:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
22:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2003.codfw.wmnet with OS bullseye
22:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2003.codfw.wmnet with reason: host reimage
22:20 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
22:17 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2003.codfw.wmnet with reason: host reimage
21:59 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2003.codfw.wmnet with OS bullseye
21:56 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2002.codfw.wmnet with OS bullseye
21:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2002.codfw.wmnet with reason: host reimage
21:43 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2002.codfw.wmnet with reason: host reimage
21:24 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2002.codfw.wmnet with OS bullseye
21:22 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-cache2002.codfw.wmnet with OS bullseye
20:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2002.codfw.wmnet with OS bullseye
18:06 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
17:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300774)', diff saved to https://phabricator.wikimedia.org/P21045 and previous config saved to /var/cache/conftool/dbconfig/20220218-174640-kormat.json
17:31 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21044 and previous config saved to /var/cache/conftool/dbconfig/20220218-173135-kormat.json
17:26 ariel@deploy1002: Finished deploy [dumps/dumps@f7c16d4]: noop script, dup jobname check for api jobs, do flow dumps in pieces like stubs (duration: 00m 03s)
17:26 ariel@deploy1002: Started deploy [dumps/dumps@f7c16d4]: noop script, dup jobname check for api jobs, do flow dumps in pieces like stubs
17:16 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P21043 and previous config saved to /var/cache/conftool/dbconfig/20220218-171630-kormat.json
17:04 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
17:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
17:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
17:02 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2022.mgmt.codfw.wmnet with reboot policy FORCED
17:02 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
17:01 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300774)', diff saved to https://phabricator.wikimedia.org/P21042 and previous config saved to /var/cache/conftool/dbconfig/20220218-170125-kormat.json
16:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
16:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
16:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
16:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2022.mgmt.codfw.wmnet with reboot policy FORCED
16:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
16:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2021.mgmt.codfw.wmnet with reboot policy FORCED
16:47 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2021.mgmt.codfw.wmnet with reboot policy FORCED
16:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300774)', diff saved to https://phabricator.wikimedia.org/P21041 and previous config saved to /var/cache/conftool/dbconfig/20220218-164434-kormat.json
16:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
16:45 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
16:44 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300774)', diff saved to https://phabricator.wikimedia.org/P21040 and previous config saved to /var/cache/conftool/dbconfig/20220218-164427-kormat.json
16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2020.mgmt.codfw.wmnet with reboot policy FORCED
16:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2020.mgmt.codfw.wmnet with reboot policy FORCED
16:34 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
16:34 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kubernetes2019.mgmt.codfw.wmnet with reboot policy FORCED
16:29 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21039 and previous config saved to /var/cache/conftool/dbconfig/20220218-162922-kormat.json
16:23 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kubernetes2019.mgmt.codfw.wmnet with reboot policy FORCED
16:14 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P21038 and previous config saved to /var/cache/conftool/dbconfig/20220218-161417-kormat.json
16:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
16:10 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
16:07 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-cache2003.mgmt.codfw.wmnet with reboot policy FORCED
15:59 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300774)', diff saved to https://phabricator.wikimedia.org/P21037 and previous config saved to /var/cache/conftool/dbconfig/20220218-155912-kormat.json
15:57 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300774)', diff saved to https://phabricator.wikimedia.org/P21036 and previous config saved to /var/cache/conftool/dbconfig/20220218-155659-kormat.json
15:57 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
15:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
15:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300774)', diff saved to https://phabricator.wikimedia.org/P21035 and previous config saved to /var/cache/conftool/dbconfig/20220218-155652-kormat.json
15:56 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
15:52 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2003.mgmt.codfw.wmnet with reboot policy FORCED
15:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-cache2002.mgmt.codfw.wmnet with reboot policy FORCED
15:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21034 and previous config saved to /var/cache/conftool/dbconfig/20220218-154147-kormat.json
15:38 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
15:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2002.mgmt.codfw.wmnet with reboot policy FORCED
15:33 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
15:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
15:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
15:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
15:26 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P21033 and previous config saved to /var/cache/conftool/dbconfig/20220218-152641-kormat.json
15:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
15:21 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: disable wmgEmergencyCaptcha for enwiki 286f99886 T302047 (duration: 00m 49s)
15:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
15:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
15:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
15:16 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
15:15 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-cache2001.mgmt.codfw.wmnet with reboot policy FORCED
15:14 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: re-enable AbuseFilter throttling on enwiki 808d82dcd T302047 (duration: 00m 49s)
15:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300774)', diff saved to https://phabricator.wikimedia.org/P21032 and previous config saved to /var/cache/conftool/dbconfig/20220218-151136-kormat.json
14:58 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300774)', diff saved to https://phabricator.wikimedia.org/P21031 and previous config saved to /var/cache/conftool/dbconfig/20220218-145820-kormat.json
14:58 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
14:58 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
14:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1009.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
14:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1009.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
14:44 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
14:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
14:29 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
14:29 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
14:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
14:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
14:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
14:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
14:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
14:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
14:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21030 and previous config saved to /var/cache/conftool/dbconfig/20220218-141517-kormat.json
14:06 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
14:04 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
14:03 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
14:02 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
14:02 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
14:01 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
14:01 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
14:01 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
14:00 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
14:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21029 and previous config saved to /var/cache/conftool/dbconfig/20220218-140012-kormat.json
13:59 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
13:59 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
13:45 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P21028 and previous config saved to /var/cache/conftool/dbconfig/20220218-134508-kormat.json
13:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1012.eqiad.wmnet with OS buster
13:31 dcausse: restarting blazegraph on wdqs1012 (jvm stuck for 8hours)
13:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21027 and previous config saved to /var/cache/conftool/dbconfig/20220218-133003-kormat.json
13:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1012.eqiad.wmnet with reason: host reimage
13:26 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1012.eqiad.wmnet with reason: host reimage
13:13 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21026 and previous config saved to /var/cache/conftool/dbconfig/20220218-131315-kormat.json
13:13 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
13:13 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
13:13 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300774)', diff saved to https://phabricator.wikimedia.org/P21025 and previous config saved to /var/cache/conftool/dbconfig/20220218-131307-kormat.json
13:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1012.eqiad.wmnet with OS buster
13:02 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1093.eqiad.wmnet with OS bullseye
12:58 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21024 and previous config saved to /var/cache/conftool/dbconfig/20220218-125802-kormat.json
12:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P21023 and previous config saved to /var/cache/conftool/dbconfig/20220218-124258-kormat.json
12:37 arturo: aborrero@apt1001:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /home/aborrero/prometheus-openstack-exporter_0.1.4-2_all.deb (T302050)
12:37 arturo: aborrero@apt1001:~$ sudo -i reprepro -C main includedeb buster-wikimedia /home/aborrero/prometheus-openstack-exporter_0.1.4-2_all.deb (T302050)
12:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300774)', diff saved to https://phabricator.wikimedia.org/P21022 and previous config saved to /var/cache/conftool/dbconfig/20220218-122753-kormat.json
12:22 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host elastic1093.eqiad.wmnet with OS bullseye
12:11 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300774)', diff saved to https://phabricator.wikimedia.org/P21021 and previous config saved to /var/cache/conftool/dbconfig/20220218-121126-kormat.json
12:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
12:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
12:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
12:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
12:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21020 and previous config saved to /var/cache/conftool/dbconfig/20220218-121113-kormat.json
12:11 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:08 cmooney@cumin1001: START - Cookbook sre.dns.netbox
12:08 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
12:05 cmooney@cumin1001: START - Cookbook sre.dns.netbox
11:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P21019 and previous config saved to /var/cache/conftool/dbconfig/20220218-115608-kormat.json
11:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1017.eqiad.wmnet with OS buster
11:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1017.eqiad.wmnet with reason: host reimage
11:41 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1017.eqiad.wmnet with reason: host reimage
11:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P21018 and previous config saved to /var/cache/conftool/dbconfig/20220218-114103-kormat.json
11:27 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1017.eqiad.wmnet with OS buster
11:26 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21017 and previous config saved to /var/cache/conftool/dbconfig/20220218-112558-kormat.json
11:05 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21016 and previous config saved to /var/cache/conftool/dbconfig/20220218-110506-kormat.json
11:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
11:05 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
11:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21015 and previous config saved to /var/cache/conftool/dbconfig/20220218-110459-kormat.json
10:50 moritzm: installing zsh security updates on stretch
10:49 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21014 and previous config saved to /var/cache/conftool/dbconfig/20220218-104954-kormat.json
10:43 Emperor: truncate swift/server.log.1 to 10G on thanos-be2001 T301657
10:37 Emperor: rsyslog-rotate to clear held-open server.log.1 (ms-be[2028-2030,2032,2037-2038,2040,2046-2047,2050-2051,2053-2054,2057,2060,2063,2065].codfw.wmnet,ms-be[1028-1031,1035-1038,1042,1046,1048-1049,1054,1058-1060,1065,1067].eqiad.wmnet,thanos-be2001.codfw.wmnet) T301657
10:34 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21013 and previous config saved to /var/cache/conftool/dbconfig/20220218-103449-kormat.json
10:20 godog: truncate /var/log/swift/server.log.1 to 30G due to full root fs - T301657
10:19 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21012 and previous config saved to /var/cache/conftool/dbconfig/20220218-101945-kormat.json
10:01 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300774)', diff saved to https://phabricator.wikimedia.org/P21011 and previous config saved to /var/cache/conftool/dbconfig/20220218-100135-kormat.json
10:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
10:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
10:00 kormat: deploying schema change to s2 T300774
09:35 moritzm: draining instances off ganeti1009
09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1022.eqiad.wmnet with OS buster
09:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1022.eqiad.wmnet with reason: host reimage
09:01 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2001.codfw.wmnet
08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1022.eqiad.wmnet with reason: host reimage
08:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2002.codfw.wmnet
08:54 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2002.codfw.wmnet
08:53 kart_: Updated cxserver to 2022-02-15-050044-production (T301443)
08:52 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
08:50 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
08:47 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
08:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1022.eqiad.wmnet with OS buster
08:45 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
08:39 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
08:39 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
08:19 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
08:19 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
07:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
07:57 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
07:57 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
07:57 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
07:42 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
07:42 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
07:41 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
07:41 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
02:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
02:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
02:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
02:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
02:12 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: enable wmgEmergencyCaptcha for enwiki ff2f7ef64 T302047 (duration: 00m 49s)
02:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
02:03 pt1979@cumin2002: START - Cookbook sre.dns.netbox
02:03 cdanis@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Disable AbuseFilter throttling on enwiki 6692b4642 T302047 (duration: 00m 49s)

2022-02-17

22:28 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:25 pt1979@cumin2002: START - Cookbook sre.dns.netbox
21:19 razzi@cumin1001: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=93) for new host datahubsearch1002.eqiad.wmnet
20:04 dcausse@deploy1002: Finished deploy [wikimedia/discovery/analytics@66350a9]: (no justification provided) (duration: 02m 02s)
20:02 dcausse@deploy1002: Started deploy [wikimedia/discovery/analytics@66350a9]: (no justification provided)
19:54 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase-dev2003.codfw.wmnet with OS buster
19:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P21009 and previous config saved to /var/cache/conftool/dbconfig/20220217-195302-ladsgroup.json
19:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase-dev2003.codfw.wmnet with reason: host reimage
19:41 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase-dev2003.codfw.wmnet with reason: host reimage
19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21008 and previous config saved to /var/cache/conftool/dbconfig/20220217-193757-ladsgroup.json
19:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase-dev2002.codfw.wmnet with OS buster
19:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase-dev2002.codfw.wmnet with reason: host reimage
19:24 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host restbase-dev2003.codfw.wmnet with OS buster
19:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P21007 and previous config saved to /var/cache/conftool/dbconfig/20220217-192252-ladsgroup.json
19:22 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase-dev2002.codfw.wmnet with reason: host reimage
19:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase-dev2001.codfw.wmnet with OS buster
19:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase-dev2001.codfw.wmnet with reason: host reimage
19:08 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:08 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on restbase-dev2001.codfw.wmnet with reason: host reimage
19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P21006 and previous config saved to /var/cache/conftool/dbconfig/20220217-190748-ladsgroup.json
19:04 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host restbase-dev2002.codfw.wmnet with OS buster
19:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
18:54 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300774)', diff saved to https://phabricator.wikimedia.org/P21005 and previous config saved to /var/cache/conftool/dbconfig/20220217-185414-kormat.json
18:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P21004 and previous config saved to /var/cache/conftool/dbconfig/20220217-185414-ladsgroup.json
18:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host restbase-dev2001.codfw.wmnet with OS buster
18:39 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21003 and previous config saved to /var/cache/conftool/dbconfig/20220217-183910-kormat.json
18:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21002 and previous config saved to /var/cache/conftool/dbconfig/20220217-183909-ladsgroup.json
18:34 accraze@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
18:31 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
18:24 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P21001 and previous config saved to /var/cache/conftool/dbconfig/20220217-182405-kormat.json
18:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P21000 and previous config saved to /var/cache/conftool/dbconfig/20220217-182405-ladsgroup.json
18:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20999 and previous config saved to /var/cache/conftool/dbconfig/20220217-180900-ladsgroup.json
18:06 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300774)', diff saved to https://phabricator.wikimedia.org/P20998 and previous config saved to /var/cache/conftool/dbconfig/20220217-180647-kormat.json
18:06 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
18:06 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
18:06 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P20997 and previous config saved to /var/cache/conftool/dbconfig/20220217-180639-kormat.json
17:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1105.eqiad.wmnet with OS bullseye
17:54 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on datahubsearch1001.eqiad.wmnet with reason: Node is being set up for first time and puppet run failed
17:54 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on datahubsearch1001.eqiad.wmnet with reason: Node is being set up for first time and puppet run failed
17:53 razzi@cumin1001: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on an-test-coord1001.eqiad.wmnet with reason: Still troubleshooting mariadb issues
17:53 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-test-coord1001.eqiad.wmnet with reason: Still troubleshooting mariadb issues
17:51 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20995 and previous config saved to /var/cache/conftool/dbconfig/20220217-175135-kormat.json
17:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1105.eqiad.wmnet with reason: host reimage
17:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1105.eqiad.wmnet with reason: host reimage
17:36 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20994 and previous config saved to /var/cache/conftool/dbconfig/20220217-173630-kormat.json
17:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1105.eqiad.wmnet with OS bullseye
17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20993 and previous config saved to /var/cache/conftool/dbconfig/20220217-172650-ladsgroup.json
17:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20992 and previous config saved to /var/cache/conftool/dbconfig/20220217-172504-ladsgroup.json
17:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
17:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
17:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P20991 and previous config saved to /var/cache/conftool/dbconfig/20220217-172124-kormat.json
17:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=kubernetes-staging,service=kubesvc
17:19 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestage2001.codfw.wmnet with OS bullseye
17:11 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
17:11 razzi@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host datahubsearch1002.eqiad.wmnet
17:09 XioNoX: stop advertising drmrs from esams
16:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:42 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
16:42 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage
16:39 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2001.codfw.wmnet with reason: host reimage
16:27 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
16:21 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300774)', diff saved to https://phabricator.wikimedia.org/P20990 and previous config saved to /var/cache/conftool/dbconfig/20220217-162104-kormat.json
16:21 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
16:21 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
16:21 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20989 and previous config saved to /var/cache/conftool/dbconfig/20220217-162056-kormat.json
16:20 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kubestage2001.codfw.wmnet with OS bullseye
16:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20988 and previous config saved to /var/cache/conftool/dbconfig/20220217-160551-kormat.json
15:50 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20987 and previous config saved to /var/cache/conftool/dbconfig/20220217-155047-kormat.json
15:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM testvm2002.codfw.wmnet
15:46 ejegg: updated fundraising CiviCRM from 84953e1d to 2874d623
15:41 jmm@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM testvm2002.codfw.wmnet
15:35 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20986 and previous config saved to /var/cache/conftool/dbconfig/20220217-153542-kormat.json
15:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on testvm[2001-2003].codfw.wmnet with reason: Instance restarts
15:26 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on testvm[2001-2003].codfw.wmnet with reason: Instance restarts
15:23 moritzm: imported openjdk-8 8u322-b06-1~deb11u1 for bullseye-wikimedia (forward port of latest Java 8 security fixes)
15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1012.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
15:20 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1012.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
15:10 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20984 and previous config saved to /var/cache/conftool/dbconfig/20220217-151021-kormat.json
15:10 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
15:10 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
15:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
15:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
15:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
15:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
15:09 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20983 and previous config saved to /var/cache/conftool/dbconfig/20220217-150941-kormat.json
15:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:01 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
14:54 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20982 and previous config saved to /var/cache/conftool/dbconfig/20220217-145436-kormat.json
14:47 hashar: UTC evening backport and config training has completed.
14:45 hashar@deploy1002: Synchronized wmf-config/interwiki.php: Config: Regen interwiki cache to drop erroneous 'wikipedia' (T301936) (duration: 00m 48s)
14:44 dcausse@deploy1002: Finished deploy [wikimedia/discovery/analytics@3a25565]: (no justification provided) (duration: 02m 04s)
14:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
14:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
14:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
14:42 dcausse@deploy1002: Started deploy [wikimedia/discovery/analytics@3a25565]: (no justification provided)
14:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
14:39 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20981 and previous config saved to /var/cache/conftool/dbconfig/20220217-143931-kormat.json
14:32 hashar@deploy1002: Synchronized php-1.38.0-wmf.22/extensions/WikimediaMaintenance/dumpInterwiki.php: Backport: Stop excluding the 'wikipedia' interwiki prefix (T301936) (duration: 00m 48s)
14:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
14:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
14:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
14:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
14:24 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable RelatedArticles for desktop (non-mobile) view at zhwikinews (T299856) (duration: 00m 49s)
14:24 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20980 and previous config saved to /var/cache/conftool/dbconfig/20220217-142427-kormat.json
14:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
14:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
14:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
14:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup, wgAddGroups (R-Z) (T301647) (no-op) (duration: 00m 50s)
13:58 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20979 and previous config saved to /var/cache/conftool/dbconfig/20220217-135831-kormat.json
13:58 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
13:58 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
13:43 moritzm: installing paramiko securiy updates
13:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:18 moritzm: installing zsh security updates
13:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
13:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
13:11 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300774)', diff saved to https://phabricator.wikimedia.org/P20977 and previous config saved to /var/cache/conftool/dbconfig/20220217-131111-kormat.json
13:01 moritzm: installing expat security updates
12:56 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20976 and previous config saved to /var/cache/conftool/dbconfig/20220217-125607-kormat.json
12:41 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20975 and previous config saved to /var/cache/conftool/dbconfig/20220217-124102-kormat.json
12:25 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300774)', diff saved to https://phabricator.wikimedia.org/P20974 and previous config saved to /var/cache/conftool/dbconfig/20220217-122557-kormat.json
12:00 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300774)', diff saved to https://phabricator.wikimedia.org/P20973 and previous config saved to /var/cache/conftool/dbconfig/20220217-120014-kormat.json
12:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
12:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
12:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
12:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
12:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20972 and previous config saved to /var/cache/conftool/dbconfig/20220217-120001-kormat.json
11:44 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20971 and previous config saved to /var/cache/conftool/dbconfig/20220217-114456-kormat.json
11:29 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20970 and previous config saved to /var/cache/conftool/dbconfig/20220217-112951-kormat.json
11:28 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: elastic1046.eqiad.wmnet
11:28 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: elastic1046.eqiad.wmnet
11:27 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: elastic1043.eqiad.wmnet
11:27 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: elastic1043.eqiad.wmnet
11:14 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20969 and previous config saved to /var/cache/conftool/dbconfig/20220217-111447-kormat.json
11:01 moritzm: installing python3.5 security uodates
10:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300774)', diff saved to https://phabricator.wikimedia.org/P20968 and previous config saved to /var/cache/conftool/dbconfig/20220217-104653-kormat.json
10:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
10:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
10:46 kormat: running schema change against s5 T300774
10:32 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
10:32 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
09:50 moritzm: migrate instances off ganeti1012
09:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1017.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
09:46 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1017.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
09:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
09:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
09:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
09:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
09:39 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.22 refs T300198
08:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
08:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
08:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
08:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
08:26 urbanecm: UTC early B&C now really done
08:26 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: c0cbd30: Deploy Growth features to 100% of newcomers on most Wikipedias (T301820) (duration: 00m 50s)
08:22 apergos: UTC early B&C window NOT completed, woops.
08:21 apergos: UTC early B&C window completed
08:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
08:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
08:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
08:10 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable SectionTranslation in Occitan and Luganda WPs + CX out-of-Beta for Luganda WP (T301443) (duration: 00m 51s)
08:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
06:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300381)', diff saved to https://phabricator.wikimedia.org/P20967 and previous config saved to /var/cache/conftool/dbconfig/20220217-062708-marostegui.json
06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20966 and previous config saved to /var/cache/conftool/dbconfig/20220217-061203-marostegui.json
05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20965 and previous config saved to /var/cache/conftool/dbconfig/20220217-055659-marostegui.json
05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300381)', diff saved to https://phabricator.wikimedia.org/P20964 and previous config saved to /var/cache/conftool/dbconfig/20220217-054154-marostegui.json
04:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T300381)', diff saved to https://phabricator.wikimedia.org/P20963 and previous config saved to /var/cache/conftool/dbconfig/20220217-041721-marostegui.json
04:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
04:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
04:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300381)', diff saved to https://phabricator.wikimedia.org/P20962 and previous config saved to /var/cache/conftool/dbconfig/20220217-041713-marostegui.json
04:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20961 and previous config saved to /var/cache/conftool/dbconfig/20220217-040208-marostegui.json
03:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20960 and previous config saved to /var/cache/conftool/dbconfig/20220217-034704-marostegui.json
03:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300381)', diff saved to https://phabricator.wikimedia.org/P20959 and previous config saved to /var/cache/conftool/dbconfig/20220217-033159-marostegui.json
02:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T300381)', diff saved to https://phabricator.wikimedia.org/P20958 and previous config saved to /var/cache/conftool/dbconfig/20220217-022128-marostegui.json
02:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
02:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
02:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20957 and previous config saved to /var/cache/conftool/dbconfig/20220217-022121-marostegui.json
02:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20956 and previous config saved to /var/cache/conftool/dbconfig/20220217-020616-marostegui.json
01:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20955 and previous config saved to /var/cache/conftool/dbconfig/20220217-015111-marostegui.json
01:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20954 and previous config saved to /var/cache/conftool/dbconfig/20220217-013607-marostegui.json
00:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20953 and previous config saved to /var/cache/conftool/dbconfig/20220217-001907-marostegui.json
00:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
00:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
00:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20952 and previous config saved to /var/cache/conftool/dbconfig/20220217-001859-marostegui.json
00:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20951 and previous config saved to /var/cache/conftool/dbconfig/20220217-000355-marostegui.json

2022-02-16

23:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20950 and previous config saved to /var/cache/conftool/dbconfig/20220216-234850-marostegui.json
23:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20949 and previous config saved to /var/cache/conftool/dbconfig/20220216-233345-marostegui.json
23:28 topranks: test reboot of lsw1-e1-eqiad - not in service.
23:09 tgr@deploy1002: Synchronized wmf-config/logos.php: Config: Use huwiki 500k milestone logos (T301923) (duration: 00m 49s)
23:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
23:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
23:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
23:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
22:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
22:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
22:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
22:58 tgr@deploy1002: Synchronized logos/config.yaml: Config: Add huwiki 500k milestone logos (T301923) (duration: 00m 49s)
22:57 tgr@deploy1002: Synchronized static/images/project-logos/: Config: Add huwiki 500k milestone logos (T301923) (duration: 00m 50s)
22:49 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: GrowthExperiments: Enable image recommendations on eswiki (T301276) (duration: 00m 52s)
22:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20948 and previous config saved to /var/cache/conftool/dbconfig/20220216-222329-root.json
22:15 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
22:15 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on doh[6001-6002].wikimedia.org with reason: T301165; errors expected, not serving any traffic
22:15 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
22:15 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on durum[6001-6002].drmrs.wmnet with reason: T301165; errors expected, not serving any traffic
22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T300381)', diff saved to https://phabricator.wikimedia.org/P20946 and previous config saved to /var/cache/conftool/dbconfig/20220216-221456-marostegui.json
22:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
22:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
22:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20945 and previous config saved to /var/cache/conftool/dbconfig/20220216-221448-marostegui.json
22:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20944 and previous config saved to /var/cache/conftool/dbconfig/20220216-220826-root.json
21:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20943 and previous config saved to /var/cache/conftool/dbconfig/20220216-215944-marostegui.json
21:55 tgr@deploy1002: Synchronized php-1.38.0-wmf.22/includes/EditPage.php: Backport: EditPage: Parse wikitext in the usual way in the copyright message (T301890) (duration: 00m 49s)
21:54 mutante: merged Alex's changes, built prometheus-etherpad-exporter_0.6 on deneb, imported on apt1001, ran reprepro export, installed new version on etherpad1003 T301872
21:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20942 and previous config saved to /var/cache/conftool/dbconfig/20220216-215322-root.json
21:52 tgr: ran mwscript updateCollation.php abwiki --force
21:49 tgr@deploy1002: Synchronized php-1.38.0-wmf.22/includes/collation/AbkhazUppercaseCollation.php: Backport: Add Ӷ and Ԥ to Abkhaz collation (T298309) (duration: 00m 49s)
21:48 tgr@deploy1002: Synchronized php-1.38.0-wmf.21/includes/collation/AbkhazUppercaseCollation.php: Backport: Add Ӷ and Ԥ to Abkhaz collation (T298309) (duration: 00m 49s)
21:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20941 and previous config saved to /var/cache/conftool/dbconfig/20220216-214439-marostegui.json
21:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:41 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:41 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20940 and previous config saved to /var/cache/conftool/dbconfig/20220216-213819-root.json
21:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20939 and previous config saved to /var/cache/conftool/dbconfig/20220216-212934-marostegui.json
21:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
21:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
21:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
21:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20938 and previous config saved to /var/cache/conftool/dbconfig/20220216-212315-root.json
21:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup, wgAddGroups (J-P) (T301647) (duration: 00m 51s)
21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
21:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
20:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T300381)', diff saved to https://phabricator.wikimedia.org/P20937 and previous config saved to /var/cache/conftool/dbconfig/20220216-200922-marostegui.json
20:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
20:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
20:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20936 and previous config saved to /var/cache/conftool/dbconfig/20220216-200914-marostegui.json
19:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20934 and previous config saved to /var/cache/conftool/dbconfig/20220216-195410-marostegui.json
19:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20933 and previous config saved to /var/cache/conftool/dbconfig/20220216-193905-marostegui.json
19:33 tzatziki: removing 28 files for legal compliance
19:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20932 and previous config saved to /var/cache/conftool/dbconfig/20220216-192400-marostegui.json
19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
19:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
19:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
18:49 mutante: deploying OTRS config change
18:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T300381)', diff saved to https://phabricator.wikimedia.org/P20931 and previous config saved to /var/cache/conftool/dbconfig/20220216-181706-marostegui.json
18:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
18:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
18:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20930 and previous config saved to /var/cache/conftool/dbconfig/20220216-181651-marostegui.json
18:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20929 and previous config saved to /var/cache/conftool/dbconfig/20220216-180146-marostegui.json
17:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20926 and previous config saved to /var/cache/conftool/dbconfig/20220216-174641-marostegui.json
17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20925 and previous config saved to /var/cache/conftool/dbconfig/20220216-173137-marostegui.json
17:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
17:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
17:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
17:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
17:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
17:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
17:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
17:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit2002.wikimedia.org with OS bullseye
17:25 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001
17:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
17:13 accraze@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
17:13 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
17:12 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gerrit2002.wikimedia.org with reason: host reimage
17:07 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2002.wikimedia.org with OS buster
16:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host gerrit2002.wikimedia.org with OS bullseye
16:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
16:54 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2002.wikimedia.org with reason: host reimage
16:51 mutante: contint2001 - temp disabled puppet (active CI server) - contint1001 - attempting to install newer docker version (gerrit:758987 T300682)
16:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host contint2002.wikimedia.org with OS buster
16:33 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20923 and previous config saved to /var/cache/conftool/dbconfig/20220216-163308-kormat.json
16:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
16:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
16:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
16:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
16:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Use ParserOutputAccess for accessing ParserOutput (T283029) (duration: 00m 49s)
16:18 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20922 and previous config saved to /var/cache/conftool/dbconfig/20220216-161803-kormat.json
16:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
16:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
16:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
16:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T300381)', diff saved to https://phabricator.wikimedia.org/P20921 and previous config saved to /var/cache/conftool/dbconfig/20220216-161054-marostegui.json
16:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
16:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
16:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20920 and previous config saved to /var/cache/conftool/dbconfig/20220216-161047-marostegui.json
16:10 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/ParserOutputAccess.php: Backport: ParserOutputAccess: Cache Parsing inside the class as well (T301310) (duration: 00m 52s)
16:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
16:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
16:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
16:06 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/includes/page/ParserOutputAccess.php: Backport: ParserOutputAccess: Cache Parsing inside the class as well (T301310) (duration: 00m 54s)
16:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
16:02 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20919 and previous config saved to /var/cache/conftool/dbconfig/20220216-160257-kormat.json
15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20918 and previous config saved to /var/cache/conftool/dbconfig/20220216-155542-marostegui.json
15:47 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20917 and previous config saved to /var/cache/conftool/dbconfig/20220216-154752-kormat.json
15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20916 and previous config saved to /var/cache/conftool/dbconfig/20220216-154037-marostegui.json
15:35 moritzm: installing zsh security updates
15:35 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20915 and previous config saved to /var/cache/conftool/dbconfig/20220216-153456-kormat.json
15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
15:34 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20914 and previous config saved to /var/cache/conftool/dbconfig/20220216-153448-kormat.json
15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20913 and previous config saved to /var/cache/conftool/dbconfig/20220216-152529-marostegui.json
15:19 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20912 and previous config saved to /var/cache/conftool/dbconfig/20220216-151944-kormat.json
15:04 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20911 and previous config saved to /var/cache/conftool/dbconfig/20220216-150439-kormat.json
15:04 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
15:03 jelto@deploy1002: helmfile [staging] START helmfile.d/services/toolhub: apply
15:02 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:01 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
15:00 jelto@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
14:58 pt1979@cumin2002: START - Cookbook sre.dns.netbox
14:49 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20910 and previous config saved to /var/cache/conftool/dbconfig/20220216-144934-kormat.json
14:47 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300774)', diff saved to https://phabricator.wikimedia.org/P20909 and previous config saved to /var/cache/conftool/dbconfig/20220216-144726-kormat.json
14:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
14:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
14:44 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-eqiad: Restarting to pick up Java security updates - hnowlan@cumin1001
14:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
14:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
14:35 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20908 and previous config saved to /var/cache/conftool/dbconfig/20220216-143535-kormat.json
14:21 moritzm: migrate instances off ganeti1017
14:20 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20907 and previous config saved to /var/cache/conftool/dbconfig/20220216-142030-kormat.json
14:17 sukhe: disabled puppet on all doh* hosts except doh3001
14:17 moritzm: failover the ganeti master to ganeti1024 T296721
14:16 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED
14:16 volans@cumin2002: START - Cookbook sre.hosts.provision for host elastic2073.mgmt.codfw.wmnet with reboot policy FORCED
14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T300381)', diff saved to https://phabricator.wikimedia.org/P20906 and previous config saved to /var/cache/conftool/dbconfig/20220216-141546-marostegui.json
14:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
14:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
14:13 mforns@deploy1002: Finished deploy [airflow-dags/analytics@8991326]: (no justification provided) (duration: 00m 07s)
14:13 mforns@deploy1002: Started deploy [airflow-dags/analytics@8991326]: (no justification provided)
14:05 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20905 and previous config saved to /var/cache/conftool/dbconfig/20220216-140526-kormat.json
13:50 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20903 and previous config saved to /var/cache/conftool/dbconfig/20220216-135021-kormat.json
13:46 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300774)', diff saved to https://phabricator.wikimedia.org/P20902 and previous config saved to /var/cache/conftool/dbconfig/20220216-134612-kormat.json
13:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
13:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
13:46 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20901 and previous config saved to /var/cache/conftool/dbconfig/20220216-134559-kormat.json
13:30 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20900 and previous config saved to /var/cache/conftool/dbconfig/20220216-133054-kormat.json
13:29 jayme@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
13:29 jayme@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
13:29 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
13:28 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
13:27 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
13:27 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
13:24 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
13:23 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300775)', diff saved to https://phabricator.wikimedia.org/P20899 and previous config saved to /var/cache/conftool/dbconfig/20220216-132322-marostegui.json
13:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:23 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
13:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
13:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1112.eqiad.wmnet with reason: Maintenance
13:21 jayme@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
13:21 jayme@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
13:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:15 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20898 and previous config saved to /var/cache/conftool/dbconfig/20220216-131549-kormat.json
13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
13:12 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
13:00 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20897 and previous config saved to /var/cache/conftool/dbconfig/20220216-130044-kormat.json
12:46 moritzm: installing apache-log4j1.2 security updates
12:42 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300774)', diff saved to https://phabricator.wikimedia.org/P20896 and previous config saved to /var/cache/conftool/dbconfig/20220216-124232-kormat.json
12:42 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
12:42 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
12:42 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20895 and previous config saved to /var/cache/conftool/dbconfig/20220216-124225-kormat.json
12:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20894 and previous config saved to /var/cache/conftool/dbconfig/20220216-122720-kormat.json
12:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20893 and previous config saved to /var/cache/conftool/dbconfig/20220216-121215-kormat.json
12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20892 and previous config saved to /var/cache/conftool/dbconfig/20220216-120840-marostegui.json
12:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20891 and previous config saved to /var/cache/conftool/dbconfig/20220216-120659-ladsgroup.json
12:06 moritzm: configure ganeti1024/ganeti1027/ganeti1028 as master candidates for eqiad Ganeti cluster
11:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1011.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
11:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20890 and previous config saved to /var/cache/conftool/dbconfig/20220216-115711-kormat.json
11:55 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1011.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20889 and previous config saved to /var/cache/conftool/dbconfig/20220216-115336-marostegui.json
11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20888 and previous config saved to /var/cache/conftool/dbconfig/20220216-115155-ladsgroup.json
11:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
11:43 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300774)', diff saved to https://phabricator.wikimedia.org/P20887 and previous config saved to /var/cache/conftool/dbconfig/20220216-114310-kormat.json
11:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
11:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
11:43 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20886 and previous config saved to /var/cache/conftool/dbconfig/20220216-114303-kormat.json
11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20885 and previous config saved to /var/cache/conftool/dbconfig/20220216-113831-marostegui.json
11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20884 and previous config saved to /var/cache/conftool/dbconfig/20220216-113650-ladsgroup.json
11:27 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20883 and previous config saved to /var/cache/conftool/dbconfig/20220216-112758-kormat.json
11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20882 and previous config saved to /var/cache/conftool/dbconfig/20220216-112326-marostegui.json
11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20881 and previous config saved to /var/cache/conftool/dbconfig/20220216-112145-ladsgroup.json
11:12 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20880 and previous config saved to /var/cache/conftool/dbconfig/20220216-111253-kormat.json
11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20879 and previous config saved to /var/cache/conftool/dbconfig/20220216-110816-ladsgroup.json
11:07 moritzm: restarting apache on prometheus nodes to pick up expat security updates
10:57 kormat@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20878 and previous config saved to /var/cache/conftool/dbconfig/20220216-105748-kormat.json
10:55 kormat@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300774)', diff saved to https://phabricator.wikimedia.org/P20877 and previous config saved to /var/cache/conftool/dbconfig/20220216-105540-kormat.json
10:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
10:55 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20875 and previous config saved to /var/cache/conftool/dbconfig/20220216-105312-ladsgroup.json
10:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
10:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
10:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20873 and previous config saved to /var/cache/conftool/dbconfig/20220216-103807-ladsgroup.json
10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
10:31 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
10:31 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20872 and previous config saved to /var/cache/conftool/dbconfig/20220216-102302-ladsgroup.json
10:20 moritzm: installing expat security updates
10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T300381)', diff saved to https://phabricator.wikimedia.org/P20871 and previous config saved to /var/cache/conftool/dbconfig/20220216-101354-marostegui.json
10:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
10:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
10:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20870 and previous config saved to /var/cache/conftool/dbconfig/20220216-101346-marostegui.json
09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20869 and previous config saved to /var/cache/conftool/dbconfig/20220216-095841-marostegui.json
09:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1146.eqiad.wmnet with OS bullseye
09:52 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
09:50 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
09:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
09:44 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20868 and previous config saved to /var/cache/conftool/dbconfig/20220216-094337-marostegui.json
09:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1146.eqiad.wmnet with reason: host reimage
09:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1146.eqiad.wmnet with reason: host reimage
09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20867 and previous config saved to /var/cache/conftool/dbconfig/20220216-092832-marostegui.json
09:25 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
09:24 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
09:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1146.eqiad.wmnet with OS bullseye
09:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
09:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
09:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
09:09 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.22 refs T300198 (duration: 00m 49s)
09:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'T300510', diff saved to https://phabricator.wikimedia.org/P20866 and previous config saved to /var/cache/conftool/dbconfig/20220216-090924-ladsgroup.json
09:08 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.22 refs T300198
09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20865 and previous config saved to /var/cache/conftool/dbconfig/20220216-090737-ladsgroup.json
09:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:01 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
08:39 urbanecm: Set an email for developer account Osnard and re-enable it (T301796)
08:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20864 and previous config saved to /var/cache/conftool/dbconfig/20220216-083832-root.json
08:33 dcausse: restarting blazegraph on wdqs1005 (jvm stuck for 4hours)
08:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20863 and previous config saved to /var/cache/conftool/dbconfig/20220216-082329-root.json
08:18 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts prometheus1004.eqiad.wmnet
08:13 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 9001a8c: Use $wgGroupInheritsPermissions for "confirmed" group (T275334; 2/2) (duration: 03m 39s)
08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20862 and previous config saved to /var/cache/conftool/dbconfig/20220216-081056-marostegui.json
08:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
08:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
08:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:10 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
08:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 9001a8c: Use $wgGroupInheritsPermissions for "confirmed" group (T275334; 1/2) (duration: 00m 51s)
08:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20861 and previous config saved to /var/cache/conftool/dbconfig/20220216-080825-root.json
08:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20860 and previous config saved to /var/cache/conftool/dbconfig/20220216-080717-ladsgroup.json
08:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20859 and previous config saved to /var/cache/conftool/dbconfig/20220216-080531-ladsgroup.json
08:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance
07:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20858 and previous config saved to /var/cache/conftool/dbconfig/20220216-075321-root.json
07:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20857 and previous config saved to /var/cache/conftool/dbconfig/20220216-073818-root.json
07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: Maintenance
07:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1133.eqiad.wmnet with OS bullseye
07:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1133.eqiad.wmnet with reason: host reimage
07:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1133.eqiad.wmnet with reason: host reimage
07:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20856 and previous config saved to /var/cache/conftool/dbconfig/20220216-071125-ladsgroup.json
07:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
07:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
07:00 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1133.eqiad.wmnet with OS bullseye
06:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20855 and previous config saved to /var/cache/conftool/dbconfig/20220216-065620-ladsgroup.json
06:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20854 and previous config saved to /var/cache/conftool/dbconfig/20220216-064115-ladsgroup.json
06:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
06:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
06:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
06:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
06:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
06:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/maintenance/pruneRevData.php: Backport: Clean up flaggedtemplate rows for deleted pages too (T296380) (duration: 00m 52s)
06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20853 and previous config saved to /var/cache/conftool/dbconfig/20220216-062610-ladsgroup.json
06:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
06:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
06:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
06:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
06:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
06:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1156.eqiad.wmnet with OS bullseye
06:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
06:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: host reimage
05:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1156.eqiad.wmnet with OS bullseye
05:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300510)', diff saved to https://phabricator.wikimedia.org/P20852 and previous config saved to /var/cache/conftool/dbconfig/20220216-054749-ladsgroup.json
05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
05:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
05:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
05:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
05:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
05:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
05:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
05:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance
05:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance

2022-02-15

23:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
23:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED
23:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2002.mgmt.codfw.wmnet with reboot policy FORCED
23:30 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2002.mgmt.codfw.wmnet with reboot policy FORCED
23:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2001.mgmt.codfw.wmnet with reboot policy FORCED
23:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host restbase-dev2001.mgmt.codfw.wmnet with reboot policy FORCED
23:15 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
23:14 tzatziki: Removing one file for legal compliance
23:10 pt1979@cumin2002: START - Cookbook sre.dns.netbox
23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20850 and previous config saved to /var/cache/conftool/dbconfig/20220215-230454-marostegui.json
22:55 tzatziki: Removing 5 files for legal compliance
22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20849 and previous config saved to /var/cache/conftool/dbconfig/20220215-224950-marostegui.json
22:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20848 and previous config saved to /var/cache/conftool/dbconfig/20220215-223445-marostegui.json
22:28 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production
22:27 jhuneidi@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging
22:27 jhuneidi@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production
22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production
22:26 jhuneidi@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging
22:25 jhuneidi@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply on production
22:24 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
22:23 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
22:23 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
22:21 jhuneidi@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
22:21 jhuneidi@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
22:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20847 and previous config saved to /var/cache/conftool/dbconfig/20220215-221940-marostegui.json
22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300381)', diff saved to https://phabricator.wikimedia.org/P20846 and previous config saved to /var/cache/conftool/dbconfig/20220215-220041-marostegui.json
22:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
22:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20845 and previous config saved to /var/cache/conftool/dbconfig/20220215-220034-marostegui.json
22:00 hoo: Updated the Wikidata property suggester with data from the 2022-02-07 JSON dump (with pre-applied T132839 workarounds)
21:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20844 and previous config saved to /var/cache/conftool/dbconfig/20220215-214529-marostegui.json
21:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:41 urbanecm: UTC late B&C window completed
21:41 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 2e0b51f: amiwiki: Deploy Growth features to newcomers (duration: 00m 49s)
21:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:36 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: b3e8161: Apply max width setting to all Wikisource page namespaces (T300563; 2/2) (duration: 00m 49s)
21:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:36 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b3e8161: Apply max width setting to all Wikisource page namespaces (T300563; 1/2) (duration: 00m 50s)
21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20843 and previous config saved to /var/cache/conftool/dbconfig/20220215-213024-marostegui.json
21:22 eileen: civicrm revision 815e3091 -> 84953e1d
21:20 eileen: localsettings checkout revision (02f4888c -> 2a6d2e45)
21:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20842 and previous config saved to /var/cache/conftool/dbconfig/20220215-211519-marostegui.json
21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: d97b43e: Remove MFUseDesktopContributionsPage config (T300583) (duration: 00m 52s)
20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T300381)', diff saved to https://phabricator.wikimedia.org/P20841 and previous config saved to /var/cache/conftool/dbconfig/20220215-205547-marostegui.json
20:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
20:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20840 and previous config saved to /var/cache/conftool/dbconfig/20220215-205539-marostegui.json
20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20838 and previous config saved to /var/cache/conftool/dbconfig/20220215-204035-marostegui.json
20:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20837 and previous config saved to /var/cache/conftool/dbconfig/20220215-202530-marostegui.json
20:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20836 and previous config saved to /var/cache/conftool/dbconfig/20220215-201025-marostegui.json
19:52 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1015.eqiad.wmnet with OS buster
19:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T300381)', diff saved to https://phabricator.wikimedia.org/P20835 and previous config saved to /var/cache/conftool/dbconfig/20220215-195051-marostegui.json
19:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
19:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20834 and previous config saved to /var/cache/conftool/dbconfig/20220215-195042-marostegui.json
19:43 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
19:40 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
19:39 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1093.mgmt.eqiad.wmnet with reboot policy FORCED
19:38 herron: beginning rolling restart of kafka-main clusters for updates
19:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20833 and previous config saved to /var/cache/conftool/dbconfig/20220215-193537-marostegui.json
19:30 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host elastic1093.mgmt.eqiad.wmnet with reboot policy FORCED
19:30 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1015.eqiad.wmnet with OS buster
19:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:28 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:27 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:25 bblack@cumin1001: START - Cookbook sre.dns.netbox
19:23 cmooney@cumin1001: START - Cookbook sre.dns.netbox
19:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20832 and previous config saved to /var/cache/conftool/dbconfig/20220215-192033-marostegui.json
19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:12 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.22/skins/Vector: Backport: Revert "Add fetch tests from WVUI" (duration: 01m 07s)
19:09 bblack: lvs1019 - start pybal/puppet with real routing, taking over low-traffic from lvs1020
19:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
19:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20831 and previous config saved to /var/cache/conftool/dbconfig/20220215-190528-marostegui.json
18:58 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
18:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
18:50 bblack: cr[12]-eqiad - edit static fallback for low-traffic (lvs1015 -> lvs1019)
18:41 bblack: lvs1019 - disable puppet/pybal, reboot - T301142
18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T300381)', diff saved to https://phabricator.wikimedia.org/P20830 and previous config saved to /var/cache/conftool/dbconfig/20220215-184037-marostegui.json
18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20829 and previous config saved to /var/cache/conftool/dbconfig/20220215-184023-marostegui.json
18:39 herron: beginning rolling restart of kafka-logging clusters for updates
18:36 bblack: lvs1019 - first prod puppetization + pybal start
18:35 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host gerrit2002.mgmt.codfw.wmnet with reboot policy FORCED
18:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
18:27 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20828 and previous config saved to /var/cache/conftool/dbconfig/20220215-182519-marostegui.json
18:18 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase1031.eqiad.wmnet with OS buster
18:12 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1014.eqiad.wmnet with OS buster
18:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20827 and previous config saved to /var/cache/conftool/dbconfig/20220215-181012-marostegui.json
18:02 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
17:59 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
17:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20826 and previous config saved to /var/cache/conftool/dbconfig/20220215-175508-marostegui.json
17:48 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1014.eqiad.wmnet with OS buster
17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
17:47 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
17:42 bblack@cumin1001: START - Cookbook sre.dns.netbox
17:40 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
17:39 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: sync on main
17:38 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply on main
17:38 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply on main
17:38 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply on main
17:36 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: sync on main
17:36 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply on main
17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20824 and previous config saved to /var/cache/conftool/dbconfig/20220215-173536-marostegui.json
17:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20823 and previous config saved to /var/cache/conftool/dbconfig/20220215-173529-marostegui.json
17:34 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
17:33 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
17:32 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
17:32 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
17:26 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: sync on main
17:26 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply on main
17:20 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001
17:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20822 and previous config saved to /var/cache/conftool/dbconfig/20220215-172024-marostegui.json
17:14 bblack: lvs1018 - bringing pybal online for production upload traffic
17:08 bblack: cr[12]-eqiad: manual edit static fallback route for high-traffic2 from lvs1014 to lvs1018 - T301142
17:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
17:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20821 and previous config saved to /var/cache/conftool/dbconfig/20220215-170520-marostegui.json
17:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1011.eqiad.wmnet with OS buster
16:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host contint2002.mgmt.codfw.wmnet with reboot policy FORCED
16:56 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:55 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
16:54 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
16:51 bblack: lvs1018 - reboot
16:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20820 and previous config saved to /var/cache/conftool/dbconfig/20220215-165015-marostegui.json
16:50 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20819 and previous config saved to /var/cache/conftool/dbconfig/20220215-164611-ladsgroup.json
16:39 cwhite: logstash switchback to eqiad complete T299168
16:38 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
16:38 bblack: lvs1018 - puppeting into prod role for first time
16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024', diff saved to https://phabricator.wikimedia.org/P20818 and previous config saved to /var/cache/conftool/dbconfig/20220215-163106-ladsgroup.json
16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T300381)', diff saved to https://phabricator.wikimedia.org/P20817 and previous config saved to /var/cache/conftool/dbconfig/20220215-162949-marostegui.json
16:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
16:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
16:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20816 and previous config saved to /var/cache/conftool/dbconfig/20220215-162941-marostegui.json
16:26 bblack: lvs1014 - downtimed - stopping puppet+pybal to fail traffic over to lvs1020 - T301142
16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024', diff saved to https://phabricator.wikimedia.org/P20815 and previous config saved to /var/cache/conftool/dbconfig/20220215-161601-ladsgroup.json
16:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20814 and previous config saved to /var/cache/conftool/dbconfig/20220215-161436-marostegui.json
16:11 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus2004.codfw.wmnet
16:01 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2004.codfw.wmnet
16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20813 and previous config saved to /var/cache/conftool/dbconfig/20220215-160055-ladsgroup.json
15:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20812 and previous config saved to /var/cache/conftool/dbconfig/20220215-155931-marostegui.json
15:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1024.eqiad.wmnet with OS bullseye
15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20811 and previous config saved to /var/cache/conftool/dbconfig/20220215-154427-marostegui.json
15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300381)', diff saved to https://phabricator.wikimedia.org/P20810 and previous config saved to /var/cache/conftool/dbconfig/20220215-152455-marostegui.json
15:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
15:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
15:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20809 and previous config saved to /var/cache/conftool/dbconfig/20220215-152448-marostegui.json
15:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host es1024.eqiad.wmnet with OS bullseye
15:11 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus1004.eqiad.wmnet
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20808 and previous config saved to /var/cache/conftool/dbconfig/20220215-151026-ladsgroup.json
15:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20807 and previous config saved to /var/cache/conftool/dbconfig/20220215-150943-marostegui.json
15:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye
14:56 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:restbase-codfw: Restarting to pick up Java security updates - hnowlan@cumin1001
14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20806 and previous config saved to /var/cache/conftool/dbconfig/20220215-145521-ladsgroup.json
14:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20805 and previous config saved to /var/cache/conftool/dbconfig/20220215-145438-marostegui.json
14:50 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
14:40 hnowlan: removing java packages from all maps hosts
14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20804 and previous config saved to /var/cache/conftool/dbconfig/20220215-144016-ladsgroup.json
14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20803 and previous config saved to /var/cache/conftool/dbconfig/20220215-143934-marostegui.json
14:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:37 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye
14:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:30 Lucas_WMDE: UTC afternoon backport window done
14:28 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup (T301647) (wgAddGroups F-I) (duration: 02m 41s)
14:28 moritzm: installing clamav security updates on otrs1001 / ticket.wikimedia.org
14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20800 and previous config saved to /var/cache/conftool/dbconfig/20220215-142511-ladsgroup.json
14:24 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts prometheus1004.eqiad.wmnet
14:23 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1004.eqiad.wmnet
14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T300381)', diff saved to https://phabricator.wikimedia.org/P20799 and previous config saved to /var/cache/conftool/dbconfig/20220215-141916-marostegui.json
14:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
14:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20798 and previous config saved to /var/cache/conftool/dbconfig/20220215-141908-marostegui.json
14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20797 and previous config saved to /var/cache/conftool/dbconfig/20220215-141411-ladsgroup.json
14:07 hnowlan: removing java packages from maps2005
14:06 volans: deployed spicerack v2.0.0 on cumin hosts
14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300775)', diff saved to https://phabricator.wikimedia.org/P20796 and previous config saved to /var/cache/conftool/dbconfig/20220215-140408-marostegui.json
14:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20795 and previous config saved to /var/cache/conftool/dbconfig/20220215-140404-marostegui.json
14:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1123.eqiad.wmnet with reason: Maintenance
14:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1022.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
14:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1022.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
14:02 volans@cumin2002: END (PASS) - Cookbook sre.hosts.test-cookbook (exit_code=0) testing new spicerack release
14:02 volans@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin2002.codfw.wmnet with reason: testing new spicerack
14:02 volans@cumin2002: START - Cookbook sre.hosts.downtime for 0:05:00 on cumin2002.codfw.wmnet with reason: testing new spicerack
14:02 volans@cumin2002: START - Cookbook sre.hosts.test-cookbook testing new spicerack release
14:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
14:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
14:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
14:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
13:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20794 and previous config saved to /var/cache/conftool/dbconfig/20220215-135907-ladsgroup.json
13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20793 and previous config saved to /var/cache/conftool/dbconfig/20220215-134859-marostegui.json
13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20792 and previous config saved to /var/cache/conftool/dbconfig/20220215-134402-ladsgroup.json
13:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20791 and previous config saved to /var/cache/conftool/dbconfig/20220215-133354-marostegui.json
13:33 vgutierrez: rolling restart of envoy on cp nodes
13:33 vgutierrez: enable puppet on cache::(text|upload)_envoy nodes
13:31 moritzm: installing lxml security updates
13:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20790 and previous config saved to /var/cache/conftool/dbconfig/20220215-132857-ladsgroup.json
13:25 vgutierrez: disable puppet on cache::(text|upload)_envoy nodes
13:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:15 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:15 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:14 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:14 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
13:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T300381)', diff saved to https://phabricator.wikimedia.org/P20789 and previous config saved to /var/cache/conftool/dbconfig/20220215-131427-marostegui.json
13:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
13:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1170.eqiad.wmnet with OS bullseye
13:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1006.eqiad.wmnet
13:01 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus2006.codfw.wmnet
13:00 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus2006.codfw.wmnet
13:00 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1006.eqiad.wmnet
12:58 volans@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: Release v0.4.0 - volans@cumin2002
12:57 volans@cumin2002: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: Release v0.4.0 - volans@cumin2002
12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
12:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
12:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20788 and previous config saved to /var/cache/conftool/dbconfig/20220215-125548-marostegui.json
12:54 volans@deploy1002: Finished deploy [homer/deploy@94bed87]: Release v0.4.0 (duration: 01m 28s)
12:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1024.mgmt.eqiad.wmnet with reboot policy GRACEFUL
12:52 volans@deploy1002: Started deploy [homer/deploy@94bed87]: Release v0.4.0
12:51 volans: uploaded spicerack_2.0.0 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
12:47 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic2035.codfw.wmnet
12:46 marostegui@cumin1001: START - Cookbook sre.hosts.provision for host es1024.mgmt.eqiad.wmnet with reboot policy GRACEFUL
12:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1170.eqiad.wmnet with OS bullseye
12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300510)', diff saved to https://phabricator.wikimedia.org/P20787 and previous config saved to /var/cache/conftool/dbconfig/20220215-124207-ladsgroup.json
12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20786 and previous config saved to /var/cache/conftool/dbconfig/20220215-124043-marostegui.json
12:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20785 and previous config saved to /var/cache/conftool/dbconfig/20220215-124035-ladsgroup.json
12:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
12:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance
12:32 topranks: Modifying anycast_import policy on cr1-eqiad to validate / prep for changes to support wikidough IPv6.
12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20784 and previous config saved to /var/cache/conftool/dbconfig/20220215-122533-marostegui.json
12:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2104.codfw.wmnet with OS bullseye
12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20783 and previous config saved to /var/cache/conftool/dbconfig/20220215-121028-marostegui.json
11:50 sukhe: running homer for Gerrit 762788 and T301165
11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T300381)', diff saved to https://phabricator.wikimedia.org/P20782 and previous config saved to /var/cache/conftool/dbconfig/20220215-114950-marostegui.json
11:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
11:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
11:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2104.codfw.wmnet with OS bullseye
11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
11:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
11:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance
11:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
11:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
11:23 moritzm: rolling out Java 8 security updates for buster
11:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
11:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
11:10 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.22 refs T300198
11:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
11:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
11:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
11:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling es1024 (T300006)', diff saved to https://phabricator.wikimedia.org/P20781 and previous config saved to /var/cache/conftool/dbconfig/20220215-110420-ladsgroup.json
11:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on es1024.eqiad.wmnet with reason: Maintenance
11:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on es1024.eqiad.wmnet with reason: Maintenance
11:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:sessionstore: Restarting to pick up Java security updates - hnowlan@cumin1001
10:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
10:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20780 and previous config saved to /var/cache/conftool/dbconfig/20220215-105354-marostegui.json
10:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20779 and previous config saved to /var/cache/conftool/dbconfig/20220215-103849-marostegui.json
10:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:25 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching A:sessionstore: Restarting to pick up Java security updates - hnowlan@cumin1001
10:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20778 and previous config saved to /var/cache/conftool/dbconfig/20220215-102345-marostegui.json
10:23 ladsgroup@deploy1002: Synchronized wmf-config/db-production.php: Config: Revert "db-production: Stop writes to es5" (T300976) (duration: 00m 55s)
10:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Setting weight to es1023 T300006', diff saved to https://phabricator.wikimedia.org/P20777 and previous config saved to /var/cache/conftool/dbconfig/20220215-101817-root.json
10:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote es1023 to es5 primary and set section read-write T300006', diff saved to https://phabricator.wikimedia.org/P20776 and previous config saved to /var/cache/conftool/dbconfig/20220215-101412-root.json
10:10 Amir1: Starting es5 eqiad failover from es1024 to es1023 - T300006
10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20775 and previous config saved to /var/cache/conftool/dbconfig/20220215-100840-marostegui.json
10:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20774 and previous config saved to /var/cache/conftool/dbconfig/20220215-100333-marostegui.json
10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20773 and previous config saved to /var/cache/conftool/dbconfig/20220215-100325-marostegui.json
10:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set es1023 with weight 0 T300006', diff saved to https://phabricator.wikimedia.org/P20772 and previous config saved to /var/cache/conftool/dbconfig/20220215-100253-ladsgroup.json
10:01 ladsgroup@deploy1002: Synchronized wmf-config/db-production.php: Config: db-production: Stop writes to es5 (T300976) (duration: 00m 49s)
10:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
09:58 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.20 (duration: 03m 08s)
09:55 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.22 refs T300198 (duration: 45m 55s)
09:49 moritzm: migrate instances off ganeti1022
09:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T300006
09:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 T300006
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20771 and previous config saved to /var/cache/conftool/dbconfig/20220215-094821-marostegui.json
09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
09:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 6 hosts with reason: Maintenance
09:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
09:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20769 and previous config saved to /var/cache/conftool/dbconfig/20220215-093316-marostegui.json
09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20768 and previous config saved to /var/cache/conftool/dbconfig/20220215-091811-marostegui.json
09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300381)', diff saved to https://phabricator.wikimedia.org/P20767 and previous config saved to /var/cache/conftool/dbconfig/20220215-091606-marostegui.json
09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
09:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
09:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20766 and previous config saved to /var/cache/conftool/dbconfig/20220215-091554-marostegui.json
09:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
09:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
09:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
09:09 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.22 refs T300198
09:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2008.codfw.wmnet
09:04 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2007.codfw.wmnet
08:56 volans: rolling out python3-wmflib 1.0.2-1 across the fleet
08:54 moritzm: imported openjdk-8 8u322-b06-1~deb10u1 for buster-wikimedia (forward port of latest Java 8 security fixes)
08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20764 and previous config saved to /var/cache/conftool/dbconfig/20220215-084544-marostegui.json
08:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2135.codfw.wmnet with OS bullseye
08:32 moritzm: installing apache security updates on thanos nodes
08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20763 and previous config saved to /var/cache/conftool/dbconfig/20220215-083039-marostegui.json
08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300381)', diff saved to https://phabricator.wikimedia.org/P20762 and previous config saved to /var/cache/conftool/dbconfig/20220215-082533-marostegui.json
08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
08:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20761 and previous config saved to /var/cache/conftool/dbconfig/20220215-082519-marostegui.json
08:15 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2135.codfw.wmnet with OS bullseye
08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20760 and previous config saved to /var/cache/conftool/dbconfig/20220215-081015-marostegui.json
08:00 marostegui: Failover m3 from db1107 to db1183 - T301219
07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20759 and previous config saved to /var/cache/conftool/dbconfig/20220215-075510-marostegui.json
07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20758 and previous config saved to /var/cache/conftool/dbconfig/20220215-074005-marostegui.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300381)', diff saved to https://phabricator.wikimedia.org/P20757 and previous config saved to /var/cache/conftool/dbconfig/20220215-073701-marostegui.json
07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
07:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20756 and previous config saved to /var/cache/conftool/dbconfig/20220215-073653-marostegui.json
07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20755 and previous config saved to /var/cache/conftool/dbconfig/20220215-072149-marostegui.json
07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20754 and previous config saved to /var/cache/conftool/dbconfig/20220215-070644-marostegui.json
06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20753 and previous config saved to /var/cache/conftool/dbconfig/20220215-065139-marostegui.json
06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300381)', diff saved to https://phabricator.wikimedia.org/P20752 and previous config saved to /var/cache/conftool/dbconfig/20220215-064631-marostegui.json
06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20751 and previous config saved to /var/cache/conftool/dbconfig/20220215-064209-marostegui.json
06:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20750 and previous config saved to /var/cache/conftool/dbconfig/20220215-062705-marostegui.json
06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20749 and previous config saved to /var/cache/conftool/dbconfig/20220215-061200-marostegui.json
05:59 marostegui: Remove watchdog@10.% user from pc1-pc3 T301442
05:58 marostegui: Remove watchdog@10.% user from es1-es5 T301442
05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20748 and previous config saved to /var/cache/conftool/dbconfig/20220215-055655-marostegui.json
05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300381)', diff saved to https://phabricator.wikimedia.org/P20747 and previous config saved to /var/cache/conftool/dbconfig/20220215-055441-marostegui.json
05:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
05:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
05:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
05:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
05:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
05:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance
02:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling db2136 (after maint)', diff saved to https://phabricator.wikimedia.org/P20746 and previous config saved to /var/cache/conftool/dbconfig/20220215-023518-ladsgroup.json
02:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:14 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@3dc404c] (eqiad): Merge "Update kartotherian-package to f239c6e" (duration: 06m 19s)
02:09 mbsantos@deploy1002: Started deploy [kartotherian/deploy@3dc404c] (eqiad): Merge "Update kartotherian-package to f239c6e"
02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2022-02-14

22:04 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
22:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
22:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
21:25 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
21:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:15 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
21:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:33 mutante: mx/exim: re-adding donate@wikimedia.org email alias (OTRS -> ITS) (T297915)
20:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20744 and previous config saved to /var/cache/conftool/dbconfig/20220214-202720-ladsgroup.json
20:27 mutante: mx/exim: removing donate@wikimedia.org email alias (OTRS -> ITS) - was alias for fundraising@ (T297915)
20:24 mutante: mx/exim: removing wikimania@wikimedia.org email alias (OTRS -> ITS) (T297915)
20:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20743 and previous config saved to /var/cache/conftool/dbconfig/20220214-201215-ladsgroup.json
19:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P20742 and previous config saved to /var/cache/conftool/dbconfig/20220214-195711-ladsgroup.json
19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20741 and previous config saved to /var/cache/conftool/dbconfig/20220214-194206-ladsgroup.json
19:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20740 and previous config saved to /var/cache/conftool/dbconfig/20220214-193732-marostegui.json
19:36 herron: prometheus2006 systemctl reset-failed
19:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20739 and previous config saved to /var/cache/conftool/dbconfig/20220214-192227-marostegui.json
19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
19:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20738 and previous config saved to /var/cache/conftool/dbconfig/20220214-190722-marostegui.json
19:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20737 and previous config saved to /var/cache/conftool/dbconfig/20220214-190235-ladsgroup.json
19:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
19:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
19:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20736 and previous config saved to /var/cache/conftool/dbconfig/20220214-190228-ladsgroup.json
19:01 volans: uploaded python3-wmflib_1.0.2 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
18:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20735 and previous config saved to /var/cache/conftool/dbconfig/20220214-185218-marostegui.json
18:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300662)', diff saved to https://phabricator.wikimedia.org/P20734 and previous config saved to /var/cache/conftool/dbconfig/20220214-185103-marostegui.json
18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
18:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20733 and previous config saved to /var/cache/conftool/dbconfig/20220214-185056-marostegui.json
18:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P20732 and previous config saved to /var/cache/conftool/dbconfig/20220214-184723-ladsgroup.json
18:44 mutante: contint2001 - disabling puppet, try replacing docker version (docker-io -> docker-ce), contint1001 first which is currently NOT the active server - gerrit:758987 T300682
18:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20731 and previous config saved to /var/cache/conftool/dbconfig/20220214-183551-marostegui.json
18:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P20730 and previous config saved to /var/cache/conftool/dbconfig/20220214-183218-ladsgroup.json
18:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20729 and previous config saved to /var/cache/conftool/dbconfig/20220214-182046-marostegui.json
18:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20728 and previous config saved to /var/cache/conftool/dbconfig/20220214-181714-ladsgroup.json
18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20727 and previous config saved to /var/cache/conftool/dbconfig/20220214-180541-marostegui.json
18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20726 and previous config saved to /var/cache/conftool/dbconfig/20220214-180427-marostegui.json
18:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
18:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
18:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20725 and previous config saved to /var/cache/conftool/dbconfig/20220214-180419-marostegui.json
17:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts etherpad1002.eqiad.wmnet
17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20724 and previous config saved to /var/cache/conftool/dbconfig/20220214-174915-marostegui.json
17:48 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts etherpad1002.eqiad.wmnet
17:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance - hw issues
17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance - hw issues
17:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20722 and previous config saved to /var/cache/conftool/dbconfig/20220214-173526-ladsgroup.json
17:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
17:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20721 and previous config saved to /var/cache/conftool/dbconfig/20220214-173410-marostegui.json
17:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (hw issue)', diff saved to https://phabricator.wikimedia.org/P20720 and previous config saved to /var/cache/conftool/dbconfig/20220214-172924-ladsgroup.json
17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20719 and previous config saved to /var/cache/conftool/dbconfig/20220214-171905-marostegui.json
17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300662)', diff saved to https://phabricator.wikimedia.org/P20718 and previous config saved to /var/cache/conftool/dbconfig/20220214-171750-marostegui.json
17:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
17:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
17:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20717 and previous config saved to /var/cache/conftool/dbconfig/20220214-171743-marostegui.json
17:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20715 and previous config saved to /var/cache/conftool/dbconfig/20220214-170238-marostegui.json
17:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
17:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
16:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
16:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
16:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
16:54 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 49s)
16:54 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 50s)
16:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
16:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20714 and previous config saved to /var/cache/conftool/dbconfig/20220214-164733-marostegui.json
16:40 razzi@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host datahubsearch1002.eqiad.wmnet
16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20713 and previous config saved to /var/cache/conftool/dbconfig/20220214-163228-marostegui.json
16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300662)', diff saved to https://phabricator.wikimedia.org/P20712 and previous config saved to /var/cache/conftool/dbconfig/20220214-163113-marostegui.json
16:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
16:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
16:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
16:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
16:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
16:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
16:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20711 and previous config saved to /var/cache/conftool/dbconfig/20220214-163016-marostegui.json
16:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
16:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20710 and previous config saved to /var/cache/conftool/dbconfig/20220214-161511-marostegui.json
16:08 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1002.eqiad.wmnet
16:07 jbond: update mx1001 to disable ldap validation of gmail emails gerrit:762442 (allready on mx2001)
16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20709 and previous config saved to /var/cache/conftool/dbconfig/20220214-160007-marostegui.json
15:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
15:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
15:45 vgutierrez: re-enable puppet on cp nodes running HAProxy - T290005
15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20708 and previous config saved to /var/cache/conftool/dbconfig/20220214-154502-marostegui.json
15:43 sukhe: running authdns-update for T301165
15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300662)', diff saved to https://phabricator.wikimedia.org/P20707 and previous config saved to /var/cache/conftool/dbconfig/20220214-154147-marostegui.json
15:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
15:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
15:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20706 and previous config saved to /var/cache/conftool/dbconfig/20220214-154139-marostegui.json
15:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
15:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
15:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
15:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20705 and previous config saved to /var/cache/conftool/dbconfig/20220214-153811-ladsgroup.json
15:37 jayme: published image docker-registry.discovery.wmnet/prometheus-statsd-exporter:0.0.10
15:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20704 and previous config saved to /var/cache/conftool/dbconfig/20220214-152635-marostegui.json
15:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20703 and previous config saved to /var/cache/conftool/dbconfig/20220214-152306-ladsgroup.json
15:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20701 and previous config saved to /var/cache/conftool/dbconfig/20220214-151130-marostegui.json
15:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P20700 and previous config saved to /var/cache/conftool/dbconfig/20220214-150801-ladsgroup.json
14:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20699 and previous config saved to /var/cache/conftool/dbconfig/20220214-145625-marostegui.json
14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300662)', diff saved to https://phabricator.wikimedia.org/P20698 and previous config saved to /var/cache/conftool/dbconfig/20220214-145508-marostegui.json
14:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20697 and previous config saved to /var/cache/conftool/dbconfig/20220214-145501-marostegui.json
14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20696 and previous config saved to /var/cache/conftool/dbconfig/20220214-145257-ladsgroup.json
14:51 vgutierrez: disable puppet on cp nodes running HAProxy - T290005
14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20695 and previous config saved to /var/cache/conftool/dbconfig/20220214-143956-marostegui.json
14:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:36 Lucas_WMDE: UTC afternoon backport window done
14:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: General cleanup (T301647) (should be a no-op) (duration: 00m 48s)
14:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:30 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: trwikisource: Enable ULS webfonts by default (T283626) (duration: 00m 48s)
14:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:27 moritzm: installing Java 8/stretch security updates
14:26 jnuche: Jenkins upgrade complete
14:25 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [WikibaseMediaInfo] Make synonyms profile the default (T301559) (duration: 00m 48s)
14:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20694 and previous config saved to /var/cache/conftool/dbconfig/20220214-142452-marostegui.json
14:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:17 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Fix missing icons for apiportalwiki and wikimaniawiki (T301636) (duration: 00m 49s)
14:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T298554)', diff saved to https://phabricator.wikimedia.org/P20693 and previous config saved to /var/cache/conftool/dbconfig/20220214-141304-ladsgroup.json
14:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
14:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
14:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
14:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
14:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20692 and previous config saved to /var/cache/conftool/dbconfig/20220214-141251-ladsgroup.json
14:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ printf '%s\n' 'https://en.wikipedia.org/static/images/sul/foundation-black.png' | mwscript purgeList.php # T301636
14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20691 and previous config saved to /var/cache/conftool/dbconfig/20220214-140947-marostegui.json
14:09 lucaswerkmeister-wmde@deploy1002: Synchronized static/images/sul/foundation-black.png: Config: Upload logo for apiportalwiki in wmgCentralAuthLoginIcon (T301636) (duration: 00m 49s)
14:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300662)', diff saved to https://phabricator.wikimedia.org/P20690 and previous config saved to /var/cache/conftool/dbconfig/20220214-140832-marostegui.json
14:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
14:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
14:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20689 and previous config saved to /var/cache/conftool/dbconfig/20220214-140824-marostegui.json
13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20688 and previous config saved to /var/cache/conftool/dbconfig/20220214-135746-ladsgroup.json
13:54 jnuche: Jenkins contint instances are going to be restarted soon
13:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20687 and previous config saved to /var/cache/conftool/dbconfig/20220214-135320-marostegui.json
13:47 moritzm: rolling restart of apache on logstash* to pick up expat security updates
13:43 mmandere@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp4031.ulsfo.wmnet
13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P20686 and previous config saved to /var/cache/conftool/dbconfig/20220214-134242-ladsgroup.json
13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20685 and previous config saved to /var/cache/conftool/dbconfig/20220214-133815-marostegui.json
13:33 mmandere@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp4031.ulsfo.wmnet
13:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20684 and previous config saved to /var/cache/conftool/dbconfig/20220214-132736-ladsgroup.json
13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20683 and previous config saved to /var/cache/conftool/dbconfig/20220214-132310-marostegui.json
13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300662)', diff saved to https://phabricator.wikimedia.org/P20682 and previous config saved to /var/cache/conftool/dbconfig/20220214-132155-marostegui.json
13:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
13:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
13:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
13:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20681 and previous config saved to /var/cache/conftool/dbconfig/20220214-132135-marostegui.json
13:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20680 and previous config saved to /var/cache/conftool/dbconfig/20220214-130630-marostegui.json
12:53 arturo: merging https://gerrit.wikimedia.org/r/c/operations/homer/public/+/755478 to core routers
12:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20679 and previous config saved to /var/cache/conftool/dbconfig/20220214-125125-marostegui.json
12:48 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1016.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
12:47 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1016.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
12:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T298554)', diff saved to https://phabricator.wikimedia.org/P20678 and previous config saved to /var/cache/conftool/dbconfig/20220214-123636-ladsgroup.json
12:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
12:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
12:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20677 and previous config saved to /var/cache/conftool/dbconfig/20220214-123629-ladsgroup.json
12:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20676 and previous config saved to /var/cache/conftool/dbconfig/20220214-123620-marostegui.json
12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300662)', diff saved to https://phabricator.wikimedia.org/P20675 and previous config saved to /var/cache/conftool/dbconfig/20220214-123506-marostegui.json
12:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
12:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
12:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
12:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
12:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20674 and previous config saved to /var/cache/conftool/dbconfig/20220214-123446-marostegui.json
12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1016.eqiad.wmnet
12:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20673 and previous config saved to /var/cache/conftool/dbconfig/20220214-122124-ladsgroup.json
12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1016.eqiad.wmnet
12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20672 and previous config saved to /var/cache/conftool/dbconfig/20220214-121941-marostegui.json
12:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20671 and previous config saved to /var/cache/conftool/dbconfig/20220214-120619-ladsgroup.json
12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20670 and previous config saved to /var/cache/conftool/dbconfig/20220214-120436-marostegui.json
11:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P20669 and previous config saved to /var/cache/conftool/dbconfig/20220214-115250-marostegui.json
11:51 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1021.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
11:51 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1009.eqiad.wmnet
11:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20668 and previous config saved to /var/cache/conftool/dbconfig/20220214-115115-ladsgroup.json
11:50 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1021.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20667 and previous config saved to /var/cache/conftool/dbconfig/20220214-114931-marostegui.json
11:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300662)', diff saved to https://phabricator.wikimedia.org/P20666 and previous config saved to /var/cache/conftool/dbconfig/20220214-114817-marostegui.json
11:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
11:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
11:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
11:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
11:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1021.eqiad.wmnet
11:42 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1021.eqiad.wmnet
11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T298554)', diff saved to https://phabricator.wikimedia.org/P20665 and previous config saved to /var/cache/conftool/dbconfig/20220214-113850-ladsgroup.json
11:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
11:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20664 and previous config saved to /var/cache/conftool/dbconfig/20220214-113842-ladsgroup.json
11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20663 and previous config saved to /var/cache/conftool/dbconfig/20220214-112337-ladsgroup.json
11:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20662 and previous config saved to /var/cache/conftool/dbconfig/20220214-111708-marostegui.json
11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20661 and previous config saved to /var/cache/conftool/dbconfig/20220214-110833-ladsgroup.json
11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20660 and previous config saved to /var/cache/conftool/dbconfig/20220214-110203-marostegui.json
10:56 moritzm: restart apache/FPM on mediawiki canaries to pick up expat security updates
10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20659 and previous config saved to /var/cache/conftool/dbconfig/20220214-105328-ladsgroup.json
10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20658 and previous config saved to /var/cache/conftool/dbconfig/20220214-104659-marostegui.json
10:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T298554)', diff saved to https://phabricator.wikimedia.org/P20657 and previous config saved to /var/cache/conftool/dbconfig/20220214-104143-ladsgroup.json
10:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
10:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
10:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20656 and previous config saved to /var/cache/conftool/dbconfig/20220214-104136-ladsgroup.json
10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20655 and previous config saved to /var/cache/conftool/dbconfig/20220214-103154-marostegui.json
10:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20654 and previous config saved to /var/cache/conftool/dbconfig/20220214-102631-ladsgroup.json
10:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300382)', diff saved to https://phabricator.wikimedia.org/P20653 and previous config saved to /var/cache/conftool/dbconfig/20220214-102142-marostegui.json
10:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
10:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
10:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20652 and previous config saved to /var/cache/conftool/dbconfig/20220214-102135-marostegui.json
10:12 jayme: published image docker-registry.discovery.wmnet/cfssl-issuer:0.2.2-1
10:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20650 and previous config saved to /var/cache/conftool/dbconfig/20220214-101126-ladsgroup.json
10:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20649 and previous config saved to /var/cache/conftool/dbconfig/20220214-100630-marostegui.json
09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20648 and previous config saved to /var/cache/conftool/dbconfig/20220214-095622-ladsgroup.json
09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20647 and previous config saved to /var/cache/conftool/dbconfig/20220214-095125-marostegui.json
09:44 jayme: published image docker-registry.discovery.wmnet/cfssl-issuer:0.2.2-0
09:40 vgutierrez: update haproxy to 2.4.12 on cp4032 - T290005
09:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20646 and previous config saved to /var/cache/conftool/dbconfig/20220214-093621-marostegui.json
09:34 vgutierrez: update haproxy to 2.4.12 on cp4026 - T290005
09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300382)', diff saved to https://phabricator.wikimedia.org/P20645 and previous config saved to /var/cache/conftool/dbconfig/20220214-092602-marostegui.json
09:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
09:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20644 and previous config saved to /var/cache/conftool/dbconfig/20220214-092555-marostegui.json
09:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T298554)', diff saved to https://phabricator.wikimedia.org/P20643 and previous config saved to /var/cache/conftool/dbconfig/20220214-091422-ladsgroup.json
09:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
09:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
09:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
09:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20642 and previous config saved to /var/cache/conftool/dbconfig/20220214-091050-marostegui.json
08:58 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2008.codfw.wmnet with OS bullseye
08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20641 and previous config saved to /var/cache/conftool/dbconfig/20220214-085546-marostegui.json
08:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:48 taavi: UTC morning deploys done (for real this time)
08:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:46 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:45 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: prod: WRITE_NEW for CentralAuth hidden level migration (T289068) (duration: 00m 49s)
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20640 and previous config saved to /var/cache/conftool/dbconfig/20220214-084041-marostegui.json
08:40 urbanecm: Reopen UTC morning B&C for a last deploy
08:40 urbanecm: UTC morning B&C window done
08:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 1b0daef: Fixed typo for SectionTranslation in testwiki: lu -> lg (duration: 00m 48s)
08:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300382)', diff saved to https://phabricator.wikimedia.org/P20639 and previous config saved to /var/cache/conftool/dbconfig/20220214-083051-marostegui.json
08:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
08:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20638 and previous config saved to /var/cache/conftool/dbconfig/20220214-083043-marostegui.json
08:29 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2008.codfw.wmnet with OS bullseye
08:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:19 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=arywiki --fix # T291737
08:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20637 and previous config saved to /var/cache/conftool/dbconfig/20220214-081538-marostegui.json
08:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: db0e71e: arywiki: Add Portal and Draft namespaces (T291737) (duration: 00m 52s)
08:13 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2007.codfw.wmnet with OS bullseye
08:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
08:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
08:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20636 and previous config saved to /var/cache/conftool/dbconfig/20220214-080034-marostegui.json
07:56 dcausse: restart blazegraph on wdqs1013 (jvm stuck for 26h)
07:48 moritzm: installing expat security updates
07:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20635 and previous config saved to /var/cache/conftool/dbconfig/20220214-074529-marostegui.json
07:43 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS bullseye
07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300382)', diff saved to https://phabricator.wikimedia.org/P20634 and previous config saved to /var/cache/conftool/dbconfig/20220214-073544-marostegui.json
07:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
07:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
07:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
07:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20633 and previous config saved to /var/cache/conftool/dbconfig/20220214-071718-marostegui.json
07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20632 and previous config saved to /var/cache/conftool/dbconfig/20220214-070214-marostegui.json
06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20631 and previous config saved to /var/cache/conftool/dbconfig/20220214-064709-marostegui.json
06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20630 and previous config saved to /var/cache/conftool/dbconfig/20220214-063204-marostegui.json
06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300382)', diff saved to https://phabricator.wikimedia.org/P20629 and previous config saved to /var/cache/conftool/dbconfig/20220214-062219-marostegui.json
06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
06:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
06:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
06:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
06:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
05:56 marostegui: Deploy schema change on s5 master (db1130) T300775
05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance

2022-02-13

23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20627 and previous config saved to /var/cache/conftool/dbconfig/20220213-231742-marostegui.json
23:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20626 and previous config saved to /var/cache/conftool/dbconfig/20220213-230237-marostegui.json
22:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P20625 and previous config saved to /var/cache/conftool/dbconfig/20220213-224733-marostegui.json
22:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20624 and previous config saved to /var/cache/conftool/dbconfig/20220213-223228-marostegui.json
19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:26 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/page/WikiPage.php: Backport: WikiPage: Cast the category values to string in updateCategoryCounts (T301433) (duration: 00m 49s)
15:39 godog: shorten /var/log/swift/server.log.1 on thanos-be2001 to recover some space
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20623 and previous config saved to /var/cache/conftool/dbconfig/20220213-100348-marostegui.json
10:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20622 and previous config saved to /var/cache/conftool/dbconfig/20220213-100340-marostegui.json
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20621 and previous config saved to /var/cache/conftool/dbconfig/20220213-094836-marostegui.json
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P20620 and previous config saved to /var/cache/conftool/dbconfig/20220213-093331-marostegui.json
09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20619 and previous config saved to /var/cache/conftool/dbconfig/20220213-091826-marostegui.json

2022-02-12

22:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20617 and previous config saved to /var/cache/conftool/dbconfig/20220212-225806-marostegui.json
22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
22:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
22:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
22:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1161.eqiad.wmnet with reason: Maintenance
12:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
10:02 jelto: update gitlab-runner1001 and gitlab-runner2001 to gitlab-runner 14.7.0
09:52 jelto: update gitlab1001 to gitlab-ce 14.7.2-ce.0
09:41 jelto: update gitlab2001 to gitlab-ce 14.7.2-ce.0
08:49 elukey: truncate /var/log/auth.log to 1g on krb1001 to free space on root partition (original log saved under /srv)
07:23 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 4hours)
03:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
03:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
03:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20616 and previous config saved to /var/cache/conftool/dbconfig/20220212-032710-marostegui.json
03:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20615 and previous config saved to /var/cache/conftool/dbconfig/20220212-031205-marostegui.json
02:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P20614 and previous config saved to /var/cache/conftool/dbconfig/20220212-025700-marostegui.json
02:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20613 and previous config saved to /var/cache/conftool/dbconfig/20220212-024155-marostegui.json

2022-02-11

23:23 inflatador: puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/762006
22:47 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
22:36 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
22:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
22:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
22:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
22:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
22:20 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
22:09 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
21:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:41 tzatziki: removed 16 emails from accounts with deleteUserEmail.php
19:14 mutante: running puppet on all ores machines to install aspell-hi (gerrit:761974) which for some reason was installed on a random subset of ores servers (1002,2001,2005 but not the other 19 ones) T300195 T252581 - after this the package is now installed on 18 servers (1001-1009, 2001-2009)
16:54 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
16:54 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
16:54 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
16:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
16:53 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
16:53 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
16:32 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host datahubsearch1001.eqiad.wmnet
16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20611 and previous config saved to /var/cache/conftool/dbconfig/20220211-161324-marostegui.json
16:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
16:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
16:03 btullis@cumin1001: START - Cookbook sre.ganeti.makevm for new host datahubsearch1001.eqiad.wmnet
14:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts auth2001.codfw.wmnet
14:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20610 and previous config saved to /var/cache/conftool/dbconfig/20220211-142045-root.json
14:07 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts auth2001.codfw.wmnet
14:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20609 and previous config saved to /var/cache/conftool/dbconfig/20220211-140540-root.json
13:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20608 and previous config saved to /var/cache/conftool/dbconfig/20220211-135037-root.json
13:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20607 and previous config saved to /var/cache/conftool/dbconfig/20220211-133533-root.json
13:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20606 and previous config saved to /var/cache/conftool/dbconfig/20220211-132028-root.json
13:19 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
13:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
13:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
13:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
13:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300662)', diff saved to https://phabricator.wikimedia.org/P20605 and previous config saved to /var/cache/conftool/dbconfig/20220211-131507-marostegui.json
13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
12:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
12:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1016.eqiad.wmnet with OS buster
12:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1016.eqiad.wmnet with OS buster
10:43 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
10:42 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
10:42 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
10:42 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
10:42 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
10:42 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
10:41 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
10:40 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
10:40 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1021.eqiad.wmnet with OS buster
10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1021.eqiad.wmnet with OS buster
10:05 jelto@deploy1002: helmfile [staging] DONE helmfile.d/services/termbox: apply
10:05 jelto@deploy1002: helmfile [staging] START helmfile.d/services/termbox: apply
09:29 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
09:29 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20599 and previous config saved to /var/cache/conftool/dbconfig/20220211-090223-marostegui.json
08:57 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti1011.eqiad.wmnet with OS buster
08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 8 hosts with reason: Maintenance
06:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
06:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance
06:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20598 and previous config saved to /var/cache/conftool/dbconfig/20220211-062306-marostegui.json
06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20597 and previous config saved to /var/cache/conftool/dbconfig/20220211-060801-marostegui.json
05:56 marostegui: Remove watchdog@10.% user from s6 codfw T301442
05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P20596 and previous config saved to /var/cache/conftool/dbconfig/20220211-055256-marostegui.json
05:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20595 and previous config saved to /var/cache/conftool/dbconfig/20220211-053752-marostegui.json
02:33 eileen: checkout revision (ccd5afc3 -> 815e3091)
02:32 eileen: civicrm: revision 815e3091, config 02f4888c
00:38 thcipriani: utc late backport Done
00:33 thcipriani@deploy1002: Synchronized dblists/desktop-improvements.dblist: Config: Make Vector 2022 the default skin for MediaWiki.org (T298519) (duration: 00m 48s)
00:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:31 thcipriani@deploy1002: Synchronized wmf-config/config/mediawikiwiki.yaml: Config: Make Vector 2022 the default skin for MediaWiki.org (T298519) (duration: 00m 48s)
00:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:16 bwang@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: urwiki: Add patroller usergroup (T301491) (duration: 00m 49s)
00:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
00:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
00:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
00:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20594 and previous config saved to /var/cache/conftool/dbconfig/20220211-001425-ladsgroup.json

2022-02-10

23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20593 and previous config saved to /var/cache/conftool/dbconfig/20220210-235920-ladsgroup.json
23:54 cstone: Donation Interface revision changed from dbcb5254 to a6a9b63e
23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20592 and previous config saved to /var/cache/conftool/dbconfig/20220210-234416-ladsgroup.json
23:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20591 and previous config saved to /var/cache/conftool/dbconfig/20220210-232911-ladsgroup.json
23:18 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
23:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T298554)', diff saved to https://phabricator.wikimedia.org/P20590 and previous config saved to /var/cache/conftool/dbconfig/20220210-231004-ladsgroup.json
23:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
23:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
22:39 mutante: etherpad - succesfully switched to etherpad1003 (bullseye) and etherpad 1.8.16 - on second attempt after making it listen on IPv6 to work behind envoy (T300568) - https://gerrit.wikimedia.org/r/c/operations/puppet/+/761727/
22:34 bblack@cumin1001: START - Cookbook sre.dns.netbox
22:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
22:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
22:28 bblack@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
22:27 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1013.eqiad.wmnet with OS buster
22:26 bblack@cumin1001: START - Cookbook sre.dns.netbox
22:24 mutante: etherpad - one more short downtime for maintenance - downtimed in alertmanager and icinga
22:04 bblack@cumin1001: START - Cookbook sre.hosts.reimage for host lvs1013.eqiad.wmnet with OS buster
21:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
21:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20589 and previous config saved to /var/cache/conftool/dbconfig/20220210-215354-ladsgroup.json
21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20588 and previous config saved to /var/cache/conftool/dbconfig/20220210-213849-ladsgroup.json
21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20587 and previous config saved to /var/cache/conftool/dbconfig/20220210-212344-ladsgroup.json
21:16 bblack: cr1-eqiad - manual config, static fallback for high-traffic1 to lvs1017
21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20586 and previous config saved to /var/cache/conftool/dbconfig/20220210-210839-ladsgroup.json
21:08 bblack: lvs1017 - bringing pybal online with real routing, flips high-traffic (text-cluster) traffic from lvs1020 -> lvs1017
20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T298554)', diff saved to https://phabricator.wikimedia.org/P20585 and previous config saved to /var/cache/conftool/dbconfig/20220210-204831-ladsgroup.json
20:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
20:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
20:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
20:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20584 and previous config saved to /var/cache/conftool/dbconfig/20220210-204818-ladsgroup.json
20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20583 and previous config saved to /var/cache/conftool/dbconfig/20220210-203313-ladsgroup.json
20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20582 and previous config saved to /var/cache/conftool/dbconfig/20220210-201808-ladsgroup.json
20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:08 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.21 refs T300197
20:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20581 and previous config saved to /var/cache/conftool/dbconfig/20220210-200304-ladsgroup.json
19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T298554)', diff saved to https://phabricator.wikimedia.org/P20580 and previous config saved to /var/cache/conftool/dbconfig/20220210-194518-ladsgroup.json
19:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
19:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20579 and previous config saved to /var/cache/conftool/dbconfig/20220210-194510-ladsgroup.json
19:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20578 and previous config saved to /var/cache/conftool/dbconfig/20220210-193005-ladsgroup.json
19:25 bblack: lvs1017 reboot again for clean network config - T301142
19:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20577 and previous config saved to /var/cache/conftool/dbconfig/20220210-191501-ladsgroup.json
19:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@828a428] (eqiad): Configure geoshapes postgres max conns (duration: 01m 29s)
19:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:13 urbanecm@deploy1002: Synchronized wmf-config/flaggedrevs.php: 72f3b31: Migrate $wmfStandardAutoPromote to $wmgStandardAutoPromote (T45956) (duration: 00m 49s)
19:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:12 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@828a428] (eqiad): Configure geoshapes postgres max conns
19:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:11 bblack: lvs1017 rebooting for sanity-check after prod config - T301142
19:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20576 and previous config saved to /var/cache/conftool/dbconfig/20220210-190840-marostegui.json
19:03 otto@deploy1002: Finished deploy [airflow-dags/research@b871faf]: (no justification provided) (duration: 00m 03s)
19:03 otto@deploy1002: Started deploy [airflow-dags/research@b871faf]: (no justification provided)
19:01 otto@deploy1002: Finished deploy [airflow-dags/research@b871faf]: (no justification provided) (duration: 00m 27s)
19:01 otto@deploy1002: Started deploy [airflow-dags/research@b871faf]: (no justification provided)
18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20575 and previous config saved to /var/cache/conftool/dbconfig/20220210-185956-ladsgroup.json
18:53 ebernhardson: restart all mjolnir daemons on search-loader1001 and 2001 to purge old cached node lists
18:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20574 and previous config saved to /var/cache/conftool/dbconfig/20220210-185336-marostegui.json
18:52 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: sync on production
18:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
18:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
18:49 jgiannelos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply on staging
18:49 jgiannelos@deploy1002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply on production
18:49 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: sync on production
18:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:46 jgiannelos@deploy1002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply on staging
18:46 jgiannelos@deploy1002: helmfile [codfw] START helmfile.d/services/mobileapps: apply on production
18:45 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: sync on staging
18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1031.eqiad.wmnet with OS buster
18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1032.eqiad.wmnet with OS buster
18:45 jgiannelos@deploy1002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply on production
18:45 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host restbase1033.eqiad.wmnet with OS buster
18:45 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply on staging
18:44 jgiannelos@deploy1002: helmfile [staging] START helmfile.d/services/mobileapps: apply on staging
18:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
18:43 bblack: lvs1013 - stopping puppet+pybal for move to lvs1017, high-traffic1 traffic fails over to lvs1020 for now - T301142
18:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
18:42 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/content/ContentHandler.php: Backport: ContentHandler: Avoding saving in ParserCache in search index jobs (T285993) (duration: 00m 50s)
18:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:40 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/content/ContentHandler.php: Backport: ContentHandler: Avoding saving in ParserCache in search index jobs (T285993) (duration: 00m 50s)
18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20573 and previous config saved to /var/cache/conftool/dbconfig/20220210-184012-marostegui.json
18:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
18:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20572 and previous config saved to /var/cache/conftool/dbconfig/20220210-184004-marostegui.json
18:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20571 and previous config saved to /var/cache/conftool/dbconfig/20220210-183831-marostegui.json
18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2088:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20570 and previous config saved to /var/cache/conftool/dbconfig/20220210-183107-ladsgroup.json
18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T298554)', diff saved to https://phabricator.wikimedia.org/P20569 and previous config saved to /var/cache/conftool/dbconfig/20220210-182959-ladsgroup.json
18:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
18:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20568 and previous config saved to /var/cache/conftool/dbconfig/20220210-182952-ladsgroup.json
18:29 bblack@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:28 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id` (duration: 01m 01s)
18:27 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id`
18:26 bblack@cumin1001: START - Cookbook sre.dns.netbox
18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2088:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20567 and previous config saved to /var/cache/conftool/dbconfig/20220210-182547-ladsgroup.json
18:25 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id` (duration: 00m 15s)
18:25 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@a5be8ac] (eqiad): Remove references to cassandra `storage_id`
18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20566 and previous config saved to /var/cache/conftool/dbconfig/20220210-182500-marostegui.json
18:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20565 and previous config saved to /var/cache/conftool/dbconfig/20220210-182326-marostegui.json
18:18 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1033.eqiad.wmnet with OS buster
18:17 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1032.eqiad.wmnet with OS buster
18:16 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS buster
18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20564 and previous config saved to /var/cache/conftool/dbconfig/20220210-181447-ladsgroup.json
18:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@bf5fb8e] (eqiad): Remove unused kartotherian-postgres reference (duration: 00m 14s)
18:13 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@bf5fb8e] (eqiad): Remove unused kartotherian-postgres reference
18:12 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@5699db7] (eqiad): Remove unused kartotherian-layermixer reference (duration: 04m 52s)
18:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2088.codfw.wmnet with OS bullseye
18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P20563 and previous config saved to /var/cache/conftool/dbconfig/20220210-180955-marostegui.json
18:07 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@5699db7] (eqiad): Remove unused kartotherian-layermixer reference
18:06 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d (duration: 05m 58s)
18:00 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d
17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20562 and previous config saved to /var/cache/conftool/dbconfig/20220210-175942-ladsgroup.json
17:57 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d (duration: 05m 59s)
17:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20561 and previous config saved to /var/cache/conftool/dbconfig/20220210-175450-marostegui.json
17:51 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@4312bc3] (eqiad): Update kartotherian-package to dd11f2d
17:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20560 and previous config saved to /var/cache/conftool/dbconfig/20220210-174438-ladsgroup.json
17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2088.codfw.wmnet with OS bullseye
17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2088:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20559 and previous config saved to /var/cache/conftool/dbconfig/20220210-173957-ladsgroup.json
17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2088:3311 (T300510)', diff saved to https://phabricator.wikimedia.org/P20558 and previous config saved to /var/cache/conftool/dbconfig/20220210-173932-ladsgroup.json
17:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2088.codfw.wmnet with reason: Maintenance
17:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2088.codfw.wmnet with reason: Maintenance
17:36 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1011.eqiad.wmnet with OS stretch
17:31 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1010.eqiad.wmnet with OS stretch
17:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
17:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
17:28 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe1009.eqiad.wmnet with OS stretch
17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T298554)', diff saved to https://phabricator.wikimedia.org/P20557 and previous config saved to /var/cache/conftool/dbconfig/20220210-172635-ladsgroup.json
17:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
17:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
17:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T300382)', diff saved to https://phabricator.wikimedia.org/P20556 and previous config saved to /var/cache/conftool/dbconfig/20220210-172307-marostegui.json
17:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
17:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20555 and previous config saved to /var/cache/conftool/dbconfig/20220210-172300-marostegui.json
17:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
17:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
17:15 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
17:14 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
17:14 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
17:14 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
17:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1011.eqiad.wmnet with OS stretch
17:10 rzl: rzl@cumin2001:~$ sudo cumin A:mw "enable-puppet T273323"
17:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20553 and previous config saved to /var/cache/conftool/dbconfig/20220210-170755-marostegui.json
17:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1010.eqiad.wmnet with OS stretch
17:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1009.eqiad.wmnet with OS stretch
17:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dbmonitor1002.wikimedia.org
17:03 rzl: rzl@cumin2001:~$ sudo cumin A:mw "disable-puppet T273323"
17:01 mutante: etherpad going down for maintenance
16:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.decommission for hosts dbmonitor1002.wikimedia.org
16:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20552 and previous config saved to /var/cache/conftool/dbconfig/20220210-165250-marostegui.json
16:50 otto@deploy1002: Finished deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided) (duration: 00m 10s)
16:50 otto@deploy1002: Started deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided)
16:50 otto@deploy1002: Finished deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided) (duration: 01m 46s)
16:48 otto@deploy1002: Started deploy [airflow-dags/analytics@5b6ba8e]: (no justification provided)
16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20551 and previous config saved to /var/cache/conftool/dbconfig/20220210-163746-marostegui.json
16:37 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@5b6ba8e]: (no justification provided) (duration: 00m 08s)
16:37 otto@deploy1002: Started deploy [airflow-dags/analytics_test@5b6ba8e]: (no justification provided)
16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T300382)', diff saved to https://phabricator.wikimedia.org/P20550 and previous config saved to /var/cache/conftool/dbconfig/20220210-163633-marostegui.json
16:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
16:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
16:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
16:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20549 and previous config saved to /var/cache/conftool/dbconfig/20220210-163620-marostegui.json
16:22 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 00m 11s)
16:22 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
16:22 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 07m 49s)
16:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20548 and previous config saved to /var/cache/conftool/dbconfig/20220210-162115-marostegui.json
16:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
16:14 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
16:14 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 04m 19s)
16:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
16:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
16:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
16:09 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
16:09 ppchelko@deploy1002: Synchronized w/tmp_settings_bench.php: Config: gerrit 761433 settings benchmark - measure new static php array config load (duration: 00m 49s)
16:08 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided) (duration: 00m 46s)
16:07 otto@deploy1002: Started deploy [airflow-dags/analytics_test@66d6cad]: (no justification provided)
16:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20547 and previous config saved to /var/cache/conftool/dbconfig/20220210-160611-marostegui.json
16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20546 and previous config saved to /var/cache/conftool/dbconfig/20220210-160417-ladsgroup.json
16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T300510)', diff saved to https://phabricator.wikimedia.org/P20545 and previous config saved to /var/cache/conftool/dbconfig/20220210-160046-ladsgroup.json
16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T300510)', diff saved to https://phabricator.wikimedia.org/P20544 and previous config saved to /var/cache/conftool/dbconfig/20220210-160003-ladsgroup.json
15:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20543 and previous config saved to /var/cache/conftool/dbconfig/20220210-155106-marostegui.json
15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20542 and previous config saved to /var/cache/conftool/dbconfig/20220210-154913-ladsgroup.json
15:39 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/Storage/DerivedPageDataUpdater.php: Backport: DerivedPageDataUpdater: Set ParserOutput when it's passed to it (T301309) (duration: 00m 50s)
15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20541 and previous config saved to /var/cache/conftool/dbconfig/20220210-153408-ladsgroup.json
15:32 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/includes/Storage/DerivedPageDataUpdater.php: Backport: DerivedPageDataUpdater: Set ParserOutput when it's passed to it (T301309) (duration: 00m 53s)
15:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2138.codfw.wmnet with OS bullseye
15:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
15:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
15:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
15:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
15:20 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
15:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply on pinkunicorn
15:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
15:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
15:19 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
15:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20538 and previous config saved to /var/cache/conftool/dbconfig/20220210-151903-ladsgroup.json
15:17 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
15:16 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:58 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:58 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:57 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:56 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2138.codfw.wmnet with OS bullseye
14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T300382)', diff saved to https://phabricator.wikimedia.org/P20537 and previous config saved to /var/cache/conftool/dbconfig/20220210-145047-marostegui.json
14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20536 and previous config saved to /var/cache/conftool/dbconfig/20220210-145040-marostegui.json
14:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138 (T300510)', diff saved to https://phabricator.wikimedia.org/P20535 and previous config saved to /var/cache/conftool/dbconfig/20220210-144913-ladsgroup.json
14:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20534 and previous config saved to /var/cache/conftool/dbconfig/20220210-143535-marostegui.json
14:23 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2006.codfw.wmnet
14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20533 and previous config saved to /var/cache/conftool/dbconfig/20220210-142030-marostegui.json
14:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2006.codfw.wmnet
14:19 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ml-serve2005.codfw.wmnet
14:10 elukey: `elukey@cumin1001:~$ homer 'cr*codfw*' commit "Add ml-serve2006 to the k8s ml-serve-codfw cluster's neighbors"`
14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20532 and previous config saved to /var/cache/conftool/dbconfig/20220210-140525-marostegui.json
14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T298554)', diff saved to https://phabricator.wikimedia.org/P20531 and previous config saved to /var/cache/conftool/dbconfig/20220210-140500-ladsgroup.json
14:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
14:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
14:00 moritzm: installing apache security updates on phab1001/phabricator.wikimedia.org
13:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20530 and previous config saved to /var/cache/conftool/dbconfig/20220210-135411-marostegui.json
13:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
13:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
13:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
13:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
13:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
13:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
13:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20529 and previous config saved to /var/cache/conftool/dbconfig/20220210-135332-marostegui.json
13:50 moritzm: installing apache security updates on otrs1001/ticket.wikimedia.org
13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20527 and previous config saved to /var/cache/conftool/dbconfig/20220210-133827-marostegui.json
13:28 moritzm: installing lxml security updates
13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20526 and previous config saved to /var/cache/conftool/dbconfig/20220210-132323-marostegui.json
13:22 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus1003.eqiad.wmnet
13:09 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus1003.eqiad.wmnet
13:08 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts prometheus2003.codfw.wmnet
13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20525 and previous config saved to /var/cache/conftool/dbconfig/20220210-130818-marostegui.json
12:59 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2003.codfw.wmnet
12:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
12:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20524 and previous config saved to /var/cache/conftool/dbconfig/20220210-125850-ladsgroup.json
12:58 filippo@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts prometheus2003.codfw.wmnet
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300382)', diff saved to https://phabricator.wikimedia.org/P20523 and previous config saved to /var/cache/conftool/dbconfig/20220210-125503-marostegui.json
12:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
12:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
12:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20522 and previous config saved to /var/cache/conftool/dbconfig/20220210-125456-marostegui.json
12:50 moritzm: installing apr security updates
12:49 filippo@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus2003.codfw.wmnet
12:48 Lucas_WMDE: printf '%s\n' 'https://query.wikidata.org/index.html' 'https://query.wikidata.org/embed.html' | mwscript purgeList.php # T301457 just in case
12:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20521 and previous config saved to /var/cache/conftool/dbconfig/20220210-124346-ladsgroup.json
12:40 taavi: UTC morning deploys done
12:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20520 and previous config saved to /var/cache/conftool/dbconfig/20220210-123951-marostegui.json
12:39 taavi@deploy1002: Synchronized logos/config.yaml: Config: banwikisource: Fix logo size (T296459) (duration: 00m 49s)
12:39 taavi: purge banwikisource logos via purgeList.php T296459
12:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:39 taavi@deploy1002: Synchronized wmf-config/logos.php: Config: banwikisource: Fix logo size (T296459) (duration: 00m 49s)
12:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:38 taavi@deploy1002: Synchronized static/images/project-logos/: Config: banwikisource: Fix logo size (T296459) (duration: 00m 50s)
12:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:34 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: InitialiseSettings: move ombudsmen.wikimedia.org to ombuds.wikimedia.org (T273323) (duration: 00m 49s)
12:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:30 taavi@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: MWMultiVersion: move ombudsmen.wikimedia.org to ombuds.wikimedia.org (T273323) (duration: 00m 49s)
12:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20519 and previous config saved to /var/cache/conftool/dbconfig/20220210-122841-ladsgroup.json
12:25 taavi@deploy1002: Synchronized wmf-config/MetaContactPages.php: Config: Define a contact form for Chapter/Thorg application status (T298024) (duration: 00m 50s)
12:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20518 and previous config saved to /var/cache/conftool/dbconfig/20220210-122446-marostegui.json
12:23 moritzm: installing pillow security updates
12:18 taavi: echo "https://query.wikidata.org/" | mwscript purgeList.php # T301457
12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20517 and previous config saved to /var/cache/conftool/dbconfig/20220210-121336-ladsgroup.json
12:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20516 and previous config saved to /var/cache/conftool/dbconfig/20220210-120941-marostegui.json
12:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20515 and previous config saved to /var/cache/conftool/dbconfig/20220210-120729-marostegui.json
12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20514 and previous config saved to /var/cache/conftool/dbconfig/20220210-120701-marostegui.json
11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts restbase2009.codfw.wmnet
11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20513 and previous config saved to /var/cache/conftool/dbconfig/20220210-115156-marostegui.json
11:43 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2009.codfw.wmnet
11:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20512 and previous config saved to /var/cache/conftool/dbconfig/20220210-114224-root.json
11:40 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts restbase2010.codfw.wmnet
11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20511 and previous config saved to /var/cache/conftool/dbconfig/20220210-113651-marostegui.json
11:27 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2010.codfw.wmnet
11:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20510 and previous config saved to /var/cache/conftool/dbconfig/20220210-112720-root.json
11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20509 and previous config saved to /var/cache/conftool/dbconfig/20220210-112147-marostegui.json
11:21 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
11:21 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T300382)', diff saved to https://phabricator.wikimedia.org/P20508 and previous config saved to /var/cache/conftool/dbconfig/20220210-112034-marostegui.json
11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
11:20 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
11:20 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
11:20 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
11:19 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
11:18 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
11:18 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
11:18 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
11:18 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
11:17 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
11:16 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
11:16 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
11:16 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
11:16 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
11:16 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
11:16 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
11:15 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
11:15 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
11:14 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
11:14 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
11:14 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
11:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20507 and previous config saved to /var/cache/conftool/dbconfig/20220210-111217-root.json
11:11 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
11:10 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
11:10 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
11:09 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
11:08 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
11:08 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
11:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
11:06 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
11:06 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
11:06 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
11:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
11:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
11:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
11:05 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
11:04 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
11:03 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
11:03 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1021.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
11:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1021.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
11:01 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Short circut updating stats when the page is not reviewable (T301433) (duration: 00m 49s)
11:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T298554)', diff saved to https://phabricator.wikimedia.org/P20506 and previous config saved to /var/cache/conftool/dbconfig/20220210-105853-ladsgroup.json
10:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
10:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
10:58 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.php: Backport: Short circut updating stats when the page is not reviewable (T301433) (duration: 00m 50s)
10:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20505 and previous config saved to /var/cache/conftool/dbconfig/20220210-105713-root.json
10:46 moritzm: installing ruby2.5 security updates
10:44 arturo: deploying https://gerrit.wikimedia.org/r/c/operations/homer/public/+/761435 to core routers
10:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20503 and previous config saved to /var/cache/conftool/dbconfig/20220210-104208-root.json
10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20502 and previous config saved to /var/cache/conftool/dbconfig/20220210-103324-marostegui.json
10:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
10:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20501 and previous config saved to /var/cache/conftool/dbconfig/20220210-103317-marostegui.json
10:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20500 and previous config saved to /var/cache/conftool/dbconfig/20220210-101812-marostegui.json
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20499 and previous config saved to /var/cache/conftool/dbconfig/20220210-100307-marostegui.json
09:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
09:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
09:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20498 and previous config saved to /var/cache/conftool/dbconfig/20220210-094929-ladsgroup.json
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20497 and previous config saved to /var/cache/conftool/dbconfig/20220210-094802-marostegui.json
09:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300382)', diff saved to https://phabricator.wikimedia.org/P20496 and previous config saved to /var/cache/conftool/dbconfig/20220210-094655-marostegui.json
09:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
09:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
09:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20495 and previous config saved to /var/cache/conftool/dbconfig/20220210-094647-marostegui.json
09:43 elukey: update pcc facts
09:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20494 and previous config saved to /var/cache/conftool/dbconfig/20220210-093425-ladsgroup.json
09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20493 and previous config saved to /var/cache/conftool/dbconfig/20220210-093141-marostegui.json
09:30 marostegui: Remove watchdog@10.% user from db2071 T301442
09:27 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20492 and previous config saved to /var/cache/conftool/dbconfig/20220210-092727-marostegui.json
09:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20491 and previous config saved to /var/cache/conftool/dbconfig/20220210-091920-ladsgroup.json
09:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20489 and previous config saved to /var/cache/conftool/dbconfig/20220210-090415-ladsgroup.json
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20488 and previous config saved to /var/cache/conftool/dbconfig/20220210-090129-marostegui.json
09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20487 and previous config saved to /var/cache/conftool/dbconfig/20220210-090023-marostegui.json
09:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
09:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20486 and previous config saved to /var/cache/conftool/dbconfig/20220210-090016-marostegui.json
08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20485 and previous config saved to /var/cache/conftool/dbconfig/20220210-084511-marostegui.json
08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20484 and previous config saved to /var/cache/conftool/dbconfig/20220210-083006-marostegui.json
08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20483 and previous config saved to /var/cache/conftool/dbconfig/20220210-081501-marostegui.json
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300382)', diff saved to https://phabricator.wikimedia.org/P20482 and previous config saved to /var/cache/conftool/dbconfig/20220210-081354-marostegui.json
08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
08:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20481 and previous config saved to /var/cache/conftool/dbconfig/20220210-081340-marostegui.json
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20480 and previous config saved to /var/cache/conftool/dbconfig/20220210-075836-marostegui.json
07:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T298554)', diff saved to https://phabricator.wikimedia.org/P20479 and previous config saved to /var/cache/conftool/dbconfig/20220210-074404-ladsgroup.json
07:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
07:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
07:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20478 and previous config saved to /var/cache/conftool/dbconfig/20220210-074356-ladsgroup.json
07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20477 and previous config saved to /var/cache/conftool/dbconfig/20220210-074331-marostegui.json
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300775)', diff saved to https://phabricator.wikimedia.org/P20476 and previous config saved to /var/cache/conftool/dbconfig/20220210-072933-marostegui.json
07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
07:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1110.eqiad.wmnet with reason: Maintenance
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20475 and previous config saved to /var/cache/conftool/dbconfig/20220210-072925-marostegui.json
07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20474 and previous config saved to /var/cache/conftool/dbconfig/20220210-072852-ladsgroup.json
07:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20473 and previous config saved to /var/cache/conftool/dbconfig/20220210-072826-marostegui.json
07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300382)', diff saved to https://phabricator.wikimedia.org/P20472 and previous config saved to /var/cache/conftool/dbconfig/20220210-072718-marostegui.json
07:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
07:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20471 and previous config saved to /var/cache/conftool/dbconfig/20220210-072711-marostegui.json
07:16 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2006.codfw.wmnet with OS bullseye
07:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P20470 and previous config saved to /var/cache/conftool/dbconfig/20220210-071421-marostegui.json
07:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20469 and previous config saved to /var/cache/conftool/dbconfig/20220210-071347-ladsgroup.json
07:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20468 and previous config saved to /var/cache/conftool/dbconfig/20220210-071206-marostegui.json
07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1115.eqiad.wmnet with OS bullseye
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P20467 and previous config saved to /var/cache/conftool/dbconfig/20220210-065916-marostegui.json
06:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20466 and previous config saved to /var/cache/conftool/dbconfig/20220210-065842-ladsgroup.json
06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20465 and previous config saved to /var/cache/conftool/dbconfig/20220210-065701-marostegui.json
06:46 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2006.codfw.wmnet with OS bullseye
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20464 and previous config saved to /var/cache/conftool/dbconfig/20220210-064411-marostegui.json
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20463 and previous config saved to /var/cache/conftool/dbconfig/20220210-064156-marostegui.json
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20462 and previous config saved to /var/cache/conftool/dbconfig/20220210-064149-marostegui.json
06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
06:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1100.eqiad.wmnet with reason: Maintenance
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20461 and previous config saved to /var/cache/conftool/dbconfig/20220210-064059-root.json
06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300382)', diff saved to https://phabricator.wikimedia.org/P20460 and previous config saved to /var/cache/conftool/dbconfig/20220210-064049-marostegui.json
06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20459 and previous config saved to /var/cache/conftool/dbconfig/20220210-064021-marostegui.json
06:28 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
06:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20458 and previous config saved to /var/cache/conftool/dbconfig/20220210-062556-root.json
06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20457 and previous config saved to /var/cache/conftool/dbconfig/20220210-062517-marostegui.json
06:23 marostegui@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1115.eqiad.wmnet with OS bullseye
06:18 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
06:13 marostegui@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host db1115.eqiad.wmnet with OS bullseye
06:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20456 and previous config saved to /var/cache/conftool/dbconfig/20220210-061052-root.json
06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20455 and previous config saved to /var/cache/conftool/dbconfig/20220210-061012-marostegui.json
06:07 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1115.eqiad.wmnet with OS bullseye
06:01 marostegui: Drop tendril database from db1115 T297605
05:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20454 and previous config saved to /var/cache/conftool/dbconfig/20220210-055548-root.json
05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20453 and previous config saved to /var/cache/conftool/dbconfig/20220210-055507-marostegui.json
05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300382)', diff saved to https://phabricator.wikimedia.org/P20452 and previous config saved to /var/cache/conftool/dbconfig/20220210-055400-marostegui.json
05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
05:49 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchangeslinked group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20451 and previous config saved to /var/cache/conftool/dbconfig/20220210-054911-marostegui.json
05:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20450 and previous config saved to /var/cache/conftool/dbconfig/20220210-054045-root.json
05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T298554)', diff saved to https://phabricator.wikimedia.org/P20449 and previous config saved to /var/cache/conftool/dbconfig/20220210-054003-ladsgroup.json
05:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
05:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
05:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20448 and previous config saved to /var/cache/conftool/dbconfig/20220210-053956-ladsgroup.json
05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20447 and previous config saved to /var/cache/conftool/dbconfig/20220210-052451-ladsgroup.json
05:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20446 and previous config saved to /var/cache/conftool/dbconfig/20220210-050946-ladsgroup.json
04:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20445 and previous config saved to /var/cache/conftool/dbconfig/20220210-045442-ladsgroup.json
03:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20444 and previous config saved to /var/cache/conftool/dbconfig/20220210-032310-ladsgroup.json
03:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
03:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
03:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20443 and previous config saved to /var/cache/conftool/dbconfig/20220210-032303-ladsgroup.json
03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20442 and previous config saved to /var/cache/conftool/dbconfig/20220210-030758-ladsgroup.json
02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20441 and previous config saved to /var/cache/conftool/dbconfig/20220210-025253-ladsgroup.json
02:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20440 and previous config saved to /var/cache/conftool/dbconfig/20220210-023749-ladsgroup.json

01:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T298554)', diff saved to https://phabricator.wikimedia.org/P20439 and previous config saved to /var/cache/conftool/dbconfig/20220210-011920-ladsgroup.json
01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
01:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
00:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:37 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: jawikivoyage: Change module talk namespace from トーク to ノート (T262155) (duration: 00m 50s)
00:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:19 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: jawikivoyage: Change talk namespace names from トーク to ノート (T262155) (duration: 00m 54s)
00:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
00:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance

2022-02-09

23:48 mutante: apt1001 - delete etherpad-lite for bullseye source package, built, uploaded and imported 1.8.16-2 in bullseye-wikimedia, now source and binary packages in APT, simulated install on etherpad1003 works T300568
23:18 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
23:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
23:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
23:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20438 and previous config saved to /var/cache/conftool/dbconfig/20220209-230745-ladsgroup.json
22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20437 and previous config saved to /var/cache/conftool/dbconfig/20220209-225240-ladsgroup.json
22:50 bking@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic[1032-1038,1040-1042,1044-1047].eqiad.wmnet
22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20435 and previous config saved to /var/cache/conftool/dbconfig/20220209-223736-ladsgroup.json
22:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20434 and previous config saved to /var/cache/conftool/dbconfig/20220209-222231-ladsgroup.json
21:51 hoo: T299422: Started Wikibase rebuildItemsPerSite in 100k page batches on mwmaint1002 for wikidatawiki. Can be killed at any time, if necessary.
20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298554)', diff saved to https://phabricator.wikimedia.org/P20432 and previous config saved to /var/cache/conftool/dbconfig/20220209-205619-ladsgroup.json
20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20431 and previous config saved to /var/cache/conftool/dbconfig/20220209-205606-ladsgroup.json
20:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:48 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.21 refs T300197 (duration: 00m 51s)
20:47 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.21 refs T300197
20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20430 and previous config saved to /var/cache/conftool/dbconfig/20220209-204101-ladsgroup.json
20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20429 and previous config saved to /var/cache/conftool/dbconfig/20220209-202557-ladsgroup.json
20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20428 and previous config saved to /var/cache/conftool/dbconfig/20220209-201052-ladsgroup.json
19:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:45 urbanecm: UTC evening B&C window completed
19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: 3da81ec: Mentor dashboard: Mark mentor-tools as beta (T280307) (duration: 00m 49s)
19:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:37 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.21/extensions/WikimediaEvents/: 588fa93: Track changes of growthexperiments-mentor-away-timestamp (T280307) (duration: 00m 49s)
19:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/GrowthExperiments/: 9675848: 49202e7: Deploy M2 Mentor settings module (T280307) (duration: 00m 51s)
19:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikimediaEvents/includes/PrefUpdateInstrumentation.php: a307ac4: Track changes of growthexperiments-mentor-away-timestamp (T280307) (duration: 00m 50s)
19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:23 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging (master % u=)]$ rm v5.4.2\) # delete untracked file found in staging dir; created by Reedy, contains scap's logo
19:09 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:04 pt1979@cumin2002: START - Cookbook sre.dns.netbox
18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T298554)', diff saved to https://phabricator.wikimedia.org/P20427 and previous config saved to /var/cache/conftool/dbconfig/20220209-184430-ladsgroup.json
18:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20426 and previous config saved to /var/cache/conftool/dbconfig/20220209-184423-ladsgroup.json
18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20425 and previous config saved to /var/cache/conftool/dbconfig/20220209-182918-ladsgroup.json
18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20424 and previous config saved to /var/cache/conftool/dbconfig/20220209-181413-ladsgroup.json
18:00 elukey: copy calico debs from buster-wikimedia's component/calico-future to bullseye-wikimedia component/calico317
17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20423 and previous config saved to /var/cache/conftool/dbconfig/20220209-175909-ladsgroup.json
17:37 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b] (duration: 07m 04s)
17:34 elukey: upload rsyslog 8.2102.0-2+deb11u1+wmf1 packages to bullseye-wikimedia component/rsyslog-k8s
17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (hadoop-test): Regular analytics weekly train HADOOP-TEST [analytics/refinery@55b229b]
17:30 joal@deploy1002: Finished deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b] (duration: 00m 07s)
17:30 joal@deploy1002: Started deploy [analytics/refinery@55b229b] (thin): Regular analytics weekly train THIN [analytics/refinery@55b229b]
17:27 joal@deploy1002: Finished deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b] (duration: 22m 00s)
17:07 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster1001 - T300740
17:05 joal@deploy1002: Started deploy [analytics/refinery@55b229b]: Regular analytics weekly train [analytics/refinery@55b229b]
16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T298554)', diff saved to https://phabricator.wikimedia.org/P20422 and previous config saved to /var/cache/conftool/dbconfig/20220209-163102-ladsgroup.json
16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
16:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
16:21 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-staging,name=eqiad
16:17 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 03s)
16:17 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
16:16 otto@deploy1002: Finished deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided) (duration: 00m 20s)
16:16 otto@deploy1002: Started deploy [airflow-dags/analytics_test@ddd10b4]: (no justification provided)
15:57 jayme: ran sudo rm /var/run/confd-template/.k8s-ingress-staging*.err on puppetmaster2001 - T300740
15:56 jayme: restarting pybal on lvs1015,lvs2009 - T300740
15:44 jbond: change puppet hiera prefernce site vs site/role gerrit:761339
15:43 jayme@cumin1001: conftool action : set/pooled=yes:weight=10; selector: cluster=kubernetes-staging,service=kubesvc
15:31 jayme: restarting pybal on lvs2010,lvs1020 - T300740
15:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
15:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20420 and previous config saved to /var/cache/conftool/dbconfig/20220209-152522-ladsgroup.json
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20419 and previous config saved to /var/cache/conftool/dbconfig/20220209-151017-ladsgroup.json
15:06 moritzm: imported jenkins 2.319.3 to thirdparty/ci T301361
14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20418 and previous config saved to /var/cache/conftool/dbconfig/20220209-145513-ladsgroup.json
14:43 ema: prometheus: remove atskafka target files - '/srv/prometheus/ops/targets/atskafka_*' T247497
14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20416 and previous config saved to /var/cache/conftool/dbconfig/20220209-144008-ladsgroup.json
14:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T300510)', diff saved to https://phabricator.wikimedia.org/P20415 and previous config saved to /var/cache/conftool/dbconfig/20220209-143642-ladsgroup.json
14:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2126.codfw.wmnet with OS bullseye
14:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:22 reedy@deploy1002: Finished scap: Downgrading symfony/console (v5.4.3 => v5.4.2) T301320 (duration: 01m 31s)
14:20 reedy@deploy1002: Started scap: Downgrading symfony/console (v5.4.3 => v5.4.2) T301320
13:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2126.codfw.wmnet with OS bullseye
13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T300510)', diff saved to https://phabricator.wikimedia.org/P20414 and previous config saved to /var/cache/conftool/dbconfig/20220209-135515-ladsgroup.json
13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2126.codfw.wmnet with reason: Maintenance
13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye (T300510)
13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Migrate to bullseye (T300510)
13:48 jelto: update scap to 4.3.1 on all hosts - T301307
13:38 reedy@deploy1002: Finished scap: Downgrading symfony/console $v5.4.3 => v5.4.2$ T301320 (duration: 01m 34s)
13:36 reedy@deploy1002: Started scap: Downgrading symfony/console $v5.4.3 => v5.4.2$ T301320
13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T298554)', diff saved to https://phabricator.wikimedia.org/P20412 and previous config saved to /var/cache/conftool/dbconfig/20220209-131938-ladsgroup.json
13:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
13:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
13:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
13:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
13:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:41 Lucas_WMDE: UTC morning backport+config window done
12:40 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: sawikisource: Add audio book namespace (T282970) (duration: 00m 50s)
12:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:14 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/MWRealm.php: Config: Stop writing to $wmfRealm (T45956) (3/3) (duration: 00m 49s)
12:13 lucaswerkmeister-wmde@deploy1002: Synchronized multiversion/buildConfigCache.php: Config: Stop writing to $wmfRealm (T45956) (2/3) (duration: 00m 49s)
12:11 lucaswerkmeister-wmde@deploy1002: Synchronized tests/loggingTest.php: Config: Stop writing to $wmfRealm (T45956) (1/3) (duration: 01m 38s)
12:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300775)', diff saved to https://phabricator.wikimedia.org/P20411 and previous config saved to /var/cache/conftool/dbconfig/20220209-112029-marostegui.json
11:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
11:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1100.eqiad.wmnet with reason: Maintenance
11:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-fe[2005-2008].codfw.wmnet
10:50 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
10:45 akosiaris: T300568 upload prometheus-etherpad-exporter_0.5_amd64 to apt.wikimedia.org bullseye-wikimedia/main
10:35 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
10:34 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
10:34 jayme@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
10:32 jayme@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
10:25 jelto@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 00m 22s)
10:25 jelto@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
10:20 jelto: update scap to 4.3.1 on A:restbase-canary - T301307
10:17 jelto: update scap to 4.3.1 on A:mw-canary or A:parsoid-canary or A:mw-jobrunner-canary - T301307
10:16 ariel@deploy1002: Finished deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2 (duration: 00m 03s)
10:15 ariel@deploy1002: Started deploy [dumps/dumps@9993036]: fix up default api jobs entry for siteinfo v2
10:15 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-fe[2005-2008].codfw.wmnet
10:14 volans: uploaded python3-wmflib_1.0.1 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
10:11 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-fe[2005-2008].codfw.wmnet
10:03 akosiaris: T300568 upload prometheus-etherpad-exporter_0.4_amd64 to apt.wikimedia.org bullseye-wikimedia/main
10:02 Emperor: rolling restart of swift frontends T301251
09:46 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
09:45 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
09:45 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
09:45 elukey: update my ssh key on all network devices (will commit only when the diff is my key only)
09:44 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
09:41 ema: cp3050: stop and disable atskafka-webrequest.service T247497
09:15 ema: cp3050: ats-backend-restart to set the number of allowed Lua states back from 64 to 256 (default) T265625
08:21 dcausse: restarting blazegraph on wdqs1004 (jvm stuck for 5hours)
07:55 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
07:42 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove logpager group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20410 and previous config saved to /var/cache/conftool/dbconfig/20220209-073528-marostegui.json
04:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
04:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
03:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
03:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
03:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20407 and previous config saved to /var/cache/conftool/dbconfig/20220209-034800-ladsgroup.json
03:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20406 and previous config saved to /var/cache/conftool/dbconfig/20220209-033255-ladsgroup.json
03:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20405 and previous config saved to /var/cache/conftool/dbconfig/20220209-031750-ladsgroup.json
03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20404 and previous config saved to /var/cache/conftool/dbconfig/20220209-030245-ladsgroup.json
02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T298554)', diff saved to https://phabricator.wikimedia.org/P20403 and previous config saved to /var/cache/conftool/dbconfig/20220209-023446-ladsgroup.json
02:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
02:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 11 hosts with reason: Maintenance
02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 11 hosts with reason: Maintenance
02:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
02:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance

2022-02-08

23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
23:48 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2054.codfw.wmnet with OS buster
23:22 tzatziki: removing 1 file for legal compliance
23:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2055.codfw.wmnet with OS buster
23:20 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2053.codfw.wmnet with OS buster
23:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2054.codfw.wmnet with OS buster
23:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2052.codfw.wmnet with OS buster
22:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2053.codfw.wmnet with OS buster
22:44 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: sync on main
22:42 dzahn@deploy1002: helmfile [staging] START helmfile.d/services/miscweb: apply on main
22:41 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2052.codfw.wmnet with OS buster
22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20402 and previous config saved to /var/cache/conftool/dbconfig/20220208-221545-marostegui.json
22:12 topranks: doing planned 1-by-1 shutdown of ports xe-0/1/1, xe-0/1/2 and xe-0/1/9 on cr2-esams, to test reliability of each following user reports of issues at AMS-IX.
22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20401 and previous config saved to /var/cache/conftool/dbconfig/20220208-220041-marostegui.json
21:59 ryankemper: T294805 elastic10[68-83] erroneously weren't in pybal, added them just now: `sudo confctl select 'cluster=elasticsearch' set/pooled=yes:weight=10` (there's no hosts in the `conftool-data` list that we want depooled so we're okay setting all to pooled w/ equal weight)
21:59 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch
21:58 ryankemper@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: cluster=elasticsearch,name=elastic1*
21:53 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
21:52 ryankemper@puppetmaster1001: conftool action : GET; selector: service=search
21:47 ryankemper: [Elastic] `ryankemper@elastic1081:~$ sudo systemctl restart elasticsearch_6*psi*` (9600 but not 9200 seemed to be having connectivity issues)
21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20400 and previous config saved to /var/cache/conftool/dbconfig/20220208-214536-marostegui.json
21:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20399 and previous config saved to /var/cache/conftool/dbconfig/20220208-213031-marostegui.json
21:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T300402)', diff saved to https://phabricator.wikimedia.org/P20398 and previous config saved to /var/cache/conftool/dbconfig/20220208-212558-marostegui.json
21:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
21:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
21:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20397 and previous config saved to /var/cache/conftool/dbconfig/20220208-212550-marostegui.json
21:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20396 and previous config saved to /var/cache/conftool/dbconfig/20220208-211046-marostegui.json
20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20395 and previous config saved to /var/cache/conftool/dbconfig/20220208-205541-marostegui.json
20:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
20:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
20:52 jhuneidi@deploy1002: Finished scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0 (duration: 16m 17s)
20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:43 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2051.codfw.wmnet with OS buster
20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20394 and previous config saved to /var/cache/conftool/dbconfig/20220208-204036-marostegui.json
20:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20393 and previous config saved to /var/cache/conftool/dbconfig/20220208-203634-ladsgroup.json
20:36 jhuneidi@deploy1002: Started scap: sync again in attempt to deploy 1.38.0-wmf.21 to group0
20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20392 and previous config saved to /var/cache/conftool/dbconfig/20220208-203529-marostegui.json
20:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
20:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
20:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20391 and previous config saved to /var/cache/conftool/dbconfig/20220208-203521-marostegui.json
20:33 ryankemper: T294805 Banned `elastic10[32-47]` from main, omega, and psi elasticsearch clusters. Shards are relocating on main and omega clusters as expected, but they don't seem to be moving on psi. Investigating that currently. Might have to do with row allocation constraints, but unsure currently
20:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2050.codfw.wmnet with OS buster
20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20390 and previous config saved to /var/cache/conftool/dbconfig/20220208-202127-ladsgroup.json
20:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20389 and previous config saved to /var/cache/conftool/dbconfig/20220208-202016-marostegui.json
20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:17 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.21 refs T300197
20:14 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2051.codfw.wmnet with OS buster
20:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20388 and previous config saved to /var/cache/conftool/dbconfig/20220208-200621-ladsgroup.json
20:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20387 and previous config saved to /var/cache/conftool/dbconfig/20220208-200512-marostegui.json
20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2049.codfw.wmnet with OS buster
19:58 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2050.codfw.wmnet with OS buster
19:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2048.codfw.wmnet with OS buster
19:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20386 and previous config saved to /var/cache/conftool/dbconfig/20220208-195115-ladsgroup.json
19:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20385 and previous config saved to /var/cache/conftool/dbconfig/20220208-195007-marostegui.json
19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T300402)', diff saved to https://phabricator.wikimedia.org/P20384 and previous config saved to /var/cache/conftool/dbconfig/20220208-194528-marostegui.json
19:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
19:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
19:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20383 and previous config saved to /var/cache/conftool/dbconfig/20220208-194520-marostegui.json
19:32 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2049.codfw.wmnet with OS buster
19:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20382 and previous config saved to /var/cache/conftool/dbconfig/20220208-193016-marostegui.json
19:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2047.codfw.wmnet with OS buster
19:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2048.codfw.wmnet with OS buster
19:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2046.codfw.wmnet with OS buster
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T298554)', diff saved to https://phabricator.wikimedia.org/P20381 and previous config saved to /var/cache/conftool/dbconfig/20220208-192055-ladsgroup.json
19:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
19:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20380 and previous config saved to /var/cache/conftool/dbconfig/20220208-192047-ladsgroup.json
19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20379 and previous config saved to /var/cache/conftool/dbconfig/20220208-191511-marostegui.json
19:12 jhuneidi@deploy1002: Pruned MediaWiki: 1.38.0-wmf.19 (duration: 03m 12s)
19:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
19:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
19:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:09 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.21 refs T300197 (duration: 39m 34s)
19:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20378 and previous config saved to /var/cache/conftool/dbconfig/20220208-190542-ladsgroup.json
19:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20377 and previous config saved to /var/cache/conftool/dbconfig/20220208-190006-marostegui.json
18:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment (duration: 02m 02s)
18:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@49ba844]: query_clicks: resolve parse error in comment
18:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2047.codfw.wmnet with OS buster
18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T300402)', diff saved to https://phabricator.wikimedia.org/P20376 and previous config saved to /var/cache/conftool/dbconfig/20220208-185420-marostegui.json
18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
18:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
18:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2046.codfw.wmnet with OS buster
18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2045.codfw.wmnet with OS buster
18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2044.codfw.wmnet with OS buster
18:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
18:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20375 and previous config saved to /var/cache/conftool/dbconfig/20220208-185037-ladsgroup.json
18:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
18:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
18:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20374 and previous config saved to /var/cache/conftool/dbconfig/20220208-184832-marostegui.json
18:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
18:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
18:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20373 and previous config saved to /var/cache/conftool/dbconfig/20220208-183532-ladsgroup.json
18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20372 and previous config saved to /var/cache/conftool/dbconfig/20220208-183328-marostegui.json
18:29 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.21 refs T300197
18:22 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup (duration: 02m 03s)
18:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2045.codfw.wmnet with OS buster
18:20 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS buster
18:20 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@ceff02f]: query_clicks: adjust start_date and catchup
18:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20371 and previous config saved to /var/cache/conftool/dbconfig/20220208-181823-marostegui.json
18:13 moritzm: installing expat security updates
18:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2043.codfw.wmnet with OS buster
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T298554)', diff saved to https://phabricator.wikimedia.org/P20370 and previous config saved to /var/cache/conftool/dbconfig/20220208-180810-ladsgroup.json
18:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
18:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20369 and previous config saved to /var/cache/conftool/dbconfig/20220208-180803-ladsgroup.json
18:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20368 and previous config saved to /var/cache/conftool/dbconfig/20220208-180316-marostegui.json
17:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2042.codfw.wmnet with OS buster
17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T300402)', diff saved to https://phabricator.wikimedia.org/P20367 and previous config saved to /var/cache/conftool/dbconfig/20220208-175844-marostegui.json
17:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
17:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20366 and previous config saved to /var/cache/conftool/dbconfig/20220208-175837-marostegui.json
17:58 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow (duration: 02m 01s)
17:56 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp4031.ulsfo.wmnet
17:56 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@79cb98e]: move query clicks from oozie to airflow
17:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
17:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20365 and previous config saved to /var/cache/conftool/dbconfig/20220208-175258-ladsgroup.json
17:52 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
17:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20364 and previous config saved to /var/cache/conftool/dbconfig/20220208-174332-marostegui.json
17:40 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2043.codfw.wmnet with OS buster
17:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2041.codfw.wmnet with OS buster
17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20363 and previous config saved to /var/cache/conftool/dbconfig/20220208-173753-ladsgroup.json
17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 8 hosts with reason: Maintenance
17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20362 and previous config saved to /var/cache/conftool/dbconfig/20220208-173611-marostegui.json
17:28 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2042.codfw.wmnet with OS buster
17:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20361 and previous config saved to /var/cache/conftool/dbconfig/20220208-172827-marostegui.json
17:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2040.codfw.wmnet with OS buster
17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20360 and previous config saved to /var/cache/conftool/dbconfig/20220208-172248-ladsgroup.json
17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20359 and previous config saved to /var/cache/conftool/dbconfig/20220208-172106-marostegui.json
17:17 rzl: rzl@cumin1001:~$ sudo cumin A:mw "enable-puppet T273323"
17:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20358 and previous config saved to /var/cache/conftool/dbconfig/20220208-171323-marostegui.json
17:11 rzl: rzl@cumin1001:~$ sudo cumin A:mw "disable-puppet T273323"
17:11 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job (duration: 02m 01s)
17:09 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@88cdfdc]: Deploy rdf-streaming-updater reconcilliation job
17:08 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2041.codfw.wmnet with OS buster
17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T300402)', diff saved to https://phabricator.wikimedia.org/P20357 and previous config saved to /var/cache/conftool/dbconfig/20220208-170812-marostegui.json
17:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
17:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20356 and previous config saved to /var/cache/conftool/dbconfig/20220208-170805-marostegui.json
17:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2039.codfw.wmnet with OS buster
17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P20355 and previous config saved to /var/cache/conftool/dbconfig/20220208-170601-marostegui.json
16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20354 and previous config saved to /var/cache/conftool/dbconfig/20220208-165445-ladsgroup.json
16:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
16:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20353 and previous config saved to /var/cache/conftool/dbconfig/20220208-165436-ladsgroup.json
16:54 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20352 and previous config saved to /var/cache/conftool/dbconfig/20220208-165300-marostegui.json
16:51 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host mc2040.codfw.wmnet with OS buster
16:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
16:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2040.codfw.wmnet with OS buster
16:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20351 and previous config saved to /var/cache/conftool/dbconfig/20220208-165057-marostegui.json
16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
16:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
16:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2038.codfw.wmnet with OS buster
16:45 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: Choose wikiversions.php file relative to MWMultiVersion.php (revived) (duration: 00m 49s)
16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20350 and previous config saved to /var/cache/conftool/dbconfig/20220208-163932-ladsgroup.json
16:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20349 and previous config saved to /var/cache/conftool/dbconfig/20220208-163755-marostegui.json
16:37 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
16:37 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
16:35 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2039.codfw.wmnet with OS buster
16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20348 and previous config saved to /var/cache/conftool/dbconfig/20220208-162427-ladsgroup.json
16:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20347 and previous config saved to /var/cache/conftool/dbconfig/20220208-162250-marostegui.json
16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T300402)', diff saved to https://phabricator.wikimedia.org/P20346 and previous config saved to /var/cache/conftool/dbconfig/20220208-161812-marostegui.json
16:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
16:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
16:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20345 and previous config saved to /var/cache/conftool/dbconfig/20220208-161805-marostegui.json
16:16 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host mc2038.codfw.wmnet with OS buster
16:13 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20344 and previous config saved to /var/cache/conftool/dbconfig/20220208-160922-ladsgroup.json
16:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20343 and previous config saved to /var/cache/conftool/dbconfig/20220208-160300-marostegui.json
15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20342 and previous config saved to /var/cache/conftool/dbconfig/20220208-154755-marostegui.json
15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T298554)', diff saved to https://phabricator.wikimedia.org/P20341 and previous config saved to /var/cache/conftool/dbconfig/20220208-154049-ladsgroup.json
15:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
15:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20340 and previous config saved to /var/cache/conftool/dbconfig/20220208-154042-ladsgroup.json
15:33 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
15:33 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20339 and previous config saved to /var/cache/conftool/dbconfig/20220208-153251-marostegui.json
15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T300402)', diff saved to https://phabricator.wikimedia.org/P20338 and previous config saved to /var/cache/conftool/dbconfig/20220208-152812-marostegui.json
15:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
15:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
15:27 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20337 and previous config saved to /var/cache/conftool/dbconfig/20220208-152536-ladsgroup.json
15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20336 and previous config saved to /var/cache/conftool/dbconfig/20220208-152525-marostegui.json
15:18 Emperor: depooling ms-fe200[5-8] T301251
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20335 and previous config saved to /var/cache/conftool/dbconfig/20220208-151032-ladsgroup.json
15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20334 and previous config saved to /var/cache/conftool/dbconfig/20220208-151020-marostegui.json
14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20333 and previous config saved to /var/cache/conftool/dbconfig/20220208-145731-marostegui.json
14:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
14:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1098.eqiad.wmnet with reason: Maintenance
14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20332 and previous config saved to /var/cache/conftool/dbconfig/20220208-145724-marostegui.json
14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20331 and previous config saved to /var/cache/conftool/dbconfig/20220208-145527-ladsgroup.json
14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20330 and previous config saved to /var/cache/conftool/dbconfig/20220208-145516-marostegui.json
14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20329 and previous config saved to /var/cache/conftool/dbconfig/20220208-144219-marostegui.json
14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20328 and previous config saved to /var/cache/conftool/dbconfig/20220208-144011-marostegui.json
14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T300402)', diff saved to https://phabricator.wikimedia.org/P20327 and previous config saved to /var/cache/conftool/dbconfig/20220208-143545-marostegui.json
14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
14:35 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2001.codfw.wmnet
14:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
14:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
14:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20326 and previous config saved to /var/cache/conftool/dbconfig/20220208-143302-marostegui.json
14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T298554)', diff saved to https://phabricator.wikimedia.org/P20325 and previous config saved to /var/cache/conftool/dbconfig/20220208-142815-ladsgroup.json
14:28 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be2001.codfw.wmnet
14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20324 and previous config saved to /var/cache/conftool/dbconfig/20220208-142808-ladsgroup.json
14:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P20323 and previous config saved to /var/cache/conftool/dbconfig/20220208-142714-marostegui.json
14:26 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2001.codfw.wmnet with OS bullseye
14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20322 and previous config saved to /var/cache/conftool/dbconfig/20220208-141757-marostegui.json
14:17 godog: update PERC firmware on thanos-be2001 - T288937
14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20321 and previous config saved to /var/cache/conftool/dbconfig/20220208-141303-ladsgroup.json
14:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20320 and previous config saved to /var/cache/conftool/dbconfig/20220208-141210-marostegui.json
14:07 godog: update NIC firmware on thanos-be2001 - T288937
14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20319 and previous config saved to /var/cache/conftool/dbconfig/20220208-140252-marostegui.json
13:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20318 and previous config saved to /var/cache/conftool/dbconfig/20220208-135758-ladsgroup.json
13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20317 and previous config saved to /var/cache/conftool/dbconfig/20220208-134748-marostegui.json
13:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
13:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
13:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T300402)', diff saved to https://phabricator.wikimedia.org/P20316 and previous config saved to /var/cache/conftool/dbconfig/20220208-134324-marostegui.json
13:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
13:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
13:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20315 and previous config saved to /var/cache/conftool/dbconfig/20220208-134254-ladsgroup.json
13:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
13:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20314 and previous config saved to /var/cache/conftool/dbconfig/20220208-134022-marostegui.json
13:37 moritzm: migrating instances off ganeti1021
13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300775)', diff saved to https://phabricator.wikimedia.org/P20313 and previous config saved to /var/cache/conftool/dbconfig/20220208-133558-marostegui.json
13:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
13:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20312 and previous config saved to /var/cache/conftool/dbconfig/20220208-133550-marostegui.json
13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20310 and previous config saved to /var/cache/conftool/dbconfig/20220208-132517-marostegui.json
13:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20309 and previous config saved to /var/cache/conftool/dbconfig/20220208-132045-marostegui.json
13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298554)', diff saved to https://phabricator.wikimedia.org/P20308 and previous config saved to /var/cache/conftool/dbconfig/20220208-131430-ladsgroup.json
13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20307 and previous config saved to /var/cache/conftool/dbconfig/20220208-131427-ladsgroup.json
13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
13:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
13:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20306 and previous config saved to /var/cache/conftool/dbconfig/20220208-131319-ladsgroup.json
13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20305 and previous config saved to /var/cache/conftool/dbconfig/20220208-131012-marostegui.json
13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P20304 and previous config saved to /var/cache/conftool/dbconfig/20220208-130541-marostegui.json
12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20303 and previous config saved to /var/cache/conftool/dbconfig/20220208-125922-ladsgroup.json
12:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20302 and previous config saved to /var/cache/conftool/dbconfig/20220208-125814-ladsgroup.json
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20301 and previous config saved to /var/cache/conftool/dbconfig/20220208-125508-marostegui.json
12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20300 and previous config saved to /var/cache/conftool/dbconfig/20220208-125036-marostegui.json
12:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P20299 and previous config saved to /var/cache/conftool/dbconfig/20220208-124418-ladsgroup.json
12:43 Amir1: shut down dbmonitor1002 (T297605)
12:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20298 and previous config saved to /var/cache/conftool/dbconfig/20220208-124309-ladsgroup.json
12:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week (T297605)
12:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on dbmonitor1002.wikimedia.org with reason: Host will be shutdown in a week (T297605)
12:37 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host thanos-be2001.codfw.wmnet with OS bullseye
12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20297 and previous config saved to /var/cache/conftool/dbconfig/20220208-122913-ladsgroup.json
12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20296 and previous config saved to /var/cache/conftool/dbconfig/20220208-122805-ladsgroup.json
12:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti1011.eqiad.wmnet with OS buster
12:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1182.eqiad.wmnet with OS bullseye
12:19 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
12:19 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2010.codfw.wmnet with reason: Decommissioning
12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300775)', diff saved to https://phabricator.wikimedia.org/P20295 and previous config saved to /var/cache/conftool/dbconfig/20220208-121430-marostegui.json
12:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
12:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20294 and previous config saved to /var/cache/conftool/dbconfig/20220208-121422-marostegui.json
12:11 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2010.wmnet
12:11 hnowlan: Running c-foreach-nt decommission on restbase2010 in advance of decommissioning
12:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T300402)', diff saved to https://phabricator.wikimedia.org/P20293 and previous config saved to /var/cache/conftool/dbconfig/20220208-120603-marostegui.json
12:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
12:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20292 and previous config saved to /var/cache/conftool/dbconfig/20220208-120556-marostegui.json
12:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: d9902a4: cowikimedia: Let admins grant confirmed and accountcreator flags (T300948) (duration: 00m 50s)
12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T298554)', diff saved to https://phabricator.wikimedia.org/P20291 and previous config saved to /var/cache/conftool/dbconfig/20220208-120102-ladsgroup.json
12:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
12:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
12:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20290 and previous config saved to /var/cache/conftool/dbconfig/20220208-120054-ladsgroup.json
11:59 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS buster
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20289 and previous config saved to /var/cache/conftool/dbconfig/20220208-115918-marostegui.json
11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2019.wmnet
11:59 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2020.wmnet
11:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2019.codfw.wmnet with OS buster
11:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1182.eqiad.wmnet with OS bullseye
11:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2020.codfw.wmnet with OS buster
11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20288 and previous config saved to /var/cache/conftool/dbconfig/20220208-115051-marostegui.json
11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300510)', diff saved to https://phabricator.wikimedia.org/P20287 and previous config saved to /var/cache/conftool/dbconfig/20220208-114639-ladsgroup.json
11:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
11:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1182.eqiad.wmnet with reason: Maintenance
11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20286 and previous config saved to /var/cache/conftool/dbconfig/20220208-114549-ladsgroup.json
11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P20285 and previous config saved to /var/cache/conftool/dbconfig/20220208-114413-marostegui.json
11:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20284 and previous config saved to /var/cache/conftool/dbconfig/20220208-113910-ladsgroup.json
11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P20283 and previous config saved to /var/cache/conftool/dbconfig/20220208-113547-marostegui.json
11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20282 and previous config saved to /var/cache/conftool/dbconfig/20220208-113045-ladsgroup.json
11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20281 and previous config saved to /var/cache/conftool/dbconfig/20220208-112909-marostegui.json
11:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20280 and previous config saved to /var/cache/conftool/dbconfig/20220208-112406-ladsgroup.json
11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20279 and previous config saved to /var/cache/conftool/dbconfig/20220208-112042-marostegui.json
11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20278 and previous config saved to /var/cache/conftool/dbconfig/20220208-111540-ladsgroup.json
11:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P20277 and previous config saved to /var/cache/conftool/dbconfig/20220208-110901-ladsgroup.json
11:06 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2020.codfw.wmnet with OS buster
11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T300402)', diff saved to https://phabricator.wikimedia.org/P20276 and previous config saved to /var/cache/conftool/dbconfig/20220208-110154-marostegui.json
11:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
11:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1179.eqiad.wmnet with reason: Maintenance
11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20275 and previous config saved to /var/cache/conftool/dbconfig/20220208-110147-marostegui.json
10:59 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2019.codfw.wmnet with OS buster
10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300775)', diff saved to https://phabricator.wikimedia.org/P20274 and previous config saved to /var/cache/conftool/dbconfig/20220208-105453-marostegui.json
10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
10:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
10:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20273 and previous config saved to /var/cache/conftool/dbconfig/20220208-105440-marostegui.json
10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20272 and previous config saved to /var/cache/conftool/dbconfig/20220208-105356-ladsgroup.json
10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1162.eqiad.wmnet with OS bullseye
10:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20271 and previous config saved to /var/cache/conftool/dbconfig/20220208-104642-marostegui.json
10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T298554)', diff saved to https://phabricator.wikimedia.org/P20270 and previous config saved to /var/cache/conftool/dbconfig/20220208-104421-ladsgroup.json
10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20269 and previous config saved to /var/cache/conftool/dbconfig/20220208-104414-ladsgroup.json
10:43 elukey: update pcc facts
10:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20268 and previous config saved to /var/cache/conftool/dbconfig/20220208-103935-marostegui.json
10:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P20267 and previous config saved to /var/cache/conftool/dbconfig/20220208-103137-marostegui.json
10:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20266 and previous config saved to /var/cache/conftool/dbconfig/20220208-102909-ladsgroup.json
10:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P20265 and previous config saved to /var/cache/conftool/dbconfig/20220208-102430-marostegui.json
10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1162.eqiad.wmnet with OS bullseye
10:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20264 and previous config saved to /var/cache/conftool/dbconfig/20220208-101631-marostegui.json
10:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20263 and previous config saved to /var/cache/conftool/dbconfig/20220208-101404-ladsgroup.json
10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300510)', diff saved to https://phabricator.wikimedia.org/P20262 and previous config saved to /var/cache/conftool/dbconfig/20220208-101238-ladsgroup.json
10:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
10:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1162.eqiad.wmnet with reason: Maintenance
10:09 jayme: updates scap to 4.3.0 on all hosts - T300804
10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20261 and previous config saved to /var/cache/conftool/dbconfig/20220208-100926-marostegui.json
09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20260 and previous config saved to /var/cache/conftool/dbconfig/20220208-095916-marostegui.json
09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1096.eqiad.wmnet with reason: Maintenance
09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20259 and previous config saved to /var/cache/conftool/dbconfig/20220208-095909-marostegui.json
09:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20258 and previous config saved to /var/cache/conftool/dbconfig/20220208-095900-ladsgroup.json
09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T300402)', diff saved to https://phabricator.wikimedia.org/P20257 and previous config saved to /var/cache/conftool/dbconfig/20220208-095427-marostegui.json
09:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
09:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20256 and previous config saved to /var/cache/conftool/dbconfig/20220208-095420-marostegui.json
09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20255 and previous config saved to /var/cache/conftool/dbconfig/20220208-094358-marostegui.json
09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20254 and previous config saved to /var/cache/conftool/dbconfig/20220208-093915-marostegui.json
09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T298554)', diff saved to https://phabricator.wikimedia.org/P20253 and previous config saved to /var/cache/conftool/dbconfig/20220208-093315-ladsgroup.json
09:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
09:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P20252 and previous config saved to /var/cache/conftool/dbconfig/20220208-092853-marostegui.json
09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P20251 and previous config saved to /var/cache/conftool/dbconfig/20220208-092410-marostegui.json
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20250 and previous config saved to /var/cache/conftool/dbconfig/20220208-091349-marostegui.json
09:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
09:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
09:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20249 and previous config saved to /var/cache/conftool/dbconfig/20220208-090906-marostegui.json
08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T300402)', diff saved to https://phabricator.wikimedia.org/P20248 and previous config saved to /var/cache/conftool/dbconfig/20220208-084851-marostegui.json
08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
08:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
08:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300775)', diff saved to https://phabricator.wikimedia.org/P20247 and previous config saved to /var/cache/conftool/dbconfig/20220208-083815-marostegui.json
08:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
08:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20246 and previous config saved to /var/cache/conftool/dbconfig/20220208-083808-marostegui.json
08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20245 and previous config saved to /var/cache/conftool/dbconfig/20220208-082303-marostegui.json
08:20 marostegui: Stop MySQL on db1115 to backup tendril T297605
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P20244 and previous config saved to /var/cache/conftool/dbconfig/20220208-080758-marostegui.json
08:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
08:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20243 and previous config saved to /var/cache/conftool/dbconfig/20220208-080709-marostegui.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20242 and previous config saved to /var/cache/conftool/dbconfig/20220208-075254-marostegui.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20241 and previous config saved to /var/cache/conftool/dbconfig/20220208-075204-marostegui.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P20240 and previous config saved to /var/cache/conftool/dbconfig/20220208-073659-marostegui.json
07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20239 and previous config saved to /var/cache/conftool/dbconfig/20220208-072155-marostegui.json
07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T300402)', diff saved to https://phabricator.wikimedia.org/P20238 and previous config saved to /var/cache/conftool/dbconfig/20220208-070339-marostegui.json
07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
06:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2134.codfw.wmnet with OS bullseye
06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
06:22 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2134.codfw.wmnet with OS bullseye
06:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300775)', diff saved to https://phabricator.wikimedia.org/P20237 and previous config saved to /var/cache/conftool/dbconfig/20220208-060943-marostegui.json
06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
06:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
06:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions group from s1 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20236 and previous config saved to /var/cache/conftool/dbconfig/20220208-060310-marostegui.json
02:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:12 ryankemper: T294805 Re-enabling puppet across eqiad elastic fleet: `ryankemper@cumin1001:~$ sudo cumin -b 8 'elastic1*' 'sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent'` tmux session `elastic`
00:12 ryankemper: T294805 old psi masters are out, done with all elastic master operations
00:05 ryankemper: T294805 new psi masters `elastic1073`, `elastic1075`, and `elastic1083` are in

2022-02-07

23:39 ryankemper: T294805 Removed old masters `elastic1034` and `elastic1038` (and `elastic1040` was removed earlier)
23:35 ryankemper: T294805 Bringing in new omega master `elastic1057`
23:31 ryankemper: T294805 Bringing in new omega master `elastic1076`
23:27 ryankemper: T294805 Bringing in new master `elastic1068`
23:27 ryankemper: T294805 Main search cluster all done, proceeding to `omega` cluster
23:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
23:17 cwhite: end opensearch upgrade (eqiad) T299168
23:09 ryankemper: T294805 Kicking out the final master `elastic1036` (which is also the currently elected leader); after this we'll be back to 3 masters as intended
23:06 ryankemper: T294805 Running puppet and restarting elasticsearch services on `elastic1040` to make it no longer a master
23:04 ryankemper: T294805 Bringing in new master `elastic1081`: `sudo systemctl restart elasticsearch_6@production-search-eqiad.service elasticsearch_6@production-search-psi-eqiad.service`
23:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2053.mgmt.codfw.wmnet with reboot policy FORCED
23:04 ryankemper: T294805 Bringing in new master `elastic1081`: `sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent`
22:59 ryankemper: T294805 `sudo systemctl restart elasticsearch_6@production-search-eqiad.service elasticsearch_6@production-search-omega-eqiad.service` on `elastic1074`
22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2052.mgmt.codfw.wmnet with reboot policy FORCED
22:57 ryankemper: T294805 Running puppet agent on new master elastic1074.eqiad.wmnet: `sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent`
22:48 ryankemper: T294805 Disabled puppet across all of elastic1* in preparation for bringing new master hosts in
22:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20235 and previous config saved to /var/cache/conftool/dbconfig/20220207-224733-ladsgroup.json
22:45 inflatador: T294805 puppet-merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/736118
22:44 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2052.mgmt.codfw.wmnet with reboot policy FORCED
22:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2051.mgmt.codfw.wmnet with reboot policy FORCED
22:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20234 and previous config saved to /var/cache/conftool/dbconfig/20220207-223228-ladsgroup.json
22:25 cwhite: begin opensearch upgrade (eqiad) T299168
22:21 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2051.mgmt.codfw.wmnet with reboot policy FORCED
22:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P20233 and previous config saved to /var/cache/conftool/dbconfig/20220207-221723-ladsgroup.json
22:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2050.mgmt.codfw.wmnet with reboot policy FORCED
22:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20232 and previous config saved to /var/cache/conftool/dbconfig/20220207-221345-ladsgroup.json
22:11 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2055.mgmt.codfw.wmnet with reboot policy FORCED
22:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20231 and previous config saved to /var/cache/conftool/dbconfig/20220207-220218-ladsgroup.json
22:01 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2050.mgmt.codfw.wmnet with reboot policy FORCED
22:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2049.mgmt.codfw.wmnet with reboot policy FORCED
22:00 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2055.mgmt.codfw.wmnet with reboot policy FORCED
21:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20230 and previous config saved to /var/cache/conftool/dbconfig/20220207-215840-ladsgroup.json
21:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2049.mgmt.codfw.wmnet with reboot policy FORCED
21:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P20229 and previous config saved to /var/cache/conftool/dbconfig/20220207-214335-ladsgroup.json
21:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2048.mgmt.codfw.wmnet with reboot policy FORCED
21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20228 and previous config saved to /var/cache/conftool/dbconfig/20220207-213650-ladsgroup.json
21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
21:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20227 and previous config saved to /var/cache/conftool/dbconfig/20220207-212830-ladsgroup.json
21:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2048.mgmt.codfw.wmnet with reboot policy FORCED
21:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2047.mgmt.codfw.wmnet with reboot policy FORCED
21:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
21:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
21:09 otto@deploy1002: Finished deploy [airflow-dags/analytics-test@6d936db]: (no justification provided) (duration: 00m 08s)
21:09 otto@deploy1002: Started deploy [airflow-dags/analytics-test@6d936db]: (no justification provided)
21:04 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2047.mgmt.codfw.wmnet with reboot policy FORCED
21:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1129.eqiad.wmnet with OS bullseye
20:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
20:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
20:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20225 and previous config saved to /var/cache/conftool/dbconfig/20220207-205620-ladsgroup.json
20:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2046.mgmt.codfw.wmnet with reboot policy FORCED
20:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20223 and previous config saved to /var/cache/conftool/dbconfig/20220207-204115-ladsgroup.json
20:34 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2046.mgmt.codfw.wmnet with reboot policy FORCED
20:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db1129.eqiad.wmnet with OS bullseye
20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300510)', diff saved to https://phabricator.wikimedia.org/P20222 and previous config saved to /var/cache/conftool/dbconfig/20220207-203120-ladsgroup.json
20:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
20:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1129.eqiad.wmnet with reason: Maintenance
20:30 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@9afb96d]: (no justification provided) (duration: 00m 08s)
20:30 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@9afb96d]: (no justification provided)
20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P20221 and previous config saved to /var/cache/conftool/dbconfig/20220207-202611-ladsgroup.json
20:23 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: old kernel
20:23 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: old kernel
20:19 eileen: revision 7dcdc017 -> ccd5afc3 civicrm update
20:19 eileen: revision 7dcdc017 -> ccd5afc3
20:19 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@ef5783e]: (no justification provided) (duration: 00m 07s)
20:18 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@ef5783e]: (no justification provided)
20:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2045.mgmt.codfw.wmnet with reboot policy FORCED
20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20220 and previous config saved to /var/cache/conftool/dbconfig/20220207-201106-ladsgroup.json
20:08 mbsantos@deploy1002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync on main
20:08 mbsantos@deploy1002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply on main
20:05 mbsantos@deploy1002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync on main
19:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2045.mgmt.codfw.wmnet with reboot policy FORCED
19:55 mbsantos@deploy1002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply on main
19:44 mforns@deploy1002: Finished deploy [airflow-dags/analytics-test@c83a4bc]: (no justification provided) (duration: 00m 08s)
19:44 mforns@deploy1002: Started deploy [airflow-dags/analytics-test@c83a4bc]: (no justification provided)
19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20219 and previous config saved to /var/cache/conftool/dbconfig/20220207-194020-ladsgroup.json
19:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
19:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20218 and previous config saved to /var/cache/conftool/dbconfig/20220207-194013-ladsgroup.json
19:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2044.mgmt.codfw.wmnet with reboot policy FORCED
19:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20217 and previous config saved to /var/cache/conftool/dbconfig/20220207-192508-ladsgroup.json
19:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mc2044.mgmt.codfw.wmnet with reboot policy FORCED
19:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P20216 and previous config saved to /var/cache/conftool/dbconfig/20220207-191003-ladsgroup.json
19:08 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:05 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Turn on wgVectorLanguageAlertInSidebar for all wikis (T300559) (duration: 00m 49s)
19:03 pt1979@cumin2002: START - Cookbook sre.dns.netbox
18:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20215 and previous config saved to /var/cache/conftool/dbconfig/20220207-185459-ladsgroup.json
18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T298554)', diff saved to https://phabricator.wikimedia.org/P20214 and previous config saved to /var/cache/conftool/dbconfig/20220207-183059-ladsgroup.json
18:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
18:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
18:20 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS buster
18:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
18:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
18:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
18:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20213 and previous config saved to /var/cache/conftool/dbconfig/20220207-180857-ladsgroup.json
18:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on restbase2020.codfw.wmnet with reason: Firmware upgrade
18:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on restbase2020.codfw.wmnet with reason: Firmware upgrade
18:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on restbase2019.codfw.wmnet with reason: Firmware upgrade
18:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on restbase2019.codfw.wmnet with reason: Firmware upgrade
18:01 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
17:56 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2020.wmnet
17:56 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase2019.wmnet
17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20212 and previous config saved to /var/cache/conftool/dbconfig/20220207-175352-ladsgroup.json
17:51 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
17:42 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2042.mgmt.codfw.wmnet with reboot policy FORCED
17:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P20211 and previous config saved to /var/cache/conftool/dbconfig/20220207-173848-ladsgroup.json
17:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2042.mgmt.codfw.wmnet with reboot policy FORCED
17:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2030.codfw.wmnet with OS buster
17:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20210 and previous config saved to /var/cache/conftool/dbconfig/20220207-172343-ladsgroup.json
16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T298554)', diff saved to https://phabricator.wikimedia.org/P20209 and previous config saved to /var/cache/conftool/dbconfig/20220207-165952-ladsgroup.json
16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20208 and previous config saved to /var/cache/conftool/dbconfig/20220207-165944-ladsgroup.json
16:55 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2030.codfw.wmnet with OS buster
16:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2029.codfw.wmnet with OS buster
16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20207 and previous config saved to /var/cache/conftool/dbconfig/20220207-164439-ladsgroup.json
16:41 moritzm: switch kubestagetcd2003 to plain disk storage
16:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2003.codfw.wmnet with reason: Switch to plain disk storage
16:38 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2003.codfw.wmnet with reason: Switch to plain disk storage
16:30 moritzm: switch kubestagetcd2002 to plain disk storage
16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P20206 and previous config saved to /var/cache/conftool/dbconfig/20220207-162935-ladsgroup.json
16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch to plain disk storage
16:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2002.codfw.wmnet with reason: Switch to plain disk storage
16:24 moritzm: switch kubestagetcd2001 to plain disk storage
16:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch to plain disk storage
16:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd2001.codfw.wmnet with reason: Switch to plain disk storage
16:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2029.codfw.wmnet with OS buster
16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20205 and previous config saved to /var/cache/conftool/dbconfig/20220207-161430-ladsgroup.json
16:05 moritzm: migrating instances off ganeti1021
16:04 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS bullseye
16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T298554)', diff saved to https://phabricator.wikimedia.org/P20204 and previous config saved to /var/cache/conftool/dbconfig/20220207-160441-ladsgroup.json
16:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
16:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20203 and previous config saved to /var/cache/conftool/dbconfig/20220207-160433-ladsgroup.json
15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20201 and previous config saved to /var/cache/conftool/dbconfig/20220207-154928-ladsgroup.json
15:47 moritzm: installing pillow security updates
15:44 jayme@deploy1002: Finished deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided) (duration: 02m 30s)
15:41 jayme@deploy1002: Started deploy [restbase/deploy@0848b15] (dev-cluster): (no justification provided)
15:40 jayme: updated scap to 4.3.0 on A:mw-canary, A:parsoid-canary, A:mw-jobrunner-canary, A:restbase-canary - T300804
15:37 jayme: uploaded scap 4.3-0 to apt.w.o - T300804
15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P20200 and previous config saved to /var/cache/conftool/dbconfig/20220207-153424-ladsgroup.json
15:30 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS bullseye
15:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20199 and previous config saved to /var/cache/conftool/dbconfig/20220207-151917-ladsgroup.json
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298554)', diff saved to https://phabricator.wikimedia.org/P20198 and previous config saved to /var/cache/conftool/dbconfig/20220207-151018-ladsgroup.json
15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20197 and previous config saved to /var/cache/conftool/dbconfig/20220207-150959-ladsgroup.json
14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20196 and previous config saved to /var/cache/conftool/dbconfig/20220207-145454-ladsgroup.json
14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P20195 and previous config saved to /var/cache/conftool/dbconfig/20220207-143950-ladsgroup.json
14:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20194 and previous config saved to /var/cache/conftool/dbconfig/20220207-142445-ladsgroup.json
14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T298554)', diff saved to https://phabricator.wikimedia.org/P20193 and previous config saved to /var/cache/conftool/dbconfig/20220207-141452-ladsgroup.json
14:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
14:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
13:14 jbond: update ferm on bullseye
13:12 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1020.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
13:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1020.eqiad.wmnet
13:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1020.eqiad.wmnet
12:44 moritzm: installing ruby2.7 security updates
12:40 volans@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc2043.mgmt.codfw.wmnet with reboot policy FORCED
12:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:34 moritzm: revert kubestagetcd1006 to plain disk storage
12:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:32 taavi: UTC morning deploys done
12:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1006.eqiad.wmnet with reason: Switch to plain disk storage
12:32 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Ensure GlobalBlocking is not loaded without CentralAuth (T299371) (2/2) (duration: 00m 48s)
12:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1006.eqiad.wmnet with reason: Switch to plain disk storage
12:31 moritzm: revert kubestagetcd1005 to plain disk storage
12:31 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: Ensure GlobalBlocking is not loaded without CentralAuth (T299371) (1/2) (duration: 00m 48s)
12:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:27 taavi@deploy1002: Synchronized w/robots.php: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (3/3) (duration: 00m 48s)
12:26 taavi@deploy1002: Synchronized wmf-config: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (2/3) (duration: 00m 48s)
12:25 taavi@deploy1002: Synchronized multiversion: Config: Migrate $wmfRealm calls to $wmgRealm (T45956) (1/3) (duration: 00m 48s)
12:25 volans@cumin2002: START - Cookbook sre.hosts.provision for host mc2043.mgmt.codfw.wmnet with reboot policy FORCED
12:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1005.eqiad.wmnet with reason: Switch to plain disk storage
12:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1005.eqiad.wmnet with reason: Switch to plain disk storage
12:19 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Remove redundant patrolmarks flag from patroller usergroup (T300913) (duration: 00m 48s)
12:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:17 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1009.eqiad.wmnet
12:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:09 taavi: taavi@deploy1002 Synchronized wmf-config/InitialiseSettings.php: Config: Stop capturing media change tags (T286362) (2/2) (duration: 00m 50s)
12:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:08 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: Stop capturing media change tags (T286362) (1/2) (duration: 00m 50s)
12:07 moritzm: revert kubestagetcd1004 to plain disk storage
12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: Switch to plain disk storage
12:06 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on kubestagetcd1004.eqiad.wmnet with reason: Switch to plain disk storage
11:59 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1008.eqiad.wmnet
11:40 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1007.eqiad.wmnet
11:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on production
11:18 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
11:18 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: sync on production
11:15 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production
11:14 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on staging
11:14 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: sync on production
11:00 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1006.eqiad.wmnet
10:51 mmandere: rolling upgrade of varnish from version 6.0.9 to 6.0.10 across DCs T300264
10:49 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus2004.codfw.wmnet
10:49 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus1004.eqiad.wmnet
10:22 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1005.eqiad.wmnet
09:59 btullis@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=aqs1004.eqiad.wmnet
09:21 godog: temp-disable mfa for 'filippo' - T296629
09:09 jayme: uncordoned kubernetes1014 - T301099
08:02 jayme: powercycle kubernetes1014 - T301099
06:20 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on kubernetes1014.eqiad.wmnet with reason: potential HW error
06:20 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on kubernetes1014.eqiad.wmnet with reason: potential HW error
06:10 jayme: draining kubernetes1014

2022-02-05

22:10 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
21:28 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2003-dev.codfw.wmnet with OS bullseye
20:15 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
19:29 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2002-dev.codfw.wmnet with OS bullseye
18:48 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
17:53 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
16:54 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
06:11 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
06:09 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt2001-dev.codfw.wmnet with OS bullseye
05:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt2001-dev.codfw.wmnet with OS bullseye

2022-02-04

23:43 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
23:43 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
23:02 inflatador: bking@deployment-puppetmaster04 local commit to public/private repo, see T299797 for more details
22:37 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
22:36 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
19:44 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudservices2002-dev.wikimedia.org with OS bullseye
18:52 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudservices2002-dev.wikimedia.org with OS bullseye
17:00 arturo: add mcrouter 2022.01.31.00-1 to bullseye-wikimedia (T300578)
16:48 jbond: update add new ferm package ferm_2.5.1-1+wmf11u2
16:38 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:35 pt1979@cumin2002: START - Cookbook sre.dns.netbox
16:05 elukey: unmask prometheus-mysqld-exporter.service and clean up the old @analytics + wmf_auto_restart units (service+timer) not used anymore on an-coord100[12]
14:25 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
14:18 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
12:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1020.eqiad.wmnet with OS buster
11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20174 and previous config saved to /var/cache/conftool/dbconfig/20220204-114117-root.json
11:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20173 and previous config saved to /var/cache/conftool/dbconfig/20220204-112613-root.json
11:14 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1020.eqiad.wmnet with OS buster
11:13 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20172 and previous config saved to /var/cache/conftool/dbconfig/20220204-111110-root.json
11:07 akosiaris@cumin1001: START - Cookbook sre.dns.netbox
11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s1 codfw T263127', diff saved to https://phabricator.wikimedia.org/P20171 and previous config saved to /var/cache/conftool/dbconfig/20220204-110427-marostegui.json
10:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20170 and previous config saved to /var/cache/conftool/dbconfig/20220204-105606-root.json
10:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P20165 and previous config saved to /var/cache/conftool/dbconfig/20220204-104102-root.json
10:40 moritzm: rebalancing row A in ganeti/eqiad, all nodes of that row are now running Buster T296721
10:03 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
10:02 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1008.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1008.eqiad.wmnet
09:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1008.eqiad.wmnet
08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Remove watchlist group from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P20164 and previous config saved to /var/cache/conftool/dbconfig/20220204-082010-marostegui.json
07:18 elukey: `git checkout main.html` on miscweb1002:/srv/org/wikidata/query to avoid puppet corrective actions (and the host being listed in alarms)
07:09 elukey: cleanup wmf_auto_restart_prometheus-mysqld-exporter@analytics-meta on an-test-coord1001 and unmasked wmf_auto_restart_prometheus-mysqld-exporter (now used)
07:03 elukey: clean up wmf_auto_restart_prometheus-mysqld-exporter@matomo on matomo1002 (not used anymore, listed as failed)
07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 schema change', diff saved to https://phabricator.wikimedia.org/P20163 and previous config saved to /var/cache/conftool/dbconfig/20220204-070003-marostegui.json
06:00 legoktm: uploaded pygments 2.11.2 to apt.wm.o (T298399)
02:48 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
02:42 ryankemper@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts elastic2035.codfw.wmnet
02:41 ryankemper@cumin1001: START - Cookbook sre.hosts.decommission for hosts elastic2035.codfw.wmnet
01:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
01:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
01:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
01:04 brennen: for-real end of utc late backport & config window
01:04 brennen@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/Thanks/modules/ext.thanks.flowthank.js: Backport: Correct attribute for flow thanks (T300831) (duration: 00m 49s)
00:50 brennen: reopening utc late backport window for Correct attribute for flow thanks (T300831)
00:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:12 cjming: end of UTC late backport & config window
00:11 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Update icons, wordmark for test wikis (T299512) (duration: 00m 49s)
00:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:10 cjming@deploy1002: Synchronized static/images/mobile/copyright/: Config: Update icons, wordmark for test wikis (T299512) (duration: 00m 53s)
00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2022-02-03

23:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20159 and previous config saved to /var/cache/conftool/dbconfig/20220203-233447-marostegui.json
23:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20158 and previous config saved to /var/cache/conftool/dbconfig/20220203-231942-marostegui.json
23:15 ryankemper: T294805 Added a silence on alerts.wikimedia.org for `CirrusSearchJVMGCOldPoolFlatlined`
23:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P20157 and previous config saved to /var/cache/conftool/dbconfig/20220203-230437-marostegui.json
22:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20156 and previous config saved to /var/cache/conftool/dbconfig/20220203-224933-marostegui.json
22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20155 and previous config saved to /var/cache/conftool/dbconfig/20220203-223923-marostegui.json
22:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
22:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
22:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20154 and previous config saved to /var/cache/conftool/dbconfig/20220203-223916-marostegui.json
22:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20153 and previous config saved to /var/cache/conftool/dbconfig/20220203-222411-marostegui.json
22:18 ryankemper: T294805 Monitoring https://grafana.wikimedia.org/d/000000455/elasticsearch-percentiles?orgId=1&var-cirrus_group=eqiad&var-cluster=elasticsearch&var-exported_cluster=production-search&var-smoothing=1&refresh=1m&from=now-3h&to=now as new hosts join the fleet
22:18 ryankemper: T294805 Bringing in new eqiad hosts in batches of 4, with 15-20 mins between batches: `ryankemper@cumin1001:~$ sudo -E cumin -b 4 'elastic1*' 'sudo run-puppet-agent --force; sudo run-puppet-agent; sleep 900'` tmux session `es_eqiad`
22:13 ryankemper: T294805 https://gerrit.wikimedia.org/r/c/operations/puppet/+/759617/ fixed the dependency issues, going to start bringing new hosts into service
22:09 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P20152 and previous config saved to /var/cache/conftool/dbconfig/20220203-220906-marostegui.json
22:05 eileen: civicrm revision 7dcdc017 -> 04cbf35b
22:04 volans@cumin2002: START - Cookbook sre.dns.netbox
21:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20150 and previous config saved to /var/cache/conftool/dbconfig/20220203-215402-marostegui.json
21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T300402)', diff saved to https://phabricator.wikimedia.org/P20149 and previous config saved to /var/cache/conftool/dbconfig/20220203-215154-marostegui.json
21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1177.eqiad.wmnet with reason: Maintenance
21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2079.codfw.wmnet with reason: Maintenance
21:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
21:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
21:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20148 and previous config saved to /var/cache/conftool/dbconfig/20220203-215121-marostegui.json
21:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20147 and previous config saved to /var/cache/conftool/dbconfig/20220203-213616-marostegui.json
21:28 rzl: root@apt1001:/home/rzl# reprepro copy bullseye-wikimedia buster-wikimedia envoyproxy # T300324
21:27 rzl: root@apt1001:/home/rzl# reprepro copy stretch-wikimedia buster-wikimedia envoyproxy # T300324
21:21 ryankemper: T294805 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759588; hoping this resolves dependency issues. Running puppet agent on `elastic1068`
21:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P20145 and previous config saved to /var/cache/conftool/dbconfig/20220203-212111-marostegui.json
21:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20144 and previous config saved to /var/cache/conftool/dbconfig/20220203-210607-marostegui.json
21:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T300402)', diff saved to https://phabricator.wikimedia.org/P20143 and previous config saved to /var/cache/conftool/dbconfig/20220203-210358-marostegui.json
21:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
21:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1172.eqiad.wmnet with reason: Maintenance
21:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20142 and previous config saved to /var/cache/conftool/dbconfig/20220203-210350-marostegui.json
20:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20140 and previous config saved to /var/cache/conftool/dbconfig/20220203-204846-marostegui.json
20:43 rzl: rzl@mwmaint1002:~$ sudo systemctl start mediawiki_job_recount_categories.service # T299823
20:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P20139 and previous config saved to /var/cache/conftool/dbconfig/20220203-203341-marostegui.json
20:26 ryankemper: T294805 Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib` wasn't there: https://phabricator.wikimedia.org/P20138
20:26 ryankemper: T294805 Running puppet on `elastic1068` failed, looks like `/usr/share/elasticsearch/lib' wasn't there: https://phabricator.wikimedia.org/P20138
20:25 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
20:25 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mx1001.wikimedia.org with reason: systemd testing
20:22 ryankemper: T294805 Running puppet on single elastic host: `ryankemper@elastic1068:~$ sudo run-puppet-agent --force`
20:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20137 and previous config saved to /var/cache/conftool/dbconfig/20220203-201836-marostegui.json
20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T300402)', diff saved to https://phabricator.wikimedia.org/P20136 and previous config saved to /var/cache/conftool/dbconfig/20220203-201729-marostegui.json
20:17 ryankemper: T294805 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/759317 to activate roles for elastic eqiad replacement hosts
20:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
20:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1126.eqiad.wmnet with reason: Maintenance
20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20135 and previous config saved to /var/cache/conftool/dbconfig/20220203-201721-marostegui.json
20:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:16 ryankemper: T294805 Disabled puppet on `elastic1*` in preparation for bringing new hosts into service: `ryankemper@cumin1001:~$ sudo cumin 'elastic1*' 'sudo disable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805"'`
20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:13 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1003.eqiad.wmnet with OS buster
20:11 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.20 refs T293961
20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:08 mutante: planet1002/planet2002 - sudo systemctl start planet-update-en to manually start update after adding diff.wikimedia.org T230444
20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:07 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: Drop skin override (T300814) (2/2) (duration: 00m 49s)
20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:06 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/skin.json: Backport: Drop skin override (T300814) (1/2) (duration: 00m 49s)
20:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudbackup1004.eqiad.wmnet with OS buster
20:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20134 and previous config saved to /var/cache/conftool/dbconfig/20220203-200217-marostegui.json
19:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P20133 and previous config saved to /var/cache/conftool/dbconfig/20220203-194712-marostegui.json
19:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
19:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:41 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1003.eqiad.wmnet with OS buster
19:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
19:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:39 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudbackup1004.eqiad.wmnet with OS buster
19:35 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1003.eqiad.wmnet with OS buster
19:34 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/Hooks.php: Backport: Pass skin name to Hooks::isSkinLegacy (T299971) (duration: 00m 49s)
19:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:33 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/ContentTranslation/modules/entrypoints/ext.cx.entrypoints.contributionsmenu.js: Backport: Update skin checks with new vector skin key. (T298916 T300814) (duration: 00m 50s)
19:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudbackup1004.eqiad.wmnet with OS buster
19:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20132 and previous config saved to /var/cache/conftool/dbconfig/20220203-193208-marostegui.json
19:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:29 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/modules/ext.wikiEditor.js: Backport: New bucket for abtest data (T291308) (2/2) (duration: 00m 50s)
19:28 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/includes/Hooks.php: Backport: New bucket for abtest data (T291308) (1/2) (duration: 00m 49s)
19:27 taavi@deploy1002: Synchronized php-1.38.0-wmf.20/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js: Backport: New bucket for abtest data (T291308) (duration: 00m 50s)
19:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:26 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: commonswiki: Add three domains to the wgCopyUploadsDomains allowlist (T299835 T300848) (duration: 00m 54s)
19:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:42 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T300402)', diff saved to https://phabricator.wikimedia.org/P20131 and previous config saved to /var/cache/conftool/dbconfig/20220203-183648-marostegui.json
18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1167.eqiad.wmnet with reason: Maintenance
18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20130 and previous config saved to /var/cache/conftool/dbconfig/20220203-183634-marostegui.json
18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20129 and previous config saved to /var/cache/conftool/dbconfig/20220203-182129-marostegui.json
18:17 dancy: restarted php7.2-fpm processes on mediawiki12
18:10 dancy: killed 8 spinning php7.2-fpm processes on mediawiki12
18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P20128 and previous config saved to /var/cache/conftool/dbconfig/20220203-180624-marostegui.json
17:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20127 and previous config saved to /var/cache/conftool/dbconfig/20220203-175120-marostegui.json
17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T300402)', diff saved to https://phabricator.wikimedia.org/P20126 and previous config saved to /var/cache/conftool/dbconfig/20220203-174913-marostegui.json
17:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
17:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1114.eqiad.wmnet with reason: Maintenance
17:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20125 and previous config saved to /var/cache/conftool/dbconfig/20220203-174905-marostegui.json
17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20122 and previous config saved to /var/cache/conftool/dbconfig/20220203-173400-marostegui.json
17:22 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P20120 and previous config saved to /var/cache/conftool/dbconfig/20220203-171856-marostegui.json
17:13 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
17:12 hnowlan@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts restbase2011.codfw.wmnet
17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20118 and previous config saved to /var/cache/conftool/dbconfig/20220203-170351-marostegui.json
17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T300402)', diff saved to https://phabricator.wikimedia.org/P20117 and previous config saved to /var/cache/conftool/dbconfig/20220203-170144-marostegui.json
17:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
17:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1111.eqiad.wmnet with reason: Maintenance
17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20116 and previous config saved to /var/cache/conftool/dbconfig/20220203-170136-marostegui.json
16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20115 and previous config saved to /var/cache/conftool/dbconfig/20220203-164632-marostegui.json
16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318', diff saved to https://phabricator.wikimedia.org/P20114 and previous config saved to /var/cache/conftool/dbconfig/20220203-163127-marostegui.json
16:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20113 and previous config saved to /var/cache/conftool/dbconfig/20220203-162316-marostegui.json
16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20111 and previous config saved to /var/cache/conftool/dbconfig/20220203-161622-marostegui.json
16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3318 (T300402)', diff saved to https://phabricator.wikimedia.org/P20110 and previous config saved to /var/cache/conftool/dbconfig/20220203-161515-marostegui.json
16:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
16:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20109 and previous config saved to /var/cache/conftool/dbconfig/20220203-161508-marostegui.json
16:10 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
16:10 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2030.mgmt.codfw.wmnet with reboot policy FORCED
16:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20108 and previous config saved to /var/cache/conftool/dbconfig/20220203-160811-marostegui.json
16:00 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
16:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20107 and previous config saved to /var/cache/conftool/dbconfig/20220203-160003-marostegui.json
15:55 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts restbase2011.codfw.wmnet
15:55 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts restbase2011.codfw.wmnet
15:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164', diff saved to https://phabricator.wikimedia.org/P20106 and previous config saved to /var/cache/conftool/dbconfig/20220203-155306-marostegui.json
15:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P20105 and previous config saved to /var/cache/conftool/dbconfig/20220203-154458-marostegui.json
15:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20104 and previous config saved to /var/cache/conftool/dbconfig/20220203-153801-marostegui.json
15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1164 (T298558)', diff saved to https://phabricator.wikimedia.org/P20103 and previous config saved to /var/cache/conftool/dbconfig/20220203-153653-marostegui.json
15:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
15:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1164.eqiad.wmnet with reason: Maintenance
15:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20102 and previous config saved to /var/cache/conftool/dbconfig/20220203-153646-marostegui.json
15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
15:34 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
15:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20101 and previous config saved to /var/cache/conftool/dbconfig/20220203-152953-marostegui.json
15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T300402)', diff saved to https://phabricator.wikimedia.org/P20100 and previous config saved to /var/cache/conftool/dbconfig/20220203-152746-marostegui.json
15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1178.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20099 and previous config saved to /var/cache/conftool/dbconfig/20220203-152739-marostegui.json
15:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20098 and previous config saved to /var/cache/conftool/dbconfig/20220203-152141-marostegui.json
15:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20097 and previous config saved to /var/cache/conftool/dbconfig/20220203-151234-marostegui.json
15:12 moritzm: installing apache security updates on gerrit1001
15:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P20096 and previous config saved to /var/cache/conftool/dbconfig/20220203-150636-marostegui.json
14:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P20095 and previous config saved to /var/cache/conftool/dbconfig/20220203-145729-marostegui.json
14:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20094 and previous config saved to /var/cache/conftool/dbconfig/20220203-145132-marostegui.json
14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20093 and previous config saved to /var/cache/conftool/dbconfig/20220203-145024-marostegui.json
14:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20092 and previous config saved to /var/cache/conftool/dbconfig/20220203-145017-marostegui.json
14:44 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
14:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20091 and previous config saved to /var/cache/conftool/dbconfig/20220203-144224-marostegui.json
14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T300402)', diff saved to https://phabricator.wikimedia.org/P20090 and previous config saved to /var/cache/conftool/dbconfig/20220203-144017-marostegui.json
14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1104.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1116.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
14:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20089 and previous config saved to /var/cache/conftool/dbconfig/20220203-143544-marostegui.json
14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20088 and previous config saved to /var/cache/conftool/dbconfig/20220203-143512-marostegui.json
14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20087 and previous config saved to /var/cache/conftool/dbconfig/20220203-142039-marostegui.json
14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P20086 and previous config saved to /var/cache/conftool/dbconfig/20220203-142007-marostegui.json
14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P20085 and previous config saved to /var/cache/conftool/dbconfig/20220203-140534-marostegui.json
14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20084 and previous config saved to /var/cache/conftool/dbconfig/20220203-140503-marostegui.json
13:53 XioNoX: eqiad: push Capirca generated border-in filters
13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20083 and previous config saved to /var/cache/conftool/dbconfig/20220203-135029-marostegui.json
13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T298558)', diff saved to https://phabricator.wikimedia.org/P20082 and previous config saved to /var/cache/conftool/dbconfig/20220203-134952-marostegui.json
13:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
13:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1119.eqiad.wmnet with reason: Maintenance
13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20081 and previous config saved to /var/cache/conftool/dbconfig/20220203-134944-marostegui.json
13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300402)', diff saved to https://phabricator.wikimedia.org/P20080 and previous config saved to /var/cache/conftool/dbconfig/20220203-134746-marostegui.json
13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20079 and previous config saved to /var/cache/conftool/dbconfig/20220203-134739-marostegui.json
13:44 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:40 jayme@cumin1001: START - Cookbook sre.dns.netbox
13:35 jbond: disable puppet fleet wide for puppetdb restart
13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20078 and previous config saved to /var/cache/conftool/dbconfig/20220203-133439-marostegui.json
13:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20077 and previous config saved to /var/cache/conftool/dbconfig/20220203-133234-marostegui.json
13:28 marostegui: Test T300858
13:28 moritzm: installing apache security updates
13:27 jayme: moved kubernetes staging master,nodes,etcd from wikimedia_cluster "kubernetes" to "kubernetes-staging" - T273866
13:27 XioNoX: esams: push Capirca generated border-in filters
13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P20076 and previous config saved to /var/cache/conftool/dbconfig/20220203-131935-marostegui.json
13:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P20075 and previous config saved to /var/cache/conftool/dbconfig/20220203-131729-marostegui.json
13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
13:15 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1020.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20074 and previous config saved to /var/cache/conftool/dbconfig/20220203-130430-marostegui.json
13:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20073 and previous config saved to /var/cache/conftool/dbconfig/20220203-130224-marostegui.json
12:58 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
12:57 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T300402)', diff saved to https://phabricator.wikimedia.org/P20072 and previous config saved to /var/cache/conftool/dbconfig/20220203-125737-marostegui.json
12:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
12:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20071 and previous config saved to /var/cache/conftool/dbconfig/20220203-125730-marostegui.json
12:53 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
12:53 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
12:52 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
12:51 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
12:49 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
12:49 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
12:49 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
12:48 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
12:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
12:44 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on staging
12:44 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
12:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
12:43 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
12:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20069 and previous config saved to /var/cache/conftool/dbconfig/20220203-124225-marostegui.json
12:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:38 taavi: UTC morning backport window done
12:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:33 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: mniwiktionary: Add localized mobile wordmark (T294709) (2/2) (duration: 00m 49s)
12:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:32 taavi@deploy1002: Synchronized static/images/mobile/copyright/wiktionary-wordmark-mni.svg: Config: mniwiktionary: Add localized mobile wordmark (T294709) (1/2) (duration: 00m 50s)
12:29 XioNoX: eqsin: push Capirca generated border-in filters
12:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P20068 and previous config saved to /var/cache/conftool/dbconfig/20220203-122720-marostegui.json
12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298558)', diff saved to https://phabricator.wikimedia.org/P20067 and previous config saved to /var/cache/conftool/dbconfig/20220203-122612-marostegui.json
12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
12:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1106.eqiad.wmnet with reason: Maintenance
12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Maintenance
12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Maintenance
12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2103.codfw.wmnet with reason: Maintenance
12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20066 and previous config saved to /var/cache/conftool/dbconfig/20220203-122529-marostegui.json
12:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:19 XioNoX: codfw: push Capirca generated border-in filters
12:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:16 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: commonswiki: Add www.gbols.smns-bw.org to the wgCopyUploadsDomains allowlist (T300842) (duration: 00m 50s)
12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20065 and previous config saved to /var/cache/conftool/dbconfig/20220203-121216-marostegui.json
12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20064 and previous config saved to /var/cache/conftool/dbconfig/20220203-121024-marostegui.json
12:10 XioNoX: eqord: push Capirca generated border-in filters
12:09 mlitn@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [WikibaseMediaInfo] Stop normalizing full text scores (T296631) (duration: 00m 52s)
12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T300402)', diff saved to https://phabricator.wikimedia.org/P20063 and previous config saved to /var/cache/conftool/dbconfig/20220203-120832-marostegui.json
12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20062 and previous config saved to /var/cache/conftool/dbconfig/20220203-120825-marostegui.json
11:57 kart_: Updated cxserver to 2022-02-03-112745-production, this should unbreak Flores MT!
11:57 XioNoX: ulsfo: push Capirca generated border-in filters
11:55 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: sync on production
11:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P20061 and previous config saved to /var/cache/conftool/dbconfig/20220203-115519-marostegui.json
11:53 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
11:53 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20060 and previous config saved to /var/cache/conftool/dbconfig/20220203-115320-marostegui.json
11:51 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
11:49 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
11:49 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
11:47 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
11:46 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
11:46 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
11:45 moritzm: installing openjdk-11 security updates
11:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20059 and previous config saved to /var/cache/conftool/dbconfig/20220203-114015-marostegui.json
11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T298558)', diff saved to https://phabricator.wikimedia.org/P20058 and previous config saved to /var/cache/conftool/dbconfig/20220203-113907-marostegui.json
11:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
11:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1184.eqiad.wmnet with reason: Maintenance
11:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20057 and previous config saved to /var/cache/conftool/dbconfig/20220203-113859-marostegui.json
11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P20056 and previous config saved to /var/cache/conftool/dbconfig/20220203-113815-marostegui.json
11:36 arturo: reprepro changes @ apt1001 after merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/758050
11:33 moritzm: draining ganeti1020 for eventual reimage
11:26 vgutierrez: rolling varnish-fe restart to catch the new listen_depth config value
11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20055 and previous config saved to /var/cache/conftool/dbconfig/20220203-112355-marostegui.json
11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20054 and previous config saved to /var/cache/conftool/dbconfig/20220203-112311-marostegui.json
11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T300402)', diff saved to https://phabricator.wikimedia.org/P20053 and previous config saved to /var/cache/conftool/dbconfig/20220203-111921-marostegui.json
11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
11:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20052 and previous config saved to /var/cache/conftool/dbconfig/20220203-111908-marostegui.json
11:15 topranks: Adding BGP peering to lsw1-f1-eqiad on cr2-eqiad. T299758.
11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P20051 and previous config saved to /var/cache/conftool/dbconfig/20220203-110850-marostegui.json
11:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20050 and previous config saved to /var/cache/conftool/dbconfig/20220203-110403-marostegui.json
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20049 and previous config saved to /var/cache/conftool/dbconfig/20220203-105345-marostegui.json
10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T298558)', diff saved to https://phabricator.wikimedia.org/P20048 and previous config saved to /var/cache/conftool/dbconfig/20220203-105238-marostegui.json
10:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
10:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1099.eqiad.wmnet with reason: Maintenance
10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20047 and previous config saved to /var/cache/conftool/dbconfig/20220203-105230-marostegui.json
10:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P20046 and previous config saved to /var/cache/conftool/dbconfig/20220203-104858-marostegui.json
10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20045 and previous config saved to /var/cache/conftool/dbconfig/20220203-103725-marostegui.json
10:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20044 and previous config saved to /var/cache/conftool/dbconfig/20220203-103354-marostegui.json
10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20043 and previous config saved to /var/cache/conftool/dbconfig/20220203-103008-marostegui.json
10:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
10:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20042 and previous config saved to /var/cache/conftool/dbconfig/20220203-103001-marostegui.json
10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P20041 and previous config saved to /var/cache/conftool/dbconfig/20220203-102221-marostegui.json
10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20040 and previous config saved to /var/cache/conftool/dbconfig/20220203-101456-marostegui.json
10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1015.eqiad.wmnet
10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1014.eqiad.wmnet
10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20039 and previous config saved to /var/cache/conftool/dbconfig/20220203-100716-marostegui.json
10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1013.eqiad.wmnet
10:07 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1012.eqiad.wmnet
10:06 btullis@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1010.eqiad.wmnet
10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1015.eqiad.wmnet
10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1014.eqiad.wmnet
10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1013.eqiad.wmnet
10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1012.eqiad.wmnet
10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1011.eqiad.wmnet
10:06 btullis@puppetmaster1001: conftool action : set/weight=10; selector: name=aqs1010.eqiad.wmnet
09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P20038 and previous config saved to /var/cache/conftool/dbconfig/20220203-095952-marostegui.json
09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T298558)', diff saved to https://phabricator.wikimedia.org/P20037 and previous config saved to /var/cache/conftool/dbconfig/20220203-095907-marostegui.json
09:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
09:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1135.eqiad.wmnet with reason: Maintenance
09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20036 and previous config saved to /var/cache/conftool/dbconfig/20220203-095859-marostegui.json
09:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1183.eqiad.wmnet with OS bullseye
09:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20034 and previous config saved to /var/cache/conftool/dbconfig/20220203-094447-marostegui.json
09:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20033 and previous config saved to /var/cache/conftool/dbconfig/20220203-094354-marostegui.json
09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T300402)', diff saved to https://phabricator.wikimedia.org/P20032 and previous config saved to /var/cache/conftool/dbconfig/20220203-094107-marostegui.json
09:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
09:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
09:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20031 and previous config saved to /var/cache/conftool/dbconfig/20220203-094059-marostegui.json
09:31 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1183.eqiad.wmnet with OS bullseye
09:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P20030 and previous config saved to /var/cache/conftool/dbconfig/20220203-092850-marostegui.json
09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20029 and previous config saved to /var/cache/conftool/dbconfig/20220203-092554-marostegui.json
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20028 and previous config saved to /var/cache/conftool/dbconfig/20220203-091345-marostegui.json
09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T298558)', diff saved to https://phabricator.wikimedia.org/P20027 and previous config saved to /var/cache/conftool/dbconfig/20220203-091237-marostegui.json
09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1134.eqiad.wmnet with reason: Maintenance
09:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
09:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance
09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20026 and previous config saved to /var/cache/conftool/dbconfig/20220203-091224-marostegui.json
09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P20025 and previous config saved to /var/cache/conftool/dbconfig/20220203-091050-marostegui.json
09:00 marostegui: Failover m2 from db1183 to db1159 - T300329
08:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20024 and previous config saved to /var/cache/conftool/dbconfig/20220203-085720-marostegui.json
08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20023 and previous config saved to /var/cache/conftool/dbconfig/20220203-085545-marostegui.json
08:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300402)', diff saved to https://phabricator.wikimedia.org/P20022 and previous config saved to /var/cache/conftool/dbconfig/20220203-085159-marostegui.json
08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
08:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20021 and previous config saved to /var/cache/conftool/dbconfig/20220203-085151-marostegui.json
08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P20020 and previous config saved to /var/cache/conftool/dbconfig/20220203-084215-marostegui.json
08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20019 and previous config saved to /var/cache/conftool/dbconfig/20220203-083647-marostegui.json
08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20018 and previous config saved to /var/cache/conftool/dbconfig/20220203-082710-marostegui.json
08:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T298558)', diff saved to https://phabricator.wikimedia.org/P20017 and previous config saved to /var/cache/conftool/dbconfig/20220203-082302-marostegui.json
08:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1163.eqiad.wmnet with reason: Maintenance
08:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
08:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20016 and previous config saved to /var/cache/conftool/dbconfig/20220203-082249-marostegui.json
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P20015 and previous config saved to /var/cache/conftool/dbconfig/20220203-082142-marostegui.json
08:10 dcausse: restarting blazegraph on wdqs1013 (jvm stuck for 5hours)
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20014 and previous config saved to /var/cache/conftool/dbconfig/20220203-080745-marostegui.json
08:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20013 and previous config saved to /var/cache/conftool/dbconfig/20220203-080637-marostegui.json
08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T300402)', diff saved to https://phabricator.wikimedia.org/P20012 and previous config saved to /var/cache/conftool/dbconfig/20220203-080254-marostegui.json
08:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
08:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20011 and previous config saved to /var/cache/conftool/dbconfig/20220203-080247-marostegui.json
07:55 _joe_: restarted php-fpm on wtp1029, segfaulting
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P20010 and previous config saved to /var/cache/conftool/dbconfig/20220203-075240-marostegui.json
07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20009 and previous config saved to /var/cache/conftool/dbconfig/20220203-074742-marostegui.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20008 and previous config saved to /var/cache/conftool/dbconfig/20220203-073735-marostegui.json
07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P20007 and previous config saved to /var/cache/conftool/dbconfig/20220203-073237-marostegui.json
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T298558)', diff saved to https://phabricator.wikimedia.org/P20006 and previous config saved to /var/cache/conftool/dbconfig/20220203-073129-marostegui.json
07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1169.eqiad.wmnet with reason: Maintenance
07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
07:23 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2078,2133].codfw.wmnet,db[1117,1159,1183].eqiad.wmnet with reason: Switchover m2 T300329
07:23 root@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2078,2133].codfw.wmnet,db[1117,1159,1183].eqiad.wmnet with reason: Switchover m2 T300329
07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20005 and previous config saved to /var/cache/conftool/dbconfig/20220203-071732-marostegui.json
07:14 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
07:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T300402)', diff saved to https://phabricator.wikimedia.org/P20004 and previous config saved to /var/cache/conftool/dbconfig/20220203-071348-marostegui.json
07:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
07:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
07:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
07:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
07:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
07:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P20003 and previous config saved to /var/cache/conftool/dbconfig/20220203-071141-marostegui.json
07:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P20002 and previous config saved to /var/cache/conftool/dbconfig/20220203-071111-marostegui.json
06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P20001 and previous config saved to /var/cache/conftool/dbconfig/20220203-065636-marostegui.json
06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P20000 and previous config saved to /var/cache/conftool/dbconfig/20220203-065606-marostegui.json
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19999 and previous config saved to /var/cache/conftool/dbconfig/20220203-064131-marostegui.json
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P19998 and previous config saved to /var/cache/conftool/dbconfig/20220203-064101-marostegui.json
06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P19997 and previous config saved to /var/cache/conftool/dbconfig/20220203-062627-marostegui.json
06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P19996 and previous config saved to /var/cache/conftool/dbconfig/20220203-062556-marostegui.json
06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T300402)', diff saved to https://phabricator.wikimedia.org/P19995 and previous config saved to /var/cache/conftool/dbconfig/20220203-062243-marostegui.json
06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
06:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
06:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
06:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
06:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T298558)', diff saved to https://phabricator.wikimedia.org/P19994 and previous config saved to /var/cache/conftool/dbconfig/20220203-061703-marostegui.json
06:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
06:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
01:12 brennen: UTC late backport window finished
01:11 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ganeti2029.codfw.wmnet with OS buster
01:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
01:09 brennen@deploy1002: Finished scap: Backports: Changes the labels of the Vector skins (T299927) and Pass skin name to Hooks::isSkinLegacy (T299971) (duration: 24m 48s)
01:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:44 brennen@deploy1002: Started scap: Backports: Changes the labels of the Vector skins (T299927) and Pass skin name to Hooks::isSkinLegacy (T299971)
00:43 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2029.codfw.wmnet with OS buster

2022-02-02

22:26 mutante: gitlab - introducing parameter to fetch TLS certs either with acmechief or certbot (if in cloud). Boolean $use_acmechief = lookup('profile::gitlab::use_acmechief'), confirmed noop in prod on gitlab1001.wikimedia.org ( T297411)
21:36 ejegg: updated CiviCRM from 2bd5fb5e to 7dcdc017
20:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:04 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.20 refs T293961 (duration: 00m 49s)
20:03 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.20 refs T293961
19:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:49 dancy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: cowikimedia: Allow bureaucrats to remove sysop and bureaucrat flags (T300779) (duration: 00m 50s)
19:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:42 dancy@deploy1002: Synchronized multiversion/MWMultiVersion.php: Config: multiversion: Improve error message if wikiversions.php has wrong format (duration: 00m 49s)
19:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 62b2acb: Migration mode enabled everywhere (T299927) (duration: 00m 49s)
19:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:27 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.20/skins/Vector/includes/SkinVector.php: bdc20dd: Fix the opt in URl (T300097) (duration: 00m 49s)
19:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:24 urbanecm@deploy1002: Synchronized wmf-config/: a48f8bd: Migrate calls of wmf* constants to wmg* constants (T45956) (duration: 00m 51s)
19:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19993 and previous config saved to /var/cache/conftool/dbconfig/20220202-191918-marostegui.json
19:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:14 urbanecm@deploy1002: Synchronized multiversion/buildConfigCache.php: 83f1f6a: Consistently write to $wmgRealm the same value as to $wmfRealm (T45956) (duration: 00m 49s)
19:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:10 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/{kywiki,kywiki-1.5x,kywiki-2x}.png (T300241)
19:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:09 topranks: Running homer to enable interface et-1/0/2 on cr1-eqiad (towards lsw1-e1-eqiad) to test connectivity.
19:09 urbanecm@deploy1002: Synchronized logos/config.yaml: 335cbee: kywiki: update logo (3/3; T300241) (duration: 00m 49s)
19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:08 urbanecm@deploy1002: Synchronized wmf-config/logos.php: 335cbee: kywiki: update logo (2/3; T300241) (duration: 00m 53s)
19:07 urbanecm@deploy1002: Synchronized static/images/project-logos/: 335cbee: kywiki: update logo (1/3; T300241) (duration: 00m 50s)
19:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19992 and previous config saved to /var/cache/conftool/dbconfig/20220202-190414-marostegui.json
18:52 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
18:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19991 and previous config saved to /var/cache/conftool/dbconfig/20220202-184909-marostegui.json
18:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19990 and previous config saved to /var/cache/conftool/dbconfig/20220202-183404-marostegui.json
18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T300402)', diff saved to https://phabricator.wikimedia.org/P19989 and previous config saved to /var/cache/conftool/dbconfig/20220202-183034-marostegui.json
18:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19988 and previous config saved to /var/cache/conftool/dbconfig/20220202-183027-marostegui.json
18:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
18:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
18:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
18:17 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
18:16 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.20/includes/filerepo/file/ForeignAPIFile.php: Backport: Revert "Support audio on filepage in InstantCommons" (T300751) (duration: 00m 51s)
18:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19987 and previous config saved to /var/cache/conftool/dbconfig/20220202-181522-marostegui.json
18:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19986 and previous config saved to /var/cache/conftool/dbconfig/20220202-180018-marostegui.json
17:45 cwhite: end logstash upgrade (codfw) T299168
17:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19985 and previous config saved to /var/cache/conftool/dbconfig/20220202-174513-marostegui.json
17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T300402)', diff saved to https://phabricator.wikimedia.org/P19984 and previous config saved to /var/cache/conftool/dbconfig/20220202-174138-marostegui.json
17:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
17:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
17:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
17:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
17:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19983 and previous config saved to /var/cache/conftool/dbconfig/20220202-174125-marostegui.json
17:32 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
17:26 cwhite: begin logstash upgrade (codfw) T299168
17:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19982 and previous config saved to /var/cache/conftool/dbconfig/20220202-172620-marostegui.json
17:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19981 and previous config saved to /var/cache/conftool/dbconfig/20220202-171115-marostegui.json
16:59 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
16:59 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
16:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19979 and previous config saved to /var/cache/conftool/dbconfig/20220202-165611-marostegui.json
16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2012.codfw.wmnet
16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2011.codfw.wmnet
16:47 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2010.codfw.wmnet
16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2012.codfw.wmnet
16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2011.codfw.wmnet
16:46 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2010.codfw.wmnet
16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2012.codfw.wmnet
16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2011.codfw.wmnet
16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2010.codfw.wmnet
16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2012.codfw.wmnet
16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2011.codfw.wmnet
16:45 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2010.codfw.wmnet
16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2005.codfw.wmnet
16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2006.codfw.wmnet
16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2007.codfw.wmnet
16:42 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2008.codfw.wmnet
16:41 Emperor: standardising nginx weights for codfw swift proxies to match eqiad ones T300738
16:41 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ms-fe2009.codfw.wmnet
16:41 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=nginx,name=ms-fe2009.codfw.wmnet
16:39 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
16:38 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
16:30 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
16:27 mvernon@puppetmaster1001: conftool action : set/pooled=yes; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
16:26 mvernon@puppetmaster1001: conftool action : set/weight=40; selector: service=swift-fe,name=ms-fe2009.codfw.wmnet
16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T300402)', diff saved to https://phabricator.wikimedia.org/P19977 and previous config saved to /var/cache/conftool/dbconfig/20220202-162435-marostegui.json
16:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
16:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19976 and previous config saved to /var/cache/conftool/dbconfig/20220202-162428-marostegui.json
16:24 jbond: disable ldap email checks on mx2001
16:19 Emperor: rolling restart of swift frontends to bring new ones into service T300738
16:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19975 and previous config saved to /var/cache/conftool/dbconfig/20220202-160923-marostegui.json
15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19974 and previous config saved to /var/cache/conftool/dbconfig/20220202-155418-marostegui.json
15:45 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
15:44 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:43 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
15:43 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:41 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
15:41 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19973 and previous config saved to /var/cache/conftool/dbconfig/20220202-153913-marostegui.json
15:37 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
15:37 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:35 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
15:35 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:34 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
15:34 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
15:32 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19972 and previous config saved to /var/cache/conftool/dbconfig/20220202-153206-marostegui.json
15:32 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
15:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
15:32 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
15:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
15:30 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
15:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
15:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
15:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
15:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19970 and previous config saved to /var/cache/conftool/dbconfig/20220202-152552-marostegui.json
15:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
15:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
15:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19969 and previous config saved to /var/cache/conftool/dbconfig/20220202-151047-marostegui.json
15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19968 and previous config saved to /var/cache/conftool/dbconfig/20220202-150832-root.json
15:00 XioNoX: esams: push Capirca generated loopback filters
14:59 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2029.mgmt.codfw.wmnet with reboot policy FORCED
14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19967 and previous config saved to /var/cache/conftool/dbconfig/20220202-145542-marostegui.json
14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19966 and previous config saved to /var/cache/conftool/dbconfig/20220202-145329-root.json
14:47 jayme@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:44 XioNoX: codfw: push Capirca generated loopback filters
14:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19965 and previous config saved to /var/cache/conftool/dbconfig/20220202-144038-marostegui.json
14:39 jayme@cumin1001: START - Cookbook sre.dns.netbox
14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19963 and previous config saved to /var/cache/conftool/dbconfig/20220202-143825-root.json
14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T300402)', diff saved to https://phabricator.wikimedia.org/P19962 and previous config saved to /var/cache/conftool/dbconfig/20220202-143221-marostegui.json
14:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
14:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
14:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19961 and previous config saved to /var/cache/conftool/dbconfig/20220202-143214-marostegui.json
14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19960 and previous config saved to /var/cache/conftool/dbconfig/20220202-142321-root.json
14:21 XioNoX: eqsin: push Capirca generated loopback filters
14:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19959 and previous config saved to /var/cache/conftool/dbconfig/20220202-141709-marostegui.json
14:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
14:15 XioNoX: cr2-eqdfw: push Capirca generated loopback filters
14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es1020 - as it is the master', diff saved to https://phabricator.wikimedia.org/P19958 and previous config saved to /var/cache/conftool/dbconfig/20220202-141455-marostegui.json
14:13 vgutierrez: pool cp1087 running envoy as TLS terminator - T271421
14:09 XioNoX: cr2-eqord: push Capirca generated loopback filters
14:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19957 and previous config saved to /var/cache/conftool/dbconfig/20220202-140818-root.json
14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1179 schema change', diff saved to https://phabricator.wikimedia.org/P19956 and previous config saved to /var/cache/conftool/dbconfig/20220202-140317-marostegui.json
14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19955 and previous config saved to /var/cache/conftool/dbconfig/20220202-140239-root.json
14:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19954 and previous config saved to /var/cache/conftool/dbconfig/20220202-140204-marostegui.json
13:50 elukey: move docker on ml-serve-ctrl* nodes from device mapper to overlay2
13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19953 and previous config saved to /var/cache/conftool/dbconfig/20220202-134735-root.json
13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19952 and previous config saved to /var/cache/conftool/dbconfig/20220202-134659-marostegui.json
13:40 XioNoX: ULSFO routers: push Capirca generated loopback filters
13:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19951 and previous config saved to /var/cache/conftool/dbconfig/20220202-133713-marostegui.json
13:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
13:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
13:35 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on production
13:34 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
13:34 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on production
13:34 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on canary
13:33 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on canary
13:33 otto@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on production
13:32 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync on production
13:32 otto@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: sync on canary
13:32 ottomata: roll restarting eventgate-main to pick up stream-configs for rdf-streaming-updater.reconcile
13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19949 and previous config saved to /var/cache/conftool/dbconfig/20220202-133231-root.json
13:31 otto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
13:31 otto@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync on canary
13:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
13:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
13:30 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
13:30 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
13:29 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on production
13:28 otto@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on canary
13:28 otto@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: sync on production
13:25 XioNoX: rename cr3-ulsfo loopback terms in preparation of move to Capirca
13:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
13:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19947 and previous config saved to /var/cache/conftool/dbconfig/20220202-132510-marostegui.json
13:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19946 and previous config saved to /var/cache/conftool/dbconfig/20220202-131728-root.json
13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19945 and previous config saved to /var/cache/conftool/dbconfig/20220202-131006-marostegui.json
13:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
13:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
13:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
13:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
13:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19944 and previous config saved to /var/cache/conftool/dbconfig/20220202-130224-root.json
12:59 taavi@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: ULS: Remove unused ULSEventLogging variable (T275894) (duration: 00m 49s)
12:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19942 and previous config saved to /var/cache/conftool/dbconfig/20220202-125500-marostegui.json
12:54 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Clean-up decommisioned Print schema configs (T196159) (duration: 00m 50s)
12:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 100%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19941 and previous config saved to /var/cache/conftool/dbconfig/20220202-125034-root.json
12:43 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1087.eqiad.wmnet with OS buster
12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T298558)', diff saved to https://phabricator.wikimedia.org/P19940 and previous config saved to /var/cache/conftool/dbconfig/20220202-124122-marostegui.json
12:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
12:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19939 and previous config saved to /var/cache/conftool/dbconfig/20220202-124115-marostegui.json
12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19938 and previous config saved to /var/cache/conftool/dbconfig/20220202-123956-marostegui.json
12:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 75%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19937 and previous config saved to /var/cache/conftool/dbconfig/20220202-123531-root.json
12:34 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1019.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
12:32 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1019.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
12:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T300402)', diff saved to https://phabricator.wikimedia.org/P19936 and previous config saved to /var/cache/conftool/dbconfig/20220202-123127-marostegui.json
12:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
12:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
12:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1019.eqiad.wmnet
12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19934 and previous config saved to /var/cache/conftool/dbconfig/20220202-122610-marostegui.json
12:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19933 and previous config saved to /var/cache/conftool/dbconfig/20220202-122112-marostegui.json
12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1019.eqiad.wmnet
12:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 65%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19932 and previous config saved to /var/cache/conftool/dbconfig/20220202-122027-root.json
12:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
12:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
12:11 taavi@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: prod: READ_NEW for CentralAuth hidden level migration (T289068) (duration: 00m 50s)
12:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P19930 and previous config saved to /var/cache/conftool/dbconfig/20220202-121105-marostegui.json
12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P19929 and previous config saved to /var/cache/conftool/dbconfig/20220202-120608-marostegui.json
12:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 50%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19928 and previous config saved to /var/cache/conftool/dbconfig/20220202-120524-root.json
11:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19927 and previous config saved to /var/cache/conftool/dbconfig/20220202-115601-marostegui.json
11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P19926 and previous config saved to /var/cache/conftool/dbconfig/20220202-115103-marostegui.json
11:50 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp1087.eqiad.wmnet with OS buster
11:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 40%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19925 and previous config saved to /var/cache/conftool/dbconfig/20220202-115020-root.json
11:48 vgutierrez: depool cp1087 to be reimaged as cache::text_envoy - T271421
11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T298558)', diff saved to https://phabricator.wikimedia.org/P19924 and previous config saved to /var/cache/conftool/dbconfig/20220202-114639-marostegui.json
11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1112.eqiad.wmnet with reason: Maintenance
11:45 _joe_: repooling thanos-fe1001 T300119
11:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
11:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19923 and previous config saved to /var/cache/conftool/dbconfig/20220202-113558-marostegui.json
11:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 25%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19922 and previous config saved to /var/cache/conftool/dbconfig/20220202-113516-root.json
11:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
11:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
11:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19921 and previous config saved to /var/cache/conftool/dbconfig/20220202-113007-marostegui.json
11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T300402)', diff saved to https://phabricator.wikimedia.org/P19920 and previous config saved to /var/cache/conftool/dbconfig/20220202-112849-marostegui.json
11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
11:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
11:28 _joe_: depooling thanos-fe1001 for testing T300119
11:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 15%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19919 and previous config saved to /var/cache/conftool/dbconfig/20220202-112013-root.json
11:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
11:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
11:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19918 and previous config saved to /var/cache/conftool/dbconfig/20220202-111804-marostegui.json
11:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P19917 and previous config saved to /var/cache/conftool/dbconfig/20220202-111502-marostegui.json
11:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 10%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19916 and previous config saved to /var/cache/conftool/dbconfig/20220202-110509-root.json
11:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P19915 and previous config saved to /var/cache/conftool/dbconfig/20220202-110259-marostegui.json
10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P19914 and previous config saved to /var/cache/conftool/dbconfig/20220202-105957-marostegui.json
10:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 5%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19913 and previous config saved to /var/cache/conftool/dbconfig/20220202-105006-root.json
10:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P19912 and previous config saved to /var/cache/conftool/dbconfig/20220202-104755-marostegui.json
10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19911 and previous config saved to /var/cache/conftool/dbconfig/20220202-104453-marostegui.json
10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Remove recentchanges and recentchanges groups from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P19910 and previous config saved to /var/cache/conftool/dbconfig/20220202-103830-marostegui.json
10:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 2%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19909 and previous config saved to /var/cache/conftool/dbconfig/20220202-103502-root.json
10:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1021 after reimage', diff saved to https://phabricator.wikimedia.org/P19908 and previous config saved to /var/cache/conftool/dbconfig/20220202-103436-marostegui.json
10:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1021 (re)pooling @ 1%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P19907 and previous config saved to /var/cache/conftool/dbconfig/20220202-103401-root.json
10:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19906 and previous config saved to /var/cache/conftool/dbconfig/20220202-103250-marostegui.json
10:28 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
10:27 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
10:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T298558)', diff saved to https://phabricator.wikimedia.org/P19905 and previous config saved to /var/cache/conftool/dbconfig/20220202-102717-marostegui.json
10:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
10:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1123.eqiad.wmnet with reason: Maintenance
10:23 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
10:22 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1008.eqiad.wmnet with OS buster
10:21 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
10:21 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
10:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
10:11 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
10:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es1021.eqiad.wmnet with OS bullseye
10:10 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
10:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Maintenance
10:09 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Maintenance
10:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2105.codfw.wmnet with reason: Maintenance
10:06 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 04s)
10:06 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
10:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
10:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
09:53 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1008.eqiad.wmnet with OS buster
09:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1019.eqiad.wmnet with OS buster
09:40 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host es1021.eqiad.wmnet with OS bullseye
09:39 moritzm: installing apache/apache-modsecurity2 security updates
09:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host es1021.mgmt.eqiad.wmnet with reboot policy GRACEFUL
09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
09:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1162 (T300402)', diff saved to https://phabricator.wikimedia.org/P19904 and previous config saved to /var/cache/conftool/dbconfig/20220202-093231-marostegui.json
09:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
09:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
09:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19903 and previous config saved to /var/cache/conftool/dbconfig/20220202-093223-marostegui.json
09:28 marostegui@cumin1001: START - Cookbook sre.hosts.provision for host es1021.mgmt.eqiad.wmnet with reboot policy GRACEFUL
09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P19902 and previous config saved to /var/cache/conftool/dbconfig/20220202-091718-marostegui.json
09:17 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1019.eqiad.wmnet with OS buster
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1021 T300127', diff saved to https://phabricator.wikimedia.org/P19901 and previous config saved to /var/cache/conftool/dbconfig/20220202-091355-marostegui.json
09:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
09:10 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
09:10 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
09:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
09:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
09:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
09:08 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
09:08 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
09:07 marostegui@deploy1002: Synchronized wmf-config/db-production.php: Enable writes on es4 T300127 (duration: 00m 50s)
09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P19900 and previous config saved to /var/cache/conftool/dbconfig/20220202-090214-marostegui.json
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1020 to es4 primary and set section read-write T300127', diff saved to https://phabricator.wikimedia.org/P19899 and previous config saved to /var/cache/conftool/dbconfig/20220202-090121-marostegui.json
09:00 marostegui: Starting es4 eqiad failover from es1021 to es1020 - T300127
08:52 aqu@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
08:52 aqu@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
08:48 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Switchover es4 T300127
08:48 root@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Switchover es4 T300127
08:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19898 and previous config saved to /var/cache/conftool/dbconfig/20220202-084709-marostegui.json
08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19897 and previous config saved to /var/cache/conftool/dbconfig/20220202-084150-marostegui.json
08:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19896 and previous config saved to /var/cache/conftool/dbconfig/20220202-084143-marostegui.json
08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P19895 and previous config saved to /var/cache/conftool/dbconfig/20220202-082638-marostegui.json
08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P19894 and previous config saved to /var/cache/conftool/dbconfig/20220202-081134-marostegui.json
07:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19893 and previous config saved to /var/cache/conftool/dbconfig/20220202-075629-marostegui.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T300402)', diff saved to https://phabricator.wikimedia.org/P19892 and previous config saved to /var/cache/conftool/dbconfig/20220202-075244-marostegui.json
07:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
07:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19891 and previous config saved to /var/cache/conftool/dbconfig/20220202-075236-marostegui.json
07:51 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (eqiad1) (duration: 04m 09s)
07:47 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (eqiad1)
07:46 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
07:45 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
07:44 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard (duration: 02m 19s)
07:42 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: update wmf-proxy-dashboard
07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Set es1020 with weight 10 T300127', diff saved to https://phabricator.wikimedia.org/P19890 and previous config saved to /var/cache/conftool/dbconfig/20220202-073918-root.json
07:38 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: Switchover es4 T300127
07:38 root@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: Switchover es4 T300127
07:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P19889 and previous config saved to /var/cache/conftool/dbconfig/20220202-073731-marostegui.json
07:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
07:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
07:36 marostegui@deploy1002: Synchronized wmf-config/db-production.php: Disable writes on es4 T300127 (duration: 00m 50s)
07:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
07:30 marostegui@deploy1002: Synchronized wmf-config/ProductionServices.php: Disable writes on es4 T300127 (duration: 00m 51s)
07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P19888 and previous config saved to /var/cache/conftool/dbconfig/20220202-072227-marostegui.json
07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19887 and previous config saved to /var/cache/conftool/dbconfig/20220202-070722-marostegui.json
07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300402)', diff saved to https://phabricator.wikimedia.org/P19886 and previous config saved to /var/cache/conftool/dbconfig/20220202-070012-marostegui.json
07:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
07:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
07:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance
06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
06:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
06:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
06:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
02:54 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
02:48 pt1979@cumin2002: START - Cookbook sre.dns.netbox
02:29 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2008.codfw.wmnet with OS buster
02:19 ejegg: updated CiviCRM from 0513f1b7 to 3d379e25
01:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2008.codfw.wmnet with OS buster
01:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2007.codfw.wmnet with OS buster
01:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
01:13 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve2007.codfw.wmnet with OS buster
01:12 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
01:12 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-serve2007.codfw.wmnet with OS buster
01:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
01:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
01:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
01:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
01:03 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: rdf-streaming-updater: add the reconciliation stream (T279541) (duration: 00m 49s)
00:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2007.codfw.wmnet with OS buster
00:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:52 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:51 urbanecm: UTC late B&C window completed
00:50 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b560843: Add wgUploadNavigationUrl upload page of ptwikinews (T300466) (duration: 00m 50s)
00:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2006.codfw.wmnet with OS buster
00:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:40 urbanecm@deploy1002: Synchronized docroot/noc/db.php: 06444c1: Start writing to some wmg* constants (T45956; 2/2) (duration: 00m 49s)
00:39 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: 06444c1: Start writing to some wmg* constants (T45956; 1/2) (duration: 00m 49s)
00:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:29 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: b2c13c6: Enable migration mode on all group 0, group 1 and desktop-improvement wikis (T299927) (duration: 01m 58s)
00:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:17 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2006.codfw.wmnet with OS buster
00:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2022-02-01

22:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve2005.codfw.wmnet with OS buster
22:48 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet2002-dev.codfw.wmnet with OS bullseye
22:22 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
22:21 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host ml-serve2005.codfw.wmnet with OS buster
22:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-serve2005.codfw.wmnet with OS buster
21:55 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2002-dev.codfw.wmnet with OS bullseye
21:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:14 Lucas_WMDE: Deployed patch for T297754
21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:42 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.17 (duration: 01m 35s)
20:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:38 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.20 refs T293961
20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19884 and previous config saved to /var/cache/conftool/dbconfig/20220201-202806-marostegui.json
20:27 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
20:21 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.18 (duration: 04m 08s)
20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
20:20 ejegg: updated payments-wiki from 933e8669 to dbcb5254
20:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P19882 and previous config saved to /var/cache/conftool/dbconfig/20220201-201259-marostegui.json
20:12 dancy@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.20 refs T293961 (duration: 51m 42s)
20:05 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P19881 and previous config saved to /var/cache/conftool/dbconfig/20220201-195755-marostegui.json
19:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
19:55 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e] (hadoop-test): Hotfix analytics weekly train TEST [analytics/refinery@6a7983e] (duration: 05m 51s)
19:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:49 joal@deploy1002: Started deploy [analytics/refinery@6a7983e] (hadoop-test): Hotfix analytics weekly train TEST [analytics/refinery@6a7983e]
19:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19880 and previous config saved to /var/cache/conftool/dbconfig/20220201-194250-marostegui.json
19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T298558)', diff saved to https://phabricator.wikimedia.org/P19879 and previous config saved to /var/cache/conftool/dbconfig/20220201-194144-marostegui.json
19:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
19:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1148.eqiad.wmnet with reason: Maintenance
19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19878 and previous config saved to /var/cache/conftool/dbconfig/20220201-194136-marostegui.json
19:40 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e] (thin): Hotfix analytics weekly train THIN [analytics/refinery@6a7983e] (duration: 00m 07s)
19:40 joal@deploy1002: Started deploy [analytics/refinery@6a7983e] (thin): Hotfix analytics weekly train THIN [analytics/refinery@6a7983e]
19:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P19877 and previous config saved to /var/cache/conftool/dbconfig/20220201-192632-marostegui.json
19:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:22 joal@deploy1002: Finished deploy [analytics/refinery@6a7983e]: Hotfix analytics weekly train [analytics/refinery@6a7983e] (duration: 19m 09s)
19:20 dancy@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.20 refs T293961
19:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:19 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2002.codfw.wmnet with OS buster
19:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
19:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
19:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P19876 and previous config saved to /var/cache/conftool/dbconfig/20220201-191127-marostegui.json
19:02 joal@deploy1002: Started deploy [analytics/refinery@6a7983e]: Hotfix analytics weekly train [analytics/refinery@6a7983e]
18:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19875 and previous config saved to /var/cache/conftool/dbconfig/20220201-185622-marostegui.json
18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T298558)', diff saved to https://phabricator.wikimedia.org/P19874 and previous config saved to /var/cache/conftool/dbconfig/20220201-185516-marostegui.json
18:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
18:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1149.eqiad.wmnet with reason: Maintenance
18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19873 and previous config saved to /var/cache/conftool/dbconfig/20220201-185507-marostegui.json
18:45 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2002.codfw.wmnet with OS buster
18:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-staging2001.codfw.wmnet with OS buster
18:40 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on production
18:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19872 and previous config saved to /var/cache/conftool/dbconfig/20220201-184027-root.json
18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P19871 and previous config saved to /var/cache/conftool/dbconfig/20220201-184002-marostegui.json
18:38 dcausse@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync on canary
18:38 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply on canary
18:38 dcausse@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply on production
18:36 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on production
18:35 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: sync on canary
18:33 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply on canary
18:33 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply on production
18:30 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: sync on production
18:29 dcausse@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply on canary
18:29 dcausse@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply on production
18:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19870 and previous config saved to /var/cache/conftool/dbconfig/20220201-182523-root.json
18:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS buster
18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160', diff saved to https://phabricator.wikimedia.org/P19869 and previous config saved to /var/cache/conftool/dbconfig/20220201-182458-marostegui.json
18:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-staging2001.codfw.wmnet with OS buster
18:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 60%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19868 and previous config saved to /var/cache/conftool/dbconfig/20220201-181019-root.json
18:10 cwhite: end logstash upgrade (eqiad) T299168
18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19867 and previous config saved to /var/cache/conftool/dbconfig/20220201-180953-marostegui.json
18:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1160 (T298558)', diff saved to https://phabricator.wikimedia.org/P19866 and previous config saved to /var/cache/conftool/dbconfig/20220201-180847-marostegui.json
18:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
18:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance
18:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19865 and previous config saved to /var/cache/conftool/dbconfig/20220201-180839-marostegui.json
18:04 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase2017.wmnet
18:03 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase2017.codfw.wmnet with OS buster
17:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ml-staging2001.codfw.wmnet with OS buster
17:57 urbanecm@deploy1002: Synchronized wmf-config/config/amiwiki.yaml: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (3/3) (duration: 00m 49s)
17:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
17:56 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
17:56 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (2/3) (duration: 00m 50s)
17:55 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: 7f8bc6d: amiwiki: Deploy Growth features in dark mode (1/3) (duration: 00m 51s)
17:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19864 and previous config saved to /var/cache/conftool/dbconfig/20220201-175516-root.json
17:54 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c24f002] (duration: 05m 41s)
17:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
17:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
17:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P19863 and previous config saved to /var/cache/conftool/dbconfig/20220201-175334-marostegui.json
17:53 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
17:52 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php amiwiki
17:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php amiwiki growthexperiments
17:49 btullis@deploy1002: Started deploy [analytics/refinery@c24f002] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@c24f002]
17:48 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002] (thin): Regular analytics weekly train THIN [analytics/refinery@c24f002] (duration: 00m 07s)
17:48 btullis@deploy1002: Started deploy [analytics/refinery@c24f002] (thin): Regular analytics weekly train THIN [analytics/refinery@c24f002]
17:47 cwhite: begin logstash upgrade (eqiad) T299168
17:42 btullis@deploy1002: Finished deploy [analytics/refinery@c24f002]: Regular analytics weekly train [analytics/refinery@c24f002] (duration: 11m 29s)
17:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 40%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19862 and previous config saved to /var/cache/conftool/dbconfig/20220201-174012-root.json
17:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P19861 and previous config saved to /var/cache/conftool/dbconfig/20220201-173830-marostegui.json
17:30 btullis@deploy1002: Started deploy [analytics/refinery@c24f002]: Regular analytics weekly train [analytics/refinery@c24f002]
17:29 btullis: about to deploy analytics/refinery
17:26 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
17:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19860 and previous config saved to /var/cache/conftool/dbconfig/20220201-172509-root.json
17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19859 and previous config saved to /var/cache/conftool/dbconfig/20220201-172325-marostegui.json
17:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298558)', diff saved to https://phabricator.wikimedia.org/P19858 and previous config saved to /var/cache/conftool/dbconfig/20220201-172219-marostegui.json
17:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
17:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
17:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
17:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1121.eqiad.wmnet with reason: Maintenance
17:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19857 and previous config saved to /var/cache/conftool/dbconfig/20220201-172205-marostegui.json
17:21 vgutierrez: pool cp2039 running envoy as TLS terminator - T271421
17:17 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host restbase2017.codfw.wmnet with OS buster
17:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 20%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19856 and previous config saved to /var/cache/conftool/dbconfig/20220201-171005-root.json
17:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P19855 and previous config saved to /var/cache/conftool/dbconfig/20220201-170701-marostegui.json
16:58 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2039.codfw.wmnet with OS buster
16:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 10%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19854 and previous config saved to /var/cache/conftool/dbconfig/20220201-165501-root.json
16:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P19852 and previous config saved to /var/cache/conftool/dbconfig/20220201-165156-marostegui.json
16:51 papaul: rebooting pfw3a-codfw and pfw3b for JUNOS upgrade
16:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2008.mgmt.codfw.wmnet with reboot policy FORCED
16:49 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
16:43 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2008.mgmt.codfw.wmnet with reboot policy FORCED
16:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 5%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19851 and previous config saved to /var/cache/conftool/dbconfig/20220201-163958-root.json
16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19850 and previous config saved to /var/cache/conftool/dbconfig/20220201-163651-marostegui.json
16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19849 and previous config saved to /var/cache/conftool/dbconfig/20220201-163545-marostegui.json
16:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
16:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1146.eqiad.wmnet with reason: Maintenance
16:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19848 and previous config saved to /var/cache/conftool/dbconfig/20220201-163537-marostegui.json
16:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 1%: repooling after schema change', diff saved to https://phabricator.wikimedia.org/P19847 and previous config saved to /var/cache/conftool/dbconfig/20220201-162454-root.json
16:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P19846 and previous config saved to /var/cache/conftool/dbconfig/20220201-162033-marostegui.json
16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T300402)', diff saved to https://phabricator.wikimedia.org/P19845 and previous config saved to /var/cache/conftool/dbconfig/20220201-161353-marostegui.json
16:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
16:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
16:12 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp2039.codfw.wmnet with OS buster
16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
16:11 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2007.mgmt.codfw.wmnet with reboot policy FORCED
16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
16:10 vgutierrez: depool cp2039 to be reimaged as cache::text_envoy - T271421
16:09 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 03s)
16:09 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
16:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P19844 and previous config saved to /var/cache/conftool/dbconfig/20220201-160528-marostegui.json
16:05 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 10s)
16:04 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2007.mgmt.codfw.wmnet with reboot policy FORCED
15:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19843 and previous config saved to /var/cache/conftool/dbconfig/20220201-155023-marostegui.json
15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T298558)', diff saved to https://phabricator.wikimedia.org/P19842 and previous config saved to /var/cache/conftool/dbconfig/20220201-154716-marostegui.json
15:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
15:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance
15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19841 and previous config saved to /var/cache/conftool/dbconfig/20220201-154709-marostegui.json
15:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1010.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
15:34 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 08s)
15:34 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2006.mgmt.codfw.wmnet with reboot policy FORCED
15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P19840 and previous config saved to /var/cache/conftool/dbconfig/20220201-153204-marostegui.json
15:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
15:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1139.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
15:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1102.eqiad.wmnet with reason: Maintenance
15:24 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2006.mgmt.codfw.wmnet with reboot policy FORCED
15:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19839 and previous config saved to /var/cache/conftool/dbconfig/20220201-152323-marostegui.json
15:22 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided) (duration: 00m 09s)
15:22 ebysans@deploy1002: Started deploy [airflow-dags/analytics-test@3ad07a0]: (no justification provided)
15:21 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1010.eqiad.wmnet to ganeti01.svc.eqiad.wmnet
15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1010.eqiad.wmnet
15:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P19838 and previous config saved to /var/cache/conftool/dbconfig/20220201-151700-marostegui.json
15:13 kart_: Deployed Flores MT for cxserver + Updated cxserver to 2022-01-13-174407-production (T298584, T292412, T292415, T298679, T298752) + Updated cxserver to 2022-02-01-141918-production (T298592)
15:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1010.eqiad.wmnet
15:10 jelto: update scap to 4.2.2 on all hosts - T300392
15:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P19837 and previous config saved to /var/cache/conftool/dbconfig/20220201-150818-marostegui.json
15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1016.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
15:07 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1016.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
15:05 mmandere@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum6002.drmrs.wmnet
15:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19836 and previous config saved to /var/cache/conftool/dbconfig/20220201-150155-marostegui.json
15:01 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: sync on production
15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298558)', diff saved to https://phabricator.wikimedia.org/P19835 and previous config saved to /var/cache/conftool/dbconfig/20220201-150049-marostegui.json
15:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
15:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19834 and previous config saved to /var/cache/conftool/dbconfig/20220201-150041-marostegui.json
14:59 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
14:59 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
14:58 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
14:56 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
14:56 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
14:53 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
14:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P19833 and previous config saved to /var/cache/conftool/dbconfig/20220201-145314-marostegui.json
14:52 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
14:52 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
14:52 mmandere@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum6002.drmrs.wmnet
14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P19832 and previous config saved to /var/cache/conftool/dbconfig/20220201-144536-marostegui.json
14:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19831 and previous config saved to /var/cache/conftool/dbconfig/20220201-143809-marostegui.json
14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19830 and previous config saved to /var/cache/conftool/dbconfig/20220201-143504-marostegui.json
14:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
14:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1144.eqiad.wmnet with reason: Maintenance
14:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19829 and previous config saved to /var/cache/conftool/dbconfig/20220201-143456-marostegui.json
14:30 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve2005.mgmt.codfw.wmnet with reboot policy FORCED
14:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P19828 and previous config saved to /var/cache/conftool/dbconfig/20220201-143031-marostegui.json
14:21 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ml-serve2005.mgmt.codfw.wmnet with reboot policy FORCED
14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P19827 and previous config saved to /var/cache/conftool/dbconfig/20220201-141952-marostegui.json
14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19826 and previous config saved to /var/cache/conftool/dbconfig/20220201-141527-marostegui.json
14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T298558)', diff saved to https://phabricator.wikimedia.org/P19825 and previous config saved to /var/cache/conftool/dbconfig/20220201-141420-marostegui.json
14:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
14:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1141.eqiad.wmnet with reason: Maintenance
14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19824 and previous config saved to /var/cache/conftool/dbconfig/20220201-141413-marostegui.json
14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P19823 and previous config saved to /var/cache/conftool/dbconfig/20220201-140447-marostegui.json
13:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P19822 and previous config saved to /var/cache/conftool/dbconfig/20220201-135908-marostegui.json
13:54 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on internal
13:54 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
13:52 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: sync on external
13:50 kharlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply on staging
13:50 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on internal
13:50 kharlan@deploy1002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply on external
13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19821 and previous config saved to /var/cache/conftool/dbconfig/20220201-134942-marostegui.json
13:49 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on internal
13:48 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.
13:48 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: sync on external
13:47 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
13:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300402)', diff saved to https://phabricator.wikimedia.org/P19820 and previous config saved to /var/cache/conftool/dbconfig/20220201-134740-marostegui.json
13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
13:47 kharlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply on staging
13:47 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on external
13:47 kharlan@deploy1002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply on internal
13:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
13:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
13:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
13:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19819 and previous config saved to /var/cache/conftool/dbconfig/20220201-134524-marostegui.json
13:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P19818 and previous config saved to /var/cache/conftool/dbconfig/20220201-134403-marostegui.json
13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: sync on staging
13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
13:43 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
13:43 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
13:41 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.
13:41 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on external
13:41 kharlan@deploy1002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply on internal
13:41 kharlan@deploy1002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply on staging
13:38 btullis@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
13:32 btullis@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.
13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P19817 and previous config saved to /var/cache/conftool/dbconfig/20220201-133020-marostegui.json
13:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19816 and previous config saved to /var/cache/conftool/dbconfig/20220201-132858-marostegui.json
13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T298558)', diff saved to https://phabricator.wikimedia.org/P19815 and previous config saved to /var/cache/conftool/dbconfig/20220201-132652-marostegui.json
13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1142.eqiad.wmnet with reason: Maintenance
13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance
13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 12 hosts with reason: Maintenance
13:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
13:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance
13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19814 and previous config saved to /var/cache/conftool/dbconfig/20220201-132624-marostegui.json
13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P19813 and previous config saved to /var/cache/conftool/dbconfig/20220201-131515-marostegui.json
13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19812 and previous config saved to /var/cache/conftool/dbconfig/20220201-131119-marostegui.json
13:09 hashar: Restarting CI Jenkins
13:09 hashar: Restarting Gerrit
13:01 hashar: Restarted Jenkins on releases1002.eqiad.wmnet
13:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19810 and previous config saved to /var/cache/conftool/dbconfig/20220201-130010-marostegui.json
12:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19809 and previous config saved to /var/cache/conftool/dbconfig/20220201-125805-marostegui.json
12:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
12:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P19808 and previous config saved to /var/cache/conftool/dbconfig/20220201-125615-marostegui.json
12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
12:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
12:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2123.codfw.wmnet with reason: Maintenance
12:56 marostegui: Set innodb_adaptive_hash_index=OFF on: db1129 es1029 es1030 es1028 es1020 es1023 T268869
12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19807 and previous config saved to /var/cache/conftool/dbconfig/20220201-125605-marostegui.json
12:52 mmandere@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum6001.drmrs.wmnet
12:42 mmandere@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum6001.drmrs.wmnet
12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19806 and previous config saved to /var/cache/conftool/dbconfig/20220201-124110-marostegui.json
12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P19805 and previous config saved to /var/cache/conftool/dbconfig/20220201-124100-marostegui.json
12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T298558)', diff saved to https://phabricator.wikimedia.org/P19804 and previous config saved to /var/cache/conftool/dbconfig/20220201-124004-marostegui.json
12:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1147.eqiad.wmnet with reason: Maintenance
12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1145.eqiad.wmnet with reason: Maintenance
12:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
12:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
12:39 moritzm: installing openjdk-11 security updates
12:31 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: sync on production
12:30 oblivian@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply on staging
12:30 oblivian@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply on production
12:30 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: sync on production
12:30 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply on staging
12:29 oblivian@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply on production
12:29 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: sync on staging
12:28 oblivian@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply on production
12:28 oblivian@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply on staging
12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P19803 and previous config saved to /var/cache/conftool/dbconfig/20220201-122556-marostegui.json
12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19802 and previous config saved to /var/cache/conftool/dbconfig/20220201-121051-marostegui.json
12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T300402)', diff saved to https://phabricator.wikimedia.org/P19801 and previous config saved to /var/cache/conftool/dbconfig/20220201-120847-marostegui.json
12:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
12:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1096.eqiad.wmnet with reason: Maintenance
12:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19800 and previous config saved to /var/cache/conftool/dbconfig/20220201-120839-marostegui.json
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19799 and previous config saved to /var/cache/conftool/dbconfig/20220201-115923-marostegui.json
11:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P19798 and previous config saved to /var/cache/conftool/dbconfig/20220201-115334-marostegui.json
11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19797 and previous config saved to /var/cache/conftool/dbconfig/20220201-114418-marostegui.json
11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P19796 and previous config saved to /var/cache/conftool/dbconfig/20220201-113830-marostegui.json
11:31 elukey: roll restart ORES to pick up logging change (use XFF header when possible) - T299137
11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P19795 and previous config saved to /var/cache/conftool/dbconfig/20220201-112913-marostegui.json
11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19794 and previous config saved to /var/cache/conftool/dbconfig/20220201-112325-marostegui.json
11:19 hnowlan: roll-restarting maps services in eqiad for updates
11:17 hnowlan: roll-restarting maps services in codfw for updates
11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T300402)', diff saved to https://phabricator.wikimedia.org/P19793 and previous config saved to /var/cache/conftool/dbconfig/20220201-111420-marostegui.json
11:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
11:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1110.eqiad.wmnet with reason: Maintenance
11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19792 and previous config saved to /var/cache/conftool/dbconfig/20220201-111413-marostegui.json
11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19791 and previous config saved to /var/cache/conftool/dbconfig/20220201-111409-marostegui.json
11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1181 (T298558)', diff saved to https://phabricator.wikimedia.org/P19790 and previous config saved to /var/cache/conftool/dbconfig/20220201-110855-marostegui.json
11:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
11:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19789 and previous config saved to /var/cache/conftool/dbconfig/20220201-110848-marostegui.json
10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P19788 and previous config saved to /var/cache/conftool/dbconfig/20220201-105906-marostegui.json
10:59 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
10:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
10:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2105.codfw.wmnet with OS bullseye
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19787 and previous config saved to /var/cache/conftool/dbconfig/20220201-105343-marostegui.json
10:53 Lucas_WMDE: Deployed patch for T297754
10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100', diff saved to https://phabricator.wikimedia.org/P19786 and previous config saved to /var/cache/conftool/dbconfig/20220201-104402-marostegui.json
10:41 vgutierrez: restart ATS-TLS on cp3058
10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Remove all special groups from s4 codfw T263127', diff saved to https://phabricator.wikimedia.org/P19785 and previous config saved to /var/cache/conftool/dbconfig/20220201-104118-marostegui.json
10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P19784 and previous config saved to /var/cache/conftool/dbconfig/20220201-103838-marostegui.json
10:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19783 and previous config saved to /var/cache/conftool/dbconfig/20220201-102857-marostegui.json
10:25 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions from s4 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P19782 and previous config saved to /var/cache/conftool/dbconfig/20220201-102512-marostegui.json
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1010.eqiad.wmnet with OS buster
10:24 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2105.codfw.wmnet with OS bullseye
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Bumeh-ctr out of all services on: 5 hosts
10:24 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Bumeh-ctr out of all services on: 5 hosts
10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1100 (T300402)', diff saved to https://phabricator.wikimedia.org/P19781 and previous config saved to /var/cache/conftool/dbconfig/20220201-102356-marostegui.json
10:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
10:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1100.eqiad.wmnet with reason: Maintenance
10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19780 and previous config saved to /var/cache/conftool/dbconfig/20220201-102333-marostegui.json
10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19779 and previous config saved to /var/cache/conftool/dbconfig/20220201-102300-marostegui.json
10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298558)', diff saved to https://phabricator.wikimedia.org/P19778 and previous config saved to /var/cache/conftool/dbconfig/20220201-102221-marostegui.json
10:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
10:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19777 and previous config saved to /var/cache/conftool/dbconfig/20220201-102207-marostegui.json
10:14 vgutierrez: pool cp3062 running envoy as TLS terminator - T271421
10:10 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply on staging
10:10 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply on production
10:08 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: sync on production
10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P19775 and previous config saved to /var/cache/conftool/dbconfig/20220201-100756-marostegui.json
10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19774 and previous config saved to /var/cache/conftool/dbconfig/20220201-100703-marostegui.json
10:05 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply on staging
10:05 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply on production
10:01 ayounsi@cumin1001: START - Cookbook sre.ganeti.makevm for new host netflow6001.drmrs.wmnet
10:01 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3062.esams.wmnet with OS buster
10:01 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: sync on staging
10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19773 and previous config saved to /var/cache/conftool/dbconfig/20220201-100052-root.json
10:00 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply on production
10:00 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply on staging
09:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1010.eqiad.wmnet with OS buster
09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P19772 and previous config saved to /var/cache/conftool/dbconfig/20220201-095251-marostegui.json
09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P19771 and previous config saved to /var/cache/conftool/dbconfig/20220201-095158-marostegui.json
09:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19770 and previous config saved to /var/cache/conftool/dbconfig/20220201-094548-root.json
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19769 and previous config saved to /var/cache/conftool/dbconfig/20220201-093747-marostegui.json
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T300402)', diff saved to https://phabricator.wikimedia.org/P19768 and previous config saved to /var/cache/conftool/dbconfig/20220201-093717-marostegui.json
09:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
09:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19767 and previous config saved to /var/cache/conftool/dbconfig/20220201-093709-marostegui.json
09:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19766 and previous config saved to /var/cache/conftool/dbconfig/20220201-093653-marostegui.json
09:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19765 and previous config saved to /var/cache/conftool/dbconfig/20220201-093044-root.json
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P19764 and previous config saved to /var/cache/conftool/dbconfig/20220201-092204-marostegui.json
09:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2127.codfw.wmnet with OS bullseye
09:20 moritzm: installing apache/apache-modsecurity2 security updates
09:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2149.codfw.wmnet with OS bullseye
09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T298558)', diff saved to https://phabricator.wikimedia.org/P19763 and previous config saved to /var/cache/conftool/dbconfig/20220201-091541-marostegui.json
09:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 10%: repooling', diff saved to https://phabricator.wikimedia.org/P19762 and previous config saved to /var/cache/conftool/dbconfig/20220201-091541-root.json
09:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
09:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19761 and previous config saved to /var/cache/conftool/dbconfig/20220201-091534-marostegui.json
09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131', diff saved to https://phabricator.wikimedia.org/P19760 and previous config saved to /var/cache/conftool/dbconfig/20220201-090700-marostegui.json
09:03 vgutierrez@cumin1001: START - Cookbook sre.hosts.reimage for host cp3062.esams.wmnet with OS buster
09:02 mmandere: apt1001 Delete unused stretch and buster dist libvarnisapi1 package T300264
09:01 vgutierrez: depool cp3062 to be reimaged as cache::text_envoy - T271421
09:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 5%: repooling', diff saved to https://phabricator.wikimedia.org/P19759 and previous config saved to /var/cache/conftool/dbconfig/20220201-090031-root.json
09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19758 and previous config saved to /var/cache/conftool/dbconfig/20220201-090029-marostegui.json
08:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1100.eqiad.wmnet with OS bullseye
08:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19757 and previous config saved to /var/cache/conftool/dbconfig/20220201-085155-marostegui.json
08:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1131 (T300402)', diff saved to https://phabricator.wikimedia.org/P19756 and previous config saved to /var/cache/conftool/dbconfig/20220201-085040-marostegui.json
08:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
08:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1131.eqiad.wmnet with reason: Maintenance
08:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
08:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance
08:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19755 and previous config saved to /var/cache/conftool/dbconfig/20220201-084956-marostegui.json
08:46 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2127.codfw.wmnet with OS bullseye
08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P19754 and previous config saved to /var/cache/conftool/dbconfig/20220201-084524-marostegui.json
08:43 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2149.codfw.wmnet with OS bullseye
08:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2074.codfw.wmnet with OS bullseye
08:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2109.codfw.wmnet with OS bullseye
08:38 moritzm: draining ganeti1016 for eventual reimage
08:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P19753 and previous config saved to /var/cache/conftool/dbconfig/20220201-083452-marostegui.json
08:33 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1100.eqiad.wmnet with OS bullseye
08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19752 and previous config saved to /var/cache/conftool/dbconfig/20220201-083020-marostegui.json
08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19751 and previous config saved to /var/cache/conftool/dbconfig/20220201-082906-marostegui.json
08:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
08:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 10 hosts with reason: Maintenance
08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 10 hosts with reason: Maintenance
08:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2121.codfw.wmnet with reason: Maintenance
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19750 and previous config saved to /var/cache/conftool/dbconfig/20220201-082825-marostegui.json
08:28 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1100.eqiad.wmnet with OS bullseye
08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti1008.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
08:23 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti1008.eqiad.wmnet with reason: Remove from Ganeti cluster for reimage
08:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P19749 and previous config saved to /var/cache/conftool/dbconfig/20220201-081947-marostegui.json
08:14 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1100.eqiad.wmnet with OS bullseye
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19748 and previous config saved to /var/cache/conftool/dbconfig/20220201-081321-marostegui.json
08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1100 for reimage T300473', diff saved to https://phabricator.wikimedia.org/P19747 and previous config saved to /var/cache/conftool/dbconfig/20220201-081050-marostegui.json
08:07 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2109.codfw.wmnet with OS bullseye
08:06 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2074.codfw.wmnet with OS bullseye
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: repooling', diff saved to https://phabricator.wikimedia.org/P19746 and previous config saved to /var/cache/conftool/dbconfig/20220201-080449-root.json
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19745 and previous config saved to /var/cache/conftool/dbconfig/20220201-080442-marostegui.json
08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T300402)', diff saved to https://phabricator.wikimedia.org/P19744 and previous config saved to /var/cache/conftool/dbconfig/20220201-080328-marostegui.json
08:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
08:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1165.eqiad.wmnet with reason: Maintenance
08:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19743 and previous config saved to /var/cache/conftool/dbconfig/20220201-080315-marostegui.json
08:01 filippo@puppetmaster1001: conftool action : set/pooled=no; selector: name=prometheus1003.eqiad.wmnet
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P19742 and previous config saved to /var/cache/conftool/dbconfig/20220201-075816-marostegui.json
07:56 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1005.eqiad.wmnet
07:56 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1005.eqiad.wmnet
07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19741 and previous config saved to /var/cache/conftool/dbconfig/20220201-074945-root.json
07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P19740 and previous config saved to /var/cache/conftool/dbconfig/20220201-074810-marostegui.json
07:47 filippo@puppetmaster1001: conftool action : set/pooled=yes; selector: name=prometheus1005.eqiad.wmnet
07:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19739 and previous config saved to /var/cache/conftool/dbconfig/20220201-074311-marostegui.json
07:39 filippo@puppetmaster1001: conftool action : set/weight=10; selector: name=prometheus1005.eqiad.wmnet
07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19738 and previous config saved to /var/cache/conftool/dbconfig/20220201-073441-root.json
07:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P19737 and previous config saved to /var/cache/conftool/dbconfig/20220201-073306-marostegui.json
07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T298558)', diff saved to https://phabricator.wikimedia.org/P19736 and previous config saved to /var/cache/conftool/dbconfig/20220201-073256-marostegui.json
07:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
07:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1127.eqiad.wmnet with reason: Maintenance
07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19735 and previous config saved to /var/cache/conftool/dbconfig/20220201-073248-marostegui.json
07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19734 and previous config saved to /var/cache/conftool/dbconfig/20220201-071938-root.json
07:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19733 and previous config saved to /var/cache/conftool/dbconfig/20220201-071801-marostegui.json
07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19732 and previous config saved to /var/cache/conftool/dbconfig/20220201-071743-marostegui.json
07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T300402)', diff saved to https://phabricator.wikimedia.org/P19731 and previous config saved to /var/cache/conftool/dbconfig/20220201-071648-marostegui.json
07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
07:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1168.eqiad.wmnet with reason: Maintenance
07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19730 and previous config saved to /var/cache/conftool/dbconfig/20220201-071640-marostegui.json
07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 10%: repooling', diff saved to https://phabricator.wikimedia.org/P19729 and previous config saved to /var/cache/conftool/dbconfig/20220201-070434-root.json
07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P19728 and previous config saved to /var/cache/conftool/dbconfig/20220201-070239-marostegui.json
07:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P19727 and previous config saved to /var/cache/conftool/dbconfig/20220201-070135-marostegui.json
06:50 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host db1110.eqiad.wmnet with OS bullseye
06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 5%: repooling', diff saved to https://phabricator.wikimedia.org/P19726 and previous config saved to /var/cache/conftool/dbconfig/20220201-064930-root.json
06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19725 and previous config saved to /var/cache/conftool/dbconfig/20220201-064734-marostegui.json
06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P19724 and previous config saved to /var/cache/conftool/dbconfig/20220201-064631-marostegui.json
06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19723 and previous config saved to /var/cache/conftool/dbconfig/20220201-064620-marostegui.json
06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1101.eqiad.wmnet with reason: Maintenance
06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
06:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19722 and previous config saved to /var/cache/conftool/dbconfig/20220201-064549-marostegui.json
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: repooling', diff saved to https://phabricator.wikimedia.org/P19721 and previous config saved to /var/cache/conftool/dbconfig/20220201-064149-root.json
06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19720 and previous config saved to /var/cache/conftool/dbconfig/20220201-063126-marostegui.json
06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19719 and previous config saved to /var/cache/conftool/dbconfig/20220201-063044-marostegui.json
06:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T300402)', diff saved to https://phabricator.wikimedia.org/P19718 and previous config saved to /var/cache/conftool/dbconfig/20220201-063013-marostegui.json
06:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
06:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1180.eqiad.wmnet with reason: Maintenance
06:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
06:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 8 hosts with reason: Maintenance
06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 8 hosts with reason: Maintenance
06:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
06:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2129.codfw.wmnet with reason: Maintenance
06:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: repooling', diff saved to https://phabricator.wikimedia.org/P19717 and previous config saved to /var/cache/conftool/dbconfig/20220201-062646-root.json
06:24 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1110.eqiad.wmnet with OS bullseye
06:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 for reimage T300473', diff saved to https://phabricator.wikimedia.org/P19716 and previous config saved to /var/cache/conftool/dbconfig/20220201-062111-marostegui.json
06:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P19715 and previous config saved to /var/cache/conftool/dbconfig/20220201-061540-marostegui.json
06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: repooling', diff saved to https://phabricator.wikimedia.org/P19714 and previous config saved to /var/cache/conftool/dbconfig/20220201-061142-root.json
06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19713 and previous config saved to /var/cache/conftool/dbconfig/20220201-060035-marostegui.json
05:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3317 (T298558)', diff saved to https://phabricator.wikimedia.org/P19712 and previous config saved to /var/cache/conftool/dbconfig/20220201-055921-marostegui.json
05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
05:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1098.eqiad.wmnet with reason: Maintenance
05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: repooling', diff saved to https://phabricator.wikimedia.org/P19711 and previous config saved to /var/cache/conftool/dbconfig/20220201-055638-root.json
05:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3312 (T298558)', diff saved to https://phabricator.wikimedia.org/P19710 and previous config saved to /var/cache/conftool/dbconfig/20220201-055327-marostegui.json
05:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
05:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1105.eqiad.wmnet with reason: Maintenance
05:08 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
03:37 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
03:36 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudnet2004-dev.codfw.wmnet with OS bullseye
02:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:18 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudnet2004-dev.codfw.wmnet with OS bullseye
02:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
02:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
01:48 ryankemper: T282117 Merged https://gerrit.wikimedia.org/r/c/operations/dns/+/717606 and successfully ran `sudo -i authdns-update` on `authdns1001`. `commons-query.wikimedia.org` is online now. (sidenote: go-live date of service is 2022-02-01)
01:42 ryankemper: T299222 `ryankemper@cumin1001:~$ sudo cumin 'wcqs*' 'sudo rm -fv /etc/default/wcqs-updater'`
01:42 ryankemper: T299222 `ryankemper@cumin1001:~$ sudo cumin 'wdqs*' 'sudo rm -fv /etc/default/wdqs-updater'`
01:25 ryankemper: T299222 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/757124; running puppet on `w*qs*` before purging old filepaths
00:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:24 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable Local upload on ptwikinews (T300466) (duration: 00m 50s)
00:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:18 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
00:11 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Lower The Wikipedia Library extension edit count (T288070) (duration: 00m 50s)
00:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn
00:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn
00:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

2000s

Archive 1: 2004 Jun - 2004 Sep
Archive 2: 2004 Oct - 2004 Nov
Archive 3: 2004 Dec - 2005 Mar
Archive 4: 2005 Apr - 2005 Jul
Archive 5: 2005 Aug - 2005 Oct, with revision history 2004-06-23 to 2005-11-25
Archive 6: 2005 Nov - 2006 Feb
Archive 7: 2006 Mar - 2006 Jun
Archive 8: 2006 Jul - 2006 Sep
Archive 9: 2006 Oct - 2007 Jan, with revision history 2005-11-25 to 2007-02-21
Archive 10: 2007 Feb - 2007 Jun
Archive 11: 2007 Jul - 2007 Dec
Archive 12: 2008 Jan - 2008 Jul
Archive 12a: 2008 Aug
Archive 12b: 2008 Sept
Archive 13: 2008 Oct - 2009 Jun
Archive 14: 2009 Jun - 2009 Dec

2010s

2020s