Server Admin Log/Archive 65

2023-04-30

14:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2184.codfw.wmnet with reason: Host down T335640
14:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db2184.codfw.wmnet with reason: Host down T335640
08:06 elukey: powercycle ores1002 (mgmt console tty not usable, host frozen)

2023-04-29

23:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1132.eqiad.wmnet with reason: Maint
23:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1132.eqiad.wmnet with reason: Maint
22:54 rzl@cumin2002: dbctl commit (dc=all): 'Depool db1132', diff saved to https://phabricator.wikimedia.org/P47290 and previous config saved to /var/cache/conftool/dbconfig/20230429-225457-rzl.json

2023-04-28

22:46 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:46 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entries for new frack nodes - pt1979@cumin2002"
22:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entries for new frack nodes - pt1979@cumin2002"
22:31 pt1979@cumin2002: START - Cookbook sre.dns.netbox
21:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1001.wikimedia.org with reason: setup
21:53 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit1001.wikimedia.org with reason: setup
20:25 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit2002.wikimedia.org with reason: setup
20:25 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit2002.wikimedia.org with reason: setup
20:24 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1003.wikimedia.org with reason: setup
20:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit1003.wikimedia.org with reason: setup
20:24 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gerrit1001.wikimedia.org with reason: setup
20:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on gerrit1001.wikimedia.org with reason: setup
19:20 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
19:20 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
19:16 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
19:16 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
19:10 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
19:10 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
19:07 otto@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
19:07 otto@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
17:50 htriedman@deploy1002: Finished deploy [airflow-dags/platform_eng@d56b7fb]: (no justification provided) (duration: 00m 10s)
17:50 htriedman@deploy1002: Started deploy [airflow-dags/platform_eng@d56b7fb]: (no justification provided)
15:39 elukey@deploy1002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
15:23 jynus: update schema for backup1-codfw (mediabackups) T327157
15:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 2519
15:01 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 2519
14:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['stat1004']
14:57 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['stat1004']
14:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['stat1004']
14:50 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['stat1004']
13:21 vgutierrez: import haproxy 2.7.7 on apt.wm.o thirdparty/haproxy27 for bullseye
12:36 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
12:35 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
12:35 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
12:34 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
12:31 akosiaris@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
12:30 akosiaris@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
12:29 akosiaris@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
12:29 akosiaris@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
12:08 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new server sretest1003 - jclark@cumin1001"
12:06 jclark@cumin1001: START - Cookbook sre.dns.netbox
10:43 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2003.codfw.wmnet with OS bullseye
10:28 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2003.codfw.wmnet with reason: host reimage
10:25 elukey@deploy1002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
10:25 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2003.codfw.wmnet with reason: host reimage
10:13 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2002.codfw.wmnet with OS bullseye
10:11 klausman@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2003.codfw.wmnet with OS bullseye
10:01 vgutierrez: restarting varnish on cp5017 and cp5025 to drop port 80 - T322774
09:58 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2002.codfw.wmnet with reason: host reimage
09:55 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2002.codfw.wmnet with reason: host reimage
09:42 klausman@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2002.codfw.wmnet with OS bullseye
09:31 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2001.codfw.wmnet with OS bullseye
09:24 elukey@deploy1002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
09:13 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2001.codfw.wmnet with reason: host reimage
09:11 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2001.codfw.wmnet with reason: host reimage
08:57 klausman@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2001.codfw.wmnet with OS bullseye
08:47 jnuche@deploy1002: Installing scap version "4.51.0" for 593 hosts
08:29 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2003.codfw.wmnet with OS buster
08:23 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
08:14 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2003.codfw.wmnet with reason: host reimage
08:11 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2003.codfw.wmnet with reason: host reimage
07:57 klausman@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2003.codfw.wmnet with OS buster
07:55 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2002.codfw.wmnet with OS buster
07:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
07:44 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
07:41 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2002.codfw.wmnet with reason: host reimage
07:37 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2002.codfw.wmnet with reason: host reimage
07:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
07:23 klausman@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2002.codfw.wmnet with OS buster
07:22 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache2001.codfw.wmnet with OS buster
07:04 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache2001.codfw.wmnet with reason: host reimage
07:00 klausman@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache2001.codfw.wmnet with reason: host reimage
06:46 klausman@cumin2002: START - Cookbook sre.hosts.reimage for host ml-cache2001.codfw.wmnet with OS buster
05:57 XioNoX: push pfw policies - T335554
05:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox circuit ID 112
05:29 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox circuit ID 112
05:28 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 393731
05:27 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 393731
04:08 eileen: config revision changed from b33fa934 to 2eef4039
03:16 ejegg: SmashPig upgraded from db9fa965 to a9fa7a2c
03:08 ejegg: payments-wiki upgraded from 91582d93 to 61951572
03:05 eileen: config revision changed from 98f2afbb to b33fa934
02:55 eileen: civicrm upgraded from b4a05476 to e7904ea6
02:13 eileen: civicrm upgraded from 601d223e to b4a05476

2023-04-27

22:17 zabe@deploy1002: Finished scap: T334295 (duration: 06m 58s)
22:10 zabe@deploy1002: Started scap: T334295
20:29 TheresNoTime: close UTC late backport window
20:27 samtar@deploy1002: Finished scap: Backport for [cawikisource] Add a wordmark (Vector 2022) (T331823), [cawiktionary] Add a wordmark (Vector 2022) (T331823) (duration: 07m 19s)
20:21 samtar@deploy1002: superpes and samtar: Backport for [cawikisource] Add a wordmark (Vector 2022) (T331823), [cawiktionary] Add a wordmark (Vector 2022) (T331823) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
20:20 samtar@deploy1002: Started scap: Backport for [cawikisource] Add a wordmark (Vector 2022) (T331823), [cawiktionary] Add a wordmark (Vector 2022) (T331823)
20:20 samtar@deploy1002: Finished scap: Backport for [cawikibooks] Add a wordmark (Vector 2022) (T331823), [cawikinews] Add a wordmark (Vector 2022) (T331823), [cawikiquote] Add a wordmark (Vector 2022) (T331823) (duration: 09m 43s)
20:11 samtar@deploy1002: samtar and superpes: Backport for [cawikibooks] Add a wordmark (Vector 2022) (T331823), [cawikinews] Add a wordmark (Vector 2022) (T331823), [cawikiquote] Add a wordmark (Vector 2022) (T331823) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
20:10 samtar@deploy1002: Started scap: Backport for [cawikibooks] Add a wordmark (Vector 2022) (T331823), [cawikinews] Add a wordmark (Vector 2022) (T331823), [cawikiquote] Add a wordmark (Vector 2022) (T331823)
19:27 xcollazo@deploy1002: Finished deploy [airflow-dags/platform_eng@bc37201]: (no justification provided) (duration: 00m 10s)
19:27 ejegg: payments-wiki upgraded from 7fa25437 to 91582d93
19:27 xcollazo@deploy1002: Started deploy [airflow-dags/platform_eng@bc37201]: (no justification provided)
19:16 xcollazo@deploy1002: Finished deploy [airflow-dags/platform_eng@f162f4d]: Deploying T333001 on platform_eng Airflow instance. (duration: 12m 01s)
19:04 xcollazo@deploy1002: Started deploy [airflow-dags/platform_eng@f162f4d]: Deploying T333001 on platform_eng Airflow instance.
18:47 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.6 refs T330212
18:37 jhuneidi@deploy1002: Finished scap: Backport for Replace references to actionsToolbar (T335469) (duration: 16m 10s)
18:22 jhuneidi@deploy1002: jhuneidi and jforrester: Backport for Replace references to actionsToolbar (T335469) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
18:21 jhuneidi@deploy1002: Started scap: Backport for Replace references to actionsToolbar (T335469)
17:51 herron@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=1) for host kafkamon1003.eqiad.wmnet with OS bullseye
17:39 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
17:35 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafkamon1003.eqiad.wmnet with reason: host reimage
17:27 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@5a46db1] (releasing): (no justification provided) (duration: 00m 40s)
17:27 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@5a46db1] (releasing): (no justification provided)
17:14 hnowlan@deploy1002: Finished deploy [restbase/deploy@a08f56d]: Deploying new wikis: T333272 T334460 T334741 T335020 (duration: 03m 29s)
17:11 hnowlan@deploy1002: Started deploy [restbase/deploy@a08f56d]: Deploying new wikis: T333272 T334460 T334741 T335020
17:06 mutante: deploy2002 - armed the keyholder (sudo keyholder arm and enter passphrase from deployment-key-passphrase in pwstore) - monitoring alert should resolve - T335435
17:01 herron@cumin1001: START - Cookbook sre.ganeti.reimage for host kafkamon1003.eqiad.wmnet with OS bullseye
16:56 volans: uploaded python3-wmflib_1.2.2 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia,bookworm-wikimedia
16:20 herron@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kafkamon1003.eqiad.wmnet
16:20 herron@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kafkamon1003.eqiad.wmnet - herron@cumin1001"
16:19 herron@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM kafkamon1003.eqiad.wmnet - herron@cumin1001"
16:05 herron@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) kafkamon1003.eqiad.wmnet on all recursors
16:05 herron@cumin1001: START - Cookbook sre.dns.wipe-cache kafkamon1003.eqiad.wmnet on all recursors
16:05 herron@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:05 herron@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kafkamon1003.eqiad.wmnet - herron@cumin1001"
16:01 herron@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM kafkamon1003.eqiad.wmnet - herron@cumin1001"
15:59 herron@cumin1001: START - Cookbook sre.dns.netbox
15:59 herron@cumin1001: START - Cookbook sre.ganeti.makevm for new host kafkamon1003.eqiad.wmnet
15:58 vgutierrez: restarting varnish on cp5018 and cp5026 to drop port 80 - T322774
15:55 jbond: upload puppetboard_4.3.0-1_all.deb to bookworm-wikimedia
15:37 legoktm@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
15:35 legoktm@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
15:35 legoktm@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
15:35 legoktm@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
15:34 legoktm@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
15:34 legoktm@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
15:34 legoktm@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
15:33 legoktm@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
15:33 legoktm@deploy1002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
15:32 legoktm@deploy1002: helmfile [codfw] START helmfile.d/services/shellbox: apply
15:29 krinkle@deploy1002: Synchronized wmf-config/mc.php: Ia174ea2b0645 (duration: 06m 05s)
15:25 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
15:23 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:22 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
15:22 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
15:22 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
15:22 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
15:22 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
15:21 claime: repooled mw2331.codfw.wmnet - T335486
15:21 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2331.codfw.wmnet
15:21 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for mw2331.codfw.wmnet
15:21 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
15:21 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
15:21 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
15:20 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
15:20 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
15:18 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
15:17 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
15:14 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
15:13 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
15:10 vgutierrez: restarting varnish on cp5019 and cp5027 to drop port 80 - T322774
15:01 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:59 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
14:58 claime: repooling mw2330.codfw.wmnet - T335487
14:58 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2330.codfw.wmnet
14:58 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for mw2330.codfw.wmnet
14:56 Lucas_WMDE: UTC afternoon backport+config window done
14:55 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for Add language codes cal and tpv to wmgExtraLanguageNames (T308062) (duration: 07m 55s)
14:49 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and noa: Backport for Add language codes cal and tpv to wmgExtraLanguageNames (T308062) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
14:47 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for Add language codes cal and tpv to wmgExtraLanguageNames (T308062)
14:46 ejegg: payments-wiki upgraded from f30bc859 to 7fa25437
14:46 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for lowiki: Use Western style (0-9) numerals (T335345) (duration: 08m 53s)
14:38 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and stang: Backport for lowiki: Use Western style (0-9) numerals (T335345) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
14:37 ejegg: disabled fundraising job ingenico_recurring_fill_scheme_ids (it's all done)
14:37 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for lowiki: Use Western style (0-9) numerals (T335345)
14:36 vgutierrez: restarting varnish on cp5020 and cp5028 to drop port 80 - T322774
14:35 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for Close cnwikimedia (T274083) (duration: 11m 05s)
14:29 legoktm@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
14:28 legoktm@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
14:28 legoktm@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
14:28 legoktm@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
14:27 legoktm@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
14:27 legoktm@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
14:27 legoktm@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
14:26 legoktm@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
14:26 legoktm@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
14:26 legoktm@deploy1002: helmfile [staging] START helmfile.d/services/shellbox: apply
14:25 legoktm@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
14:25 lucaswerkmeister-wmde@deploy1002: lucaswerkmeister-wmde and stang: Backport for Close cnwikimedia (T274083) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
14:25 legoktm@deploy1002: helmfile [staging] START helmfile.d/services/shellbox: apply
14:25 moritzm: restarting apache/FPM on mw canaries to pick up curl update
14:24 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for Close cnwikimedia (T274083)
14:20 lucaswerkmeister-wmde@deploy1002: Finished scap: Backport for labtestwiki: disable cirrus completion index (duration: 09m 31s)
14:13 moritzm: installing curl security updates on buster
14:12 lucaswerkmeister-wmde@deploy1002: dcausse and lucaswerkmeister-wmde: Backport for labtestwiki: disable cirrus completion index synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
14:11 lucaswerkmeister-wmde@deploy1002: Started scap: Backport for labtestwiki: disable cirrus completion index
14:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache1003.eqiad.wmnet with OS bullseye
14:05 samtar@deploy1002: Finished scap: Backport for Enable $wgCampaignEventsEnableMultipleOrganizers in production (T334088) (duration: 38m 35s)
14:00 legoktm@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
13:59 legoktm@deploy1002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
13:59 legoktm@deploy1002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
13:59 legoktm@deploy1002: helmfile [staging] START helmfile.d/services/shellbox: apply
13:48 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache1003.eqiad.wmnet with reason: host reimage
13:45 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache1003.eqiad.wmnet with reason: host reimage
13:35 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
13:33 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-cache1003.eqiad.wmnet with OS bullseye
13:31 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
13:30 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
13:28 samtar@deploy1002: samtar and cmelo: Backport for Enable $wgCampaignEventsEnableMultipleOrganizers in production (T334088) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
13:26 samtar@deploy1002: Started scap: Backport for Enable $wgCampaignEventsEnableMultipleOrganizers in production (T334088)
13:20 samtar@deploy1002: Finished scap: Backport for metawiki: Give campaignevents-organize-events to campaignevents-beta-tester only (T334088) (duration: 15m 07s)
13:20 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
13:06 samtar@deploy1002: samtar and cmelo: Backport for metawiki: Give campaignevents-organize-events to campaignevents-beta-tester only (T334088) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
13:05 samtar@deploy1002: Started scap: Backport for metawiki: Give campaignevents-organize-events to campaignevents-beta-tester only (T334088)
13:04 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache1002.eqiad.wmnet with OS bullseye
12:56 vgutierrez: restarting varnish on cp5021 and cp5029 to drop port 80 - T322774
12:43 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache1002.eqiad.wmnet with reason: host reimage
12:40 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache1002.eqiad.wmnet with reason: host reimage
12:29 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-cache1002.eqiad.wmnet with OS bullseye
12:27 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-cache1001.eqiad.wmnet with OS bullseye
12:12 moritzm: imported puppet 5.5.22-2+deb13u3 to bookworm-wikimedia T330495
11:56 jbond: upload python3-pypuppetdb_3.1.0-1_all.deb to bookworm
11:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 23951
11:44 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 23951
11:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 54994
11:41 krinkle@deploy1002: Synchronized wmf-config/: I195978 (duration: 06m 29s)
11:14 hnowlan@puppetmaster1001: conftool action : set/weight=6; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
11:13 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 54994
11:09 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
11:09 vgutierrez: restarting varnish on cp5022 and cp5030 to drop port 80 - T322774
11:07 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
11:03 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
11:00 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
10:59 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
10:59 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
10:33 vgutierrez: restarting varnish on cp5023 and cp5031 to drop port 80 - T322774
10:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-cache1001.eqiad.wmnet with reason: host reimage
10:20 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-cache1001.eqiad.wmnet with reason: host reimage
10:09 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-cache1001.eqiad.wmnet with OS bullseye
10:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1002.wikimedia.org
10:04 elukey@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host ml-cache1001.eqiad.wmnet with OS bullseye
10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1002.wikimedia.org
10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1001.eqiad.wmnet
09:55 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-cache1001.eqiad.wmnet with OS bullseye
09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet
09:54 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-cache1001.eqiad.wmnet with OS bullseye
09:43 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:42 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
09:42 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2331.codfw.wmnet with reason: PSU failure
09:42 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2331.codfw.wmnet with reason: PSU failure
09:41 cgoubert@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 7 days, 0:00:00 on mw2331.codfw.wmnet with reason: PSU failure
09:41 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2331.codfw.wmnet with reason: PSU failure
09:41 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2330.codfw.wmnet with reason: PSU failure
09:41 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2330.codfw.wmnet with reason: PSU failure
09:40 claime: depooling mw2330.codfw.wmnet for HW troubleshooting - T335487
09:39 godog: delete all 2023 replica=unset blocks from thanos - T335406
09:37 claime: depooling mw2331.codfw.wmnet for HW troubleshooting - T335486
09:36 vgutierrez: restarting varnish on cp5024 and cp5032 to drop port 80 - T322774
09:34 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host ml-cache1001.eqiad.wmnet with OS bullseye
09:29 moritzm: imported prometheus-rsyslog-exporter to bookworm-wikimedia T330495
09:29 moritzm: imported wmf-certificates to bookworm-wikimedia T330495
09:14 vgutierrez: restarting varnish on cp4037 and cp4045 to drop port 80 - T322774
09:11 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@fb6f0ea] (releasing): (no justification provided) (duration: 00m 40s)
09:10 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@fb6f0ea] (releasing): (no justification provided)
09:09 godog: restart thanos-compact on thanos-fe2001 - T335406
09:06 moritzm: uploaded debdeploy 0.0.99.13+deb12u1 to bookworm-wikimedia T330495
09:00 godog: delete overlapping block 01GY1CQ4EAKRV9BQ8D9JB1VWGJ from thanos - T335406
08:39 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox circuit ID 112
08:39 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox circuit ID 112
08:24 vgutierrez: restarting varnish on cp4038 and cp4046 to drop port 80 - T322774
08:22 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 199524
08:17 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 199524
07:50 apergos: UTC morning backport and config training window complete
07:45 jnuche@deploy1002: Finished scap: Backport for Hide wrong "this reference is used 0 times" in citation dialog (T241885 T335410) (duration: 08m 33s)
07:43 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 15169
07:38 jnuche@deploy1002: thiemowmde and jnuche: Backport for Hide wrong "this reference is used 0 times" in citation dialog (T241885 T335410) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
07:37 jnuche@deploy1002: Started scap: Backport for Hide wrong "this reference is used 0 times" in citation dialog (T241885 T335410)
07:31 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 15169
07:23 moritzm: uploaded debmonitor-client 0.3.2-1+deb12u1 to bookworm-wikimedia T330495
05:56 XioNoX: Configure 1:1 NAT for new fr-tech hosts - T335441
05:51 XioNoX: downgrade SGIX RS BGP sessions to non-primary
00:01 zabe@deploy1002: Finished scap: T334295 (duration: 06m 53s)

2023-04-26

23:54 zabe@deploy1002: Started scap: T334295
23:32 zabe@deploy1002: Finished scap: Backport for Fix `a.image:not(.noviewer,.metadata),a.thumbimage:not(.noviewer,.metadata)' is not a valid selector` bug (T335451) (duration: 07m 07s)
23:26 zabe@deploy1002: zabe and nray: Backport for Fix `a.image:not(.noviewer,.metadata),a.thumbimage:not(.noviewer,.metadata)' is not a valid selector` bug (T335451) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
23:25 zabe@deploy1002: Started scap: Backport for Fix `a.image:not(.noviewer,.metadata),a.thumbimage:not(.noviewer,.metadata)' is not a valid selector` bug (T335451)
22:06 samtar@deploy1002: Finished scap: Backport for interwiki: update URL to XTools (duration: 09m 43s)
21:57 samtar@deploy1002: musikanimal and samtar: Backport for interwiki: update URL to XTools synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
21:56 samtar@deploy1002: Started scap: Backport for interwiki: update URL to XTools
21:39 brett: Re-enable Puppet on LVS[4008-4010] - T263797
21:02 bblack@cumin1001: conftool action : set/pooled=yes; selector: service=labweb-ssl
21:00 bblack@cumin1001: conftool action : set/pooled=yes; selector: service=labweb
20:37 jhuneidi@deploy1002: Finished scap: Backport for Set Vector 2022 as default skin on Polish Wikipedia (T335311) (duration: 09m 22s)
20:29 jhuneidi@deploy1002: jhuneidi and jdrewniak: Backport for Set Vector 2022 as default skin on Polish Wikipedia (T335311) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
20:28 jhuneidi@deploy1002: Started scap: Backport for Set Vector 2022 as default skin on Polish Wikipedia (T335311)
19:47 brett: Disable Puppet on LVS[4008-4010] for rollout of LVS maglev hashing scheduler - T263797
19:16 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@eb07d71]: fetch_conda: path globs must not be quoted (duration: 00m 27s)
19:15 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@eb07d71]: fetch_conda: path globs must not be quoted
19:10 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@5f2ec35]: repoint shebang lines of conda env (duration: 00m 23s)
19:10 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@5f2ec35]: repoint shebang lines of conda env
18:34 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@ba52b43]: replace python env deployment method with conda env from gitlab (duration: 00m 24s)
18:33 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@ba52b43]: replace python env deployment method with conda env from gitlab
18:16 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.41.0-wmf.6 refs T330212 (duration: 06m 04s)
18:10 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.6 refs T330212
17:37 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp5016
17:37 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:37 robh@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5016 decommissioned, removing all IPs except the asset tag one - robh@cumin1001"
17:35 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 36351
17:31 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5016 decommissioned, removing all IPs except the asset tag one - robh@cumin1001"
17:29 robh@cumin1001: START - Cookbook sre.dns.netbox
17:24 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp5016
17:23 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp5015
17:23 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:23 robh@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5015 decommissioned, removing all IPs except the asset tag one - robh@cumin1001"
17:22 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 36351
17:21 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5015 decommissioned, removing all IPs except the asset tag one - robh@cumin1001"
17:17 robh@cumin1001: START - Cookbook sre.dns.netbox
17:13 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp5015
17:12 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp5014
17:12 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:12 robh@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5014 decommissioned, removing all IPs except the asset tag one - robh@cumin1001"
17:11 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5014 decommissioned, removing all IPs except the asset tag one - robh@cumin1001"
16:56 robh@cumin1001: START - Cookbook sre.dns.netbox
16:50 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp5014
16:48 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cp5013
16:48 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:48 robh@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5013 decommissioned, removing all IPs except the asset tag one - robh@cumin1001"
16:46 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp5013 decommissioned, removing all IPs except the asset tag one - robh@cumin1001"
16:44 robh@cumin1001: START - Cookbook sre.dns.netbox
16:36 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cp5013
16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest2002.codfw.wmnet with OS bullseye
16:33 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
16:31 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
16:29 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5014.eqsin.wmnet with OS bullseye
16:29 robh@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin1001"
16:29 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5013.eqsin.wmnet with OS bullseye
16:29 robh@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin1001"
16:29 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5015.eqsin.wmnet with OS bullseye
16:29 robh@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin1001"
16:29 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5016.eqsin.wmnet with OS bullseye
16:29 robh@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin1001"
16:17 vgutierrez: restarting varnish on cp4039 and cp4047 to drop port 80 - T322774
16:10 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin1001"
16:08 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin1001"
16:05 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin1001"
15:51 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5016.eqsin.wmnet with reason: host reimage
15:49 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5015.eqsin.wmnet with reason: host reimage
15:46 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5014.eqsin.wmnet with reason: host reimage
15:44 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5016.eqsin.wmnet with reason: host reimage
15:44 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5015.eqsin.wmnet with reason: host reimage
15:43 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5014.eqsin.wmnet with reason: host reimage
15:43 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - robh@cumin1001"
15:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest2002.codfw.wmnet with reason: host reimage
15:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest2002.codfw.wmnet with reason: host reimage
15:34 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
15:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2002.codfw.wmnet with OS bullseye
15:21 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5013.eqsin.wmnet with reason: host reimage
15:20 htriedman@deploy1002: Finished deploy [airflow-dags/platform_eng@5061681]: (no justification provided) (duration: 00m 20s)
15:19 htriedman@deploy1002: Started deploy [airflow-dags/platform_eng@5061681]: (no justification provided)
15:18 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5013.eqsin.wmnet with reason: host reimage
15:14 robh@cumin1001: START - Cookbook sre.hosts.reimage for host cp5016.eqsin.wmnet with OS bullseye
15:14 robh@cumin1001: START - Cookbook sre.hosts.reimage for host cp5015.eqsin.wmnet with OS bullseye
15:13 robh@cumin1001: START - Cookbook sre.hosts.reimage for host cp5014.eqsin.wmnet with OS bullseye
14:45 robh@cumin1001: START - Cookbook sre.hosts.reimage for host cp5013.eqsin.wmnet with OS bullseye
14:41 vgutierrez: restarting varnish on cp4040 and cp4048 to drop port 80 - T322774
14:34 cgoubert@deploy1002: Finished scap: Backport for Revert "debug.json: List primary DC servers first" (duration: 08m 07s)
14:31 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0)
14:28 cgoubert@deploy1002: cgoubert: Backport for Revert "debug.json: List primary DC servers first" synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
14:26 cgoubert@deploy1002: Started scap: Backport for Revert "debug.json: List primary DC servers first"
14:24 cgoubert@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter Switchback - T327920 (duration: 69m 03s)
14:16 marostegui: Update dns for parsercache T327920
14:10 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters
14:08 claime: Phase 9.5 Update DNS records for new database masters
14:08 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0)
14:07 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl
14:07 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
14:05 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
14:05 claime: Restarting maintenance jobs - T327920
14:04 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners (exit_code=0)
14:04 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners
14:03 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
14:03 cgoubert@cumin1001: MediaWiki read-only period ends at: 2023-04-26 14:03:01.527715
14:00 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.02-set-readonly
13:59 claime: Going to read-only for mediawiki datacenter switchback - T327920
13:55 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
13:55 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
13:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 1239
13:53 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 1239
13:51 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 136106
13:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 136106
13:47 cgoubert@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=99)
13:46 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
13:45 cgoubert@cumin1001: END (FAIL) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=99)
13:45 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
13:45 claime: Stopping maintenance scripts for datacenter switchback - T327920
13:43 vgutierrez: restarting varnish on cp4041 and cp4049 to drop port 80 - T322774
13:35 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0)
13:35 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks
13:35 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-optional-warmup-caches (exit_code=0)
13:31 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-optional-warmup-caches
13:31 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
13:25 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
13:25 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
13:25 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
13:23 claime: Starting mediawiki datacenter switchback preparation - T327920
13:15 cgoubert@deploy1002: Locking from deployment [ALL REPOSITORIES]: Datacenter Switchback - T327920
13:14 claime: Locking scap for datacenter switchback - T327920
13:13 vgutierrez: restarting varnish on cp4042 and cp4050 to drop port 80 - T322774
13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
13:06 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
13:06 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
12:56 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
12:55 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp5013.mgmt.eqsin.wmnet with reboot policy FORCED
12:52 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
12:49 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
12:49 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
12:49 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
12:40 robh@cumin1001: START - Cookbook sre.hosts.provision for host cp5013.mgmt.eqsin.wmnet with reboot policy FORCED
12:13 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@93a04bd] (releasing): (no justification provided) (duration: 00m 36s)
12:13 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@93a04bd] (releasing): (no justification provided)
12:11 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@93a04bd] (releasing): (no justification provided) (duration: 00m 33s)
12:10 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@93a04bd] (releasing): (no justification provided)
12:10 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@93a04bd] (releasing): (no justification provided) (duration: 01m 15s)
12:09 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@93a04bd] (releasing): (no justification provided)
12:03 jnuche@deploy1002: Finished deploy [releng/jenkins-deploy@93a04bd] (releasing): (no justification provided) (duration: 00m 34s)
12:03 jnuche@deploy1002: Started deploy [releng/jenkins-deploy@93a04bd] (releasing): (no justification provided)
11:37 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
11:27 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
11:25 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
11:16 moritzm: import php-excimer 1.0.2-1+wmf3+buster1+icu67 to component/icu67 T332964
11:15 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
10:54 btullis@deploy1002: Finished deploy [analytics/refinery@571f955] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@571f955] (duration: 01m 30s)
10:52 btullis@deploy1002: Started deploy [analytics/refinery@571f955] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@571f955]
10:52 btullis@deploy1002: Finished deploy [analytics/refinery@571f955] (thin): Regular analytics weekly train THIN [analytics/refinery@571f955] (duration: 02m 08s)
10:50 btullis@deploy1002: Started deploy [analytics/refinery@571f955] (thin): Regular analytics weekly train THIN [analytics/refinery@571f955]
10:49 btullis@deploy1002: Finished deploy [analytics/refinery@571f955]: Regular analytics weekly train [analytics/refinery@571f955] (duration: 05m 23s)
10:44 btullis@deploy1002: Started deploy [analytics/refinery@571f955]: Regular analytics weekly train [analytics/refinery@571f955]
10:25 vgutierrez: restarting varnish on cp4043 and cp4051 to drop port 80 - T322774
10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 1828
10:07 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
10:05 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 1828
09:57 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
09:54 akosiaris@deploy1002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
09:54 akosiaris@deploy1002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
09:49 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
09:49 btullis@cumin1001: Added views for new wiki: kbdwiktionary T333270
09:31 vgutierrez: restarting varnish on cp4044 and cp4052 to drop port 80 - T322774
09:26 btullis@deploy1002: Finished deploy [analytics/refinery@571f955] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@571f955] (duration: 00m 04s)
09:26 btullis@deploy1002: Started deploy [analytics/refinery@571f955] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@571f955]
09:25 btullis@deploy1002: Finished deploy [analytics/refinery@571f955] (thin): Regular analytics weekly train THIN [analytics/refinery@571f955] (duration: 00m 05s)
09:25 btullis@deploy1002: Started deploy [analytics/refinery@571f955] (thin): Regular analytics weekly train THIN [analytics/refinery@571f955]
09:24 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
09:20 btullis@deploy1002: Finished deploy [analytics/refinery@571f955]: Regular analytics weekly train [analytics/refinery@571f955] (duration: 00m 46s)
09:19 btullis@deploy1002: Started deploy [analytics/refinery@571f955]: Regular analytics weekly train [analytics/refinery@571f955]
09:05 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db2139.codfw.wmnet with reason: T335396
09:05 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db2139.codfw.wmnet with reason: T335396
08:53 moritzm: installing golang-1.11 security updates
08:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
08:52 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
08:51 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32934
08:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
08:40 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
08:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
08:36 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
08:22 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
08:17 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
08:17 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
08:17 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
08:16 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
08:12 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
08:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
08:00 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
07:41 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
07:39 Emperor: start to load new swift backends, drain old ones T335278 T335279 T335280 T335281
07:39 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
07:35 elukey@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: sync
07:34 elukey@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: sync
07:33 elukey@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
07:33 elukey@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
07:32 elukey@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: sync
07:32 elukey@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: sync
07:28 taavi@deploy1002: Finished scap: Backport for Beta-Wikidata: Enable Labels in Wikidata edit summaries (T327062) (duration: 07m 48s)
07:22 taavi@deploy1002: taavi and migr: Backport for Beta-Wikidata: Enable Labels in Wikidata edit summaries (T327062) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
07:20 taavi@deploy1002: Started scap: Backport for Beta-Wikidata: Enable Labels in Wikidata edit summaries (T327062)
07:18 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 38082
07:16 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 38082
07:08 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 9584
07:06 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 9584
07:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 4826
07:04 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 4826
07:03 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 55818
07:01 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 55818
07:01 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 49544
07:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
06:59 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 49544
06:57 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 133840
06:56 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 133840
06:55 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4826
06:53 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4826
06:53 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 18106
06:51 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 18106
06:50 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 7552
06:49 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 7552
06:49 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45796
06:49 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 45796
06:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 140407
06:48 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 140407
06:47 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 1828
06:45 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 1828
06:45 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 38082
06:44 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 38082
06:43 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4657
06:42 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4657
06:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 1239
06:40 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 1239
06:40 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 36351
06:38 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 36351
06:38 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 17676
06:38 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 17676
06:37 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45498
06:37 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 45498
06:37 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 134823
06:36 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 134823
06:36 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9583
06:35 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 9583
06:35 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 24482
06:34 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 24482
06:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 137831
06:32 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 137831
06:32 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9002
06:32 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 9002
06:32 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 23951
06:31 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 23951
06:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9299
06:29 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 9299
06:29 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8529
06:27 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 8529
06:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 38040
06:25 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 38040
06:25 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4651
06:24 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4651
06:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 132132
06:23 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 132132
06:23 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 58552
06:21 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 58552
06:21 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 23947
06:20 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 23947
06:20 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 17961
06:20 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 17961
06:20 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 54994
06:19 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 54994
06:19 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 55818
06:18 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 55818
06:18 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9009
06:17 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 9009
06:17 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4773
06:16 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4773
06:15 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 133840
06:15 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 133840
06:15 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 140951
06:14 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 140951
06:14 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4761
06:14 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4761
06:14 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 49544
06:12 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 49544
06:12 ayounsi@cumin1001: END (ERROR) - Cookbook sre.network.peering (exit_code=97) with action 'email' for AS: 6939
06:12 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 6939
06:12 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 136907
06:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 136907
06:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4775
06:10 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4775
06:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 199524
06:09 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 199524
06:09 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 23824
06:08 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 23824
06:08 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 18403
06:07 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 18403
06:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 136106
06:06 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 136106
06:06 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 35280
06:04 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 35280
06:03 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 10089
06:02 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 10089
06:02 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 906
06:01 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 906
06:01 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9584
06:01 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 9584
06:00 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 139836
05:59 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 139836
05:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 10030
05:58 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 10030
05:58 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 38158
05:57 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 38158
05:57 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 63199
05:56 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 63199
05:55 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 131285
05:54 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 131285
05:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 2518
05:53 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 2518
05:53 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 55967
05:52 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 55967
05:52 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 2519
05:51 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 2519
05:51 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45430
05:50 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 45430
05:46 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 703
05:46 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 703
05:45 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 703
05:44 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 703
05:44 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 703
05:43 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 703
05:33 XioNoX: bounce SGIX RS BGP - T327284
05:21 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 59369
05:20 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 59369
05:19 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 59360
05:19 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 59360
05:19 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 59360
05:19 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 59360
04:55 eileen: civicrm upgraded from 2bc9f372 to 601d223e

2023-04-25

21:40 mutante: gerrit1003 - chown -R gerrit2:gerrit2 /var/lib/gerrit2/review_site/ - T326368
21:19 mutante: gerrit1003 - chown -R gerrit2:gerrit2 /srv/gerrit T333143 T326368
21:17 mutante: gerrit1003 - mv /srv/gerrit/plugins/lfs /srv/gerrit/data/ T333143 T326368
21:14 mutante: gerrit1003 - manually replacing deploy2002 with deploy1002 in /srv/deployment/gerrit/gerrit-cache/.config to fix initial scap deployment T257317 T326368
21:12 mutante: once again running into T257317 when applying gerrit role to new hardware
21:06 mutante: adding production gerrit role to new machine gerrit1003 - monitoring downtimed - but it has a service IP that is going to be added by this and cant be downtimed ? (Bug: T326368)
21:04 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on gerrit1003.wikimedia.org with reason: setup
21:04 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on gerrit1003.wikimedia.org with reason: setup
19:48 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
19:48 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on wdqs2012.codfw.wmnet with reason: attempting WDQS stack on bullseye
19:48 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2006.codfw.wmnet
19:48 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2006.codfw.wmnet
19:46 bking@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wdqs2009.codfw.wmnet
19:46 bking@cumin1001: START - Cookbook sre.hosts.remove-downtime for wdqs2009.codfw.wmnet
19:46 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on wdqs2006.codfw.wmnet with reason: attempting WDQS stack on bullseye
19:46 bking@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on wdqs2006.codfw.wmnet with reason: attempting WDQS stack on bullseye
19:23 inflatador: bking@cumin1001 finishing WDQS deploy...restarting `wdqs-categories` across lvs-managed hosts
18:57 bking@deploy1002: Finished deploy [wdqs/wdqs@0e051d8]: 0.3.123 (duration: 17m 29s)
18:39 bking@deploy1002: Started deploy [wdqs/wdqs@0e051d8]: 0.3.123
18:18 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.6 refs T330212
16:55 ejegg: payments-wiki upgraded from 2a4c450d to f30bc859
15:39 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
15:39 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
15:34 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
15:33 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
15:32 akosiaris@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
15:31 akosiaris@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
15:31 akosiaris@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
15:31 akosiaris@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
15:30 dancy@deploy1002: Installation of scap version "4.50.0" completed for 1 hosts
15:30 dancy@deploy1002: Installing scap version "4.50.0" for 1 hosts
15:28 XioNoX: update cr2-eqsin BBIX interface
15:27 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
15:27 btullis@cumin1001: Added views for new wiki: azwikimedia T330442
15:25 dancy@deploy1002: Installing scap version "4.50.0" for 592 hosts
15:24 cgoubert@cumin2002: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool restbase-async in eqiad: T335015
15:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on irc2002.wikimedia.org with reason: Non-functional, WIP for Bullseye update
15:22 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on irc2002.wikimedia.org with reason: Non-functional, WIP for Bullseye update
15:22 claime: Datacenter Service Switchback concluded - T335015
15:21 cgoubert@deploy1002: Synchronized README: check the deployment server after switchback - T335015 (duration: 19m 55s)
15:19 cgoubert@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase-async.discovery.wmnet on all recursors
15:19 cgoubert@cumin2002: START - Cookbook sre.dns.wipe-cache restbase-async.discovery.wmnet on all recursors
15:19 cgoubert@cumin2002: START - Cookbook sre.discovery.service-route depool restbase-async in eqiad: T335015
15:19 claime: Restoring restbase-async to codfw only - T335015
15:18 cgoubert@deploy1002: Finished deploy [restbase/deploy@a08f56d]: (no justification provided) (duration: 13m 06s)
15:08 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1002.eqiad.wmnet with OS bullseye
15:05 cgoubert@deploy1002: Started deploy [restbase/deploy@a08f56d]: (no justification provided)
15:02 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
15:02 inflatador: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
15:02 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
15:02 btullis@cumin1001: Added views for new wiki: vewikimedia T330704
15:01 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
15:01 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
15:01 btullis@cumin1001: Added views for new wiki: ckbwiktionary T331834
15:01 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
15:01 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
15:00 btullis@cumin1001: Added views for new wiki: fatwiki T335018
15:00 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
15:00 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
15:00 btullis@cumin1001: Added views for new wiki: kcgwiktionary T334739
15:00 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
14:59 btullis@cumin1001: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0)
14:59 btullis@cumin1001: Added views for new wiki: guwwikinews T334408
14:59 btullis@cumin1001: START - Cookbook sre.wikireplicas.add-wiki
14:58 bking@deploy1002: Finished deploy [wdqs/wdqs@0e051d8]: 0.3.123 (duration: 07m 38s)
14:54 cgoubert@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter Service Switchback - T335015 (duration: 81m 19s)
14:51 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
14:50 bking@deploy1002: Started deploy [wdqs/wdqs@0e051d8]: 0.3.123
14:48 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
14:45 claime: Running authdns-update - T335015
14:45 inflatador: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.123`. Pre-deploy tests passing on canary `wdqs1003`
14:44 claime: Switch deployment server back to eqiad - T335015
14:43 claime: All active/active services repooled in codfw - T335015
14:43 cgoubert@cumin1001: END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in codfw: Datacenter Services Switchback - T335015
14:36 herron@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS bullseye
14:35 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging1001.eqiad.wmnet with OS bullseye
14:26 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: Datacenter Services Switchback - T335015
14:26 claime: All services pooled in eqiad, all depooled in codfw, proceeding with repooling active/active services in codfw - T335015
14:25 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
14:25 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter status all services in all: None - None
14:25 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Services Switchback - T335015
14:19 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Services Switchback - T335015
14:19 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
14:18 cgoubert@cumin1001: END (ERROR) - Cookbook sre.discovery.datacenter (exit_code=93) depool all services in codfw: Datacenter Services Switchback - T335015
14:16 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1001.eqiad.wmnet with reason: host reimage
14:04 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Services Switchback - T335015
14:04 cgoubert@cumin1001: END (ERROR) - Cookbook sre.discovery.datacenter (exit_code=93) depool all services in codfw: Datacenter Services Switchback - T335015
14:02 herron@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1001.eqiad.wmnet with OS bullseye
14:01 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Services Switchback - T335015
14:00 claime: Starting Datacenter Services Switchback - T335015
13:53 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-worker1002.eqiad.wmnet
13:47 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-test-worker1002.eqiad.wmnet
13:33 cgoubert@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter Service Switchback - T335015
13:30 inflatador: bking@cumin1001 transfer.py wdqs2009.codfw.wmnet:/srv/wdqs wdqs2022.codfw.wmnet:/srv/wdqs
13:26 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wdqs2009.codfw.wmnet with reason: attempting WDQS stack on bullseye
13:26 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wdqs2009.codfw.wmnet with reason: attempting WDQS stack on bullseye
13:06 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
13:05 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
13:05 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
13:05 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
13:04 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
13:03 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
13:03 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
13:02 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
11:44 jmm@cumin2002: END (PASS) - Cookbook sre.o11y.roll-restart-reboot-thanos-fe (exit_code=0) rolling restart_daemons on A:thanos-fe
11:40 jmm@cumin2002: START - Cookbook sre.o11y.roll-restart-reboot-thanos-fe rolling restart_daemons on A:thanos-fe
10:58 cgoubert@cumin1001: conftool action : set/weight=20; selector: name=mw2394.codfw.wmnet
10:57 cgoubert@cumin1001: conftool action : set/weight=20; selector: name=mw2395.codfw.wmnet
10:57 cgoubert@cumin1001: conftool action : set/weight=20; selector: name=mw2410.codfw.wmnet
10:56 cgoubert@cumin1001: conftool action : set/weight=20; selector: name=mw2411.codfw.wmnet
10:52 cgoubert@cumin1001: conftool action : set/weight=25; selector: dc=codfw,cluster=videoscaler,service=canary
10:52 cgoubert@cumin1001: conftool action : set/weight=25; selector: dc=codfw,cluster=jobrunner,service=canary
10:21 moritzm: installing libxml2 security updates on bullseye
09:34 moritzm: upgrade php-excimer on remaining mediawiki hosts to 1.0.2-1+wmf3+buster1 (which rebases Excimer to 1.1.1) T332964
08:51 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1003.eqiad.wmnet
08:43 mvernon@cumin1001: START - Cookbook sre.hosts.reboot-single for host thanos-be1003.eqiad.wmnet
07:53 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
07:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
06:12 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 46887
06:12 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 46887
06:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 4557
06:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 4557
04:08 ejegg: re-enabled fundraising scheduled jobs
04:07 ejegg: civicrm upgraded from fa5265bf to 2bc9f372
03:55 ejegg: civicrm upgraded from 14644f30 to fa5265bf
03:52 mwpresync@deploy2002: Pruned MediaWiki: 1.41.0-wmf.4 (duration: 02m 06s)
03:50 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.41.0-wmf.6 refs T330212 (duration: 48m 05s)
03:16 eileen: config revision changed from 554bb874 to d1462a30
03:02 mwpresync@deploy2002: Started scap: testwikis wikis to 1.41.0-wmf.6 refs T330212

2023-04-24

23:15 eileen: civicrm upgraded from c17c8db2 to 26150ed4
22:00 eileen: civicrm upgraded from 3466c2d3 to c17c8db2
20:53 cjming: end of UTC late backport window
20:52 cjming@deploy2002: Finished scap: Backport for Fix InvalidCharacterError: Failed to execute 'add' on 'DOMTokenList' (T335149) (duration: 11m 25s)
20:42 cjming@deploy2002: cjming and nray: Backport for Fix InvalidCharacterError: Failed to execute 'add' on 'DOMTokenList' (T335149) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
20:42 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@6e76561]: (no justification provided) (duration: 00m 23s)
20:41 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@6e76561]: (no justification provided)
20:41 cjming@deploy2002: Started scap: Backport for Fix InvalidCharacterError: Failed to execute 'add' on 'DOMTokenList' (T335149)
20:38 cjming@deploy2002: Finished scap: Backport for [fywiki] Add portal and portal talk namespace (T334807) (duration: 07m 26s)
20:32 cjming@deploy2002: cjming and superpes: Backport for [fywiki] Add portal and portal talk namespace (T334807) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
20:30 cjming@deploy2002: Started scap: Backport for [fywiki] Add portal and portal talk namespace (T334807)
20:28 cjming@deploy2002: Finished scap: Backport for [guwwikinews] Add a HD logo for vector legacy (T335162) (duration: 07m 22s)
20:22 cjming@deploy2002: superpes and cjming: Backport for [guwwikinews] Add a HD logo for vector legacy (T335162) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
20:21 cjming@deploy2002: Started scap: Backport for [guwwikinews] Add a HD logo for vector legacy (T335162)
20:19 cjming@deploy2002: Finished scap: Backport for [kcgwiktionary] Add a HD logo for vector legacy (T335162) (duration: 07m 51s)
20:13 cjming@deploy2002: superpes and cjming: Backport for [kcgwiktionary] Add a HD logo for vector legacy (T335162) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
20:11 cjming@deploy2002: Started scap: Backport for [kcgwiktionary] Add a HD logo for vector legacy (T335162)
19:45 eoghan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on aphlict1001.eqiad.wmnet with reason: aphlict1002 is now active for testing
19:42 eoghan@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on aphlict1001.eqiad.wmnet with reason: aphlict1002 is now active for testing
19:29 eoghan@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) aphlict.discovery.wmnet on all recursors
19:29 eoghan@cumin1001: START - Cookbook sre.dns.wipe-cache aphlict.discovery.wmnet on all recursors
18:44 bking@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
17:51 wfan: payments-wiki upgraded from a6288840 to 2a4c450d
17:43 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp5013.mgmt.eqsin.wmnet with reboot policy FORCED
17:36 robh@cumin1001: START - Cookbook sre.hosts.provision for host cp5013.mgmt.eqsin.wmnet with reboot policy FORCED
17:35 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp5015.mgmt.eqsin.wmnet with reboot policy FORCED
17:32 robh@cumin1001: START - Cookbook sre.hosts.provision for host cp5015.mgmt.eqsin.wmnet with reboot policy FORCED
17:26 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp5014.mgmt.eqsin.wmnet with reboot policy FORCED
17:20 robh@cumin1001: START - Cookbook sre.hosts.provision for host cp5014.mgmt.eqsin.wmnet with reboot policy FORCED
17:19 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp5013.mgmt.eqsin.wmnet with reboot policy FORCED
17:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
17:04 robh@cumin1001: START - Cookbook sre.hosts.provision for host cp5013.mgmt.eqsin.wmnet with reboot policy FORCED
17:03 jhancock@cumin2002: START - Cookbook sre.dns.netbox
16:54 jhancock@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
16:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
16:39 robh@cumin1001: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5016
16:37 robh@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host cp5016
16:37 robh@cumin1001: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5015
16:36 robh@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host cp5015
16:36 robh@cumin1001: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5014
16:35 robh@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host cp5014
16:35 robh@cumin1001: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cp5013
16:34 robh@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host cp5013
15:48 ejegg: payments-wiki upgraded from 25d867dc to a6288840
15:14 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:14 robh@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: old cp server work - robh@cumin1001"
15:11 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: old cp server work - robh@cumin1001"
15:09 robh@cumin1001: START - Cookbook sre.dns.netbox
15:09 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:09 robh@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: old cp server work - robh@cumin1001"
15:08 vgutierrez: restarting haproxy on cp3064 - T334448
15:07 robh@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: old cp server work - robh@cumin1001"
15:05 robh@cumin1001: START - Cookbook sre.dns.netbox
14:59 eoghan@cumin1001: END (PASS) - Cookbook sre.gitlab.failover (exit_code=0) Failover of gitlab from gitlab1003.wikimedia.org to gitlab1004.wikimedia.org
14:58 inflatador: bking@wdqs1015 repool wdqs1015 as lag is back down
14:56 eoghan@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) https://gitlab-replica.wikimedia.org/ https://gitlab-replica-old.wikimedia.org/ on all recursors
14:56 eoghan@cumin1001: START - Cookbook sre.dns.wipe-cache https://gitlab-replica.wikimedia.org/ https://gitlab-replica-old.wikimedia.org/ on all recursors
14:47 mutante: DNS - new project language "btm" added - Mandailing language is spoken in Indonesia - https://en.wikipedia.org/wiki/Mandailing_language
14:31 herron: re-enabled icinga meta monitoring on wikitech-static T333837
14:07 herron: disabled icinga meta monitoring on wikitech-static T333837
14:07 herron: beginning alert host failover from alert2001 to alert1001 T333837
13:40 dcausse: repooling wdqs1005
13:32 claime: Deployed push-notifications production for switch to mw-api-int - T334061
13:32 moritzm: installing libxml2 security updates on bullseye
13:27 urbanecm@deploy2002: Finished scap: Backport for Update InterwikiSortOrders (T335019) (duration: 06m 59s)
13:24 eoghan@cumin1001: START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1003.wikimedia.org to gitlab1004.wikimedia.org
13:24 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
13:23 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
13:20 urbanecm@deploy2002: Started scap: Backport for Update InterwikiSortOrders (T335019)
13:15 urbanecm@deploy2002: Finished scap: Backport for Disable wmgNewUserMessageOnAutoCreate from Extension:NewUserMessage on knwikisource (T335090) (duration: 11m 02s)
13:14 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
13:14 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
13:13 claime: Deploying push-notifications production for switch to mw-api-int - T334061
13:05 urbanecm@deploy2002: urbanecm and anzx: Backport for Disable wmgNewUserMessageOnAutoCreate from Extension:NewUserMessage on knwikisource (T335090) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
13:04 urbanecm@deploy2002: Started scap: Backport for Disable wmgNewUserMessageOnAutoCreate from Extension:NewUserMessage on knwikisource (T335090)
12:56 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
12:29 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
12:28 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
12:28 claime: Deploying push-notifications staging for switch to mw-api-int - T334061
11:23 cgoubert@cumin1001: conftool action : set/weight=30; selector: dc=codfw,cluster=api_appserver,service=canary
11:21 cgoubert@cumin1001: conftool action : set/weight=25; selector: dc=codfw,cluster=appserver,service=canary
11:19 cgoubert@cumin1001: conftool action : set/weight=30; selector: dc=eqiad,cluster=appserver,service=canary
11:18 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:17 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
11:14 cgoubert@cumin1001: conftool action : set/weight=10; selector: dc=codfw,cluster=parsoid,service=canary
11:13 cgoubert@cumin1001: conftool action : set/weight=10; selector: dc=eqiad,cluster=parsoid,service=canary
11:13 claime: Fixing appserver clusters canary weights
10:56 jynus: deployed new ssh key for jcrespo on production cluster
10:29 claime: Datacenter switchover live testing setting db to read-only and back in eqiad successful - T327920
10:29 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0)
10:29 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite
10:29 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0)
10:29 cgoubert@cumin1001: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly
10:27 claime: Datacenter switchover live testing setting db to read-only and back in eqiad - T327920
10:26 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Ilooremeta out of all services on: 801 hosts
10:26 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Ilooremeta out of all services on: 801 hosts
10:24 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Ilooremeta out of all services on: 1262 hosts
10:22 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Ilooremeta out of all services on: 1262 hosts
10:22 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hghani out of all services on: 1262 hosts
10:20 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Hghani out of all services on: 1262 hosts
10:18 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hghani out of all services on: 801 hosts
10:18 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Hghani out of all services on: 801 hosts
10:17 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hibashaath out of all services on: 801 hosts
10:17 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Hibashaath out of all services on: 801 hosts
10:16 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hibashaath out of all services on: 1262 hosts
10:14 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Hibashaath out of all services on: 1262 hosts
10:11 marostegui: Enable replication eqiad -> codfw on s1 dbmaint eqiad T335266
10:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 38 hosts with reason: Enabling replication T335266
10:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 38 hosts with reason: Enabling replication T335266
10:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 35 hosts with reason: Enabling replication T335266
10:08 marostegui: Enable replication eqiad -> codfw on s4 dbmaint eqiad T335266
10:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 35 hosts with reason: Enabling replication T335266
10:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 24 hosts with reason: Enabling replication T335266
10:06 marostegui: Enable replication eqiad -> codfw on s3 dbmaint eqiad T335266
10:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 24 hosts with reason: Enabling replication T335266
10:01 moritzm: installing git security updates
09:55 slyngs: Update LDAP schema wmf-user: T148048
09:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 28 hosts with reason: Enabling replication T335266
09:55 marostegui: Enable replication eqiad -> codfw on s7 dbmaint eqiad T335266
09:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 28 hosts with reason: Enabling replication T335266
09:25 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host an-worker1110.eqiad.wmnet
09:21 moritzm: upgrade php-excimer on mw canaries to 1.0.2-1+wmf3+buster1 (which rebases Excimer to 1.1.1) T332964
08:45 moritzm: uploaded php-excimer 1.0.2-1+wmf3+buster1 (which rebases Excimer to 1.1.1) to component/php74 for buster-wikimedia T332964
08:44 marostegui: Enable replication eqiad -> codfw on s8 dbmaint eqiad T335266
08:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 34 hosts with reason: Enabling replication T335266
08:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 34 hosts with reason: Enabling replication T335266
08:33 marostegui: Enable replication eqiad -> codfw on s5 dbmaint eqiad T335266
08:32 cgoubert@deploy2002: Finished scap: testing T329857 (duration: 14m 29s)
08:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 26 hosts with reason: Enabling replication T335266
08:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 26 hosts with reason: Enabling replication T335266
08:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 27 hosts with reason: Enabling replication T335266
08:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 27 hosts with reason: Enabling replication T335266
08:28 marostegui: Enable replication eqiad -> codfw on s6 dbmaint eqiad T335266
08:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 27 hosts with reason: Enabling replication T335266
08:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 27 hosts with reason: Enabling replication T335266
08:26 marostegui: Enable replication eqiad -> codfw on s2 dbmaint eqiad T335266
08:25 btullis@cumin1001: START - Cookbook sre.hosts.dhcp for host an-worker1110.eqiad.wmnet
08:21 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-worker1110.eqiad.wmnet with reason: Upgrading RAID controller firmware
08:21 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-worker1110.eqiad.wmnet with reason: Upgrading RAID controller firmware
08:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 10 hosts with reason: Enabling replication T335266
08:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 10 hosts with reason: Enabling replication T335266
08:20 marostegui: Enable replication eqiad -> codfw on x1 dbmaint eqiad T335266
08:18 cgoubert@deploy2002: Started scap: testing T329857
08:17 marostegui: Enable replication eqiad -> codfw on es5 dbmaint eqiad T335266
08:14 claime: Deploying 909302 on deploy2002 for T329857
08:10 claime: Disabling puppet on deploy2002 - T329857
08:09 claime: Deploying 909302 on deploy1002 for T329857
08:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 6 hosts with reason: Enabling replication T335266
08:08 marostegui: Enable replication eqiad -> codfw on es4 dbmaint eqiad T335266
08:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on 6 hosts with reason: Enabling replication T335266
08:07 marostegui: Enable replication eqiad -> codfw on pc3 dbmaint eqiad T335266
08:06 marostegui: Enable replication eqiad -> codfw on pc2 dbmaint eqiad T335266
08:05 marostegui: Enable replication eqiad -> codfw on pc1 dbmaint eqiad T335266
07:53 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.41 in codfw
07:51 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.41 in codfw
07:45 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab1004.wikimedia.org with OS bullseye
07:44 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.59 in codfw
07:42 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.59 in codfw
07:39 dcausse: restarting blazegraph on wdqs1005 (stuck for 3+days)
07:38 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.4a in codfw
07:36 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.4a in codfw
07:24 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage
07:21 jelto@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab1004.wikimedia.org with reason: host reimage
07:06 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab1004.wikimedia.org with OS bullseye

2023-04-22

05:41 joe: <thumbor/codfw>$ helmfile --state-values-set roll_restart=1 -e codfw sync
05:40 oblivian@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
05:39 oblivian@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
05:39 oblivian@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
05:39 oblivian@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
05:15 hashar@deploy2002: Finished deploy [integration/docroot@b816911]: Update Grafana URL (duration: 00m 11s)
05:15 hashar@deploy2002: Started deploy [integration/docroot@b816911]: Update Grafana URL
05:10 joe: sudo cumin -b 1 -s 20 'A:swift-fe-codfw' 'systemctl restart swift-proxy.service'
04:33 vgutierrez: restart haproxy on cp1087 - T334448

2023-04-21

18:27 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.remove-ghost-objects (exit_code=99) from container wikipedia-en-local-public.a8 in codfw
18:25 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.a8 in codfw
15:57 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for Set wmgUseGraphWithJsonNamespace = true for mediawikiwiki (T124748 T335130) (duration: 10m 01s)
15:48 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Set wmgUseGraphWithJsonNamespace = true for mediawikiwiki (T124748 T335130) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
15:47 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for Set wmgUseGraphWithJsonNamespace = true for mediawikiwiki (T124748 T335130)
12:18 duesen: reverted monky-patch, mwdebug2001 and deploy2002 are back to wmf/1.41.0-wmf.5 (T335183)
11:56 duesen: monky-patching Ib11a871ff on mwdebug2001 to investigate T335183
09:03 Amir1: finish of the wikibase populate sites table
08:35 Amir1: start of foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https
03:19 eileen: civicrm upgraded from 5b63c2b2 to 0fad720a
03:11 eileen: civicrm upgraded from a2e7c079 to 5b63c2b2
01:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2011.codfw.wmnet with OS bullseye
01:41 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:39 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:22 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2011.codfw.wmnet with reason: host reimage
01:19 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2011.codfw.wmnet with reason: host reimage
00:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2010.codfw.wmnet with OS bullseye
00:37 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
00:35 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
00:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2010.codfw.wmnet with reason: host reimage
00:15 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2010.codfw.wmnet with reason: host reimage
00:10 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host backup2011.codfw.wmnet with OS bullseye

2023-04-20

22:48 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
22:24 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['backup2011']
22:18 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2011']
22:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['backup2011']
22:17 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2011']
21:47 zabe@deploy2002: Finished scap: Backport for Update interwiki cache (duration: 06m 26s)
21:42 zabe@deploy2002: zabe: Backport for Update interwiki cache synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
21:41 zabe@deploy2002: Started scap: Backport for Update interwiki cache
21:35 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
21:35 zabe@deploy2002: Finished scap: T334394 (duration: 07m 46s)
21:28 zabe@deploy2002: zabe: T334394 synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
21:28 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
21:27 zabe@deploy2002: Started scap: T334394
21:26 zabe: create Wikinews Gungbe # T334394
21:22 inflatador: bking@cumin1001 repool wdqs2012 T331300
21:19 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
21:19 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
21:18 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
21:18 inflatador: bking@cumin1001 depool wdqs2009 for data xfer T331300
21:03 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
21:03 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup2011.mgmt.codfw.wmnet with reboot policy FORCED
20:57 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
20:54 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
20:47 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
20:36 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
20:33 eevans@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host sessionstore1001.eqiad.wmnet
20:31 thcipriani@deploy2002: Finished scap: Backport for Fix TypeError: trigger.attr is not a function (T335148) (duration: 09m 53s)
20:22 thcipriani@deploy2002: nray and thcipriani: Backport for Fix TypeError: trigger.attr is not a function (T335148) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
20:22 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
20:21 thcipriani@deploy2002: Started scap: Backport for Fix TypeError: trigger.attr is not a function (T335148)
19:58 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
19:57 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
19:54 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
19:47 zabe@deploy2002: Finished scap: Backport for Update interwiki cache (duration: 06m 47s)
19:41 zabe@deploy2002: zabe: Backport for Update interwiki cache synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
19:40 zabe@deploy2002: Started scap: Backport for Update interwiki cache
19:34 zabe@deploy2002: Finished scap: T333266 (duration: 07m 04s)
19:29 zabe@deploy2002: zabe: T333266 synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
19:28 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
19:27 zabe@deploy2002: Started scap: T333266
19:27 zabe: create Wiktionary Kabardian # T333266
19:16 inflatador: bking@cumin1001 depool wdqs2012.codfw.wmnet for data xfer T331300
19:16 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
19:15 bking@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
19:13 bking@cumin1001: START - Cookbook sre.wdqs.data-transfer
18:58 zabe@deploy2002: Finished scap: Backport for Disable VE as default editor on kcgwiktionary (T334730), db-production: Fix indentation, Update interwiki cache (duration: 07m 06s)
18:53 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host backup2011.mgmt.codfw.wmnet with reboot policy FORCED
18:52 zabe@deploy2002: zabe: Backport for Disable VE as default editor on kcgwiktionary (T334730), db-production: Fix indentation, Update interwiki cache synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add backup2011 DNS entries - pt1979@cumin2002"
18:51 zabe@deploy2002: Started scap: Backport for Disable VE as default editor on kcgwiktionary (T334730), db-production: Fix indentation, Update interwiki cache
18:50 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab1003.wikimedia.org with OS bullseye
18:50 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add backup2011 DNS entries - pt1979@cumin2002"
18:50 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host backup2010.codfw.wmnet with OS bullseye
18:47 pt1979@cumin2002: START - Cookbook sre.dns.netbox
18:36 zabe@deploy2002: Finished scap: T335016 (duration: 07m 28s)
18:30 zabe@deploy2002: zabe: T335016 synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
18:29 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab1003.wikimedia.org with reason: host reimage
18:29 zabe@deploy2002: Started scap: T335016
18:29 zabe: create Wikipedia Fante # T335016
18:26 jelto@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab1003.wikimedia.org with reason: host reimage
18:17 zabe@deploy2002: Finished scap: Backport for Add messages for Fante Wikipedia (fatwiki) (T335016), Localisation updates from https://translatewiki.net., Localisation updates from https://translatewiki.net. (duration: 23m 58s)
18:10 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab1003.wikimedia.org with OS bullseye
18:05 zabe@deploy2002: zabe: Backport for Add messages for Fante Wikipedia (fatwiki) (T335016), Localisation updates from https://translatewiki.net., Localisation updates from https://translatewiki.net. synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
18:01 sukhe: enable puppet and run agent in A:lvs and A:eqiad CR 910563
18:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS bullseye
18:00 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
17:59 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
17:54 sukhe: disable puppet in A:lvs and A:eqiad to test CR 910563
17:53 zabe@deploy2002: Started scap: Backport for Add messages for Fante Wikipedia (fatwiki) (T335016), Localisation updates from https://translatewiki.net., Localisation updates from https://translatewiki.net.
17:48 zabe@deploy2002: Finished scap: create kcgwiktionary (T334730) (duration: 08m 08s)
17:42 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
17:41 zabe@deploy2002: zabe: create kcgwiktionary (T334730) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
17:39 zabe@deploy2002: Started scap: create kcgwiktionary (T334730)
17:39 zabe: create Wiktionary Tyap # T334730
17:39 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
17:24 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS bullseye
17:02 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching aqs10[10,13,16,19].eqiad.wmnet: Testing rolling restart (rack1) — T334754 - eevans@cumin1001
16:31 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching aqs10[10,13,16,19].eqiad.wmnet: Testing rolling restart (rack1) — T334754 - eevans@cumin1001
16:25 SandraEbele: Deployed refinery using scap, then deployed onto hdfs as part of weekly deployment train.
16:23 claime: repooling parse2010 after fix - T335138
16:22 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse2010.codfw.wmnet
16:22 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse2010.codfw.wmnet
16:20 stevemunene@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host an-airflow1006.eqiad.wmnet with OS buster
16:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['bast2003']
16:16 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['bast2003']
16:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['bast2003']
16:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['bast2003']
16:15 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['bast2003']
16:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['bast2003']
16:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host bast2003.mgmt.codfw.wmnet with reboot policy FORCED
16:08 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:08 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: setting sretest2001 back to offine - pt1979@cumin2002"
16:07 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: setting sretest2001 back to offine - pt1979@cumin2002"
16:04 stevemunene@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
16:03 pt1979@cumin2002: START - Cookbook sre.dns.netbox
16:01 stevemunene@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-airflow1006.eqiad.wmnet with reason: host reimage
15:59 ebysans@deploy2002: Finished deploy [analytics/refinery@1631dea] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@1631dea] (duration: 01m 29s)
15:58 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host bast2003.mgmt.codfw.wmnet with reboot policy FORCED
15:58 ebysans@deploy2002: Started deploy [analytics/refinery@1631dea] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@1631dea]
15:57 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:57 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add bast2003 DNS entries - pt1979@cumin2002"
15:56 ebysans@deploy2002: Finished deploy [analytics/refinery@1631dea] (thin): Regular analytics weekly train THIN [analytics/refinery@1631dea] (duration: 00m 08s)
15:56 ebysans@deploy2002: Started deploy [analytics/refinery@1631dea] (thin): Regular analytics weekly train THIN [analytics/refinery@1631dea]
15:55 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add bast2003 DNS entries - pt1979@cumin2002"
15:54 ebysans@deploy2002: Finished deploy [analytics/refinery@1631dea]: Regular analytics weekly train [analytics/refinery@1631dea] (duration: 08m 30s)
15:51 pt1979@cumin2002: START - Cookbook sre.dns.netbox
15:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns2006.wikimedia.org with OS bullseye
15:48 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
15:48 stevemunene@cumin1001: START - Cookbook sre.ganeti.reimage for host an-airflow1006.eqiad.wmnet with OS buster
15:47 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
15:47 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:46 pt1979@cumin2002: START - Cookbook sre.dns.netbox
15:46 ebysans@deploy2002: Started deploy [analytics/refinery@1631dea]: Regular analytics weekly train [analytics/refinery@1631dea]
15:44 SandraEbele: deploying weekly deployment train for analytics refinery.
15:38 sukhe: sudo cumin -b1 -s1200 'A:cp and A:eqsin' 'varnish-frontend-restart'
15:37 stevemunene@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-airflow1006.eqiad.wmnet
15:37 stevemunene@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM an-airflow1006.eqiad.wmnet - stevemunene@cumin1001"
15:36 stevemunene@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM an-airflow1006.eqiad.wmnet - stevemunene@cumin1001"
15:33 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye
15:32 bking@cumin1001: START - Cookbook sre.hosts.downtime for 12 days, 0:00:00 on wdqs2022.codfw.wmnet with reason: attempting WDQS stack on bullseye
15:32 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns2006.wikimedia.org with reason: host reimage
15:31 ejegg: payments-wiki upgraded from 66be66e0 to 744d82c6
15:28 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns2006.wikimedia.org with reason: host reimage
15:27 sukhe: run puppet manually in A:cp and A:eqsin to pick up CR 910005
15:26 sukhe: re-enable puppet in A:cp and A:eqsin
15:23 sukhe: varnish-frontend-restart cp5022
15:21 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:20 jclark@cumin1001: START - Cookbook sre.dns.netbox
15:15 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dns2006.wikimedia.org with OS bullseye
14:56 sukhe: disable puppet in A:cp and A:eqsin to test CR 910005
14:50 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for Make $wmgUseGraphWithJsonNamespace depend on $wmgUseJsonConfig (T335130) (duration: 07m 40s)
14:49 stevemunene@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) an-airflow1006.eqiad.wmnet on all recursors
14:49 stevemunene@cumin1001: START - Cookbook sre.dns.wipe-cache an-airflow1006.eqiad.wmnet on all recursors
14:49 stevemunene@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:49 stevemunene@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM an-airflow1006.eqiad.wmnet - stevemunene@cumin1001"
14:47 stevemunene@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM an-airflow1006.eqiad.wmnet - stevemunene@cumin1001"
14:45 stevemunene@cumin1001: START - Cookbook sre.dns.netbox
14:45 stevemunene@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-airflow1006.eqiad.wmnet
14:43 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Make $wmgUseGraphWithJsonNamespace depend on $wmgUseJsonConfig (T335130) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
14:42 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for Make $wmgUseGraphWithJsonNamespace depend on $wmgUseJsonConfig (T335130)
14:39 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on parse2010.codfw.wmnet with reason: PSU failure
14:39 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on parse2010.codfw.wmnet with reason: PSU failure
14:33 claime: depooling parse2010 for PSU failure
13:35 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.remove-ghost-objects (exit_code=99) from container wikipedia-en-local-public.a8 in codfw
13:33 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.a8 in codfw
12:44 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.remove-ghost-objects (exit_code=99) from container wikipedia-en-local-public.a8 in codfw
12:42 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.a8 in codfw
12:12 ladsgroup@deploy2002: Finished scap: Backport for Set wmgUseGraphWithJsonNamespace = false for mediawikiwiki (T124748) (duration: 07m 48s)
12:05 ladsgroup@deploy2002: aklapper and ladsgroup: Backport for Set wmgUseGraphWithJsonNamespace = false for mediawikiwiki (T124748) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
12:04 ladsgroup@deploy2002: Started scap: Backport for Set wmgUseGraphWithJsonNamespace = false for mediawikiwiki (T124748)
10:57 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
10:57 moritzm: installing openvswitch security updates on bullseye
10:57 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
10:43 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.remove-ghost-objects (exit_code=99) from container wikipedia-en-local-public.a8 in codfw
10:41 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.a8 in codfw
09:43 isaranto@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
09:42 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:42 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:40 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:40 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:35 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:35 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:06 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.18 in codfw
09:04 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.18 in codfw
08:57 isaranto@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
08:17 jnuche@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.41.0-wmf.5 refs T330211
07:24 moritzm: uploaded imagemagick 8:6.9.10.23+dfsg-2.1+deb10u1+wmf1 to apt.wikimedia.org for buster-wikimedia T328901
06:25 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 14593
06:24 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 14593
06:19 moritzm: installing tomcat9 security updates
06:15 joe: enabled requestctl rule for T332061
06:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on krb2002.codfw.wmnet with reason: Non-functional, WIP for Bullseye update
06:09 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on krb2002.codfw.wmnet with reason: Non-functional, WIP for Bullseye update
03:49 eileen: civicrm upgraded from efdf9434 to a2e7c079
00:02 mutante: LDAP - adding uid fnavas-foundation to group wmf - T331482

2023-04-19

23:36 zabe@deploy2002: Finished scap: gerrit:910078 (duration: 06m 40s)
23:29 zabe@deploy2002: Started scap: gerrit:910078
23:15 tzatziki: removing 1 file for legal compliance
23:02 tzatziki: removing 3 files for legal compliance
22:34 bking@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2022.codfw.wmnet with OS bullseye
22:10 tzatziki: removing 5 files for legal compliance
21:38 bking@cumin1001: START - Cookbook sre.hosts.reimage for host wdqs2022.codfw.wmnet with OS bullseye
20:16 zabe@deploy2002: Finished scap: Backport for Revert "Revert "dewiki: Allow 'crats to remove sysopship and manage importers"" (T331921) (duration: 07m 26s)
20:10 zabe@deploy2002: zabe: Backport for Revert "Revert "dewiki: Allow 'crats to remove sysopship and manage importers"" (T331921) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
20:09 zabe@deploy2002: Started scap: Backport for Revert "Revert "dewiki: Allow 'crats to remove sysopship and manage importers"" (T331921)
19:49 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dns2006.wikimedia.org with OS bullseye
19:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns2005.wikimedia.org with OS bullseye
19:09 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns2005.wikimedia.org with reason: host reimage
19:06 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns2005.wikimedia.org with reason: host reimage
19:05 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dns2005.wikimedia.org with OS bullseye
19:04 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host dns2005.wikimedia.org with OS bullseye
19:04 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
19:02 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
18:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dns2006.wikimedia.org with OS bullseye
18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns2004.wikimedia.org with OS bullseye
18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
18:50 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
18:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns2005.wikimedia.org with reason: host reimage
18:36 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns2005.wikimedia.org with reason: host reimage
18:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns2004.wikimedia.org with reason: host reimage
18:31 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on dns2004.wikimedia.org with reason: host reimage
18:28 sukhe@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: LVS reimaging in eqiad, blocking deploys T321309 (duration: 286m 39s)
18:25 sukhe: restart pybal on lvs1017 to pick up bgp-med change: T321309
18:23 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dns2005.wikimedia.org with OS bullseye
18:22 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1017.eqiad.wmnet with OS bullseye
18:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
18:01 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1017.eqiad.wmnet with reason: host reimage
18:00 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host dns2004.wikimedia.org with OS bullseye
17:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dns2006']
17:57 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns2006']
17:57 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dns2005']
17:57 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns2005']
17:57 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns2005']
17:56 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns2005']
17:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns2006']
17:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dns2005']
17:55 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns2006']
17:55 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns2005']
17:55 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['dns2004']
17:50 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dns2004']
17:46 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1017.eqiad.wmnet with OS bullseye
17:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dns2006.mgmt.codfw.wmnet with reboot policy FORCED
17:21 sukhe: stop pybal in lvs1017 for reimaging
17:14 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dns2006.mgmt.codfw.wmnet with reboot policy FORCED
17:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dns2005.mgmt.codfw.wmnet with reboot policy FORCED
17:05 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dns2005.mgmt.codfw.wmnet with reboot policy FORCED
17:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dns2004.mgmt.codfw.wmnet with reboot policy FORCED
16:41 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dns2004.mgmt.codfw.wmnet with reboot policy FORCED
16:39 sukhe: restart pybal on lvs1018 to remove bgp-med change: T321309
16:39 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns2004.mgmt.codfw.wmnet with reboot policy FORCED
16:35 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs1018
16:35 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs1018
16:23 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1018.eqiad.wmnet with OS bullseye
16:17 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:09 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:09 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:09 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:06 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:06 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: host reimage
16:06 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:05 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:05 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:04 jbond@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:04 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:04 jbond@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:04 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010.codfw.wmnet']
16:02 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1018.eqiad.wmnet with reason: host reimage
15:49 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dns2004.mgmt.codfw.wmnet with reboot policy FORCED
15:48 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1018.eqiad.wmnet with OS bullseye
15:47 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs1018
15:47 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs1018
15:44 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dns2004.mgmt.codfw.wmnet with reboot policy FORCED
15:42 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host dns2004.mgmt.codfw.wmnet with reboot policy FORCED
15:36 mutante: DNS - added new project language "fat" (fat.wikipedia.org) - the "Fante" language, a dialect of Akan, spoken by 2.8 million people in Ghana - https://en.wikipedia.org/wiki/Fante_dialect T335016
15:34 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:34 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for dns200[4-6] - pt1979@cumin2002"
15:33 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for dns200[4-6] - pt1979@cumin2002"
15:30 pt1979@cumin2002: START - Cookbook sre.dns.netbox
15:20 sukhe: stop pybal on lvs1018 for reimaging: T321309
14:54 sukhe: restart pybal on lvs1019 to pick up bpg-med change
14:42 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs1019
14:42 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs1019
14:38 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1019.eqiad.wmnet with OS bullseye
14:22 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1019.eqiad.wmnet with reason: host reimage
14:19 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1019.eqiad.wmnet with reason: host reimage
14:04 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1019.eqiad.wmnet with OS bullseye
13:41 sukhe@deploy2002: Locking from deployment [ALL REPOSITORIES]: LVS reimaging in eqiad, blocking deploys T321309
13:41 sukhe@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: LVS reimaging in eqiad, blocking deploys T321309 (duration: 00m 16s)
13:41 sukhe@deploy2002: Locking from deployment [ALL REPOSITORIES]: LVS reimaging in eqiad, blocking deploys T321309
13:28 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.remove-ghost-objects (exit_code=99) from container wikipedia-en-local-public.a8 in codfw
13:25 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.a8 in codfw
13:16 klausman@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
13:16 klausman@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
13:14 taavi@deploy2002: Finished scap: Backport for cleanup: Remove duplicate permission config of confirmed users (duration: 11m 32s)
13:09 moritzm: installing lldpd security updates
13:04 taavi@deploy2002: func and taavi: Backport for cleanup: Remove duplicate permission config of confirmed users synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
13:02 taavi@deploy2002: Started scap: Backport for cleanup: Remove duplicate permission config of confirmed users
11:18 hnowlan@puppetmaster1001: conftool action : set/weight=7; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
10:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
10:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2165.codfw.wmnet with reason: Maintenance
10:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1126.eqiad.wmnet with reason: Maintenance
10:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1126.eqiad.wmnet with reason: Maintenance
10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
10:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2112.codfw.wmnet with reason: Maintenance
10:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
10:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1163.eqiad.wmnet with reason: Maintenance
10:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
10:46 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-en-local-public.1a in codfw
10:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2107.codfw.wmnet with reason: Maintenance
10:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
10:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1162.eqiad.wmnet with reason: Maintenance
10:43 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.1a in codfw
10:42 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-en-local-public.1a in eqiad
10:40 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.1a in eqiad
10:37 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.e4 in eqiad
10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T333332)', diff saved to https://phabricator.wikimedia.org/P47260 and previous config saved to /var/cache/conftool/dbconfig/20230419-103603-ladsgroup.json
10:34 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.e4 in eqiad
10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P47259 and previous config saved to /var/cache/conftool/dbconfig/20230419-102057-ladsgroup.json
10:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
10:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2105.codfw.wmnet with reason: Maintenance
10:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P47258 and previous config saved to /var/cache/conftool/dbconfig/20230419-101614-root.json
10:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
10:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1157.eqiad.wmnet with reason: Maintenance
10:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
10:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2140.codfw.wmnet with reason: Maintenance
10:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
10:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P47257 and previous config saved to /var/cache/conftool/dbconfig/20230419-100746-root.json
10:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1160.eqiad.wmnet with reason: Maintenance
10:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P47256 and previous config saved to /var/cache/conftool/dbconfig/20230419-100550-ladsgroup.json
10:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
10:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2113.codfw.wmnet with reason: Maintenance
10:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1130.eqiad.wmnet with reason: Maintenance
10:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1130.eqiad.wmnet with reason: Maintenance
10:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
10:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2129.codfw.wmnet with reason: Maintenance
10:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1131.eqiad.wmnet with reason: Maintenance
10:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1131.eqiad.wmnet with reason: Maintenance
10:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
10:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2118.codfw.wmnet with reason: Maintenance
10:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
10:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1181.eqiad.wmnet with reason: Maintenance
10:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P47255 and previous config saved to /var/cache/conftool/dbconfig/20230419-100109-root.json
09:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P47254 and previous config saved to /var/cache/conftool/dbconfig/20230419-095807-root.json
09:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 100%: Pooling', diff saved to https://phabricator.wikimedia.org/P47253 and previous config saved to /var/cache/conftool/dbconfig/20230419-095316-root.json
09:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P47252 and previous config saved to /var/cache/conftool/dbconfig/20230419-095241-root.json
09:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T333332)', diff saved to https://phabricator.wikimedia.org/P47250 and previous config saved to /var/cache/conftool/dbconfig/20230419-095044-ladsgroup.json
09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T333332)', diff saved to https://phabricator.wikimedia.org/P47249 and previous config saved to /var/cache/conftool/dbconfig/20230419-094836-ladsgroup.json
09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
09:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
09:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P47248 and previous config saved to /var/cache/conftool/dbconfig/20230419-094604-root.json
09:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P47247 and previous config saved to /var/cache/conftool/dbconfig/20230419-094302-root.json
09:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P47246 and previous config saved to /var/cache/conftool/dbconfig/20230419-093812-root.json
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P47245 and previous config saved to /var/cache/conftool/dbconfig/20230419-093737-root.json
09:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P47244 and previous config saved to /var/cache/conftool/dbconfig/20230419-093059-root.json
09:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P47243 and previous config saved to /var/cache/conftool/dbconfig/20230419-092757-root.json
09:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P47242 and previous config saved to /var/cache/conftool/dbconfig/20230419-092307-root.json
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P47241 and previous config saved to /var/cache/conftool/dbconfig/20230419-092232-root.json
09:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P47240 and previous config saved to /var/cache/conftool/dbconfig/20230419-091554-root.json
09:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P47239 and previous config saved to /var/cache/conftool/dbconfig/20230419-091252-root.json
09:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P47238 and previous config saved to /var/cache/conftool/dbconfig/20230419-090802-root.json
09:07 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
09:07 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
09:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P47237 and previous config saved to /var/cache/conftool/dbconfig/20230419-090727-root.json
09:07 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
09:07 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
09:05 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
09:05 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
09:03 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
09:03 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
09:01 elukey@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: sync
09:00 elukey@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: sync
09:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P47236 and previous config saved to /var/cache/conftool/dbconfig/20230419-090050-root.json
09:00 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: sync
08:59 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: sync
08:59 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
08:59 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
08:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P47235 and previous config saved to /var/cache/conftool/dbconfig/20230419-085748-root.json
08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P47234 and previous config saved to /var/cache/conftool/dbconfig/20230419-085257-root.json
08:52 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P47233 and previous config saved to /var/cache/conftool/dbconfig/20230419-085222-root.json
08:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 4%: Repooling', diff saved to https://phabricator.wikimedia.org/P47232 and previous config saved to /var/cache/conftool/dbconfig/20230419-084545-root.json
08:45 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
08:45 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
08:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
08:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P47231 and previous config saved to /var/cache/conftool/dbconfig/20230419-084243-root.json
08:40 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 9%: Pooling', diff saved to https://phabricator.wikimedia.org/P47230 and previous config saved to /var/cache/conftool/dbconfig/20230419-083753-root.json
08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 4%: Repooling', diff saved to https://phabricator.wikimedia.org/P47229 and previous config saved to /var/cache/conftool/dbconfig/20230419-083717-root.json
08:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:01:00 on db2185.codfw.wmnet,db[1115,1215].eqiad.wmnet with reason: Test
08:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 0:01:00 on db2185.codfw.wmnet,db[1115,1215].eqiad.wmnet with reason: Test
08:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 3%: Repooling', diff saved to https://phabricator.wikimedia.org/P47228 and previous config saved to /var/cache/conftool/dbconfig/20230419-083040-root.json
08:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P47227 and previous config saved to /var/cache/conftool/dbconfig/20230419-082738-root.json
08:24 jnuche@deploy2002: Synchronized php: group1 wikis to 1.41.0-wmf.5 refs T330211 (duration: 05m 43s)
08:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 100%: Pooling', diff saved to https://phabricator.wikimedia.org/P47226 and previous config saved to /var/cache/conftool/dbconfig/20230419-082345-root.json
08:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 8%: Pooling', diff saved to https://phabricator.wikimedia.org/P47225 and previous config saved to /var/cache/conftool/dbconfig/20230419-082247-root.json
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 3%: Repooling', diff saved to https://phabricator.wikimedia.org/P47224 and previous config saved to /var/cache/conftool/dbconfig/20230419-082213-root.json
08:18 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.5 refs T330211
08:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P47223 and previous config saved to /var/cache/conftool/dbconfig/20230419-081535-root.json
08:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P47222 and previous config saved to /var/cache/conftool/dbconfig/20230419-080841-root.json
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 7%: Pooling', diff saved to https://phabricator.wikimedia.org/P47221 and previous config saved to /var/cache/conftool/dbconfig/20230419-080742-root.json
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P47220 and previous config saved to /var/cache/conftool/dbconfig/20230419-080708-root.json
08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P47219 and previous config saved to /var/cache/conftool/dbconfig/20230419-080030-root.json
07:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P47218 and previous config saved to /var/cache/conftool/dbconfig/20230419-075336-root.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 6%: Pooling', diff saved to https://phabricator.wikimedia.org/P47217 and previous config saved to /var/cache/conftool/dbconfig/20230419-075237-root.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P47216 and previous config saved to /var/cache/conftool/dbconfig/20230419-075203-root.json
07:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P47215 and previous config saved to /var/cache/conftool/dbconfig/20230419-073831-root.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P47214 and previous config saved to /var/cache/conftool/dbconfig/20230419-073732-root.json
07:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P47213 and previous config saved to /var/cache/conftool/dbconfig/20230419-072326-root.json
07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P47212 and previous config saved to /var/cache/conftool/dbconfig/20230419-072228-root.json
07:15 XioNoX: update TLS cert on pfw - T334676
07:13 kartik@deploy2002: Finished scap: Backport for Enable Content/Section translation on 6 Wikipedias (T327102) (duration: 09m 33s)
07:10 XioNoX: push pfw policies - T334983
07:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T333332)', diff saved to https://phabricator.wikimedia.org/P47211 and previous config saved to /var/cache/conftool/dbconfig/20230419-070920-ladsgroup.json
07:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P47210 and previous config saved to /var/cache/conftool/dbconfig/20230419-070822-root.json
07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P47209 and previous config saved to /var/cache/conftool/dbconfig/20230419-070723-root.json
07:05 kartik@deploy2002: kartik: Backport for Enable Content/Section translation on 6 Wikipedias (T327102) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
07:03 kartik@deploy2002: Started scap: Backport for Enable Content/Section translation on 6 Wikipedias (T327102)
06:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P47208 and previous config saved to /var/cache/conftool/dbconfig/20230419-065413-ladsgroup.json
06:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P47207 and previous config saved to /var/cache/conftool/dbconfig/20230419-065317-root.json
06:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P47206 and previous config saved to /var/cache/conftool/dbconfig/20230419-065218-root.json
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 T335011', diff saved to https://phabricator.wikimedia.org/P47205 and previous config saved to /var/cache/conftool/dbconfig/20230419-064122-root.json
06:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P47204 and previous config saved to /var/cache/conftool/dbconfig/20230419-063907-ladsgroup.json
06:38 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
06:38 gmodena@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply
06:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P47203 and previous config saved to /var/cache/conftool/dbconfig/20230419-063812-root.json
06:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1219 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P47202 and previous config saved to /var/cache/conftool/dbconfig/20230419-063713-root.json
06:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T333332)', diff saved to https://phabricator.wikimedia.org/P47201 and previous config saved to /var/cache/conftool/dbconfig/20230419-062401-ladsgroup.json
06:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P47200 and previous config saved to /var/cache/conftool/dbconfig/20230419-062307-root.json
06:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113 (s5,s6)', diff saved to https://phabricator.wikimedia.org/P47197 and previous config saved to /var/cache/conftool/dbconfig/20230419-062123-root.json
06:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2178 (T333332)', diff saved to https://phabricator.wikimedia.org/P47196 and previous config saved to /var/cache/conftool/dbconfig/20230419-062007-ladsgroup.json
06:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
06:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2178.codfw.wmnet with reason: Maintenance
06:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47195 and previous config saved to /var/cache/conftool/dbconfig/20230419-061944-ladsgroup.json
06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1219 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P47194 and previous config saved to /var/cache/conftool/dbconfig/20230419-061414-marostegui.json
06:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1212 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P47193 and previous config saved to /var/cache/conftool/dbconfig/20230419-060803-root.json
06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P47192 and previous config saved to /var/cache/conftool/dbconfig/20230419-060437-ladsgroup.json
05:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P47191 and previous config saved to /var/cache/conftool/dbconfig/20230419-054931-ladsgroup.json
05:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47190 and previous config saved to /var/cache/conftool/dbconfig/20230419-053425-ladsgroup.json
05:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47189 and previous config saved to /var/cache/conftool/dbconfig/20230419-053027-ladsgroup.json
05:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
05:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
05:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T333332)', diff saved to https://phabricator.wikimedia.org/P47188 and previous config saved to /var/cache/conftool/dbconfig/20230419-053003-ladsgroup.json
05:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P47187 and previous config saved to /var/cache/conftool/dbconfig/20230419-051457-ladsgroup.json
04:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P47186 and previous config saved to /var/cache/conftool/dbconfig/20230419-045951-ladsgroup.json
04:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T333332)', diff saved to https://phabricator.wikimedia.org/P47185 and previous config saved to /var/cache/conftool/dbconfig/20230419-044445-ladsgroup.json
04:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2157 (T333332)', diff saved to https://phabricator.wikimedia.org/P47184 and previous config saved to /var/cache/conftool/dbconfig/20230419-044050-ladsgroup.json
04:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
04:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2157.codfw.wmnet with reason: Maintenance
04:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47183 and previous config saved to /var/cache/conftool/dbconfig/20230419-044027-ladsgroup.json
04:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P47182 and previous config saved to /var/cache/conftool/dbconfig/20230419-042520-ladsgroup.json
04:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P47181 and previous config saved to /var/cache/conftool/dbconfig/20230419-041013-ladsgroup.json
03:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47180 and previous config saved to /var/cache/conftool/dbconfig/20230419-035507-ladsgroup.json
03:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47178 and previous config saved to /var/cache/conftool/dbconfig/20230419-035112-ladsgroup.json
03:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
03:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
03:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T333332)', diff saved to https://phabricator.wikimedia.org/P47177 and previous config saved to /var/cache/conftool/dbconfig/20230419-035048-ladsgroup.json
03:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P47176 and previous config saved to /var/cache/conftool/dbconfig/20230419-033542-ladsgroup.json
03:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P47175 and previous config saved to /var/cache/conftool/dbconfig/20230419-032036-ladsgroup.json
03:12 ejegg: payments-wiki upgraded from a01e5ae8 to 66be66e0
03:11 ejegg: civicrm upgraded from 39bbe8cc to efdf9434
03:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T333332)', diff saved to https://phabricator.wikimedia.org/P47174 and previous config saved to /var/cache/conftool/dbconfig/20230419-030530-ladsgroup.json
03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2128 (T333332)', diff saved to https://phabricator.wikimedia.org/P47173 and previous config saved to /var/cache/conftool/dbconfig/20230419-030234-ladsgroup.json
03:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
03:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
03:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
03:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2128.codfw.wmnet with reason: Maintenance
03:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T333332)', diff saved to https://phabricator.wikimedia.org/P47172 and previous config saved to /var/cache/conftool/dbconfig/20230419-030205-ladsgroup.json
02:47 ejegg: civicrm upgraded from dab8912d to 39bbe8cc
02:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P47171 and previous config saved to /var/cache/conftool/dbconfig/20230419-024658-ladsgroup.json
02:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P47170 and previous config saved to /var/cache/conftool/dbconfig/20230419-023152-ladsgroup.json
02:19 cstone: payments-wiki upgraded from c01a32c4 to a01e5ae8
02:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T333332)', diff saved to https://phabricator.wikimedia.org/P47168 and previous config saved to /var/cache/conftool/dbconfig/20230419-021646-ladsgroup.json
02:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2123 (T333332)', diff saved to https://phabricator.wikimedia.org/P47167 and previous config saved to /var/cache/conftool/dbconfig/20230419-021051-ladsgroup.json
02:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
02:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2123.codfw.wmnet with reason: Maintenance
02:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T333332)', diff saved to https://phabricator.wikimedia.org/P47166 and previous config saved to /var/cache/conftool/dbconfig/20230419-021028-ladsgroup.json
02:03 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1075.eqiad.wmnet with OS bullseye
02:03 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
02:01 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P47165 and previous config saved to /var/cache/conftool/dbconfig/20230419-015522-ladsgroup.json
01:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1073.eqiad.wmnet with OS bullseye
01:46 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:44 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:42 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1075.eqiad.wmnet with reason: host reimage
01:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P47164 and previous config saved to /var/cache/conftool/dbconfig/20230419-014016-ladsgroup.json
01:38 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1075.eqiad.wmnet with reason: host reimage
01:37 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1074.eqiad.wmnet with OS bullseye
01:36 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:34 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T333332)', diff saved to https://phabricator.wikimedia.org/P47163 and previous config saved to /var/cache/conftool/dbconfig/20230419-012509-ladsgroup.json
01:23 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1075.eqiad.wmnet with OS bullseye
01:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2111 (T333332)', diff saved to https://phabricator.wikimedia.org/P47162 and previous config saved to /var/cache/conftool/dbconfig/20230419-012114-ladsgroup.json
01:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
01:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2111.codfw.wmnet with reason: Maintenance
01:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
01:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2101.codfw.wmnet with reason: Maintenance
01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1072.eqiad.wmnet with OS bullseye
01:18 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
01:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
01:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T333332)', diff saved to https://phabricator.wikimedia.org/P47161 and previous config saved to /var/cache/conftool/dbconfig/20230419-011754-ladsgroup.json
01:16 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1074.eqiad.wmnet with reason: host reimage
01:10 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1074.eqiad.wmnet with reason: host reimage
01:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1073.eqiad.wmnet with reason: host reimage
01:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P47160 and previous config saved to /var/cache/conftool/dbconfig/20230419-010247-ladsgroup.json
01:01 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1073.eqiad.wmnet with reason: host reimage
00:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1072.eqiad.wmnet with reason: host reimage
00:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P47159 and previous config saved to /var/cache/conftool/dbconfig/20230419-004741-ladsgroup.json
00:44 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1072.eqiad.wmnet with reason: host reimage
00:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1074.eqiad.wmnet with OS bullseye
00:37 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1073.eqiad.wmnet with OS bullseye
00:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
00:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T333332)', diff saved to https://phabricator.wikimedia.org/P47158 and previous config saved to /var/cache/conftool/dbconfig/20230419-003235-ladsgroup.json
00:30 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1210 (T333332)', diff saved to https://phabricator.wikimedia.org/P47157 and previous config saved to /var/cache/conftool/dbconfig/20230419-002952-ladsgroup.json
00:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
00:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1210.eqiad.wmnet with reason: Maintenance
00:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T333332)', diff saved to https://phabricator.wikimedia.org/P47156 and previous config saved to /var/cache/conftool/dbconfig/20230419-002929-ladsgroup.json
00:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be1072.eqiad.wmnet with OS bullseye
00:24 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1074.mgmt.eqiad.wmnet with reboot policy FORCED
00:19 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be1074.mgmt.eqiad.wmnet with reboot policy FORCED
00:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
00:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P47155 and previous config saved to /var/cache/conftool/dbconfig/20230419-001423-ladsgroup.json
00:10 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
00:02 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
00:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
00:01 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED

2023-04-18

23:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P47154 and previous config saved to /var/cache/conftool/dbconfig/20230418-235916-ladsgroup.json
23:58 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
23:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
23:50 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
23:49 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T333332)', diff saved to https://phabricator.wikimedia.org/P47153 and previous config saved to /var/cache/conftool/dbconfig/20230418-234410-ladsgroup.json
23:43 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1200 (T333332)', diff saved to https://phabricator.wikimedia.org/P47152 and previous config saved to /var/cache/conftool/dbconfig/20230418-234032-ladsgroup.json
23:40 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
23:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1200.eqiad.wmnet with reason: Maintenance
23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T333332)', diff saved to https://phabricator.wikimedia.org/P47151 and previous config saved to /var/cache/conftool/dbconfig/20230418-234008-ladsgroup.json
23:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P47150 and previous config saved to /var/cache/conftool/dbconfig/20230418-232502-ladsgroup.json
23:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P47149 and previous config saved to /var/cache/conftool/dbconfig/20230418-230956-ladsgroup.json
22:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T333332)', diff saved to https://phabricator.wikimedia.org/P47148 and previous config saved to /var/cache/conftool/dbconfig/20230418-225449-ladsgroup.json
22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1185 (T333332)', diff saved to https://phabricator.wikimedia.org/P47147 and previous config saved to /var/cache/conftool/dbconfig/20230418-225211-ladsgroup.json
22:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
22:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1185.eqiad.wmnet with reason: Maintenance
22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T333332)', diff saved to https://phabricator.wikimedia.org/P47146 and previous config saved to /var/cache/conftool/dbconfig/20230418-225148-ladsgroup.json
22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P47145 and previous config saved to /var/cache/conftool/dbconfig/20230418-223642-ladsgroup.json
22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183', diff saved to https://phabricator.wikimedia.org/P47144 and previous config saved to /var/cache/conftool/dbconfig/20230418-222135-ladsgroup.json
22:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1183 (T333332)', diff saved to https://phabricator.wikimedia.org/P47143 and previous config saved to /var/cache/conftool/dbconfig/20230418-220629-ladsgroup.json
22:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1183 (T333332)', diff saved to https://phabricator.wikimedia.org/P47142 and previous config saved to /var/cache/conftool/dbconfig/20230418-220350-ladsgroup.json
22:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
22:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1183.eqiad.wmnet with reason: Maintenance
22:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T333332)', diff saved to https://phabricator.wikimedia.org/P47141 and previous config saved to /var/cache/conftool/dbconfig/20230418-220327-ladsgroup.json
21:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P47140 and previous config saved to /var/cache/conftool/dbconfig/20230418-214820-ladsgroup.json
21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P47139 and previous config saved to /var/cache/conftool/dbconfig/20230418-213314-ladsgroup.json
21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T333332)', diff saved to https://phabricator.wikimedia.org/P47138 and previous config saved to /var/cache/conftool/dbconfig/20230418-211808-ladsgroup.json
21:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T333332)', diff saved to https://phabricator.wikimedia.org/P47137 and previous config saved to /var/cache/conftool/dbconfig/20230418-211529-ladsgroup.json
21:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
21:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
21:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
21:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1161.eqiad.wmnet with reason: Maintenance
21:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
21:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47136 and previous config saved to /var/cache/conftool/dbconfig/20230418-211354-ladsgroup.json
20:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P47134 and previous config saved to /var/cache/conftool/dbconfig/20230418-205848-ladsgroup.json
20:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315', diff saved to https://phabricator.wikimedia.org/P47133 and previous config saved to /var/cache/conftool/dbconfig/20230418-204339-ladsgroup.json
20:32 TheresNoTime: close UTC late backport window
20:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47132 and previous config saved to /var/cache/conftool/dbconfig/20230418-202833-ladsgroup.json
20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47131 and previous config saved to /var/cache/conftool/dbconfig/20230418-202554-ladsgroup.json
20:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
20:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47130 and previous config saved to /var/cache/conftool/dbconfig/20230418-202530-ladsgroup.json
20:25 samtar@deploy2002: Finished scap: Backport for Remove weird VisualEditor config hack from 2015, Simplify some more VisualEditor configuration (duration: 10m 32s)
20:16 samtar@deploy2002: matmarex and samtar: Backport for Remove weird VisualEditor config hack from 2015, Simplify some more VisualEditor configuration synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
20:14 samtar@deploy2002: Started scap: Backport for Remove weird VisualEditor config hack from 2015, Simplify some more VisualEditor configuration
20:13 samtar@deploy2002: Finished scap: Backport for Enable visual enhancements on pages using on dewiki (T318596) (duration: 07m 49s)
20:13 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1003.eqiad.wmnet with OS bullseye
20:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P47129 and previous config saved to /var/cache/conftool/dbconfig/20230418-201024-ladsgroup.json
20:06 samtar@deploy2002: matmarex and samtar: Backport for Enable visual enhancements on pages using on dewiki (T318596) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
20:05 samtar@deploy2002: Started scap: Backport for Enable visual enhancements on pages using on dewiki (T318596)
19:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P47126 and previous config saved to /var/cache/conftool/dbconfig/20230418-195518-ladsgroup.json
19:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T333332)', diff saved to https://phabricator.wikimedia.org/P47125 and previous config saved to /var/cache/conftool/dbconfig/20230418-194401-ladsgroup.json
19:43 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1003.eqiad.wmnet with reason: host reimage
19:40 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1003.eqiad.wmnet with reason: host reimage
19:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47124 and previous config saved to /var/cache/conftool/dbconfig/20230418-194012-ladsgroup.json
19:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T333332)', diff saved to https://phabricator.wikimedia.org/P47123 and previous config saved to /var/cache/conftool/dbconfig/20230418-193832-ladsgroup.json
19:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1113.eqiad.wmnet with reason: Maintenance
19:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1113.eqiad.wmnet with reason: Maintenance
19:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T333332)', diff saved to https://phabricator.wikimedia.org/P47122 and previous config saved to /var/cache/conftool/dbconfig/20230418-193809-ladsgroup.json
19:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P47121 and previous config saved to /var/cache/conftool/dbconfig/20230418-192855-ladsgroup.json
19:24 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1003.eqiad.wmnet with OS bullseye
19:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P47120 and previous config saved to /var/cache/conftool/dbconfig/20230418-192302-ladsgroup.json
19:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P47119 and previous config saved to /var/cache/conftool/dbconfig/20230418-191348-ladsgroup.json
19:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P47118 and previous config saved to /var/cache/conftool/dbconfig/20230418-190756-ladsgroup.json
19:03 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T333332)', diff saved to https://phabricator.wikimedia.org/P47117 and previous config saved to /var/cache/conftool/dbconfig/20230418-185842-ladsgroup.json
18:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2176 (T333332)', diff saved to https://phabricator.wikimedia.org/P47116 and previous config saved to /var/cache/conftool/dbconfig/20230418-185627-ladsgroup.json
18:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
18:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2176.codfw.wmnet with reason: Maintenance
18:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T333332)', diff saved to https://phabricator.wikimedia.org/P47115 and previous config saved to /var/cache/conftool/dbconfig/20230418-185604-ladsgroup.json
18:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T333332)', diff saved to https://phabricator.wikimedia.org/P47114 and previous config saved to /var/cache/conftool/dbconfig/20230418-185250-ladsgroup.json
18:51 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host cloudswift1002.mgmt.eqiad.wmnet with reboot policy FORCED
18:51 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host cloudswift1001.mgmt.eqiad.wmnet with reboot policy FORCED
18:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T333332)', diff saved to https://phabricator.wikimedia.org/P47113 and previous config saved to /var/cache/conftool/dbconfig/20230418-185010-ladsgroup.json
18:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1110.eqiad.wmnet with reason: Maintenance
18:49 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:49 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entries for cloudswift100[1-2] - pt1979@cumin2002"
18:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1110.eqiad.wmnet with reason: Maintenance
18:48 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS entries for cloudswift100[1-2] - pt1979@cumin2002"
18:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
18:46 pt1979@cumin2002: START - Cookbook sre.dns.netbox
18:44 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
18:43 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs1020
18:43 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs1020
18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P47112 and previous config saved to /var/cache/conftool/dbconfig/20230418-184058-ladsgroup.json
18:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:28 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:26 taavi@deploy2002: Finished scap: 909693 and 909700 (duration: 07m 36s)
18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P47111 and previous config saved to /var/cache/conftool/dbconfig/20230418-182551-ladsgroup.json
18:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:19 taavi@deploy2002: taavi: 909693 and 909700 synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
18:18 taavi@deploy2002: Started scap: 909693 and 909700
18:15 taavi@deploy2002: Finished scap: Backport for Add temporary message for Graph being disabled (T334895), Add temporary message for Graph being disabled (T334895), Add temporary tracking category for Graph being disabled (T334895), Add temporary tracking category for Graph being disabled (T334895) (duration: 37m 33s)
18:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T333332)', diff saved to https://phabricator.wikimedia.org/P47110 and previous config saved to /var/cache/conftool/dbconfig/20230418-181045-ladsgroup.json
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2174 (T333332)', diff saved to https://phabricator.wikimedia.org/P47109 and previous config saved to /var/cache/conftool/dbconfig/20230418-180830-ladsgroup.json
18:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
18:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2174.codfw.wmnet with reason: Maintenance
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T333332)', diff saved to https://phabricator.wikimedia.org/P47108 and previous config saved to /var/cache/conftool/dbconfig/20230418-180807-ladsgroup.json
17:59 taavi@deploy2002: taavi: Backport for Add temporary message for Graph being disabled (T334895), Add temporary message for Graph being disabled (T334895), Add temporary tracking category for Graph being disabled (T334895), Add temporary tracking category for Graph being disabled (T334895) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1
17:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P47107 and previous config saved to /var/cache/conftool/dbconfig/20230418-175301-ladsgroup.json
17:48 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:47 jclark@cumin1001: START - Cookbook sre.dns.netbox
17:47 jclark@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
17:46 jclark@cumin1001: START - Cookbook sre.dns.netbox
17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P47106 and previous config saved to /var/cache/conftool/dbconfig/20230418-173754-ladsgroup.json
17:37 taavi@deploy2002: Started scap: Backport for Add temporary message for Graph being disabled (T334895), Add temporary message for Graph being disabled (T334895), Add temporary tracking category for Graph being disabled (T334895), Add temporary tracking category for Graph being disabled (T334895)
17:26 htriedman@deploy2002: Finished deploy [airflow-dags/platform_eng@3b8ab60]: (no justification provided) (duration: 00m 12s)
17:26 htriedman@deploy2002: Started deploy [airflow-dags/platform_eng@3b8ab60]: (no justification provided)
17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T333332)', diff saved to https://phabricator.wikimedia.org/P47105 and previous config saved to /var/cache/conftool/dbconfig/20230418-172247-ladsgroup.json
17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 (T333332)', diff saved to https://phabricator.wikimedia.org/P47104 and previous config saved to /var/cache/conftool/dbconfig/20230418-172032-ladsgroup.json
17:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
17:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
17:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
17:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2173.codfw.wmnet with reason: Maintenance
17:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T333332)', diff saved to https://phabricator.wikimedia.org/P47103 and previous config saved to /var/cache/conftool/dbconfig/20230418-171951-ladsgroup.json
17:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P47102 and previous config saved to /var/cache/conftool/dbconfig/20230418-170445-ladsgroup.json
16:57 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host htmldumper1001.eqiad.wmnet with OS bullseye
16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P47101 and previous config saved to /var/cache/conftool/dbconfig/20230418-164939-ladsgroup.json
16:44 hnowlan@puppetmaster1001: conftool action : set/weight=6; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T333332)', diff saved to https://phabricator.wikimedia.org/P47100 and previous config saved to /var/cache/conftool/dbconfig/20230418-163432-ladsgroup.json
16:33 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on htmldumper1001.eqiad.wmnet with reason: host reimage
16:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T333332)', diff saved to https://phabricator.wikimedia.org/P47099 and previous config saved to /var/cache/conftool/dbconfig/20230418-163217-ladsgroup.json
16:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
16:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T333332)', diff saved to https://phabricator.wikimedia.org/P47098 and previous config saved to /var/cache/conftool/dbconfig/20230418-163154-ladsgroup.json
16:29 ariel@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on htmldumper1001.eqiad.wmnet with reason: host reimage
16:23 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:22 sukhe@cumin2002: START - Cookbook sre.dns.netbox
16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P47097 and previous config saved to /var/cache/conftool/dbconfig/20230418-161648-ladsgroup.json
16:14 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool restbase-async in codfw: Depool from primary DC following network maintenance
16:09 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase-async.discovery.wmnet on all recursors
16:09 cgoubert@cumin1001: START - Cookbook sre.dns.wipe-cache restbase-async.discovery.wmnet on all recursors
16:09 cgoubert@cumin1001: START - Cookbook sre.discovery.service-route depool restbase-async in codfw: Depool from primary DC following network maintenance
16:08 claime: depooling restbase-async from codfw
16:08 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in eqiad: End of maintenance - T333377
16:08 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: End of maintenance - T333377
16:04 cgoubert@cumin1001: END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in eqiad: End of maintenance - T333377
16:03 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: End of maintenance - T333377
16:03 cgoubert@cumin1001: END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in eqiad: End of maintenance - T333377
16:03 ariel@cumin1001: START - Cookbook sre.hosts.reimage for host htmldumper1001.eqiad.wmnet with OS bullseye
16:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P47095 and previous config saved to /var/cache/conftool/dbconfig/20230418-160141-ladsgroup.json
16:00 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: End of maintenance - T333377
16:00 cgoubert@cumin1001: END (ERROR) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in eqiad: End of maintenance - T333377
15:54 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: End of maintenance - T333377
15:54 cgoubert@cumin1001: END (ERROR) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in eqiad: End of maintenance - T333377
15:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T333332)', diff saved to https://phabricator.wikimedia.org/P47093 and previous config saved to /var/cache/conftool/dbconfig/20230418-154635-ladsgroup.json
15:45 sukhe: enable puppet in A:lvs and A:codfw to test CR 908909
15:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T333332)', diff saved to https://phabricator.wikimedia.org/P47092 and previous config saved to /var/cache/conftool/dbconfig/20230418-154219-ladsgroup.json
15:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
15:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T333332)', diff saved to https://phabricator.wikimedia.org/P47091 and previous config saved to /var/cache/conftool/dbconfig/20230418-154156-ladsgroup.json
15:38 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: End of maintenance - T333377
15:38 cgoubert@cumin1001: END (ERROR) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in eqiad: End of maintenance - T333377
15:37 sukhe: disable puppet in A:lvs and A:codfw to test CR 908909
15:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P47090 and previous config saved to /var/cache/conftool/dbconfig/20230418-152649-ladsgroup.json
15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P47089 and previous config saved to /var/cache/conftool/dbconfig/20230418-151143-ladsgroup.json
15:07 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: End of maintenance - T333377
15:07 claime: repooling all eqiad active active services post T333377
14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T333332)', diff saved to https://phabricator.wikimedia.org/P47088 and previous config saved to /var/cache/conftool/dbconfig/20230418-145637-ladsgroup.json
14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T333332)', diff saved to https://phabricator.wikimedia.org/P47087 and previous config saved to /var/cache/conftool/dbconfig/20230418-145422-ladsgroup.json
14:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
14:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2153.codfw.wmnet with reason: Maintenance
14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T333332)', diff saved to https://phabricator.wikimedia.org/P47086 and previous config saved to /var/cache/conftool/dbconfig/20230418-145359-ladsgroup.json
14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P47085 and previous config saved to /var/cache/conftool/dbconfig/20230418-143852-ladsgroup.json
14:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P47084 and previous config saved to /var/cache/conftool/dbconfig/20230418-142346-ladsgroup.json
14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T333332)', diff saved to https://phabricator.wikimedia.org/P47083 and previous config saved to /var/cache/conftool/dbconfig/20230418-140840-ladsgroup.json
14:06 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase1018.eqiad.wmnet
14:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T333332)', diff saved to https://phabricator.wikimedia.org/P47082 and previous config saved to /var/cache/conftool/dbconfig/20230418-140626-ladsgroup.json
14:06 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase102[5-7].eqiad.wmnet
14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
14:06 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase103[03].eqiad.wmnet
14:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2146.codfw.wmnet with reason: Maintenance
14:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T333332)', diff saved to https://phabricator.wikimedia.org/P47081 and previous config saved to /var/cache/conftool/dbconfig/20230418-140602-ladsgroup.json
14:04 sukhe: running authdns-update to repool eqiad after switch maint: T333377
13:57 btullis@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts an-worker1110.eqiad.wmnet
13:57 btullis@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts an-worker1110.eqiad.wmnet
13:55 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for asw2-d-eqiad
13:55 cmooney@cumin1001: START - Cookbook sre.hosts.remove-downtime for asw2-d-eqiad
13:52 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 270 hosts
13:51 jmm@puppetmaster1001: conftool action : set/pooled=yes; selector: name=ldap-replica1004.wikimedia.org
13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P47080 and previous config saved to /var/cache/conftool/dbconfig/20230418-135056-ladsgroup.json
13:49 cmooney@cumin1001: START - Cookbook sre.hosts.remove-downtime for 270 hosts
13:41 elukey: restart etcdmirror on conf2005 (down due to conf1009 under maintenance)
13:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P47079 and previous config saved to /var/cache/conftool/dbconfig/20230418-133549-ladsgroup.json
13:25 topranks: Rebooting asw2-d-eqiad virtual-chassis (all row D top-of-rack switches) to upgrade JunOS. Row D going down T333377
13:22 xSavitar: RESTBase/Proton deployment complete
13:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T333332)', diff saved to https://phabricator.wikimedia.org/P47078 and previous config saved to /var/cache/conftool/dbconfig/20230418-132042-ladsgroup.json
13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 (T333332)', diff saved to https://phabricator.wikimedia.org/P47076 and previous config saved to /var/cache/conftool/dbconfig/20230418-131827-ladsgroup.json
13:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
13:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2145.codfw.wmnet with reason: Maintenance
13:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
13:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T333332)', diff saved to https://phabricator.wikimedia.org/P47075 and previous config saved to /var/cache/conftool/dbconfig/20230418-131738-ladsgroup.json
13:16 derick@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
13:15 derick@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
13:15 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on asw2-d-eqiad with reason: eqiad row D upgrade
13:15 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on asw2-d-eqiad with reason: eqiad row D upgrade
13:14 derick@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
13:13 derick@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
13:12 derick@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
13:12 jbond: disable puppet fleet wide T333377
13:11 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 270 hosts with reason: eqiad row D upgrade
13:10 derick@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
13:06 topranks: disabling ping offload on cr1-eqiad and cr2-eqiad in advance of row D switch upgrade T333377
13:06 jbond: upload libapache2-mod-auth-cas_1.2-1+wmf12u1
13:04 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 270 hosts with reason: eqiad row D upgrade
13:03 derick@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
13:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P47074 and previous config saved to /var/cache/conftool/dbconfig/20230418-130231-ladsgroup.json
13:02 derick@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
12:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P47073 and previous config saved to /var/cache/conftool/dbconfig/20230418-124724-ladsgroup.json
12:40 sukhe: run authdns-update to depool eqiad for switch upgrade
12:39 moritzm: imported puppet 5.5.22-2+deb12u2 for bookworm-wikimedia T330495
12:36 jiji@cumin1001: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) status all services in all: None - None
12:36 jiji@cumin1001: START - Cookbook sre.discovery.datacenter status all services in all: None - None
12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T333332)', diff saved to https://phabricator.wikimedia.org/P47072 and previous config saved to /var/cache/conftool/dbconfig/20230418-123218-ladsgroup.json
12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T333332)', diff saved to https://phabricator.wikimedia.org/P47071 and previous config saved to /var/cache/conftool/dbconfig/20230418-122903-ladsgroup.json
12:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
12:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2130.codfw.wmnet with reason: Maintenance
12:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T333332)', diff saved to https://phabricator.wikimedia.org/P47070 and previous config saved to /var/cache/conftool/dbconfig/20230418-122839-ladsgroup.json
12:27 jiji@cumin1001: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all active/active services in eqiad: eqiad row D switches upgrade - T333377
12:27 jiji@cumin1001: START - Cookbook sre.discovery.datacenter depool all active/active services in eqiad: eqiad row D switches upgrade - T333377
12:26 jiji@cumin1001: END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) depool all active/active services in eqiad: eqiad row D switches upgrade - T333377
12:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P47069 and previous config saved to /var/cache/conftool/dbconfig/20230418-121333-ladsgroup.json
11:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P47068 and previous config saved to /var/cache/conftool/dbconfig/20230418-115827-ladsgroup.json
11:57 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase103[03].eqiad.wmnet
11:57 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase102[5-7].eqiad.wmnet
11:57 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=restbase1018.eqiad.wmnet
11:50 jiji@cumin1001: START - Cookbook sre.discovery.datacenter depool all active/active services in eqiad: eqiad row D switches upgrade - T333377
11:49 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1102.eqiad.wmnet
11:49 jynus@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:49 jynus@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1102.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1001"
11:48 effie: depooling eqiad due to eqiad row D switches upgrade - T333377
11:46 jynus@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1102.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1001"
11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T333332)', diff saved to https://phabricator.wikimedia.org/P47067 and previous config saved to /var/cache/conftool/dbconfig/20230418-114320-ladsgroup.json
11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T333332)', diff saved to https://phabricator.wikimedia.org/P47066 and previous config saved to /var/cache/conftool/dbconfig/20230418-114106-ladsgroup.json
11:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
11:40 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2116.codfw.wmnet with reason: Maintenance
11:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T333332)', diff saved to https://phabricator.wikimedia.org/P47065 and previous config saved to /var/cache/conftool/dbconfig/20230418-114042-ladsgroup.json
11:39 jynus@cumin1001: START - Cookbook sre.dns.netbox
11:34 jynus@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1102.eqiad.wmnet
11:32 btullis@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts an-worker1110.eqiad.wmnet
11:30 btullis@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts an-worker1110.eqiad.wmnet
11:27 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1116.eqiad.wmnet
11:27 jynus@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:27 jynus@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1116.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1001"
11:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P47064 and previous config saved to /var/cache/conftool/dbconfig/20230418-112536-ladsgroup.json
11:24 jynus@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1116.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jynus@cumin1001"
11:22 jynus@cumin1001: START - Cookbook sre.dns.netbox
11:22 taavi@deploy2002: Finished scap: Backport for Hide raw Graph tags (T334895) (duration: 07m 09s)
11:16 jynus@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1116.eqiad.wmnet
11:16 taavi@deploy2002: taavi: Backport for Hide raw Graph tags (T334895) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
11:14 taavi@deploy2002: Started scap: Backport for Hide raw Graph tags (T334895)
11:10 urbanecm@deploy2002: Finished scap: Backport for [Growth] Prepare for a Personalized praise config variable change (T334630) (duration: 06m 43s)
11:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P47063 and previous config saved to /var/cache/conftool/dbconfig/20230418-111029-ladsgroup.json
11:03 urbanecm@deploy2002: Started scap: Backport for [Growth] Prepare for a Personalized praise config variable change (T334630)
11:00 elukey: puppet cert clean kafka_jumbo-eqiad_broker on puppetmaster1001 - remove old certificate (not used anymore)
10:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T333332)', diff saved to https://phabricator.wikimedia.org/P47062 and previous config saved to /var/cache/conftool/dbconfig/20230418-105523-ladsgroup.json
10:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 (T333332)', diff saved to https://phabricator.wikimedia.org/P47061 and previous config saved to /var/cache/conftool/dbconfig/20230418-105308-ladsgroup.json
10:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
10:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2103.codfw.wmnet with reason: Maintenance
10:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
10:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2102.codfw.wmnet with reason: Maintenance
10:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
10:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
10:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
10:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
10:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T333332)', diff saved to https://phabricator.wikimedia.org/P47060 and previous config saved to /var/cache/conftool/dbconfig/20230418-105131-ladsgroup.json
10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P47059 and previous config saved to /var/cache/conftool/dbconfig/20230418-103625-ladsgroup.json
10:25 jmm@puppetmaster1001: conftool action : set/pooled=no; selector: name=ldap-replica1004.wikimedia.org
10:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P47058 and previous config saved to /var/cache/conftool/dbconfig/20230418-102119-ladsgroup.json
10:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T333332)', diff saved to https://phabricator.wikimedia.org/P47057 and previous config saved to /var/cache/conftool/dbconfig/20230418-100612-ladsgroup.json
10:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1218 (T333332)', diff saved to https://phabricator.wikimedia.org/P47056 and previous config saved to /var/cache/conftool/dbconfig/20230418-100359-ladsgroup.json
10:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
10:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1218.eqiad.wmnet with reason: Maintenance
08:38 jnuche@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.5 refs T330211
08:37 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on an-worker1110.eqiad.wmnet with reason: Upgrading RAID controller firmware
08:37 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on an-worker1110.eqiad.wmnet with reason: Upgrading RAID controller firmware
08:12 zabe@deploy2002: Finished scap: Backport for Add separate config for enabling JsonConfig (duration: 07m 43s)
08:08 dcausse: repooling wdqs2011
08:06 zabe@deploy2002: zabe: Backport for Add separate config for enabling JsonConfig synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
08:04 zabe@deploy2002: Started scap: Backport for Add separate config for enabling JsonConfig
07:51 cgoubert@deploy2002: Finished scap: Forcing redeplou (duration: 02m 31s)
07:50 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1212 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P47055 and previous config saved to /var/cache/conftool/dbconfig/20230418-075032-marostegui.json
07:48 cgoubert@deploy2002: Started scap: Forcing redeplou
07:41 zabe@deploy2002: Finished scap: T334895 (duration: 06m 42s)
07:35 zabe@deploy2002: Started scap: T334895
07:30 zabe@deploy2002: Finished scap: T334895 (duration: 06m 37s)
07:24 zabe@deploy2002: Started scap: T334895
07:20 zabe@deploy2002: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki=aawiki --force-version "1.41.0-wmf.4" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.8ZJFnr01rx"' returned non-zero exit status 255. (duration: 00m 00s)
07:20 zabe@deploy2002: Started scap: T334895
07:18 zabe@deploy2002: scap failed: CalledProcessError Command '/usr/local/bin/mwscript mergeMessageFileList.php --wiki=aawiki --force-version "1.41.0-wmf.4" --list-file="/srv/mediawiki-staging/wmf-config/extension-list" --output="/tmp/tmp.c2xgrltrG8"' returned non-zero exit status 255. (duration: 00m 01s)
07:18 zabe@deploy2002: Started scap: T334895
07:16 joe: added requestctl rule for T332061 in logging mode
07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1109.eqiad.wmnet
07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1109.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:05 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1109.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:03 marostegui@cumin1001: START - Cookbook sre.dns.netbox
06:59 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1109.eqiad.wmnet
06:11 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2142 to x2 primary T334821', diff saved to https://phabricator.wikimedia.org/P47054 and previous config saved to /var/cache/conftool/dbconfig/20230418-061101-root.json
06:06 marostegui: Starting x2 codfw failover from db2144 to db2142 - T334821
06:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover x2 T334821
06:02 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 16591
06:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover x2 T334821
06:01 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 16591
03:53 mwpresync@deploy2002: Pruned MediaWiki: 1.41.0-wmf.3 (duration: 02m 08s)
03:51 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.41.0-wmf.5 refs T330211 (duration: 49m 03s)
03:30 eileen: civicrm upgraded from 0b8e303d to dab8912d
03:02 mwpresync@deploy2002: Started scap: testwikis wikis to 1.41.0-wmf.5 refs T330211
01:38 eileen: civicrm upgraded from cd0f886d to 0b8e303d
00:54 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cassandra-dev2001.codfw.wmnet
00:54 eevans@cumin1001: START - Cookbook sre.hosts.remove-downtime for cassandra-dev2001.codfw.wmnet
00:28 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cassandra-dev2001.codfw.wmnet with reason: testing systemd unit changes — T327954
00:28 eevans@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on cassandra-dev2001.codfw.wmnet with reason: testing systemd unit changes — T327954
00:26 eileen: config revision changed from 7da418a4 to f25cb7cc

2023-04-17

22:00 zabe@deploy2002: Finished scap: Backport for Fix infinite loop for self-redirects with variants conversion (T333050) (duration: 06m 52s)
22:00 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 13 hosts with reason: T333377 maint
21:59 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 13 hosts with reason: T333377 maint
21:54 zabe@deploy2002: zabe: Backport for Fix infinite loop for self-redirects with variants conversion (T333050) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
21:53 zabe@deploy2002: Started scap: Backport for Fix infinite loop for self-redirects with variants conversion (T333050)
21:45 zabe@deploy2002: Finished scap: Backport for RC: Handle deleted story (T334829) (duration: 07m 01s)
21:39 zabe@deploy2002: zabe: Backport for RC: Handle deleted story (T334829) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
21:38 zabe@deploy2002: Started scap: Backport for RC: Handle deleted story (T334829)
21:20 sbassett: Deployed updated mitigation for T333140
21:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T333332)', diff saved to https://phabricator.wikimedia.org/P47053 and previous config saved to /var/cache/conftool/dbconfig/20230417-211909-ladsgroup.json
21:17 inflatador: bking@cumin1001 ban cloudelastic1004 for upcoming switch maintenance T333377
21:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P47052 and previous config saved to /var/cache/conftool/dbconfig/20230417-210403-ladsgroup.json
20:52 urbanecm@deploy2002: Finished scap: Backport for [trwikiquote] Add a HD logo for Vector legacy (T334732) (duration: 07m 02s)
20:50 otto@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
20:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P47051 and previous config saved to /var/cache/conftool/dbconfig/20230417-204856-ladsgroup.json
20:48 joal@deploy2002: Started restart [analytics/aqs/deploy@d273fde]: Restarting AQS to pick up new druid datasource
20:46 urbanecm@deploy2002: urbanecm and superpes: Backport for [trwikiquote] Add a HD logo for Vector legacy (T334732) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
20:44 urbanecm@deploy2002: Started scap: Backport for [trwikiquote] Add a HD logo for Vector legacy (T334732)
20:35 urbanecm@deploy2002: Finished scap: Backport for Mobile editor: Don't try to take over if the form has already been submitted (T334794 T334797 T334877), Mobile editor: Don't try to take over on non-wikitext content (T334799) (duration: 09m 14s)
20:35 otto@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T333332)', diff saved to https://phabricator.wikimedia.org/P47049 and previous config saved to /var/cache/conftool/dbconfig/20230417-203350-ladsgroup.json
20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T333332)', diff saved to https://phabricator.wikimedia.org/P47048 and previous config saved to /var/cache/conftool/dbconfig/20230417-203108-ladsgroup.json
20:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
20:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2175.codfw.wmnet with reason: Maintenance
20:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P47047 and previous config saved to /var/cache/conftool/dbconfig/20230417-203056-ladsgroup.json
20:27 urbanecm@deploy2002: urbanecm and matmarex: Backport for Mobile editor: Don't try to take over if the form has already been submitted (T334794 T334797 T334877), Mobile editor: Don't try to take over on non-wikitext content (T334799) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
20:26 urbanecm@deploy2002: Started scap: Backport for Mobile editor: Don't try to take over if the form has already been submitted (T334794 T334797 T334877), Mobile editor: Don't try to take over on non-wikitext content (T334799)
20:25 urbanecm@deploy2002: Finished scap: Backport for Stop using redundant $wmg variables for VisualEditor extension (T119117) (duration: 08m 19s)
20:18 urbanecm@deploy2002: urbanecm and matmarex: Backport for Stop using redundant $wmg variables for VisualEditor extension (T119117) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
20:17 urbanecm@deploy2002: Started scap: Backport for Stop using redundant $wmg variables for VisualEditor extension (T119117)
20:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P47046 and previous config saved to /var/cache/conftool/dbconfig/20230417-201549-ladsgroup.json
20:14 urbanecm@deploy2002: Finished scap: Backport for ruwiki: Allow sysop to add/remove confirmed group (T334780) (duration: 07m 31s)
20:08 urbanecm@deploy2002: urbanecm and stang: Backport for ruwiki: Allow sysop to add/remove confirmed group (T334780) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
20:06 urbanecm@deploy2002: Started scap: Backport for ruwiki: Allow sysop to add/remove confirmed group (T334780)
20:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P47045 and previous config saved to /var/cache/conftool/dbconfig/20230417-200043-ladsgroup.json
19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P47044 and previous config saved to /var/cache/conftool/dbconfig/20230417-194537-ladsgroup.json
19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P47043 and previous config saved to /var/cache/conftool/dbconfig/20230417-194253-ladsgroup.json
19:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
19:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2170.codfw.wmnet with reason: Maintenance
19:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T333332)', diff saved to https://phabricator.wikimedia.org/P47042 and previous config saved to /var/cache/conftool/dbconfig/20230417-194229-ladsgroup.json
19:32 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2003.wikimedia.org with OS bullseye
19:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P47041 and previous config saved to /var/cache/conftool/dbconfig/20230417-192723-ladsgroup.json
19:16 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
19:13 jelto@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
19:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P47040 and previous config saved to /var/cache/conftool/dbconfig/20230417-191217-ladsgroup.json
19:00 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T333332)', diff saved to https://phabricator.wikimedia.org/P47039 and previous config saved to /var/cache/conftool/dbconfig/20230417-185710-ladsgroup.json
18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T333332)', diff saved to https://phabricator.wikimedia.org/P47038 and previous config saved to /var/cache/conftool/dbconfig/20230417-184525-ladsgroup.json
18:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
18:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2148.codfw.wmnet with reason: Maintenance
18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P47037 and previous config saved to /var/cache/conftool/dbconfig/20230417-184502-ladsgroup.json
18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P47036 and previous config saved to /var/cache/conftool/dbconfig/20230417-182956-ladsgroup.json
18:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P47035 and previous config saved to /var/cache/conftool/dbconfig/20230417-181449-ladsgroup.json
17:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P47034 and previous config saved to /var/cache/conftool/dbconfig/20230417-175943-ladsgroup.json
17:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P47033 and previous config saved to /var/cache/conftool/dbconfig/20230417-175700-ladsgroup.json
17:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
17:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T333332)', diff saved to https://phabricator.wikimedia.org/P47032 and previous config saved to /var/cache/conftool/dbconfig/20230417-175636-ladsgroup.json
17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P47031 and previous config saved to /var/cache/conftool/dbconfig/20230417-174130-ladsgroup.json
17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P47030 and previous config saved to /var/cache/conftool/dbconfig/20230417-172623-ladsgroup.json
17:26 SandraEbele: restarted turnilo with ‘sudo systemctl restart turnilo’
17:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['backup2010']
17:18 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010']
17:17 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['backup2010']
17:16 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010']
17:14 jhancock@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['backup2010']
17:14 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010']
17:13 SandraEbele: restarted Oozie page view-druid-daily job 0174450-220913162928808-oozie-oozi-C
17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T333332)', diff saved to https://phabricator.wikimedia.org/P47029 and previous config saved to /var/cache/conftool/dbconfig/20230417-171117-ladsgroup.json
17:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T333332)', diff saved to https://phabricator.wikimedia.org/P47028 and previous config saved to /var/cache/conftool/dbconfig/20230417-170838-ladsgroup.json
17:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
17:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
17:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
17:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2126.codfw.wmnet with reason: Maintenance
17:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T333332)', diff saved to https://phabricator.wikimedia.org/P47027 and previous config saved to /var/cache/conftool/dbconfig/20230417-170757-ladsgroup.json
17:04 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['backup2010']
17:03 volans: installed spicerack_6.4.2 on cumin1001
17:01 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['backup2010']
16:59 xcollazo@deploy2002: Finished deploy [airflow-dags/analytics@f8dad05]: analytics: deploy Airflow ArchiveOperator should have a number of retries of 0. T332216 (duration: 00m 12s)
16:59 xcollazo@deploy2002: Started deploy [airflow-dags/analytics@f8dad05]: analytics: deploy Airflow ArchiveOperator should have a number of retries of 0. T332216
16:56 SandraEbele: restarted oozie page view-druid-hourly job 0174449-220913162928808-oozie-oozi-C
16:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P47026 and previous config saved to /var/cache/conftool/dbconfig/20230417-165251-ladsgroup.json
16:49 volans: installed spicerack_6.4.2 on cumin2002
16:46 volans: uploaded spicerack_6.4.2 to apt.wikimedia.org bullseye-wikimedia
16:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host backup2010.mgmt.codfw.wmnet with reboot policy FORCED
16:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P47025 and previous config saved to /var/cache/conftool/dbconfig/20230417-163744-ladsgroup.json
16:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T333332)', diff saved to https://phabricator.wikimedia.org/P47024 and previous config saved to /var/cache/conftool/dbconfig/20230417-162238-ladsgroup.json
16:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T333332)', diff saved to https://phabricator.wikimedia.org/P47023 and previous config saved to /var/cache/conftool/dbconfig/20230417-161955-ladsgroup.json
16:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
16:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2125.codfw.wmnet with reason: Maintenance
16:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T333332)', diff saved to https://phabricator.wikimedia.org/P47022 and previous config saved to /var/cache/conftool/dbconfig/20230417-161931-ladsgroup.json
16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P47021 and previous config saved to /var/cache/conftool/dbconfig/20230417-160425-ladsgroup.json
16:02 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host backup2010.mgmt.codfw.wmnet with reboot policy FORCED
15:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P47020 and previous config saved to /var/cache/conftool/dbconfig/20230417-155654-root.json
15:53 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 05m 30s)
15:50 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:50 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add newly racked backup2010 hosts in codfw - jhancock@cumin2002"
15:50 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add newly racked backup2010 hosts in codfw - jhancock@cumin2002"
15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P47019 and previous config saved to /var/cache/conftool/dbconfig/20230417-154918-ladsgroup.json
15:48 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 05m 59s)
15:42 jhancock@cumin2002: START - Cookbook sre.dns.netbox
15:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P47018 and previous config saved to /var/cache/conftool/dbconfig/20230417-154149-root.json
15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T333332)', diff saved to https://phabricator.wikimedia.org/P47017 and previous config saved to /var/cache/conftool/dbconfig/20230417-153412-ladsgroup.json
15:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T333332)', diff saved to https://phabricator.wikimedia.org/P47016 and previous config saved to /var/cache/conftool/dbconfig/20230417-153134-ladsgroup.json
15:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
15:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance
15:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
15:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance
15:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
15:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
15:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
15:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance
15:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T333332)', diff saved to https://phabricator.wikimedia.org/P47015 and previous config saved to /var/cache/conftool/dbconfig/20230417-152916-ladsgroup.json
15:27 urbanecm@deploy2002: Finished scap: Expose the sfsblock-bypass right so it can be assigned to global groups (T334856; second try) (duration: 06m 22s)
15:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P47014 and previous config saved to /var/cache/conftool/dbconfig/20230417-152644-root.json
15:21 urbanecm@deploy2002: Started scap: Expose the sfsblock-bypass right so it can be assigned to global groups (T334856; second try)
15:20 urbanecm@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (duration: 23m 03s)
15:18 sukhe: run authdns-update and repool eqiad
15:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P47013 and previous config saved to /var/cache/conftool/dbconfig/20230417-151409-ladsgroup.json
15:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P47012 and previous config saved to /var/cache/conftool/dbconfig/20230417-151138-root.json
15:09 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs1020
15:09 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs1020
15:07 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host lvs1020.eqiad.wmnet with OS bullseye
15:07 vgutierrez: rolling restart of HAProxy in the text cluster - T334448
14:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P47011 and previous config saved to /var/cache/conftool/dbconfig/20230417-145902-ladsgroup.json
14:57 urbanecm@deploy2002: Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage
14:57 urbanecm@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (duration: 00m 01s)
14:57 urbanecm@deploy2002: Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage
14:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P47010 and previous config saved to /var/cache/conftool/dbconfig/20230417-145633-root.json
14:55 claime: repooled mw1375.eqiad.wmnet
14:54 claime: depooling mw1375.eqiad.wmnet
14:53 ladsgroup@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (T334703) (duration: 13m 39s)
14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1222 (T333332)', diff saved to https://phabricator.wikimedia.org/P47009 and previous config saved to /var/cache/conftool/dbconfig/20230417-144356-ladsgroup.json
14:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1222 (T333332)', diff saved to https://phabricator.wikimedia.org/P47008 and previous config saved to /var/cache/conftool/dbconfig/20230417-144133-ladsgroup.json
14:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P47007 and previous config saved to /var/cache/conftool/dbconfig/20230417-144128-root.json
14:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
14:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
14:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T333332)', diff saved to https://phabricator.wikimedia.org/P47006 and previous config saved to /var/cache/conftool/dbconfig/20230417-144109-ladsgroup.json
14:40 ladsgroup@deploy2002: Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage (T334703)
14:31 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid
14:31 claime: repooling parsoid in eqiad
14:31 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=appserver
14:31 claime: repooling appserver in eqiad
14:30 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=api_appserver
14:30 claime: repooling api_appserver in eqiad
14:30 sukhe: running auth-dns update to depool eqiad
14:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P47005 and previous config saved to /var/cache/conftool/dbconfig/20230417-142623-root.json
14:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P47004 and previous config saved to /var/cache/conftool/dbconfig/20230417-142603-ladsgroup.json
14:25 urbanecm@deploy2002: Finished scap: Backport for Expose the 'sfsblock-bypass' right so it can be assigned to global groups (T334856) (duration: 07m 36s)
14:24 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage
14:21 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage
14:19 urbanecm@deploy2002: urbanecm and maurelio: Backport for Expose the 'sfsblock-bypass' right so it can be assigned to global groups (T334856) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
14:17 urbanecm@deploy2002: Started scap: Backport for Expose the 'sfsblock-bypass' right so it can be assigned to global groups (T334856)
14:14 elukey: upload amd-k8s-device-plugin deb (1.25.2.3-1) to bullseye-wikimedia - T333009
14:12 claime: Migrated linkrecommandation to mw-api-int - T334060
14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P47003 and previous config saved to /var/cache/conftool/dbconfig/20230417-141056-ladsgroup.json
14:10 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
14:09 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
14:08 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
14:07 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1020.eqiad.wmnet with OS bullseye
14:07 claime: Migrating linkrecommandation to mw-api-int - T334060
14:06 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T333332)', diff saved to https://phabricator.wikimedia.org/P47002 and previous config saved to /var/cache/conftool/dbconfig/20230417-135550-ladsgroup.json
13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T333332)', diff saved to https://phabricator.wikimedia.org/P47001 and previous config saved to /var/cache/conftool/dbconfig/20230417-135334-ladsgroup.json
13:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
13:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1197.eqiad.wmnet with reason: Maintenance
13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T333332)', diff saved to https://phabricator.wikimedia.org/P47000 and previous config saved to /var/cache/conftool/dbconfig/20230417-135311-ladsgroup.json
13:47 moritzm: installing mariadb-10.3 security updates (Debian packaged version, not the wmf-mariadb packages)
13:39 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.e4 in codfw
13:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P46999 and previous config saved to /var/cache/conftool/dbconfig/20230417-133804-ladsgroup.json
13:37 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.e4 in codfw
13:30 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1132.eqiad.wmnet
13:23 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-worker1132.eqiad.wmnet
13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P46998 and previous config saved to /var/cache/conftool/dbconfig/20230417-132258-ladsgroup.json
13:12 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
13:10 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
13:10 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
13:09 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
13:08 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
13:08 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T333332)', diff saved to https://phabricator.wikimedia.org/P46997 and previous config saved to /var/cache/conftool/dbconfig/20230417-130751-ladsgroup.json
13:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T333332)', diff saved to https://phabricator.wikimedia.org/P46996 and previous config saved to /var/cache/conftool/dbconfig/20230417-130535-ladsgroup.json
13:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
13:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1188.eqiad.wmnet with reason: Maintenance
13:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46995 and previous config saved to /var/cache/conftool/dbconfig/20230417-130512-ladsgroup.json
12:59 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
12:59 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
12:59 claime: Migrating linkrecommandation staging to mw-api-int - T334060
12:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P46994 and previous config saved to /var/cache/conftool/dbconfig/20230417-125006-ladsgroup.json
12:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P46993 and previous config saved to /var/cache/conftool/dbconfig/20230417-123500-ladsgroup.json
12:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46992 and previous config saved to /var/cache/conftool/dbconfig/20230417-121953-ladsgroup.json
12:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46991 and previous config saved to /var/cache/conftool/dbconfig/20230417-121734-ladsgroup.json
12:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
12:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1182.eqiad.wmnet with reason: Maintenance
12:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P46990 and previous config saved to /var/cache/conftool/dbconfig/20230417-121710-ladsgroup.json
12:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P46989 and previous config saved to /var/cache/conftool/dbconfig/20230417-120204-ladsgroup.json
11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 T326669', diff saved to https://phabricator.wikimedia.org/P46987 and previous config saved to /var/cache/conftool/dbconfig/20230417-115847-marostegui.json
11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P46986 and previous config saved to /var/cache/conftool/dbconfig/20230417-114658-ladsgroup.json
11:33 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1132.eqiad.wmnet
11:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P46985 and previous config saved to /var/cache/conftool/dbconfig/20230417-113152-ladsgroup.json
11:30 btullis@cumin1001: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1132.eqiad.wmnet
11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P46984 and previous config saved to /var/cache/conftool/dbconfig/20230417-113031-ladsgroup.json
11:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
11:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T333332)', diff saved to https://phabricator.wikimedia.org/P46983 and previous config saved to /var/cache/conftool/dbconfig/20230417-113008-ladsgroup.json
11:23 kamila@deploy2002: conftool action : set/pooled=yes:weight=10; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
11:17 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1109 from dbctl T334820', diff saved to https://phabricator.wikimedia.org/P46981 and previous config saved to /var/cache/conftool/dbconfig/20230417-111724-marostegui.json
11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P46980 and previous config saved to /var/cache/conftool/dbconfig/20230417-111501-ladsgroup.json
11:10 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1132.eqiad.wmnet with OS buster
10:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P46979 and previous config saved to /var/cache/conftool/dbconfig/20230417-105955-ladsgroup.json
10:59 ladsgroup@deploy2002: Finished scap: Backport for filebackend: Find thumbnails from all backends in FileBackendMultiWrite (T331138) (duration: 07m 16s)
10:54 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1132.eqiad.wmnet with reason: host reimage
10:53 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.98 in codfw
10:53 ladsgroup@deploy2002: ladsgroup: Backport for filebackend: Find thumbnails from all backends in FileBackendMultiWrite (T331138) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
10:51 ladsgroup@deploy2002: Started scap: Backport for filebackend: Find thumbnails from all backends in FileBackendMultiWrite (T331138)
10:51 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1132.eqiad.wmnet with reason: host reimage
10:50 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.98 in codfw
10:49 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-commons-local-public.98 in eqiad
10:46 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-commons-local-public.98 in eqiad
10:45 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.remove-ghost-objects (exit_code=0) from container wikipedia-en-local-public.1a in eqiad
10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T333332)', diff saved to https://phabricator.wikimedia.org/P46978 and previous config saved to /var/cache/conftool/dbconfig/20230417-104449-ladsgroup.json
10:42 mvernon@cumin2002: START - Cookbook sre.swift.remove-ghost-objects from container wikipedia-en-local-public.1a in eqiad
10:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T333332)', diff saved to https://phabricator.wikimedia.org/P46977 and previous config saved to /var/cache/conftool/dbconfig/20230417-104229-ladsgroup.json
10:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
10:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1156.eqiad.wmnet with reason: Maintenance
10:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P46976 and previous config saved to /var/cache/conftool/dbconfig/20230417-104144-ladsgroup.json
10:32 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1132.eqiad.wmnet with OS buster
10:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P46974 and previous config saved to /var/cache/conftool/dbconfig/20230417-102637-ladsgroup.json
10:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P46973 and previous config saved to /var/cache/conftool/dbconfig/20230417-101131-ladsgroup.json
10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46972 and previous config saved to /var/cache/conftool/dbconfig/20230417-100003-root.json
09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P46971 and previous config saved to /var/cache/conftool/dbconfig/20230417-095625-ladsgroup.json
09:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T333332)', diff saved to https://phabricator.wikimedia.org/P46970 and previous config saved to /var/cache/conftool/dbconfig/20230417-095404-ladsgroup.json
09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T333332)', diff saved to https://phabricator.wikimedia.org/P46969 and previous config saved to /var/cache/conftool/dbconfig/20230417-095311-ladsgroup.json
09:48 ladsgroup@deploy2002: Finished scap: Backport for Also broadcast RCFeed/IRC events to irc1002/irc2002 (T331702) (duration: 44m 21s)
09:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46968 and previous config saved to /var/cache/conftool/dbconfig/20230417-094459-root.json
09:38 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1116.eqiad.wmnet with reason: T334066
09:38 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1116.eqiad.wmnet with reason: T334066
09:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P46967 and previous config saved to /var/cache/conftool/dbconfig/20230417-093804-ladsgroup.json
09:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46966 and previous config saved to /var/cache/conftool/dbconfig/20230417-092954-root.json
09:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P46965 and previous config saved to /var/cache/conftool/dbconfig/20230417-092258-ladsgroup.json
09:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46964 and previous config saved to /var/cache/conftool/dbconfig/20230417-091449-root.json
09:12 ladsgroup@deploy2002: jmm and ladsgroup: Backport for Also broadcast RCFeed/IRC events to irc1002/irc2002 (T331702) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
09:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T333332)', diff saved to https://phabricator.wikimedia.org/P46963 and previous config saved to /var/cache/conftool/dbconfig/20230417-090751-ladsgroup.json
09:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T333332)', diff saved to https://phabricator.wikimedia.org/P46962 and previous config saved to /var/cache/conftool/dbconfig/20230417-090535-ladsgroup.json
09:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1129.eqiad.wmnet with reason: Maintenance
09:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1129.eqiad.wmnet with reason: Maintenance
09:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46961 and previous config saved to /var/cache/conftool/dbconfig/20230417-090512-ladsgroup.json
09:04 ladsgroup@deploy2002: Started scap: Backport for Also broadcast RCFeed/IRC events to irc1002/irc2002 (T331702)
09:04 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host netflow6002.drmrs.wmnet
09:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow6002.drmrs.wmnet on all recursors
09:04 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow6002.drmrs.wmnet on all recursors
09:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM netflow6002.drmrs.wmnet - jmm@cumin2002"
09:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM netflow6002.drmrs.wmnet - jmm@cumin2002"
08:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46960 and previous config saved to /var/cache/conftool/dbconfig/20230417-085944-root.json
08:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow6002.drmrs.wmnet on all recursors
08:59 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow6002.drmrs.wmnet on all recursors
08:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow6002.drmrs.wmnet - jmm@cumin2002"
08:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow6002.drmrs.wmnet - jmm@cumin2002"
08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 100%: Maint done', diff saved to https://phabricator.wikimedia.org/P46959 and previous config saved to /var/cache/conftool/dbconfig/20230417-085623-ladsgroup.json
08:55 kamila@deploy2002: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
08:55 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:55 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow6002.drmrs.wmnet
08:54 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
08:54 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
08:52 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host netflow6002.drmrs.wmnet
08:52 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow6002.drmrs.wmnet on all recursors
08:52 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow6002.drmrs.wmnet on all recursors
08:52 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:52 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM netflow6002.drmrs.wmnet - jmm@cumin2002"
08:51 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM netflow6002.drmrs.wmnet - jmm@cumin2002"
08:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P46958 and previous config saved to /var/cache/conftool/dbconfig/20230417-085005-ladsgroup.json
08:48 kamila@deploy2002: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
08:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46957 and previous config saved to /var/cache/conftool/dbconfig/20230417-084439-root.json
08:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 75%: Maint done', diff saved to https://phabricator.wikimedia.org/P46956 and previous config saved to /var/cache/conftool/dbconfig/20230417-084118-ladsgroup.json
08:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:39 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
08:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P46955 and previous config saved to /var/cache/conftool/dbconfig/20230417-083459-ladsgroup.json
08:34 jmm@cumin2002: START - Cookbook sre.dns.netbox
08:33 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow6002.drmrs.wmnet - jmm@cumin2002"
08:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46954 and previous config saved to /var/cache/conftool/dbconfig/20230417-082934-root.json
08:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 25%: Maint done', diff saved to https://phabricator.wikimedia.org/P46953 and previous config saved to /var/cache/conftool/dbconfig/20230417-082613-ladsgroup.json
08:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46952 and previous config saved to /var/cache/conftool/dbconfig/20230417-081953-ladsgroup.json
08:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46951 and previous config saved to /var/cache/conftool/dbconfig/20230417-081732-ladsgroup.json
08:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1122.eqiad.wmnet with reason: Maintenance
08:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1122.eqiad.wmnet with reason: Maintenance
08:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1102.eqiad.wmnet with reason: Maintenance
08:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1102.eqiad.wmnet with reason: Maintenance
08:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 10%: Maint done', diff saved to https://phabricator.wikimedia.org/P46950 and previous config saved to /var/cache/conftool/dbconfig/20230417-081108-ladsgroup.json
08:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
08:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
07:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1100.eqiad.wmnet
07:58 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:58 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1100.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 100%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46948 and previous config saved to /var/cache/conftool/dbconfig/20230417-075818-root.json
07:57 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1100.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:55 marostegui@cumin1001: START - Cookbook sre.dns.netbox
07:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow6002.drmrs.wmnet - jmm@cumin2002"
07:49 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1100.eqiad.wmnet
07:49 vgutierrez: restart haproxy on cp3054 - T334448
07:44 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow6002.drmrs.wmnet on all recursors
07:44 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow6002.drmrs.wmnet on all recursors
07:44 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow6002.drmrs.wmnet - jmm@cumin2002"
07:43 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
07:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow6002.drmrs.wmnet - jmm@cumin2002"
07:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 75%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46946 and previous config saved to /var/cache/conftool/dbconfig/20230417-074313-root.json
07:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
07:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow6002.drmrs.wmnet
07:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
07:30 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: host reimage
07:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 50%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46945 and previous config saved to /var/cache/conftool/dbconfig/20230417-072809-root.json
07:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
07:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 25%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46944 and previous config saved to /var/cache/conftool/dbconfig/20230417-071304-root.json
06:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 10%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46943 and previous config saved to /var/cache/conftool/dbconfig/20230417-065759-root.json
06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1109 T334820', diff saved to https://phabricator.wikimedia.org/P46942 and previous config saved to /var/cache/conftool/dbconfig/20230417-064525-marostegui.json
06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 5%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46941 and previous config saved to /var/cache/conftool/dbconfig/20230417-064254-root.json
06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 4%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46940 and previous config saved to /var/cache/conftool/dbconfig/20230417-062749-root.json
06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 3%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46939 and previous config saved to /var/cache/conftool/dbconfig/20230417-061244-root.json
05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 2%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46938 and previous config saved to /var/cache/conftool/dbconfig/20230417-055739-root.json
05:57 marostegui@cumin1001: dbctl commit (dc=all): 'Change db1152 weight', diff saved to https://phabricator.wikimedia.org/P46937 and previous config saved to /var/cache/conftool/dbconfig/20230417-055721-root.json
05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1152 to x2 primary T334663', diff saved to https://phabricator.wikimedia.org/P46936 and previous config saved to /var/cache/conftool/dbconfig/20230417-055644-root.json
05:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1214 (re)pooling @ 1%: Pooling db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46935 and previous config saved to /var/cache/conftool/dbconfig/20230417-054235-root.json
05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1214 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46934 and previous config saved to /var/cache/conftool/dbconfig/20230417-054154-marostegui.json
05:33 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1100 from dbctl T329352', diff saved to https://phabricator.wikimedia.org/P46933 and previous config saved to /var/cache/conftool/dbconfig/20230417-053310-marostegui.json
05:32 marostegui: Stop MariaDB on db1112 to clone db1212 - this will generate lag on s3 wiki replicas
05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112 T326669', diff saved to https://phabricator.wikimedia.org/P46931 and previous config saved to /var/cache/conftool/dbconfig/20230417-051903-marostegui.json
04:48 phedenskog@deploy2002: Finished deploy [performance/navtiming@e21f08f]: (no justification provided) (duration: 00m 06s)
04:48 phedenskog@deploy2002: Started deploy [performance/navtiming@e21f08f]: (no justification provided)

2023-04-16

07:54 vgutierrez: restart haproxy on cp2033 to clear unexpected service restart alerts - T334448
01:49 legoktm: legoktm@mwmaint2002:~$ mwscript extensions/Translate/scripts/moveTranslatableBundle.php --wiki commonswiki "Commons:Picture of the Year/2021/Help" "Commons:Picture of the Year/Help" "Legoktm" --reason "make non-year specific" --skip-talkpages

2023-04-15

07:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T333332)', diff saved to https://phabricator.wikimedia.org/P46929 and previous config saved to /var/cache/conftool/dbconfig/20230415-071327-ladsgroup.json
06:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P46928 and previous config saved to /var/cache/conftool/dbconfig/20230415-065821-ladsgroup.json
06:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P46927 and previous config saved to /var/cache/conftool/dbconfig/20230415-064314-ladsgroup.json
06:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T333332)', diff saved to https://phabricator.wikimedia.org/P46926 and previous config saved to /var/cache/conftool/dbconfig/20230415-062808-ladsgroup.json
06:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2181 (T333332)', diff saved to https://phabricator.wikimedia.org/P46925 and previous config saved to /var/cache/conftool/dbconfig/20230415-062558-ladsgroup.json
06:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
06:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2181.codfw.wmnet with reason: Maintenance
06:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T333332)', diff saved to https://phabricator.wikimedia.org/P46924 and previous config saved to /var/cache/conftool/dbconfig/20230415-062534-ladsgroup.json
06:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P46923 and previous config saved to /var/cache/conftool/dbconfig/20230415-061028-ladsgroup.json
05:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P46922 and previous config saved to /var/cache/conftool/dbconfig/20230415-055521-ladsgroup.json
05:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T333332)', diff saved to https://phabricator.wikimedia.org/P46921 and previous config saved to /var/cache/conftool/dbconfig/20230415-054015-ladsgroup.json
05:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 (T333332)', diff saved to https://phabricator.wikimedia.org/P46920 and previous config saved to /var/cache/conftool/dbconfig/20230415-053804-ladsgroup.json
05:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
05:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
05:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T333332)', diff saved to https://phabricator.wikimedia.org/P46919 and previous config saved to /var/cache/conftool/dbconfig/20230415-053752-ladsgroup.json
05:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P46918 and previous config saved to /var/cache/conftool/dbconfig/20230415-052246-ladsgroup.json
05:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P46917 and previous config saved to /var/cache/conftool/dbconfig/20230415-050739-ladsgroup.json
04:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T333332)', diff saved to https://phabricator.wikimedia.org/P46916 and previous config saved to /var/cache/conftool/dbconfig/20230415-045233-ladsgroup.json
04:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3318 (T333332)', diff saved to https://phabricator.wikimedia.org/P46915 and previous config saved to /var/cache/conftool/dbconfig/20230415-045023-ladsgroup.json
04:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
04:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2167.codfw.wmnet with reason: Maintenance
04:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T333332)', diff saved to https://phabricator.wikimedia.org/P46914 and previous config saved to /var/cache/conftool/dbconfig/20230415-044959-ladsgroup.json
04:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P46913 and previous config saved to /var/cache/conftool/dbconfig/20230415-043453-ladsgroup.json
04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P46912 and previous config saved to /var/cache/conftool/dbconfig/20230415-041947-ladsgroup.json
04:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T333332)', diff saved to https://phabricator.wikimedia.org/P46911 and previous config saved to /var/cache/conftool/dbconfig/20230415-040440-ladsgroup.json
04:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2166 (T333332)', diff saved to https://phabricator.wikimedia.org/P46910 and previous config saved to /var/cache/conftool/dbconfig/20230415-040230-ladsgroup.json
04:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
04:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2166.codfw.wmnet with reason: Maintenance
04:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T333332)', diff saved to https://phabricator.wikimedia.org/P46909 and previous config saved to /var/cache/conftool/dbconfig/20230415-040207-ladsgroup.json
03:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P46908 and previous config saved to /var/cache/conftool/dbconfig/20230415-034700-ladsgroup.json
03:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P46907 and previous config saved to /var/cache/conftool/dbconfig/20230415-033154-ladsgroup.json
03:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T333332)', diff saved to https://phabricator.wikimedia.org/P46906 and previous config saved to /var/cache/conftool/dbconfig/20230415-031648-ladsgroup.json
03:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2164 (T333332)', diff saved to https://phabricator.wikimedia.org/P46905 and previous config saved to /var/cache/conftool/dbconfig/20230415-031437-ladsgroup.json
03:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
03:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
03:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
03:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2164.codfw.wmnet with reason: Maintenance
03:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T333332)', diff saved to https://phabricator.wikimedia.org/P46904 and previous config saved to /var/cache/conftool/dbconfig/20230415-031356-ladsgroup.json
02:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P46903 and previous config saved to /var/cache/conftool/dbconfig/20230415-025850-ladsgroup.json
02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P46902 and previous config saved to /var/cache/conftool/dbconfig/20230415-024344-ladsgroup.json
02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T333332)', diff saved to https://phabricator.wikimedia.org/P46901 and previous config saved to /var/cache/conftool/dbconfig/20230415-022837-ladsgroup.json
02:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2163 (T333332)', diff saved to https://phabricator.wikimedia.org/P46900 and previous config saved to /var/cache/conftool/dbconfig/20230415-022627-ladsgroup.json
02:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
02:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2163.codfw.wmnet with reason: Maintenance
02:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T333332)', diff saved to https://phabricator.wikimedia.org/P46899 and previous config saved to /var/cache/conftool/dbconfig/20230415-022604-ladsgroup.json
02:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P46898 and previous config saved to /var/cache/conftool/dbconfig/20230415-021057-ladsgroup.json
01:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P46897 and previous config saved to /var/cache/conftool/dbconfig/20230415-015551-ladsgroup.json
01:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T333332)', diff saved to https://phabricator.wikimedia.org/P46896 and previous config saved to /var/cache/conftool/dbconfig/20230415-014045-ladsgroup.json
01:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2162 (T333332)', diff saved to https://phabricator.wikimedia.org/P46895 and previous config saved to /var/cache/conftool/dbconfig/20230415-013835-ladsgroup.json
01:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
01:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2162.codfw.wmnet with reason: Maintenance
01:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T333332)', diff saved to https://phabricator.wikimedia.org/P46894 and previous config saved to /var/cache/conftool/dbconfig/20230415-013811-ladsgroup.json
01:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46893 and previous config saved to /var/cache/conftool/dbconfig/20230415-012753-ladsgroup.json
01:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P46892 and previous config saved to /var/cache/conftool/dbconfig/20230415-012305-ladsgroup.json
01:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P46891 and previous config saved to /var/cache/conftool/dbconfig/20230415-011246-ladsgroup.json
01:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P46890 and previous config saved to /var/cache/conftool/dbconfig/20230415-010759-ladsgroup.json
00:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P46889 and previous config saved to /var/cache/conftool/dbconfig/20230415-005740-ladsgroup.json
00:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T333332)', diff saved to https://phabricator.wikimedia.org/P46888 and previous config saved to /var/cache/conftool/dbconfig/20230415-005252-ladsgroup.json
00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2161 (T333332)', diff saved to https://phabricator.wikimedia.org/P46887 and previous config saved to /var/cache/conftool/dbconfig/20230415-005042-ladsgroup.json
00:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
00:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2161.codfw.wmnet with reason: Maintenance
00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T333332)', diff saved to https://phabricator.wikimedia.org/P46886 and previous config saved to /var/cache/conftool/dbconfig/20230415-005019-ladsgroup.json
00:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46885 and previous config saved to /var/cache/conftool/dbconfig/20230415-004233-ladsgroup.json
00:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P46884 and previous config saved to /var/cache/conftool/dbconfig/20230415-003512-ladsgroup.json
00:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46883 and previous config saved to /var/cache/conftool/dbconfig/20230415-003315-ladsgroup.json
00:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
00:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
00:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46882 and previous config saved to /var/cache/conftool/dbconfig/20230415-003251-ladsgroup.json
00:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P46881 and previous config saved to /var/cache/conftool/dbconfig/20230415-002006-ladsgroup.json
00:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P46880 and previous config saved to /var/cache/conftool/dbconfig/20230415-001745-ladsgroup.json
00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T333332)', diff saved to https://phabricator.wikimedia.org/P46879 and previous config saved to /var/cache/conftool/dbconfig/20230415-000500-ladsgroup.json
00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2154 (T333332)', diff saved to https://phabricator.wikimedia.org/P46878 and previous config saved to /var/cache/conftool/dbconfig/20230415-000249-ladsgroup.json
00:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P46877 and previous config saved to /var/cache/conftool/dbconfig/20230415-000239-ladsgroup.json
00:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2154.codfw.wmnet with reason: Maintenance
00:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T333332)', diff saved to https://phabricator.wikimedia.org/P46876 and previous config saved to /var/cache/conftool/dbconfig/20230415-000226-ladsgroup.json

2023-04-14

23:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46875 and previous config saved to /var/cache/conftool/dbconfig/20230414-234732-ladsgroup.json
23:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P46874 and previous config saved to /var/cache/conftool/dbconfig/20230414-234720-ladsgroup.json
23:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46873 and previous config saved to /var/cache/conftool/dbconfig/20230414-234516-ladsgroup.json
23:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
23:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
23:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46872 and previous config saved to /var/cache/conftool/dbconfig/20230414-234453-ladsgroup.json
23:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P46871 and previous config saved to /var/cache/conftool/dbconfig/20230414-233213-ladsgroup.json
23:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P46870 and previous config saved to /var/cache/conftool/dbconfig/20230414-232946-ladsgroup.json
23:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T333332)', diff saved to https://phabricator.wikimedia.org/P46869 and previous config saved to /var/cache/conftool/dbconfig/20230414-231707-ladsgroup.json
23:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T333332)', diff saved to https://phabricator.wikimedia.org/P46868 and previous config saved to /var/cache/conftool/dbconfig/20230414-231557-ladsgroup.json
23:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
23:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2152.codfw.wmnet with reason: Maintenance
23:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
23:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
23:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
23:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
23:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
23:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
23:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
23:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1214.eqiad.wmnet with reason: Maintenance
23:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T333332)', diff saved to https://phabricator.wikimedia.org/P46867 and previous config saved to /var/cache/conftool/dbconfig/20230414-231440-ladsgroup.json
23:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P46866 and previous config saved to /var/cache/conftool/dbconfig/20230414-231440-ladsgroup.json
22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P46865 and previous config saved to /var/cache/conftool/dbconfig/20230414-225934-ladsgroup.json
22:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46864 and previous config saved to /var/cache/conftool/dbconfig/20230414-225934-ladsgroup.json
22:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46863 and previous config saved to /var/cache/conftool/dbconfig/20230414-225717-ladsgroup.json
22:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
22:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
22:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46862 and previous config saved to /var/cache/conftool/dbconfig/20230414-225654-ladsgroup.json
22:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P46861 and previous config saved to /var/cache/conftool/dbconfig/20230414-224428-ladsgroup.json
22:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P46860 and previous config saved to /var/cache/conftool/dbconfig/20230414-224147-ladsgroup.json
22:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1211 (T333332)', diff saved to https://phabricator.wikimedia.org/P46859 and previous config saved to /var/cache/conftool/dbconfig/20230414-222921-ladsgroup.json
22:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1211 (T333332)', diff saved to https://phabricator.wikimedia.org/P46858 and previous config saved to /var/cache/conftool/dbconfig/20230414-222814-ladsgroup.json
22:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
22:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1211.eqiad.wmnet with reason: Maintenance
22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T333332)', diff saved to https://phabricator.wikimedia.org/P46857 and previous config saved to /var/cache/conftool/dbconfig/20230414-222750-ladsgroup.json
22:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P46856 and previous config saved to /var/cache/conftool/dbconfig/20230414-222641-ladsgroup.json
22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P46855 and previous config saved to /var/cache/conftool/dbconfig/20230414-221244-ladsgroup.json
22:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46854 and previous config saved to /var/cache/conftool/dbconfig/20230414-221134-ladsgroup.json
22:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46853 and previous config saved to /var/cache/conftool/dbconfig/20230414-220918-ladsgroup.json
22:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T333332)', diff saved to https://phabricator.wikimedia.org/P46852 and previous config saved to /var/cache/conftool/dbconfig/20230414-220838-ladsgroup.json
21:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P46851 and previous config saved to /var/cache/conftool/dbconfig/20230414-215738-ladsgroup.json
21:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P46850 and previous config saved to /var/cache/conftool/dbconfig/20230414-215331-ladsgroup.json
21:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T333332)', diff saved to https://phabricator.wikimedia.org/P46849 and previous config saved to /var/cache/conftool/dbconfig/20230414-214231-ladsgroup.json
21:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1209 (T333332)', diff saved to https://phabricator.wikimedia.org/P46848 and previous config saved to /var/cache/conftool/dbconfig/20230414-214123-ladsgroup.json
21:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
21:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1209.eqiad.wmnet with reason: Maintenance
21:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T333332)', diff saved to https://phabricator.wikimedia.org/P46847 and previous config saved to /var/cache/conftool/dbconfig/20230414-214100-ladsgroup.json
21:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P46846 and previous config saved to /var/cache/conftool/dbconfig/20230414-213825-ladsgroup.json
21:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P46845 and previous config saved to /var/cache/conftool/dbconfig/20230414-212554-ladsgroup.json
21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T333332)', diff saved to https://phabricator.wikimedia.org/P46844 and previous config saved to /var/cache/conftool/dbconfig/20230414-212319-ladsgroup.json
21:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T333332)', diff saved to https://phabricator.wikimedia.org/P46843 and previous config saved to /var/cache/conftool/dbconfig/20230414-212102-ladsgroup.json
21:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
21:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46842 and previous config saved to /var/cache/conftool/dbconfig/20230414-212039-ladsgroup.json
21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P46841 and previous config saved to /var/cache/conftool/dbconfig/20230414-211048-ladsgroup.json
21:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P46840 and previous config saved to /var/cache/conftool/dbconfig/20230414-210533-ladsgroup.json
20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T333332)', diff saved to https://phabricator.wikimedia.org/P46838 and previous config saved to /var/cache/conftool/dbconfig/20230414-205541-ladsgroup.json
20:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1203 (T333332)', diff saved to https://phabricator.wikimedia.org/P46837 and previous config saved to /var/cache/conftool/dbconfig/20230414-205333-ladsgroup.json
20:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
20:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1203.eqiad.wmnet with reason: Maintenance
20:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T333332)', diff saved to https://phabricator.wikimedia.org/P46836 and previous config saved to /var/cache/conftool/dbconfig/20230414-205310-ladsgroup.json
20:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P46835 and previous config saved to /var/cache/conftool/dbconfig/20230414-205026-ladsgroup.json
20:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P46834 and previous config saved to /var/cache/conftool/dbconfig/20230414-203804-ladsgroup.json
20:36 papaul: rebooting labstore1004 for mgmt interface issue
20:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46833 and previous config saved to /var/cache/conftool/dbconfig/20230414-203520-ladsgroup.json
20:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46832 and previous config saved to /var/cache/conftool/dbconfig/20230414-203304-ladsgroup.json
20:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
20:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46831 and previous config saved to /var/cache/conftool/dbconfig/20230414-203241-ladsgroup.json
20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1207 (T333332)', diff saved to https://phabricator.wikimedia.org/P46830 and previous config saved to /var/cache/conftool/dbconfig/20230414-203220-ladsgroup.json
20:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
20:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1207.eqiad.wmnet with reason: Maintenance
20:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T333332)', diff saved to https://phabricator.wikimedia.org/P46829 and previous config saved to /var/cache/conftool/dbconfig/20230414-203156-ladsgroup.json
20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P46828 and previous config saved to /var/cache/conftool/dbconfig/20230414-202257-ladsgroup.json
20:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P46827 and previous config saved to /var/cache/conftool/dbconfig/20230414-201734-ladsgroup.json
20:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P46826 and previous config saved to /var/cache/conftool/dbconfig/20230414-201650-ladsgroup.json
20:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T333332)', diff saved to https://phabricator.wikimedia.org/P46825 and previous config saved to /var/cache/conftool/dbconfig/20230414-200751-ladsgroup.json
20:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1193 (T333332)', diff saved to https://phabricator.wikimedia.org/P46824 and previous config saved to /var/cache/conftool/dbconfig/20230414-200543-ladsgroup.json
20:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
20:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1193.eqiad.wmnet with reason: Maintenance
20:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T333332)', diff saved to https://phabricator.wikimedia.org/P46823 and previous config saved to /var/cache/conftool/dbconfig/20230414-200520-ladsgroup.json
20:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P46822 and previous config saved to /var/cache/conftool/dbconfig/20230414-200226-ladsgroup.json
20:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P46821 and previous config saved to /var/cache/conftool/dbconfig/20230414-200144-ladsgroup.json
19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P46820 and previous config saved to /var/cache/conftool/dbconfig/20230414-195014-ladsgroup.json
19:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46819 and previous config saved to /var/cache/conftool/dbconfig/20230414-194720-ladsgroup.json
19:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T333332)', diff saved to https://phabricator.wikimedia.org/P46818 and previous config saved to /var/cache/conftool/dbconfig/20230414-194637-ladsgroup.json
19:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46817 and previous config saved to /var/cache/conftool/dbconfig/20230414-194504-ladsgroup.json
19:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
19:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
19:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T333332)', diff saved to https://phabricator.wikimedia.org/P46816 and previous config saved to /var/cache/conftool/dbconfig/20230414-194441-ladsgroup.json
19:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1206 (T333332)', diff saved to https://phabricator.wikimedia.org/P46815 and previous config saved to /var/cache/conftool/dbconfig/20230414-194424-ladsgroup.json
19:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
19:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1206.eqiad.wmnet with reason: Maintenance
19:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T333332)', diff saved to https://phabricator.wikimedia.org/P46814 and previous config saved to /var/cache/conftool/dbconfig/20230414-194401-ladsgroup.json
19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P46813 and previous config saved to /var/cache/conftool/dbconfig/20230414-193507-ladsgroup.json
19:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P46812 and previous config saved to /var/cache/conftool/dbconfig/20230414-192934-ladsgroup.json
19:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P46811 and previous config saved to /var/cache/conftool/dbconfig/20230414-192855-ladsgroup.json
19:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T333332)', diff saved to https://phabricator.wikimedia.org/P46810 and previous config saved to /var/cache/conftool/dbconfig/20230414-192001-ladsgroup.json
19:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1192 (T333332)', diff saved to https://phabricator.wikimedia.org/P46809 and previous config saved to /var/cache/conftool/dbconfig/20230414-191854-ladsgroup.json
19:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
19:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1192.eqiad.wmnet with reason: Maintenance
19:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T333332)', diff saved to https://phabricator.wikimedia.org/P46808 and previous config saved to /var/cache/conftool/dbconfig/20230414-191831-ladsgroup.json
19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P46807 and previous config saved to /var/cache/conftool/dbconfig/20230414-191428-ladsgroup.json
19:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P46806 and previous config saved to /var/cache/conftool/dbconfig/20230414-191348-ladsgroup.json
19:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P46805 and previous config saved to /var/cache/conftool/dbconfig/20230414-190324-ladsgroup.json
18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T333332)', diff saved to https://phabricator.wikimedia.org/P46804 and previous config saved to /var/cache/conftool/dbconfig/20230414-185921-ladsgroup.json
18:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T333332)', diff saved to https://phabricator.wikimedia.org/P46803 and previous config saved to /var/cache/conftool/dbconfig/20230414-185842-ladsgroup.json
18:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T333332)', diff saved to https://phabricator.wikimedia.org/P46802 and previous config saved to /var/cache/conftool/dbconfig/20230414-185705-ladsgroup.json
18:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
18:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
18:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T333332)', diff saved to https://phabricator.wikimedia.org/P46801 and previous config saved to /var/cache/conftool/dbconfig/20230414-185642-ladsgroup.json
18:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1196 (T333332)', diff saved to https://phabricator.wikimedia.org/P46800 and previous config saved to /var/cache/conftool/dbconfig/20230414-185630-ladsgroup.json
18:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
18:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
18:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1196.eqiad.wmnet with reason: Maintenance
18:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T333332)', diff saved to https://phabricator.wikimedia.org/P46799 and previous config saved to /var/cache/conftool/dbconfig/20230414-185545-ladsgroup.json
18:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P46798 and previous config saved to /var/cache/conftool/dbconfig/20230414-184818-ladsgroup.json
18:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P46797 and previous config saved to /var/cache/conftool/dbconfig/20230414-184135-ladsgroup.json
18:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P46796 and previous config saved to /var/cache/conftool/dbconfig/20230414-184038-ladsgroup.json
18:36 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
18:33 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
18:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T333332)', diff saved to https://phabricator.wikimedia.org/P46795 and previous config saved to /var/cache/conftool/dbconfig/20230414-183311-ladsgroup.json
18:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P46794 and previous config saved to /var/cache/conftool/dbconfig/20230414-182629-ladsgroup.json
18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P46793 and previous config saved to /var/cache/conftool/dbconfig/20230414-182532-ladsgroup.json
18:18 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:17 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T333332)', diff saved to https://phabricator.wikimedia.org/P46792 and previous config saved to /var/cache/conftool/dbconfig/20230414-181123-ladsgroup.json
18:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T333332)', diff saved to https://phabricator.wikimedia.org/P46791 and previous config saved to /var/cache/conftool/dbconfig/20230414-181025-ladsgroup.json
18:09 mutante: doc1002, doc2001 - manually remove php7.3-fpm restart timers to fix T334735 and alerting - T322357 - systemctl stop wmf_auto_restart_php7.3-fpm.timer; systemctl stop wmf_auto_restart_php7.3-fpm.service; rm /lib/systemd/system/wmf_auto_restart_php7.3-fpm.*
18:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1186 (T333332)', diff saved to https://phabricator.wikimedia.org/P46790 and previous config saved to /var/cache/conftool/dbconfig/20230414-180812-ladsgroup.json
18:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
18:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1186.eqiad.wmnet with reason: Maintenance
18:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T333332)', diff saved to https://phabricator.wikimedia.org/P46789 and previous config saved to /var/cache/conftool/dbconfig/20230414-180748-ladsgroup.json
18:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T333332)', diff saved to https://phabricator.wikimedia.org/P46788 and previous config saved to /var/cache/conftool/dbconfig/20230414-180606-ladsgroup.json
18:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
18:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
18:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
18:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
18:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
18:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
18:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
18:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
18:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T333332)', diff saved to https://phabricator.wikimedia.org/P46787 and previous config saved to /var/cache/conftool/dbconfig/20230414-180430-ladsgroup.json
18:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:03 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:57 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1016.eqiad.wmnet with OS bullseye
17:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1014.eqiad.wmnet with OS bullseye
17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P46786 and previous config saved to /var/cache/conftool/dbconfig/20230414-175242-ladsgroup.json
17:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P46785 and previous config saved to /var/cache/conftool/dbconfig/20230414-174924-ladsgroup.json
17:49 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:47 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet
17:45 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1016.eqiad.wmnet with reason: host reimage
17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T333332)', diff saved to https://phabricator.wikimedia.org/P46784 and previous config saved to /var/cache/conftool/dbconfig/20230414-174356-ladsgroup.json
17:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
17:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1178.eqiad.wmnet with reason: Maintenance
17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T333332)', diff saved to https://phabricator.wikimedia.org/P46783 and previous config saved to /var/cache/conftool/dbconfig/20230414-174333-ladsgroup.json
17:42 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1016.eqiad.wmnet with reason: host reimage
17:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
17:39 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1072']
17:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P46782 and previous config saved to /var/cache/conftool/dbconfig/20230414-173734-ladsgroup.json
17:36 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1014.eqiad.wmnet with reason: host reimage
17:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P46781 and previous config saved to /var/cache/conftool/dbconfig/20230414-173418-ladsgroup.json
17:29 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1016.eqiad.wmnet with OS bullseye
17:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P46780 and previous config saved to /var/cache/conftool/dbconfig/20230414-172826-ladsgroup.json
17:27 pt1979@cumin2002: START - Cookbook sre.hosts.dhcp for host cloudvirtlocal1001.eqiad.wmnet
17:25 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:24 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
17:23 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
17:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T333332)', diff saved to https://phabricator.wikimedia.org/P46779 and previous config saved to /var/cache/conftool/dbconfig/20230414-172229-ladsgroup.json
17:21 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T333332)', diff saved to https://phabricator.wikimedia.org/P46778 and previous config saved to /var/cache/conftool/dbconfig/20230414-172016-ladsgroup.json
17:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
17:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1184.eqiad.wmnet with reason: Maintenance
17:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T333332)', diff saved to https://phabricator.wikimedia.org/P46777 and previous config saved to /var/cache/conftool/dbconfig/20230414-171953-ladsgroup.json
17:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T333332)', diff saved to https://phabricator.wikimedia.org/P46776 and previous config saved to /var/cache/conftool/dbconfig/20230414-171911-ladsgroup.json
17:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1014.eqiad.wmnet with OS bullseye
17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 (T333332)', diff saved to https://phabricator.wikimedia.org/P46775 and previous config saved to /var/cache/conftool/dbconfig/20230414-171702-ladsgroup.json
17:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
17:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T333332)', diff saved to https://phabricator.wikimedia.org/P46774 and previous config saved to /var/cache/conftool/dbconfig/20230414-171638-ladsgroup.json
17:15 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1015.eqiad.wmnet with OS bullseye
17:15 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P46773 and previous config saved to /var/cache/conftool/dbconfig/20230414-171320-ladsgroup.json
17:11 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:10 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:05 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P46772 and previous config saved to /var/cache/conftool/dbconfig/20230414-170447-ladsgroup.json
17:04 hnowlan@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
17:04 hnowlan@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
17:03 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
17:02 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P46771 and previous config saved to /var/cache/conftool/dbconfig/20230414-170133-ladsgroup.json
17:00 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1015.eqiad.wmnet with reason: host reimage
16:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T333332)', diff saved to https://phabricator.wikimedia.org/P46770 and previous config saved to /var/cache/conftool/dbconfig/20230414-165814-ladsgroup.json
16:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P46769 and previous config saved to /var/cache/conftool/dbconfig/20230414-164940-ladsgroup.json
16:47 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1015.eqiad.wmnet with OS bullseye
16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P46768 and previous config saved to /var/cache/conftool/dbconfig/20230414-164627-ladsgroup.json
16:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
16:38 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
16:38 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
16:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T333332)', diff saved to https://phabricator.wikimedia.org/P46767 and previous config saved to /var/cache/conftool/dbconfig/20230414-163434-ladsgroup.json
16:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T333332)', diff saved to https://phabricator.wikimedia.org/P46766 and previous config saved to /var/cache/conftool/dbconfig/20230414-163221-ladsgroup.json
16:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
16:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1169.eqiad.wmnet with reason: Maintenance
16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
16:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
16:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T333332)', diff saved to https://phabricator.wikimedia.org/P46765 and previous config saved to /var/cache/conftool/dbconfig/20230414-163120-ladsgroup.json
16:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1139.eqiad.wmnet with reason: Maintenance
16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T333332)', diff saved to https://phabricator.wikimedia.org/P46764 and previous config saved to /var/cache/conftool/dbconfig/20230414-163110-ladsgroup.json
16:30 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T333332)', diff saved to https://phabricator.wikimedia.org/P46763 and previous config saved to /var/cache/conftool/dbconfig/20230414-162911-ladsgroup.json
16:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
16:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
16:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T333332)', diff saved to https://phabricator.wikimedia.org/P46762 and previous config saved to /var/cache/conftool/dbconfig/20230414-162848-ladsgroup.json
16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P46761 and previous config saved to /var/cache/conftool/dbconfig/20230414-161604-ladsgroup.json
16:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P46760 and previous config saved to /var/cache/conftool/dbconfig/20230414-161341-ladsgroup.json
16:06 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs1013.eqiad.wmnet with OS bullseye
16:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P46759 and previous config saved to /var/cache/conftool/dbconfig/20230414-160058-ladsgroup.json
15:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P46758 and previous config saved to /var/cache/conftool/dbconfig/20230414-155835-ladsgroup.json
15:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T333332)', diff saved to https://phabricator.wikimedia.org/P46757 and previous config saved to /var/cache/conftool/dbconfig/20230414-155758-ladsgroup.json
15:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
15:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1177.eqiad.wmnet with reason: Maintenance
15:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T333332)', diff saved to https://phabricator.wikimedia.org/P46756 and previous config saved to /var/cache/conftool/dbconfig/20230414-155735-ladsgroup.json
15:53 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1013.eqiad.wmnet with reason: host reimage
15:52 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:52 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:50 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1013.eqiad.wmnet with reason: host reimage
15:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T333332)', diff saved to https://phabricator.wikimedia.org/P46755 and previous config saved to /var/cache/conftool/dbconfig/20230414-154551-ladsgroup.json
15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T333332)', diff saved to https://phabricator.wikimedia.org/P46754 and previous config saved to /var/cache/conftool/dbconfig/20230414-154339-ladsgroup.json
15:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T333332)', diff saved to https://phabricator.wikimedia.org/P46753 and previous config saved to /var/cache/conftool/dbconfig/20230414-154329-ladsgroup.json
15:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1135.eqiad.wmnet with reason: Maintenance
15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T333332)', diff saved to https://phabricator.wikimedia.org/P46752 and previous config saved to /var/cache/conftool/dbconfig/20230414-154316-ladsgroup.json
15:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P46751 and previous config saved to /var/cache/conftool/dbconfig/20230414-154228-ladsgroup.json
15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T333332)', diff saved to https://phabricator.wikimedia.org/P46750 and previous config saved to /var/cache/conftool/dbconfig/20230414-154119-ladsgroup.json
15:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
15:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
15:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T333332)', diff saved to https://phabricator.wikimedia.org/P46749 and previous config saved to /var/cache/conftool/dbconfig/20230414-154056-ladsgroup.json
15:36 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs1013.eqiad.wmnet with OS bullseye
15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P46748 and previous config saved to /var/cache/conftool/dbconfig/20230414-152809-ladsgroup.json
15:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P46747 and previous config saved to /var/cache/conftool/dbconfig/20230414-152722-ladsgroup.json
15:26 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P46746 and previous config saved to /var/cache/conftool/dbconfig/20230414-152550-ladsgroup.json
15:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P46745 and previous config saved to /var/cache/conftool/dbconfig/20230414-151303-ladsgroup.json
15:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T333332)', diff saved to https://phabricator.wikimedia.org/P46744 and previous config saved to /var/cache/conftool/dbconfig/20230414-151216-ladsgroup.json
15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1172 (T333332)', diff saved to https://phabricator.wikimedia.org/P46743 and previous config saved to /var/cache/conftool/dbconfig/20230414-151108-ladsgroup.json
15:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1172.eqiad.wmnet with reason: Maintenance
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P46742 and previous config saved to /var/cache/conftool/dbconfig/20230414-151043-ladsgroup.json
15:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
15:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T333332)', diff saved to https://phabricator.wikimedia.org/P46741 and previous config saved to /var/cache/conftool/dbconfig/20230414-151037-ladsgroup.json
15:04 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:04 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
14:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T333332)', diff saved to https://phabricator.wikimedia.org/P46740 and previous config saved to /var/cache/conftool/dbconfig/20230414-145756-ladsgroup.json
14:55 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: name=mw1349.eqiad.wmnet
14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T333332)', diff saved to https://phabricator.wikimedia.org/P46739 and previous config saved to /var/cache/conftool/dbconfig/20230414-145544-ladsgroup.json
14:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1134.eqiad.wmnet with reason: Maintenance
14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T333332)', diff saved to https://phabricator.wikimedia.org/P46738 and previous config saved to /var/cache/conftool/dbconfig/20230414-145537-ladsgroup.json
14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P46737 and previous config saved to /var/cache/conftool/dbconfig/20230414-145531-ladsgroup.json
14:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1134.eqiad.wmnet with reason: Maintenance
14:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T333332)', diff saved to https://phabricator.wikimedia.org/P46736 and previous config saved to /var/cache/conftool/dbconfig/20230414-145521-ladsgroup.json
14:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T333332)', diff saved to https://phabricator.wikimedia.org/P46735 and previous config saved to /var/cache/conftool/dbconfig/20230414-145327-ladsgroup.json
14:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
14:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
14:53 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
14:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46734 and previous config saved to /var/cache/conftool/dbconfig/20230414-145245-ladsgroup.json
14:49 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pybal-test2002.codfw.wmnet
14:49 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:49 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pybal-test2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
14:48 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pybal-test2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
14:45 sukhe@cumin2002: START - Cookbook sre.dns.netbox
14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P46733 and previous config saved to /var/cache/conftool/dbconfig/20230414-144024-ladsgroup.json
14:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P46732 and previous config saved to /var/cache/conftool/dbconfig/20230414-144014-ladsgroup.json
14:38 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:38 jclark@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mngmt dns fundrasing - jclark@cumin1001"
14:38 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts pybal-test2002.codfw.wmnet
14:38 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pybal-test2001.codfw.wmnet
14:38 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P46731 and previous config saved to /var/cache/conftool/dbconfig/20230414-143738-ladsgroup.json
14:37 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mngmt dns fundrasing - jclark@cumin1001"
14:36 sukhe@cumin2002: START - Cookbook sre.dns.netbox
14:35 jclark@cumin1001: START - Cookbook sre.dns.netbox
14:32 sukhe@cumin2002: START - Cookbook sre.hosts.decommission for hosts pybal-test2001.codfw.wmnet
14:30 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
14:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
14:29 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
14:27 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T333332)', diff saved to https://phabricator.wikimedia.org/P46730 and previous config saved to /var/cache/conftool/dbconfig/20230414-142518-ladsgroup.json
14:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132', diff saved to https://phabricator.wikimedia.org/P46729 and previous config saved to /var/cache/conftool/dbconfig/20230414-142508-ladsgroup.json
14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P46728 and previous config saved to /var/cache/conftool/dbconfig/20230414-142232-ladsgroup.json
14:21 claime: rebooting list1001 for cpu bump
14:11 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
14:11 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
14:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1132 (T333332)', diff saved to https://phabricator.wikimedia.org/P46727 and previous config saved to /var/cache/conftool/dbconfig/20230414-141002-ladsgroup.json
14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1132 (T333332)', diff saved to https://phabricator.wikimedia.org/P46726 and previous config saved to /var/cache/conftool/dbconfig/20230414-140749-ladsgroup.json
14:07 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1132.eqiad.wmnet with reason: Maintenance
14:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1132.eqiad.wmnet with reason: Maintenance
14:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46725 and previous config saved to /var/cache/conftool/dbconfig/20230414-140725-ladsgroup.json
14:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46724 and previous config saved to /var/cache/conftool/dbconfig/20230414-140616-ladsgroup.json
14:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46723 and previous config saved to /var/cache/conftool/dbconfig/20230414-140553-ladsgroup.json
14:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T333332)', diff saved to https://phabricator.wikimedia.org/P46722 and previous config saved to /var/cache/conftool/dbconfig/20230414-140401-ladsgroup.json
14:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
14:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
14:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
14:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1167.eqiad.wmnet with reason: Maintenance
14:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1116.eqiad.wmnet with reason: Maintenance
14:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1116.eqiad.wmnet with reason: Maintenance
14:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T333332)', diff saved to https://phabricator.wikimedia.org/P46721 and previous config saved to /var/cache/conftool/dbconfig/20230414-140258-ladsgroup.json
13:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P46720 and previous config saved to /var/cache/conftool/dbconfig/20230414-135220-ladsgroup.json
13:51 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P46719 and previous config saved to /var/cache/conftool/dbconfig/20230414-135047-ladsgroup.json
13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P46718 and previous config saved to /var/cache/conftool/dbconfig/20230414-134751-ladsgroup.json
13:45 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:44 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:42 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:42 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:37 jclark@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS buster
13:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P46717 and previous config saved to /var/cache/conftool/dbconfig/20230414-133714-ladsgroup.json
13:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P46716 and previous config saved to /var/cache/conftool/dbconfig/20230414-133540-ladsgroup.json
13:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P46715 and previous config saved to /var/cache/conftool/dbconfig/20230414-133245-ladsgroup.json
13:31 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:30 jclark@cumin1001: START - Cookbook sre.dns.netbox
13:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T333332)', diff saved to https://phabricator.wikimedia.org/P46714 and previous config saved to /var/cache/conftool/dbconfig/20230414-132208-ladsgroup.json
13:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46713 and previous config saved to /var/cache/conftool/dbconfig/20230414-132034-ladsgroup.json
13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T333332)', diff saved to https://phabricator.wikimedia.org/P46712 and previous config saved to /var/cache/conftool/dbconfig/20230414-131956-ladsgroup.json
13:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1128.eqiad.wmnet with reason: Maintenance
13:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1128.eqiad.wmnet with reason: Maintenance
13:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T333332)', diff saved to https://phabricator.wikimedia.org/P46711 and previous config saved to /var/cache/conftool/dbconfig/20230414-131932-ladsgroup.json
13:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46710 and previous config saved to /var/cache/conftool/dbconfig/20230414-131824-ladsgroup.json
13:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
13:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T333332)', diff saved to https://phabricator.wikimedia.org/P46709 and previous config saved to /var/cache/conftool/dbconfig/20230414-131739-ladsgroup.json
13:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T333332)', diff saved to https://phabricator.wikimedia.org/P46708 and previous config saved to /var/cache/conftool/dbconfig/20230414-131631-ladsgroup.json
13:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1114.eqiad.wmnet with reason: Maintenance
13:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1114.eqiad.wmnet with reason: Maintenance
13:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T333332)', diff saved to https://phabricator.wikimedia.org/P46707 and previous config saved to /var/cache/conftool/dbconfig/20230414-131607-ladsgroup.json
13:11 ottomata: granting IdempotentWrite on kafka jumbo-eqiad cluster to User:ANONYNOUS - this will allow for user of newer kafka producers that have enabled transactional writes by default. `kafka acls --add --allow-principal User:ANONYMOUS --cluster --operation IdempotentWrite`
13:07 ottomata: creating User:ANONYMOUS ACLs on kafka-test cluster https://wikitech.wikimedia.org/wiki/Kafka/Administration#Kafka_ACLs
13:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P46706 and previous config saved to /var/cache/conftool/dbconfig/20230414-130426-ladsgroup.json
13:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P46705 and previous config saved to /var/cache/conftool/dbconfig/20230414-130234-ladsgroup.json
13:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P46704 and previous config saved to /var/cache/conftool/dbconfig/20230414-130101-ladsgroup.json
12:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119', diff saved to https://phabricator.wikimedia.org/P46703 and previous config saved to /var/cache/conftool/dbconfig/20230414-124920-ladsgroup.json
12:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P46702 and previous config saved to /var/cache/conftool/dbconfig/20230414-124727-ladsgroup.json
12:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P46701 and previous config saved to /var/cache/conftool/dbconfig/20230414-124553-ladsgroup.json
12:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1119 (T333332)', diff saved to https://phabricator.wikimedia.org/P46700 and previous config saved to /var/cache/conftool/dbconfig/20230414-123413-ladsgroup.json
12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T333332)', diff saved to https://phabricator.wikimedia.org/P46699 and previous config saved to /var/cache/conftool/dbconfig/20230414-123221-ladsgroup.json
12:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 (T333332)', diff saved to https://phabricator.wikimedia.org/P46698 and previous config saved to /var/cache/conftool/dbconfig/20230414-123201-ladsgroup.json
12:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1119.eqiad.wmnet with reason: Maintenance
12:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1119.eqiad.wmnet with reason: Maintenance
12:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T333332)', diff saved to https://phabricator.wikimedia.org/P46697 and previous config saved to /var/cache/conftool/dbconfig/20230414-123138-ladsgroup.json
12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T333332)', diff saved to https://phabricator.wikimedia.org/P46696 and previous config saved to /var/cache/conftool/dbconfig/20230414-123047-ladsgroup.json
12:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 (T333332)', diff saved to https://phabricator.wikimedia.org/P46695 and previous config saved to /var/cache/conftool/dbconfig/20230414-123011-ladsgroup.json
12:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1136.eqiad.wmnet with reason: Maintenance
12:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1136.eqiad.wmnet with reason: Maintenance
12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46694 and previous config saved to /var/cache/conftool/dbconfig/20230414-122948-ladsgroup.json
12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T333332)', diff saved to https://phabricator.wikimedia.org/P46693 and previous config saved to /var/cache/conftool/dbconfig/20230414-122939-ladsgroup.json
12:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1111.eqiad.wmnet with reason: Maintenance
12:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1111.eqiad.wmnet with reason: Maintenance
12:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109 (T333332)', diff saved to https://phabricator.wikimedia.org/P46692 and previous config saved to /var/cache/conftool/dbconfig/20230414-122915-ladsgroup.json
12:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P46691 and previous config saved to /var/cache/conftool/dbconfig/20230414-121632-ladsgroup.json
12:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P46690 and previous config saved to /var/cache/conftool/dbconfig/20230414-121442-ladsgroup.json
12:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109', diff saved to https://phabricator.wikimedia.org/P46689 and previous config saved to /var/cache/conftool/dbconfig/20230414-121409-ladsgroup.json
12:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P46688 and previous config saved to /var/cache/conftool/dbconfig/20230414-120125-ladsgroup.json
11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P46687 and previous config saved to /var/cache/conftool/dbconfig/20230414-115935-ladsgroup.json
11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109', diff saved to https://phabricator.wikimedia.org/P46686 and previous config saved to /var/cache/conftool/dbconfig/20230414-115903-ladsgroup.json
11:50 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
11:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T333332)', diff saved to https://phabricator.wikimedia.org/P46685 and previous config saved to /var/cache/conftool/dbconfig/20230414-114619-ladsgroup.json
11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46684 and previous config saved to /var/cache/conftool/dbconfig/20230414-114429-ladsgroup.json
11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 (T333332)', diff saved to https://phabricator.wikimedia.org/P46683 and previous config saved to /var/cache/conftool/dbconfig/20230414-114407-ladsgroup.json
11:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1118.eqiad.wmnet with reason: Maintenance
11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109 (T333332)', diff saved to https://phabricator.wikimedia.org/P46682 and previous config saved to /var/cache/conftool/dbconfig/20230414-114356-ladsgroup.json
11:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1118.eqiad.wmnet with reason: Maintenance
11:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
11:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1106.eqiad.wmnet with reason: Maintenance
11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46681 and previous config saved to /var/cache/conftool/dbconfig/20230414-114219-ladsgroup.json
11:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1127.eqiad.wmnet with reason: Maintenance
11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1127.eqiad.wmnet with reason: Maintenance
11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1109 (T333332)', diff saved to https://phabricator.wikimedia.org/P46680 and previous config saved to /var/cache/conftool/dbconfig/20230414-114148-ladsgroup.json
11:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1109.eqiad.wmnet with reason: Maintenance
11:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1109.eqiad.wmnet with reason: Maintenance
11:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
10:49 kamila@deploy2002: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
10:43 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
10:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1120.eqiad.wmnet
10:40 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:40 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1120.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
10:39 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1120.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
10:37 marostegui@cumin1001: START - Cookbook sre.dns.netbox
10:32 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1120.eqiad.wmnet
10:26 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
10:08 kamila@deploy2002: conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
09:53 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: name=mw2.*.codfw.wmnet,cluster=api_appserver
09:53 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: name=mw2.*.codfw.wmnet,cluster=appserver
09:45 kamila@deploy2002: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
09:22 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=parsoid
09:21 kamila@deploy2002: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
09:16 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup2002.codfw.wmnet with reason: systemd package upgrade
09:16 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on backup2002.codfw.wmnet with reason: systemd package upgrade
08:51 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
08:35 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
08:21 arturo: aborrero@apt2001:~ $ sudo -i reprepro --noskipold --component thirdparty/kubeadm-k8s-1-23 update buster-wikimedia (T298005)
07:55 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest1002.eqiad.wmnet with OS bookworm
07:39 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1100 T329352', diff saved to https://phabricator.wikimedia.org/P46679 and previous config saved to /var/cache/conftool/dbconfig/20230414-062553-marostegui.json
06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1107.eqiad.wmnet
06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1107.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
06:08 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1107.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
06:06 marostegui@cumin1001: START - Cookbook sre.dns.netbox
06:01 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1107.eqiad.wmnet
04:04 ejegg: SmashPig upgraded from 24d700f4 to db9fa965
01:37 fab@deploy2002: Finished deploy [airflow-dags/research@f8dad05]: (no justification provided) (duration: 00m 11s)
01:37 fab@deploy2002: Started deploy [airflow-dags/research@f8dad05]: (no justification provided)
01:07 fab@deploy2002: Finished deploy [airflow-dags/research@f8dad05]: (no justification provided) (duration: 00m 10s)
01:07 fab@deploy2002: Started deploy [airflow-dags/research@f8dad05]: (no justification provided)
01:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
00:05 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
00:04 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye

2023-04-13

23:44 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
23:41 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
23:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
23:16 ejegg: civicrm upgraded from 2d5ede8d to cd0f886d
22:00 ryankemper: T333656 `ryankemper@dns1001:~$ sudo -i authdns-update` after merge of https://gerrit.wikimedia.org/r/905754 => `OK - authdns-update successful on all nodes!`
21:38 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
21:37 SandraEbele: Successfully Deployed analytics refinery using scap, then deployed onto hdfs.
21:28 mutante: https://query-preview.wikidata.org has been deactivated at ATS layer - T333656
21:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
21:25 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
21:10 brennen@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.41.0-wmf.4 refs T330210
21:03 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
21:02 mutante: doc1002 (doc.wikimedia.org) - switching from PHP 7.3 to 7.4 - systemctl stop php7.3-fpm, restart php7.4-fpm, apt-get remove --purge php7.3*, systemctl restart apache2. - all tests still working (on deployment server: httpbb --hosts doc1002.eqiad.wmnet /srv/deployment/httpbb-tests/doc/test_doc.yaml) T322357 T319477
21:01 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
20:55 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on doc1002.eqiad.wmnet with reason: maintenance
20:55 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 0:15:00 on doc1002.eqiad.wmnet with reason: maintenance
20:55 urbanecm@deploy2002: Finished scap: Backport for Only log 'visualEditorFeatureUse' events if 'editAttemptStep' events are being logged (T334157), Stop using redundant $wmg variable for MobileFrontend extension (T119117) (duration: 06m 26s)
20:50 urbanecm@deploy2002: urbanecm and matmarex: Backport for Only log 'visualEditorFeatureUse' events if 'editAttemptStep' events are being logged (T334157), Stop using redundant $wmg variable for MobileFrontend extension (T119117) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
20:48 urbanecm@deploy2002: Started scap: Backport for Only log 'visualEditorFeatureUse' events if 'editAttemptStep' events are being logged (T334157), Stop using redundant $wmg variable for MobileFrontend extension (T119117)
20:46 mutante: doc2001 - systemctl stop php7.3-fpm; systemctl restart php7.4-fpm - needed because after gerrit:901612 we had BOTH PHP versions, 7.3 and 7.4 running their own php-fpm process, also packages for both versions are installed, so also manual package removal needed - apt-get remove php7.3* T322357 T319477
20:38 urbanecm@deploy2002: Finished scap: Backport for enwiki: Remove userrights from `founder` (T334692) (duration: 05m 55s)
20:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
20:34 urbanecm@deploy2002: urbanecm: Backport for enwiki: Remove userrights from `founder` (T334692) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
20:32 urbanecm@deploy2002: Started scap: Backport for enwiki: Remove userrights from `founder` (T334692)
20:32 urbanecm@deploy2002: Finished scap: Backport for [wikitech] Add a logo and a wordmark for Vector 2022 (T334666) (duration: 05m 41s)
20:31 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
20:27 urbanecm@deploy2002: superpes and urbanecm: Backport for [wikitech] Add a logo and a wordmark for Vector 2022 (T334666) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
20:27 mutante: doc2001 - switching PHP version from 7.3 to 7.4 for T322357
20:26 urbanecm@deploy2002: Started scap: Backport for [wikitech] Add a logo and a wordmark for Vector 2022 (T334666)
20:25 urbanecm@deploy2002: Finished scap: Backport for Enable mobile page tabs for everyone in ruwiki (T334395) (duration: 06m 49s)
20:20 urbanecm@deploy2002: urbanecm and matmarex: Backport for Enable mobile page tabs for everyone in ruwiki (T334395) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
20:19 urbanecm@deploy2002: Started scap: Backport for Enable mobile page tabs for everyone in ruwiki (T334395)
20:15 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
20:15 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.3 refs T330210
20:14 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
19:55 brennen@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.41.0-wmf.4 refs T330210
19:29 sukhe: restart pybal on lvs2009 to pick up bgp-med change and pool
19:25 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2009
19:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
19:25 brett@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2009
19:25 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2009
19:25 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2009
19:18 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
19:16 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
19:03 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.3 refs T330210
18:59 bblack: lvs1020: restart pybal for experiment...
18:58 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
18:57 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
18:56 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
18:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2009.codfw.wmnet with OS bullseye
18:46 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
18:45 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1002.eqiad.wmnet with reason: host reimage
18:44 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
18:44 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
18:42 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1002.eqiad.wmnet with reason: host reimage
18:38 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2009.codfw.wmnet with reason: host reimage
18:35 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2009.codfw.wmnet with reason: host reimage
18:34 brennen@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.41.0-wmf.4 refs T330210
18:26 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:26 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
18:23 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
18:23 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
18:16 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2009.codfw.wmnet with OS bullseye
18:07 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
18:07 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
17:57 brett: Disable Puppet/PyBal on lvs2009 in preparation for reimaging - T321309
17:55 brett: restarting pybal on lvs2008 to pick up bgp-med change
17:49 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: T334057
17:48 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1102.eqiad.wmnet with reason: T334057
17:46 brett@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2008
17:46 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2008
17:37 brett@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2008
17:37 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2008
17:37 brett@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2008
17:36 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2008
17:31 ejegg: payments-wiki upgraded from 4dcba0a9 to c01a32c4
17:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2008.codfw.wmnet with OS bullseye
17:28 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
17:28 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
17:28 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
17:28 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
17:27 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
17:27 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
17:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2008.codfw.wmnet with reason: host reimage
17:09 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2008.codfw.wmnet with reason: host reimage
16:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2008.codfw.wmnet with OS bullseye
16:46 hnowlan@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
16:31 sukhe: sudo cumin -b1 -s30 'A:cp-text' 'ats-backend-restart': T332650
16:28 jhancock@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['ms-be2067']
16:28 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be2067']
16:27 sukhe: enable puppet on A:cp-text to merge CR 907937
16:23 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
16:21 sukhe: disable puppet on A:cp-text to merge CR 907937
16:14 stevemunene@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1132.eqiad.wmnet with reason: host reimage
16:10 stevemunene@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1132.eqiad.wmnet with reason: host reimage
16:05 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
16:04 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
15:58 stevemunene@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1132.eqiad.wmnet with OS buster
15:51 ebysans@deploy2002: Finished deploy [analytics/refinery@4e8f1ac] (hadoop-test): Update druid pageview hourly and daily tables TEST [analytics/refinery@4e8f1ac] (duration: 01m 26s)
15:51 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
15:51 andrew@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1001"
15:50 ebysans@deploy2002: Started deploy [analytics/refinery@4e8f1ac] (hadoop-test): Update druid pageview hourly and daily tables TEST [analytics/refinery@4e8f1ac]
15:49 andrew@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1001"
15:49 ebysans@deploy2002: Finished deploy [analytics/refinery@4e8f1ac] (thin): Update druid pageview hourly and daily tables THIN [analytics/refinery@4e8f1ac] (duration: 00m 08s)
15:49 ebysans@deploy2002: Started deploy [analytics/refinery@4e8f1ac] (thin): Update druid pageview hourly and daily tables THIN [analytics/refinery@4e8f1ac]
15:49 stevemunene@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1132.eqiad.wmnet with OS buster
15:48 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
15:48 andrew@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1001"
15:47 ebysans@deploy2002: Finished deploy [analytics/refinery@4e8f1ac]: Update druid pageview hourly and daily tables [analytics/refinery@4e8f1ac] (duration: 06m 24s)
15:47 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirtlocal1003.eqiad.wmnet with OS bullseye
15:47 andrew@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1001"
15:46 andrew@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1001"
15:46 brett: Disable Puppet/PyBal on lvs2008 in preparation for reimaging - T321309
15:44 andrew@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - andrew@cumin1001"
15:42 SandraEbele: paused Oozie pageview-druid-hourly job.
15:41 ebysans@deploy2002: Started deploy [analytics/refinery@4e8f1ac]: Update druid pageview hourly and daily tables [analytics/refinery@4e8f1ac]
15:36 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2007
15:36 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2007
15:33 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1002.eqiad.wmnet with reason: host reimage
15:31 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:31 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
15:30 stevemunene@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1132.eqiad.wmnet with OS buster
15:29 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirtlocal1003.eqiad.wmnet with reason: host reimage
15:29 SandraEbele: deploying analytics refinery-update pageview druid table
15:25 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1001.eqiad.wmnet with reason: host reimage
15:25 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1002.eqiad.wmnet with reason: host reimage
15:25 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirtlocal1003.eqiad.wmnet with reason: host reimage
15:25 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:24 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
15:24 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
15:23 stevemunene@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1132.eqiad.wmnet with OS buster
15:22 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
15:22 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
15:19 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
15:19 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
15:17 claime: cxserver migrated to mw-api-int on kubernetes, take three - T334204
15:14 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
15:13 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
15:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
15:13 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
15:13 moritzm: remove runc packages installed on mw1349-mw1436, these were once used for a load test with dragonfly and are no longer needed
15:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
15:11 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
15:10 claime: Migrating cxserver to mw-api-int on kubernetes, take three - T334204
15:10 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
15:09 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
15:09 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1003.eqiad.wmnet with OS bullseye
15:07 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
15:06 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
15:06 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
15:05 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
15:04 moritzm: installing unbound security updates on buster
15:03 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
15:03 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
15:00 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirtlocal1003.eqiad.wmnet with OS bullseye
14:49 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
14:41 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
14:39 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
14:36 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
14:36 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
14:26 sukhe: restart pybal on lvs2007 to pick up bgp-med change CR 908552
14:23 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
14:23 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
14:20 moritzm: installing mariadb-10.3 security updates (as shipped in Debian, not the wmf-mariadb packages)
14:19 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
14:14 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet
14:09 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
14:06 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
14:06 kamila@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
14:05 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1002.eqiad.wmnet
14:04 vgutierrez: rolling restart of HAProxy on A:cp-text - T334448
14:02 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:54 sukhe: [puppetmaster] sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080; failing PCC for recently reimaged node
13:45 mbsantos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
13:45 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1003.eqiad.wmnet with OS bullseye
13:45 mbsantos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
13:44 andrew@cumin1001: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['cloudvirtlocal1003']
13:44 andrew@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirtlocal1003']
13:43 jelto@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gitlab2003.wikimedia.org with OS bullseye
13:43 mbsantos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
13:42 mbsantos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
13:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46674 and previous config saved to /var/cache/conftool/dbconfig/20230413-134030-root.json
13:38 mbsantos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
13:37 mbsantos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
13:33 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
13:31 sukhe@cumin2002: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs2007
13:30 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host lvs2007
13:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46673 and previous config saved to /var/cache/conftool/dbconfig/20230413-132525-root.json
13:23 vgutierrez: restarting haproxy in cp5022 - T334448
13:19 jgiannelos@deploy2002: Finished deploy [restbase/deploy@a08f56d]: (no justification provided) (duration: 17m 02s)
13:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46672 and previous config saved to /var/cache/conftool/dbconfig/20230413-131021-root.json
13:04 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:02 jgiannelos@deploy2002: Started deploy [restbase/deploy@a08f56d]: (no justification provided)
12:57 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
12:57 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
12:56 claime: Migrating cxserver to mw-api-int on kubernetes, take two - T334204
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46671 and previous config saved to /var/cache/conftool/dbconfig/20230413-125516-root.json
12:49 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
12:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46670 and previous config saved to /var/cache/conftool/dbconfig/20230413-124011-root.json
12:38 moritzm: installing systemd security updates on buster
12:33 moritzm: installing Django security updates
12:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46669 and previous config saved to /var/cache/conftool/dbconfig/20230413-122506-root.json
12:21 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus3001.esams.wmnet
12:21 moritzm: remove imagemagick 8:6.9.10.23+dfsg-2.1+deb10u1+wmf1 from apt.wikimedia.org (obsoleted by 8:6.9.10.23+dfsg-2.1+deb10u4 from the Debian archive) T328901
12:15 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host prometheus3001.esams.wmnet
12:11 moritzm: installing imagemagick security updates for buster T328901
12:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 4%: Repooling', diff saved to https://phabricator.wikimedia.org/P46668 and previous config saved to /var/cache/conftool/dbconfig/20230413-121001-root.json
11:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 3%: Repooling', diff saved to https://phabricator.wikimedia.org/P46667 and previous config saved to /var/cache/conftool/dbconfig/20230413-115456-root.json
11:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P46666 and previous config saved to /var/cache/conftool/dbconfig/20230413-113951-root.json
11:34 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1120 from dbctl T334580', diff saved to https://phabricator.wikimedia.org/P46665 and previous config saved to /var/cache/conftool/dbconfig/20230413-113435-marostegui.json
11:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46664 and previous config saved to /var/cache/conftool/dbconfig/20230413-112446-root.json
11:24 moritzm: installing imagemagick security updates
11:18 cgoubert@deploy2002: Finished scap: Updating mw-on-k8s certificates (duration: 01m 56s)
11:16 cgoubert@deploy2002: Started scap: Updating mw-on-k8s certificates
11:15 claime: Re-deploying mw-on-k8s to update certificates - T334561
10:39 claime: updating appservers and api certificates - T334561
10:23 Emperor: clear old 2/22/Free-object-universal-property.svg thumbs from wikipedia-commons-local-thumb.22 T334303
10:15 stevemunene@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1132.eqiad.wmnet with OS buster
10:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 100%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46662 and previous config saved to /var/cache/conftool/dbconfig/20230413-101307-root.json
10:07 moritzm: installing tomcat security updates
09:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 75%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46661 and previous config saved to /var/cache/conftool/dbconfig/20230413-095802-root.json
09:53 taavi: taavi@mwmaint2002 ~ $ mwscript emptyUserGroup.php --wiki frwikinews editor # T333750
09:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 50%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46660 and previous config saved to /var/cache/conftool/dbconfig/20230413-094257-root.json
09:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 25%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46659 and previous config saved to /var/cache/conftool/dbconfig/20230413-092752-root.json
09:25 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host dse-k8s-worker1001.eqiad.wmnet
09:22 stevemunene@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1132.eqiad.wmnet with OS buster
09:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 10%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46658 and previous config saved to /var/cache/conftool/dbconfig/20230413-091247-root.json
09:12 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet
09:04 jmm@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: centrallog1001.eqiad.wmnet
09:04 jmm@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: centrallog1001.eqiad.wmnet
09:01 stevemunene@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1132.eqiad.wmnet with OS buster
08:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 5%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46657 and previous config saved to /var/cache/conftool/dbconfig/20230413-085742-root.json
08:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 4%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46656 and previous config saved to /var/cache/conftool/dbconfig/20230413-084238-root.json
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 100%: Pooling', diff saved to https://phabricator.wikimedia.org/P46655 and previous config saved to /var/cache/conftool/dbconfig/20230413-084036-root.json
08:36 moritzm: installing git security updates
08:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 100%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46654 and previous config saved to /var/cache/conftool/dbconfig/20230413-083457-root.json
08:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 3%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46653 and previous config saved to /var/cache/conftool/dbconfig/20230413-082732-root.json
08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P46652 and previous config saved to /var/cache/conftool/dbconfig/20230413-082532-root.json
08:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 75%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46651 and previous config saved to /var/cache/conftool/dbconfig/20230413-081952-root.json
08:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 2%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46650 and previous config saved to /var/cache/conftool/dbconfig/20230413-081227-root.json
08:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P46649 and previous config saved to /var/cache/conftool/dbconfig/20230413-081027-root.json
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 50%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46647 and previous config saved to /var/cache/conftool/dbconfig/20230413-080447-root.json
08:00 moritzm: imported perccli 007.1910.0000.000 to bookworm-wikimedia-private T330495
07:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1223 (re)pooling @ 1%: Pooling db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46646 and previous config saved to /var/cache/conftool/dbconfig/20230413-075722-root.json
07:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P46645 and previous config saved to /var/cache/conftool/dbconfig/20230413-075522-root.json
07:55 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1223 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46644 and previous config saved to /var/cache/conftool/dbconfig/20230413-075513-marostegui.json
07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 25%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46643 and previous config saved to /var/cache/conftool/dbconfig/20230413-074942-root.json
07:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P46642 and previous config saved to /var/cache/conftool/dbconfig/20230413-074010-root.json
07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 10%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46641 and previous config saved to /var/cache/conftool/dbconfig/20230413-073437-root.json
07:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 10 hosts with reason: Cloning db1117
07:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on 10 hosts with reason: Cloning db1117
07:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P46639 and previous config saved to /var/cache/conftool/dbconfig/20230413-072505-root.json
07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 5%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46638 and previous config saved to /var/cache/conftool/dbconfig/20230413-071932-root.json
07:14 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2003.wikimedia.org with OS bullseye
07:14 slyngs: Puppet: move htcacheclean to httpd class https://gerrit.wikimedia.org/r/c/operations/puppet/+/904102
07:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P46637 and previous config saved to /var/cache/conftool/dbconfig/20230413-071000-root.json
07:09 moritzm: update bookworm installer to rc1 T330495
07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 4%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46636 and previous config saved to /var/cache/conftool/dbconfig/20230413-070428-root.json
06:59 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
06:56 jelto@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
06:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P46635 and previous config saved to /var/cache/conftool/dbconfig/20230413-065456-root.json
06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 3%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46634 and previous config saved to /var/cache/conftool/dbconfig/20230413-064922-root.json
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114 to clone db1214 T326669', diff saved to https://phabricator.wikimedia.org/P46632 and previous config saved to /var/cache/conftool/dbconfig/20230413-064452-marostegui.json
06:43 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
06:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P46631 and previous config saved to /var/cache/conftool/dbconfig/20230413-063951-root.json
06:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 2%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46630 and previous config saved to /var/cache/conftool/dbconfig/20230413-063417-root.json
06:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1221 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P46629 and previous config saved to /var/cache/conftool/dbconfig/20230413-062446-root.json
06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1221 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46628 and previous config saved to /var/cache/conftool/dbconfig/20230413-062231-marostegui.json
06:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1210 (re)pooling @ 1%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46627 and previous config saved to /var/cache/conftool/dbconfig/20230413-061913-root.json
06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1210 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46626 and previous config saved to /var/cache/conftool/dbconfig/20230413-061716-marostegui.json
02:08 fab@deploy2002: Finished deploy [airflow-dags/research@f8dad05]: (no justification provided) (duration: 00m 10s)
02:07 fab@deploy2002: Started deploy [airflow-dags/research@f8dad05]: (no justification provided)
02:01 fab@deploy2002: Finished deploy [airflow-dags/research@f8dad05]: (no justification provided) (duration: 00m 11s)
02:01 fab@deploy2002: Started deploy [airflow-dags/research@f8dad05]: (no justification provided)
02:00 ejegg: civicrm upgraded from 0f37f981 to 2d5ede8d
01:41 fab@deploy2002: Finished deploy [airflow-dags/research@f8dad05]: (no justification provided) (duration: 00m 10s)
01:41 fab@deploy2002: Started deploy [airflow-dags/research@f8dad05]: (no justification provided)
01:23 fab@deploy2002: Finished deploy [airflow-dags/research@f8dad05]: (no justification provided) (duration: 00m 10s)
01:23 fab@deploy2002: Started deploy [airflow-dags/research@f8dad05]: (no justification provided)
00:22 krinkle@deploy2002: Finished deploy [integration/docroot@f68055d]: (no justification provided) (duration: 00m 28s)
00:21 krinkle@deploy2002: Started deploy [integration/docroot@f68055d]: (no justification provided)

2023-04-12

23:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46625 and previous config saved to /var/cache/conftool/dbconfig/20230412-230933-ladsgroup.json
22:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P46624 and previous config saved to /var/cache/conftool/dbconfig/20230412-225427-ladsgroup.json
22:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P46623 and previous config saved to /var/cache/conftool/dbconfig/20230412-223921-ladsgroup.json
22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46622 and previous config saved to /var/cache/conftool/dbconfig/20230412-222414-ladsgroup.json
22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T333332)', diff saved to https://phabricator.wikimedia.org/P46621 and previous config saved to /var/cache/conftool/dbconfig/20230412-222141-ladsgroup.json
22:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
22:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2182.codfw.wmnet with reason: Maintenance
22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46620 and previous config saved to /var/cache/conftool/dbconfig/20230412-222117-ladsgroup.json
22:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P46619 and previous config saved to /var/cache/conftool/dbconfig/20230412-220611-ladsgroup.json
21:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2007.codfw.wmnet with OS bullseye
21:52 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for sessionstore1001.eqiad.wmnet
21:52 eevans@cumin1001: START - Cookbook sre.hosts.remove-downtime for sessionstore1001.eqiad.wmnet
21:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P46618 and previous config saved to /var/cache/conftool/dbconfig/20230412-215104-ladsgroup.json
21:38 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2007.codfw.wmnet with reason: host reimage
21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46617 and previous config saved to /var/cache/conftool/dbconfig/20230412-213558-ladsgroup.json
21:35 urandom: restarting Cassandra —sessionstore1001— to reenable native transport — T327954
21:35 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2007.codfw.wmnet with reason: host reimage
21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46616 and previous config saved to /var/cache/conftool/dbconfig/20230412-213325-ladsgroup.json
21:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
21:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46615 and previous config saved to /var/cache/conftool/dbconfig/20230412-213301-ladsgroup.json
21:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P46614 and previous config saved to /var/cache/conftool/dbconfig/20230412-211755-ladsgroup.json
21:16 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2007.codfw.wmnet with OS bullseye
21:04 mutante: gerrit1001 - pushing data over to gerrit1003 via rsync, with bwlimit option: rsync -avp --bwlimit=1m /srv/gerrit/ rsync://gerrit1003.wikimedia.org/gerrit-data/ (T326368)
21:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P46613 and previous config saved to /var/cache/conftool/dbconfig/20230412-210249-ladsgroup.json
21:01 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host lvs2007.codfw.wmnet with OS bullseye
21:01 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2007.codfw.wmnet with OS bullseye
20:58 brett: Disable Puppet/PyBal on lvs2007 in preparation for reimaging - T321309
20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46612 and previous config saved to /var/cache/conftool/dbconfig/20230412-204742-ladsgroup.json
20:47 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
20:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46611 and previous config saved to /var/cache/conftool/dbconfig/20230412-204508-ladsgroup.json
20:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
20:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2168.codfw.wmnet with reason: Maintenance
20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46610 and previous config saved to /var/cache/conftool/dbconfig/20230412-204445-ladsgroup.json
20:38 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
20:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P46609 and previous config saved to /var/cache/conftool/dbconfig/20230412-202939-ladsgroup.json
20:15 zabe@deploy2002: Finished scap: Backport for Drop unused VectorPageTools feature flag (T332090), Set Vector 2022 as default skin on Welsh Wikipedia (T334279) (duration: 10m 19s)
20:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P46608 and previous config saved to /var/cache/conftool/dbconfig/20230412-201432-ladsgroup.json
20:06 zabe@deploy2002: zabe and jdlrobson: Backport for Drop unused VectorPageTools feature flag (T332090), Set Vector 2022 as default skin on Welsh Wikipedia (T334279) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
20:05 zabe@deploy2002: Started scap: Backport for Drop unused VectorPageTools feature flag (T332090), Set Vector 2022 as default skin on Welsh Wikipedia (T334279)
19:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46606 and previous config saved to /var/cache/conftool/dbconfig/20230412-195926-ladsgroup.json
19:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T333332)', diff saved to https://phabricator.wikimedia.org/P46605 and previous config saved to /var/cache/conftool/dbconfig/20230412-195453-ladsgroup.json
19:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
19:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
19:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
19:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2159.codfw.wmnet with reason: Maintenance
19:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T333332)', diff saved to https://phabricator.wikimedia.org/P46604 and previous config saved to /var/cache/conftool/dbconfig/20230412-195423-ladsgroup.json
19:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
19:43 zabe@deploy2002: Finished scap: Backport for Revert "Ensure ApiHelp correctly types values in TOCData objects", Revert "Ensure ApiHelp correctly types values in TOCData objects" (duration: 06m 40s)
19:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
19:41 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
19:40 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
19:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P46603 and previous config saved to /var/cache/conftool/dbconfig/20230412-193917-ladsgroup.json
19:38 zabe@deploy2002: zabe: Backport for Revert "Ensure ApiHelp correctly types values in TOCData objects", Revert "Ensure ApiHelp correctly types values in TOCData objects" synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
19:37 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
19:37 zabe@deploy2002: Started scap: Backport for Revert "Ensure ApiHelp correctly types values in TOCData objects", Revert "Ensure ApiHelp correctly types values in TOCData objects"
19:37 urandom: sessionstore1001: systemctl stop cassandra-a.service && systemctl start cassandra-a.service — T327954
19:36 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
19:35 zabe@deploy2002: Sync cancelled.
19:32 zabe@deploy2002: jforrester and zabe: Backport for composer.json: Explicitly pin psr/http-message to 1.0.1 (T333993), Ensure ApiHelp correctly types values in TOCData objects (T334551), Ensure ApiHelp correctly types values in TOCData objects (T334551) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.
19:30 zabe@deploy2002: Started scap: Backport for composer.json: Explicitly pin psr/http-message to 1.0.1 (T333993), Ensure ApiHelp correctly types values in TOCData objects (T334551), Ensure ApiHelp correctly types values in TOCData objects (T334551)
19:28 urandom: restart Cassandra —sessionstore1001— to disable native transport for testing — T327954
19:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P46602 and previous config saved to /var/cache/conftool/dbconfig/20230412-192411-ladsgroup.json
19:17 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on sessionstore1001.eqiad.wmnet with reason: Reproducing dissonant cluster state
19:16 eevans@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on sessionstore1001.eqiad.wmnet with reason: Reproducing dissonant cluster state
19:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T333332)', diff saved to https://phabricator.wikimedia.org/P46601 and previous config saved to /var/cache/conftool/dbconfig/20230412-190904-ladsgroup.json
18:42 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:42 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:41 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
18:39 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
18:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T333332)', diff saved to https://phabricator.wikimedia.org/P46600 and previous config saved to /var/cache/conftool/dbconfig/20230412-183822-ladsgroup.json
18:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
18:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2150.codfw.wmnet with reason: Maintenance
18:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46599 and previous config saved to /var/cache/conftool/dbconfig/20230412-183758-ladsgroup.json
18:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P46598 and previous config saved to /var/cache/conftool/dbconfig/20230412-182252-ladsgroup.json
18:16 dancy@deploy2002: Synchronized php: group1 wikis to 1.41.0-wmf.4 refs T330210 (duration: 06m 02s)
18:10 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.4 refs T330210
18:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P46597 and previous config saved to /var/cache/conftool/dbconfig/20230412-180746-ladsgroup.json
17:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46596 and previous config saved to /var/cache/conftool/dbconfig/20230412-175240-ladsgroup.json
17:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T333332)', diff saved to https://phabricator.wikimedia.org/P46595 and previous config saved to /var/cache/conftool/dbconfig/20230412-174806-ladsgroup.json
17:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
17:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2122.codfw.wmnet with reason: Maintenance
17:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46594 and previous config saved to /var/cache/conftool/dbconfig/20230412-174743-ladsgroup.json
17:47 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:46 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
17:44 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1002.eqiad.wmnet with OS bullseye
17:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P46593 and previous config saved to /var/cache/conftool/dbconfig/20230412-173237-ladsgroup.json
17:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P46592 and previous config saved to /var/cache/conftool/dbconfig/20230412-171730-ladsgroup.json
17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T333332)', diff saved to https://phabricator.wikimedia.org/P46591 and previous config saved to /var/cache/conftool/dbconfig/20230412-171219-ladsgroup.json
17:06 ejegg: payments-wiki upgraded from efe7e408 to 4dcba0a9
17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46590 and previous config saved to /var/cache/conftool/dbconfig/20230412-170224-ladsgroup.json
16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46589 and previous config saved to /var/cache/conftool/dbconfig/20230412-165951-ladsgroup.json
16:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
16:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance
16:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T333332)', diff saved to https://phabricator.wikimedia.org/P46588 and previous config saved to /var/cache/conftool/dbconfig/20230412-165928-ladsgroup.json
16:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P46587 and previous config saved to /var/cache/conftool/dbconfig/20230412-165712-ladsgroup.json
16:54 topranks: Updating routing-options on drmrs asw switches to add empty rib inet6 stanza T334281
16:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
16:51 topranks: Updating routing-options on Eqiad lsw1 switches to add empty rib inet6 stanza T334281
16:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
16:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P46586 and previous config saved to /var/cache/conftool/dbconfig/20230412-164422-ladsgroup.json
16:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P46585 and previous config saved to /var/cache/conftool/dbconfig/20230412-164206-ladsgroup.json
16:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P46584 and previous config saved to /var/cache/conftool/dbconfig/20230412-162915-ladsgroup.json
16:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T333332)', diff saved to https://phabricator.wikimedia.org/P46583 and previous config saved to /var/cache/conftool/dbconfig/20230412-162700-ladsgroup.json
16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T333332)', diff saved to https://phabricator.wikimedia.org/P46582 and previous config saved to /var/cache/conftool/dbconfig/20230412-162448-ladsgroup.json
16:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
16:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2179.codfw.wmnet with reason: Maintenance
16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T333332)', diff saved to https://phabricator.wikimedia.org/P46581 and previous config saved to /var/cache/conftool/dbconfig/20230412-162422-ladsgroup.json
16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T333332)', diff saved to https://phabricator.wikimedia.org/P46580 and previous config saved to /var/cache/conftool/dbconfig/20230412-161409-ladsgroup.json
16:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T333332)', diff saved to https://phabricator.wikimedia.org/P46579 and previous config saved to /var/cache/conftool/dbconfig/20230412-161135-ladsgroup.json
16:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
16:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2120.codfw.wmnet with reason: Maintenance
16:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T333332)', diff saved to https://phabricator.wikimedia.org/P46578 and previous config saved to /var/cache/conftool/dbconfig/20230412-161112-ladsgroup.json
16:09 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
16:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P46577 and previous config saved to /var/cache/conftool/dbconfig/20230412-160916-ladsgroup.json
16:05 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
16:05 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
16:04 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
16:04 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
16:04 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
16:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs2010.codfw.wmnet with OS bullseye
16:03 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
16:03 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
16:02 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
15:58 hnowlan@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
15:57 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:57 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P46576 and previous config saved to /var/cache/conftool/dbconfig/20230412-155606-ladsgroup.json
15:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P46575 and previous config saved to /var/cache/conftool/dbconfig/20230412-155410-ladsgroup.json
15:52 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:52 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:49 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:49 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:47 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:47 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:46 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2010.codfw.wmnet with reason: host reimage
15:45 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:44 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:44 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2010.codfw.wmnet with reason: host reimage
15:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P46573 and previous config saved to /var/cache/conftool/dbconfig/20230412-154100-ladsgroup.json
15:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T333332)', diff saved to https://phabricator.wikimedia.org/P46572 and previous config saved to /var/cache/conftool/dbconfig/20230412-153903-ladsgroup.json
15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T333332)', diff saved to https://phabricator.wikimedia.org/P46571 and previous config saved to /var/cache/conftool/dbconfig/20230412-153651-ladsgroup.json
15:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
15:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2172.codfw.wmnet with reason: Maintenance
15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T333332)', diff saved to https://phabricator.wikimedia.org/P46570 and previous config saved to /var/cache/conftool/dbconfig/20230412-153627-ladsgroup.json
15:30 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
15:30 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
15:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T333332)', diff saved to https://phabricator.wikimedia.org/P46569 and previous config saved to /var/cache/conftool/dbconfig/20230412-152553-ladsgroup.json
15:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T333332)', diff saved to https://phabricator.wikimedia.org/P46568 and previous config saved to /var/cache/conftool/dbconfig/20230412-152320-ladsgroup.json
15:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
15:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2108.codfw.wmnet with reason: Maintenance
15:22 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs2010.codfw.wmnet with OS bullseye
15:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
15:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2100.codfw.wmnet with reason: Maintenance
15:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
15:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance
15:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P46567 and previous config saved to /var/cache/conftool/dbconfig/20230412-152120-ladsgroup.json
15:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
15:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
15:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T333332)', diff saved to https://phabricator.wikimedia.org/P46566 and previous config saved to /var/cache/conftool/dbconfig/20230412-152104-ladsgroup.json
15:14 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
15:14 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P46565 and previous config saved to /var/cache/conftool/dbconfig/20230412-150614-ladsgroup.json
15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P46564 and previous config saved to /var/cache/conftool/dbconfig/20230412-150557-ladsgroup.json
15:05 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
15:05 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
15:04 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
15:04 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
15:02 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
15:00 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
15:00 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
14:59 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
14:59 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
14:59 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
14:59 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
14:58 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
14:58 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
14:57 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
14:57 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
14:57 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
14:56 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
14:56 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
14:55 otto@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
14:55 otto@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
14:54 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
14:53 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
14:53 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
14:53 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
14:52 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
14:52 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
14:52 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
14:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T333332)', diff saved to https://phabricator.wikimedia.org/P46563 and previous config saved to /var/cache/conftool/dbconfig/20230412-145108-ladsgroup.json
14:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P46562 and previous config saved to /var/cache/conftool/dbconfig/20230412-145051-ladsgroup.json
14:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T333332)', diff saved to https://phabricator.wikimedia.org/P46561 and previous config saved to /var/cache/conftool/dbconfig/20230412-144856-ladsgroup.json
14:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
14:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2155.codfw.wmnet with reason: Maintenance
14:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T333332)', diff saved to https://phabricator.wikimedia.org/P46560 and previous config saved to /var/cache/conftool/dbconfig/20230412-144815-ladsgroup.json
14:44 otto@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
14:43 otto@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
14:43 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
14:43 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
14:42 otto@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
14:42 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
14:41 otto@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
14:40 moritzm: installing apache security updates on phab1004 (phabricator.wikimedia.org)
14:38 moritzm: installing apache security updates on gerrit1001
14:36 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
14:36 kamila@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
14:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T333332)', diff saved to https://phabricator.wikimedia.org/P46559 and previous config saved to /var/cache/conftool/dbconfig/20230412-143545-ladsgroup.json
14:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1202 (T333332)', diff saved to https://phabricator.wikimedia.org/P46558 and previous config saved to /var/cache/conftool/dbconfig/20230412-143331-ladsgroup.json
14:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
14:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1202.eqiad.wmnet with reason: Maintenance
14:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P46557 and previous config saved to /var/cache/conftool/dbconfig/20230412-143309-ladsgroup.json
14:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T333332)', diff saved to https://phabricator.wikimedia.org/P46556 and previous config saved to /var/cache/conftool/dbconfig/20230412-143308-ladsgroup.json
14:32 kamila@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
14:23 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
14:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46554 and previous config saved to /var/cache/conftool/dbconfig/20230412-142045-root.json
14:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P46553 and previous config saved to /var/cache/conftool/dbconfig/20230412-141802-ladsgroup.json
14:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P46552 and previous config saved to /var/cache/conftool/dbconfig/20230412-141801-ladsgroup.json
14:13 kamila@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
14:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes kswiki --fix # T334277, fixed the one remaining link
14:07 moritzm: re-enabled Puppet in codfw/edges after puppetdb maintenance
14:07 cmooney@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
14:06 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
14:05 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
14:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46550 and previous config saved to /var/cache/conftool/dbconfig/20230412-140540-root.json
14:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P46549 and previous config saved to /var/cache/conftool/dbconfig/20230412-140255-ladsgroup.json
14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T333332)', diff saved to https://phabricator.wikimedia.org/P46548 and previous config saved to /var/cache/conftool/dbconfig/20230412-140045-ladsgroup.json
14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2147.codfw.wmnet with reason: Maintenance
14:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
14:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46547 and previous config saved to /var/cache/conftool/dbconfig/20230412-135959-ladsgroup.json
13:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46546 and previous config saved to /var/cache/conftool/dbconfig/20230412-135035-root.json
13:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T333332)', diff saved to https://phabricator.wikimedia.org/P46545 and previous config saved to /var/cache/conftool/dbconfig/20230412-134749-ladsgroup.json
13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1194 (T333332)', diff saved to https://phabricator.wikimedia.org/P46544 and previous config saved to /var/cache/conftool/dbconfig/20230412-134535-ladsgroup.json
13:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
13:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1194.eqiad.wmnet with reason: Maintenance
13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T333332)', diff saved to https://phabricator.wikimedia.org/P46543 and previous config saved to /var/cache/conftool/dbconfig/20230412-134512-ladsgroup.json
13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P46542 and previous config saved to /var/cache/conftool/dbconfig/20230412-134453-ladsgroup.json
13:43 moritzm: stop Puppet in codfw/edges for puppetdb maintenance
13:43 Lucas_WMDE: UTC afternoon backport+config window done
13:39 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for Make VE on officewiki use Parsoid directly (T320529 T333402) (duration: 09m 48s)
13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on puppetdb2002.codfw.wmnet with reason: puppetdb maintenance
13:36 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on puppetdb2002.codfw.wmnet with reason: puppetdb maintenance
13:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46541 and previous config saved to /var/cache/conftool/dbconfig/20230412-133531-root.json
13:34 sukhe: [puppetmaster] sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080 to update PCC
13:30 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and daniel: Backport for Make VE on officewiki use Parsoid directly (T320529 T333402) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
13:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P46540 and previous config saved to /var/cache/conftool/dbconfig/20230412-133006-ladsgroup.json
13:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P46539 and previous config saved to /var/cache/conftool/dbconfig/20230412-132946-ladsgroup.json
13:29 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for Make VE on officewiki use Parsoid directly (T320529 T333402)
13:28 eoghan: Stopping puppet on gitlab hosts to slow-rollout puppet ssh key management - T333840
13:26 elukey: upload AMD ROCm 5.4 debian packages to wikimedia-bullseye:thirdparty/amd-rocm54 - T295661
13:22 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint2002:~$ mwscript namespaceDupes kswiki --fix | tee >(phaste -t T334277) # P46538; errors on stderr, cf. T328634
13:20 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for GrowthExperiments: enable add link frontend in 7,8th round wikis (T304551 T308133) (duration: 13m 30s)
13:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46537 and previous config saved to /var/cache/conftool/dbconfig/20230412-132026-root.json
13:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P46535 and previous config saved to /var/cache/conftool/dbconfig/20230412-131459-ladsgroup.json
13:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46533 and previous config saved to /var/cache/conftool/dbconfig/20230412-131440-ladsgroup.json
13:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46532 and previous config saved to /var/cache/conftool/dbconfig/20230412-131227-ladsgroup.json
13:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
13:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2138.codfw.wmnet with reason: Maintenance
13:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46531 and previous config saved to /var/cache/conftool/dbconfig/20230412-131204-ladsgroup.json
13:08 lucaswerkmeister-wmde@deploy2002: sgimeno and lucaswerkmeister-wmde: Backport for GrowthExperiments: enable add link frontend in 7,8th round wikis (T304551 T308133) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
13:07 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
13:07 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for GrowthExperiments: enable add link frontend in 7,8th round wikis (T304551 T308133)
13:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46530 and previous config saved to /var/cache/conftool/dbconfig/20230412-130521-root.json
13:03 moritzm: installing nodejs security updates on buster
13:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idm1001.wikimedia.org
12:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T333332)', diff saved to https://phabricator.wikimedia.org/P46529 and previous config saved to /var/cache/conftool/dbconfig/20230412-125953-ladsgroup.json
12:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1191 (T333332)', diff saved to https://phabricator.wikimedia.org/P46528 and previous config saved to /var/cache/conftool/dbconfig/20230412-125739-ladsgroup.json
12:57 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
12:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idm1001.wikimedia.org
12:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1191.eqiad.wmnet with reason: Maintenance
12:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T333332)', diff saved to https://phabricator.wikimedia.org/P46527 and previous config saved to /var/cache/conftool/dbconfig/20230412-125716-ladsgroup.json
12:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P46526 and previous config saved to /var/cache/conftool/dbconfig/20230412-125658-ladsgroup.json
12:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1123 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46525 and previous config saved to /var/cache/conftool/dbconfig/20230412-125016-root.json
12:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P46524 and previous config saved to /var/cache/conftool/dbconfig/20230412-124210-ladsgroup.json
12:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P46523 and previous config saved to /var/cache/conftool/dbconfig/20230412-124151-ladsgroup.json
12:35 moritzm: installing intel-microcode security updates
12:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P46522 and previous config saved to /var/cache/conftool/dbconfig/20230412-122703-ladsgroup.json
12:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46521 and previous config saved to /var/cache/conftool/dbconfig/20230412-122645-ladsgroup.json
12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46520 and previous config saved to /var/cache/conftool/dbconfig/20230412-122433-ladsgroup.json
12:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
12:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2137.codfw.wmnet with reason: Maintenance
12:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T333332)', diff saved to https://phabricator.wikimedia.org/P46519 and previous config saved to /var/cache/conftool/dbconfig/20230412-122409-ladsgroup.json
12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1120 T334580', diff saved to https://phabricator.wikimedia.org/P46518 and previous config saved to /var/cache/conftool/dbconfig/20230412-121420-marostegui.json
12:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T333332)', diff saved to https://phabricator.wikimedia.org/P46517 and previous config saved to /var/cache/conftool/dbconfig/20230412-121157-ladsgroup.json
12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1174 (T333332)', diff saved to https://phabricator.wikimedia.org/P46516 and previous config saved to /var/cache/conftool/dbconfig/20230412-120943-ladsgroup.json
12:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
12:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1174.eqiad.wmnet with reason: Maintenance
12:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P46515 and previous config saved to /var/cache/conftool/dbconfig/20230412-120903-ladsgroup.json
12:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1171.eqiad.wmnet with reason: Maintenance
12:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46514 and previous config saved to /var/cache/conftool/dbconfig/20230412-120853-ladsgroup.json
11:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P46513 and previous config saved to /var/cache/conftool/dbconfig/20230412-115357-ladsgroup.json
11:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P46512 and previous config saved to /var/cache/conftool/dbconfig/20230412-115347-ladsgroup.json
11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T333332)', diff saved to https://phabricator.wikimedia.org/P46509 and previous config saved to /var/cache/conftool/dbconfig/20230412-113850-ladsgroup.json
11:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P46508 and previous config saved to /var/cache/conftool/dbconfig/20230412-113840-ladsgroup.json
11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T333332)', diff saved to https://phabricator.wikimedia.org/P46507 and previous config saved to /var/cache/conftool/dbconfig/20230412-113638-ladsgroup.json
11:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
11:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2136.codfw.wmnet with reason: Maintenance
11:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T333332)', diff saved to https://phabricator.wikimedia.org/P46506 and previous config saved to /var/cache/conftool/dbconfig/20230412-113615-ladsgroup.json
11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46505 and previous config saved to /var/cache/conftool/dbconfig/20230412-112334-ladsgroup.json
11:23 marostegui: dbmaint Upgrade db1106 to mariadb 11.1 (eqiad) T333289
11:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3317 (T333332)', diff saved to https://phabricator.wikimedia.org/P46504 and previous config saved to /var/cache/conftool/dbconfig/20230412-112217-ladsgroup.json
11:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
11:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1170.eqiad.wmnet with reason: Maintenance
11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46503 and previous config saved to /var/cache/conftool/dbconfig/20230412-112154-ladsgroup.json
11:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P46502 and previous config saved to /var/cache/conftool/dbconfig/20230412-112108-ladsgroup.json
11:12 moritzm: installing gnutls28 security updates on buster
11:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P46501 and previous config saved to /var/cache/conftool/dbconfig/20230412-110647-ladsgroup.json
11:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P46500 and previous config saved to /var/cache/conftool/dbconfig/20230412-110602-ladsgroup.json
11:00 cgoubert@cumin1001: conftool action : set/pooled=yes; selector: name=mw2448.*.codfw.wmnet
10:59 claime: repooling mw2448.codfw.wmnet - T334429
10:59 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for mw2448.codfw.wmnet
10:59 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for mw2448.codfw.wmnet
10:56 moritzm: installing apache2 security updates on Buster
10:56 moritzm: installing apache2 security updates on Bullseye
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46499 and previous config saved to /var/cache/conftool/dbconfig/20230412-105356-root.json
10:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P46498 and previous config saved to /var/cache/conftool/dbconfig/20230412-105141-ladsgroup.json
10:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T333332)', diff saved to https://phabricator.wikimedia.org/P46497 and previous config saved to /var/cache/conftool/dbconfig/20230412-105056-ladsgroup.json
10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T333332)', diff saved to https://phabricator.wikimedia.org/P46496 and previous config saved to /var/cache/conftool/dbconfig/20230412-104843-ladsgroup.json
10:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
10:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2119.codfw.wmnet with reason: Maintenance
10:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T333332)', diff saved to https://phabricator.wikimedia.org/P46495 and previous config saved to /var/cache/conftool/dbconfig/20230412-104820-ladsgroup.json
10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46494 and previous config saved to /var/cache/conftool/dbconfig/20230412-103851-root.json
10:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46493 and previous config saved to /var/cache/conftool/dbconfig/20230412-103635-ladsgroup.json
10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46492 and previous config saved to /var/cache/conftool/dbconfig/20230412-103421-ladsgroup.json
10:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
10:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1158.eqiad.wmnet with reason: Maintenance
10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T333332)', diff saved to https://phabricator.wikimedia.org/P46491 and previous config saved to /var/cache/conftool/dbconfig/20230412-103348-ladsgroup.json
10:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P46490 and previous config saved to /var/cache/conftool/dbconfig/20230412-103314-ladsgroup.json
10:29 hashar@deploy2002: Finished deploy [integration/docroot@ab848e3]: Dummy deploy with dsh file managed by Puppet (duration: 00m 04s)
10:29 hashar@deploy2002: Started deploy [integration/docroot@ab848e3]: Dummy deploy with dsh file managed by Puppet
10:29 hashar@deploy2002: Finished deploy [integration/docroot@ab848e3]: Dummy deploy with dsh file managed by Puppet (duration: 00m 06s)
10:29 hashar@deploy2002: Started deploy [integration/docroot@ab848e3]: Dummy deploy with dsh file managed by Puppet
10:29 hashar@deploy2002: Finished deploy [integration/docroot@ab848e3]: Dummy deploy with dsh file managed by Puppet (duration: 00m 02s)
10:29 hashar@deploy2002: Started deploy [integration/docroot@ab848e3]: Dummy deploy with dsh file managed by Puppet
10:28 hashar@deploy2002: Finished deploy [zuul/deploy@4c6859c]: Dummy deploy with dsh file managed by Puppet (duration: 00m 02s)
10:28 hashar@deploy2002: Started deploy [zuul/deploy@4c6859c]: Dummy deploy with dsh file managed by Puppet
10:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46489 and previous config saved to /var/cache/conftool/dbconfig/20230412-102346-root.json
10:18 Emperor: clearing out 24 ghost objects from Swift T327253
10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P46488 and previous config saved to /var/cache/conftool/dbconfig/20230412-101841-ladsgroup.json
10:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P46487 and previous config saved to /var/cache/conftool/dbconfig/20230412-101808-ladsgroup.json
10:10 cgoubert@deploy2002: Synchronized README: (no justification provided) (duration: 05m 44s)
10:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46486 and previous config saved to /var/cache/conftool/dbconfig/20230412-100841-root.json
10:06 hnowlan@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136', diff saved to https://phabricator.wikimedia.org/P46485 and previous config saved to /var/cache/conftool/dbconfig/20230412-100335-ladsgroup.json
10:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T333332)', diff saved to https://phabricator.wikimedia.org/P46484 and previous config saved to /var/cache/conftool/dbconfig/20230412-100301-ladsgroup.json
10:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1123 to clone db1223 T326669', diff saved to https://phabricator.wikimedia.org/P46482 and previous config saved to /var/cache/conftool/dbconfig/20230412-100111-marostegui.json
10:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T333332)', diff saved to https://phabricator.wikimedia.org/P46481 and previous config saved to /var/cache/conftool/dbconfig/20230412-100049-ladsgroup.json
10:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
10:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2110.codfw.wmnet with reason: Maintenance
10:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T333332)', diff saved to https://phabricator.wikimedia.org/P46480 and previous config saved to /var/cache/conftool/dbconfig/20230412-100026-ladsgroup.json
09:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46479 and previous config saved to /var/cache/conftool/dbconfig/20230412-095336-root.json
09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1136 (T333332)', diff saved to https://phabricator.wikimedia.org/P46478 and previous config saved to /var/cache/conftool/dbconfig/20230412-094829-ladsgroup.json
09:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1136 (T333332)', diff saved to https://phabricator.wikimedia.org/P46477 and previous config saved to /var/cache/conftool/dbconfig/20230412-094615-ladsgroup.json
09:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1136.eqiad.wmnet with reason: Maintenance
09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1136.eqiad.wmnet with reason: Maintenance
09:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46476 and previous config saved to /var/cache/conftool/dbconfig/20230412-094551-ladsgroup.json
09:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P46475 and previous config saved to /var/cache/conftool/dbconfig/20230412-094520-ladsgroup.json
09:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46474 and previous config saved to /var/cache/conftool/dbconfig/20230412-093831-root.json
09:34 claime: Reverted migrating cxserver to mw-api-int on kubernetes - T334204
09:34 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
09:34 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
09:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P46473 and previous config saved to /var/cache/conftool/dbconfig/20230412-093045-ladsgroup.json
09:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P46472 and previous config saved to /var/cache/conftool/dbconfig/20230412-093013-ladsgroup.json
09:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46470 and previous config saved to /var/cache/conftool/dbconfig/20230412-092327-root.json
09:21 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2003.wikimedia.org with OS bullseye
09:21 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
09:20 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
09:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P46469 and previous config saved to /var/cache/conftool/dbconfig/20230412-091539-ladsgroup.json
09:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T333332)', diff saved to https://phabricator.wikimedia.org/P46468 and previous config saved to /var/cache/conftool/dbconfig/20230412-091507-ladsgroup.json
09:13 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
09:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T333332)', diff saved to https://phabricator.wikimedia.org/P46467 and previous config saved to /var/cache/conftool/dbconfig/20230412-091255-ladsgroup.json
09:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
09:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2106.codfw.wmnet with reason: Maintenance
09:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
09:12 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
09:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2099.codfw.wmnet with reason: Maintenance
09:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
09:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
09:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T333332)', diff saved to https://phabricator.wikimedia.org/P46466 and previous config saved to /var/cache/conftool/dbconfig/20230412-091151-ladsgroup.json
09:11 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
09:11 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
09:07 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
09:06 claime: Migrating cxserver to mw-api-int on kubernetes - T334204
09:04 jelto@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
09:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46464 and previous config saved to /var/cache/conftool/dbconfig/20230412-090032-ladsgroup.json
08:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46463 and previous config saved to /var/cache/conftool/dbconfig/20230412-085816-ladsgroup.json
08:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1127.eqiad.wmnet with reason: Maintenance
08:57 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1127.eqiad.wmnet with reason: Maintenance
08:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P46462 and previous config saved to /var/cache/conftool/dbconfig/20230412-085644-ladsgroup.json
08:51 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
08:51 aqu@deploy2002: Finished deploy [airflow-dags/analytics@18ae3be]: Deploy airflow-dags including webrequest load job - Analytics [airflow-dags@18ae3be] (duration: 00m 12s)
08:50 aqu@deploy2002: Started deploy [airflow-dags/analytics@18ae3be]: Deploy airflow-dags including webrequest load job - Analytics [airflow-dags@18ae3be]
08:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P46460 and previous config saved to /var/cache/conftool/dbconfig/20230412-084138-ladsgroup.json
08:37 marostegui: dbmaint Deploy schema change on s1 codfw with replication T334536
08:35 aqu@deploy2002: Finished deploy [analytics/refinery@f3389dc] (thin): Deploy analytics_refinery in production thin [analytics/refinery@f3389dc] (duration: 00m 07s)
08:35 aqu@deploy2002: Started deploy [analytics/refinery@f3389dc] (thin): Deploy analytics_refinery in production thin [analytics/refinery@f3389dc]
08:35 moritzm: imported puppet 5.5.22-2+deb12u1 for bookworm-wikimedia component/puppet5 T330495
08:34 aqu@deploy2002: Finished deploy [analytics/refinery@f3389dc]: Deploy analytics_refinery in production [analytics/refinery@f3389dc] (duration: 00m 41s)
08:34 aqu@deploy2002: Started deploy [analytics/refinery@f3389dc]: Deploy analytics_refinery in production [analytics/refinery@f3389dc]
08:33 aqu: About to deploy analytics/refinery in production
08:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1199 (T333332)', diff saved to https://phabricator.wikimedia.org/P46459 and previous config saved to /var/cache/conftool/dbconfig/20230412-082632-ladsgroup.json
08:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1199 (T333332)', diff saved to https://phabricator.wikimedia.org/P46458 and previous config saved to /var/cache/conftool/dbconfig/20230412-082424-ladsgroup.json
08:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
08:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1199.eqiad.wmnet with reason: Maintenance
08:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T333332)', diff saved to https://phabricator.wikimedia.org/P46457 and previous config saved to /var/cache/conftool/dbconfig/20230412-082400-ladsgroup.json
08:17 hashar@deploy2002: Synchronized wmf-config/CommonSettings-labs.php: [Beta Cluster] Replicate WebResponseSetCookie wgHooks migration here too - T333926 (duration: 05m 51s)
08:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P46456 and previous config saved to /var/cache/conftool/dbconfig/20230412-080854-ladsgroup.json
08:03 marostegui: dbmaint Deploy schema change on s3 codfw with replication enabled (only for testwiki and test2wiki) T334536
08:01 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2003.wikimedia.org with OS bullseye
07:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 100%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46455 and previous config saved to /var/cache/conftool/dbconfig/20230412-075703-root.json
07:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46454 and previous config saved to /var/cache/conftool/dbconfig/20230412-075422-root.json
07:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P46453 and previous config saved to /var/cache/conftool/dbconfig/20230412-075348-ladsgroup.json
07:45 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
07:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 75%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46451 and previous config saved to /var/cache/conftool/dbconfig/20230412-074158-root.json
07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1107 from dbctl T334447', diff saved to https://phabricator.wikimedia.org/P46450 and previous config saved to /var/cache/conftool/dbconfig/20230412-073921-marostegui.json
07:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46449 and previous config saved to /var/cache/conftool/dbconfig/20230412-073917-root.json
07:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 (T333332)', diff saved to https://phabricator.wikimedia.org/P46448 and previous config saved to /var/cache/conftool/dbconfig/20230412-073841-ladsgroup.json
07:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T333332)', diff saved to https://phabricator.wikimedia.org/P46447 and previous config saved to /var/cache/conftool/dbconfig/20230412-073633-ladsgroup.json
07:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
07:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1190.eqiad.wmnet with reason: Maintenance
07:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
07:36 moritzm: installing python-cryptography security updates
07:35 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T333332)', diff saved to https://phabricator.wikimedia.org/P46446 and previous config saved to /var/cache/conftool/dbconfig/20230412-073550-ladsgroup.json
07:30 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
07:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P46445 and previous config saved to /var/cache/conftool/dbconfig/20230412-072812-root.json
07:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 50%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46444 and previous config saved to /var/cache/conftool/dbconfig/20230412-072654-root.json
07:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46443 and previous config saved to /var/cache/conftool/dbconfig/20230412-072412-root.json
07:21 moritzm: installing xen security updates
07:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P46442 and previous config saved to /var/cache/conftool/dbconfig/20230412-072044-ladsgroup.json
07:16 marostegui: Drop flaggerevs tables from ptwikisource T332594
07:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P46441 and previous config saved to /var/cache/conftool/dbconfig/20230412-071307-root.json
07:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 25%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46440 and previous config saved to /var/cache/conftool/dbconfig/20230412-071149-root.json
07:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46439 and previous config saved to /var/cache/conftool/dbconfig/20230412-070907-root.json
07:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P46438 and previous config saved to /var/cache/conftool/dbconfig/20230412-070538-ladsgroup.json
06:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P46437 and previous config saved to /var/cache/conftool/dbconfig/20230412-065802-root.json
06:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 10%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46436 and previous config saved to /var/cache/conftool/dbconfig/20230412-065644-root.json
06:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46435 and previous config saved to /var/cache/conftool/dbconfig/20230412-065402-root.json
06:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T333332)', diff saved to https://phabricator.wikimedia.org/P46434 and previous config saved to /var/cache/conftool/dbconfig/20230412-065032-ladsgroup.json
06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T333332)', diff saved to https://phabricator.wikimedia.org/P46433 and previous config saved to /var/cache/conftool/dbconfig/20230412-064823-ladsgroup.json
06:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
06:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1149.eqiad.wmnet with reason: Maintenance
06:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T333332)', diff saved to https://phabricator.wikimedia.org/P46432 and previous config saved to /var/cache/conftool/dbconfig/20230412-064800-ladsgroup.json
06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P46431 and previous config saved to /var/cache/conftool/dbconfig/20230412-064257-root.json
06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 5%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46430 and previous config saved to /var/cache/conftool/dbconfig/20230412-064139-root.json
06:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46429 and previous config saved to /var/cache/conftool/dbconfig/20230412-063858-root.json
06:38 vgutierrez: restart haproxy on cp2035 - T334448
06:33 marostegui: Stop mariadb on db1121 to clone db1221 this will generate lag on clouddb replicas for s4 T326669
06:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P46427 and previous config saved to /var/cache/conftool/dbconfig/20230412-063253-ladsgroup.json
06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121 to clone db1221 T326669', diff saved to https://phabricator.wikimedia.org/P46426 and previous config saved to /var/cache/conftool/dbconfig/20230412-063224-marostegui.json
06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P46425 and previous config saved to /var/cache/conftool/dbconfig/20230412-062752-root.json
06:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 4%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46424 and previous config saved to /var/cache/conftool/dbconfig/20230412-062634-root.json
06:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46423 and previous config saved to /var/cache/conftool/dbconfig/20230412-062353-root.json
06:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148', diff saved to https://phabricator.wikimedia.org/P46422 and previous config saved to /var/cache/conftool/dbconfig/20230412-061747-ladsgroup.json
06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P46421 and previous config saved to /var/cache/conftool/dbconfig/20230412-061248-root.json
06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 3%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46420 and previous config saved to /var/cache/conftool/dbconfig/20230412-061129-root.json
06:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1148 (T333332)', diff saved to https://phabricator.wikimedia.org/P46419 and previous config saved to /var/cache/conftool/dbconfig/20230412-060241-ladsgroup.json
06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T333332)', diff saved to https://phabricator.wikimedia.org/P46418 and previous config saved to /var/cache/conftool/dbconfig/20230412-060133-ladsgroup.json
06:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1148.eqiad.wmnet with reason: Maintenance
06:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1148.eqiad.wmnet with reason: Maintenance
06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T333332)', diff saved to https://phabricator.wikimedia.org/P46417 and previous config saved to /var/cache/conftool/dbconfig/20230412-060109-ladsgroup.json
05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P46416 and previous config saved to /var/cache/conftool/dbconfig/20230412-055743-root.json
05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 2%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46415 and previous config saved to /var/cache/conftool/dbconfig/20230412-055624-root.json
05:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P46414 and previous config saved to /var/cache/conftool/dbconfig/20230412-054603-ladsgroup.json
05:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 to clone db1210 T326669', diff saved to https://phabricator.wikimedia.org/P46412 and previous config saved to /var/cache/conftool/dbconfig/20230412-054258-marostegui.json
05:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P46411 and previous config saved to /var/cache/conftool/dbconfig/20230412-054238-root.json
05:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1218 (re)pooling @ 1%: Pooling db1218 T326669', diff saved to https://phabricator.wikimedia.org/P46410 and previous config saved to /var/cache/conftool/dbconfig/20230412-054120-root.json
05:41 krinkle@deploy2002: Synchronized php-1.41.0-wmf.4/includes/libs/objectcache/: Ie3a2215d33: disable WANCache cool-off feature (duration: 06m 00s)
05:40 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1218 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46409 and previous config saved to /var/cache/conftool/dbconfig/20230412-054024-marostegui.json
05:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147', diff saved to https://phabricator.wikimedia.org/P46408 and previous config saved to /var/cache/conftool/dbconfig/20230412-053057-ladsgroup.json
05:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1222 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P46407 and previous config saved to /var/cache/conftool/dbconfig/20230412-052733-root.json
05:25 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1222 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46406 and previous config saved to /var/cache/conftool/dbconfig/20230412-052504-marostegui.json
05:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T333332)', diff saved to https://phabricator.wikimedia.org/P46405 and previous config saved to /var/cache/conftool/dbconfig/20230412-051550-ladsgroup.json
05:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1147 (T333332)', diff saved to https://phabricator.wikimedia.org/P46404 and previous config saved to /var/cache/conftool/dbconfig/20230412-051342-ladsgroup.json
05:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1147.eqiad.wmnet with reason: Maintenance
05:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1147.eqiad.wmnet with reason: Maintenance
05:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46403 and previous config saved to /var/cache/conftool/dbconfig/20230412-051319-ladsgroup.json
04:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P46402 and previous config saved to /var/cache/conftool/dbconfig/20230412-045813-ladsgroup.json
04:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P46401 and previous config saved to /var/cache/conftool/dbconfig/20230412-044306-ladsgroup.json
04:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46400 and previous config saved to /var/cache/conftool/dbconfig/20230412-042800-ladsgroup.json
04:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46399 and previous config saved to /var/cache/conftool/dbconfig/20230412-042552-ladsgroup.json
04:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
04:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1146.eqiad.wmnet with reason: Maintenance
04:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
04:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1145.eqiad.wmnet with reason: Maintenance
04:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46398 and previous config saved to /var/cache/conftool/dbconfig/20230412-042510-ladsgroup.json
04:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P46397 and previous config saved to /var/cache/conftool/dbconfig/20230412-041003-ladsgroup.json
03:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314', diff saved to https://phabricator.wikimedia.org/P46396 and previous config saved to /var/cache/conftool/dbconfig/20230412-035457-ladsgroup.json
03:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46395 and previous config saved to /var/cache/conftool/dbconfig/20230412-033951-ladsgroup.json
03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T333332)', diff saved to https://phabricator.wikimedia.org/P46394 and previous config saved to /var/cache/conftool/dbconfig/20230412-033742-ladsgroup.json
03:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
03:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1144.eqiad.wmnet with reason: Maintenance
03:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T333332)', diff saved to https://phabricator.wikimedia.org/P46393 and previous config saved to /var/cache/conftool/dbconfig/20230412-033719-ladsgroup.json
03:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P46392 and previous config saved to /var/cache/conftool/dbconfig/20230412-032213-ladsgroup.json
03:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P46391 and previous config saved to /var/cache/conftool/dbconfig/20230412-030707-ladsgroup.json
02:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1143 (T333332)', diff saved to https://phabricator.wikimedia.org/P46390 and previous config saved to /var/cache/conftool/dbconfig/20230412-025200-ladsgroup.json
02:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1143 (T333332)', diff saved to https://phabricator.wikimedia.org/P46389 and previous config saved to /var/cache/conftool/dbconfig/20230412-024952-ladsgroup.json
02:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1143.eqiad.wmnet with reason: Maintenance
02:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1143.eqiad.wmnet with reason: Maintenance
02:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T333332)', diff saved to https://phabricator.wikimedia.org/P46388 and previous config saved to /var/cache/conftool/dbconfig/20230412-024929-ladsgroup.json
02:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P46387 and previous config saved to /var/cache/conftool/dbconfig/20230412-023422-ladsgroup.json
02:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P46386 and previous config saved to /var/cache/conftool/dbconfig/20230412-021916-ladsgroup.json
02:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T333332)', diff saved to https://phabricator.wikimedia.org/P46385 and previous config saved to /var/cache/conftool/dbconfig/20230412-020410-ladsgroup.json
02:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T333332)', diff saved to https://phabricator.wikimedia.org/P46384 and previous config saved to /var/cache/conftool/dbconfig/20230412-020201-ladsgroup.json
02:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1142.eqiad.wmnet with reason: Maintenance
02:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1142.eqiad.wmnet with reason: Maintenance
02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T333332)', diff saved to https://phabricator.wikimedia.org/P46383 and previous config saved to /var/cache/conftool/dbconfig/20230412-020138-ladsgroup.json
01:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P46382 and previous config saved to /var/cache/conftool/dbconfig/20230412-014632-ladsgroup.json
01:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P46381 and previous config saved to /var/cache/conftool/dbconfig/20230412-013126-ladsgroup.json
01:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T333332)', diff saved to https://phabricator.wikimedia.org/P46380 and previous config saved to /var/cache/conftool/dbconfig/20230412-011619-ladsgroup.json
01:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T333332)', diff saved to https://phabricator.wikimedia.org/P46379 and previous config saved to /var/cache/conftool/dbconfig/20230412-011411-ladsgroup.json
01:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1141.eqiad.wmnet with reason: Maintenance
01:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1141.eqiad.wmnet with reason: Maintenance
01:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T333332)', diff saved to https://phabricator.wikimedia.org/P46378 and previous config saved to /var/cache/conftool/dbconfig/20230412-011348-ladsgroup.json
01:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T333332)', diff saved to https://phabricator.wikimedia.org/P46377 and previous config saved to /var/cache/conftool/dbconfig/20230412-010832-ladsgroup.json
00:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P46376 and previous config saved to /var/cache/conftool/dbconfig/20230412-005841-ladsgroup.json
00:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P46375 and previous config saved to /var/cache/conftool/dbconfig/20230412-005325-ladsgroup.json
00:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138', diff saved to https://phabricator.wikimedia.org/P46374 and previous config saved to /var/cache/conftool/dbconfig/20230412-004335-ladsgroup.json
00:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P46373 and previous config saved to /var/cache/conftool/dbconfig/20230412-003819-ladsgroup.json
00:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1138 (T333332)', diff saved to https://phabricator.wikimedia.org/P46372 and previous config saved to /var/cache/conftool/dbconfig/20230412-002829-ladsgroup.json
00:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 (T333332)', diff saved to https://phabricator.wikimedia.org/P46371 and previous config saved to /var/cache/conftool/dbconfig/20230412-002620-ladsgroup.json
00:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1138.eqiad.wmnet with reason: Maintenance
00:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1138.eqiad.wmnet with reason: Maintenance
00:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46370 and previous config saved to /var/cache/conftool/dbconfig/20230412-002557-ladsgroup.json
00:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T333332)', diff saved to https://phabricator.wikimedia.org/P46369 and previous config saved to /var/cache/conftool/dbconfig/20230412-002312-ladsgroup.json
00:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P46368 and previous config saved to /var/cache/conftool/dbconfig/20230412-001051-ladsgroup.json

2023-04-11

23:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P46367 and previous config saved to /var/cache/conftool/dbconfig/20230411-235544-ladsgroup.json
23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T333332)', diff saved to https://phabricator.wikimedia.org/P46366 and previous config saved to /var/cache/conftool/dbconfig/20230411-235225-ladsgroup.json
23:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
23:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2177.codfw.wmnet with reason: Maintenance
23:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T333332)', diff saved to https://phabricator.wikimedia.org/P46365 and previous config saved to /var/cache/conftool/dbconfig/20230411-235202-ladsgroup.json
23:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46364 and previous config saved to /var/cache/conftool/dbconfig/20230411-234038-ladsgroup.json
23:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T333332)', diff saved to https://phabricator.wikimedia.org/P46363 and previous config saved to /var/cache/conftool/dbconfig/20230411-233930-ladsgroup.json
23:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
23:39 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
23:39 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1121.eqiad.wmnet with reason: Maintenance
23:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1121.eqiad.wmnet with reason: Maintenance
23:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P46362 and previous config saved to /var/cache/conftool/dbconfig/20230411-233655-ladsgroup.json
23:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P46361 and previous config saved to /var/cache/conftool/dbconfig/20230411-232149-ladsgroup.json
23:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T333332)', diff saved to https://phabricator.wikimedia.org/P46360 and previous config saved to /var/cache/conftool/dbconfig/20230411-230643-ladsgroup.json
22:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T333332)', diff saved to https://phabricator.wikimedia.org/P46359 and previous config saved to /var/cache/conftool/dbconfig/20230411-223732-ladsgroup.json
22:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
22:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2186.codfw.wmnet with reason: Maintenance
22:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
22:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2156.codfw.wmnet with reason: Maintenance
22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T333332)', diff saved to https://phabricator.wikimedia.org/P46358 and previous config saved to /var/cache/conftool/dbconfig/20230411-223651-ladsgroup.json
22:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P46357 and previous config saved to /var/cache/conftool/dbconfig/20230411-222145-ladsgroup.json
22:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P46356 and previous config saved to /var/cache/conftool/dbconfig/20230411-220638-ladsgroup.json
21:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T333332)', diff saved to https://phabricator.wikimedia.org/P46355 and previous config saved to /var/cache/conftool/dbconfig/20230411-215132-ladsgroup.json
21:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T333332)', diff saved to https://phabricator.wikimedia.org/P46354 and previous config saved to /var/cache/conftool/dbconfig/20230411-212053-ladsgroup.json
21:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
21:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2149.codfw.wmnet with reason: Maintenance
20:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
20:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2139.codfw.wmnet with reason: Maintenance
20:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46353 and previous config saved to /var/cache/conftool/dbconfig/20230411-205239-ladsgroup.json
20:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P46352 and previous config saved to /var/cache/conftool/dbconfig/20230411-203733-ladsgroup.json
20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P46351 and previous config saved to /var/cache/conftool/dbconfig/20230411-202227-ladsgroup.json
20:19 mforns@deploy2002: Finished deploy [airflow-dags/analytics@fcc4c9b]: (no justification provided) (duration: 00m 11s)
20:19 mforns@deploy2002: Started deploy [airflow-dags/analytics@fcc4c9b]: (no justification provided)
20:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46350 and previous config saved to /var/cache/conftool/dbconfig/20230411-200720-ladsgroup.json
20:05 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
19:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2127 (T333332)', diff saved to https://phabricator.wikimedia.org/P46349 and previous config saved to /var/cache/conftool/dbconfig/20230411-193640-ladsgroup.json
19:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
19:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2127.codfw.wmnet with reason: Maintenance
19:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T333332)', diff saved to https://phabricator.wikimedia.org/P46348 and previous config saved to /var/cache/conftool/dbconfig/20230411-193628-ladsgroup.json
19:31 ejegg: payments-wiki upgraded from ad6e5801 to 153bdf64
19:29 ejegg: civicrm upgraded from e2fdb4a4 to 0f37f981
19:22 andrew@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudvirtlocal1003']
19:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P46347 and previous config saved to /var/cache/conftool/dbconfig/20230411-192122-ladsgroup.json
19:19 eileen: civicrm upgraded from b573aee4 to e2fdb4a4
19:16 andrew@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirtlocal1003']
19:16 andrew@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudvirtlocal1002']
19:10 andrew@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirtlocal1002']
19:08 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
19:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P46346 and previous config saved to /var/cache/conftool/dbconfig/20230411-190616-ladsgroup.json
19:05 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
19:05 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:59 ebysans@deploy2002: Finished deploy [airflow-dags/analytics@d2cd28d]: (no justification provided) (duration: 00m 11s)
18:59 ebysans@deploy2002: Started deploy [airflow-dags/analytics@d2cd28d]: (no justification provided)
18:58 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:57 andrew@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['cloudvirtlocal1001']
18:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T333332)', diff saved to https://phabricator.wikimedia.org/P46345 and previous config saved to /var/cache/conftool/dbconfig/20230411-185110-ladsgroup.json
18:50 andrew@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirtlocal1001']
18:38 demon@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.4 refs T330210
18:32 zabe@deploy2002: Finished scap: close wowikiquote (T334482) (duration: 06m 46s)
18:25 zabe@deploy2002: Started scap: close wowikiquote (T334482)
18:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T333332)', diff saved to https://phabricator.wikimedia.org/P46344 and previous config saved to /var/cache/conftool/dbconfig/20230411-182024-ladsgroup.json
18:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
18:20 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirtlocal1001.eqiad.wmnet with OS bullseye
18:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2109.codfw.wmnet with reason: Maintenance
18:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
18:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
18:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T333332)', diff saved to https://phabricator.wikimedia.org/P46343 and previous config saved to /var/cache/conftool/dbconfig/20230411-181123-ladsgroup.json
17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs3006.esams.wmnet with OS bullseye
17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P46342 and previous config saved to /var/cache/conftool/dbconfig/20230411-175617-ladsgroup.json
17:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs3006.esams.wmnet with reason: host reimage
17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P46341 and previous config saved to /var/cache/conftool/dbconfig/20230411-174110-ladsgroup.json
17:38 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs3006.esams.wmnet with reason: host reimage
17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T333332)', diff saved to https://phabricator.wikimedia.org/P46340 and previous config saved to /var/cache/conftool/dbconfig/20230411-172604-ladsgroup.json
17:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs3006.esams.wmnet with OS bullseye
17:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T333332)', diff saved to https://phabricator.wikimedia.org/P46339 and previous config saved to /var/cache/conftool/dbconfig/20230411-171600-ladsgroup.json
17:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
17:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1198.eqiad.wmnet with reason: Maintenance
17:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T333332)', diff saved to https://phabricator.wikimedia.org/P46338 and previous config saved to /var/cache/conftool/dbconfig/20230411-171537-ladsgroup.json
17:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P46337 and previous config saved to /var/cache/conftool/dbconfig/20230411-170031-ladsgroup.json
16:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P46336 and previous config saved to /var/cache/conftool/dbconfig/20230411-164524-ladsgroup.json
16:33 sbassett: Deployed security mitigation update for T333140
16:33 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
16:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T333332)', diff saved to https://phabricator.wikimedia.org/P46335 and previous config saved to /var/cache/conftool/dbconfig/20230411-163018-ladsgroup.json
16:30 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
16:29 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
16:27 mforns@deploy2002: Finished deploy [airflow-dags/analytics@ce3d4d6]: (no justification provided) (duration: 00m 11s)
16:27 mforns@deploy2002: Started deploy [airflow-dags/analytics@ce3d4d6]: (no justification provided)
16:23 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
16:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T333332)', diff saved to https://phabricator.wikimedia.org/P46334 and previous config saved to /var/cache/conftool/dbconfig/20230411-162020-ladsgroup.json
16:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
16:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1189.eqiad.wmnet with reason: Maintenance
16:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T333332)', diff saved to https://phabricator.wikimedia.org/P46333 and previous config saved to /var/cache/conftool/dbconfig/20230411-161956-ladsgroup.json
16:19 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
16:19 brett: Disable Puppet/PyBal on lvs3006 in preparation for reimaging - T321309
16:18 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
16:12 hnowlan@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
16:11 hnowlan@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
16:09 hnowlan@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
16:08 hnowlan@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
16:07 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-worker1132.eqiad.wmnet with reason: More tests are needed before the host can be added to prod
16:06 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-worker1132.eqiad.wmnet with reason: More tests are needed before the host can be added to prod
16:05 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1132.eqiad.wmnet with OS buster
16:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P46332 and previous config saved to /var/cache/conftool/dbconfig/20230411-160450-ladsgroup.json
15:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P46331 and previous config saved to /var/cache/conftool/dbconfig/20230411-154943-ladsgroup.json
15:37 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1132.eqiad.wmnet with reason: host reimage
15:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T333332)', diff saved to https://phabricator.wikimedia.org/P46330 and previous config saved to /var/cache/conftool/dbconfig/20230411-153437-ladsgroup.json
15:34 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1132.eqiad.wmnet with reason: host reimage
15:33 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: installation failed due to read-only database
15:32 aokoth@cumin1001: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: installation failed due to read-only database
15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T333332)', diff saved to https://phabricator.wikimedia.org/P46329 and previous config saved to /var/cache/conftool/dbconfig/20230411-152438-ladsgroup.json
15:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
15:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance
15:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T333332)', diff saved to https://phabricator.wikimedia.org/P46328 and previous config saved to /var/cache/conftool/dbconfig/20230411-152413-ladsgroup.json
15:21 moritzm: installing xen security updates
15:13 ebysans@deploy2002: Finished deploy [analytics/refinery@f3389dc] (hadoop-test): Update pageview hourly table with referer data field TEST [analytics/refinery@f3389dc] (duration: 01m 28s)
15:11 ebysans@deploy2002: Started deploy [analytics/refinery@f3389dc] (hadoop-test): Update pageview hourly table with referer data field TEST [analytics/refinery@f3389dc]
15:10 ebysans@deploy2002: Finished deploy [analytics/refinery@f3389dc] (thin): Update pageview hourly table with referer data field THIN [analytics/refinery@f3389dc] (duration: 00m 08s)
15:10 ebysans@deploy2002: Started deploy [analytics/refinery@f3389dc] (thin): Update pageview hourly table with referer data field THIN [analytics/refinery@f3389dc]
15:09 ebysans@deploy2002: Finished deploy [analytics/refinery@f3389dc]: Update pageview hourly table with referer data field [analytics/refinery@f3389dc] (duration: 05m 34s)
15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P46327 and previous config saved to /var/cache/conftool/dbconfig/20230411-150907-ladsgroup.json
15:03 ebysans@deploy2002: Started deploy [analytics/refinery@f3389dc]: Update pageview hourly table with referer data field [analytics/refinery@f3389dc]
15:01 SandraEbele: deploying analytics refinery to update hive pageview hourly table with referer_data field.
14:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P46326 and previous config saved to /var/cache/conftool/dbconfig/20230411-145401-ladsgroup.json
14:53 SandraEbele: paused pageview hourly job.
14:51 herron@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:51 herron@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add kafka-logging1005 ipv6 - herron@cumin1001"
14:48 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host an-worker1132.eqiad.wmnet with OS buster
14:47 herron@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add kafka-logging1005 ipv6 - herron@cumin1001"
14:45 herron@cumin1001: START - Cookbook sre.dns.netbox
14:42 moritzm: installing Tomcat security updates
14:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T333332)', diff saved to https://phabricator.wikimedia.org/P46325 and previous config saved to /var/cache/conftool/dbconfig/20230411-143854-ladsgroup.json
14:34 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
14:34 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
14:29 jnuche@deploy2002: Installing scap version "4.49.0" for 590 hosts
14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T333332)', diff saved to https://phabricator.wikimedia.org/P46324 and previous config saved to /var/cache/conftool/dbconfig/20230411-142857-ladsgroup.json
14:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
14:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1166.eqiad.wmnet with reason: Maintenance
14:27 jnuche@deploy2002: Installing scap version "4.49.0" for 590 hosts
14:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
14:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1150.eqiad.wmnet with reason: Maintenance
14:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T333332)', diff saved to https://phabricator.wikimedia.org/P46323 and previous config saved to /var/cache/conftool/dbconfig/20230411-141944-ladsgroup.json
14:16 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on mw2448.codfw.wmnet with reason: HW failure
14:16 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on mw2448.codfw.wmnet with reason: HW failure
14:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P46321 and previous config saved to /var/cache/conftool/dbconfig/20230411-140438-ladsgroup.json
14:00 claime: Revoking kafka_main-codfw_broker and kafka_main-eqiad_broker puppet CA certs - T319372
13:55 elukey: remove old puppet certificates for kafka main brokers from A:kafka-main - T319372
13:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P46320 and previous config saved to /var/cache/conftool/dbconfig/20230411-134932-ladsgroup.json
13:46 elukey: powercycle analytics1069, down for some days now, host stuck from the mgmt/serial console
13:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T333332)', diff saved to https://phabricator.wikimedia.org/P46319 and previous config saved to /var/cache/conftool/dbconfig/20230411-133425-ladsgroup.json
13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T333332)', diff saved to https://phabricator.wikimedia.org/P46318 and previous config saved to /var/cache/conftool/dbconfig/20230411-132348-ladsgroup.json
13:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1123.eqiad.wmnet with reason: Maintenance
13:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1123.eqiad.wmnet with reason: Maintenance
13:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T333332)', diff saved to https://phabricator.wikimedia.org/P46317 and previous config saved to /var/cache/conftool/dbconfig/20230411-132324-ladsgroup.json
13:21 taavi@deploy2002: Finished scap: Backport for Deploy Nearby feature on most wikis [2/2] (T334079) (duration: 08m 25s)
13:14 taavi@deploy2002: wmde-fisch and taavi: Backport for Deploy Nearby feature on most wikis [2/2] (T334079) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
13:13 taavi@deploy2002: Started scap: Backport for Deploy Nearby feature on most wikis [2/2] (T334079)
13:11 taavi@deploy2002: Finished scap: Backport for Deploy Nearby feature on most wikis [1/2] (T334079) (duration: 07m 24s)
13:09 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2003.wikimedia.org with OS bullseye
13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P46316 and previous config saved to /var/cache/conftool/dbconfig/20230411-130817-ladsgroup.json
13:05 taavi@deploy2002: taavi and wmde-fisch: Backport for Deploy Nearby feature on most wikis [1/2] (T334079) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
13:04 taavi@deploy2002: Started scap: Backport for Deploy Nearby feature on most wikis [1/2] (T334079)
12:54 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
12:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P46315 and previous config saved to /var/cache/conftool/dbconfig/20230411-125310-ladsgroup.json
12:50 jelto@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
12:38 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
12:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T333332)', diff saved to https://phabricator.wikimedia.org/P46314 and previous config saved to /var/cache/conftool/dbconfig/20230411-123803-ladsgroup.json
12:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T333332)', diff saved to https://phabricator.wikimedia.org/P46313 and previous config saved to /var/cache/conftool/dbconfig/20230411-122735-ladsgroup.json
12:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
12:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
12:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1112.eqiad.wmnet with reason: Maintenance
12:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1112.eqiad.wmnet with reason: Maintenance
12:24 cgoubert@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2448.*.codfw.wmnet
12:24 claime: Setting mw2448.codfw.wmnet to pooled=invalid - T334429
12:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1102.eqiad.wmnet with reason: Maintenance
12:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1102.eqiad.wmnet with reason: Maintenance
12:16 ladsgroup@cumin1001: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 8:00:00 on db1113.eqiad.wmnet with reason: Maintenance
12:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1113.eqiad.wmnet with reason: Maintenance
11:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46312 and previous config saved to /var/cache/conftool/dbconfig/20230411-115137-root.json
11:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46311 and previous config saved to /var/cache/conftool/dbconfig/20230411-113631-root.json
11:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46310 and previous config saved to /var/cache/conftool/dbconfig/20230411-112126-root.json
11:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Pooling', diff saved to https://phabricator.wikimedia.org/P46309 and previous config saved to /var/cache/conftool/dbconfig/20230411-111854-root.json
11:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46308 and previous config saved to /var/cache/conftool/dbconfig/20230411-110621-root.json
11:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P46307 and previous config saved to /var/cache/conftool/dbconfig/20230411-110349-root.json
10:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46306 and previous config saved to /var/cache/conftool/dbconfig/20230411-105116-root.json
10:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1107 T334447', diff saved to https://phabricator.wikimedia.org/P46305 and previous config saved to /var/cache/conftool/dbconfig/20230411-105100-marostegui.json
10:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P46304 and previous config saved to /var/cache/conftool/dbconfig/20230411-104844-root.json
10:36 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes201[0123].codfw.wmnet
10:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46303 and previous config saved to /var/cache/conftool/dbconfig/20230411-103611-root.json
10:36 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
10:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P46302 and previous config saved to /var/cache/conftool/dbconfig/20230411-103339-root.json
10:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46301 and previous config saved to /var/cache/conftool/dbconfig/20230411-102106-root.json
10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P46300 and previous config saved to /var/cache/conftool/dbconfig/20230411-101835-root.json
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P46298 and previous config saved to /var/cache/conftool/dbconfig/20230411-100330-root.json
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P46297 and previous config saved to /var/cache/conftool/dbconfig/20230411-094825-root.json
09:44 jelto@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gitlab2003.wikimedia.org with OS bullseye
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P46296 and previous config saved to /var/cache/conftool/dbconfig/20230411-093320-root.json
09:27 Amir1: start of watchlist clean up of a user in wikidatawiki (T328501)
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 100%: Pooling', diff saved to https://phabricator.wikimedia.org/P46295 and previous config saved to /var/cache/conftool/dbconfig/20230411-092224-root.json
09:20 moritzm: installing nodejs security updates on buster
09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P46294 and previous config saved to /var/cache/conftool/dbconfig/20230411-091815-root.json
09:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P46293 and previous config saved to /var/cache/conftool/dbconfig/20230411-090720-root.json
09:04 moritzm: installing pcre2 security updates
09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P46292 and previous config saved to /var/cache/conftool/dbconfig/20230411-090310-root.json
08:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1122 to clone db1222 T326669', diff saved to https://phabricator.wikimedia.org/P46290 and previous config saved to /var/cache/conftool/dbconfig/20230411-085654-marostegui.json
08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P46289 and previous config saved to /var/cache/conftool/dbconfig/20230411-085215-root.json
08:50 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P46288 and previous config saved to /var/cache/conftool/dbconfig/20230411-083710-root.json
08:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 100%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46287 and previous config saved to /var/cache/conftool/dbconfig/20230411-083339-root.json
08:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46286 and previous config saved to /var/cache/conftool/dbconfig/20230411-083106-root.json
08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 100%: Pooling', diff saved to https://phabricator.wikimedia.org/P46285 and previous config saved to /var/cache/conftool/dbconfig/20230411-082521-root.json
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P46284 and previous config saved to /var/cache/conftool/dbconfig/20230411-082205-root.json
08:19 aqu@deploy2002: Finished deploy [analytics/refinery@bed78f6] (hadoop-test): Deploy analytics_refinery including last webrequest load scripts in TEST 3nd try [analytics/refinery@bed78f6] (duration: 01m 25s)
08:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 75%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46283 and previous config saved to /var/cache/conftool/dbconfig/20230411-081834-root.json
08:18 aqu@deploy2002: Started deploy [analytics/refinery@bed78f6] (hadoop-test): Deploy analytics_refinery including last webrequest load scripts in TEST 3nd try [analytics/refinery@bed78f6]
08:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46282 and previous config saved to /var/cache/conftool/dbconfig/20230411-081601-root.json
08:15 aqu: About to deploy analytics/refinery (To migrate webrequest load from Oozie to Airflow)
08:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P46281 and previous config saved to /var/cache/conftool/dbconfig/20230411-081016-root.json
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P46280 and previous config saved to /var/cache/conftool/dbconfig/20230411-080700-root.json
08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 50%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46279 and previous config saved to /var/cache/conftool/dbconfig/20230411-080329-root.json
08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46278 and previous config saved to /var/cache/conftool/dbconfig/20230411-080057-root.json
07:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P46277 and previous config saved to /var/cache/conftool/dbconfig/20230411-075511-root.json
07:54 vgutierrez: restart haproxy on cp2033 - T334448
07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P46276 and previous config saved to /var/cache/conftool/dbconfig/20230411-075155-root.json
07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 25%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46275 and previous config saved to /var/cache/conftool/dbconfig/20230411-074824-root.json
07:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46274 and previous config saved to /var/cache/conftool/dbconfig/20230411-074552-root.json
07:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P46273 and previous config saved to /var/cache/conftool/dbconfig/20230411-074006-root.json
07:39 jelto@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gitlab2003.wikimedia.org with OS bullseye
07:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1103.eqiad.wmnet
07:39 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:39 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1103.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:37 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1103.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P46272 and previous config saved to /var/cache/conftool/dbconfig/20230411-073651-root.json
07:35 marostegui@cumin1001: START - Cookbook sre.dns.netbox
07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 10%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46271 and previous config saved to /var/cache/conftool/dbconfig/20230411-073319-root.json
07:30 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1103.eqiad.wmnet
07:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46270 and previous config saved to /var/cache/conftool/dbconfig/20230411-073047-root.json
07:30 dcausse: restarting blazegraph on wdqs1007 (stuck for 48hours)
07:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P46269 and previous config saved to /var/cache/conftool/dbconfig/20230411-072501-root.json
07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P46268 and previous config saved to /var/cache/conftool/dbconfig/20230411-072146-root.json
07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 5%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46267 and previous config saved to /var/cache/conftool/dbconfig/20230411-071815-root.json
07:18 zabe@deploy2002: Finished scap: Backport for Add blkwiki to wgSitename (T334351) (duration: 08m 08s)
07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1103 from dbctl T332293', diff saved to https://phabricator.wikimedia.org/P46266 and previous config saved to /var/cache/conftool/dbconfig/20230411-071647-marostegui.json
07:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46265 and previous config saved to /var/cache/conftool/dbconfig/20230411-071542-root.json
07:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 393731
07:11 zabe@deploy2002: zabe and jhsoby: Backport for Add blkwiki to wgSitename (T334351) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
07:11 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 393731
07:11 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 150279
07:10 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 150279
07:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 35467
07:10 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 35467
07:10 zabe@deploy2002: Started scap: Backport for Add blkwiki to wgSitename (T334351)
07:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P46264 and previous config saved to /var/cache/conftool/dbconfig/20230411-070956-root.json
07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1211 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P46263 and previous config saved to /var/cache/conftool/dbconfig/20230411-070641-root.json
07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1211 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46262 and previous config saved to /var/cache/conftool/dbconfig/20230411-070609-marostegui.json
07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 4%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46261 and previous config saved to /var/cache/conftool/dbconfig/20230411-070310-root.json
07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46260 and previous config saved to /var/cache/conftool/dbconfig/20230411-070037-root.json
06:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 T334375', diff saved to https://phabricator.wikimedia.org/P46258 and previous config saved to /var/cache/conftool/dbconfig/20230411-065734-marostegui.json
06:56 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1163 to s1 primary T334375', diff saved to https://phabricator.wikimedia.org/P46257 and previous config saved to /var/cache/conftool/dbconfig/20230411-065639-root.json
06:56 marostegui: Starting s1 eqiad failover from db1118 to db1163 - T334375
06:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P46256 and previous config saved to /var/cache/conftool/dbconfig/20230411-065452-root.json
06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 3%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46255 and previous config saved to /var/cache/conftool/dbconfig/20230411-064805-root.json
06:43 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
06:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P46254 and previous config saved to /var/cache/conftool/dbconfig/20230411-063947-root.json
06:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 2%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46252 and previous config saved to /var/cache/conftool/dbconfig/20230411-063300-root.json
06:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P46251 and previous config saved to /var/cache/conftool/dbconfig/20230411-062442-root.json
06:21 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1163 with weight 0 T334375', diff saved to https://phabricator.wikimedia.org/P46250 and previous config saved to /var/cache/conftool/dbconfig/20230411-062127-root.json
06:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 37 hosts with reason: Primary switchover s1 T334375
06:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 37 hosts with reason: Primary switchover s1 T334375
06:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1209 (re)pooling @ 1%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46249 and previous config saved to /var/cache/conftool/dbconfig/20230411-061755-root.json
06:16 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1209 to dbctl T326206', diff saved to https://phabricator.wikimedia.org/P46248 and previous config saved to /var/cache/conftool/dbconfig/20230411-061642-marostegui.json
06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110 to clone db1210 T326669', diff saved to https://phabricator.wikimedia.org/P46246 and previous config saved to /var/cache/conftool/dbconfig/20230411-061044-marostegui.json
06:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1224 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P46245 and previous config saved to /var/cache/conftool/dbconfig/20230411-060937-root.json
06:09 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1224 to dbctl T326206', diff saved to https://phabricator.wikimedia.org/P46244 and previous config saved to /var/cache/conftool/dbconfig/20230411-060922-marostegui.json
05:45 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Swakiyama out of all services on: 814 hosts
05:45 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Swakiyama out of all services on: 814 hosts
05:44 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Swakiyama out of all services on: 1241 hosts
05:43 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Swakiyama out of all services on: 1241 hosts
04:10 eileen: civicrm upgraded from bc2f5ccc to b573aee4
03:54 mwpresync@deploy2002: Pruned MediaWiki: 1.41.0-wmf.2 (duration: 02m 15s)
03:52 mwpresync@deploy2002: Finished scap: testwikis wikis to 1.41.0-wmf.4 refs T330210 (duration: 49m 57s)
03:02 mwpresync@deploy2002: Started scap: testwikis wikis to 1.41.0-wmf.4 refs T330210
00:37 eileen: civicrm upgraded from 001e156a to bc2f5ccc
00:13 eileen: civicrm upgraded from 223f655a to 001e156a

2023-04-10

23:07 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts miscweb1002.eqiad.wmnet
23:07 dzahn@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
23:07 dzahn@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: miscweb1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1001"
23:06 dzahn@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: miscweb1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1001"
23:00 dzahn@cumin1001: START - Cookbook sre.dns.netbox
22:55 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts miscweb1002.eqiad.wmnet
22:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on miscweb1002.eqiad.wmnet with reason: decom
22:53 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on miscweb1002.eqiad.wmnet with reason: decom
21:53 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs3005.esams.wmnet with OS bullseye
21:46 urandom: restarting Cassandra, sessionstore1001-a, to restore native transport settings — T327954
21:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs3005.esams.wmnet with reason: host reimage
21:33 eevans@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host sessionstore1001.eqiad.wmnet
21:32 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs3005.esams.wmnet with reason: host reimage
21:31 urandom: restarting Cassandra, sessionstore1002-a — T327954
21:22 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
21:21 eevans@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host sessionstore1001.eqiad.wmnet
21:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs3005.esams.wmnet with OS bullseye
21:14 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs3005.esams.wmnet with OS bullseye
21:13 sbassett: Deployed updated security mitigation for T333140
21:10 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
21:08 eevans@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host sessionstore1001.eqiad.wmnet
21:06 urandom: restarting Cassandra, sessionstore1003-a — T327954
21:04 urandom: restarting Cassandra, sessionstore1002-a — T327954
20:57 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
20:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs3005.esams.wmnet with reason: host reimage
20:36 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs3005.esams.wmnet with reason: host reimage
20:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs3005.esams.wmnet with OS bullseye
20:09 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
20:07 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
20:07 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
20:05 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
19:53 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirtlocal1003']
19:52 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirtlocal1003']
19:52 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirtlocal1002']
19:52 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirtlocal1002']
19:51 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudvirtlocal1001']
19:51 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudvirtlocal1001']
19:48 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
19:35 brett: Disable Puppet/PyBal on lvs3005 in preparation for reimaging - T321309
19:25 mutante: mw2488 - scap pull - T334429
19:22 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs6002.drmrs.wmnet
19:22 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs6002.drmrs.wmnet
19:19 mforns@deploy2002: Finished deploy [airflow-dags/analytics@6d6f1ec]: (no justification provided) (duration: 00m 11s)
19:19 mforns@deploy2002: Started deploy [airflow-dags/analytics@6d6f1ec]: (no justification provided)
19:16 mutante: power-cycling mw2448 - down, no console output T334429
19:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs6002.drmrs.wmnet with OS bullseye
18:46 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs6002.drmrs.wmnet with reason: host reimage
18:43 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs6002.drmrs.wmnet with reason: host reimage
18:34 herron@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:34 herron@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add kafka-logging1004 ipv6 - herron@cumin1001"
18:33 herron@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add kafka-logging1004 ipv6 - herron@cumin1001"
18:31 herron@cumin1001: START - Cookbook sre.dns.netbox
18:22 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs6002.drmrs.wmnet with OS bullseye
18:16 krinkle@deploy2002: Synchronized wmf-config/: (no justification provided) (duration: 587m 34s)
17:29 brett: Disable Puppet/PyBal on lvs6002 in preparation for reimaging - T321309
16:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs6001.drmrs.wmnet with OS bullseye
16:31 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs6001.drmrs.wmnet with reason: host reimage
16:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs6001.drmrs.wmnet with reason: host reimage
16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs6001.drmrs.wmnet with OS bullseye
15:53 herron: centrallog1002:~# systemctl restart rsyslog
15:46 brett: Disable Puppet/PyBal on lvs6001 in preparation for reimaging - T321309
14:57 sukhe: enable puppet on A:lvs and A:ulsfo to merge 906580
14:52 sukhe: disable puppet on A:lvs and A:ulsfo to merge 906580
14:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46242 and previous config saved to /var/cache/conftool/dbconfig/20230410-141052-root.json
13:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46241 and previous config saved to /var/cache/conftool/dbconfig/20230410-135547-root.json
13:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46240 and previous config saved to /var/cache/conftool/dbconfig/20230410-134042-root.json
13:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46239 and previous config saved to /var/cache/conftool/dbconfig/20230410-132538-root.json
13:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46238 and previous config saved to /var/cache/conftool/dbconfig/20230410-131033-root.json
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46237 and previous config saved to /var/cache/conftool/dbconfig/20230410-125528-root.json
12:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46236 and previous config saved to /var/cache/conftool/dbconfig/20230410-124023-root.json
12:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 100%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46235 and previous config saved to /var/cache/conftool/dbconfig/20230410-122112-root.json
12:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 75%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46234 and previous config saved to /var/cache/conftool/dbconfig/20230410-120607-root.json
11:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 50%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46233 and previous config saved to /var/cache/conftool/dbconfig/20230410-115102-root.json
11:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 100%: Pooling', diff saved to https://phabricator.wikimedia.org/P46232 and previous config saved to /var/cache/conftool/dbconfig/20230410-114733-root.json
11:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 25%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46231 and previous config saved to /var/cache/conftool/dbconfig/20230410-113557-root.json
11:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P46230 and previous config saved to /var/cache/conftool/dbconfig/20230410-113228-root.json
11:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1201 to clone db1224 T326669', diff saved to https://phabricator.wikimedia.org/P46228 and previous config saved to /var/cache/conftool/dbconfig/20230410-112524-marostegui.json
11:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 10%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46227 and previous config saved to /var/cache/conftool/dbconfig/20230410-112052-root.json
11:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P46226 and previous config saved to /var/cache/conftool/dbconfig/20230410-111723-root.json
11:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 5%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46225 and previous config saved to /var/cache/conftool/dbconfig/20230410-110548-root.json
11:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P46224 and previous config saved to /var/cache/conftool/dbconfig/20230410-110218-root.json
10:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 4%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46222 and previous config saved to /var/cache/conftool/dbconfig/20230410-105043-root.json
10:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P46221 and previous config saved to /var/cache/conftool/dbconfig/20230410-104714-root.json
10:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 3%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46220 and previous config saved to /var/cache/conftool/dbconfig/20230410-103538-root.json
10:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P46219 and previous config saved to /var/cache/conftool/dbconfig/20230410-103209-root.json
10:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 2%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46218 and previous config saved to /var/cache/conftool/dbconfig/20230410-102033-root.json
10:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P46217 and previous config saved to /var/cache/conftool/dbconfig/20230410-101704-root.json
10:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1183 (re)pooling @ 1%: Pooling T334080', diff saved to https://phabricator.wikimedia.org/P46216 and previous config saved to /var/cache/conftool/dbconfig/20230410-100528-root.json
10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P46215 and previous config saved to /var/cache/conftool/dbconfig/20230410-100159-root.json
09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1183 to s5 depooled T334080', diff saved to https://phabricator.wikimedia.org/P46214 and previous config saved to /var/cache/conftool/dbconfig/20230410-095846-marostegui.json
09:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46213 and previous config saved to /var/cache/conftool/dbconfig/20230410-095530-root.json
09:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P46212 and previous config saved to /var/cache/conftool/dbconfig/20230410-094654-root.json
09:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46211 and previous config saved to /var/cache/conftool/dbconfig/20230410-094025-root.json
09:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1109 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P46210 and previous config saved to /var/cache/conftool/dbconfig/20230410-093149-root.json
09:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46209 and previous config saved to /var/cache/conftool/dbconfig/20230410-092520-root.json
09:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46207 and previous config saved to /var/cache/conftool/dbconfig/20230410-091015-root.json
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46206 and previous config saved to /var/cache/conftool/dbconfig/20230410-090141-root.json
08:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46205 and previous config saved to /var/cache/conftool/dbconfig/20230410-085511-root.json
08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 100%: Pooling', diff saved to https://phabricator.wikimedia.org/P46204 and previous config saved to /var/cache/conftool/dbconfig/20230410-085117-root.json
08:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46203 and previous config saved to /var/cache/conftool/dbconfig/20230410-084636-root.json
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46202 and previous config saved to /var/cache/conftool/dbconfig/20230410-084006-root.json
08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 75%: Pooling', diff saved to https://phabricator.wikimedia.org/P46201 and previous config saved to /var/cache/conftool/dbconfig/20230410-083613-root.json
08:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46200 and previous config saved to /var/cache/conftool/dbconfig/20230410-083131-root.json
08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 4%: Repooling', diff saved to https://phabricator.wikimedia.org/P46199 and previous config saved to /var/cache/conftool/dbconfig/20230410-082501-root.json
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 50%: Pooling', diff saved to https://phabricator.wikimedia.org/P46198 and previous config saved to /var/cache/conftool/dbconfig/20230410-082108-root.json
08:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46197 and previous config saved to /var/cache/conftool/dbconfig/20230410-081626-root.json
08:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 3%: Repooling', diff saved to https://phabricator.wikimedia.org/P46196 and previous config saved to /var/cache/conftool/dbconfig/20230410-080956-root.json
08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 25%: Pooling', diff saved to https://phabricator.wikimedia.org/P46195 and previous config saved to /var/cache/conftool/dbconfig/20230410-080603-root.json
08:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46194 and previous config saved to /var/cache/conftool/dbconfig/20230410-080121-root.json
08:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 100%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46193 and previous config saved to /var/cache/conftool/dbconfig/20230410-080115-root.json
07:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P46192 and previous config saved to /var/cache/conftool/dbconfig/20230410-075451-root.json
07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 10%: Pooling', diff saved to https://phabricator.wikimedia.org/P46191 and previous config saved to /var/cache/conftool/dbconfig/20230410-075058-root.json
07:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46190 and previous config saved to /var/cache/conftool/dbconfig/20230410-074617-root.json
07:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 75%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46189 and previous config saved to /var/cache/conftool/dbconfig/20230410-074610-root.json
07:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46188 and previous config saved to /var/cache/conftool/dbconfig/20230410-073947-root.json
07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 5%: Pooling', diff saved to https://phabricator.wikimedia.org/P46187 and previous config saved to /var/cache/conftool/dbconfig/20230410-073553-root.json
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1163 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46186 and previous config saved to /var/cache/conftool/dbconfig/20230410-073112-root.json
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 50%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46185 and previous config saved to /var/cache/conftool/dbconfig/20230410-073105-root.json
07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1163', diff saved to https://phabricator.wikimedia.org/P46184 and previous config saved to /var/cache/conftool/dbconfig/20230410-072206-marostegui.json
07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 4%: Pooling', diff saved to https://phabricator.wikimedia.org/P46183 and previous config saved to /var/cache/conftool/dbconfig/20230410-072048-root.json
07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1109 T326669', diff saved to https://phabricator.wikimedia.org/P46181 and previous config saved to /var/cache/conftool/dbconfig/20230410-071747-marostegui.json
07:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 25%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46180 and previous config saved to /var/cache/conftool/dbconfig/20230410-071600-root.json
07:09 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46179 and previous config saved to /var/cache/conftool/dbconfig/20230410-070948-root.json
07:09 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts db1101.eqiad.wmnet
07:09 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:09 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1101.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 3%: Pooling', diff saved to https://phabricator.wikimedia.org/P46178 and previous config saved to /var/cache/conftool/dbconfig/20230410-070544-root.json
07:05 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1101.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:03 marostegui@cumin1001: START - Cookbook sre.dns.netbox
07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 10%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46177 and previous config saved to /var/cache/conftool/dbconfig/20230410-070056-root.json
06:58 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1101.eqiad.wmnet
06:54 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46176 and previous config saved to /var/cache/conftool/dbconfig/20230410-065443-root.json
06:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103 T334374', diff saved to https://phabricator.wikimedia.org/P46175 and previous config saved to /var/cache/conftool/dbconfig/20230410-065149-marostegui.json
06:50 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1179 to x1 primary T334374', diff saved to https://phabricator.wikimedia.org/P46174 and previous config saved to /var/cache/conftool/dbconfig/20230410-065047-root.json
06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 2%: Pooling', diff saved to https://phabricator.wikimedia.org/P46173 and previous config saved to /var/cache/conftool/dbconfig/20230410-065039-root.json
06:50 marostegui: Starting x1 eqiad failover from db1103 to db1179 - T334374
06:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 5%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46172 and previous config saved to /var/cache/conftool/dbconfig/20230410-064551-root.json
06:39 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46171 and previous config saved to /var/cache/conftool/dbconfig/20230410-063939-root.json
06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1179 with weight 0 T334374', diff saved to https://phabricator.wikimedia.org/P46170 and previous config saved to /var/cache/conftool/dbconfig/20230410-063916-root.json
06:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 12 hosts with reason: Primary switchover x1 T334374
06:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 12 hosts with reason: Primary switchover x1 T334374
06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1220 (re)pooling @ 1%: Pooling', diff saved to https://phabricator.wikimedia.org/P46169 and previous config saved to /var/cache/conftool/dbconfig/20230410-063534-root.json
06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1220 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46168 and previous config saved to /var/cache/conftool/dbconfig/20230410-063458-marostegui.json
06:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 4%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46167 and previous config saved to /var/cache/conftool/dbconfig/20230410-063046-root.json
06:24 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46166 and previous config saved to /var/cache/conftool/dbconfig/20230410-062434-root.json
06:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 3%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46165 and previous config saved to /var/cache/conftool/dbconfig/20230410-061541-root.json
06:09 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46164 and previous config saved to /var/cache/conftool/dbconfig/20230410-060929-root.json
06:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 2%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46163 and previous config saved to /var/cache/conftool/dbconfig/20230410-060037-root.json
05:54 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46162 and previous config saved to /var/cache/conftool/dbconfig/20230410-055424-root.json
05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1161 T334080', diff saved to https://phabricator.wikimedia.org/P46160 and previous config saved to /var/cache/conftool/dbconfig/20230410-055005-marostegui.json
05:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1207 (re)pooling @ 1%: Pooling T326669', diff saved to https://phabricator.wikimedia.org/P46159 and previous config saved to /var/cache/conftool/dbconfig/20230410-054532-root.json
05:45 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1207 to dbctl T326669', diff saved to https://phabricator.wikimedia.org/P46158 and previous config saved to /var/cache/conftool/dbconfig/20230410-054504-marostegui.json
05:39 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46157 and previous config saved to /var/cache/conftool/dbconfig/20230410-053919-root.json

2023-04-08

17:57 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1073']

2023-04-07

18:19 xcollazo@deploy2002: Finished deploy [airflow-dags/platform_eng@5c4ebda]: (no justification provided) (duration: 00m 35s)
18:18 xcollazo@deploy2002: Started deploy [airflow-dags/platform_eng@5c4ebda]: (no justification provided)
17:02 urandom: restart Cassandra, sessionstore1001-a (re-enabling CQL) — T327954
11:05 aqu@deploy2002: Finished deploy [analytics/refinery@e70da10] (hadoop-test): Deploy analytics_refinery including last webrquest load scripts in TEST 2nd try [analytics/refinery@e70da10] (duration: 01m 33s)
11:03 aqu@deploy2002: Started deploy [analytics/refinery@e70da10] (hadoop-test): Deploy analytics_refinery including last webrquest load scripts in TEST 2nd try [analytics/refinery@e70da10]
10:40 aqu@deploy2002: Finished deploy [analytics/refinery@eb4c2b2] (hadoop-test): Deploy analytics_refinery including last webrquest load scripts in TEST [analytics/refinery@eb4c2b2] (duration: 00m 06s)
10:40 aqu@deploy2002: Started deploy [analytics/refinery@eb4c2b2] (hadoop-test): Deploy analytics_refinery including last webrquest load scripts in TEST [analytics/refinery@eb4c2b2]
10:34 aqu: About to deploy analytics/refinery in test cluster
09:23 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:23 ayounsi@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sonicmgmt - ayounsi@cumin1001"
09:22 ayounsi@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: sonicmgmt - ayounsi@cumin1001"
09:20 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
01:17 urandom: rebooting sessionstore1001 — T327954
01:10 urandom: rebooting sessionstore1001 — T327954
01:02 urandom: rebooting sessionstore1001 — T327954
00:39 urandom: rebooting sessionstore1001 — T327954

2023-04-06

22:05 ejegg: SmashPig upgraded from 7c19151f to 24d700f4
22:04 ejegg: payments-wiki upgraded from 75b068a1 to 0f15a101
21:52 sbassett: Deployed updated mitigation for T333140
21:19 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
21:18 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
21:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T333332)', diff saved to https://phabricator.wikimedia.org/P46154 and previous config saved to /var/cache/conftool/dbconfig/20230406-211054-ladsgroup.json
21:05 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
21:04 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
21:02 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
21:02 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
21:00 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
21:00 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
20:59 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
20:57 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
20:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P46153 and previous config saved to /var/cache/conftool/dbconfig/20230406-205548-ladsgroup.json
20:53 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
20:50 eevans@cumin1001: conftool action : set/pooled=yes; selector: name=ms-fe1014.eqiad.wmnet
20:49 eevans@cumin1001: conftool action : set/pooled=yes; selector: name=ms-fe1013.eqiad.wmnet
20:49 eevans@cumin1001: conftool action : set/weight=40; selector: name=ms-fe1014.eqiad.wmnet
20:49 eevans@cumin1001: conftool action : set/weight=40; selector: name=ms-fe1013.eqiad.wmnet
20:45 eevans@cumin1001: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:swift-fe-eqiad
20:45 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "remove info for new ssw as need to set back to planned to make homer happy - cmooney@cumin1001 - T322937"
20:43 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "remove info for new ssw as need to set back to planned to make homer happy - cmooney@cumin1001 - T322937"
20:41 eevans@cumin1001: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
20:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P46152 and previous config saved to /var/cache/conftool/dbconfig/20230406-204041-ladsgroup.json
20:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T333332)', diff saved to https://phabricator.wikimedia.org/P46151 and previous config saved to /var/cache/conftool/dbconfig/20230406-202535-ladsgroup.json
20:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2180 (T333332)', diff saved to https://phabricator.wikimedia.org/P46150 and previous config saved to /var/cache/conftool/dbconfig/20230406-202319-ladsgroup.json
20:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
20:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2180.codfw.wmnet with reason: Maintenance
20:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46149 and previous config saved to /var/cache/conftool/dbconfig/20230406-202256-ladsgroup.json
20:16 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe1014.eqiad.wmnet
20:15 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe1013.eqiad.wmnet
20:09 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-fe1014.eqiad.wmnet
20:09 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-fe1013.eqiad.wmnet
20:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P46148 and previous config saved to /var/cache/conftool/dbconfig/20230406-200750-ladsgroup.json
19:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P46147 and previous config saved to /var/cache/conftool/dbconfig/20230406-195243-ladsgroup.json
19:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46146 and previous config saved to /var/cache/conftool/dbconfig/20230406-193737-ladsgroup.json
19:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46145 and previous config saved to /var/cache/conftool/dbconfig/20230406-193510-ladsgroup.json
19:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
19:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2171.codfw.wmnet with reason: Maintenance
19:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46144 and previous config saved to /var/cache/conftool/dbconfig/20230406-193447-ladsgroup.json
19:26 mforns@deploy2002: Finished deploy [airflow-dags/analytics@b454afd]: (no justification provided) (duration: 00m 11s)
19:26 mforns@deploy2002: Started deploy [airflow-dags/analytics@b454afd]: (no justification provided)
19:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P46143 and previous config saved to /var/cache/conftool/dbconfig/20230406-191941-ladsgroup.json
19:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P46142 and previous config saved to /var/cache/conftool/dbconfig/20230406-190435-ladsgroup.json
18:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46141 and previous config saved to /var/cache/conftool/dbconfig/20230406-184929-ladsgroup.json
18:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46140 and previous config saved to /var/cache/conftool/dbconfig/20230406-184701-ladsgroup.json
18:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
18:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2169.codfw.wmnet with reason: Maintenance
18:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46139 and previous config saved to /var/cache/conftool/dbconfig/20230406-184638-ladsgroup.json
18:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P46138 and previous config saved to /var/cache/conftool/dbconfig/20230406-183132-ladsgroup.json
18:18 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs3007.esams.wmnet with OS bullseye
18:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P46137 and previous config saved to /var/cache/conftool/dbconfig/20230406-181625-ladsgroup.json
18:02 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs3007.esams.wmnet with reason: host reimage
18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46136 and previous config saved to /var/cache/conftool/dbconfig/20230406-180119-ladsgroup.json
17:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T333332)', diff saved to https://phabricator.wikimedia.org/P46135 and previous config saved to /var/cache/conftool/dbconfig/20230406-175854-ladsgroup.json
17:58 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs3007.esams.wmnet with reason: host reimage
17:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
17:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on db2187.codfw.wmnet with reason: Maintenance
17:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
17:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2158.codfw.wmnet with reason: Maintenance
17:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T333332)', diff saved to https://phabricator.wikimedia.org/P46134 and previous config saved to /var/cache/conftool/dbconfig/20230406-175813-ladsgroup.json
17:49 volans@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox
17:49 volans@cumin1001: START - Cookbook sre.netbox.update-extras rolling update on A:netbox
17:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P46133 and previous config saved to /var/cache/conftool/dbconfig/20230406-174306-ladsgroup.json
17:36 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs3007.esams.wmnet with OS bullseye
17:34 volans@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox
17:34 volans@cumin1001: START - Cookbook sre.netbox.update-extras rolling update on A:netbox
17:32 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-f1-eqiad.mgmt with reason: test on ssw1-e1-eqiad will take ospf on lsw1-f1-eqiad down.
17:32 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-f1-eqiad.mgmt with reason: test on ssw1-e1-eqiad will take ospf on lsw1-f1-eqiad down.
17:32 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lsw1-e1-eqiad.mgmt with reason: test on ssw1-e1-eqiad will take ospf on lsw1-e1-eqiad down.
17:31 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on lsw1-e1-eqiad.mgmt with reason: test on ssw1-e1-eqiad will take ospf on lsw1-e1-eqiad down.
17:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P46132 and previous config saved to /var/cache/conftool/dbconfig/20230406-172800-ladsgroup.json
17:22 sukhe@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts lvs3007.esams.wmnet
17:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T333332)', diff saved to https://phabricator.wikimedia.org/P46131 and previous config saved to /var/cache/conftool/dbconfig/20230406-171254-ladsgroup.json
17:12 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts lvs3007.esams.wmnet
17:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2151 (T333332)', diff saved to https://phabricator.wikimedia.org/P46130 and previous config saved to /var/cache/conftool/dbconfig/20230406-171028-ladsgroup.json
17:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
17:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2151.codfw.wmnet with reason: Maintenance
17:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
17:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2141.codfw.wmnet with reason: Maintenance
17:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T333332)', diff saved to https://phabricator.wikimedia.org/P46129 and previous config saved to /var/cache/conftool/dbconfig/20230406-170928-ladsgroup.json
17:05 aqu@deploy2002: Finished deploy [airflow-dags/analytics@318480e]: Fix for dump_month_of_daily_pageviews dag - Analytics [airflow-dags@318480e] (duration: 00m 14s)
17:05 aqu@deploy2002: Started deploy [airflow-dags/analytics@318480e]: Fix for dump_month_of_daily_pageviews dag - Analytics [airflow-dags@318480e]
16:58 jelto@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gitlab2003.wikimedia.org with OS bullseye
16:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P46128 and previous config saved to /var/cache/conftool/dbconfig/20230406-165422-ladsgroup.json
16:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs6003.drmrs.wmnet
16:41 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs6003.drmrs.wmnet
16:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P46127 and previous config saved to /var/cache/conftool/dbconfig/20230406-163916-ladsgroup.json
16:34 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs6003.drmrs.wmnet with OS bullseye
16:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T333332)', diff saved to https://phabricator.wikimedia.org/P46126 and previous config saved to /var/cache/conftool/dbconfig/20230406-162409-ladsgroup.json
16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T333332)', diff saved to https://phabricator.wikimedia.org/P46125 and previous config saved to /var/cache/conftool/dbconfig/20230406-162144-ladsgroup.json
16:21 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
16:21 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2124.codfw.wmnet with reason: Maintenance
16:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T333332)', diff saved to https://phabricator.wikimedia.org/P46124 and previous config saved to /var/cache/conftool/dbconfig/20230406-162120-ladsgroup.json
16:15 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs6003.drmrs.wmnet with reason: host reimage
16:12 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs6003.drmrs.wmnet with reason: host reimage
16:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P46123 and previous config saved to /var/cache/conftool/dbconfig/20230406-160614-ladsgroup.json
16:05 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
16:05 topranks: Enable BGP EVPN sessions between eqiad row e/f Leaf and Spine devices
15:53 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs6003.drmrs.wmnet with OS bullseye
15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P46122 and previous config saved to /var/cache/conftool/dbconfig/20230406-155108-ladsgroup.json
15:42 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs6003.drmrs.wmnet with OS bullseye
15:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T333332)', diff saved to https://phabricator.wikimedia.org/P46121 and previous config saved to /var/cache/conftool/dbconfig/20230406-153602-ladsgroup.json
15:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T333332)', diff saved to https://phabricator.wikimedia.org/P46120 and previous config saved to /var/cache/conftool/dbconfig/20230406-153335-ladsgroup.json
15:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
15:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2117.codfw.wmnet with reason: Maintenance
15:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T333332)', diff saved to https://phabricator.wikimedia.org/P46119 and previous config saved to /var/cache/conftool/dbconfig/20230406-153312-ladsgroup.json
15:28 ladsgroup@deploy2002: Finished scap: Backport for Disable writes on group2 for DT backend (duration: 08m 11s)
15:21 ladsgroup@deploy2002: ladsgroup: Backport for Disable writes on group2 for DT backend synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
15:20 fab@deploy2002: Finished deploy [airflow-dags/research@2192f15]: (no justification provided) (duration: 00m 11s)
15:20 fab@deploy2002: Started deploy [airflow-dags/research@2192f15]: (no justification provided)
15:20 ladsgroup@deploy2002: Started scap: Backport for Disable writes on group2 for DT backend
15:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs6003.drmrs.wmnet with reason: host reimage
15:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P46118 and previous config saved to /var/cache/conftool/dbconfig/20230406-151806-ladsgroup.json
15:18 jgiannelos@deploy2002: Finished deploy [restbase/deploy@8fb20e9]: (no justification provided) (duration: 21m 01s)
15:16 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs6003.drmrs.wmnet with reason: host reimage
15:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P46117 and previous config saved to /var/cache/conftool/dbconfig/20230406-150300-ladsgroup.json
14:57 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs6003.drmrs.wmnet with OS bullseye
14:57 sukhe@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs6003.drmrs.wmnet with OS bullseye
14:57 jgiannelos@deploy2002: Started deploy [restbase/deploy@8fb20e9]: (no justification provided)
14:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T333332)', diff saved to https://phabricator.wikimedia.org/P46116 and previous config saved to /var/cache/conftool/dbconfig/20230406-144753-ladsgroup.json
14:46 ladsgroup@deploy2002: Finished scap: Backport for Disable DT backend on enwiki (duration: 07m 14s)
14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T333332)', diff saved to https://phabricator.wikimedia.org/P46115 and previous config saved to /var/cache/conftool/dbconfig/20230406-144437-ladsgroup.json
14:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
14:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db2114.codfw.wmnet with reason: Maintenance
14:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
14:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
14:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T333332)', diff saved to https://phabricator.wikimedia.org/P46114 and previous config saved to /var/cache/conftool/dbconfig/20230406-144332-ladsgroup.json
14:42 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Sync data for new ssw1 spine switches in eqiad. - cmooney@cumin1001 - T322937"
14:40 ladsgroup@deploy2002: ladsgroup: Backport for Disable DT backend on enwiki synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
14:40 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Sync data for new ssw1 spine switches in eqiad. - cmooney@cumin1001 - T322937"
14:39 ladsgroup@deploy2002: Started scap: Backport for Disable DT backend on enwiki
14:39 hnowlan@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1019*,lvs2009*} and A:lvs (T320967)
14:37 hnowlan@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1019*,lvs2009*} and A:lvs (T320967)
14:37 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs6003.drmrs.wmnet with reason: host reimage
14:34 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs6003.drmrs.wmnet with reason: host reimage
14:33 jelto@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gitlab2003.wikimedia.org with OS bullseye
14:32 hnowlan@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1020*,lvs2010*} and A:lvs (T320967)
14:30 hnowlan@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1020*,lvs2010*} and A:lvs (T320967)
14:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P46113 and previous config saved to /var/cache/conftool/dbconfig/20230406-142826-ladsgroup.json
14:21 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
14:21 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
14:21 elukey: upgrade istioctl on deploy[12]002 and istio-cni on ml-serve[12]00[1-8] manually - T334068
14:14 elukey: upload new istio-cni and istioctl 1.15.7 debian package versions to bullseye-wikimedia - T334068
14:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P46112 and previous config saved to /var/cache/conftool/dbconfig/20230406-141319-ladsgroup.json
14:12 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host lvs6003.drmrs.wmnet with OS bullseye
14:10 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for Add session schema config for mobile apps (T331481) (duration: 07m 54s)
14:08 fab@deploy2002: Finished deploy [airflow-dags/research@2192f15]: (no justification provided) (duration: 00m 11s)
14:08 fab@deploy2002: Started deploy [airflow-dags/research@2192f15]: (no justification provided)
14:03 lucaswerkmeister-wmde@deploy2002: mazevedo and lucaswerkmeister-wmde: Backport for Add session schema config for mobile apps (T331481) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
14:02 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for Add session schema config for mobile apps (T331481)
14:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts lvs6003.drmrs.wmnet
13:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T333332)', diff saved to https://phabricator.wikimedia.org/P46111 and previous config saved to /var/cache/conftool/dbconfig/20230406-135813-ladsgroup.json
13:56 urandom: rebooting sessionstore1001 — T327954
13:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1187 (T333332)', diff saved to https://phabricator.wikimedia.org/P46110 and previous config saved to /var/cache/conftool/dbconfig/20230406-135604-ladsgroup.json
13:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
13:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1187.eqiad.wmnet with reason: Maintenance
13:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T333332)', diff saved to https://phabricator.wikimedia.org/P46109 and previous config saved to /var/cache/conftool/dbconfig/20230406-135541-ladsgroup.json
13:51 sukhe@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts lvs6003.drmrs.wmnet
13:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P46108 and previous config saved to /var/cache/conftool/dbconfig/20230406-134035-ladsgroup.json
13:40 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
13:34 jelto@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gitlab2003.wikimedia.org with OS bullseye
13:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P46106 and previous config saved to /var/cache/conftool/dbconfig/20230406-132528-ladsgroup.json
13:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T333332)', diff saved to https://phabricator.wikimedia.org/P46104 and previous config saved to /var/cache/conftool/dbconfig/20230406-131022-ladsgroup.json
13:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T333332)', diff saved to https://phabricator.wikimedia.org/P46103 and previous config saved to /var/cache/conftool/dbconfig/20230406-130812-ladsgroup.json
13:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
13:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1180.eqiad.wmnet with reason: Maintenance
13:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T333332)', diff saved to https://phabricator.wikimedia.org/P46102 and previous config saved to /var/cache/conftool/dbconfig/20230406-130749-ladsgroup.json
12:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P46101 and previous config saved to /var/cache/conftool/dbconfig/20230406-125242-ladsgroup.json
12:50 godog: import grafana 9.4 T317887
12:41 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
12:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P46100 and previous config saved to /var/cache/conftool/dbconfig/20230406-123735-ladsgroup.json
12:26 dcausse: restarting blazegraph on wdqs1012 (BlazegraphFreeAllocatorsDecreasingRapidly)
12:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T333332)', diff saved to https://phabricator.wikimedia.org/P46099 and previous config saved to /var/cache/conftool/dbconfig/20230406-122229-ladsgroup.json
12:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1173 (T333332)', diff saved to https://phabricator.wikimedia.org/P46098 and previous config saved to /var/cache/conftool/dbconfig/20230406-122018-ladsgroup.json
12:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
12:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1173.eqiad.wmnet with reason: Maintenance
12:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T333332)', diff saved to https://phabricator.wikimedia.org/P46097 and previous config saved to /var/cache/conftool/dbconfig/20230406-121955-ladsgroup.json
12:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P46096 and previous config saved to /var/cache/conftool/dbconfig/20230406-120448-ladsgroup.json
11:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P46095 and previous config saved to /var/cache/conftool/dbconfig/20230406-114942-ladsgroup.json
11:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T333332)', diff saved to https://phabricator.wikimedia.org/P46094 and previous config saved to /var/cache/conftool/dbconfig/20230406-113436-ladsgroup.json
11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T333332)', diff saved to https://phabricator.wikimedia.org/P46093 and previous config saved to /var/cache/conftool/dbconfig/20230406-113226-ladsgroup.json
11:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
11:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1168.eqiad.wmnet with reason: Maintenance
11:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T333332)', diff saved to https://phabricator.wikimedia.org/P46092 and previous config saved to /var/cache/conftool/dbconfig/20230406-113203-ladsgroup.json
11:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P46091 and previous config saved to /var/cache/conftool/dbconfig/20230406-111657-ladsgroup.json
11:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P46090 and previous config saved to /var/cache/conftool/dbconfig/20230406-110151-ladsgroup.json
10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T333332)', diff saved to https://phabricator.wikimedia.org/P46089 and previous config saved to /var/cache/conftool/dbconfig/20230406-104644-ladsgroup.json
10:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T333332)', diff saved to https://phabricator.wikimedia.org/P46088 and previous config saved to /var/cache/conftool/dbconfig/20230406-104435-ladsgroup.json
10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
10:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
10:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1165.eqiad.wmnet with reason: Maintenance
10:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
10:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1140.eqiad.wmnet with reason: Maintenance
10:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46087 and previous config saved to /var/cache/conftool/dbconfig/20230406-104319-ladsgroup.json
10:41 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
10:41 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
10:40 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirtlocal1003.mgmt.eqiad.wmnet with reboot policy FORCED
10:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P46086 and previous config saved to /var/cache/conftool/dbconfig/20230406-102813-ladsgroup.json
10:28 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
10:27 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
10:27 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
10:26 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
10:13 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1003.mgmt.eqiad.wmnet with reboot policy FORCED
10:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P46085 and previous config saved to /var/cache/conftool/dbconfig/20230406-101306-ladsgroup.json
09:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46084 and previous config saved to /var/cache/conftool/dbconfig/20230406-095800-ladsgroup.json
09:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T333332)', diff saved to https://phabricator.wikimedia.org/P46083 and previous config saved to /var/cache/conftool/dbconfig/20230406-095640-ladsgroup.json
09:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1113.eqiad.wmnet with reason: Maintenance
09:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1113.eqiad.wmnet with reason: Maintenance
09:43 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
09:42 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
09:39 kevinbazira@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
09:38 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
09:38 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-codfw cluster: Roll restart of jvm daemons.
09:30 elukey: kafka main codfw cluster migrated to PKI TLS certs for brokers - T319372
09:22 jelto@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gitlab2003.wikimedia.org with OS bullseye
09:19 cgoubert@deploy2002: Finished scap: Backport for jobrunners: Raise memory_limit to match parsoid (T333528) (duration: 07m 11s)
09:13 cgoubert@deploy2002: cgoubert: Backport for jobrunners: Raise memory_limit to match parsoid (T333528) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
09:12 cgoubert@deploy2002: Started scap: Backport for jobrunners: Raise memory_limit to match parsoid (T333528)
08:40 elukey: powercycle ml-serve2004 - host frozen, racadm getsel shows multi-bit errors in various DIMM slots
08:28 jelto@cumin2002: START - Cookbook sre.hosts.reimage for host gitlab2003.wikimedia.org with OS bullseye
08:09 hashar@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.41.0-wmf.3 refs T330209
08:08 volans: restarting update-ubuntu-mirror.service on mirror1001 o check if it was a transient erro
07:56 elukey@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-main-codfw cluster: Roll restart of jvm daemons.
07:31 apergos: UTC morning backport and config training window done
07:28 moritzm: installing ghostscript security updates
07:19 kartik@deploy2002: Finished scap: Backport for Enable Section Translation on Kashmiri Wikipedia (T326541) (duration: 09m 31s)
07:16 zabe: zabe@mwmaint2002:~$ mwscript extensions/Translate/scripts/moveTranslatableBundle.php --wiki metawiki "Abuse filter maintainer" "Abuse filter maintainers" "Zabe" --reason "per request T334147"
07:11 kartik@deploy2002: kartik: Backport for Enable Section Translation on Kashmiri Wikipedia (T326541) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
07:09 kartik@deploy2002: Started scap: Backport for Enable Section Translation on Kashmiri Wikipedia (T326541)
02:07 fab@deploy2002: Finished deploy [airflow-dags/research@2192f15]: (no justification provided) (duration: 00m 21s)
02:06 fab@deploy2002: Started deploy [airflow-dags/research@2192f15]: (no justification provided)
00:50 urandom: rebooting sessionstore1001 — T327954
00:19 urandom: rebooting Cassandra on sessionstore1001 — T327954

2023-04-05

23:58 legoktm@deploy2002: Finished scap: Backport for Remove misleading "disable" of Special:Mostlinkedcategories (T310456) (duration: 07m 55s)
23:55 urandom: rebooting Cassandra on sessionstore1001 — T327954
23:52 legoktm@deploy2002: legoktm: Backport for Remove misleading "disable" of Special:Mostlinkedcategories (T310456) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
23:50 legoktm@deploy2002: Started scap: Backport for Remove misleading "disable" of Special:Mostlinkedcategories (T310456)
23:44 legoktm@deploy2002: Finished scap: Backport for Add <link rel="me"> to verify Mastodon account on mediawiki.org (duration: 07m 47s)
23:38 legoktm@deploy2002: legoktm: Backport for Add <link rel="me"> to verify Mastodon account on mediawiki.org synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
23:36 legoktm@deploy2002: Started scap: Backport for Add <link rel="me"> to verify Mastodon account on mediawiki.org
22:36 topranks: enabling lsw1-e1-eqiad port et-0/0/51 to ssw1-e1-eqiad et-0/0/80 T322937
22:33 urandom: rebooting Cassandra on sessionstore1001 — T327954
22:21 urandom: restarting Cassandra on sessionstore1001 to apply (intentionally) unreachable native transport — T327954
22:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5005.eqsin.wmnet with OS bullseye
21:45 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
21:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5005.eqsin.wmnet with reason: host reimage
21:31 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:31 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for ssw link addresses in eqiad - cmooney@cumin1001"
21:30 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for ssw link addresses in eqiad - cmooney@cumin1001"
21:28 cjming: end of UTC late backport window
21:23 cjming@deploy2002: Finished scap: Backport for [mgwiki] Replace the wordmark on Vector 2022 (T334022) (duration: 07m 58s)
21:21 cmooney@cumin1001: START - Cookbook sre.dns.netbox
21:16 cjming@deploy2002: superpes and cjming: Backport for [mgwiki] Replace the wordmark on Vector 2022 (T334022) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
21:16 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5005.eqsin.wmnet with OS bullseye
21:15 cjming@deploy2002: Started scap: Backport for [mgwiki] Replace the wordmark on Vector 2022 (T334022)
21:10 cjming@deploy2002: Finished scap: Backport for Add static mobile United_States page to facilitate synthetic testing of T331681 (T331681) (duration: 10m 06s)
21:10 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:10 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for ssw link addresses in eqiad - cmooney@cumin1001"
21:09 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for ssw link addresses in eqiad - cmooney@cumin1001"
21:07 cmooney@cumin1001: START - Cookbook sre.dns.netbox
21:02 cjming@deploy2002: cjming and nray: Backport for Add static mobile United_States page to facilitate synthetic testing of T331681 (T331681) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
21:01 cjming: UTC late backport & config window continuing
21:00 cjming@deploy2002: Started scap: Backport for Add static mobile United_States page to facilitate synthetic testing of T331681 (T331681)
20:58 cjming@deploy2002: Finished scap: Backport for Undeploy SimilarEditors from Beta (T331718) (duration: 35m 41s)
20:57 brett: Disable Puppet/PyBal on lvs5005 in preparation for reimaging - T321309
20:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5004.eqsin.wmnet with OS bullseye
20:44 cjming@deploy2002: tsepothoabala and cjming: Backport for Undeploy SimilarEditors from Beta (T331718) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
20:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5004.eqsin.wmnet with reason: host reimage
20:22 cjming@deploy2002: Started scap: Backport for Undeploy SimilarEditors from Beta (T331718)
20:21 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5004.eqsin.wmnet with reason: host reimage
20:17 mforns@deploy2002: Finished deploy [airflow-dags/analytics@2192f15]: (no justification provided) (duration: 00m 12s)
20:17 mforns@deploy2002: Started deploy [airflow-dags/analytics@2192f15]: (no justification provided)
20:03 mforns@deploy2002: Finished deploy [analytics/refinery@eb4c2b2] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eb4c2b2] (duration: 01m 34s)
20:01 mforns@deploy2002: Started deploy [analytics/refinery@eb4c2b2] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@eb4c2b2]
20:01 mforns@deploy2002: Finished deploy [analytics/refinery@eb4c2b2] (thin): Regular analytics weekly train THIN [analytics/refinery@eb4c2b2] (duration: 00m 08s)
20:01 mforns@deploy2002: Started deploy [analytics/refinery@eb4c2b2] (thin): Regular analytics weekly train THIN [analytics/refinery@eb4c2b2]
20:01 mforns@deploy2002: Finished deploy [analytics/refinery@eb4c2b2]: Regular analytics weekly train [analytics/refinery@eb4c2b2] (duration: 06m 26s)
19:54 mforns@deploy2002: Started deploy [analytics/refinery@eb4c2b2]: Regular analytics weekly train [analytics/refinery@eb4c2b2]
19:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5004.eqsin.wmnet with OS bullseye
19:30 brett: Disable Puppet/PyBal on lvs5004 in preparation for reimaging - T321309
19:27 mforns@deploy2002: Finished deploy [analytics/refinery@944a995] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@944a995] (duration: 01m 29s)
19:25 mforns@deploy2002: Started deploy [analytics/refinery@944a995] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@944a995]
19:25 mforns@deploy2002: Finished deploy [analytics/refinery@944a995] (thin): Regular analytics weekly train THIN [analytics/refinery@944a995] (duration: 00m 08s)
19:25 mforns@deploy2002: Started deploy [analytics/refinery@944a995] (thin): Regular analytics weekly train THIN [analytics/refinery@944a995]
19:25 mforns@deploy2002: Finished deploy [analytics/refinery@944a995]: Regular analytics weekly train [analytics/refinery@944a995] (duration: 06m 31s)
19:19 mforns@deploy2002: Started deploy [analytics/refinery@944a995]: Regular analytics weekly train [analytics/refinery@944a995]
19:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4009.ulsfo.wmnet with OS bullseye
18:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4009.ulsfo.wmnet with reason: host reimage
18:52 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4009.ulsfo.wmnet with reason: host reimage
18:37 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs4009.ulsfo.wmnet with OS bullseye
18:37 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host lvs4009.ulsfo.wmnet with OS bullseye
17:50 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs4009.ulsfo.wmnet with OS bullseye
17:32 brett: Disable Puppet/PyBal on lvs4009 in preparation for reimaging - T321309
17:28 cjming: deploying labs-only change
17:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs4008.ulsfo.wmnet with OS bullseye
17:06 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
17:03 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
16:56 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lists1003.wikimedia.org with reason: Moar CPUs!
16:56 jhathaway@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on lists1003.wikimedia.org with reason: Moar CPUs!
16:54 hnowlan@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
16:54 hnowlan@puppetmaster1001: conftool action : set/weight=10; selector: service=thumbor,name=thumbor100[1256].eqiad.wmnet
16:52 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool restbase-async in codfw: Depool from primary DC following network maintenance
16:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bullseye
16:47 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host lvs4008.ulsfo.wmnet with OS bullseye
16:47 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase-async.discovery.wmnet on all recursors
16:47 cgoubert@cumin1001: START - Cookbook sre.dns.wipe-cache restbase-async.discovery.wmnet on all recursors
16:47 cgoubert@cumin1001: START - Cookbook sre.discovery.service-route depool restbase-async in codfw: Depool from primary DC following network maintenance
16:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
16:37 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs4008.ulsfo.wmnet with reason: host reimage
16:36 hnowlan@puppetmaster1001: conftool action : set/weight=6; selector: service=thumbor,name=thumbor100[1256].eqiad.wmnet
16:30 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
16:30 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
16:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs4008.ulsfo.wmnet with OS bullseye
16:18 hnowlan@puppetmaster1001: conftool action : set/weight=8; selector: service=thumbor,name=thumbor100[1256].eqiad.wmnet
16:04 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/changeprop: sync
16:04 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/changeprop: sync
16:02 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kafka-test1010.eqiad.wmnet with OS bullseye
15:55 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=10; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
15:50 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: sync
15:47 brett: Disable Puppet/PyBal on lvs4008 in preparation for reimaging - T321309
15:44 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
15:42 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/changeprop: sync
15:42 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/changeprop: sync
15:41 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1010.eqiad.wmnet with reason: host reimage
15:39 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: sync
15:31 moritzm: restarting FPM on mediawiki canaries to pick up pcre security update
15:30 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=8; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
15:27 elukey@cumin1001: START - Cookbook sre.ganeti.reimage for host kafka-test1010.eqiad.wmnet with OS bullseye
15:25 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:21 moritzm: installing pcre2 security updates on buster
15:21 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=7; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
15:16 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
15:15 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for Revert "VisualEditorFeatureUse sampling rate to 1 everywhere" (duration: 07m 42s)
15:14 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:11 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:10 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kafka-test1009.eqiad.wmnet with OS bullseye
15:09 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and phuedx: Backport for Revert "VisualEditorFeatureUse sampling rate to 1 everywhere" synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
15:09 moritzm: installing nodejs security updates on buster
15:09 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: sync
15:08 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: sync
15:07 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for Revert "VisualEditorFeatureUse sampling rate to 1 everywhere"
15:05 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:04 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:03 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:03 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
14:54 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
14:51 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1009.eqiad.wmnet with reason: host reimage
14:48 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
14:48 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
14:48 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
14:36 elukey@cumin1001: START - Cookbook sre.ganeti.reimage for host kafka-test1009.eqiad.wmnet with OS bullseye
14:33 elukey: restart kafka on kafka-main1005 to pick up the new TLS certificate (PKI based) - T319372
14:31 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kafka-test1008.eqiad.wmnet with OS bullseye
14:31 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-main1005.eqiad.wmnet with reason: restart kafka, switch to PKI
14:30 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-main1005.eqiad.wmnet with reason: restart kafka, switch to PKI
14:14 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
14:14 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
14:14 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
14:11 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
14:11 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1008.eqiad.wmnet with reason: host reimage
14:00 elukey: powercycle an-worker1132
13:58 elukey@cumin1001: START - Cookbook sre.ganeti.reimage for host kafka-test1008.eqiad.wmnet with OS bullseye
13:57 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1010.eqiad.wmnet
13:54 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
13:54 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
13:53 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
13:53 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
13:52 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1010.eqiad.wmnet
13:52 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1009.eqiad.wmnet
13:52 elukey: restart kafka on kafka-main1004 to pick up the new TLS certificate (PKI based) - T319372
13:49 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-main1004.eqiad.wmnet with reason: restart kafka, switch to PKI
13:48 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-main1004.eqiad.wmnet with reason: restart kafka, switch to PKI
13:48 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1009.eqiad.wmnet
13:46 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for VisualEditorFeatureUse sampling rate to 1 everywhere (T333168) (duration: 14m 47s)
13:33 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and phuedx: Backport for VisualEditorFeatureUse sampling rate to 1 everywhere (T333168) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
13:31 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for VisualEditorFeatureUse sampling rate to 1 everywhere (T333168)
13:29 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for mediawiki.edit_attempt: Ignore events from PHP MPC (T309985) (duration: 10m 52s)
13:28 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
13:28 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:27 jclark@cumin1001: START - Cookbook sre.dns.netbox
13:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
13:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46079 and previous config saved to /var/cache/conftool/dbconfig/20230405-132318-root.json
13:21 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
13:19 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and phuedx: Backport for mediawiki.edit_attempt: Ignore events from PHP MPC (T309985) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
13:19 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
13:18 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for mediawiki.edit_attempt: Ignore events from PHP MPC (T309985)
13:17 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for GrowthExperiments: enable add link backend in wiki rounds (8,9th) (T308133 T308134) (duration: 08m 00s)
13:16 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
13:15 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:14 jclark@cumin1001: START - Cookbook sre.dns.netbox
13:10 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and sgimeno: Backport for GrowthExperiments: enable add link backend in wiki rounds (8,9th) (T308133 T308134) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
13:09 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for GrowthExperiments: enable add link backend in wiki rounds (8,9th) (T308133 T308134)
13:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46078 and previous config saved to /var/cache/conftool/dbconfig/20230405-130813-root.json
13:03 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1008.eqiad.wmnet
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46077 and previous config saved to /var/cache/conftool/dbconfig/20230405-130315-root.json
13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46076 and previous config saved to /var/cache/conftool/dbconfig/20230405-130121-root.json
12:58 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1008.eqiad.wmnet
12:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46075 and previous config saved to /var/cache/conftool/dbconfig/20230405-125308-root.json
12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46074 and previous config saved to /var/cache/conftool/dbconfig/20230405-124810-root.json
12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46073 and previous config saved to /var/cache/conftool/dbconfig/20230405-124616-root.json
12:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46072 and previous config saved to /var/cache/conftool/dbconfig/20230405-123804-root.json
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46071 and previous config saved to /var/cache/conftool/dbconfig/20230405-123305-root.json
12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46070 and previous config saved to /var/cache/conftool/dbconfig/20230405-123111-root.json
12:27 moritzm: installing xapian-core security updates
12:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46069 and previous config saved to /var/cache/conftool/dbconfig/20230405-122259-root.json
12:20 samtar@deploy2002: Finished scap: Backport for Remove WikiEditor's Realtime Preview config vars (T327515) (duration: 07m 41s)
12:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46068 and previous config saved to /var/cache/conftool/dbconfig/20230405-121801-root.json
12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46067 and previous config saved to /var/cache/conftool/dbconfig/20230405-121606-root.json
12:13 samtar@deploy2002: samwilson and samtar: Backport for Remove WikiEditor's Realtime Preview config vars (T327515) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
12:12 samtar@deploy2002: Started scap: Backport for Remove WikiEditor's Realtime Preview config vars (T327515)
12:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46066 and previous config saved to /var/cache/conftool/dbconfig/20230405-120754-root.json
12:04 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
12:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46065 and previous config saved to /var/cache/conftool/dbconfig/20230405-120256-root.json
12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46064 and previous config saved to /var/cache/conftool/dbconfig/20230405-120101-root.json
11:54 moritzm: installing apache2 security updates on buster
11:53 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
11:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 4%: Repooling', diff saved to https://phabricator.wikimedia.org/P46063 and previous config saved to /var/cache/conftool/dbconfig/20230405-115249-root.json
11:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46062 and previous config saved to /var/cache/conftool/dbconfig/20230405-114751-root.json
11:47 slyngshede@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host testvm2004.codfw.wmnet with OS bullseye
11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46061 and previous config saved to /var/cache/conftool/dbconfig/20230405-114557-root.json
11:45 TheresNoTime: `[samtar@mwmaint2002 ~]$ echo 'https://en.wikipedia.org/robots.txt' | mwscript purgeList.php` T334038
11:40 samtar@deploy2002: Finished scap: Backport for Remove possibly significant whitespace from robots.txt (T334038) (duration: 07m 14s)
11:38 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
11:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 3%: Repooling', diff saved to https://phabricator.wikimedia.org/P46060 and previous config saved to /var/cache/conftool/dbconfig/20230405-113745-root.json
11:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw1414.eqiad.wmnet
11:34 samtar@deploy2002: legoktm and samtar: Backport for Remove possibly significant whitespace from robots.txt (T334038) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
11:34 slyngshede@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2004.codfw.wmnet with reason: host reimage
11:33 samtar@deploy2002: Started scap: Backport for Remove possibly significant whitespace from robots.txt (T334038)
11:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 4%: Repooling', diff saved to https://phabricator.wikimedia.org/P46059 and previous config saved to /var/cache/conftool/dbconfig/20230405-113246-root.json
11:31 slyngshede@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2004.codfw.wmnet with reason: host reimage
11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1120 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46058 and previous config saved to /var/cache/conftool/dbconfig/20230405-113052-root.json
11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P46057 and previous config saved to /var/cache/conftool/dbconfig/20230405-113031-root.json
11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mw1414.eqiad.wmnet
11:28 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
11:28 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
11:24 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
11:23 ladsgroup@deploy2002: Finished scap: Backport for Revert "Revert "Revert "Revert "mwscript: Switch to use run.php"""" (T326800) (duration: 08m 45s)
11:23 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
11:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1100 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P46056 and previous config saved to /var/cache/conftool/dbconfig/20230405-112240-root.json
11:22 slyngshede@cumin1001: START - Cookbook sre.ganeti.reimage for host testvm2004.codfw.wmnet with OS bullseye
11:17 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
11:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 3%: Repooling', diff saved to https://phabricator.wikimedia.org/P46055 and previous config saved to /var/cache/conftool/dbconfig/20230405-111742-root.json
11:17 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
11:17 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
11:16 ladsgroup@deploy2002: ladsgroup: Backport for Revert "Revert "Revert "Revert "mwscript: Switch to use run.php"""" (T326800) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
11:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P46054 and previous config saved to /var/cache/conftool/dbconfig/20230405-111527-root.json
11:15 ladsgroup@deploy2002: Started scap: Backport for Revert "Revert "Revert "Revert "mwscript: Switch to use run.php"""" (T326800)
11:14 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
11:12 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
11:12 moritzm: installing systemd security updates on buster
11:12 slyngshede@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS bullseye
11:10 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
11:07 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1100 with 1% weight', diff saved to https://phabricator.wikimedia.org/P46053 and previous config saved to /var/cache/conftool/dbconfig/20230405-110717-root.json
11:05 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1130 to s5 primary T331302', diff saved to https://phabricator.wikimedia.org/P46052 and previous config saved to /var/cache/conftool/dbconfig/20230405-110530-root.json
11:05 marostegui: Starting s5 eqiad failover from db1100 to db1130 - T331302
11:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 2%: Repooling', diff saved to https://phabricator.wikimedia.org/P46051 and previous config saved to /var/cache/conftool/dbconfig/20230405-110237-root.json
11:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P46050 and previous config saved to /var/cache/conftool/dbconfig/20230405-110022-root.json
11:00 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
11:00 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
10:59 slyngshede@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
10:59 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
10:56 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
10:56 slyngshede@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
10:50 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
10:50 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
10:50 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
10:49 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
10:48 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
10:48 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
10:47 slyngshede@cumin1001: START - Cookbook sre.ganeti.reimage for host testvm2002.codfw.wmnet with OS bullseye
10:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46049 and previous config saved to /var/cache/conftool/dbconfig/20230405-104732-root.json
10:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P46048 and previous config saved to /var/cache/conftool/dbconfig/20230405-104517-root.json
10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1130 with weight 0 T331302', diff saved to https://phabricator.wikimedia.org/P46047 and previous config saved to /var/cache/conftool/dbconfig/20230405-104422-marostegui.json
10:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 24 hosts with reason: Primary switchover s5 T331302
10:43 hnowlan@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
10:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 24 hosts with reason: Primary switchover s5 T331302
10:43 hnowlan@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
10:41 hnowlan@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
10:40 hnowlan@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
10:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P46046 and previous config saved to /var/cache/conftool/dbconfig/20230405-103012-root.json
10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1120 T326669', diff saved to https://phabricator.wikimedia.org/P46044 and previous config saved to /var/cache/conftool/dbconfig/20230405-102215-marostegui.json
10:20 slyngshede@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS bullseye
10:17 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
10:17 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
10:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P46043 and previous config saved to /var/cache/conftool/dbconfig/20230405-101507-root.json
10:14 elukey: restart purged on cp5032, cp1082, cp6004, cp1090 - errors after restart of kafka main eqiad brokers
10:12 elukey: restart purged on cp6015 to verify if connection to brokers failed are only temporary or not
10:11 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kafka-test1007.eqiad.wmnet with OS bullseye
10:09 slyngshede@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
10:06 slyngshede@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1122 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P46041 and previous config saved to /var/cache/conftool/dbconfig/20230405-100003-root.json
09:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1122', diff saved to https://phabricator.wikimedia.org/P46040 and previous config saved to /var/cache/conftool/dbconfig/20230405-095954-marostegui.json
09:57 slyngshede@cumin1001: START - Cookbook sre.ganeti.reimage for host testvm2002.codfw.wmnet with OS bullseye
09:56 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1162 to s2 primary T334067', diff saved to https://phabricator.wikimedia.org/P46039 and previous config saved to /var/cache/conftool/dbconfig/20230405-095600-root.json
09:55 marostegui: Starting s2 eqiad failover from db1122 to db1162 - T334067
09:54 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
09:51 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1007.eqiad.wmnet with reason: host reimage
09:42 slyngshede@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host testvm2002.codfw.wmnet with OS bullseye
09:36 elukey@cumin1001: START - Cookbook sre.ganeti.reimage for host kafka-test1007.eqiad.wmnet with OS bullseye
09:35 elukey: restart kafka on kafka-main1003 to pick up the new TLS certificate (PKI based) - T319372
09:34 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1007.eqiad.wmnet
09:34 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-main1003.eqiad.wmnet with reason: restart kafka, switch to PKI
09:34 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-main1003.eqiad.wmnet with reason: restart kafka, switch to PKI
09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1162 with weight 0 T334067', diff saved to https://phabricator.wikimedia.org/P46038 and previous config saved to /var/cache/conftool/dbconfig/20230405-093155-marostegui.json
09:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s2 T334067
09:30 elukey@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1007.eqiad.wmnet
09:29 slyngshede@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
09:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s2 T334067
09:26 slyngshede@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage
09:15 slyngshede@cumin1001: START - Cookbook sre.ganeti.reimage for host testvm2002.codfw.wmnet with OS bullseye
08:58 hashar@deploy2002: Synchronized php: group1 wikis to 1.41.0-wmf.3 refs T330209 (duration: 05m 46s)
08:52 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.3 refs T330209
08:39 filippo@cumin1001: conftool action : set/pooled=no; selector: name=thanos-fe1003.eqiad.wmnet,service=thanos-web
08:28 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2067.codfw.wmnet with OS bullseye
08:27 elukey@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kafka-test1006.eqiad.wmnet with OS bullseye
08:25 hashar@deploy2002: Synchronized wmf-config/InitialiseSettings.php: Remove akwiki from CX config (take 2, it was not fully deployed due to a scap lock issue on the spare server) (duration: 06m 06s)
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1107 T326669', diff saved to https://phabricator.wikimedia.org/P46036 and previous config saved to /var/cache/conftool/dbconfig/20230405-082240-root.json
08:09 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
08:07 elukey: restart kafka on kafka-main1002 to pick up the new TLS certificate (PKI based) - T319372
08:06 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
08:02 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-main1002.eqiad.wmnet with reason: restart kafka, switch to PKI
08:02 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-main1002.eqiad.wmnet with reason: restart kafka, switch to PKI
07:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1104.eqiad.wmnet
07:59 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:59 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1104.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:56 elukey@cumin1001: START - Cookbook sre.ganeti.reimage for host kafka-test1006.eqiad.wmnet with OS bullseye
07:54 elukey@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host kafka-test1006.eqiad.wmnet with OS bullseye
07:54 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1104.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:52 marostegui@cumin1001: START - Cookbook sre.dns.netbox
07:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
07:47 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1104.eqiad.wmnet
07:46 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1104 from dbctl T329481', diff saved to https://phabricator.wikimedia.org/P46035 and previous config saved to /var/cache/conftool/dbconfig/20230405-073102-marostegui.json
07:30 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS bullseye
07:24 elukey@cumin1001: START - Cookbook sre.ganeti.reimage for host kafka-test1006.eqiad.wmnet with OS bullseye
07:20 marostegui: Stop mariadb on db1101 T331381
07:11 kartik@deploy2002: Finished scap: Backport for Remove akwiki from CX config (duration: 07m 22s)
07:11 marostegui: Failover m5-master T333377
07:05 kartik@deploy2002: kartik: Backport for Remove akwiki from CX config synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
07:04 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox circuit ID 33
07:04 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox circuit ID 33
07:04 kartik@deploy2002: Started scap: Backport for Remove akwiki from CX config
07:03 marostegui: Failover m3-master T333377
04:17 TimStarling: restarted swift-proxy on ms-fe* T328872

2023-04-04

23:40 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
23:34 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
23:28 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
23:25 tstarling@deploy2002: Synchronized src/Profiler.php: re-enable excimer T331882 (duration: 06m 25s)
23:21 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED
23:21 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
23:00 jclark@cumin1001: START - Cookbook sre.hosts.provision for host cloudvirtlocal1001.mgmt.eqiad.wmnet with reboot policy FORCED
22:58 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:58 jclark@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns cloudvirtlocal - jclark@cumin1001"
22:57 jclark@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns cloudvirtlocal - jclark@cumin1001"
22:55 jclark@cumin1001: START - Cookbook sre.dns.netbox
22:33 cstone: civicrm upgraded from 4231191f to 223f655a
22:26 mutante: deploying change to block scap execution on inactive deployment server via gerrit:904502 T330756
22:19 ejegg: payments-wiki upgraded from 49a2e104 to 75b068a1
21:39 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts miscweb2002.codfw.wmnet
21:39 dzahn@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:39 dzahn@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: miscweb2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1001"
21:37 dzahn@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: miscweb2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - dzahn@cumin1001"
21:26 dzahn@cumin1001: START - Cookbook sre.dns.netbox
21:22 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts miscweb2002.codfw.wmnet
20:56 sbassett: Deployed mitigation for T333140
20:44 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on miscweb2002.codfw.wmnet with reason: decom
20:44 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on miscweb2002.codfw.wmnet with reason: decom
20:44 TheresNoTime: closing UTC late backport window
20:38 samtar@deploy2002: Finished scap: Backport for Clean up history page visual diffs beta feature config (T333448) (duration: 06m 42s)
20:33 samtar@deploy2002: matmarex and samtar: Backport for Clean up history page visual diffs beta feature config (T333448) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
20:31 samtar@deploy2002: Started scap: Backport for Clean up history page visual diffs beta feature config (T333448)
20:27 samtar@deploy2002: Finished scap: Backport for EditCheck: catch errors from TransactionSquasher (T324733) (duration: 08m 23s)
20:23 inflatador: bking@cumin1001 unban elastic nodes post switch maintenance T331882
20:20 samtar@deploy2002: matmarex and samtar: Backport for EditCheck: catch errors from TransactionSquasher (T324733) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
20:18 samtar@deploy2002: Started scap: Backport for EditCheck: catch errors from TransactionSquasher (T324733)
20:11 samtar@deploy2002: Finished scap: Backport for Revert "Revert "Enable hidden tag for "Edit Check" project on Wikipedias"" (T324733) (duration: 07m 30s)
20:10 mutante: deploying ATS config change on cp2* for query.wikidata.org
20:06 ryankemper: T331896 Running puppet on wcqs fleet to pickup new miscweb gui_url: `ryankemper@cumin1001:~$ sudo -E cumin -b 2 'wcqs*' 'run-puppet-agent'`
20:05 samtar@deploy2002: matmarex and samtar: Backport for Revert "Revert "Enable hidden tag for "Edit Check" project on Wikipedias"" (T324733) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
20:03 samtar@deploy2002: Started scap: Backport for Revert "Revert "Enable hidden tag for "Edit Check" project on Wikipedias"" (T324733)
20:03 mutante: running puppet on cp5*, cp4*...
20:00 ryankemper: T331896 Running puppet on wdqs fleet to pickup new miscweb gui_url: `ryankemper@cumin1001:~$ sudo -E cumin -b 6 'wdqs*' 'run-puppet-agent'`
19:58 hashar@deploy2002: Finished deploy [gerrit/gerrit@dbaaa7a]: wm-zuul-status: change pending jobs SUCCESS > INFO | T214068 (duration: 00m 07s)
19:58 hashar@deploy2002: Started deploy [gerrit/gerrit@dbaaa7a]: wm-zuul-status: change pending jobs SUCCESS > INFO | T214068
19:55 mutante: https://query.wikidata.org and WCQS GUIs are switching to new backend VMs on bullseye in codfw T330090 T331896
19:46 hashar@deploy2002: Finished scap: Backport for Replace usages of Hooks::register() (T334005) (duration: 06m 55s)
19:40 hashar@deploy2002: hashar: Backport for Replace usages of Hooks::register() (T334005) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
19:39 hashar@deploy2002: Started scap: Backport for Replace usages of Hooks::register() (T334005)
19:10 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.3 refs T330209
18:05 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
18:05 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
17:22 ladsgroup@deploy2002: Finished scap: Backport for Revert "mergeMessageFileList.php: move code out of file scope." (T333966) (duration: 38m 18s)
17:04 ladsgroup@deploy2002: ladsgroup: Backport for Revert "mergeMessageFileList.php: move code out of file scope." (T333966) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
16:56 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
16:55 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
16:55 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
16:55 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
16:44 ladsgroup@deploy2002: Started scap: Backport for Revert "mergeMessageFileList.php: move code out of file scope." (T333966)
16:37 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
16:37 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
16:17 ladsgroup@deploy2002: Finished scap: Backport for Revert "external store: Depool es4 (cluster26) from writes for maintenance" (T333961) (duration: 07m 31s)
16:11 ladsgroup@deploy2002: ladsgroup: Backport for Revert "external store: Depool es4 (cluster26) from writes for maintenance" (T333961) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
16:10 ladsgroup@deploy2002: Started scap: Backport for Revert "external store: Depool es4 (cluster26) from writes for maintenance" (T333961)
16:07 jynus@cumin1001: dbctl commit (dc=all): 'Repool es1021 for reads', diff saved to https://phabricator.wikimedia.org/P46031 and previous config saved to /var/cache/conftool/dbconfig/20230404-160702-jynus.json
16:01 jynus@cumin1001: dbctl commit (dc=all): 'Repool es1021 for reads (only 10%)', diff saved to https://phabricator.wikimedia.org/P46030 and previous config saved to /var/cache/conftool/dbconfig/20230404-160146-jynus.json
15:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on es1022.eqiad.wmnet with reason: T333961
15:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on es1022.eqiad.wmnet with reason: T333961
15:58 jynus: restart es1021, several connections in a "stuck" state T333961
15:50 dancy@deploy2002: Installation of scap version "4.48.0" completed for 592 hosts
15:49 dancy@deploy2002: Installing scap version "4.48.0" for 592 hosts
15:31 jynus: restart es1021, several connections in a "stuck" state T333961
15:25 jynus@cumin1001: dbctl commit (dc=all): 'Depool es1021 reads', diff saved to https://phabricator.wikimedia.org/P46029 and previous config saved to /var/cache/conftool/dbconfig/20230404-152501-jynus.json
15:23 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
15:19 jiji@cumin1001: END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in eqiad: eqiad row C switches upgrade - T331882
15:18 ladsgroup@deploy2002: Finished scap: Backport for external store: Depool es4 (cluster26) from writes for maintenance (T333961) (duration: 11m 30s)
15:16 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db1150.eqiad.wmnet with reason: pending s3 reprovisioning
15:16 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db1150.eqiad.wmnet with reason: pending s3 reprovisioning
15:12 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
15:08 ladsgroup@deploy2002: ladsgroup and jynus: Backport for external store: Depool es4 (cluster26) from writes for maintenance (T333961) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
15:06 ladsgroup@deploy2002: Started scap: Backport for external store: Depool es4 (cluster26) from writes for maintenance (T333961)
14:54 urbanecm: [urbanecm@mwmaint2002 /srv/mediawiki/php]$ mwscript extensions/CentralAuth/maintenance/migrateAccount.php --wiki=metawiki -u 'Translation Notification Bot (T255246)' --auto # T255246
14:43 jiji@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: eqiad row C switches upgrade - T331882
14:39 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
14:39 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
14:38 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
14:38 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
14:38 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
14:37 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
14:36 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
14:36 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
14:28 vgutierrez: switch cp6008 (upload) and cp6016 (text) to use a single UDS socket between haproxy and varnish - T333965
14:21 jynus: stop es1022 for debugging T333961
14:15 Lucas_WMDE: UTC afternoon backport+config window done
14:15 lucaswerkmeister-wmde@deploy2002: Finished scap: Backport for Use HookContainer to register hooks inside hooks (T333926) (duration: 10m 50s)
14:10 stevemunene@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1018.eqiad.wmnet
14:09 stevemunene@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1013.eqiad.wmnet
14:09 stevemunene@puppetmaster1001: conftool action : set/pooled=yes; selector: name=aqs1012.eqiad.wmnet
14:09 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox circuit ID 33
14:09 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox circuit ID 33
14:09 stevemunene@puppetmaster1001: conftool action : set/pooled=yes; selector: name=datahubsearch1003.eqiad.wmnet
14:05 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde: Backport for Use HookContainer to register hooks inside hooks (T333926) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
14:04 lucaswerkmeister-wmde@deploy2002: Started scap: Backport for Use HookContainer to register hooks inside hooks (T333926)
13:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool es1022 T333961', diff saved to https://phabricator.wikimedia.org/P46027 and previous config saved to /var/cache/conftool/dbconfig/20230404-134415-ladsgroup.json
13:42 Emperor: repool thanos-fe1003 re T331882
13:41 Emperor: repool ms-fe1011 re T331882
13:38 steve_munene: leave hdfs safemode T331882
13:38 inflatador: reboot elastic2038 to clear soft lock
13:34 sukhe: run authdns-update for CR 905612, reverting depool of eqiad
13:30 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=thumbor1006.eqiad.wmnet
13:25 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
13:25 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
13:11 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1009.eqiad.wmnet
13:11 XioNoX: asw2-c-eqiad> request system reboot all-members - T331882
13:10 urbanecm@deploy2002: Finished scap: Backport for ckbwiktionary: Add logo (T331831) (duration: 07m 00s)
13:05 akosiaris@cumin1001: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all active/active services in eqiad: eqiad row C switches upgrade - T331882
13:03 urbanecm@deploy2002: Started scap: Backport for ckbwiktionary: Add logo (T331831)
13:02 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 227 hosts with reason: eqiad row C upgrade
12:57 ayounsi@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 227 hosts with reason: eqiad row C upgrade
12:57 steve_munene: putting pdfs into safe mode as part of T331882
12:52 ayounsi@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on 228 hosts with reason: eqiad row C upgrade
12:52 ayounsi@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 228 hosts with reason: eqiad row C upgrade
12:44 akosiaris@cumin1001: START - Cookbook sre.discovery.datacenter depool all active/active services in eqiad: eqiad row C switches upgrade - T331882
12:43 Emperor: depool thanos-fe1003 re T331882
12:38 Emperor: depool ms-fe1011 re T331882
12:32 sukhe: [finished] run authdns-update for CR: 905603 depool eqiad
12:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 38 hosts with reason: Row c switch maint T331882
12:31 sukhe: run authdns-update for CR: 905603 depool eqiad
12:31 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on 38 hosts with reason: Row c switch maint T331882
12:28 stevemunene@puppetmaster1001: conftool action : set/pooled=no; selector: name=aqs1018.eqiad.wmnet
12:28 volans@cumin1001: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox
12:28 stevemunene@puppetmaster1001: conftool action : set/pooled=no; selector: name=aqs1013.eqiad.wmnet
12:28 volans@cumin1001: START - Cookbook sre.netbox.update-extras rolling update on A:netbox
12:28 stevemunene@puppetmaster1001: conftool action : set/pooled=no; selector: name=aqs1012.eqiad.wmnet
12:28 volans@cumin1001: END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling update on A:netbox-canary
12:27 volans@cumin1001: START - Cookbook sre.netbox.update-extras rolling update on A:netbox-canary
12:26 stevemunene@puppetmaster1001: conftool action : set/pooled=no; selector: name=datahubsearch1003.eqiad.wmnet
12:24 TimStarling: I noticed that mw2382 was still talking to mwlog1002. It still had old php-fpm7.4 processes despite the scap. So I manually restarted php-fpm on it.
12:17 tstarling@deploy2002: Synchronized src/Profiler.php: T331882 disable profiling for switch maintenance (duration: 05m 58s)
11:35 hnowlan@puppetmaster1001: conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
11:24 moritzm: installing joblib security updates
10:17 hnowlan@puppetmaster1001: conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet
09:51 hashar@deploy2002: rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.41.0-wmf.3" | T330209
09:42 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.3 refs T330209
09:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T333332)', diff saved to https://phabricator.wikimedia.org/P46025 and previous config saved to /var/cache/conftool/dbconfig/20230404-091639-ladsgroup.json
09:19 hashar@deploy2002: Pruned MediaWiki: 1.41.0-wmf.1 (duration: 02m 16s)
09:13 hashar@deploy2002: Finished scap: testwikis wikis to 1.41.0-wmf.3 refs T330209 (duration: 40m 20s)
09:09 moritzm: installing libmicrohttpd security updates
09:07 moritzm: installing libdatetime-timezone-perl updates
09:04 akosiaris@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.
09:04 akosiaris@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.
09:04 akosiaris@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
09:04 akosiaris@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
09:03 akosiaris@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
09:03 akosiaris@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
09:03 akosiaris@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
09:03 akosiaris@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
09:03 akosiaris@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
09:02 akosiaris@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
09:02 akosiaris@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
09:02 akosiaris@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
09:02 akosiaris@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
09:02 akosiaris@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
09:01 akosiaris@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
09:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P46024 and previous config saved to /var/cache/conftool/dbconfig/20230404-090133-ladsgroup.json
09:01 akosiaris@deploy2002: helmfile [codfw] START helmfile.d/admin 'sync'.
08:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P46023 and previous config saved to /var/cache/conftool/dbconfig/20230404-085553-ladsgroup.json
08:55 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad
08:53 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad
08:46 akosiaris@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
08:46 akosiaris@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
08:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P46022 and previous config saved to /var/cache/conftool/dbconfig/20230404-084627-ladsgroup.json
08:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P46021 and previous config saved to /var/cache/conftool/dbconfig/20230404-084048-ladsgroup.json
08:35 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad
08:35 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad
08:32 hashar@deploy2002: Started scap: testwikis wikis to 1.41.0-wmf.3 refs T330209
08:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T333332)', diff saved to https://phabricator.wikimedia.org/P46020 and previous config saved to /var/cache/conftool/dbconfig/20230404-083120-ladsgroup.json
08:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1201 (T333332)', diff saved to https://phabricator.wikimedia.org/P46019 and previous config saved to /var/cache/conftool/dbconfig/20230404-082911-ladsgroup.json
08:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
08:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on db1201.eqiad.wmnet with reason: Maintenance
08:28 hashar: Deleting mediawiki/core branch `wmf/branch_cut_pretest` pointing at `430d25d1a1858edfa4a6199dfe1f0eb3743a219a` # T330209
08:27 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams
08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P46017 and previous config saved to /var/cache/conftool/dbconfig/20230404-082543-ladsgroup.json
08:25 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams
08:22 godog: upgrade grafana* to grafana 9.3.11 - T333915
08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1162 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P46016 and previous config saved to /var/cache/conftool/dbconfig/20230404-081039-ladsgroup.json
08:01 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams
08:01 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams
08:00 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs
08:00 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs
07:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
07:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
07:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1162 T333918', diff saved to https://phabricator.wikimedia.org/P46015 and previous config saved to /var/cache/conftool/dbconfig/20230404-074848-ladsgroup.json
07:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1122 to s2 primary T333918', diff saved to https://phabricator.wikimedia.org/P46014 and previous config saved to /var/cache/conftool/dbconfig/20230404-074656-ladsgroup.json
07:46 Amir1: Starting s2 eqiad failover from db1162 to db1122 - T333918
07:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2001.codfw.wmnet
07:36 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs
07:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2001.codfw.wmnet
07:35 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs
07:35 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
07:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
07:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2002.codfw.wmnet
07:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2002.codfw.wmnet
07:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1122 with weight 0 T333918', diff saved to https://phabricator.wikimedia.org/P46013 and previous config saved to /var/cache/conftool/dbconfig/20230404-072817-ladsgroup.json
07:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s2 T333918
07:27 hashar@deploy2002: Finished deploy [gerrit/gerrit@453b038]: Gerrit plugin update and switching from git-fat to git-lfs (duration: 00m 08s)
07:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s2 T333918
07:27 hashar@deploy2002: Started deploy [gerrit/gerrit@453b038]: Gerrit plugin update and switching from git-fat to git-lfs
07:23 hashar@deploy2002: Finished deploy [gerrit/gerrit@453b038]: Gerrit plugin update and switching from git-fat to git-lfs (duration: 00m 05s)
07:23 hashar@deploy2002: Started deploy [gerrit/gerrit@453b038]: Gerrit plugin update and switching from git-fat to git-lfs
06:09 XioNoX: stage new Junos on asw2-c-eqiad - T331882

2023-04-03

21:53 ryankemper: T331896 `sudo -E cumin -b 4 'wdqs*' 'sudo run-puppet-agent'`
21:42 maryum: undeployed mitigation for T333140
21:25 inflatador: bking@cumin ban cloudelastic1003 from all cloudelastic clusters T331882
21:22 maryum: deployed mitigation for T333140
21:17 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 10 hosts with reason: T331882 eqiad row C maint
21:16 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 10 hosts with reason: T331882 eqiad row C maint
21:12 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs1003.eqiad.wmnet,wdqs[1010,1013-1014].eqiad.wmnet with reason: T331882 eqiad row C maint
21:12 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs1003.eqiad.wmnet,wdqs[1010,1013-1014].eqiad.wmnet with reason: T331882 eqiad row C maint
20:37 kindrobot: close UTC late backport window
20:36 kindrobot@deploy2002: Finished scap: Backport for make "advanced mode" default on ptwikinews mobile (T290812) (duration: 10m 47s)
20:31 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5006.eqsin.wmnet with OS bullseye
20:26 kindrobot@deploy2002: jdlrobson and kindrobot: Backport for make "advanced mode" default on ptwikinews mobile (T290812) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
20:25 kindrobot@deploy2002: Started scap: Backport for make "advanced mode" default on ptwikinews mobile (T290812)
20:19 kindrobot@deploy2002: Finished scap: Backport for [refactor] split out Minerva configuration from main config, Disable Vector js/css sharing on pl.wikipedia (T332809) (duration: 12m 05s)
20:10 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5006.eqsin.wmnet with reason: host reimage
20:08 kindrobot@deploy2002: kindrobot and jdlrobson: Backport for [refactor] split out Minerva configuration from main config, Disable Vector js/css sharing on pl.wikipedia (T332809) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
20:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5006.eqsin.wmnet with reason: host reimage
20:07 kindrobot@deploy2002: Started scap: Backport for [refactor] split out Minerva configuration from main config, Disable Vector js/css sharing on pl.wikipedia (T332809)
20:03 kindrobot: start UTC late backport window
19:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5006.eqsin.wmnet with OS bullseye
19:38 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host lvs5006.eqsin.wmnet with OS bullseye
19:36 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
19:35 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
19:09 cwhite: manually upgrade vopsbot on alert2001 to version 0.3.3
18:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5006.eqsin.wmnet with reason: host reimage
18:55 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5006.eqsin.wmnet with reason: host reimage
18:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host lvs5006.eqsin.wmnet with OS bullseye
18:14 brett: Disable Puppet/PyBal on lvs5006 in preparation for reimaging - T321309
16:02 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin
15:59 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin
15:52 jdrewniak@deploy2002: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 05m 33s)
15:51 cstone: payments-wiki upgraded from 60d0aed5 to 49a2e104
15:46 jdrewniak@deploy2002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 06m 14s)
15:37 volans: restarted sirenbot (vopsbot) on alert2001 (msg="could not find the topic for this channel stored. Is the bot in the channel?")
15:36 mfossati@deploy2002: Finished deploy [airflow-dags/platform_eng@04b4841]: (no justification provided) (duration: 00m 12s)
15:36 mfossati@deploy2002: Started deploy [airflow-dags/platform_eng@04b4841]: (no justification provided)
15:30 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin
15:30 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin
15:27 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw
15:26 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw
15:12 sukhe: rolling restart of bird.service on doh* and not doh2002
15:07 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw
15:07 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw
15:05 aqu@deploy2002: Finished deploy [airflow-dags/analytics_test@fabc2cf]: Deploy refine webrequest job on analytics_test to fix matching Oozie job (duration: 00m 11s)
15:04 aqu@deploy2002: Started deploy [airflow-dags/analytics_test@fabc2cf]: Deploy refine webrequest job on analytics_test to fix matching Oozie job
14:30 stevemunene@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Investigate service failures from bullseye upgrade
14:30 stevemunene@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Investigate service failures from bullseye upgrade
13:50 claime: Testing deploy server dsh group inclusion - T329857
13:47 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1075.eqiad.wmnet']
13:47 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1074.eqiad.wmnet']
13:46 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
13:45 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
13:44 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
13:44 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
13:44 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
13:44 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
13:42 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
13:42 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
13:35 taavi@deploy2002: Finished scap: Backport for GrowthExperiments: add link backend amends (T308133) (duration: 07m 15s)
13:34 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1073']
13:32 jclark@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-be1072']
13:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 11062
13:30 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 11062
13:29 taavi@deploy2002: sgimeno and taavi: Backport for GrowthExperiments: add link backend amends (T308133) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
13:28 taavi@deploy2002: Started scap: Backport for GrowthExperiments: add link backend amends (T308133)
13:25 taavi@deploy2002: Finished scap: Backport for Enable visual enhancements on pages using on huwiki (T333570) (duration: 16m 06s)
13:18 taavi@deploy2002: matmarex and taavi: Backport for Enable visual enhancements on pages using on huwiki (T333570) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
13:09 taavi@deploy2002: Started scap: Backport for Enable visual enhancements on pages using on huwiki (T333570)
12:55 dcausse@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
12:54 dcausse@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/rdf-streaming-updater: apply
12:11 jbond@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=netbox
12:02 jbond: testing netbox failover cookbook
12:02 jbond@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=netbox
11:31 hnowlan@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
11:31 hnowlan@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
11:31 hnowlan@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
11:31 hnowlan@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
11:29 hnowlan@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
11:29 hnowlan@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
11:06 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
11:04 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
11:01 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
10:58 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
10:35 vgutierrez: Extend the ESI test to text@eqsin, revert https://gerrit.wikimedia.org/r/c/operations/puppet/+/905173/ if this gives any issue - T308799
10:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
10:26 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
10:23 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
10:23 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
10:20 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
10:19 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
09:19 elukey: move kafka-jumbo1006's kafka broker cert to PKI - T296064
09:19 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-jumbo1006.eqiad.wmnet with reason: restart kafka, switch to PKI
09:19 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-jumbo1006.eqiad.wmnet with reason: restart kafka, switch to PKI
08:54 elukey: move kafka-jumbo1009's kafka broker cert to PKI - T296064
08:53 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-jumbo1009.eqiad.wmnet with reason: restart kafka, switch to PKI
08:53 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-jumbo1009.eqiad.wmnet with reason: restart kafka, switch to PKI
08:52 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo
08:50 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo
08:32 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo
08:31 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo
08:31 vgutierrez: rolling upgrade to HAProxy 2.6.12 in A:cp-ulsfo
08:29 elukey: move kafka-main1001's kafka broker to PKI - T319372
08:26 vgutierrez: fetch HAProxy 2.6.12 on thirdparty/haproxy26 for bullseye (apt.wm.o)
08:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-main1001.eqiad.wmnet with reason: restart kafka, switch to PKI
08:23 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-main1001.eqiad.wmnet with reason: restart kafka, switch to PKI
08:03 elukey: move kafka-jumbo1008's kafka broker cert to PKI - T296064
08:03 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-jumbo1008.eqiad.wmnet with reason: restart kafka, switch to PKI
08:02 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-jumbo1008.eqiad.wmnet with reason: restart kafka, switch to PKI
07:43 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-jumbo1007.eqiad.wmnet with reason: restart kafka, switch to PKI
07:43 elukey: move kafka-jumbo1007's kafka broker cert to PKI - T296064
06:53 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-jumbo1005.eqiad.wmnet with reason: restart kafka, switch to PKI
06:52 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-jumbo1005.eqiad.wmnet with reason: restart kafka, switch to PKI
06:52 elukey: move kafka-jumbo1005's kafka broker cert to PKI - T296064

2023-04-01

00:13 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host prometheus5002.eqsin.wmnet with OS bullseye

Other archives

2000s

Archive 1: 2004 Jun - 2004 Sep
Archive 2: 2004 Oct - 2004 Nov
Archive 3: 2004 Dec - 2005 Mar
Archive 4: 2005 Apr - 2005 Jul
Archive 5: 2005 Aug - 2005 Oct, with revision history 2004-06-23 to 2005-11-25
Archive 6: 2005 Nov - 2006 Feb
Archive 7: 2006 Mar - 2006 Jun
Archive 8: 2006 Jul - 2006 Sep
Archive 9: 2006 Oct - 2007 Jan, with revision history 2005-11-25 to 2007-02-21
Archive 10: 2007 Feb - 2007 Jun
Archive 11: 2007 Jul - 2007 Dec
Archive 12: 2008 Jan - 2008 Jul
Archive 12a: 2008 Aug
Archive 12b: 2008 Sept
Archive 13: 2008 Oct - 2009 Jun
Archive 14: 2009 Jun - 2009 Dec

2010s

2020s