23:55 denisse@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus5002.eqsin.wmnet with reason: host reimage
23:52 denisse@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus5002.eqsin.wmnet with reason: host reimage
23:21 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host prometheus5002.eqsin.wmnet with OS bullseye
23:14 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host prometheus6002.drmrs.wmnet with OS bullseye
23:10 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host prometheus4002.ulsfo.wmnet with OS bullseye
23:02 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host prometheus5002.eqsin.wmnet
23:02 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
23:01 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
23:01 denisse@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus6002.drmrs.wmnet with reason: host reimage
22:58 denisse@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus4002.ulsfo.wmnet with reason: host reimage
22:57 denisse@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus6002.drmrs.wmnet with reason: host reimage
22:55 denisse@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus4002.ulsfo.wmnet with reason: host reimage
22:43 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host prometheus6002.drmrs.wmnet with OS bullseye
22:41 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host prometheus4002.ulsfo.wmnet with OS bullseye
22:25 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on miscweb[2002-2003].codfw.wmnet,miscweb[1002-1003].eqiad.wmnet with reason: maintenance
22:24 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on miscweb[2002-2003].codfw.wmnet,miscweb[1002-1003].eqiad.wmnet with reason: maintenance
22:01 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus5002.eqsin.wmnet on all recursors
22:01 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus5002.eqsin.wmnet on all recursors
22:01 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:01 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
22:01 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1075.eqiad.wmnet']
22:00 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
21:07 denisse@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus5002
21:06 denisse@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus5002.eqsin.wmnet
21:05 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus5002.eqsin.wmnet on all recursors
21:05 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus5002.eqsin.wmnet on all recursors
21:05 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:05 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
21:04 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
21:02 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus5002.eqsin.wmnet on all recursors
21:02 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus5002.eqsin.wmnet on all recursors
21:02 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:02 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
21:00 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
20:58 denisse@cumin1001: START - Cookbook sre.ganeti.makevm for new host prometheus5002.eqsin.wmnet
20:41 denisse@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus5002.eqsin.wmnet
20:41 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus5002.eqsin.wmnet on all recursors
20:41 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus5002.eqsin.wmnet on all recursors
20:40 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:40 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
20:39 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
20:38 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host prometheus6002.drmrs.wmnet with OS bullseye
20:38 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host prometheus4002.ulsfo.wmnet with OS bullseye
20:38 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host prometheus6002.drmrs.wmnet
20:38 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus6002.drmrs.wmnet - denisse@cumin1001"
20:37 denisse@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
20:37 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host prometheus4002.ulsfo.wmnet
20:37 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4002.ulsfo.wmnet - denisse@cumin1001"
20:37 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus6002.drmrs.wmnet - denisse@cumin1001"
20:33 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
20:30 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4002.ulsfo.wmnet - denisse@cumin1001"
20:16 jclark@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
20:05 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
20:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1074.mgmt.eqiad.wmnet with reboot policy FORCED
19:58 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host prometheus3002.esams.wmnet with OS bullseye
19:45 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
19:45 denisse@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus3002.esams.wmnet with reason: host reimage
19:45 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ms-be1074.mgmt.eqiad.wmnet with reboot policy FORCED
19:42 denisse@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus3002.esams.wmnet with reason: host reimage
19:41 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
19:40 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1072.mgmt.eqiad.wmnet with reboot policy FORCED
19:39 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1073.mgmt.eqiad.wmnet with reboot policy FORCED
19:37 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus6002.drmrs.wmnet on all recursors
19:37 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus6002.drmrs.wmnet on all recursors
19:37 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:37 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus6002.drmrs.wmnet - denisse@cumin1001"
19:36 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus6002.drmrs.wmnet - denisse@cumin1001"
19:35 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['an-worker1152.eqiad.wmnet']
19:34 denisse@cumin1001: START - Cookbook sre.ganeti.makevm for new host prometheus6002.drmrs.wmnet
19:33 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus5002.eqsin.wmnet on all recursors
19:33 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus5002.eqsin.wmnet on all recursors
19:33 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:33 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
19:33 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1073.eqiad.wmnet']
19:33 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
19:32 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus5002.eqsin.wmnet - denisse@cumin1001"
19:30 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
19:30 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1074.mgmt.eqiad.wmnet with reboot policy FORCED
19:30 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus4002.ulsfo.wmnet on all recursors
19:30 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus4002.ulsfo.wmnet on all recursors
19:30 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:30 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4002.ulsfo.wmnet - denisse@cumin1001"
19:29 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4002.ulsfo.wmnet - denisse@cumin1001"
19:28 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
19:28 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1074.mgmt.eqiad.wmnet with reboot policy FORCED
19:28 denisse@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
18:41 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host prometheus3002.esams.wmnet
18:40 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus3002.esams.wmnet - denisse@cumin1001"
18:40 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe1004.eqiad.wmnet with reason: host reimage
17:49 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe1004.eqiad.wmnet with reason: host reimage
17:48 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-fe1004.eqiad.wmnet with OS bullseye
17:44 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host thanos-fe1004.eqiad.wmnet with OS bullseye
17:40 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus3002.esams.wmnet on all recursors
17:40 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus3002.esams.wmnet on all recursors
17:40 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:40 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:39 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:27 denisse@cumin1001: START - Cookbook sre.hosts.decommission for hosts prometheus3002.esams.wmnet
17:23 denisse@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus3002.esams.wmnet
17:23 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus3002.esams.wmnet on all recursors
17:23 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus3002.esams.wmnet on all recursors
17:23 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:23 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:22 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:20 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus3002.esams.wmnet on all recursors
17:20 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus3002.esams.wmnet on all recursors
17:20 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:20 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:19 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:18 aqu@deploy2002: Finished deploy [airflow-dags/analytics@9182e44]: Fix for VirtualPageview Dag - Analytics [airflow-dags@9182e44] (duration: 00m 11s)
17:18 aqu@deploy2002: Started deploy [airflow-dags/analytics@9182e44]: Fix for VirtualPageview Dag - Analytics [airflow-dags@9182e44]
17:16 ebernhardson@deploy2002: Started deploy [airflow-dags/search@48778b4]: bump discolytics to 0.11.0
17:16 denisse@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus3002.esams.wmnet
17:16 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus3002.esams.wmnet on all recursors
17:16 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus3002.esams.wmnet on all recursors
17:16 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:16 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:15 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
17:13 denisse@cumin1001: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus3002.esams.wmnet - denisse@cumin1001"
16:55 sukhe: restart pybal on lvs4008 to set it primary LVS for high-traffic1
16:54 aqu@deploy2002: Finished deploy [airflow-dags/analytics@2aae7d0]: Fix for VirtualPageview Dag - Analytics [airflow-dags@2aae7d0] (duration: 00m 10s)
16:54 aqu@deploy2002: Started deploy [airflow-dags/analytics@2aae7d0]: Fix for VirtualPageview Dag - Analytics [airflow-dags@2aae7d0]
16:30 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-fe1004.eqiad.wmnet with OS bullseye
16:29 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host thanos-fe1004.eqiad.wmnet with OS bullseye
16:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-fe1004.eqiad.wmnet with OS bullseye
16:28 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host thanos-fe1004.eqiad.wmnet with OS bullseye
16:15 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
16:15 btullis@deploy2002: helmfile [staging] START helmfile.d/services/datahub: sync on main
15:14 ladsgroup@deploy1002: ladsgroup: Backport for Revert "Revert "Revert "mwscript: Switch to use run.php""" synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
15:14 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-fe1004.eqiad.wmnet with OS bullseye
15:10 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host thanos-fe1004.eqiad.wmnet with OS bullseye
15:10 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-fe1004.eqiad.wmnet with OS bullseye
15:08 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe1014.eqiad.wmnet with reason: host reimage
15:08 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host thanos-fe1004.eqiad.wmnet with OS bullseye
15:07 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host thanos-fe1004.eqiad.wmnet with OS bullseye
15:07 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host thanos-fe1004.eqiad.wmnet with OS bullseye
13:11 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-jumbo1004.eqiad.wmnet with reason: restart kafka, switch to PKI
13:11 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-jumbo1004.eqiad.wmnet with reason: restart kafka, switch to PKI
11:09 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Upgrading Gitlab
11:08 eoghan@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Upgrading Gitlab
11:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2067.codfw.wmnet with OS bullseye
10:46 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
10:45 Amir1: Failover m1 from db1101 to db1164 - T333123
10:44 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2067.codfw.wmnet with reason: host reimage
10:32 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['an-worker1149.eqiad.wmnet']
10:28 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2067.codfw.wmnet with OS bullseye
10:25 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup1001.eqiad.wmnet with reason: preparing for m1 primary db switchover
10:25 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on backup1001.eqiad.wmnet with reason: preparing for m1 primary db switchover
10:18 eoghan@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Upgrading Gitlab
09:54 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: reprovisioning after maintenance
09:54 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: reprovisioning after maintenance
09:54 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-jumbo1003.eqiad.wmnet with reason: restart kafka, switch to PKI
09:53 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-jumbo1003.eqiad.wmnet with reason: restart kafka, switch to PKI
09:03 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on kafka-jumbo1002.eqiad.wmnet with reason: restart kafka, switch to PKI
09:03 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 0:15:00 on kafka-jumbo1002.eqiad.wmnet with reason: restart kafka, switch to PKI
08:47 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2003.wikimedia.org with OS bullseye
08:38 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on an-worker1091.eqiad.wmnet with reason: Replacing battery
08:38 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on an-worker1091.eqiad.wmnet with reason: Replacing battery
08:32 jelto@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
08:27 jelto@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2003.wikimedia.org with reason: host reimage
01:08 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit1003.wikimedia.org with OS bullseye
01:07 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
01:04 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus3002.esams.wmnet - denisse@cumin1001"
00:06 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage
00:04 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus3002.esams.wmnet on all recursors
00:04 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache prometheus3002.esams.wmnet on all recursors
00:04 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
00:04 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
00:02 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus3002.esams.wmnet - denisse@cumin1001"
20:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1217.eqiad.wmnet with reason: host reimage
20:12 thcipriani@deploy2002: nray and thcipriani: Backport for Remove inline script from United States static page (T331681) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
20:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1216.eqiad.wmnet with reason: host reimage
14:17 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-codfw cluster: Roll restart of jvm daemons.
12:08 btullis@deploy2002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
12:02 ladsgroup@deploy2002: ladsgroup: Backport for Set externallinks to WRITE BOTH everywhere (T321662) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
11:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1136.eqiad.wmnet with reason: Maintenance
11:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1136.eqiad.wmnet with reason: Maintenance
11:12 hnowlan@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:12 hnowlan@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add service records for rest-gateway - hnowlan@cumin1001"
11:11 hnowlan@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add service records for rest-gateway - hnowlan@cumin1001"
11:03 ladsgroup@deploy2002: ladsgroup: Backport for Revert "Revert "mwscript: Switch to use run.php"" (T326800) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
11:03 claime: Re-enabling puppet for cp-text - T331318
10:58 volans@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1074.mgmt.eqiad.wmnet with reboot policy FORCED
10:58 volans@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
10:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
10:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
10:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P45994 and previous config saved to /var/cache/conftool/dbconfig/20230330-105011-ladsgroup.json
10:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1136 T333538', diff saved to https://phabricator.wikimedia.org/P45993 and previous config saved to /var/cache/conftool/dbconfig/20230330-104928-ladsgroup.json
10:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1181 to s7 primary T333538', diff saved to https://phabricator.wikimedia.org/P45992 and previous config saved to /var/cache/conftool/dbconfig/20230330-104617-ladsgroup.json
10:45 Amir1: Starting s7 eqiad failover from db1136 to db1181 - T333538
10:44 volans@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
10:35 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons.
10:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P45989 and previous config saved to /var/cache/conftool/dbconfig/20230330-103506-ladsgroup.json
10:29 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
10:27 jclark@cumin1001: START - Cookbook sre.hosts.provision for host ms-be1075.mgmt.eqiad.wmnet with reboot policy FORCED
10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1181 with weight 0 T333538', diff saved to https://phabricator.wikimedia.org/P45988 and previous config saved to /var/cache/conftool/dbconfig/20230330-102012-ladsgroup.json
10:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P45987 and previous config saved to /var/cache/conftool/dbconfig/20230330-102002-ladsgroup.json
10:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 T333538
10:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s7 T333538
00:02 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1225.mgmt.eqiad.wmnet with reboot policy FORCED
2023-03-29
23:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1224.mgmt.eqiad.wmnet with reboot policy FORCED
23:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1223.mgmt.eqiad.wmnet with reboot policy FORCED
23:53 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on contint2002.wikimedia.org with reason: WIP-known-to-be-debugged-new-host
23:52 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on contint2002.wikimedia.org with reason: WIP-known-to-be-debugged-new-host
23:51 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1224.mgmt.eqiad.wmnet with reboot policy FORCED
23:50 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1223.mgmt.eqiad.wmnet with reboot policy FORCED
23:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1221.mgmt.eqiad.wmnet with reboot policy FORCED
23:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1222.mgmt.eqiad.wmnet with reboot policy FORCED
23:48 mutante: contint2002 - a2dismod mpm_event (ONCE AGAIN this year old issue when applying roles with apache for the first time) - running puppet - now it can actually install PHP 7.3 and start apache T324659
23:48 mutante: contint2002 - a2dismod mpm_event (ONCE AGAIN this year old issue when applying roles with apache for the first time) - running puppet - now it can actually install PHP 7.3 and start apache
23:23 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1072.eqiad.wmnet']
15:06 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2001.codfw.wmnet with reason: Stop kafka, dist-upgrade
15:06 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2001.codfw.wmnet with reason: Stop kafka, dist-upgrade
13:41 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2002.codfw.wmnet with reason: stop kafka, dist-upgrade
13:41 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2002.codfw.wmnet with reason: stop kafka, dist-upgrade
13:29 sukhe: disable puppet on A:lvs to test Python 2 deprecation change: T321309
13:21 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and arlolra: Backport for Enabled native gallery editing in Parsoid (T329662) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
13:10 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde and matmarex: Backport for Enable history page visual diffs on remaining wikis (T314588) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
13:09 dcausse@deploy2002: Started deploy [airflow-dags/search@92e9876]: (no justification provided)
12:50 mvernon@cumin1001: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
12:43 mvernon@cumin1001: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
10:52 claime: Running puppet on dns-auth - T333120
10:50 claime: Switching mw-api-int to production - T333120
10:50 claime: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1019*,lvs2009*} and A:lvs (T333120)
10:49 cgoubert@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1019*,lvs2009*} and A:lvs (T333120)
10:46 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1019*,lvs2009*} and A:lvs (T333120)
10:43 cgoubert@cumin1001: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1020*,lvs2010*} and A:lvs (T333120)
10:41 cgoubert@cumin1001: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1020*,lvs2010*} and A:lvs (T333120)
10:37 claime: Switching mw-api-int to lvs_setup - T333120
10:02 hnowlan@deploy2002: Started deploy [restbase/deploy@c265f3f]: Add ckbwiktionary, anpwiki T332093T332379
09:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox-dev2002.codfw.wmnet with reason: a good reason - ayounsi@cumin1001
09:58 claime: running puppet on O:kubernetes::worker and O:lvs::balancer - T333120
09:58 denisse: updating prometheus3001 to bullseye
09:57 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: a good reason - ayounsi@cumin1001
09:57 claime: Adding mw-api-int to service_catalog in service_setup - T333120
09:56 ayounsi@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox-dev2002.codfw.wmnet with reason: a good reason - ayounsi@cumin1001
09:54 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: a good reason - ayounsi@cumin1001
09:54 ayounsi@cumin1001: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) netbox to netbox-dev2002.codfw.wmnet with reason: a good reason - ayounsi@cumin1001
09:50 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: a good reason - ayounsi@cumin1001
09:50 ayounsi@cumin1001: END (FAIL) - Cookbook sre.deploy.python-code (exit_code=99) netbox to netbox-dev2002.codfw.wmnet with reason: a good reason - ayounsi@cumin1001
09:50 ayounsi@cumin1001: START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: a good reason - ayounsi@cumin1001
09:27 filippo@deploy2002: filippo: Backport for Revert "Failover statsd to graphite2004" synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
09:02 elukey: move kafka on kafka-jumbo1001 to PKI TLS certs - T296064
09:02 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-jumbo1001.eqiad.wmnet with reason: restart kafka, upgrade to PKI
09:02 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-jumbo1001.eqiad.wmnet with reason: restart kafka, upgrade to PKI
08:03 volans: installed spicerack v6.4.0 on cumin1001
07:35 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2003.codfw.wmnet with reason: Stop kafka, dist-upgrade
07:34 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2003.codfw.wmnet with reason: Stop kafka, dist-upgrade
18:36 xcollazo@deploy2002: Started deploy [airflow-dags/platform_eng@0f1c9e8]: Deploy latest image_suggestions on platform_eng Airflow instance
18:33 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1207.mgmt.eqiad.wmnet with reboot policy FORCED
18:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1208.mgmt.eqiad.wmnet with reboot policy FORCED
18:32 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1207.mgmt.eqiad.wmnet with reboot policy FORCED
18:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1208.mgmt.eqiad.wmnet with reboot policy FORCED
18:28 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1207.mgmt.eqiad.wmnet with reboot policy FORCED
18:25 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:25 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new db nodes - pt1979@cumin2002"
18:23 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for new db nodes - pt1979@cumin2002"
15:20 jnuche@deploy2002: Started scap: testwikis wikis to 1.41.0-wmf.2 refs T330208
15:15 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: installation failed due to read-only database
15:15 aokoth@cumin1001: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: installation failed due to read-only database
14:51 akosiaris@cumin1001: END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in eqiad: eqiad row B switches upgrade done - T330165
14:32 akosiaris@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: eqiad row B switches upgrade done - T330165
14:31 sukhe: run authdns-update to revert eqiad depool
12:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 45295
12:15 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 45295
12:09 eoghan@cumin1001: START - Cookbook sre.ganeti.reimage for host aphlict1002.eqiad.wmnet with OS bullseye
11:57 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main1002.eqiad.wmnet with reason: stop kafka and dist-upgrade
11:57 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main1002.eqiad.wmnet with reason: stop kafka and dist-upgrade
11:56 elukey: dist-upgrade kafka-main1002 to debian bullseye - T332013
10:16 stevemunene@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-client1002.eqiad.wmnet with reason: host reimage
10:12 stevemunene@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-client1002.eqiad.wmnet with reason: host reimage
09:56 stevemunene@cumin1001: START - Cookbook sre.ganeti.reimage for host an-test-client1002.eqiad.wmnet with OS bullseye
09:45 vgutierrez@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on cp2035.codfw.wmnet with reason: HW issues
09:45 vgutierrez@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on cp2035.codfw.wmnet with reason: HW issues
09:28 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main1001.eqiad.wmnet with reason: stop kafka and dist-upgrade
09:28 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main1001.eqiad.wmnet with reason: stop kafka and dist-upgrade
09:12 jbond@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nicolas Fraison out of all services on: 2048 hosts
09:11 jbond@cumin1001: START - Cookbook sre.idm.logout Logging Nicolas Fraison out of all services on: 2048 hosts
09:11 jbond@cumin1001: END (ERROR) - Cookbook sre.idm.logout (exit_code=97) Logging Nicolas Fraison out of systemdlogoutd on: 2048 hosts
09:11 jbond@cumin1001: START - Cookbook sre.idm.logout Logging Nicolas Fraison out of systemdlogoutd on: 2048 hosts
08:58 vgutierrez: restart ipmiseld on cp2035
08:50 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.wikimedia.org
08:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on 16 hosts with reason: Switch maintenance
08:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on 16 hosts with reason: Switch maintenance
08:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on 21 hosts with reason: Switch maintenance
08:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 8:00:00 on 21 hosts with reason: Switch maintenance
08:04 oblivian@deploy2002: oblivian and filippo: Backport for Failover statsd to graphite2004 (T330165) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
08:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on es[1020-1022].eqiad.wmnet with reason: Switch maintenance
21:11 tzatziki: moving Universal Code of Conduct/Enforcement guidelines -> Universal Code of Conduct/Enforcement guidelines/Version 1 on metawiki with `extensions/Translate/scripts/moveTranslatableBundle.php `
20:45 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudvirt1022.eqiad.wmnet
20:45 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:45 andrew@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirt1022.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
20:43 andrew@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirt1022.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
20:36 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudvirt1022.eqiad.wmnet
20:35 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudvirt1021.eqiad.wmnet
20:35 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:35 andrew@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirt1021.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
20:33 andrew@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirt1021.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
20:25 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudvirt1021.eqiad.wmnet
20:25 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudvirt1017.eqiad.wmnet
20:25 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:25 andrew@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirt1017.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
20:23 andrew@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudvirt1017.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1001"
15:17 elukey@deploy2002: Synchronized private/PrivateSettings.php: (no justification provided) (duration: 06m 10s)
15:05 eoghan@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host aphlict1002.eqiad.wmnet
14:56 eoghan@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) aphlict1002.eqiad.wmnet on all recursors
14:56 eoghan@cumin1001: START - Cookbook sre.dns.wipe-cache aphlict1002.eqiad.wmnet on all recursors
14:56 eoghan@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:56 eoghan@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM aphlict1002.eqiad.wmnet - eoghan@cumin1001"
14:55 eoghan@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM aphlict1002.eqiad.wmnet - eoghan@cumin1001"
14:08 taavi@deploy2002: taavi: Backport for namespaceDupes: Remove extra addQuotes() calls (T333166) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
13:04 taavi@deploy2002: superpes and taavi: Backport for [huwiki] Add Draft and Draft_talk namespaces (T333083) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
11:55 elukey@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.
10:10 elukey: dist-upgrade kafka-main1003 manually to bullseye - T332013
10:03 Emperor: depool ms-fe2009
09:47 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main1003.eqiad.wmnet with reason: stop kafka and dist-upgrade
09:47 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main1003.eqiad.wmnet with reason: stop kafka and dist-upgrade
09:45 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45295
09:44 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 45295
09:41 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:30 ladsgroup@deploy1002: ladsgroup: Backport for EntityUsageTable: Mark query as read-only (T332941) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
08:28 jynus: restarting bacula at backup1001 T331510
08:25 urbanecm@deploy2002: Synchronized wmf-config/InitialiseSettings.php: 63dd23b: [Growth] eswiki: Enable mentorship for 50% of newcomers (T332737, T285235) (duration: 06m 09s)
05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1120 T332292', diff saved to https://phabricator.wikimedia.org/P45942 and previous config saved to /var/cache/conftool/dbconfig/20230327-051941-root.json
05:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2132,2160].codfw.wmnet,db[1101,1117,1164].eqiad.wmnet with reason: m1 master switch T331510
05:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2132,2160].codfw.wmnet,db[1101,1117,1164].eqiad.wmnet with reason: m1 master switch T331510
07:54 hashar@deploy2002: Started deploy [integration/docroot@ab848e3]: build: Updating eslint-config-wikimedia to 0.24.0
00:59 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host
00:58 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host
00:57 mutante: doc1002 - issue is mismatched UIDs again, most likely. doc-uploader is debmonitor on new host
00:56 mutante: doc1002 - manually running rsync to doc2002 - which failed with status 23 when started by timer
00:09 tzatziki: removing 2 files for legal compliance
23:50 tzatziki: removing 1 file for legal compliance
21:08 mutante: mwmaint1002 ferm rules for rsyncd_access from miscweb removed by puppet after I4fe17f which reverted a8af0339bde14018e8. manually deleted rsyncd config and stopped rsync service. complete noop on mwmaint2002 which is currently the active mwmaint server. T328907
10:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 21 days, 0:00:00 on krb2002.codfw.wmnet with reason: Non-functional, WIP for Bullseye update
10:55 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 21 days, 0:00:00 on krb2002.codfw.wmnet with reason: Non-functional, WIP for Bullseye update
10:35 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
10:00 marostegui: Upgrade db1204 to mariadb 10.6 T330861
08:57 hashar: Fixed up Gerrit > GitHub replication which broke at 5:00 UTC by updating the Github RSA ssh host key T332972
05:37 hashar: gerrit: refreshed ssh host key for `github.com`
05:28 hashar: Restarted Gerrit
05:26 hashar: Stopping Gerrit
05:26 hashar@deploy2002: Finished deploy [gerrit/gerrit@c1cbda4]: Update js plugins for EarlyWarning bot (T330850) and displaying Zuul status on changes (T241068) (duration: 00m 10s)
05:26 hashar@deploy2002: Started deploy [gerrit/gerrit@c1cbda4]: Update js plugins for EarlyWarning bot (T330850) and displaying Zuul status on changes (T241068)
05:22 hashar: Restarting gerrit replica on gerrit2002.wikimedia.org
05:21 hashar@deploy2002: Finished deploy [gerrit/gerrit@c1cbda4]: Update js plugins for EarlyWarning bot (T330850) and displaying Zuul status on changes (T241068) (duration: 00m 07s)
05:20 hashar@deploy2002: Started deploy [gerrit/gerrit@c1cbda4]: Update js plugins for EarlyWarning bot (T330850) and displaying Zuul status on changes (T241068)
05:17 hashar: Restarting Gerrit for deploying plugins updates
05:10 ejegg: Standalone SmashPig upgraded from 3b84e4cb to 50139e82
22:09 mutante: moscovium - when doing an in-place upgrade from buster to bullseye and you replace the string in sources.list, you also need to replace "bullseye-updates" with "bullseye-security" in the security.debian.org lines - that this is needed is called a bug at https://shagain.club/index.php/archives/641/ - T327068
22:00 mutante: moscovium - apt-get full-upgrade ; apt autoremove ; replace buster with bullseye in sources.list ; repeat apt-get upgrade/full-upgrade etc. (https://wiki.debian.org/DebianUpgrade) T327068
22:00 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host doc2002.codfw.wmnet with OS bullseye
19:36 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doc2002.codfw.wmnet on all recursors
19:36 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache doc2002.codfw.wmnet on all recursors
19:36 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:36 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doc2002.codfw.wmnet - denisse@cumin1001"
19:35 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doc2002.codfw.wmnet - denisse@cumin1001"
19:31 denisse@cumin1001: START - Cookbook sre.ganeti.makevm for new host doc2002.codfw.wmnet
19:28 denisse@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doc2002
19:28 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:28 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doc2002 decommissioned, removing all IPs except the asset tag one - denisse@cumin1001"
19:20 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doc2002 decommissioned, removing all IPs except the asset tag one - denisse@cumin1001"
16:59 sukhe: rolling out CR 901333 to A:cp-text T313578
16:45 sukhe: disable Puppet in A:cp to test and then merge CR 901333
16:17 elukey@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-main2002.codfw.wmnet with OS bullseye
16:07 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-main2002.codfw.wmnet with OS bullseye
16:04 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2002.codfw.wmnet with reason: stop kafka and reimage
16:04 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2002.codfw.wmnet with reason: stop kafka and reimage
14:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) irc1002.wikimedia.org on all recursors
14:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache irc1002.wikimedia.org on all recursors
14:24 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:24 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM irc1002.wikimedia.org - jmm@cumin2002"
14:21 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host pybal-test2003.codfw.wmnet with OS bullseye
14:19 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cephosd1002.eqiad.wmnet
14:16 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM irc1002.wikimedia.org - jmm@cumin2002"
13:29 samtar@deploy2002: samtar and sgimeno: Backport for GrowthExperiments: disable add a link backend (T304551) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
11:08 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-main2004.codfw.wmnet with OS bullseye
11:07 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2004.codfw.wmnet with reason: stop kafka and reimage
11:06 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2004.codfw.wmnet with reason: stop kafka and reimage
11:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on irc2002.wikimedia.org with reason: host reimage
10:56 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on irc2002.wikimedia.org with reason: host reimage
10:44 jmm@cumin2002: START - Cookbook sre.ganeti.reimage for host irc2002.wikimedia.org with OS bullseye
10:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host irc2002.wikimedia.org
10:38 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2005.codfw.wmnet with OS bullseye
10:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) irc2002.wikimedia.org on all recursors
10:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache irc2002.wikimedia.org on all recursors
10:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
10:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM irc2002.wikimedia.org - jmm@cumin2002"
10:18 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2005.codfw.wmnet with reason: host reimage
10:15 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2005.codfw.wmnet with reason: host reimage
10:10 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM irc2002.wikimedia.org - jmm@cumin2002"
10:08 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host irc2002.wikimedia.org
10:01 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-main2005.codfw.wmnet with OS bullseye
09:57 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2005.codfw.wmnet with reason: stop kafka and reimage
09:57 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2005.codfw.wmnet with reason: stop kafka and reimage
09:47 moritzm: uploaded prometheus-druid-exporter 0.8-2 for bullseye-wikimedia T332584T332589
08:21 elukey: clean up docker and reboot kubernetes2024 to enable overlay2 - T332803
08:11 vgutierrez: testing HAProxy 2.6.11 in cp4044 - T332796
08:08 vgutierrez: fetch haproxy 2.6.11 in apt.wm.o thirdparty/haproxy26 for bullseye & buster
08:04 vgutierrez: rolling rollback to HAProxy 2.6.9 in cache text cluster - T332796
07:54 elukey: clean up docker and reboot kubernetes2023 to enable overlay2 - T332803
07:50 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubernetes2023.codfw.wmnet with reason: Restart docker with overlay
07:49 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kubernetes2023.codfw.wmnet with reason: Restart docker with overlay
07:49 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubernetes2024.codfw.wmnet with reason: Restart docker with overlay
07:49 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kubernetes2024.codfw.wmnet with reason: Restart docker with overlay
07:42 elukey: clean up docker on kubernetes1024 (cordon + stop kubelet + docker + clean /var/lib/docker/*) and reboot to enable overlay2 - T332803
07:38 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubernetes1024.eqiad.wmnet with reason: Restart docker with overlay
07:37 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kubernetes1024.eqiad.wmnet with reason: Restart docker with overlay
07:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45928 and previous config saved to /var/cache/conftool/dbconfig/20230323-072315-root.json
07:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45927 and previous config saved to /var/cache/conftool/dbconfig/20230323-070811-root.json
06:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45926 and previous config saved to /var/cache/conftool/dbconfig/20230323-065306-root.json
06:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45925 and previous config saved to /var/cache/conftool/dbconfig/20230323-063800-root.json
06:22 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45924 and previous config saved to /var/cache/conftool/dbconfig/20230323-062255-root.json
06:07 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P45923 and previous config saved to /var/cache/conftool/dbconfig/20230323-060750-root.json
05:37 denisse@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host doc2002.codfw.wmnet with OS bullseye
05:34 stevemunene@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host an-test-client1002.eqiad.wmnet with OS bullseye
04:25 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host doc2002.codfw.wmnet with OS bullseye
02:07 denisse@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host doc2002.codfw.wmnet with OS bullseye
02:00 mutante: rsyncing ~4GB files for static-codereview.wikimedia.org from old to newer VMs for T331896 - no automatic sync / deploy for these
00:57 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host doc2002.codfw.wmnet with OS bullseye
00:57 denisse@cumin1001: END (ERROR) - Cookbook sre.ganeti.reimage (exit_code=97) for host doc2002.codfw.wmnet with OS bullseye
00:57 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host doc2002.codfw.wmnet with OS bullseye
00:27 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doc2002.codfw.wmnet
00:10 denisse@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host doc1003.eqiad.wmnet with OS bullseye
2023-03-22
23:59 denisse@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doc1003.eqiad.wmnet with reason: host reimage
23:56 denisse@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on doc1003.eqiad.wmnet with reason: host reimage
23:46 denisse@cumin1001: START - Cookbook sre.ganeti.reimage for host doc1003.eqiad.wmnet with OS bullseye
23:34 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doc2002.codfw.wmnet on all recursors
23:34 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache doc2002.codfw.wmnet on all recursors
23:34 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
23:33 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doc2002.codfw.wmnet - denisse@cumin1001"
23:32 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doc2002.codfw.wmnet - denisse@cumin1001"
20:59 taavi@deploy2002: taavi: Backport for Set OATHAuthMultipleDevicesMigrationStage in IS synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
20:12 denisse@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doc1003.eqiad.wmnet on all recursors
20:11 denisse@cumin1001: START - Cookbook sre.dns.wipe-cache doc1003.eqiad.wmnet on all recursors
20:11 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
20:11 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doc1003.eqiad.wmnet - denisse@cumin1001"
20:10 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doc1003.eqiad.wmnet - denisse@cumin1001"
20:01 denisse@cumin1001: START - Cookbook sre.ganeti.makevm for new host doc1003.wikimedia.org
18:16 dancy@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.41.0-wmf.1 refs T330207
18:12 mutante: rsyncing /srv/org/wikimedia/sitemaps files for https://sitemaps.wikimedia.org from old to new machines. most other things are auto-deployed by puppet or puppet running intial scap or automatic rsync.. this is not. rsync -av /srv/org/wikimedia/sitemaps/ rsync://miscweb2003.codfw.wmnet/miscapps-srv/org/wikimedia/sitemaps/ T331896 - but also see T332101
17:53 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dborch1002.wikimedia.org
17:53 jhathaway@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:53 jhathaway@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dborch1002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jhathaway@cumin1001"
17:38 _joe_: stopping apache on mwdebug1001 to test the new envoy error page
17:15 hashar@deploy2002: Synchronized composer.json: build: add local typos check to composer.json # T332121 (duration: 06m 44s)
17:12 jhathaway@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dborch1002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jhathaway@cumin1001"
15:23 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2004.codfw.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
15:23 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2004.codfw.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
15:22 hnowlan: removing java packages from maps hosts
14:21 stevemunene@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-client1002.eqiad.wmnet with reason: host reimage
14:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45917 and previous config saved to /var/cache/conftool/dbconfig/20230322-141923-root.json
14:17 stevemunene@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-client1002.eqiad.wmnet with reason: host reimage
14:17 sukhe: enable Puppet on A:wikidough to roll out dnsdist.conf change
14:13 sukhe: disable Puppet on A:wikidough to roll out dnsdist.conf change
14:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45916 and previous config saved to /var/cache/conftool/dbconfig/20230322-140418-root.json
14:02 stevemunene@cumin1001: START - Cookbook sre.ganeti.reimage for host an-test-client1002.eqiad.wmnet with OS bullseye
13:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45915 and previous config saved to /var/cache/conftool/dbconfig/20230322-134913-root.json
13:35 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1014.mgmt.eqiad.wmnet with reboot policy FORCED
13:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45914 and previous config saved to /var/cache/conftool/dbconfig/20230322-133409-root.json
13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45913 and previous config saved to /var/cache/conftool/dbconfig/20230322-131904-root.json
11:30 marostegui: Poweroff db1121 (lag will show on wikireplicas for s4 section) T323961
11:24 elukey@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts kafka-main2005.codfw.wmnet
11:24 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafka-main2005.codfw.wmnet
11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool needs to be rebooted T323961', diff saved to https://phabricator.wikimedia.org/P45910 and previous config saved to /var/cache/conftool/dbconfig/20230322-112031-root.json
11:17 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host kafka-main2005.codfw.wmnet
11:16 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2005.codfw.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
11:16 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2005.codfw.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
10:23 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main2005.codfw.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
10:22 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main2005.codfw.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
10:16 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1004.eqiad.wmnet with OS bullseye
10:07 stevemunene@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host an-test-client1002.eqiad.wmnet with OS bullseye
09:56 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1004.eqiad.wmnet with reason: host reimage
09:54 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1004.eqiad.wmnet with reason: host reimage
09:38 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-main1004.eqiad.wmnet with OS bullseye
09:36 elukey@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts kafka-main1004.eqiad.wmnet
09:27 elukey@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host kafka-main1004.eqiad.wmnet
09:27 elukey@cumin1001: START - Cookbook sre.hosts.reboot-single for host kafka-main1004.eqiad.wmnet
09:01 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on kafka-main1004.eqiad.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
09:01 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on kafka-main1004.eqiad.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on pybal-test2003.codfw.wmnet with reason: Some tests with pybal/Bullseye
08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on pybal-test2003.codfw.wmnet with reason: Some tests with pybal/Bullseye
08:52 stevemunene@cumin1001: START - Cookbook sre.ganeti.reimage for host an-test-client1002.eqiad.wmnet with OS bullseye
08:25 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
08:25 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
20:31 taavi@deploy2002: matmarex and taavi: Backport for Simplify/Fix wgDiscussionToolsEnablePermalinksBackend config synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
16:50 jbond: copy /usr/bin/prometheus-ipmi-exporter from bullseye to buster
16:46 jhathaway@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dborch1002.wikimedia.org on all recursors
16:46 jhathaway@cumin1001: START - Cookbook sre.dns.wipe-cache dborch1002.wikimedia.org on all recursors
16:46 jhathaway@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:46 jhathaway@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM dborch1002.wikimedia.org - jhathaway@cumin1001"
16:45 jhathaway@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM dborch1002.wikimedia.org - jhathaway@cumin1001"
16:28 jbond: upload prometheus-ipmi-exporter_1.6.1 to bullseye
16:15 stevemunene@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) an-test-client1002.eqiad.wmnet on all recursors
16:15 stevemunene@cumin1001: START - Cookbook sre.dns.wipe-cache an-test-client1002.eqiad.wmnet on all recursors
16:14 stevemunene@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:14 stevemunene@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM an-test-client1002.eqiad.wmnet - stevemunene@cumin1001"
16:13 stevemunene@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM an-test-client1002.eqiad.wmnet - stevemunene@cumin1001"
13:11 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on kafka-main1005.eqiad.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
13:11 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on kafka-main1005.eqiad.wmnet with reason: Stop kafka, update idrac/bios/nic-firmware
13:05 elukey: move kafka mirror maker instances to PKI migration settings (new truststores) - T319372
11:20 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
11:09 joal: Unpause mediacounts_load airflow job with start_date set to 2023-03-21T10:00
10:14 joal@deploy2002: Started deploy [analytics/refinery@0bb61e9]: Regular analytics weekly train [analytics/refinery@0bb61e9]
09:43 elukey@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-main1005.eqiad.wmnet with OS bullseye
09:39 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on kafka-main1005.eqiad.wmnet with reason: Stop kafka, attempt to reimage
09:39 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on kafka-main1005.eqiad.wmnet with reason: Stop kafka, attempt to reimage
09:25 phedenskog@deploy2002: Started deploy [performance/navtiming@d2b97ad]: (no justification provided)
09:06 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cephosd[1001-1005].eqiad.wmnet with reason: Systemd units failing, pupper tries to bring them up periodically, spam on IRC
09:05 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on cephosd[1001-1005].eqiad.wmnet with reason: Systemd units failing, pupper tries to bring them up periodically, spam on IRC
08:31 elukey: move purged daemons on cp nodes to a new CA bundle (to allow accepting kafka clients using PKI tls certs) - T319372
06:50 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 13150
06:49 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 13150
21:52 samtar@deploy2002: jdlrobson and samtar: Backport for Add languages to Minerva HTML (T331905) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
21:11 samtar@deploy2002: samtar and aleksandar: Backport for Rename project and project talk namespace for shwiki (T332614) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
19:49 mutante: miscweb1003 - manually edit /srv/deployment/iegreview/iegreview-cache/.config and replace tin.eqiad.wmnet with deployment.eqiad.wmnet (which is an alias for deploy2002.codfw.wmnet) T257317T332623T331896
18:05 mutante: miscweb1003 - syntax error in httpd config due to "Unknown Authn provider: ldap" - comes from static-rt vhost (T331896)
18:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1019.eqiad.wmnet
18:04 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs1019.eqiad.wmnet
17:59 mutante: when applying apache role for the first time on new hosts we still have the same old conflict: miscweb1003 - manual "a2dismod mpm_event" to be able to let puppet enable mod PHP (T196968)
17:57 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on miscweb1003.eqiad.wmnet with reason: maintenance
17:57 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on miscweb1003.eqiad.wmnet with reason: maintenance
17:55 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on lvs1019.eqiad.wmnet with reason: reboot for kernel update
17:55 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:40:00 on lvs1019.eqiad.wmnet with reason: reboot for kernel update
17:26 akosiaris: disable puppet on rdb*, netbox*, ores*, registry*
17:14 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on lvs3006.esams.wmnet with reason: reboot for kernel update
17:14 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:40:00 on lvs3006.esams.wmnet with reason: reboot for kernel update
17:14 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet with reason: reboot for kernel update
17:14 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:40:00 on lvs2009.codfw.wmnet,lvs1019.eqiad.wmnet with reason: reboot for kernel update
14:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2029 and promote es2027 to es3 master', diff saved to https://phabricator.wikimedia.org/P45896 and previous config saved to /var/cache/conftool/dbconfig/20230320-143951-root.json
14:11 TheresNoTime: close UTC afternoon backport window
14:10 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs1018.eqiad.wmnet with reason: rebooting for kernel updates
14:10 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs1018.eqiad.wmnet with reason: rebooting for kernel updates
13:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host cuminunpriv1001.eqiad.wmnet with OS bullseye
13:39 samtar@deploy2002: aleksandar and samtar: Backport for SITENAME change of Serbo-Croatian Wikipedia (T332468) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
13:35 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs2008.codfw.wmnet with reason: rebooting for kernel updates
13:35 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs2008.codfw.wmnet with reason: rebooting for kernel updates
13:34 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs3005.esams.wmnet with reason: rebooting for kernel updates
13:34 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs3005.esams.wmnet with reason: rebooting for kernel updates
22:47 fab@deploy2002: Started deploy [airflow-dags/research@5edcd7b]: (no justification provided)
14:26 apergos: rsync of xmldata public dir from screen as ariel on dumpsdata1004 to dumpsdata1005, no bandwidth cap
13:46 apergos: rsync of xmldata private dir from screen as ariel on dumpsdata1004 to dumpsdata1005, no bandwidth cap
07:55 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on cephosd[1001-1005].eqiad.wmnet with reason: Systemd units failing, pupper tries to bring them up periodically, spam on IRC
07:55 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on cephosd[1001-1005].eqiad.wmnet with reason: Systemd units failing, pupper tries to bring them up periodically, spam on IRC
19:06 ebernhardson@deploy2002: Started deploy [airflow-dags/search@7d75578]: enable templating of ores threshold fetch
18:35 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs6002.drmrs.wmnet with reason: rebooting for kernel updates
18:35 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs6002.drmrs.wmnet with reason: rebooting for kernel updates
18:34 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs5005.eqsin.wmnet with reason: rebooting for kernel updates
18:34 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs5005.eqsin.wmnet with reason: rebooting for kernel updates
18:32 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on lvs1017.eqiad.wmnet with reason: rebooting for kernel updates
18:31 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:40:00 on lvs1017.eqiad.wmnet with reason: rebooting for kernel updates
18:09 fab@deploy2002: Started deploy [airflow-dags/research@5edcd7b]: (no justification provided)
18:04 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs2007.codfw.wmnet with reason: rebooting for kernel updates
18:04 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs2007.codfw.wmnet with reason: rebooting for kernel updates
17:35 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs6001.drmrs.wmnet with reason: rebooting for kernel updates
17:35 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs6001.drmrs.wmnet with reason: rebooting for kernel updates
17:31 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs5004.eqsin.wmnet
17:31 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs5004.eqsin.wmnet
17:29 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs4008.ulsfo.wmnet with reason: rebooting for kernel updates
17:29 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs4008.ulsfo.wmnet with reason: rebooting for kernel updates
17:05 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on lvs5004.eqsin.wmnet with reason: rebooting for kernel updates
17:05 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:30:00 on lvs5004.eqsin.wmnet with reason: rebooting for kernel updates
15:50 bking@cumin1001: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1106 to dbctl', diff saved to https://phabricator.wikimedia.org/P45887 and previous config saved to /var/cache/conftool/dbconfig/20230317-055643-marostegui.json
01:05 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2010.codfw.wmnet
01:05 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2010.codfw.wmnet
00:35 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on lvs1020.eqiad.wmnet with reason: rebooting for kernel updates
00:35 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on lvs1020.eqiad.wmnet with reason: rebooting for kernel updates
00:26 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on lvs2010.codfw.wmnet with reason: rebooting for kernel updates
00:26 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on lvs2010.codfw.wmnet with reason: rebooting for kernel updates
00:13 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on lvs5006.eqsin.wmnet with reason: rebooting for kernel updates
00:13 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on lvs5006.eqsin.wmnet with reason: rebooting for kernel updates
2023-03-16
23:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on lvs6003.drmrs.wmnet with reason: rebooting for kernel updates
23:40 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:10:00 on lvs6003.drmrs.wmnet with reason: rebooting for kernel updates
23:33 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:25:00 on lvs3007.esams.wmnet with reason: rebooting for kernel updates
23:33 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:25:00 on lvs3007.esams.wmnet with reason: rebooting for kernel updates
23:31 dzahn@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host miscweb2003.codfw.wmnet with OS bullseye
23:28 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host miscweb1003.eqiad.wmnet with OS bullseye
23:20 ebernhardson@deploy2002: Started deploy [airflow-dags/search@e6f0142]: bump discolytics env to 0.7.0
23:18 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on miscweb2003.codfw.wmnet with reason: host reimage
23:15 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on miscweb2003.codfw.wmnet with reason: host reimage
23:14 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on miscweb1003.eqiad.wmnet with reason: host reimage
23:11 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on miscweb1003.eqiad.wmnet with reason: host reimage
23:01 dzahn@cumin1001: START - Cookbook sre.ganeti.reimage for host miscweb1003.eqiad.wmnet with OS bullseye
23:00 dzahn@cumin2002: START - Cookbook sre.ganeti.reimage for host miscweb2003.codfw.wmnet with OS bullseye
22:49 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host miscweb1003.eqiad.wmnet
22:42 dzahn@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host miscweb2003.codfw.wmnet
22:39 dzahn@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) miscweb1003.eqiad.wmnet on all recursors
22:39 dzahn@cumin1001: START - Cookbook sre.dns.wipe-cache miscweb1003.eqiad.wmnet on all recursors
22:39 dzahn@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:39 dzahn@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM miscweb1003.eqiad.wmnet - dzahn@cumin1001"
22:38 dzahn@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM miscweb1003.eqiad.wmnet - dzahn@cumin1001"
22:35 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host miscweb1003.eqiad.wmnet
22:32 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) miscweb2003.codfw.wmnet on all recursors
22:32 dzahn@cumin2002: START - Cookbook sre.dns.wipe-cache miscweb2003.codfw.wmnet on all recursors
22:32 dzahn@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:32 dzahn@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM miscweb2003.codfw.wmnet - dzahn@cumin2002"
22:31 dzahn@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM miscweb2003.codfw.wmnet - dzahn@cumin2002"
20:28 samtar@deploy2002: samtar and sharvaniharan: Backport for Remove sampling from breadCrumbs schema synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
18:40 xcollazo@deploy2002: Started deploy [airflow-dags/platform_eng@5c2c701]: (no justification provided)
18:38 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ms-be2067.codfw.wmnet
18:37 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host thanos-fe1004.eqiad.wmnet with OS bullseye
18:03 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs4009.ulsfo.wmnet
18:03 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs4009.ulsfo.wmnet
17:41 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:25:00 on lvs4009.ulsfo.wmnet with reason: rebooting for kernel updates
17:41 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:25:00 on lvs4009.ulsfo.wmnet with reason: rebooting for kernel updates
17:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host thanos-fe1004.eqiad.wmnet with OS bullseye
17:40 ayounsi@cumin2002: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox-canary
17:40 ayounsi@cumin2002: START - Cookbook sre.netbox.update-extras rolling update on A:netbox-canary
17:36 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host thanos-fe1004.eqiad.wmnet with OS bullseye
17:30 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host thanos-fe1004.eqiad.wmnet with OS bullseye
17:21 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host ms-fe1013.eqiad.wmnet with OS bullseye
17:05 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on lvs4008.ulsfo.wmnet with reason: rebooting for kernel updates
17:05 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 0:15:00 on lvs4008.ulsfo.wmnet with reason: rebooting for kernel updates
16:59 xcollazo@deploy2002: Finished deploy [airflow-dags/platform_eng@e17ee96]: First deploy after Airflow 2.5.1 upgrade. (duration: 00m 24s)
16:58 xcollazo@deploy2002: Started deploy [airflow-dags/platform_eng@e17ee96]: First deploy after Airflow 2.5.1 upgrade.
16:56 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs4010.ulsfo.wmnet
16:56 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs4010.ulsfo.wmnet
16:47 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs4010.ulsfo.wmnet with reason: rebooting for kernel updates
16:46 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on lvs4010.ulsfo.wmnet with reason: rebooting for kernel updates
16:31 Emperor: reboot ms-be2067 again to see if the missing drive comes back
16:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet
15:39 claime: Pooled new mw hosts mw24[20-51].codfw.wmnet - T326363
15:28 sukhe: enable puppet on R:class = dnsrecursor to merge CR: 898957 [done]
13:28 kharlan@deploy2002: kharlan: Backport for GrowthExperiments: Remove unused GENewImpactD3Enabled flag synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
10:52 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw
10:50 vgutierrez@cumin1001: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw
10:42 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
10:42 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
10:40 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
10:39 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
10:38 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin
10:37 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin
10:33 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
10:33 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: new_install
10:32 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
10:32 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 32 hosts with reason: new_install
10:32 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw
10:31 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw
10:31 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
10:31 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
10:31 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
10:31 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
10:30 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
10:29 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
10:28 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
10:26 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1179 to move it to x1', diff saved to https://phabricator.wikimedia.org/P45885 and previous config saved to /var/cache/conftool/dbconfig/20230316-100945-root.json
08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1105.eqiad.wmnet
08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:51 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1105.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
08:49 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db1105.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
06:23 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1105 from dbctl T331874', diff saved to https://phabricator.wikimedia.org/P45883 and previous config saved to /var/cache/conftool/dbconfig/20230316-062307-root.json
06:03 marostegui: Failover m5 from db1106 to db1176 - T332155
05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: m5 master switch T332155
05:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: m5 master switch T332155
21:08 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab2002.codfw.wmnet,phab1004.eqiad.wmnet with reason: maintenance
21:08 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on phab2002.codfw.wmnet,phab1004.eqiad.wmnet with reason: maintenance
20:56 brennen@deploy2002: Finished deploy [phabricator/deployment@9e9b406]: test deploy of current state to phab2002 (T331915) (duration: 00m 31s)
20:55 brennen@deploy2002: Started deploy [phabricator/deployment@9e9b406]: test deploy of current state to phab2002 (T331915)
20:54 brennen: starting phabricator window a touch early with a test deploy to phab2002
20:33 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host doh3002.wikimedia.org with OS bullseye
20:27 samtar@deploy2002: samtar and tsepothoabala: Backport for Deploy action blocks on itwiki (T330533) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
19:22 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh6002.wikimedia.org with reason: host reimage
19:17 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host doh1001.wikimedia.org with OS bullseye
19:16 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host doh2001.wikimedia.org with OS bullseye
19:15 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host doh5002.wikimedia.org with OS bullseye
19:14 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host doh3001.wikimedia.org with OS bullseye
19:05 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host doh6002.wikimedia.org with OS bullseye
19:03 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host doh6001.wikimedia.org with OS bullseye
18:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh5002.wikimedia.org with reason: host reimage
18:49 mutante: adding new language prefix anp.wikipedia.org - Angika, an Eastern Indo-Aryan language spoken in some parts of the Indian states of Bihar and Jharkhand, as well as in parts of Nepal. (T332115)
18:49 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh5002.wikimedia.org with reason: host reimage
18:46 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh6001.wikimedia.org with reason: host reimage
18:42 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh6001.wikimedia.org with reason: host reimage
18:25 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host doh6001.wikimedia.org with OS bullseye
15:52 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
15:49 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging1002.eqiad.wmnet with reason: host reimage
15:44 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host doh4002.wikimedia.org with OS bullseye
15:34 herron@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1002.eqiad.wmnet with OS bullseye
15:33 mvernon@cumin1001: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
15:30 mvernon@cumin1001: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
09:05 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1106 from dbctl T331875', diff saved to https://phabricator.wikimedia.org/P45872 and previous config saved to /var/cache/conftool/dbconfig/20230315-090515-root.json
08:40 hashar@deploy2002: Finished deploy [integration/docroot@5abe9c6]: Link Groovy doc of PipelineLib - T222199 (duration: 00m 19s)
08:40 hashar@deploy2002: Started deploy [integration/docroot@5abe9c6]: Link Groovy doc of PipelineLib - T222199
08:15 vgutierrez@cumin1001: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on A:cp-upload_ulsfo
08:15 vgutierrez@cumin1001: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo
07:40 tgr_: UTC morning deploys done
07:39 mvernon@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ms-be2067.codfw.wmnet
20:04 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
20:03 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
19:47 topranks: Reboot cloudsw1-b1-codfw to upgrade JunOS version T327919
19:44 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cloudsw1-b1-codfw,cloudsw1-b1-codfw IPv6,cloudsw1-b1-codfw.mgmt with reason: cloudsw1-b1-codfw OS upgrade
19:44 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cloudsw1-b1-codfw,cloudsw1-b1-codfw IPv6,cloudsw1-b1-codfw.mgmt with reason: cloudsw1-b1-codfw OS upgrade
19:32 jhathaway@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
19:30 brennen: 1.40.0-wmf.27 train (T330205): uneventful at group0. i'm afk for about an hour.
16:16 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pki2001.codfw.wmnet
16:16 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
16:16 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jbond@cumin1001"
16:13 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: pki2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jbond@cumin1001"
16:04 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 12:00:00 on cephosd[1001-1005].eqiad.wmnet with reason: Bootstrapping ceph
16:04 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 12:00:00 on cephosd[1001-1005].eqiad.wmnet with reason: Bootstrapping ceph
16:00 jbond@cumin1001: START - Cookbook sre.hosts.decommission for hosts pki2001.codfw.wmnet
15:59 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2003.codfw.wmnet with OS bullseye
15:36 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
15:35 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: installation failed due to read-only database
15:35 aokoth@cumin1001: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: installation failed due to read-only database
15:32 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2003.codfw.wmnet with reason: host reimage
15:30 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on pki2001.codfw.wmnet with reason: decommission
15:30 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on pki2001.codfw.wmnet with reason: decommission
15:19 herron@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging2003.codfw.wmnet with OS bullseye
14:37 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pki1001.eqiad.wmnet with OS bullseye
14:19 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki1001.eqiad.wmnet with reason: host reimage
14:16 claime: All active/active services in eqiad repooled, DNS issues resolved - T331541
14:16 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on pki1001.eqiad.wmnet with reason: host reimage
14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Decrease db2122 weight', diff saved to https://phabricator.wikimedia.org/P45866 and previous config saved to /var/cache/conftool/dbconfig/20230314-140926-root.json
14:01 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host pki1001.eqiad.wmnet with OS bullseye
14:00 jbond: reimage pki1001
13:58 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
13:58 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
13:33 bblack: rolling out recdns fixup for missing 10/8 ECS affecting local inter-dc discovery/geoip results (again, with sukhe's more-correct variant!)
13:27 TheresNoTime: close UTC afternoon backport window
13:20 samtar@deploy2002: samtar and urbanecm: Backport for arwiki: Add new throttle rule (T331973) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
13:11 samtar@deploy2002: esanders and samtar: Backport for Enable VE on more namespaces on foundationwiki (T331079) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
13:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
13:04 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudlb2003-dev.codfw.wmnet with reason: host reimage
13:02 aborrero@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudlb2002-dev.codfw.wmnet with reason: host reimage
12:58 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudlb2003-dev.codfw.wmnet with reason: host reimage
12:58 aborrero@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudlb2002-dev.codfw.wmnet with reason: host reimage
12:44 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudlb2003-dev.codfw.wmnet with OS bullseye
12:43 aborrero@cumin2002: START - Cookbook sre.hosts.reimage for host cloudlb2002-dev.codfw.wmnet with OS bullseye
12:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
12:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T329260)', diff saved to https://phabricator.wikimedia.org/P45864 and previous config saved to /var/cache/conftool/dbconfig/20230314-123515-marostegui.json
12:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P45863 and previous config saved to /var/cache/conftool/dbconfig/20230314-122009-marostegui.json
12:20 TheresNoTime: `Command '['helmfile', '-e', 'eqiad', '--selector', 'name=canary', 'apply']' returned non-zero exit status 1.` (P45862) during scap deployment of T297396 + T331680 — scap rolled back
12:18 jbond@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host pki-root1001.eqiad.wmnet with OS bullseye
12:13 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) pool appservers-ro in eqiad: T331541
12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P45861 and previous config saved to /var/cache/conftool/dbconfig/20230314-120503-marostegui.json
11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T329260)', diff saved to https://phabricator.wikimedia.org/P45860 and previous config saved to /var/cache/conftool/dbconfig/20230314-114957-marostegui.json
11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T329260)', diff saved to https://phabricator.wikimedia.org/P45857 and previous config saved to /var/cache/conftool/dbconfig/20230314-112354-marostegui.json
11:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
11:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T329260)', diff saved to https://phabricator.wikimedia.org/P45856 and previous config saved to /var/cache/conftool/dbconfig/20230314-112333-marostegui.json
11:19 jbond@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) api-ro.discovery.wmnet on all recursors
11:19 jbond@cumin1001: START - Cookbook sre.dns.wipe-cache api-ro.discovery.wmnet on all recursors
11:13 claime: We are encountering unexpected DNS anycast issued following T331541, latencies are increased but no production outage.
11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P45855 and previous config saved to /var/cache/conftool/dbconfig/20230314-110826-marostegui.json
11:03 akosiaris@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mathoid.discovery.wmnet on all recursors
11:03 akosiaris@cumin1001: START - Cookbook sre.dns.wipe-cache mathoid.discovery.wmnet on all recursors
11:02 jbond@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) api-ro.discovery.wmnet on all recursors
11:02 jbond@cumin1001: START - Cookbook sre.dns.wipe-cache api-ro.discovery.wmnet on all recursors
11:02 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pki-root1001.eqiad.wmnet with reason: host reimage
10:58 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on pki-root1001.eqiad.wmnet with reason: host reimage
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P45854 and previous config saved to /var/cache/conftool/dbconfig/20230314-105319-marostegui.json
10:48 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool restbase-async in codfw: T331541
10:47 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in eqiad: Datacenter Switchover - eqiad RO repool - T331541
10:43 jbond@cumin1001: START - Cookbook sre.hosts.reimage for host pki-root1001.eqiad.wmnet with OS bullseye
10:42 jbond: reimage pki-root1001
10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T329260)', diff saved to https://phabricator.wikimedia.org/P45853 and previous config saved to /var/cache/conftool/dbconfig/20230314-103813-marostegui.json
10:33 cgoubert@cumin1001: START - Cookbook sre.discovery.datacenter pool all active/active services in eqiad: Datacenter Switchover - eqiad RO repool - T331541
10:32 claime: Repooling all active/active services in eqiad - T331541
10:32 cgoubert@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-optional-warmup-caches (exit_code=0)
10:29 jbond@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) pki.discovery.wmnet on all recursors
10:28 jbond@cumin1001: START - Cookbook sre.dns.wipe-cache pki.discovery.wmnet on all recursors
10:21 jbond: move pki.discovery.wmnet to pki2002 (buyllseye)
10:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T329260)', diff saved to https://phabricator.wikimedia.org/P45852 and previous config saved to /var/cache/conftool/dbconfig/20230314-101918-marostegui.json
10:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
10:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
10:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
10:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
10:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T329260)', diff saved to https://phabricator.wikimedia.org/P45851 and previous config saved to /var/cache/conftool/dbconfig/20230314-101840-marostegui.json
10:15 jayme: enabling puppet on P:calico::kubernetes for T325268
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P45850 and previous config saved to /var/cache/conftool/dbconfig/20230314-100334-marostegui.json
10:02 claime: Locking scap deployment for service switchover - T331541
10:00 claime: Locking scap deployment for service switchover - T330651
09:56 jayme: disabling puppet on P:calico::kubernetes for T325268
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P45849 and previous config saved to /var/cache/conftool/dbconfig/20230314-094828-marostegui.json
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T329260)', diff saved to https://phabricator.wikimedia.org/P45848 and previous config saved to /var/cache/conftool/dbconfig/20230314-093321-marostegui.json
09:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T329260)', diff saved to https://phabricator.wikimedia.org/P45847 and previous config saved to /var/cache/conftool/dbconfig/20230314-090649-marostegui.json
09:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
09:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
08:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
08:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T329260)', diff saved to https://phabricator.wikimedia.org/P45846 and previous config saved to /var/cache/conftool/dbconfig/20230314-084249-marostegui.json
08:38 vgutierrez: test HAProxy 2.6.10 in cp4044 and cp4045
08:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P45845 and previous config saved to /var/cache/conftool/dbconfig/20230314-082743-marostegui.json
08:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P45843 and previous config saved to /var/cache/conftool/dbconfig/20230314-081236-marostegui.json
07:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T329260)', diff saved to https://phabricator.wikimedia.org/P45842 and previous config saved to /var/cache/conftool/dbconfig/20230314-075730-marostegui.json
07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2127 (T329260)', diff saved to https://phabricator.wikimedia.org/P45841 and previous config saved to /var/cache/conftool/dbconfig/20230314-073210-marostegui.json
07:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T329260)', diff saved to https://phabricator.wikimedia.org/P45840 and previous config saved to /var/cache/conftool/dbconfig/20230314-073149-marostegui.json
07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P45839 and previous config saved to /var/cache/conftool/dbconfig/20230314-071643-marostegui.json
07:13 marostegui: Migrate db2135 to mariadb m5 codfw dbmaint 10.6
07:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P45838 and previous config saved to /var/cache/conftool/dbconfig/20230314-070137-marostegui.json
06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T329260)', diff saved to https://phabricator.wikimedia.org/P45837 and previous config saved to /var/cache/conftool/dbconfig/20230314-064630-marostegui.json
06:42 denisse@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts centrallog1001
06:42 denisse@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
06:42 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: centrallog1001 decommissioned, removing all IPs except the asset tag one - denisse@cumin1001"
06:41 hashar: gerrit: changed `operations/puppet` merge strategy to allow "content merges" (see `ops` list for the rationale)
06:36 denisse@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: centrallog1001 decommissioned, removing all IPs except the asset tag one - denisse@cumin1001"
03:02 mwpresync@deploy2002: Started scap: testwikis wikis to 1.40.0-wmf.27 refs T330205
02:22 legoktm: removed user's 2FA on wikitech for T331955
02:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T329260)', diff saved to https://phabricator.wikimedia.org/P45835 and previous config saved to /var/cache/conftool/dbconfig/20230314-022023-marostegui.json
02:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P45834 and previous config saved to /var/cache/conftool/dbconfig/20230314-020517-marostegui.json
01:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P45833 and previous config saved to /var/cache/conftool/dbconfig/20230314-015011-marostegui.json
01:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T329260)', diff saved to https://phabricator.wikimedia.org/P45832 and previous config saved to /var/cache/conftool/dbconfig/20230314-013504-marostegui.json
01:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2176 (T329260)', diff saved to https://phabricator.wikimedia.org/P45831 and previous config saved to /var/cache/conftool/dbconfig/20230314-012442-marostegui.json
01:24 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
01:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
01:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T329260)', diff saved to https://phabricator.wikimedia.org/P45830 and previous config saved to /var/cache/conftool/dbconfig/20230314-012421-marostegui.json
01:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P45829 and previous config saved to /var/cache/conftool/dbconfig/20230314-010915-marostegui.json
00:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P45828 and previous config saved to /var/cache/conftool/dbconfig/20230314-005409-marostegui.json
00:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T329260)', diff saved to https://phabricator.wikimedia.org/P45827 and previous config saved to /var/cache/conftool/dbconfig/20230314-003903-marostegui.json
00:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2174 (T329260)', diff saved to https://phabricator.wikimedia.org/P45826 and previous config saved to /var/cache/conftool/dbconfig/20230314-002840-marostegui.json
00:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
00:28 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
00:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T329260)', diff saved to https://phabricator.wikimedia.org/P45825 and previous config saved to /var/cache/conftool/dbconfig/20230314-002819-marostegui.json
00:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P45824 and previous config saved to /var/cache/conftool/dbconfig/20230314-001313-marostegui.json
2023-03-13
23:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P45823 and previous config saved to /var/cache/conftool/dbconfig/20230313-235807-marostegui.json
23:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T329260)', diff saved to https://phabricator.wikimedia.org/P45822 and previous config saved to /var/cache/conftool/dbconfig/20230313-234301-marostegui.json
23:39 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1003.eqiad.wmnet
23:33 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1003.eqiad.wmnet
23:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2173 (T329260)', diff saved to https://phabricator.wikimedia.org/P45821 and previous config saved to /var/cache/conftool/dbconfig/20230313-233127-marostegui.json
23:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
23:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
23:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
23:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
23:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T329260)', diff saved to https://phabricator.wikimedia.org/P45820 and previous config saved to /var/cache/conftool/dbconfig/20230313-233050-marostegui.json
23:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P45819 and previous config saved to /var/cache/conftool/dbconfig/20230313-231544-marostegui.json
23:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P45818 and previous config saved to /var/cache/conftool/dbconfig/20230313-230038-marostegui.json
22:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T329260)', diff saved to https://phabricator.wikimedia.org/P45817 and previous config saved to /var/cache/conftool/dbconfig/20230313-224532-marostegui.json
{{safesubst:SAL entry|1=22:40 zabe@deploy2002: Started scap: [[gerrit:898037}}
22:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T329260)', diff saved to https://phabricator.wikimedia.org/P45816 and previous config saved to /var/cache/conftool/dbconfig/20230313-223331-marostegui.json
22:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
22:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
22:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T329260)', diff saved to https://phabricator.wikimedia.org/P45815 and previous config saved to /var/cache/conftool/dbconfig/20230313-223309-marostegui.json
22:30 sbassett@deploy2002: Synchronized wmf-config/InitialiseSettings.php: Set ext:StopForumSpam to enforce on es.wikiversity (duration: 06m 59s)
22:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P45814 and previous config saved to /var/cache/conftool/dbconfig/20230313-221803-marostegui.json
22:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P45813 and previous config saved to /var/cache/conftool/dbconfig/20230313-220257-marostegui.json
21:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T329260)', diff saved to https://phabricator.wikimedia.org/P45812 and previous config saved to /var/cache/conftool/dbconfig/20230313-214751-marostegui.json
21:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T329260)', diff saved to https://phabricator.wikimedia.org/P45811 and previous config saved to /var/cache/conftool/dbconfig/20230313-213544-marostegui.json
21:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
21:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
21:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T329260)', diff saved to https://phabricator.wikimedia.org/P45810 and previous config saved to /var/cache/conftool/dbconfig/20230313-213523-marostegui.json
21:23 herron@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-logging2001.codfw.wmnet with OS bullseye
21:21 wfan: remove -d for jobs-dlocal queue runner
21:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P45809 and previous config saved to /var/cache/conftool/dbconfig/20230313-212017-marostegui.json
21:06 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
21:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P45808 and previous config saved to /var/cache/conftool/dbconfig/20230313-210510-marostegui.json
21:04 herron@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
21:01 herron@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-logging2001.codfw.wmnet with reason: host reimage
21:01 ejegg: enabled jobs-dlocal queue runner
21:00 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
20:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T329260)', diff saved to https://phabricator.wikimedia.org/P45807 and previous config saved to /var/cache/conftool/dbconfig/20230313-205004-marostegui.json
20:47 herron@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging2001.codfw.wmnet with OS bullseye
20:43 ebernhardson@deploy2002: Finished deploy [airflow-dags/search@8685c9e]: drop_dated_directories.py must run through skein (duration: 00m 14s)
20:43 ebernhardson@deploy2002: Started deploy [airflow-dags/search@8685c9e]: drop_dated_directories.py must run through skein
20:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T329260)', diff saved to https://phabricator.wikimedia.org/P45806 and previous config saved to /var/cache/conftool/dbconfig/20230313-203824-marostegui.json
20:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
20:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
20:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T329260)', diff saved to https://phabricator.wikimedia.org/P45805 and previous config saved to /var/cache/conftool/dbconfig/20230313-203802-marostegui.json
20:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P45804 and previous config saved to /var/cache/conftool/dbconfig/20230313-202256-marostegui.json
20:16 kindrobot@deploy2002: kindrobot and ksarabia: Backport for Add header at top of main page (T325362) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
20:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P45803 and previous config saved to /var/cache/conftool/dbconfig/20230313-200750-marostegui.json
20:02 eevans@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sessionstore1001.eqiad.wmnet
20:02 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
19:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T329260)', diff saved to https://phabricator.wikimedia.org/P45802 and previous config saved to /var/cache/conftool/dbconfig/20230313-195244-marostegui.json
19:52 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
19:50 eevans@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sessionstore1003.eqiad.wmnet
19:50 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1003.eqiad.wmnet
19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T329260)', diff saved to https://phabricator.wikimedia.org/P45801 and previous config saved to /var/cache/conftool/dbconfig/20230313-194148-marostegui.json
19:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
19:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T329260)', diff saved to https://phabricator.wikimedia.org/P45800 and previous config saved to /var/cache/conftool/dbconfig/20230313-194116-marostegui.json
19:39 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1003.eqiad.wmnet
19:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P45799 and previous config saved to /var/cache/conftool/dbconfig/20230313-192610-marostegui.json
19:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P45798 and previous config saved to /var/cache/conftool/dbconfig/20230313-191104-marostegui.json
19:07 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
19:00 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
18:59 eevans@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sessionstore1002.eqiad.wmnet
18:59 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1002.eqiad.wmnet
18:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T329260)', diff saved to https://phabricator.wikimedia.org/P45797 and previous config saved to /var/cache/conftool/dbconfig/20230313-185558-marostegui.json
18:49 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1002.eqiad.wmnet
18:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
18:36 ebernhardson@deploy2002: Started deploy [airflow-dags/search@a8d066e]: Parameterize streaming updater reconcile start date
18:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
18:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T329260)', diff saved to https://phabricator.wikimedia.org/P45795 and previous config saved to /var/cache/conftool/dbconfig/20230313-183628-marostegui.json
18:33 eevans@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sessionstore1002.eqiad.wmnet
18:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P45794 and previous config saved to /var/cache/conftool/dbconfig/20230313-182121-marostegui.json
18:17 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1002.eqiad.wmnet
18:11 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1002.eqiad.wmnet
18:07 eevans@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sessionstore1001.eqiad.wmnet
18:07 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
18:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P45793 and previous config saved to /var/cache/conftool/dbconfig/20230313-180615-marostegui.json
17:56 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
17:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T329260)', diff saved to https://phabricator.wikimedia.org/P45792 and previous config saved to /var/cache/conftool/dbconfig/20230313-175109-marostegui.json
17:50 dancy@deploy2002: Finished scap: test cleanup (duration: 06m 40s)
17:44 dancy@deploy2002: Started scap: test cleanup
17:43 eevans@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sessionstore1001.eqiad.wmnet
17:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T329260)', diff saved to https://phabricator.wikimedia.org/P45791 and previous config saved to /var/cache/conftool/dbconfig/20230313-174030-marostegui.json
17:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
17:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
17:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T329260)', diff saved to https://phabricator.wikimedia.org/P45790 and previous config saved to /var/cache/conftool/dbconfig/20230313-174009-marostegui.json
17:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P45789 and previous config saved to /var/cache/conftool/dbconfig/20230313-172503-marostegui.json
17:16 mvernon@cumin1001: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
17:15 dancy@deploy2002: Started scap: testing T329857
17:13 eevans@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts sessionstore1001.eqiad.wmnet
17:11 mvernon@cumin1001: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:codfw and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
17:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P45788 and previous config saved to /var/cache/conftool/dbconfig/20230313-170955-marostegui.json
17:08 dancy@deploy2002: Installation of scap version "4.46.0" completed for 553 hosts
17:07 dancy@deploy2002: Installing scap version "4.46.0" for 553 hosts
17:04 bd808: Ran cache.purge_openstack_users() for Striker following deploy of e1f7491 (T331674)
17:04 dancy@deploy2002: Installing scap version "4.46.0" for 553 hosts
16:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T329260)', diff saved to https://phabricator.wikimedia.org/P45787 and previous config saved to /var/cache/conftool/dbconfig/20230313-165449-marostegui.json
16:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T329260)', diff saved to https://phabricator.wikimedia.org/P45785 and previous config saved to /var/cache/conftool/dbconfig/20230313-164410-marostegui.json
16:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
16:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
16:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T329260)', diff saved to https://phabricator.wikimedia.org/P45784 and previous config saved to /var/cache/conftool/dbconfig/20230313-164349-marostegui.json
16:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P45783 and previous config saved to /var/cache/conftool/dbconfig/20230313-162843-marostegui.json
16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P45782 and previous config saved to /var/cache/conftool/dbconfig/20230313-161337-marostegui.json
15:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T329260)', diff saved to https://phabricator.wikimedia.org/P45781 and previous config saved to /var/cache/conftool/dbconfig/20230313-155830-marostegui.json
15:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2103 (T329260)', diff saved to https://phabricator.wikimedia.org/P45780 and previous config saved to /var/cache/conftool/dbconfig/20230313-154641-marostegui.json
15:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
15:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
15:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
15:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
15:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T329260)', diff saved to https://phabricator.wikimedia.org/P45779 and previous config saved to /var/cache/conftool/dbconfig/20230313-150523-marostegui.json
14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P45778 and previous config saved to /var/cache/conftool/dbconfig/20230313-145016-marostegui.json
14:38 jbond: disable puppet fleet wide to debug strange issue
14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P45777 and previous config saved to /var/cache/conftool/dbconfig/20230313-143510-marostegui.json
14:23 claime: switch noc.wikimedia.org from eqiad to codfw - T331634
14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T329260)', diff saved to https://phabricator.wikimedia.org/P45776 and previous config saved to /var/cache/conftool/dbconfig/20230313-142004-marostegui.json
14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2182 (T329260)', diff saved to https://phabricator.wikimedia.org/P45774 and previous config saved to /var/cache/conftool/dbconfig/20230313-141409-marostegui.json
14:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
14:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2182.codfw.wmnet with reason: Maintenance
14:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T329260)', diff saved to https://phabricator.wikimedia.org/P45773 and previous config saved to /var/cache/conftool/dbconfig/20230313-141348-marostegui.json
13:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P45772 and previous config saved to /var/cache/conftool/dbconfig/20230313-135842-marostegui.json
13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P45770 and previous config saved to /var/cache/conftool/dbconfig/20230313-134336-marostegui.json
13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T329260)', diff saved to https://phabricator.wikimedia.org/P45769 and previous config saved to /var/cache/conftool/dbconfig/20230313-132829-marostegui.json
13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3317 (T329260)', diff saved to https://phabricator.wikimedia.org/P45768 and previous config saved to /var/cache/conftool/dbconfig/20230313-132123-marostegui.json
13:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
13:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T329260)', diff saved to https://phabricator.wikimedia.org/P45767 and previous config saved to /var/cache/conftool/dbconfig/20230313-132101-marostegui.json
13:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P45766 and previous config saved to /var/cache/conftool/dbconfig/20230313-130555-marostegui.json
13:05 taavi@deploy2002: stang and taavi: Backport for zhwiki: Add movefile to extendedconfirmed (T331691) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
12:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P45764 and previous config saved to /var/cache/conftool/dbconfig/20230313-125049-marostegui.json
12:48 hnowlan: restarting codfw thumbor instances to attempt to remedy 502 issues
12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317 (T329260)', diff saved to https://phabricator.wikimedia.org/P45763 and previous config saved to /var/cache/conftool/dbconfig/20230313-123543-marostegui.json
12:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3317 (T329260)', diff saved to https://phabricator.wikimedia.org/P45762 and previous config saved to /var/cache/conftool/dbconfig/20230313-122928-marostegui.json
12:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
12:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T329260)', diff saved to https://phabricator.wikimedia.org/P45761 and previous config saved to /var/cache/conftool/dbconfig/20230313-122906-marostegui.json
12:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P45760 and previous config saved to /var/cache/conftool/dbconfig/20230313-121400-marostegui.json
11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P45759 and previous config saved to /var/cache/conftool/dbconfig/20230313-115854-marostegui.json
11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T329260)', diff saved to https://phabricator.wikimedia.org/P45758 and previous config saved to /var/cache/conftool/dbconfig/20230313-114348-marostegui.json
10:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2159 (T329260)', diff saved to https://phabricator.wikimedia.org/P45757 and previous config saved to /var/cache/conftool/dbconfig/20230313-104322-marostegui.json
10:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
10:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
10:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
10:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2159.codfw.wmnet with reason: Maintenance
10:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T329260)', diff saved to https://phabricator.wikimedia.org/P45756 and previous config saved to /var/cache/conftool/dbconfig/20230313-104246-marostegui.json
10:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P45755 and previous config saved to /var/cache/conftool/dbconfig/20230313-102740-marostegui.json
10:12 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 55701
10:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P45754 and previous config saved to /var/cache/conftool/dbconfig/20230313-101234-marostegui.json
10:10 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 55701
10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 38193
10:10 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 38193
10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 46632
10:10 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 46632
10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 6663
10:09 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 6663
10:08 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45558
10:08 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 45558
10:08 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 38082
10:07 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 38082
10:07 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 668
10:06 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 668
09:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T329260)', diff saved to https://phabricator.wikimedia.org/P45753 and previous config saved to /var/cache/conftool/dbconfig/20230313-095728-marostegui.json
09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2150 (T329260)', diff saved to https://phabricator.wikimedia.org/P45752 and previous config saved to /var/cache/conftool/dbconfig/20230313-095119-marostegui.json
09:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
09:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2150.codfw.wmnet with reason: Maintenance
09:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T329260)', diff saved to https://phabricator.wikimedia.org/P45751 and previous config saved to /var/cache/conftool/dbconfig/20230313-095058-marostegui.json
09:48 jayme: pcc-worker1003:~# rm -r /srv/jenkins/puppet-compiler/40076 - / back to 70%
09:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P45750 and previous config saved to /var/cache/conftool/dbconfig/20230313-093552-marostegui.json
09:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122', diff saved to https://phabricator.wikimedia.org/P45749 and previous config saved to /var/cache/conftool/dbconfig/20230313-092045-marostegui.json
09:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2122 (T329260)', diff saved to https://phabricator.wikimedia.org/P45748 and previous config saved to /var/cache/conftool/dbconfig/20230313-090539-marostegui.json
08:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2122 (T329260)', diff saved to https://phabricator.wikimedia.org/P45747 and previous config saved to /var/cache/conftool/dbconfig/20230313-085937-marostegui.json
08:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2122.codfw.wmnet with reason: Maintenance
08:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2122.codfw.wmnet with reason: Maintenance
08:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T329260)', diff saved to https://phabricator.wikimedia.org/P45746 and previous config saved to /var/cache/conftool/dbconfig/20230313-085916-marostegui.json
08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P45745 and previous config saved to /var/cache/conftool/dbconfig/20230313-084409-marostegui.json
08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121', diff saved to https://phabricator.wikimedia.org/P45744 and previous config saved to /var/cache/conftool/dbconfig/20230313-082903-marostegui.json
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2121 (T329260)', diff saved to https://phabricator.wikimedia.org/P45743 and previous config saved to /var/cache/conftool/dbconfig/20230313-081357-marostegui.json
08:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2121 (T329260)', diff saved to https://phabricator.wikimedia.org/P45742 and previous config saved to /var/cache/conftool/dbconfig/20230313-080759-marostegui.json
08:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
08:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2121.codfw.wmnet with reason: Maintenance
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T329260)', diff saved to https://phabricator.wikimedia.org/P45741 and previous config saved to /var/cache/conftool/dbconfig/20230313-080738-marostegui.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P45740 and previous config saved to /var/cache/conftool/dbconfig/20230313-075232-marostegui.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120', diff saved to https://phabricator.wikimedia.org/P45739 and previous config saved to /var/cache/conftool/dbconfig/20230313-073725-marostegui.json
07:37 marostegui: Remove pagetriage_log from enwiki T328309
07:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2120 (T329260)', diff saved to https://phabricator.wikimedia.org/P45738 and previous config saved to /var/cache/conftool/dbconfig/20230313-072219-marostegui.json
07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2120 (T329260)', diff saved to https://phabricator.wikimedia.org/P45737 and previous config saved to /var/cache/conftool/dbconfig/20230313-071522-marostegui.json
07:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2120.codfw.wmnet with reason: Maintenance
07:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2120.codfw.wmnet with reason: Maintenance
07:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T329260)', diff saved to https://phabricator.wikimedia.org/P45736 and previous config saved to /var/cache/conftool/dbconfig/20230313-071501-marostegui.json
06:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P45735 and previous config saved to /var/cache/conftool/dbconfig/20230313-065954-marostegui.json
06:52 marostegui_: Remove pagetriage_log from testwiki and test2wiki T328309
06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108', diff saved to https://phabricator.wikimedia.org/P45734 and previous config saved to /var/cache/conftool/dbconfig/20230313-064448-marostegui.json
06:35 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9873
06:35 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 9873
06:34 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9507
06:34 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 9507
06:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 15830
06:33 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 15830
06:31 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9902
06:31 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 9902
06:29 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 8966
06:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2108 (T329260)', diff saved to https://phabricator.wikimedia.org/P45733 and previous config saved to /var/cache/conftool/dbconfig/20230313-062942-marostegui.json
06:29 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 8966
06:27 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 34549
06:27 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 34549
06:25 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 29357
06:25 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 29357
06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2108 (T329260)', diff saved to https://phabricator.wikimedia.org/P45732 and previous config saved to /var/cache/conftool/dbconfig/20230313-062244-marostegui.json
06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2108.codfw.wmnet with reason: Maintenance
06:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2108.codfw.wmnet with reason: Maintenance
06:21 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 138886
06:19 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'configure' for AS: 138886
06:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
06:16 marostegui_: Deploy schema change on s3 codfw dbmaint T329684
06:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
06:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
04:37 kart_: Updated cxserver to 2023-03-09-061555-production (T331097, T327102, T326541)
19:39 milimetric@deploy2002: Finished deploy [analytics/refinery@898a942] (thin): Special deploy for pageview job migration [analytics/refinery@898a942] (duration: 00m 09s)
19:38 milimetric@deploy2002: Started deploy [analytics/refinery@898a942] (thin): Special deploy for pageview job migration [analytics/refinery@898a942]
19:38 milimetric@deploy2002: Finished deploy [analytics/refinery@898a942]: Special deploy for pageview job migration [analytics/refinery@898a942] (duration: 08m 08s)
19:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host ms-fe1014.mgmt.eqiad.wmnet with reboot policy FORCED
19:30 milimetric@deploy2002: Started deploy [analytics/refinery@898a942]: Special deploy for pageview job migration [analytics/refinery@898a942]
19:27 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with reboot policy FORCED
19:24 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with reboot policy FORCED
19:23 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:23 cmjohnson@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new ms-fe servers - cmjohnson@cumin1001"
19:17 cmjohnson@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: new ms-fe servers - cmjohnson@cumin1001"
15:09 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudlb2002-dev.mgmt.codfw.wmnet with reboot policy FORCED
15:08 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host cloudlb2003-dev.mgmt.codfw.wmnet with reboot policy FORCED
14:52 cmooney@cumin1001: START - Cookbook sre.hosts.provision for host cloudlb2002-dev.mgmt.codfw.wmnet with reboot policy FORCED
14:50 mvernon@cumin1001: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on A:eqiad and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
14:47 mvernon@cumin1001: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:eqiad and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
14:38 cmooney@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 update - cmooney@cumin1001
14:36 cmooney@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 update - cmooney@cumin1001
14:22 cmooney@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 update - cmooney@cumin1001
14:20 cmooney@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 update - cmooney@cumin1001
14:09 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for pki2002.codfw.wmnet: Renew puppet certificate - jbond@cumin1001
09:58 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
09:58 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove netbox-generated DNS records which have been defined manually. - cmooney@cumin1001"
09:57 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove netbox-generated DNS records which have been defined manually. - cmooney@cumin1001"
22:40 bd808: Forced puppet run on cloudweb100[34] to apply quick fix for T331674
22:25 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
22:25 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS records for new links to cloudsw1-b1-codfw - cmooney@cumin1001"
22:24 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS records for new links to cloudsw1-b1-codfw - cmooney@cumin1001"
22:20 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns1003.wikimedia.org with OS bullseye
22:20 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - sukhe@cumin2002"
22:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns2003.wikimedia.org with reason: host reimage
21:20 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
21:20 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Adjust and remove reverse DNS records after cloudsw1-b1-codfw migration. - cmooney@cumin1001"
21:19 ryankemper@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster restart to enable incr shard recovery throughput - ryankemper@cumin1001 - T317816
21:18 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Adjust and remove reverse DNS records after cloudsw1-b1-codfw migration. - cmooney@cumin1001"
21:18 samtar@deploy2002: samtar and jforrester: Backport for Unload RenameUser, now part of core: Part II of II synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
21:10 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns2003
21:10 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host dns1003.wikimedia.org with OS bullseye
21:09 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns2003
21:09 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns1003
21:08 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns1003
21:07 sukhe@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dns1003.wikimedia.org with OS bullseye
21:03 samtar@deploy2002: samtar and jforrester: Backport for Unload RenameUser, now part of core: Part I of II synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
21:02 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dns2003.mgmt.codfw.wmnet on all recursors
21:02 sukhe@cumin2002: START - Cookbook sre.dns.wipe-cache dns2003.mgmt.codfw.wmnet on all recursors
19:51 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad cluster restart to enable incr shard recovery throughput - ryankemper@cumin1001 - T317816
19:46 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 12:00:00 on an-worker1078.eqiad.wmnet with reason: Replacing RAID BBU
19:46 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 12:00:00 on an-worker1078.eqiad.wmnet with reason: Replacing RAID BBU
19:15 sukhe@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dns1003
19:15 sukhe@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host dns1003
19:14 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:14 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add dns1003 (renamed from authdns1001) - sukhe@cumin2002"
19:12 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add dns1003 (renamed from authdns1001) - sukhe@cumin2002"
19:10 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: all wikis to 1.40.0-wmf.26 refs T330204
18:34 sukhe: [correction] homer "cr*-codfw*" commit "Remove authdns2001 from homer, T330670"
18:34 sukhe: homer "cr*-codfw*" commit "Remove authdns1001 from homer, T330670"
18:31 sukhe: homer "cr*-eqiad*" commit "Remove authdns1001 from homer, T330670"
18:26 cmooney@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
18:26 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts authdns[1001,2001].wikimedia.org
18:26 sukhe@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
18:25 sukhe@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: authdns[1001,2001].wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
18:24 sukhe@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: authdns[1001,2001].wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin2002"
18:22 sukhe: running puppet-agent on A:dns-auth to remove deprecated authdns[12]001
17:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T329260)', diff saved to https://phabricator.wikimedia.org/P45725 and previous config saved to /var/cache/conftool/dbconfig/20230309-174723-marostegui.json
17:36 sukhe: cr1-eqiad: set routing-options static route 208.80.154.238/32 next-hop 208.80.154.10
17:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P45724 and previous config saved to /var/cache/conftool/dbconfig/20230309-173217-marostegui.json
17:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P45723 and previous config saved to /var/cache/conftool/dbconfig/20230309-171711-marostegui.json
17:13 topranks: Add EBGP peering from cr1-codfw to cloudsw1-b1-codfw (prod links) T327919
17:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T329260)', diff saved to https://phabricator.wikimedia.org/P45722 and previous config saved to /var/cache/conftool/dbconfig/20230309-170205-marostegui.json
16:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T329260)', diff saved to https://phabricator.wikimedia.org/P45721 and previous config saved to /var/cache/conftool/dbconfig/20230309-165210-marostegui.json
16:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
16:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
16:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T329260)', diff saved to https://phabricator.wikimedia.org/P45720 and previous config saved to /var/cache/conftool/dbconfig/20230309-165149-marostegui.json
16:51 topranks: Add EBGP peering from cr1-codfw to cloudsw1-b1-codfw (cloud vrf) T327919
16:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P45719 and previous config saved to /var/cache/conftool/dbconfig/20230309-163643-marostegui.json
16:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
16:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
16:26 marostegui@cumin1001: dbctl commit (dc=all): 'db2163 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45718 and previous config saved to /var/cache/conftool/dbconfig/20230309-162608-root.json
16:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P45717 and previous config saved to /var/cache/conftool/dbconfig/20230309-162137-marostegui.json
16:18 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host acmechief1001.eqiad.wmnet with OS bullseye
16:11 marostegui@cumin1001: dbctl commit (dc=all): 'db2163 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45716 and previous config saved to /var/cache/conftool/dbconfig/20230309-161103-root.json
16:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T329260)', diff saved to https://phabricator.wikimedia.org/P45715 and previous config saved to /var/cache/conftool/dbconfig/20230309-160630-marostegui.json
16:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on acmechief1001.eqiad.wmnet with reason: host reimage
16:01 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on acmechief1001.eqiad.wmnet with reason: host reimage
16:00 marostegui: Failover m5 from db1183 to db1176 - T330847
15:55 marostegui@cumin1001: dbctl commit (dc=all): 'db2163 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45714 and previous config saved to /var/cache/conftool/dbconfig/20230309-155558-root.json
15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T329260)', diff saved to https://phabricator.wikimedia.org/P45713 and previous config saved to /var/cache/conftool/dbconfig/20230309-155520-marostegui.json
15:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
15:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T329260)', diff saved to https://phabricator.wikimedia.org/P45712 and previous config saved to /var/cache/conftool/dbconfig/20230309-155459-marostegui.json
15:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2163 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45711 and previous config saved to /var/cache/conftool/dbconfig/20230309-154053-root.json
15:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P45710 and previous config saved to /var/cache/conftool/dbconfig/20230309-153953-marostegui.json
15:29 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host acmechief1001.eqiad.wmnet with OS bullseye
15:27 brett: Enable puppet on R:acme_chief::cert - T321309
15:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P45709 and previous config saved to /var/cache/conftool/dbconfig/20230309-152447-marostegui.json
15:15 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:15 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS records for codfw cr links to cloudsw-b1-codfw. - cmooney@cumin1001"
15:14 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS records for codfw cr links to cloudsw-b1-codfw. - cmooney@cumin1001"
15:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T329260)', diff saved to https://phabricator.wikimedia.org/P45706 and previous config saved to /var/cache/conftool/dbconfig/20230309-150940-marostegui.json
15:06 brett: Disable puppet on R:acme_chief::cert for acmechief maintenance - T321309
15:04 TheresNoTime: close UTC afternoon backport window
15:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on db[2135,2160].codfw.wmnet,db[1117,1176,1183].eqiad.wmnet with reason: m5 master switch T330847
15:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:30:00 on db[2135,2160].codfw.wmnet,db[1117,1176,1183].eqiad.wmnet with reason: m5 master switch T330847
14:55 zabe@deploy2002: awight and zabe: Backport for Drop unused FlaggedRevs threshold level names (T277883) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
14:11 jgiannelos@deploy2002: Started deploy [restbase/deploy@f774711]: (no justification provided)
14:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T329260)', diff saved to https://phabricator.wikimedia.org/P45705 and previous config saved to /var/cache/conftool/dbconfig/20230309-140915-marostegui.json
14:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
14:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
14:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
14:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
14:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T329260)', diff saved to https://phabricator.wikimedia.org/P45704 and previous config saved to /var/cache/conftool/dbconfig/20230309-140850-marostegui.json
14:08 Emperor: testing disk-swap in ms-be1066 T329305
14:07 samtar@deploy2002: daniel and samtar: Backport for Bump parsoid parser cache writes to 50%. (T320534) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T329203)', diff saved to https://phabricator.wikimedia.org/P45703 and previous config saved to /var/cache/conftool/dbconfig/20230309-140510-marostegui.json
14:00 aqu@deploy2002: Finished deploy [airflow-dags/analytics@9fba86b]: Upgrade to 2.5.1 from origin/T326194_airflow_deb_creation_with_gitlab_ci [airflow-dags@9fba86b] (duration: 00m 13s)
14:00 aqu@deploy2002: Started deploy [airflow-dags/analytics@9fba86b]: Upgrade to 2.5.1 from origin/T326194_airflow_deb_creation_with_gitlab_ci [airflow-dags@9fba86b]
13:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P45702 and previous config saved to /var/cache/conftool/dbconfig/20230309-135343-marostegui.json
13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P45701 and previous config saved to /var/cache/conftool/dbconfig/20230309-135004-marostegui.json
13:42 moritzm: restarting FPM/Apache on mw canaries to pick up curl updates
13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P45700 and previous config saved to /var/cache/conftool/dbconfig/20230309-133837-marostegui.json
13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P45699 and previous config saved to /var/cache/conftool/dbconfig/20230309-133458-marostegui.json
13:34 moritzm: installing curl security updates
13:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2135,2160].codfw.wmnet,db[1117,1176,1183].eqiad.wmnet with reason: Topology changes
13:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2135,2160].codfw.wmnet,db[1117,1176,1183].eqiad.wmnet with reason: Topology changes
13:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T329260)', diff saved to https://phabricator.wikimedia.org/P45698 and previous config saved to /var/cache/conftool/dbconfig/20230309-132331-marostegui.json
13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T329203)', diff saved to https://phabricator.wikimedia.org/P45697 and previous config saved to /var/cache/conftool/dbconfig/20230309-131951-marostegui.json
13:17 vgutierrez: rolling restart of pybal in lvs2009 and lvs2010
13:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T329260)', diff saved to https://phabricator.wikimedia.org/P45696 and previous config saved to /var/cache/conftool/dbconfig/20230309-131136-marostegui.json
13:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
13:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
13:04 btullis@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
13:04 btullis@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: btullis-T331115 - btullis@cumin1001"
13:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
13:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
13:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T329260)', diff saved to https://phabricator.wikimedia.org/P45695 and previous config saved to /var/cache/conftool/dbconfig/20230309-130315-marostegui.json
12:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P45694 and previous config saved to /var/cache/conftool/dbconfig/20230309-124809-marostegui.json
12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T329203)', diff saved to https://phabricator.wikimedia.org/P45693 and previous config saved to /var/cache/conftool/dbconfig/20230309-124025-marostegui.json
12:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
12:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T329203)', diff saved to https://phabricator.wikimedia.org/P45692 and previous config saved to /var/cache/conftool/dbconfig/20230309-124004-marostegui.json
12:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P45691 and previous config saved to /var/cache/conftool/dbconfig/20230309-123303-marostegui.json
12:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45690 and previous config saved to /var/cache/conftool/dbconfig/20230309-123015-root.json
12:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P45689 and previous config saved to /var/cache/conftool/dbconfig/20230309-122458-marostegui.json
12:22 moritzm: rebalancing ganeti eqiad/C after completion of bullseye updates T311687
12:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T329260)', diff saved to https://phabricator.wikimedia.org/P45688 and previous config saved to /var/cache/conftool/dbconfig/20230309-121756-marostegui.json
12:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45687 and previous config saved to /var/cache/conftool/dbconfig/20230309-121510-root.json
12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P45686 and previous config saved to /var/cache/conftool/dbconfig/20230309-120951-marostegui.json
12:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T329260)', diff saved to https://phabricator.wikimedia.org/P45685 and previous config saved to /var/cache/conftool/dbconfig/20230309-120559-marostegui.json
12:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
12:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T329260)', diff saved to https://phabricator.wikimedia.org/P45684 and previous config saved to /var/cache/conftool/dbconfig/20230309-120537-marostegui.json
12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45683 and previous config saved to /var/cache/conftool/dbconfig/20230309-120005-root.json
11:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T329203)', diff saved to https://phabricator.wikimedia.org/P45682 and previous config saved to /var/cache/conftool/dbconfig/20230309-115445-marostegui.json
11:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P45681 and previous config saved to /var/cache/conftool/dbconfig/20230309-115031-marostegui.json
11:47 marostegui: Deploy schema change on s1 codfw dbmaint T329684
11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45680 and previous config saved to /var/cache/conftool/dbconfig/20230309-114500-root.json
11:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
11:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2180 (T329684)', diff saved to https://phabricator.wikimedia.org/P45679 and previous config saved to /var/cache/conftool/dbconfig/20230309-114338-marostegui.json
11:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
11:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
11:40 moritzm: installing git security updates
11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P45678 and previous config saved to /var/cache/conftool/dbconfig/20230309-113525-marostegui.json
11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T329203)', diff saved to https://phabricator.wikimedia.org/P45677 and previous config saved to /var/cache/conftool/dbconfig/20230309-112804-marostegui.json
11:28 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
11:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
11:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
11:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
11:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T329203)', diff saved to https://phabricator.wikimedia.org/P45676 and previous config saved to /var/cache/conftool/dbconfig/20230309-112739-marostegui.json
11:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T329260)', diff saved to https://phabricator.wikimedia.org/P45675 and previous config saved to /var/cache/conftool/dbconfig/20230309-112019-marostegui.json
11:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P45674 and previous config saved to /var/cache/conftool/dbconfig/20230309-111233-marostegui.json
11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T329260)', diff saved to https://phabricator.wikimedia.org/P45673 and previous config saved to /var/cache/conftool/dbconfig/20230309-110827-marostegui.json
11:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
11:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T329260)', diff saved to https://phabricator.wikimedia.org/P45672 and previous config saved to /var/cache/conftool/dbconfig/20230309-110806-marostegui.json
11:01 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 9 hosts
11:01 cmooney@cumin1001: START - Cookbook sre.hosts.remove-downtime for 9 hosts
10:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P45671 and previous config saved to /var/cache/conftool/dbconfig/20230309-105726-marostegui.json
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P45670 and previous config saved to /var/cache/conftool/dbconfig/20230309-105259-marostegui.json
10:47 topranks: Resetting PIC in slot 1/0 on cr2-codfw T331527
10:45 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on 9 hosts with reason: cr2-codfw linecard 1/0 reset
10:44 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:20:00 on 9 hosts with reason: cr2-codfw linecard 1/0 reset
10:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T329203)', diff saved to https://phabricator.wikimedia.org/P45669 and previous config saved to /var/cache/conftool/dbconfig/20230309-104220-marostegui.json
10:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P45668 and previous config saved to /var/cache/conftool/dbconfig/20230309-103753-marostegui.json
10:32 hashar@deploy2002: Finished deploy [integration/docroot@095a329]: Add 'Test coverage' link for MW core and a few others (duration: 00m 08s)
10:32 hashar@deploy2002: Started deploy [integration/docroot@095a329]: Add 'Test coverage' link for MW core and a few others
10:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti1011.eqiad.wmnet to cluster eqiad and group C
10:26 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
10:26 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
10:25 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
10:24 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
10:23 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
10:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T329260)', diff saved to https://phabricator.wikimedia.org/P45667 and previous config saved to /var/cache/conftool/dbconfig/20230309-102247-marostegui.json
10:22 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
10:22 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 9 hosts with reason: cr2-codfw linecard 1/0 reset
10:22 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
10:22 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
10:22 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on 9 hosts with reason: cr2-codfw linecard 1/0 reset
10:21 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
10:21 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
10:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1011.eqiad.wmnet to cluster eqiad and group C
10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
10:19 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
10:19 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
10:13 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
10:13 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
10:13 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
10:13 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
10:12 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
10:11 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
10:11 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
10:11 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
10:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
10:11 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
10:10 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
10:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T329260)', diff saved to https://phabricator.wikimedia.org/P45666 and previous config saved to /var/cache/conftool/dbconfig/20230309-101042-marostegui.json
10:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
10:10 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
10:10 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti1011.eqiad.wmnet to cluster eqiad and group C
10:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
10:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T329260)', diff saved to https://phabricator.wikimedia.org/P45665 and previous config saved to /var/cache/conftool/dbconfig/20230309-101020-marostegui.json
10:10 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti1011.eqiad.wmnet to cluster eqiad and group C
10:10 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
10:06 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T329203)', diff saved to https://phabricator.wikimedia.org/P45664 and previous config saved to /var/cache/conftool/dbconfig/20230309-100611-marostegui.json
10:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
10:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
10:01 topranks: commencing work to drain cr2-codfw ports on card 1/0 (T331601)
09:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1011.eqiad.wmnet
09:55 marostegui: Deploy schema change on s4 codfw dbmaint T329684
09:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P45663 and previous config saved to /var/cache/conftool/dbconfig/20230309-095514-marostegui.json
09:53 marostegui: Deploy schema change on s8 codfw dbmaint T329684
09:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1011.eqiad.wmnet
09:48 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 9 hosts
09:48 cmooney@cumin1001: START - Cookbook sre.hosts.remove-downtime for 9 hosts
09:46 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45662 and previous config saved to /var/cache/conftool/dbconfig/20230309-094602-root.json
09:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P45661 and previous config saved to /var/cache/conftool/dbconfig/20230309-094008-marostegui.json
09:33 topranks: resetting Pic 1/0 on cr1-codfw
09:32 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on cr2-codfw,cr2-codfw IPv6 with reason: cr1-codfw linecard 1/0 reset
09:32 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on cr2-codfw,cr2-codfw IPv6 with reason: cr1-codfw linecard 1/0 reset
09:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
09:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T329203)', diff saved to https://phabricator.wikimedia.org/P45660 and previous config saved to /var/cache/conftool/dbconfig/20230309-093120-marostegui.json
09:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45659 and previous config saved to /var/cache/conftool/dbconfig/20230309-093057-root.json
09:29 elukey: delete old/unused ML-related docker images from the registry - T331513
09:27 topranks: disabling Transit cct on cr1-codfw xe-1/0/1:0 (T331527)
09:25 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on pfw3-codfw with reason: cr1-codfw linecard 1/0 reset
09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T329260)', diff saved to https://phabricator.wikimedia.org/P45658 and previous config saved to /var/cache/conftool/dbconfig/20230309-092502-marostegui.json
09:25 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on pfw3-codfw with reason: cr1-codfw linecard 1/0 reset
09:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti1011.eqiad.wmnet with OS bullseye
09:21 jnuche@deploy2002: Installation of scap version "latest" completed for 553 hosts
09:20 jnuche@deploy2002: Installing scap version "latest" for 553 hosts
09:19 marostegui: Deploy schema change on s7 codfw dbmaint T329684
09:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P45657 and previous config saved to /var/cache/conftool/dbconfig/20230309-091613-marostegui.json
09:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45656 and previous config saved to /var/cache/conftool/dbconfig/20230309-091552-root.json
09:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T329260)', diff saved to https://phabricator.wikimedia.org/P45655 and previous config saved to /var/cache/conftool/dbconfig/20230309-091400-marostegui.json
09:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
09:13 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: cr1-codfw linecard 1/0 reset
09:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T329260)', diff saved to https://phabricator.wikimedia.org/P45654 and previous config saved to /var/cache/conftool/dbconfig/20230309-091338-marostegui.json
09:13 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: cr1-codfw linecard 1/0 reset
09:12 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on 10 hosts with reason: cr1-codfw linecard 1/0 reset
09:12 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on 10 hosts with reason: cr1-codfw linecard 1/0 reset
09:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
09:06 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1011.eqiad.wmnet with reason: host reimage
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P45653 and previous config saved to /var/cache/conftool/dbconfig/20230309-090107-marostegui.json
09:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45652 and previous config saved to /var/cache/conftool/dbconfig/20230309-090048-root.json
08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P45651 and previous config saved to /var/cache/conftool/dbconfig/20230309-085832-marostegui.json
08:54 marostegui: Deploy schema change on s2 codfw dbmaint T329684
08:54 marostegui: Deploy schema change on s5 codfw dbmaint T329684
08:54 marostegui: Deploy schema change on s6 codfw dbmaint T329684
08:51 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti1011.eqiad.wmnet with OS bullseye
08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T329203)', diff saved to https://phabricator.wikimedia.org/P45650 and previous config saved to /var/cache/conftool/dbconfig/20230309-084601-marostegui.json
08:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45649 and previous config saved to /var/cache/conftool/dbconfig/20230309-084543-root.json
08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2180 (T329684)', diff saved to https://phabricator.wikimedia.org/P45648 and previous config saved to /var/cache/conftool/dbconfig/20230309-084359-marostegui.json
08:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
08:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2180.codfw.wmnet with reason: Maintenance
08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P45647 and previous config saved to /var/cache/conftool/dbconfig/20230309-084326-marostegui.json
08:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2141.codfw.wmnet with reason: Maintenance
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T329260)', diff saved to https://phabricator.wikimedia.org/P45644 and previous config saved to /var/cache/conftool/dbconfig/20230309-082820-marostegui.json
08:23 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: remove from cluster for reimage
08:23 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti1011.eqiad.wmnet with reason: remove from cluster for reimage
08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45643 and previous config saved to /var/cache/conftool/dbconfig/20230309-082257-root.json
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db2104 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45642 and previous config saved to /var/cache/conftool/dbconfig/20230309-082059-root.json
08:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T329260)', diff saved to https://phabricator.wikimedia.org/P45641 and previous config saved to /var/cache/conftool/dbconfig/20230309-081707-marostegui.json
08:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
08:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T329260)', diff saved to https://phabricator.wikimedia.org/P45640 and previous config saved to /var/cache/conftool/dbconfig/20230309-081646-marostegui.json
08:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2127 (T329203)', diff saved to https://phabricator.wikimedia.org/P45639 and previous config saved to /var/cache/conftool/dbconfig/20230309-080858-marostegui.json
08:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
08:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
08:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T329203)', diff saved to https://phabricator.wikimedia.org/P45638 and previous config saved to /var/cache/conftool/dbconfig/20230309-080837-marostegui.json
08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45637 and previous config saved to /var/cache/conftool/dbconfig/20230309-080752-root.json
08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2104 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45636 and previous config saved to /var/cache/conftool/dbconfig/20230309-080555-root.json
08:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P45635 and previous config saved to /var/cache/conftool/dbconfig/20230309-080140-marostegui.json
07:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P45634 and previous config saved to /var/cache/conftool/dbconfig/20230309-075331-marostegui.json
07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45633 and previous config saved to /var/cache/conftool/dbconfig/20230309-075247-root.json
07:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2104 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45632 and previous config saved to /var/cache/conftool/dbconfig/20230309-075050-root.json
07:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P45631 and previous config saved to /var/cache/conftool/dbconfig/20230309-074633-marostegui.json
07:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P45630 and previous config saved to /var/cache/conftool/dbconfig/20230309-073825-marostegui.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45629 and previous config saved to /var/cache/conftool/dbconfig/20230309-073743-root.json
07:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2104 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45628 and previous config saved to /var/cache/conftool/dbconfig/20230309-073545-root.json
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T329260)', diff saved to https://phabricator.wikimedia.org/P45627 and previous config saved to /var/cache/conftool/dbconfig/20230309-073127-marostegui.json
07:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T329203)', diff saved to https://phabricator.wikimedia.org/P45626 and previous config saved to /var/cache/conftool/dbconfig/20230309-072319-marostegui.json
07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P45625 and previous config saved to /var/cache/conftool/dbconfig/20230309-072238-root.json
07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2104 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P45624 and previous config saved to /var/cache/conftool/dbconfig/20230309-072040-root.json
07:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T329684)', diff saved to https://phabricator.wikimedia.org/P45623 and previous config saved to /var/cache/conftool/dbconfig/20230309-071853-marostegui.json
07:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
07:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2104.codfw.wmnet with reason: Maintenance
07:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
07:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2097.codfw.wmnet with reason: Maintenance
07:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T329260)', diff saved to https://phabricator.wikimedia.org/P45622 and previous config saved to /var/cache/conftool/dbconfig/20230309-071809-marostegui.json
07:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2106.codfw.wmnet with reason: Maintenance
07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2106.codfw.wmnet with reason: Maintenance
07:15 marostegui: Deploy schema change on s3 eqiad dbmaint T329684
07:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 15 hosts with reason: Schema change
07:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 15 hosts with reason: Schema change
07:13 marostegui: Deploy schema change on s7 eqiad dbmaint T329684
07:13 marostegui: Deploy schema change on s8 eqiad dbmaint T329684
07:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3316', diff saved to https://phabricator.wikimedia.org/P45621 and previous config saved to /var/cache/conftool/dbconfig/20230309-071029-root.json
07:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2099.codfw.wmnet with reason: Maintenance
07:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2099.codfw.wmnet with reason: Maintenance
07:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T329684)', diff saved to https://phabricator.wikimedia.org/P45620 and previous config saved to /var/cache/conftool/dbconfig/20230309-070805-marostegui.json
07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db2117 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P45619 and previous config saved to /var/cache/conftool/dbconfig/20230309-070733-root.json
07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T329684)', diff saved to https://phabricator.wikimedia.org/P45618 and previous config saved to /var/cache/conftool/dbconfig/20230309-070658-marostegui.json
07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:06 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
07:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
07:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T329684)', diff saved to https://phabricator.wikimedia.org/P45617 and previous config saved to /var/cache/conftool/dbconfig/20230309-070327-marostegui.json
07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db2117.codfw.wmnet with reason: Maintenance
07:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T329684)', diff saved to https://phabricator.wikimedia.org/P45616 and previous config saved to /var/cache/conftool/dbconfig/20230309-070223-marostegui.json
07:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 6:00:00 on db1113.eqiad.wmnet with reason: Maintenance
06:48 marostegui: Deploy schema change on s1 eqiad dbmaint T329684
06:48 marostegui: Deploy schema change on s4 eqiad dbmaint T329684
06:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
06:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2165.codfw.wmnet with reason: Maintenance
06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T329203)', diff saved to https://phabricator.wikimedia.org/P45615 and previous config saved to /var/cache/conftool/dbconfig/20230309-064538-marostegui.json
06:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
06:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
06:43 marostegui: Deploy schema change on s2 eqiad dbmaint T329684
06:42 marostegui: Deploy schema change on s5 eqiad dbmaint T329684
06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Schema change
06:40 marostegui: Deploy schema change on s6 eqiad dbmaint T329684
06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Schema change
06:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
06:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
06:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2112.codfw.wmnet with reason: Maintenance
06:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2112.codfw.wmnet with reason: Maintenance
04:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T329260)', diff saved to https://phabricator.wikimedia.org/P45614 and previous config saved to /var/cache/conftool/dbconfig/20230309-040925-marostegui.json
03:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P45613 and previous config saved to /var/cache/conftool/dbconfig/20230309-035418-marostegui.json
03:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P45612 and previous config saved to /var/cache/conftool/dbconfig/20230309-033912-marostegui.json
03:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T329260)', diff saved to https://phabricator.wikimedia.org/P45611 and previous config saved to /var/cache/conftool/dbconfig/20230309-032406-marostegui.json
03:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2181 (T329260)', diff saved to https://phabricator.wikimedia.org/P45610 and previous config saved to /var/cache/conftool/dbconfig/20230309-030445-marostegui.json
03:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
03:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
03:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T329260)', diff saved to https://phabricator.wikimedia.org/P45609 and previous config saved to /var/cache/conftool/dbconfig/20230309-030424-marostegui.json
02:59 sukhe: run keyholder arm on acmechief2001
02:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P45608 and previous config saved to /var/cache/conftool/dbconfig/20230309-024917-marostegui.json
02:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P45607 and previous config saved to /var/cache/conftool/dbconfig/20230309-023411-marostegui.json
02:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T329260)', diff saved to https://phabricator.wikimedia.org/P45606 and previous config saved to /var/cache/conftool/dbconfig/20230309-021905-marostegui.json
01:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 (T329260)', diff saved to https://phabricator.wikimedia.org/P45604 and previous config saved to /var/cache/conftool/dbconfig/20230309-015831-marostegui.json
01:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
01:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
01:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T329260)', diff saved to https://phabricator.wikimedia.org/P45603 and previous config saved to /var/cache/conftool/dbconfig/20230309-015810-marostegui.json
01:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P45602 and previous config saved to /var/cache/conftool/dbconfig/20230309-014303-marostegui.json
01:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P45601 and previous config saved to /var/cache/conftool/dbconfig/20230309-012757-marostegui.json
01:18 ebernhardson@deploy2002: Started deploy [airflow-dags/search@558da74]: correct eventgate datacenter partitioning in sensors
01:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T329260)', diff saved to https://phabricator.wikimedia.org/P45600 and previous config saved to /var/cache/conftool/dbconfig/20230309-011251-marostegui.json
00:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3318 (T329260)', diff saved to https://phabricator.wikimedia.org/P45599 and previous config saved to /var/cache/conftool/dbconfig/20230309-005220-marostegui.json
00:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
00:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
00:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T329260)', diff saved to https://phabricator.wikimedia.org/P45598 and previous config saved to /var/cache/conftool/dbconfig/20230309-005210-marostegui.json
00:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P45597 and previous config saved to /var/cache/conftool/dbconfig/20230309-003703-marostegui.json
00:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P45596 and previous config saved to /var/cache/conftool/dbconfig/20230309-002157-marostegui.json
00:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T329260)', diff saved to https://phabricator.wikimedia.org/P45594 and previous config saved to /var/cache/conftool/dbconfig/20230309-000651-marostegui.json
23:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2166 (T329260)', diff saved to https://phabricator.wikimedia.org/P45593 and previous config saved to /var/cache/conftool/dbconfig/20230308-234534-marostegui.json
23:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
23:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
23:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T329260)', diff saved to https://phabricator.wikimedia.org/P45592 and previous config saved to /var/cache/conftool/dbconfig/20230308-234502-marostegui.json
23:42 ebernhardson@deploy2002: Finished deploy [airflow-dags/search@29f73a4]: update virtualenv entry_points to use relative paths (duration: 00m 14s)
23:42 ebernhardson@deploy2002: Started deploy [airflow-dags/search@29f73a4]: update virtualenv entry_points to use relative paths
23:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P45591 and previous config saved to /var/cache/conftool/dbconfig/20230308-232956-marostegui.json
23:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P45590 and previous config saved to /var/cache/conftool/dbconfig/20230308-231449-marostegui.json
22:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T329260)', diff saved to https://phabricator.wikimedia.org/P45589 and previous config saved to /var/cache/conftool/dbconfig/20230308-225943-marostegui.json
22:44 hashar: Upgrading CI Jenkins
22:42 tgr: UTC late deploys done
22:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2164 (T329260)', diff saved to https://phabricator.wikimedia.org/P45588 and previous config saved to /var/cache/conftool/dbconfig/20230308-224044-marostegui.json
22:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
22:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
22:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
22:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
22:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T329260)', diff saved to https://phabricator.wikimedia.org/P45587 and previous config saved to /var/cache/conftool/dbconfig/20230308-224018-marostegui.json
22:27 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
22:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P45586 and previous config saved to /var/cache/conftool/dbconfig/20230308-222512-marostegui.json
22:21 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
22:20 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sessionstore1001.eqiad.wmnet
22:12 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host sessionstore1001.eqiad.wmnet
22:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P45585 and previous config saved to /var/cache/conftool/dbconfig/20230308-221006-marostegui.json
22:09 kindrobot: hand off backport window UTC late to tgr for self-service
21:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T329260)', diff saved to https://phabricator.wikimedia.org/P45584 and previous config saved to /var/cache/conftool/dbconfig/20230308-215500-marostegui.json
21:07 hashar@deploy2002: Started deploy [releng/jenkins-deploy@0e465ac] (releasing): (no justification provided)
20:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2163 (T329260)', diff saved to https://phabricator.wikimedia.org/P45583 and previous config saved to /var/cache/conftool/dbconfig/20230308-205435-marostegui.json
20:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
20:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
20:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T329260)', diff saved to https://phabricator.wikimedia.org/P45582 and previous config saved to /var/cache/conftool/dbconfig/20230308-205414-marostegui.json
20:51 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host acmechief2001.codfw.wmnet with OS bullseye
20:41 mutante: deploy2002 - systemctl restart keyholder-proxy.service to fix T331568 - after this SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh -i /etc/keyholder.d/deploy_jenkins -l deploy-jenkins releases1002.eqiad.wmnet works
20:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on acmechief2001.codfw.wmnet with reason: host reimage
20:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P45581 and previous config saved to /var/cache/conftool/dbconfig/20230308-203907-marostegui.json
20:36 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on acmechief2001.codfw.wmnet with reason: host reimage
20:24 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host acmechief2001.codfw.wmnet with OS bullseye
20:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P45580 and previous config saved to /var/cache/conftool/dbconfig/20230308-202401-marostegui.json
20:18 urandom: power cycle restbase2022 (unresponsive; cannot SSH)
20:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T329260)', diff saved to https://phabricator.wikimedia.org/P45579 and previous config saved to /var/cache/conftool/dbconfig/20230308-200855-marostegui.json
20:01 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host acmechief-test1001.eqiad.wmnet with OS bullseye
19:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2162 (T329260)', diff saved to https://phabricator.wikimedia.org/P45578 and previous config saved to /var/cache/conftool/dbconfig/20230308-194646-marostegui.json
19:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
19:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
19:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T329260)', diff saved to https://phabricator.wikimedia.org/P45577 and previous config saved to /var/cache/conftool/dbconfig/20230308-194625-marostegui.json
19:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on acmechief-test1001.eqiad.wmnet with reason: host reimage
19:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on acmechief-test1001.eqiad.wmnet with reason: host reimage
19:31 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host acmechief-test1001.eqiad.wmnet with OS bullseye
19:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P45576 and previous config saved to /var/cache/conftool/dbconfig/20230308-193118-marostegui.json
19:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P45575 and previous config saved to /var/cache/conftool/dbconfig/20230308-191612-marostegui.json
19:14 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
19:14 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse entries for new links from CRs to cloudsw1-b1-codfw. - cmooney@cumin1001"
19:13 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add reverse entries for new links from CRs to cloudsw1-b1-codfw. - cmooney@cumin1001"
19:09 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.26 refs T330204
19:09 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for acmechief-test2001.codfw.wmnet
19:09 sukhe@cumin2002: START - Cookbook sre.hosts.remove-downtime for acmechief-test2001.codfw.wmnet
19:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T329260)', diff saved to https://phabricator.wikimedia.org/P45574 and previous config saved to /var/cache/conftool/dbconfig/20230308-190106-marostegui.json
18:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T328817)', diff saved to https://phabricator.wikimedia.org/P45573 and previous config saved to /var/cache/conftool/dbconfig/20230308-184328-marostegui.json
18:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2161 (T329260)', diff saved to https://phabricator.wikimedia.org/P45572 and previous config saved to /var/cache/conftool/dbconfig/20230308-184204-marostegui.json
18:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
18:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
18:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T329260)', diff saved to https://phabricator.wikimedia.org/P45571 and previous config saved to /var/cache/conftool/dbconfig/20230308-184143-marostegui.json
18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109 (T318605)', diff saved to https://phabricator.wikimedia.org/P45570 and previous config saved to /var/cache/conftool/dbconfig/20230308-183020-ladsgroup.json
18:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P45569 and previous config saved to /var/cache/conftool/dbconfig/20230308-182822-marostegui.json
18:28 inflatador: bking@cumin2002 repool elastic1060-1066 to finish off T322082
18:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T329203)', diff saved to https://phabricator.wikimedia.org/P45568 and previous config saved to /var/cache/conftool/dbconfig/20230308-182726-marostegui.json
18:27 inflatador: bking@cumin2002 unban elastic1060-1066 to finish off T322082
18:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P45567 and previous config saved to /var/cache/conftool/dbconfig/20230308-182637-marostegui.json
18:16 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host acmechief-test2001.codfw.wmnet with OS bullseye
18:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109', diff saved to https://phabricator.wikimedia.org/P45566 and previous config saved to /var/cache/conftool/dbconfig/20230308-181514-ladsgroup.json
18:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P45565 and previous config saved to /var/cache/conftool/dbconfig/20230308-181316-marostegui.json
18:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P45564 and previous config saved to /var/cache/conftool/dbconfig/20230308-181220-marostegui.json
18:12 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "update locatoin of elastic1064 - bking@cumin2002 - T322082"
18:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P45563 and previous config saved to /var/cache/conftool/dbconfig/20230308-181131-marostegui.json
18:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109', diff saved to https://phabricator.wikimedia.org/P45562 and previous config saved to /var/cache/conftool/dbconfig/20230308-180008-ladsgroup.json
17:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T328817)', diff saved to https://phabricator.wikimedia.org/P45561 and previous config saved to /var/cache/conftool/dbconfig/20230308-175810-marostegui.json
17:58 herron: failing grafana over from codfw to eqiad
17:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P45560 and previous config saved to /var/cache/conftool/dbconfig/20230308-175714-marostegui.json
17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on acmechief-test2001.codfw.wmnet with reason: host reimage
17:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T329260)', diff saved to https://phabricator.wikimedia.org/P45559 and previous config saved to /var/cache/conftool/dbconfig/20230308-175625-marostegui.json
17:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1066.mgmt.eqiad.wmnet with reboot policy GRACEFUL
17:48 bking@cumin2002: START - Cookbook sre.hosts.provision for host elastic1066.mgmt.eqiad.wmnet with reboot policy GRACEFUL
17:47 bking@cumin2002: START - Cookbook sre.hosts.provision for host elastic1064.mgmt.eqiad.wmnet with reboot policy GRACEFUL
17:47 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host acmechief-test2001.codfw.wmnet with OS bullseye
17:46 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic1064.eqiad.wmnet']
17:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2176 (T328817)', diff saved to https://phabricator.wikimedia.org/P45558 and previous config saved to /var/cache/conftool/dbconfig/20230308-174535-marostegui.json
17:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
17:45 bking@cumin2002: START - Cookbook sre.hosts.provision for host elastic1065.mgmt.eqiad.wmnet with reboot policy GRACEFUL
17:45 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
17:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T328817)', diff saved to https://phabricator.wikimedia.org/P45557 and previous config saved to /var/cache/conftool/dbconfig/20230308-174514-marostegui.json
17:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1109 (T318605)', diff saved to https://phabricator.wikimedia.org/P45556 and previous config saved to /var/cache/conftool/dbconfig/20230308-174501-ladsgroup.json
17:43 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic1066.eqiad.wmnet']
17:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T329203)', diff saved to https://phabricator.wikimedia.org/P45555 and previous config saved to /var/cache/conftool/dbconfig/20230308-174208-marostegui.json
17:38 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic1065.eqiad.wmnet']
17:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2154 (T329260)', diff saved to https://phabricator.wikimedia.org/P45554 and previous config saved to /var/cache/conftool/dbconfig/20230308-173701-marostegui.json
17:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
17:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
17:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T329260)', diff saved to https://phabricator.wikimedia.org/P45553 and previous config saved to /var/cache/conftool/dbconfig/20230308-173640-marostegui.json
17:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
17:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T329203)', diff saved to https://phabricator.wikimedia.org/P45551 and previous config saved to /var/cache/conftool/dbconfig/20230308-173104-marostegui.json
17:31 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic1065.eqiad.wmnet']
17:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P45550 and previous config saved to /var/cache/conftool/dbconfig/20230308-173007-marostegui.json
17:28 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic1064.eqiad.wmnet']
17:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P45549 and previous config saved to /var/cache/conftool/dbconfig/20230308-172134-marostegui.json
17:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P45548 and previous config saved to /var/cache/conftool/dbconfig/20230308-171558-marostegui.json
17:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P45547 and previous config saved to /var/cache/conftool/dbconfig/20230308-171501-marostegui.json
17:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P45546 and previous config saved to /var/cache/conftool/dbconfig/20230308-170627-marostegui.json
17:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1109 (T318605)', diff saved to https://phabricator.wikimedia.org/P45545 and previous config saved to /var/cache/conftool/dbconfig/20230308-170512-ladsgroup.json
17:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1109.eqiad.wmnet with reason: Maintenance
17:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1109.eqiad.wmnet with reason: Maintenance
17:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P45543 and previous config saved to /var/cache/conftool/dbconfig/20230308-170051-marostegui.json
16:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T328817)', diff saved to https://phabricator.wikimedia.org/P45542 and previous config saved to /var/cache/conftool/dbconfig/20230308-165955-marostegui.json
16:52 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1063.mgmt.eqiad.wmnet with reboot policy GRACEFUL
16:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T329260)', diff saved to https://phabricator.wikimedia.org/P45541 and previous config saved to /var/cache/conftool/dbconfig/20230308-165121-marostegui.json
16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2174 (T328817)', diff saved to https://phabricator.wikimedia.org/P45540 and previous config saved to /var/cache/conftool/dbconfig/20230308-164807-marostegui.json
16:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
16:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
16:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T328817)', diff saved to https://phabricator.wikimedia.org/P45539 and previous config saved to /var/cache/conftool/dbconfig/20230308-164746-marostegui.json
16:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T329203)', diff saved to https://phabricator.wikimedia.org/P45538 and previous config saved to /var/cache/conftool/dbconfig/20230308-164545-marostegui.json
16:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T329203)', diff saved to https://phabricator.wikimedia.org/P45537 and previous config saved to /var/cache/conftool/dbconfig/20230308-163311-marostegui.json
16:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
16:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T329203)', diff saved to https://phabricator.wikimedia.org/P45536 and previous config saved to /var/cache/conftool/dbconfig/20230308-163249-marostegui.json
16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P45535 and previous config saved to /var/cache/conftool/dbconfig/20230308-163240-marostegui.json
16:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T329260)', diff saved to https://phabricator.wikimedia.org/P45534 and previous config saved to /var/cache/conftool/dbconfig/20230308-163230-marostegui.json
16:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
16:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
16:29 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "update locatoin of elastic1060 - bking@cumin2002 - T322082"
16:22 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1060.mgmt.eqiad.wmnet with reboot policy GRACEFUL
16:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1061.mgmt.eqiad.wmnet with reboot policy GRACEFUL
16:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P45533 and previous config saved to /var/cache/conftool/dbconfig/20230308-161737-marostegui.json
16:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P45532 and previous config saved to /var/cache/conftool/dbconfig/20230308-161727-marostegui.json
16:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
16:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2100.codfw.wmnet with reason: Maintenance
16:14 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1062.mgmt.eqiad.wmnet with reboot policy GRACEFUL
16:10 bking@cumin2002: START - Cookbook sre.hosts.provision for host elastic1062.mgmt.eqiad.wmnet with reboot policy GRACEFUL
16:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on elastic1062.eqiad.wmnet with reason: re-rack
16:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on elastic1062.eqiad.wmnet with reason: re-rack
16:08 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host elastic1062.eqiad.wmnet
16:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on elastic1061.eqiad.wmnet with reason: re-rack
16:06 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on elastic1061.eqiad.wmnet with reason: re-rack
16:05 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 4 hosts
16:05 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 4 hosts
16:03 bking@cumin2002: START - Cookbook sre.hosts.provision for host elastic1060.mgmt.eqiad.wmnet with reboot policy GRACEFUL
16:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P45531 and previous config saved to /var/cache/conftool/dbconfig/20230308-160231-marostegui.json
16:02 bking@cumin2002: START - Cookbook sre.hosts.provision for host elastic1061.mgmt.eqiad.wmnet with reboot policy GRACEFUL
16:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T328817)', diff saved to https://phabricator.wikimedia.org/P45530 and previous config saved to /var/cache/conftool/dbconfig/20230308-160221-marostegui.json
16:00 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host elastic1062.eqiad.wmnet
16:00 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host elastic1061.eqiad.wmnet
15:59 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic1062.eqiad.wmnet']
15:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
15:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2173 (T328817)', diff saved to https://phabricator.wikimedia.org/P45529 and previous config saved to /var/cache/conftool/dbconfig/20230308-154736-marostegui.json
15:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
15:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
15:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T329203)', diff saved to https://phabricator.wikimedia.org/P45528 and previous config saved to /var/cache/conftool/dbconfig/20230308-154724-marostegui.json
15:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
15:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T328817)', diff saved to https://phabricator.wikimedia.org/P45527 and previous config saved to /var/cache/conftool/dbconfig/20230308-154709-marostegui.json
15:33 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['elastic1061.eqiad.wmnet']
15:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P45526 and previous config saved to /var/cache/conftool/dbconfig/20230308-153202-marostegui.json
15:31 bking@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['elastic1060.eqiad.wmnet']
15:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2113.codfw.wmnet with reason: Maintenance
15:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2113.codfw.wmnet with reason: Maintenance
15:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P45525 and previous config saved to /var/cache/conftool/dbconfig/20230308-151656-marostegui.json
15:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T328817)', diff saved to https://phabricator.wikimedia.org/P45524 and previous config saved to /var/cache/conftool/dbconfig/20230308-150150-marostegui.json
14:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T329260)', diff saved to https://phabricator.wikimedia.org/P45523 and previous config saved to /var/cache/conftool/dbconfig/20230308-145245-marostegui.json
14:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T328817)', diff saved to https://phabricator.wikimedia.org/P45522 and previous config saved to /var/cache/conftool/dbconfig/20230308-144934-marostegui.json
14:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
14:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
14:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T328817)', diff saved to https://phabricator.wikimedia.org/P45521 and previous config saved to /var/cache/conftool/dbconfig/20230308-144924-marostegui.json
14:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T329203)', diff saved to https://phabricator.wikimedia.org/P45520 and previous config saved to /var/cache/conftool/dbconfig/20230308-144659-marostegui.json
14:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
14:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
14:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
14:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
14:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T329203)', diff saved to https://phabricator.wikimedia.org/P45519 and previous config saved to /var/cache/conftool/dbconfig/20230308-144634-marostegui.json
14:46 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
14:46 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
14:45 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
14:44 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
14:43 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
14:42 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
14:41 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
14:40 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
14:40 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
14:39 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
14:39 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
14:38 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
14:37 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
14:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P45518 and previous config saved to /var/cache/conftool/dbconfig/20230308-143739-marostegui.json
14:37 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
14:36 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
14:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P45517 and previous config saved to /var/cache/conftool/dbconfig/20230308-143418-marostegui.json
14:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P45516 and previous config saved to /var/cache/conftool/dbconfig/20230308-143127-marostegui.json
14:25 inflatador: bking@cumin2002 powering down elastic1060-66 for re-rack T322082
14:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P45514 and previous config saved to /var/cache/conftool/dbconfig/20230308-142233-marostegui.json
14:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P45513 and previous config saved to /var/cache/conftool/dbconfig/20230308-141911-marostegui.json
14:16 TheresNoTime: close UTC afternoon backport window
14:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P45511 and previous config saved to /var/cache/conftool/dbconfig/20230308-141621-marostegui.json
14:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T329260)', diff saved to https://phabricator.wikimedia.org/P45510 and previous config saved to /var/cache/conftool/dbconfig/20230308-140727-marostegui.json
14:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T328817)', diff saved to https://phabricator.wikimedia.org/P45509 and previous config saved to /var/cache/conftool/dbconfig/20230308-140405-marostegui.json
14:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T329203)', diff saved to https://phabricator.wikimedia.org/P45508 and previous config saved to /var/cache/conftool/dbconfig/20230308-140115-marostegui.json
13:51 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T328817)', diff saved to https://phabricator.wikimedia.org/P45507 and previous config saved to /var/cache/conftool/dbconfig/20230308-135153-marostegui.json
13:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
13:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
13:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T328817)', diff saved to https://phabricator.wikimedia.org/P45506 and previous config saved to /var/cache/conftool/dbconfig/20230308-135132-marostegui.json
13:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T329203)', diff saved to https://phabricator.wikimedia.org/P45505 and previous config saved to /var/cache/conftool/dbconfig/20230308-134945-marostegui.json
13:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
13:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
13:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
13:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T329203)', diff saved to https://phabricator.wikimedia.org/P45504 and previous config saved to /var/cache/conftool/dbconfig/20230308-134034-marostegui.json
13:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2178 (T329260)', diff saved to https://phabricator.wikimedia.org/P45503 and previous config saved to /var/cache/conftool/dbconfig/20230308-134002-marostegui.json
13:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
13:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2178.codfw.wmnet with reason: Maintenance
13:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45502 and previous config saved to /var/cache/conftool/dbconfig/20230308-133940-marostegui.json
13:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P45501 and previous config saved to /var/cache/conftool/dbconfig/20230308-133626-marostegui.json
13:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P45500 and previous config saved to /var/cache/conftool/dbconfig/20230308-132528-marostegui.json
13:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P45499 and previous config saved to /var/cache/conftool/dbconfig/20230308-132434-marostegui.json
13:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P45498 and previous config saved to /var/cache/conftool/dbconfig/20230308-132120-marostegui.json
13:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P45497 and previous config saved to /var/cache/conftool/dbconfig/20230308-131022-marostegui.json
13:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315', diff saved to https://phabricator.wikimedia.org/P45496 and previous config saved to /var/cache/conftool/dbconfig/20230308-130928-marostegui.json
13:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T328817)', diff saved to https://phabricator.wikimedia.org/P45495 and previous config saved to /var/cache/conftool/dbconfig/20230308-130613-marostegui.json
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T328817)', diff saved to https://phabricator.wikimedia.org/P45494 and previous config saved to /var/cache/conftool/dbconfig/20230308-125548-marostegui.json
12:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
12:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T328817)', diff saved to https://phabricator.wikimedia.org/P45493 and previous config saved to /var/cache/conftool/dbconfig/20230308-125527-marostegui.json
12:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T329203)', diff saved to https://phabricator.wikimedia.org/P45492 and previous config saved to /var/cache/conftool/dbconfig/20230308-125515-marostegui.json
12:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45491 and previous config saved to /var/cache/conftool/dbconfig/20230308-125422-marostegui.json
12:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45490 and previous config saved to /var/cache/conftool/dbconfig/20230308-124945-marostegui.json
12:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
12:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
12:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T329260)', diff saved to https://phabricator.wikimedia.org/P45489 and previous config saved to /var/cache/conftool/dbconfig/20230308-124924-marostegui.json
12:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T329203)', diff saved to https://phabricator.wikimedia.org/P45488 and previous config saved to /var/cache/conftool/dbconfig/20230308-124344-marostegui.json
12:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
12:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
12:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T329203)', diff saved to https://phabricator.wikimedia.org/P45487 and previous config saved to /var/cache/conftool/dbconfig/20230308-124334-marostegui.json
12:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P45486 and previous config saved to /var/cache/conftool/dbconfig/20230308-124021-marostegui.json
12:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P45485 and previous config saved to /var/cache/conftool/dbconfig/20230308-123418-marostegui.json
12:31 hnowlan: running authdns-update for r/890398
12:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P45484 and previous config saved to /var/cache/conftool/dbconfig/20230308-122827-marostegui.json
12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P45483 and previous config saved to /var/cache/conftool/dbconfig/20230308-122515-marostegui.json
12:22 hnowlan@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
12:22 hnowlan@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add service records for device-analytics - hnowlan@cumin1001"
12:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P45482 and previous config saved to /var/cache/conftool/dbconfig/20230308-121912-marostegui.json
12:17 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc1039.eqiad.wmnet with OS bullseye
12:14 jmm@cumin2002: START - Cookbook sre.ganeti.reimage for host urldownloader1003.wikimedia.org with OS bullseye
12:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P45480 and previous config saved to /var/cache/conftool/dbconfig/20230308-121321-marostegui.json
12:10 hnowlan@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add service records for device-analytics - hnowlan@cumin1001"
12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T328817)', diff saved to https://phabricator.wikimedia.org/P45479 and previous config saved to /var/cache/conftool/dbconfig/20230308-121009-marostegui.json
12:09 jmm@cumin2002: END (ERROR) - Cookbook sre.ganeti.reimage (exit_code=97) for host urldownloader1003.wikimedia.org with OS bullseye
12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T329260)', diff saved to https://phabricator.wikimedia.org/P45478 and previous config saved to /var/cache/conftool/dbconfig/20230308-120406-marostegui.json
12:01 claime: restbase-async back in standard state - T330651
12:01 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1039.eqiad.wmnet with reason: host reimage
12:00 cgoubert@cumin1001: END (PASS) - Cookbook sre.discovery.service-route (exit_code=0) depool restbase-async in codfw: T330651
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2157 (T329260)', diff saved to https://phabricator.wikimedia.org/P45477 and previous config saved to /var/cache/conftool/dbconfig/20230308-115935-marostegui.json
11:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T328817)', diff saved to https://phabricator.wikimedia.org/P45476 and previous config saved to /var/cache/conftool/dbconfig/20230308-115924-marostegui.json
11:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
11:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2157.codfw.wmnet with reason: Maintenance
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45475 and previous config saved to /var/cache/conftool/dbconfig/20230308-115913-marostegui.json
11:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
11:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T328817)', diff saved to https://phabricator.wikimedia.org/P45474 and previous config saved to /var/cache/conftool/dbconfig/20230308-115903-marostegui.json
11:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T329203)', diff saved to https://phabricator.wikimedia.org/P45473 and previous config saved to /var/cache/conftool/dbconfig/20230308-115815-marostegui.json
11:57 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc1039.eqiad.wmnet with reason: host reimage
11:55 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) restbase-async.discovery.wmnet on all recursors
11:55 cgoubert@cumin1001: START - Cookbook sre.dns.wipe-cache restbase-async.discovery.wmnet on all recursors
11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T329203)', diff saved to https://phabricator.wikimedia.org/P45471 and previous config saved to /var/cache/conftool/dbconfig/20230308-114652-marostegui.json
11:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
11:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T329203)', diff saved to https://phabricator.wikimedia.org/P45470 and previous config saved to /var/cache/conftool/dbconfig/20230308-114642-marostegui.json
11:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3315', diff saved to https://phabricator.wikimedia.org/P45469 and previous config saved to /var/cache/conftool/dbconfig/20230308-114553-root.json
11:44 jiji@cumin1001: START - Cookbook sre.hosts.reimage for host mc1039.eqiad.wmnet with OS bullseye
11:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P45468 and previous config saved to /var/cache/conftool/dbconfig/20230308-114407-marostegui.json
11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P45467 and previous config saved to /var/cache/conftool/dbconfig/20230308-114357-marostegui.json
11:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P45466 and previous config saved to /var/cache/conftool/dbconfig/20230308-113136-marostegui.json
11:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315', diff saved to https://phabricator.wikimedia.org/P45465 and previous config saved to /var/cache/conftool/dbconfig/20230308-112901-marostegui.json
11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P45464 and previous config saved to /var/cache/conftool/dbconfig/20230308-112850-marostegui.json
11:27 otto@deploy2002: Started deploy [analytics/refinery@d4aaff9]: Regular analytics weekly train [analytics/refinery@d4aaff9]
11:21 claime: Traffic: repool eqiad for user traffic - T331285
11:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P45463 and previous config saved to /var/cache/conftool/dbconfig/20230308-111628-marostegui.json
11:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45462 and previous config saved to /var/cache/conftool/dbconfig/20230308-111355-marostegui.json
11:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T328817)', diff saved to https://phabricator.wikimedia.org/P45461 and previous config saved to /var/cache/conftool/dbconfig/20230308-111344-marostegui.json
11:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45460 and previous config saved to /var/cache/conftool/dbconfig/20230308-110907-marostegui.json
11:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
11:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T329260)', diff saved to https://phabricator.wikimedia.org/P45459 and previous config saved to /var/cache/conftool/dbconfig/20230308-110846-marostegui.json
11:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2145 (T328817)', diff saved to https://phabricator.wikimedia.org/P45458 and previous config saved to /var/cache/conftool/dbconfig/20230308-110306-marostegui.json
11:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
11:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
11:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T329203)', diff saved to https://phabricator.wikimedia.org/P45457 and previous config saved to /var/cache/conftool/dbconfig/20230308-110121-marostegui.json
10:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
10:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T328817)', diff saved to https://phabricator.wikimedia.org/P45456 and previous config saved to /var/cache/conftool/dbconfig/20230308-105347-marostegui.json
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P45455 and previous config saved to /var/cache/conftool/dbconfig/20230308-105339-marostegui.json
10:52 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
10:52 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
10:52 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
10:51 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
10:51 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
10:50 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T329203)', diff saved to https://phabricator.wikimedia.org/P45454 and previous config saved to /var/cache/conftool/dbconfig/20230308-105043-marostegui.json
10:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
10:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
10:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T329203)', diff saved to https://phabricator.wikimedia.org/P45453 and previous config saved to /var/cache/conftool/dbconfig/20230308-105022-marostegui.json
10:50 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
10:49 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
10:48 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
10:42 otto@deploy2002: Started deploy [analytics/refinery@eb29334]: Regular analytics weekly train [analytics/refinery@eb29334]
10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P45452 and previous config saved to /var/cache/conftool/dbconfig/20230308-103840-marostegui.json
10:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128', diff saved to https://phabricator.wikimedia.org/P45451 and previous config saved to /var/cache/conftool/dbconfig/20230308-103833-marostegui.json
10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P45450 and previous config saved to /var/cache/conftool/dbconfig/20230308-103515-marostegui.json
10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P45449 and previous config saved to /var/cache/conftool/dbconfig/20230308-102334-marostegui.json
10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2128 (T329260)', diff saved to https://phabricator.wikimedia.org/P45448 and previous config saved to /var/cache/conftool/dbconfig/20230308-102326-marostegui.json
10:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P45447 and previous config saved to /var/cache/conftool/dbconfig/20230308-102009-marostegui.json
10:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
10:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
10:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
10:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2128.codfw.wmnet with reason: Maintenance
10:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T329260)', diff saved to https://phabricator.wikimedia.org/P45446 and previous config saved to /var/cache/conftool/dbconfig/20230308-101944-marostegui.json
10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T328817)', diff saved to https://phabricator.wikimedia.org/P45445 and previous config saved to /var/cache/conftool/dbconfig/20230308-100826-marostegui.json
10:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T329203)', diff saved to https://phabricator.wikimedia.org/P45444 and previous config saved to /var/cache/conftool/dbconfig/20230308-100502-marostegui.json
10:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P45443 and previous config saved to /var/cache/conftool/dbconfig/20230308-100437-marostegui.json
09:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T328817)', diff saved to https://phabricator.wikimedia.org/P45442 and previous config saved to /var/cache/conftool/dbconfig/20230308-095804-marostegui.json
09:57 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
09:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
09:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T328817)', diff saved to https://phabricator.wikimedia.org/P45441 and previous config saved to /var/cache/conftool/dbconfig/20230308-095742-marostegui.json
09:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T329203)', diff saved to https://phabricator.wikimedia.org/P45440 and previous config saved to /var/cache/conftool/dbconfig/20230308-095320-marostegui.json
09:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
09:53 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
09:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T329203)', diff saved to https://phabricator.wikimedia.org/P45439 and previous config saved to /var/cache/conftool/dbconfig/20230308-095259-marostegui.json
09:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123', diff saved to https://phabricator.wikimedia.org/P45438 and previous config saved to /var/cache/conftool/dbconfig/20230308-094931-marostegui.json
09:45 claime: Rebuilding production-images for 894687
09:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P45437 and previous config saved to /var/cache/conftool/dbconfig/20230308-094236-marostegui.json
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P45436 and previous config saved to /var/cache/conftool/dbconfig/20230308-093752-marostegui.json
09:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2123 (T329260)', diff saved to https://phabricator.wikimedia.org/P45435 and previous config saved to /var/cache/conftool/dbconfig/20230308-093424-marostegui.json
09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2123 (T329260)', diff saved to https://phabricator.wikimedia.org/P45434 and previous config saved to /var/cache/conftool/dbconfig/20230308-093106-marostegui.json
09:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2123.codfw.wmnet with reason: Maintenance
09:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2123.codfw.wmnet with reason: Maintenance
09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T329260)', diff saved to https://phabricator.wikimedia.org/P45433 and previous config saved to /var/cache/conftool/dbconfig/20230308-093045-marostegui.json
09:30 moritzm: drain ganeti1011 for eventual reimage to Bullseye T311687
09:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P45432 and previous config saved to /var/cache/conftool/dbconfig/20230308-092729-marostegui.json
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P45431 and previous config saved to /var/cache/conftool/dbconfig/20230308-092246-marostegui.json
09:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P45430 and previous config saved to /var/cache/conftool/dbconfig/20230308-091538-marostegui.json
09:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T328817)', diff saved to https://phabricator.wikimedia.org/P45429 and previous config saved to /var/cache/conftool/dbconfig/20230308-091223-marostegui.json
09:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T329203)', diff saved to https://phabricator.wikimedia.org/P45428 and previous config saved to /var/cache/conftool/dbconfig/20230308-090739-marostegui.json
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T328817)', diff saved to https://phabricator.wikimedia.org/P45426 and previous config saved to /var/cache/conftool/dbconfig/20230308-090156-marostegui.json
09:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
09:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T328817)', diff saved to https://phabricator.wikimedia.org/P45425 and previous config saved to /var/cache/conftool/dbconfig/20230308-090134-marostegui.json
09:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P45424 and previous config saved to /var/cache/conftool/dbconfig/20230308-090031-marostegui.json
08:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T329203)', diff saved to https://phabricator.wikimedia.org/P45423 and previous config saved to /var/cache/conftool/dbconfig/20230308-085608-marostegui.json
08:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
08:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T329203)', diff saved to https://phabricator.wikimedia.org/P45422 and previous config saved to /var/cache/conftool/dbconfig/20230308-085546-marostegui.json
08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45421 and previous config saved to /var/cache/conftool/dbconfig/20230308-085159-root.json
08:50 vgutierrez: re-enable HAProxy systemd service unit hardening in ulsfo - T323944
08:49 moritzm: installing git security updates
08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P45420 and previous config saved to /var/cache/conftool/dbconfig/20230308-084628-marostegui.json
08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111 (T329260)', diff saved to https://phabricator.wikimedia.org/P45419 and previous config saved to /var/cache/conftool/dbconfig/20230308-084525-marostegui.json
08:41 marostegui: Deploy schema change on s3 eqiad dbmaint T329203
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2111 (T329260)', diff saved to https://phabricator.wikimedia.org/P45418 and previous config saved to /var/cache/conftool/dbconfig/20230308-084053-marostegui.json
08:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2111.codfw.wmnet with reason: Maintenance
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P45417 and previous config saved to /var/cache/conftool/dbconfig/20230308-084040-marostegui.json
08:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2111.codfw.wmnet with reason: Maintenance
08:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3315', diff saved to https://phabricator.wikimedia.org/P45416 and previous config saved to /var/cache/conftool/dbconfig/20230308-083843-marostegui.json
08:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance
08:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2101.codfw.wmnet with reason: Maintenance
08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3315 (re)pooling @ 15%: Repooling', diff saved to https://phabricator.wikimedia.org/P45415 and previous config saved to /var/cache/conftool/dbconfig/20230308-083731-root.json
08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45414 and previous config saved to /var/cache/conftool/dbconfig/20230308-083654-root.json
08:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P45413 and previous config saved to /var/cache/conftool/dbconfig/20230308-083618-marostegui.json
08:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 15 hosts with reason: Schema change
08:34 marostegui: Deploy schema change on s3 eqiad dbmaint T329260
08:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 15 hosts with reason: Schema change
08:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 14 hosts with reason: Schema change
08:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 14 hosts with reason: Schema change
08:32 marostegui: Deploy schema change on s5 eqiad dbmaint T329260
08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P45412 and previous config saved to /var/cache/conftool/dbconfig/20230308-083121-marostegui.json
08:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P45411 and previous config saved to /var/cache/conftool/dbconfig/20230308-082533-marostegui.json
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45410 and previous config saved to /var/cache/conftool/dbconfig/20230308-082149-root.json
08:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45409 and previous config saved to /var/cache/conftool/dbconfig/20230308-082112-marostegui.json
08:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3315 (T329260)', diff saved to https://phabricator.wikimedia.org/P45408 and previous config saved to /var/cache/conftool/dbconfig/20230308-081809-marostegui.json
08:18 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
08:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
08:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T329260)', diff saved to https://phabricator.wikimedia.org/P45407 and previous config saved to /var/cache/conftool/dbconfig/20230308-081748-marostegui.json
08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T328817)', diff saved to https://phabricator.wikimedia.org/P45406 and previous config saved to /var/cache/conftool/dbconfig/20230308-081614-marostegui.json
08:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 19 hosts with reason: Schema change
08:15 marostegui: Deploy schema change on s8 eqiad dbmaint T329260
08:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 19 hosts with reason: Schema change
08:15 marostegui: Deploy schema change on s7 eqiad dbmaint T329260
08:15 marostegui: Deploy schema change on s4 eqiad dbmaint T329260
08:15 marostegui: Deploy schema change on s1 eqiad dbmaint T329260
08:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 15 hosts with reason: Schema change
08:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 15 hosts with reason: Schema change
08:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2093.codfw.wmnet
08:10 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
08:10 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2093.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
08:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T329203)', diff saved to https://phabricator.wikimedia.org/P45405 and previous config saved to /var/cache/conftool/dbconfig/20230308-081027-marostegui.json
08:09 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2093.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45404 and previous config saved to /var/cache/conftool/dbconfig/20230308-080644-root.json
08:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2103 (T328817)', diff saved to https://phabricator.wikimedia.org/P45403 and previous config saved to /var/cache/conftool/dbconfig/20230308-080431-marostegui.json
08:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
08:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
08:02 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db2093.codfw.wmnet
08:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P45402 and previous config saved to /var/cache/conftool/dbconfig/20230308-080241-marostegui.json
08:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 20 hosts with reason: Schema change
08:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 20 hosts with reason: Schema change
08:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 22 hosts with reason: Schema change
08:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 22 hosts with reason: Schema change
07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T329203)', diff saved to https://phabricator.wikimedia.org/P45401 and previous config saved to /var/cache/conftool/dbconfig/20230308-075857-marostegui.json
07:58 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2106.codfw.wmnet with reason: Maintenance
07:58 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2106.codfw.wmnet with reason: Maintenance
07:56 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
07:56 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2102.codfw.wmnet with reason: Maintenance
07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45400 and previous config saved to /var/cache/conftool/dbconfig/20230308-075139-root.json
07:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2099.codfw.wmnet with reason: Maintenance
07:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2099.codfw.wmnet with reason: Maintenance
07:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
07:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
07:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110', diff saved to https://phabricator.wikimedia.org/P45399 and previous config saved to /var/cache/conftool/dbconfig/20230308-074735-marostegui.json
07:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1109', diff saved to https://phabricator.wikimedia.org/P45398 and previous config saved to /var/cache/conftool/dbconfig/20230308-074427-marostegui.json
07:42 taavi@deploy2002: Started deploy [horizon/deploy@9d02cd6]: updating wmf-sudo-dashboard
07:36 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P45397 and previous config saved to /var/cache/conftool/dbconfig/20230308-073633-root.json
07:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2140.codfw.wmnet with reason: Maintenance
07:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1110 (T329260)', diff saved to https://phabricator.wikimedia.org/P45396 and previous config saved to /var/cache/conftool/dbconfig/20230308-073228-marostegui.json
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1109 T330991', diff saved to https://phabricator.wikimedia.org/P45395 and previous config saved to /var/cache/conftool/dbconfig/20230308-073110-root.json
07:30 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1126 to s8 primary T330991', diff saved to https://phabricator.wikimedia.org/P45394 and previous config saved to /var/cache/conftool/dbconfig/20230308-073005-root.json
07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1110 (T329260)', diff saved to https://phabricator.wikimedia.org/P45393 and previous config saved to /var/cache/conftool/dbconfig/20230308-072932-marostegui.json
07:29 marostegui: Starting s8 eqiad failover from db1109 to db1126 - T330991
07:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1110.eqiad.wmnet with reason: Maintenance
07:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1110.eqiad.wmnet with reason: Maintenance
07:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2107.codfw.wmnet with reason: Maintenance
07:22 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2107.codfw.wmnet with reason: Maintenance
07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db2175 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P45392 and previous config saved to /var/cache/conftool/dbconfig/20230308-072128-root.json
07:05 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1126 with weight 0 T330991', diff saved to https://phabricator.wikimedia.org/P45391 and previous config saved to /var/cache/conftool/dbconfig/20230308-070544-root.json
07:05 root@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T330991
07:05 root@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T330991
07:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T329260)', diff saved to https://phabricator.wikimedia.org/P45390 and previous config saved to /var/cache/conftool/dbconfig/20230308-070458-marostegui.json
07:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
07:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
07:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance
06:53 marostegui: Failover m3 from db1101 to db1159 - T331387
06:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2134,2160].codfw.wmnet,db[1101,1117,1159].eqiad.wmnet with reason: m3 master switchover T331387
06:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2134,2160].codfw.wmnet,db[1101,1117,1159].eqiad.wmnet with reason: m3 master switchover T331387
06:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2134,2160].codfw.wmnet,db[1101,1117,1159].eqiad.wmnet with reason: m3 master switchover T331384
06:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2134,2160].codfw.wmnet,db[1101,1117,1159].eqiad.wmnet with reason: m3 master switchover T331384
06:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2112.codfw.wmnet with reason: Maintenance
06:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2112.codfw.wmnet with reason: Maintenance
05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T329260)', diff saved to https://phabricator.wikimedia.org/P45389 and previous config saved to /var/cache/conftool/dbconfig/20230308-055038-marostegui.json
05:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P45388 and previous config saved to /var/cache/conftool/dbconfig/20230308-053531-marostegui.json
05:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P45387 and previous config saved to /var/cache/conftool/dbconfig/20230308-052024-marostegui.json
05:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T329260)', diff saved to https://phabricator.wikimedia.org/P45386 and previous config saved to /var/cache/conftool/dbconfig/20230308-050517-marostegui.json
04:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2175 (T329260)', diff saved to https://phabricator.wikimedia.org/P45385 and previous config saved to /var/cache/conftool/dbconfig/20230308-040451-marostegui.json
04:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
04:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2175.codfw.wmnet with reason: Maintenance
04:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45384 and previous config saved to /var/cache/conftool/dbconfig/20230308-040430-marostegui.json
03:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P45383 and previous config saved to /var/cache/conftool/dbconfig/20230308-034923-marostegui.json
03:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312', diff saved to https://phabricator.wikimedia.org/P45382 and previous config saved to /var/cache/conftool/dbconfig/20230308-033416-marostegui.json
03:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45381 and previous config saved to /var/cache/conftool/dbconfig/20230308-031910-marostegui.json
03:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45380 and previous config saved to /var/cache/conftool/dbconfig/20230308-031257-marostegui.json
03:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
03:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
03:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T329260)', diff saved to https://phabricator.wikimedia.org/P45379 and previous config saved to /var/cache/conftool/dbconfig/20230308-031246-marostegui.json
02:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P45378 and previous config saved to /var/cache/conftool/dbconfig/20230308-025739-marostegui.json
02:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T329203)', diff saved to https://phabricator.wikimedia.org/P45377 and previous config saved to /var/cache/conftool/dbconfig/20230308-024536-marostegui.json
02:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P45376 and previous config saved to /var/cache/conftool/dbconfig/20230308-024233-marostegui.json
02:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P45375 and previous config saved to /var/cache/conftool/dbconfig/20230308-023029-marostegui.json
02:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T329260)', diff saved to https://phabricator.wikimedia.org/P45374 and previous config saved to /var/cache/conftool/dbconfig/20230308-022726-marostegui.json
02:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2148 (T329260)', diff saved to https://phabricator.wikimedia.org/P45373 and previous config saved to /var/cache/conftool/dbconfig/20230308-022116-marostegui.json
02:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
02:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2148.codfw.wmnet with reason: Maintenance
02:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45372 and previous config saved to /var/cache/conftool/dbconfig/20230308-022054-marostegui.json
02:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P45371 and previous config saved to /var/cache/conftool/dbconfig/20230308-021523-marostegui.json
02:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P45370 and previous config saved to /var/cache/conftool/dbconfig/20230308-020547-marostegui.json
02:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T329203)', diff saved to https://phabricator.wikimedia.org/P45369 and previous config saved to /var/cache/conftool/dbconfig/20230308-020016-marostegui.json
01:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T328817)', diff saved to https://phabricator.wikimedia.org/P45368 and previous config saved to /var/cache/conftool/dbconfig/20230308-015921-marostegui.json
01:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312', diff saved to https://phabricator.wikimedia.org/P45367 and previous config saved to /var/cache/conftool/dbconfig/20230308-015040-marostegui.json
01:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2176 (T329203)', diff saved to https://phabricator.wikimedia.org/P45366 and previous config saved to /var/cache/conftool/dbconfig/20230308-014659-marostegui.json
01:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
01:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance
01:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T329203)', diff saved to https://phabricator.wikimedia.org/P45365 and previous config saved to /var/cache/conftool/dbconfig/20230308-014637-marostegui.json
01:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P45364 and previous config saved to /var/cache/conftool/dbconfig/20230308-014415-marostegui.json
01:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45363 and previous config saved to /var/cache/conftool/dbconfig/20230308-013534-marostegui.json
01:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P45362 and previous config saved to /var/cache/conftool/dbconfig/20230308-013131-marostegui.json
01:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45361 and previous config saved to /var/cache/conftool/dbconfig/20230308-012918-marostegui.json
01:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P45360 and previous config saved to /var/cache/conftool/dbconfig/20230308-012908-marostegui.json
01:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
01:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
01:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T329260)', diff saved to https://phabricator.wikimedia.org/P45359 and previous config saved to /var/cache/conftool/dbconfig/20230308-012901-marostegui.json
01:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P45358 and previous config saved to /var/cache/conftool/dbconfig/20230308-011624-marostegui.json
01:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T328817)', diff saved to https://phabricator.wikimedia.org/P45357 and previous config saved to /var/cache/conftool/dbconfig/20230308-011401-marostegui.json
01:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P45356 and previous config saved to /var/cache/conftool/dbconfig/20230308-011354-marostegui.json
01:08 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir1002.eqiad.wmnet with OS bullseye
01:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T328817)', diff saved to https://phabricator.wikimedia.org/P45355 and previous config saved to /var/cache/conftool/dbconfig/20230308-010321-marostegui.json
01:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
01:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2179.codfw.wmnet with reason: Maintenance
01:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T328817)', diff saved to https://phabricator.wikimedia.org/P45354 and previous config saved to /var/cache/conftool/dbconfig/20230308-010300-marostegui.json
01:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174 (T329203)', diff saved to https://phabricator.wikimedia.org/P45353 and previous config saved to /var/cache/conftool/dbconfig/20230308-010117-marostegui.json
00:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P45352 and previous config saved to /var/cache/conftool/dbconfig/20230308-005848-marostegui.json
00:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir1002.eqiad.wmnet with reason: host reimage
00:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir1002.eqiad.wmnet with reason: host reimage
00:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P45351 and previous config saved to /var/cache/conftool/dbconfig/20230308-004753-marostegui.json
00:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2174 (T329203)', diff saved to https://phabricator.wikimedia.org/P45350 and previous config saved to /var/cache/conftool/dbconfig/20230308-004744-marostegui.json
00:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
00:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance
00:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T329203)', diff saved to https://phabricator.wikimedia.org/P45349 and previous config saved to /var/cache/conftool/dbconfig/20230308-004722-marostegui.json
00:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126 (T329260)', diff saved to https://phabricator.wikimedia.org/P45348 and previous config saved to /var/cache/conftool/dbconfig/20230308-004341-marostegui.json
00:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2126 (T329260)', diff saved to https://phabricator.wikimedia.org/P45347 and previous config saved to /var/cache/conftool/dbconfig/20230308-004115-marostegui.json
00:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
00:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
00:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
00:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2126.codfw.wmnet with reason: Maintenance
00:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T329260)', diff saved to https://phabricator.wikimedia.org/P45346 and previous config saved to /var/cache/conftool/dbconfig/20230308-004049-marostegui.json
00:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P45345 and previous config saved to /var/cache/conftool/dbconfig/20230308-003240-marostegui.json
00:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P45344 and previous config saved to /var/cache/conftool/dbconfig/20230308-003216-marostegui.json
00:32 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir1002.eqiad.wmnet with OS bullseye
00:29 brett@cumin2002: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ncredir1002.eqiad.wmnet with OS bullseye
00:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P45343 and previous config saved to /var/cache/conftool/dbconfig/20230308-002543-marostegui.json
00:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T328817)', diff saved to https://phabricator.wikimedia.org/P45342 and previous config saved to /var/cache/conftool/dbconfig/20230308-001734-marostegui.json
00:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P45341 and previous config saved to /var/cache/conftool/dbconfig/20230308-001709-marostegui.json
00:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125', diff saved to https://phabricator.wikimedia.org/P45340 and previous config saved to /var/cache/conftool/dbconfig/20230308-001036-marostegui.json
00:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T328817)', diff saved to https://phabricator.wikimedia.org/P45339 and previous config saved to /var/cache/conftool/dbconfig/20230308-000538-marostegui.json
00:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
00:05 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2172.codfw.wmnet with reason: Maintenance
00:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T328817)', diff saved to https://phabricator.wikimedia.org/P45338 and previous config saved to /var/cache/conftool/dbconfig/20230308-000516-marostegui.json
00:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2173 (T329203)', diff saved to https://phabricator.wikimedia.org/P45337 and previous config saved to /var/cache/conftool/dbconfig/20230308-000203-marostegui.json
2023-03-07
23:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2125 (T329260)', diff saved to https://phabricator.wikimedia.org/P45336 and previous config saved to /var/cache/conftool/dbconfig/20230307-235529-marostegui.json
23:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P45335 and previous config saved to /var/cache/conftool/dbconfig/20230307-235010-marostegui.json
23:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T329260)', diff saved to https://phabricator.wikimedia.org/P45334 and previous config saved to /var/cache/conftool/dbconfig/20230307-234858-marostegui.json
23:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
23:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2125.codfw.wmnet with reason: Maintenance
23:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T329260)', diff saved to https://phabricator.wikimedia.org/P45333 and previous config saved to /var/cache/conftool/dbconfig/20230307-234837-marostegui.json
23:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2173 (T329203)', diff saved to https://phabricator.wikimedia.org/P45332 and previous config saved to /var/cache/conftool/dbconfig/20230307-234741-marostegui.json
23:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
23:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
23:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
23:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2173.codfw.wmnet with reason: Maintenance
23:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T329203)', diff saved to https://phabricator.wikimedia.org/P45331 and previous config saved to /var/cache/conftool/dbconfig/20230307-234715-marostegui.json
23:40 ryankemper@deploy2002: Finished deploy [airflow-dags/search@3419b7d]: initial deployment to new search platform airflow 2 instance - ryankemper (duration: 00m 15s)
23:39 ryankemper@deploy2002: Started deploy [airflow-dags/search@3419b7d]: initial deployment to new search platform airflow 2 instance - ryankemper
23:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P45329 and previous config saved to /var/cache/conftool/dbconfig/20230307-233503-marostegui.json
23:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P45328 and previous config saved to /var/cache/conftool/dbconfig/20230307-233330-marostegui.json
23:32 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir1002.eqiad.wmnet with OS bullseye
23:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P45327 and previous config saved to /var/cache/conftool/dbconfig/20230307-233209-marostegui.json
23:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T328817)', diff saved to https://phabricator.wikimedia.org/P45326 and previous config saved to /var/cache/conftool/dbconfig/20230307-231957-marostegui.json
23:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104', diff saved to https://phabricator.wikimedia.org/P45325 and previous config saved to /var/cache/conftool/dbconfig/20230307-231824-marostegui.json
23:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P45324 and previous config saved to /var/cache/conftool/dbconfig/20230307-231702-marostegui.json
23:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2104 (T329260)', diff saved to https://phabricator.wikimedia.org/P45323 and previous config saved to /var/cache/conftool/dbconfig/20230307-230317-marostegui.json
23:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T329203)', diff saved to https://phabricator.wikimedia.org/P45322 and previous config saved to /var/cache/conftool/dbconfig/20230307-230156-marostegui.json
22:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2104 (T329260)', diff saved to https://phabricator.wikimedia.org/P45321 and previous config saved to /var/cache/conftool/dbconfig/20230307-225951-marostegui.json
22:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2104.codfw.wmnet with reason: Maintenance
22:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2104.codfw.wmnet with reason: Maintenance
22:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
22:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2097.codfw.wmnet with reason: Maintenance
22:54 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir2002.codfw.wmnet with OS bullseye
22:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
22:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
22:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T329260)', diff saved to https://phabricator.wikimedia.org/P45319 and previous config saved to /var/cache/conftool/dbconfig/20230307-225110-marostegui.json
22:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T329203)', diff saved to https://phabricator.wikimedia.org/P45318 and previous config saved to /var/cache/conftool/dbconfig/20230307-224803-marostegui.json
22:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
22:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2170.codfw.wmnet with reason: Maintenance
22:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T329203)', diff saved to https://phabricator.wikimedia.org/P45317 and previous config saved to /var/cache/conftool/dbconfig/20230307-224742-marostegui.json
22:44 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir1001.eqiad.wmnet with OS bullseye
22:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir2002.codfw.wmnet with reason: host reimage
22:36 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir2002.codfw.wmnet with reason: host reimage
22:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P45316 and previous config saved to /var/cache/conftool/dbconfig/20230307-223603-marostegui.json
22:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P45315 and previous config saved to /var/cache/conftool/dbconfig/20230307-223235-marostegui.json
22:31 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir1001.eqiad.wmnet with reason: host reimage
22:26 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir1001.eqiad.wmnet with reason: host reimage
22:26 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir2002.codfw.wmnet with OS bullseye
22:23 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir2001.codfw.wmnet with OS bullseye
22:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P45314 and previous config saved to /var/cache/conftool/dbconfig/20230307-222056-marostegui.json
22:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T328817)', diff saved to https://phabricator.wikimedia.org/P45313 and previous config saved to /var/cache/conftool/dbconfig/20230307-221931-marostegui.json
22:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
22:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
22:19 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
22:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2155.codfw.wmnet with reason: Maintenance
22:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T328817)', diff saved to https://phabricator.wikimedia.org/P45312 and previous config saved to /var/cache/conftool/dbconfig/20230307-221854-marostegui.json
22:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P45311 and previous config saved to /var/cache/conftool/dbconfig/20230307-221729-marostegui.json
22:14 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir1001.eqiad.wmnet with OS bullseye
22:13 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir4002.ulsfo.wmnet with OS bullseye
22:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir2001.codfw.wmnet with reason: host reimage
22:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir2001.codfw.wmnet with reason: host reimage
22:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T329260)', diff saved to https://phabricator.wikimedia.org/P45310 and previous config saved to /var/cache/conftool/dbconfig/20230307-220550-marostegui.json
22:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1197 (T329260)', diff saved to https://phabricator.wikimedia.org/P45309 and previous config saved to /var/cache/conftool/dbconfig/20230307-220438-marostegui.json
22:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
22:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1197.eqiad.wmnet with reason: Maintenance
22:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T329260)', diff saved to https://phabricator.wikimedia.org/P45308 and previous config saved to /var/cache/conftool/dbconfig/20230307-220416-marostegui.json
22:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P45307 and previous config saved to /var/cache/conftool/dbconfig/20230307-220348-marostegui.json
22:03 mforns@deploy2002: Started deploy [airflow-dags/analytics@9fba86b]: (no justification provided)
22:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T329203)', diff saved to https://phabricator.wikimedia.org/P45306 and previous config saved to /var/cache/conftool/dbconfig/20230307-220222-marostegui.json
21:59 sukhe@cumin2002: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host durum6002.drmrs.wmnet with OS bullseye
21:58 inflatador: bking@cumin2002 depool elastic row D hosts to prepare for T322082
21:57 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 7 hosts with reason: re-rack
21:56 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 7 hosts with reason: re-rack
21:56 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir2001.codfw.wmnet with OS bullseye
21:54 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir3002.esams.wmnet with OS bullseye
21:52 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4002.ulsfo.wmnet with reason: host reimage
21:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P45305 and previous config saved to /var/cache/conftool/dbconfig/20230307-214910-marostegui.json
21:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P45304 and previous config saved to /var/cache/conftool/dbconfig/20230307-214841-marostegui.json
21:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T329203)', diff saved to https://phabricator.wikimedia.org/P45303 and previous config saved to /var/cache/conftool/dbconfig/20230307-214824-marostegui.json
21:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
21:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
21:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T329203)', diff saved to https://phabricator.wikimedia.org/P45302 and previous config saved to /var/cache/conftool/dbconfig/20230307-214802-marostegui.json
21:45 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum6002.drmrs.wmnet with reason: host reimage
21:43 TheresNoTime: close UTC late backport window
21:42 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum6002.drmrs.wmnet with reason: host reimage
21:41 inflatador: bking@cumin2002 ban elastic row D hosts to prepare for T322082
21:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2073.codfw.wmnet with OS bullseye
21:40 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
21:39 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir4002.ulsfo.wmnet with OS bullseye
21:37 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir4001.ulsfo.wmnet with OS bullseye
21:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir3002.esams.wmnet with reason: host reimage
21:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P45301 and previous config saved to /var/cache/conftool/dbconfig/20230307-213403-marostegui.json
21:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T328817)', diff saved to https://phabricator.wikimedia.org/P45300 and previous config saved to /var/cache/conftool/dbconfig/20230307-213334-marostegui.json
21:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P45299 and previous config saved to /var/cache/conftool/dbconfig/20230307-213256-marostegui.json
21:32 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir3002.esams.wmnet with reason: host reimage
21:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T328817)', diff saved to https://phabricator.wikimedia.org/P45298 and previous config saved to /var/cache/conftool/dbconfig/20230307-212138-marostegui.json
21:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
21:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2147.codfw.wmnet with reason: Maintenance
21:20 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.26 refs T330204
21:19 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4001.ulsfo.wmnet with reason: host reimage
21:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T329260)', diff saved to https://phabricator.wikimedia.org/P45297 and previous config saved to /var/cache/conftool/dbconfig/20230307-211857-marostegui.json
21:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P45296 and previous config saved to /var/cache/conftool/dbconfig/20230307-211749-marostegui.json
21:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1188 (T329260)', diff saved to https://phabricator.wikimedia.org/P45295 and previous config saved to /var/cache/conftool/dbconfig/20230307-211744-marostegui.json
21:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
21:17 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir3002.esams.wmnet with OS bullseye
21:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1188.eqiad.wmnet with reason: Maintenance
21:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T329260)', diff saved to https://phabricator.wikimedia.org/P45294 and previous config saved to /var/cache/conftool/dbconfig/20230307-211723-marostegui.json
21:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4001.ulsfo.wmnet with reason: host reimage
21:15 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir3001.esams.wmnet with OS bullseye
21:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
21:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
21:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T328817)', diff saved to https://phabricator.wikimedia.org/P45293 and previous config saved to /var/cache/conftool/dbconfig/20230307-211159-marostegui.json
21:10 bblack: lvs500[45]: re-enabling/pooling, back to normal flow
21:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T329203)', diff saved to https://phabricator.wikimedia.org/P45292 and previous config saved to /var/cache/conftool/dbconfig/20230307-210243-marostegui.json
21:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P45291 and previous config saved to /var/cache/conftool/dbconfig/20230307-210216-marostegui.json
20:58 ebernhardson@deploy2002: Finished deploy [airflow-dags/search@9924c93]: test deploy new airflow instance (duration: 02m 03s)
20:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P45290 and previous config saved to /var/cache/conftool/dbconfig/20230307-205653-marostegui.json
20:56 ebernhardson@deploy2002: Started deploy [airflow-dags/search@9924c93]: test deploy new airflow instance
20:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir3001.esams.wmnet with reason: host reimage
20:56 ebernhardson@deploy2002: deploy aborted: test deploy new airflow instance (duration: 00m 01s)
20:56 ebernhardson@deploy2002: Started deploy [airflow-dags/search@9924c93]: test deploy new airflow instance
20:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
20:52 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir3001.esams.wmnet with reason: host reimage
20:50 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
20:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T329203)', diff saved to https://phabricator.wikimedia.org/P45289 and previous config saved to /var/cache/conftool/dbconfig/20230307-204925-marostegui.json
20:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
20:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2153.codfw.wmnet with reason: Maintenance
20:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T329203)', diff saved to https://phabricator.wikimedia.org/P45288 and previous config saved to /var/cache/conftool/dbconfig/20230307-204904-marostegui.json
20:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P45287 and previous config saved to /var/cache/conftool/dbconfig/20230307-204710-marostegui.json
20:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314', diff saved to https://phabricator.wikimedia.org/P45286 and previous config saved to /var/cache/conftool/dbconfig/20230307-204146-marostegui.json
20:35 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir3001.esams.wmnet with OS bullseye
20:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P45284 and previous config saved to /var/cache/conftool/dbconfig/20230307-203357-marostegui.json
20:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T329260)', diff saved to https://phabricator.wikimedia.org/P45283 and previous config saved to /var/cache/conftool/dbconfig/20230307-203203-marostegui.json
20:30 ebernhardson@deploy2002: deploy aborted: test deploy new airflow instance (duration: 00m 02s)
20:30 ebernhardson@deploy2002: Started deploy [airflow-dags/search@9924c93]: test deploy new airflow instance
20:30 ebernhardson@deploy2002: Finished deploy [wikimedia/discovery/analytics@c8dc6d5]: test deploy old airflow instance (duration: 00m 05s)
20:29 ebernhardson@deploy2002: Started deploy [wikimedia/discovery/analytics@c8dc6d5]: test deploy old airflow instance
20:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2073.codfw.wmnet with OS bullseye
20:27 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1040.eqiad.wmnet with OS bullseye
20:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1182 (T329260)', diff saved to https://phabricator.wikimedia.org/P45282 and previous config saved to /var/cache/conftool/dbconfig/20230307-202713-marostegui.json
20:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
20:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1182.eqiad.wmnet with reason: Maintenance
20:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45281 and previous config saved to /var/cache/conftool/dbconfig/20230307-202652-marostegui.json
20:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T328817)', diff saved to https://phabricator.wikimedia.org/P45280 and previous config saved to /var/cache/conftool/dbconfig/20230307-202640-marostegui.json
20:24 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.40.0-wmf.26 refs T330204
20:19 brett@cumin2002: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ncredir6002.drmrs.wmnet with OS bullseye
20:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P45279 and previous config saved to /var/cache/conftool/dbconfig/20230307-201851-marostegui.json
20:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2138.codfw.wmnet with reason: Maintenance
20:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T328817)', diff saved to https://phabricator.wikimedia.org/P45276 and previous config saved to /var/cache/conftool/dbconfig/20230307-201353-marostegui.json
20:12 ebernhardson@deploy2002: Started deploy [airflow-dags/search@9924c93]: initial deployment to search platform airflow 2 instance
20:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P45274 and previous config saved to /var/cache/conftool/dbconfig/20230307-201145-marostegui.json
20:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T329203)', diff saved to https://phabricator.wikimedia.org/P45273 and previous config saved to /var/cache/conftool/dbconfig/20230307-200344-marostegui.json
20:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir6002.drmrs.wmnet with reason: host reimage
19:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P45272 and previous config saved to /var/cache/conftool/dbconfig/20230307-195846-marostegui.json
19:57 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir6002.drmrs.wmnet with reason: host reimage
19:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312', diff saved to https://phabricator.wikimedia.org/P45270 and previous config saved to /var/cache/conftool/dbconfig/20230307-195639-marostegui.json
19:51 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum5002.eqsin.wmnet with OS bullseye
19:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T329203)', diff saved to https://phabricator.wikimedia.org/P45268 and previous config saved to /var/cache/conftool/dbconfig/20230307-194934-marostegui.json
19:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
19:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2146.codfw.wmnet with reason: Maintenance
19:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T329203)', diff saved to https://phabricator.wikimedia.org/P45267 and previous config saved to /var/cache/conftool/dbconfig/20230307-194913-marostegui.json
19:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P45266 and previous config saved to /var/cache/conftool/dbconfig/20230307-194340-marostegui.json
19:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45265 and previous config saved to /var/cache/conftool/dbconfig/20230307-194132-marostegui.json
19:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1170:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45264 and previous config saved to /var/cache/conftool/dbconfig/20230307-193639-marostegui.json
19:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
19:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1170.eqiad.wmnet with reason: Maintenance
19:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T329260)', diff saved to https://phabricator.wikimedia.org/P45263 and previous config saved to /var/cache/conftool/dbconfig/20230307-193617-marostegui.json
19:35 sukhe@cumin2002: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host durum6001.drmrs.wmnet with OS bullseye
19:35 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum4002.ulsfo.wmnet with OS bullseye
19:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P45262 and previous config saved to /var/cache/conftool/dbconfig/20230307-193406-marostegui.json
19:32 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5002.eqsin.wmnet with reason: host reimage
19:31 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1040.eqiad.wmnet with OS bullseye
19:29 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5002.eqsin.wmnet with reason: host reimage
19:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T328817)', diff saved to https://phabricator.wikimedia.org/P45261 and previous config saved to /var/cache/conftool/dbconfig/20230307-192833-marostegui.json
19:22 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
19:21 brett@cumin2002: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host ncredir6001.drmrs.wmnet with OS bullseye
19:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P45260 and previous config saved to /var/cache/conftool/dbconfig/20230307-192111-marostegui.json
19:19 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4002.ulsfo.wmnet with reason: host reimage
19:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P45259 and previous config saved to /var/cache/conftool/dbconfig/20230307-191900-marostegui.json
19:17 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
19:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T328817)', diff saved to https://phabricator.wikimedia.org/P45258 and previous config saved to /var/cache/conftool/dbconfig/20230307-191717-marostegui.json
19:17 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4002.ulsfo.wmnet with reason: host reimage
19:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
19:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2137.codfw.wmnet with reason: Maintenance
19:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T328817)', diff saved to https://phabricator.wikimedia.org/P45257 and previous config saved to /var/cache/conftool/dbconfig/20230307-191656-marostegui.json
19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2072.codfw.wmnet with OS bullseye
19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
19:08 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1035.eqiad.wmnet with OS bullseye
19:06 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum5002.eqsin.wmnet with OS bullseye
19:06 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum4002.ulsfo.wmnet with OS bullseye
19:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P45256 and previous config saved to /var/cache/conftool/dbconfig/20230307-190604-marostegui.json
19:04 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum6001.drmrs.wmnet with OS bullseye
19:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T329203)', diff saved to https://phabricator.wikimedia.org/P45255 and previous config saved to /var/cache/conftool/dbconfig/20230307-190353-marostegui.json
19:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir6001.drmrs.wmnet with reason: host reimage
19:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P45254 and previous config saved to /var/cache/conftool/dbconfig/20230307-190149-marostegui.json
19:01 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum5001.eqsin.wmnet with OS bullseye
18:57 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir6001.drmrs.wmnet with reason: host reimage
18:56 sukhe@cumin2002: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host durum6001.drmrs.wmnet with OS bullseye
18:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T329260)', diff saved to https://phabricator.wikimedia.org/P45253 and previous config saved to /var/cache/conftool/dbconfig/20230307-185058-marostegui.json
18:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
18:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2145 (T329203)', diff saved to https://phabricator.wikimedia.org/P45252 and previous config saved to /var/cache/conftool/dbconfig/20230307-184907-marostegui.json
18:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
18:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2145.codfw.wmnet with reason: Maintenance
18:48 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum4001.ulsfo.wmnet with OS bullseye
18:47 jhuneidi@deploy2002: Started scap: testwikis wikis to 1.40.0-wmf.26 refs T330204
18:46 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
18:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P45251 and previous config saved to /var/cache/conftool/dbconfig/20230307-184642-marostegui.json
18:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T329260)', diff saved to https://phabricator.wikimedia.org/P45250 and previous config saved to /var/cache/conftool/dbconfig/20230307-184506-marostegui.json
18:45 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
18:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
18:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
18:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1156.eqiad.wmnet with reason: Maintenance
18:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45249 and previous config saved to /var/cache/conftool/dbconfig/20230307-184428-marostegui.json
18:43 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum5001.eqsin.wmnet with reason: host reimage
18:40 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
18:39 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir6001.drmrs.wmnet with OS bullseye
18:39 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum5001.eqsin.wmnet with reason: host reimage
18:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
18:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
18:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T329203)', diff saved to https://phabricator.wikimedia.org/P45248 and previous config saved to /var/cache/conftool/dbconfig/20230307-183810-marostegui.json
18:37 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum6001.drmrs.wmnet with reason: host reimage
18:35 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage
18:32 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4001.ulsfo.wmnet with reason: host reimage
18:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T328817)', diff saved to https://phabricator.wikimedia.org/P45247 and previous config saved to /var/cache/conftool/dbconfig/20230307-183136-marostegui.json
18:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P45246 and previous config saved to /var/cache/conftool/dbconfig/20230307-182921-marostegui.json
18:29 dancy: dancy@deploy2002: Fixing up /srv/mediawiki-staging/.git permissions
18:26 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2072.codfw.wmnet with OS bullseye
18:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2071.codfw.wmnet with OS bullseye
18:26 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"
18:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P45245 and previous config saved to /var/cache/conftool/dbconfig/20230307-182304-marostegui.json
18:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T328817)', diff saved to https://phabricator.wikimedia.org/P45244 and previous config saved to /var/cache/conftool/dbconfig/20230307-182035-marostegui.json
18:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
18:20 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum6001.drmrs.wmnet with OS bullseye
18:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2136.codfw.wmnet with reason: Maintenance
18:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T328817)', diff saved to https://phabricator.wikimedia.org/P45243 and previous config saved to /var/cache/conftool/dbconfig/20230307-182013-marostegui.json
18:19 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum5001.eqsin.wmnet with OS bullseye
18:18 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum4001.ulsfo.wmnet with OS bullseye
18:17 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum3002.esams.wmnet with OS bullseye
18:16 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir5002.eqsin.wmnet with OS bullseye
18:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P45242 and previous config saved to /var/cache/conftool/dbconfig/20230307-181414-marostegui.json
18:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1035.eqiad.wmnet with OS bullseye
18:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P45241 and previous config saved to /var/cache/conftool/dbconfig/20230307-180757-marostegui.json
18:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P45240 and previous config saved to /var/cache/conftool/dbconfig/20230307-180506-marostegui.json
18:01 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum3002.esams.wmnet with reason: host reimage
17:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45239 and previous config saved to /var/cache/conftool/dbconfig/20230307-175907-marostegui.json
17:57 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum3002.esams.wmnet with reason: host reimage
17:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45238 and previous config saved to /var/cache/conftool/dbconfig/20230307-175314-marostegui.json
17:53 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1146.eqiad.wmnet with reason: Maintenance
17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T329203)', diff saved to https://phabricator.wikimedia.org/P45237 and previous config saved to /var/cache/conftool/dbconfig/20230307-175251-marostegui.json
17:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119', diff saved to https://phabricator.wikimedia.org/P45236 and previous config saved to /var/cache/conftool/dbconfig/20230307-175000-marostegui.json
17:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
17:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1139.eqiad.wmnet with reason: Maintenance
17:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T329260)', diff saved to https://phabricator.wikimedia.org/P45235 and previous config saved to /var/cache/conftool/dbconfig/20230307-174848-marostegui.json
17:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir5002.eqsin.wmnet with reason: host reimage
17:47 volans@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
17:44 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir5002.eqsin.wmnet with reason: host reimage
17:40 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum3002.esams.wmnet with OS bullseye
17:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T329203)', diff saved to https://phabricator.wikimedia.org/P45234 and previous config saved to /var/cache/conftool/dbconfig/20230307-173923-marostegui.json
17:39 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
17:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2130.codfw.wmnet with reason: Maintenance
17:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T329203)', diff saved to https://phabricator.wikimedia.org/P45233 and previous config saved to /var/cache/conftool/dbconfig/20230307-173901-marostegui.json
17:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T328817)', diff saved to https://phabricator.wikimedia.org/P45232 and previous config saved to /var/cache/conftool/dbconfig/20230307-173453-marostegui.json
17:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P45231 and previous config saved to /var/cache/conftool/dbconfig/20230307-173341-marostegui.json
17:31 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum3001.esams.wmnet with OS bullseye
17:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P45229 and previous config saved to /var/cache/conftool/dbconfig/20230307-172354-marostegui.json
17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T328817)', diff saved to https://phabricator.wikimedia.org/P45230 and previous config saved to /var/cache/conftool/dbconfig/20230307-172354-marostegui.json
17:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
17:23 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum2002.codfw.wmnet with OS bullseye
17:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2119.codfw.wmnet with reason: Maintenance
17:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T328817)', diff saved to https://phabricator.wikimedia.org/P45228 and previous config saved to /var/cache/conftool/dbconfig/20230307-172333-marostegui.json
17:22 brett@cumin2002: START - Cookbook sre.ganeti.reimage for host ncredir5002.eqsin.wmnet with OS bullseye
17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P45227 and previous config saved to /var/cache/conftool/dbconfig/20230307-171834-marostegui.json
17:15 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum3001.esams.wmnet with reason: host reimage
17:12 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum3001.esams.wmnet with reason: host reimage
17:09 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum2002.codfw.wmnet with reason: host reimage
17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P45226 and previous config saved to /var/cache/conftool/dbconfig/20230307-170848-marostegui.json
17:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P45225 and previous config saved to /var/cache/conftool/dbconfig/20230307-170826-marostegui.json
17:06 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum2002.codfw.wmnet with reason: host reimage
17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T329260)', diff saved to https://phabricator.wikimedia.org/P45224 and previous config saved to /var/cache/conftool/dbconfig/20230307-170328-marostegui.json
17:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T329260)', diff saved to https://phabricator.wikimedia.org/P45223 and previous config saved to /var/cache/conftool/dbconfig/20230307-170215-marostegui.json
17:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1129.eqiad.wmnet with reason: Maintenance
17:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1129.eqiad.wmnet with reason: Maintenance
17:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T329260)', diff saved to https://phabricator.wikimedia.org/P45222 and previous config saved to /var/cache/conftool/dbconfig/20230307-170154-marostegui.json
16:57 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum3001.esams.wmnet with OS bullseye
16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T329203)', diff saved to https://phabricator.wikimedia.org/P45221 and previous config saved to /var/cache/conftool/dbconfig/20230307-165340-marostegui.json
16:53 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum2002.codfw.wmnet with OS bullseye
16:53 xcollazo@deploy2002: Started deploy [airflow-dags/platform_eng@9924c93]: (no justification provided)
16:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110', diff saved to https://phabricator.wikimedia.org/P45220 and previous config saved to /var/cache/conftool/dbconfig/20230307-165319-marostegui.json
16:52 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum2001.codfw.wmnet with OS bullseye
16:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
16:47 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
16:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P45219 and previous config saved to /var/cache/conftool/dbconfig/20230307-164647-marostegui.json
16:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T329203)', diff saved to https://phabricator.wikimedia.org/P45218 and previous config saved to /var/cache/conftool/dbconfig/20230307-164010-marostegui.json
16:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
16:39 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum2001.codfw.wmnet with reason: host reimage
16:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2116.codfw.wmnet with reason: Maintenance
16:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T329203)', diff saved to https://phabricator.wikimedia.org/P45217 and previous config saved to /var/cache/conftool/dbconfig/20230307-163948-marostegui.json
16:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2110 (T328817)', diff saved to https://phabricator.wikimedia.org/P45216 and previous config saved to /var/cache/conftool/dbconfig/20230307-163813-marostegui.json
16:36 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum2001.codfw.wmnet with reason: host reimage
16:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P45215 and previous config saved to /var/cache/conftool/dbconfig/20230307-163140-marostegui.json
16:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2110 (T328817)', diff saved to https://phabricator.wikimedia.org/P45214 and previous config saved to /var/cache/conftool/dbconfig/20230307-162616-marostegui.json
16:26 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons.
16:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
16:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2110.codfw.wmnet with reason: Maintenance
16:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T328817)', diff saved to https://phabricator.wikimedia.org/P45213 and previous config saved to /var/cache/conftool/dbconfig/20230307-162554-marostegui.json
16:25 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum2001.codfw.wmnet with OS bullseye
16:25 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2071.codfw.wmnet with OS bullseye
16:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P45212 and previous config saved to /var/cache/conftool/dbconfig/20230307-162442-marostegui.json
16:21 brett@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ncredir5001.eqsin.wmnet with OS bullseye
16:17 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1037']
16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122 (T329260)', diff saved to https://phabricator.wikimedia.org/P45211 and previous config saved to /var/cache/conftool/dbconfig/20230307-161634-marostegui.json
16:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1122 (T329260)', diff saved to https://phabricator.wikimedia.org/P45210 and previous config saved to /var/cache/conftool/dbconfig/20230307-161132-marostegui.json
16:11 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1122.eqiad.wmnet with reason: Maintenance
16:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1122.eqiad.wmnet with reason: Maintenance
16:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45209 and previous config saved to /var/cache/conftool/dbconfig/20230307-161111-marostegui.json
16:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P45208 and previous config saved to /var/cache/conftool/dbconfig/20230307-161047-marostegui.json
16:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P45207 and previous config saved to /var/cache/conftool/dbconfig/20230307-160935-marostegui.json
16:04 sukhe@cumin2002: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host durum1002.eqiad.wmnet with OS bullseye
16:01 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1022.eqiad.wmnet with OS bullseye
15:56 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1040']
15:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P45206 and previous config saved to /var/cache/conftool/dbconfig/20230307-155604-marostegui.json
15:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P45205 and previous config saved to /var/cache/conftool/dbconfig/20230307-155541-marostegui.json
15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 (T329203)', diff saved to https://phabricator.wikimedia.org/P45204 and previous config saved to /var/cache/conftool/dbconfig/20230307-155428-marostegui.json
15:44 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1022.eqiad.wmnet with reason: host reimage
15:42 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1040']
15:41 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1022.eqiad.wmnet with reason: host reimage
15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P45203 and previous config saved to /var/cache/conftool/dbconfig/20230307-154058-marostegui.json
15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2103 (T329203)', diff saved to https://phabricator.wikimedia.org/P45202 and previous config saved to /var/cache/conftool/dbconfig/20230307-154049-marostegui.json
15:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
15:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T328817)', diff saved to https://phabricator.wikimedia.org/P45201 and previous config saved to /var/cache/conftool/dbconfig/20230307-154034-marostegui.json
15:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2103.codfw.wmnet with reason: Maintenance
15:36 sukhe@cumin2002: START - Cookbook sre.ganeti.reimage for host durum1002.eqiad.wmnet with OS bullseye
15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T329260)', diff saved to https://phabricator.wikimedia.org/P45199 and previous config saved to /var/cache/conftool/dbconfig/20230307-152545-marostegui.json
15:25 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons.
11:43 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubernetes1016.eqiad.wmnet with OS bullseye
11:42 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1014.eqiad.wmnet with OS bullseye
11:38 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1008.eqiad.wmnet with OS bullseye
11:38 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1010.eqiad.wmnet with OS bullseye
11:38 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1009.eqiad.wmnet with OS bullseye
11:37 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host kubernetes1015.eqiad.wmnet with OS bullseye
11:36 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1011.eqiad.wmnet with OS bullseye
11:33 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1012.eqiad.wmnet with OS bullseye
11:29 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1007.eqiad.wmnet with OS bullseye
11:28 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kubernetes1005.eqiad.wmnet with OS bullseye
11:28 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubernetes1013.eqiad.wmnet with OS bullseye
11:24 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1014.eqiad.wmnet with reason: host reimage
11:23 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host kubernetes1006.eqiad.wmnet with OS bullseye
11:21 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubernetes1009.eqiad.wmnet with reason: host reimage
11:21 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubernetes1008.eqiad.wmnet with reason: host reimage
11:19 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1010.eqiad.wmnet with reason: host reimage
11:19 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubernetes1011.eqiad.wmnet with reason: host reimage
11:19 akosiaris@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubernetes1013.eqiad.wmnet with reason: host reimage
11:17 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1012.eqiad.wmnet with reason: host reimage
11:14 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1006.eqiad.wmnet with reason: host reimage
11:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T329203)', diff saved to https://phabricator.wikimedia.org/P45193 and previous config saved to /var/cache/conftool/dbconfig/20230307-111421-marostegui.json
11:14 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1014.eqiad.wmnet with reason: host reimage
11:14 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1013.eqiad.wmnet with reason: host reimage
11:13 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1012.eqiad.wmnet with reason: host reimage
11:13 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1011.eqiad.wmnet with reason: host reimage
11:12 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1010.eqiad.wmnet with reason: host reimage
11:12 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1009.eqiad.wmnet with reason: host reimage
11:12 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1007.eqiad.wmnet with reason: host reimage
11:11 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1008.eqiad.wmnet with reason: host reimage
11:09 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1005.eqiad.wmnet with reason: host reimage
11:08 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1007.eqiad.wmnet with reason: host reimage
11:06 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1006.eqiad.wmnet with reason: host reimage
11:06 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1005.eqiad.wmnet with reason: host reimage
11:05 akosiaris@cumin1001: END (FAIL) - Cookbook sre.ganeti.reimage (exit_code=99) for host kubernetes1016.eqiad.wmnet with OS bullseye
11:00 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1014.eqiad.wmnet with OS bullseye
11:00 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1013.eqiad.wmnet with OS bullseye
10:59 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1012.eqiad.wmnet with OS bullseye
10:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P45192 and previous config saved to /var/cache/conftool/dbconfig/20230307-105914-marostegui.json
10:59 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1011.eqiad.wmnet with OS bullseye
10:59 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1010.eqiad.wmnet with OS bullseye
10:58 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1009.eqiad.wmnet with OS bullseye
10:57 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1008.eqiad.wmnet with OS bullseye
10:56 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubernetes1016.eqiad.wmnet with OS bullseye
10:55 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubernetes1015.eqiad.wmnet with OS bullseye
10:54 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubernetes1006.eqiad.wmnet with OS bullseye
10:54 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubernetes1005.eqiad.wmnet with OS bullseye
10:53 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1007.eqiad.wmnet with OS bullseye
10:51 akosiaris: manually label kubemaster1001, kubemaster1002 giving them role master T307943
10:48 arturo: apt2001: pull latest packages for thirdparty/kubeadm-k8s-1-22 buster-wikimedia (T286856)
10:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P45191 and previous config saved to /var/cache/conftool/dbconfig/20230307-104408-marostegui.json
10:39 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kubemaster1001.eqiad.wmnet with OS bullseye
10:38 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kubemaster1002.eqiad.wmnet with OS bullseye
10:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T329203)', diff saved to https://phabricator.wikimedia.org/P45190 and previous config saved to /var/cache/conftool/dbconfig/20230307-102901-marostegui.json
10:28 arturo: apt1001: pull latest packages for thirdparty/kubeadm-k8s-1-22 buster-wikimedia (T286856)
10:21 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubemaster1002.eqiad.wmnet with reason: host reimage
10:19 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubemaster1001.eqiad.wmnet with reason: host reimage
10:16 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubemaster1002.eqiad.wmnet with reason: host reimage
10:16 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubemaster1001.eqiad.wmnet with reason: host reimage
10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2181 (T329203)', diff saved to https://phabricator.wikimedia.org/P45189 and previous config saved to /var/cache/conftool/dbconfig/20230307-100807-marostegui.json
10:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
10:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2181.codfw.wmnet with reason: Maintenance
10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P45188 and previous config saved to /var/cache/conftool/dbconfig/20230307-100745-marostegui.json
10:07 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubemaster1002.eqiad.wmnet with OS bullseye
10:07 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubemaster1001.eqiad.wmnet with OS bullseye
10:05 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kubetcd1005.eqiad.wmnet with OS bullseye
09:54 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kubetcd1006.eqiad.wmnet with OS bullseye
09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P45187 and previous config saved to /var/cache/conftool/dbconfig/20230307-095239-marostegui.json
09:39 akosiaris: schedule downtime for PyBal backends health on lvs1019, lvs1020
09:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318', diff saved to https://phabricator.wikimedia.org/P45186 and previous config saved to /var/cache/conftool/dbconfig/20230307-093732-marostegui.json
09:35 akosiaris@cumin1001: END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host kubetcd1004.eqiad.wmnet with OS bullseye
09:33 moritzm: installing apr-util security updates on Bullseye
09:23 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubetcd1004.eqiad.wmnet with reason: host reimage
09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P45184 and previous config saved to /var/cache/conftool/dbconfig/20230307-092226-marostegui.json
09:21 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubetcd1006.eqiad.wmnet with reason: host reimage
09:18 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubetcd1005.eqiad.wmnet with reason: host reimage
09:16 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubetcd1006.eqiad.wmnet with reason: host reimage
09:16 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubetcd1004.eqiad.wmnet with reason: host reimage
09:16 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubetcd1005.eqiad.wmnet with reason: host reimage
09:14 moritzm: installing PHP 7.4 security updates (as packaged in Debian Bullseye, not our internal build for Buster)
09:07 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubetcd1006.eqiad.wmnet with OS bullseye
09:06 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubetcd1005.eqiad.wmnet with OS bullseye
09:06 akosiaris@cumin1001: START - Cookbook sre.ganeti.reimage for host kubetcd1004.eqiad.wmnet with OS bullseye
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2168:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P45182 and previous config saved to /var/cache/conftool/dbconfig/20230307-090130-marostegui.json
09:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
09:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2168.codfw.wmnet with reason: Maintenance
09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P45181 and previous config saved to /var/cache/conftool/dbconfig/20230307-090109-marostegui.json
08:51 akosiaris: T331126 Scheduled 24H downtime for all wikikube eqiad hosts and all LVS services powered by the cluster
08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P45180 and previous config saved to /var/cache/conftool/dbconfig/20230307-084602-marostegui.json
08:43 dcausse: closing the UTC morning backport window
08:42 nfraison@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-conf1003.eqiad.wmnet with OS bullseye
08:35 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1101 from dbctl T329352', diff saved to https://phabricator.wikimedia.org/P45179 and previous config saved to /var/cache/conftool/dbconfig/20230307-083542-marostegui.json
08:34 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 23 hosts with reason: Reinitialize eqiad with k8s 1.23
08:33 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 23 hosts with reason: Reinitialize eqiad with k8s 1.23
08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318', diff saved to https://phabricator.wikimedia.org/P45178 and previous config saved to /var/cache/conftool/dbconfig/20230307-083056-marostegui.json
08:28 dcausse@deploy2002: dcausse: Backport for Properly pass the page id on page moves (T331127) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
08:23 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
08:23 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
08:23 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
08:23 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
08:22 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
08:22 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
08:22 nfraison@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-conf1003.eqiad.wmnet with reason: host reimage
08:21 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
08:21 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart (exit_code=0) rolling restart_daemons on A:maps-replica-codfw
08:20 marostegui: Failover m3 from db1159 to db1101 - T331384
08:20 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
08:19 nfraison@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-conf1003.eqiad.wmnet with reason: host reimage
08:18 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-replica-codfw
08:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2134,2160].codfw.wmnet,db[1101,1117,1159].eqiad.wmnet with reason: m3 master switchover T331384
08:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2134,2160].codfw.wmnet,db[1101,1117,1159].eqiad.wmnet with reason: m3 master switchover T331384
08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P45177 and previous config saved to /var/cache/conftool/dbconfig/20230307-081549-marostegui.json
08:15 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
08:12 jmm@cumin2002: END (PASS) - Cookbook sre.maps.roll-restart (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad
08:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2134,2160].codfw.wmnet,db[1101,1117,1159].eqiad.wmnet with reason: m3 master switchover T331384
08:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2134,2160].codfw.wmnet,db[1101,1117,1159].eqiad.wmnet with reason: m3 master switchover T331384
08:09 jmm@cumin2002: START - Cookbook sre.maps.roll-restart rolling restart_daemons on A:maps-replica-eqiad
08:07 nfraison@cumin1001: START - Cookbook sre.hosts.reimage for host an-conf1003.eqiad.wmnet with OS bullseye
07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P45176 and previous config saved to /var/cache/conftool/dbconfig/20230307-075453-marostegui.json
07:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
07:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2167.codfw.wmnet with reason: Maintenance
07:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T329203)', diff saved to https://phabricator.wikimedia.org/P45175 and previous config saved to /var/cache/conftool/dbconfig/20230307-075443-marostegui.json
07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P45174 and previous config saved to /var/cache/conftool/dbconfig/20230307-073936-marostegui.json
07:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 15 hosts with reason: Row A switch maintenance T329073
07:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 15 hosts with reason: Row A switch maintenance T329073
07:34 vgutierrez: enable haproxy systemd service unit hardening in cp4044 - T323944
07:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db[2142-2144].codfw.wmnet with reason: Row A switch maintenance T329073
07:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db[2142-2144].codfw.wmnet with reason: Row A switch maintenance T329073
07:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db[1151-1153].eqiad.wmnet with reason: Row A switch maintenance T329073
07:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db[1151-1153].eqiad.wmnet with reason: Row A switch maintenance T329073
07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1115.eqiad.wmnet with reason: Row A switch maintenance T329073
07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1115.eqiad.wmnet with reason: Row A switch maintenance T329073
07:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 6 hosts with reason: Row A switch maintenance T329073
07:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on 6 hosts with reason: Row A switch maintenance T329073
07:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101 (s7,s8) T331381', diff saved to https://phabricator.wikimedia.org/P45172 and previous config saved to /var/cache/conftool/dbconfig/20230307-072454-root.json
07:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P45171 and previous config saved to /var/cache/conftool/dbconfig/20230307-072429-marostegui.json
07:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2166 (T329203)', diff saved to https://phabricator.wikimedia.org/P45170 and previous config saved to /var/cache/conftool/dbconfig/20230307-070923-marostegui.json
06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2166 (T329203)', diff saved to https://phabricator.wikimedia.org/P45169 and previous config saved to /var/cache/conftool/dbconfig/20230307-064752-marostegui.json
06:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
06:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2166.codfw.wmnet with reason: Maintenance
06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T329203)', diff saved to https://phabricator.wikimedia.org/P45168 and previous config saved to /var/cache/conftool/dbconfig/20230307-064730-marostegui.json
06:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 34 hosts with reason: Schema change on s4 eqiad
06:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 34 hosts with reason: Schema change on s4 eqiad
06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 37 hosts with reason: Schema change on s1 eqiad
06:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 37 hosts with reason: Schema change on s1 eqiad
06:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2095.codfw.wmnet
06:36 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
06:36 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2095.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
06:34 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2095.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
06:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P45167 and previous config saved to /var/cache/conftool/dbconfig/20230307-063223-marostegui.json
06:28 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db2095.codfw.wmnet
06:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
06:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2105.codfw.wmnet with reason: Maintenance
06:20 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1157.eqiad.wmnet with reason: Maintenance
06:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P45166 and previous config saved to /var/cache/conftool/dbconfig/20230307-061717-marostegui.json
06:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
06:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2129.codfw.wmnet with reason: Maintenance
06:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
06:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1131.eqiad.wmnet with reason: Maintenance
06:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2164 (T329203)', diff saved to https://phabricator.wikimedia.org/P45165 and previous config saved to /var/cache/conftool/dbconfig/20230307-060210-marostegui.json
05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2164 (T329203)', diff saved to https://phabricator.wikimedia.org/P45164 and previous config saved to /var/cache/conftool/dbconfig/20230307-054153-marostegui.json
05:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
05:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
05:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
05:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2164.codfw.wmnet with reason: Maintenance
05:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T329203)', diff saved to https://phabricator.wikimedia.org/P45163 and previous config saved to /var/cache/conftool/dbconfig/20230307-054127-marostegui.json
05:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P45162 and previous config saved to /var/cache/conftool/dbconfig/20230307-052620-marostegui.json
05:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P45161 and previous config saved to /var/cache/conftool/dbconfig/20230307-051113-marostegui.json
04:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2163 (T329203)', diff saved to https://phabricator.wikimedia.org/P45160 and previous config saved to /var/cache/conftool/dbconfig/20230307-045607-marostegui.json
03:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2163 (T329203)', diff saved to https://phabricator.wikimedia.org/P45159 and previous config saved to /var/cache/conftool/dbconfig/20230307-035541-marostegui.json
03:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
03:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2163.codfw.wmnet with reason: Maintenance
03:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T329203)', diff saved to https://phabricator.wikimedia.org/P45158 and previous config saved to /var/cache/conftool/dbconfig/20230307-035520-marostegui.json
03:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P45157 and previous config saved to /var/cache/conftool/dbconfig/20230307-034013-marostegui.json
03:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162', diff saved to https://phabricator.wikimedia.org/P45156 and previous config saved to /var/cache/conftool/dbconfig/20230307-032506-marostegui.json
03:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2162 (T329203)', diff saved to https://phabricator.wikimedia.org/P45155 and previous config saved to /var/cache/conftool/dbconfig/20230307-031000-marostegui.json
02:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2162 (T329203)', diff saved to https://phabricator.wikimedia.org/P45154 and previous config saved to /var/cache/conftool/dbconfig/20230307-024912-marostegui.json
02:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
02:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2162.codfw.wmnet with reason: Maintenance
02:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T329203)', diff saved to https://phabricator.wikimedia.org/P45153 and previous config saved to /var/cache/conftool/dbconfig/20230307-024850-marostegui.json
02:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P45152 and previous config saved to /var/cache/conftool/dbconfig/20230307-023344-marostegui.json
02:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P45151 and previous config saved to /var/cache/conftool/dbconfig/20230307-021837-marostegui.json
02:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161 (T329203)', diff saved to https://phabricator.wikimedia.org/P45150 and previous config saved to /var/cache/conftool/dbconfig/20230307-020330-marostegui.json
01:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2161 (T329203)', diff saved to https://phabricator.wikimedia.org/P45149 and previous config saved to /var/cache/conftool/dbconfig/20230307-014152-marostegui.json
01:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
01:41 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2161.codfw.wmnet with reason: Maintenance
01:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T329203)', diff saved to https://phabricator.wikimedia.org/P45148 and previous config saved to /var/cache/conftool/dbconfig/20230307-014130-marostegui.json
01:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P45147 and previous config saved to /var/cache/conftool/dbconfig/20230307-012624-marostegui.json
01:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P45146 and previous config saved to /var/cache/conftool/dbconfig/20230307-011117-marostegui.json
00:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154 (T329203)', diff saved to https://phabricator.wikimedia.org/P45145 and previous config saved to /var/cache/conftool/dbconfig/20230307-005611-marostegui.json
00:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2154 (T329203)', diff saved to https://phabricator.wikimedia.org/P45144 and previous config saved to /var/cache/conftool/dbconfig/20230307-003547-marostegui.json
00:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
00:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2154.codfw.wmnet with reason: Maintenance
00:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T329203)', diff saved to https://phabricator.wikimedia.org/P45143 and previous config saved to /var/cache/conftool/dbconfig/20230307-003525-marostegui.json
00:23 mutante: people* - determined which users did not have a public_html dir in codfw but did in eqiad. created that dir, rsynced via push from people1003 to people2002 for the 7 affected users. re-enabled temp disabled puppet to restore live-hacked rsync config. T330091
00:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P45142 and previous config saved to /var/cache/conftool/dbconfig/20230307-002019-marostegui.json
00:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P45141 and previous config saved to /var/cache/conftool/dbconfig/20230307-000512-marostegui.json
2023-03-06
23:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2152 (T329203)', diff saved to https://phabricator.wikimedia.org/P45140 and previous config saved to /var/cache/conftool/dbconfig/20230306-235006-marostegui.json
23:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2152 (T329203)', diff saved to https://phabricator.wikimedia.org/P45139 and previous config saved to /var/cache/conftool/dbconfig/20230306-232933-marostegui.json
23:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
23:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2152.codfw.wmnet with reason: Maintenance
23:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs1001.eqiad.wmnet,wdqs[1003-1004,1006,1011].eqiad.wmnet with reason: switch maintenance
23:20 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs1001.eqiad.wmnet,wdqs[1003-1004,1006,1011].eqiad.wmnet with reason: switch maintenance
23:19 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 12 hosts with reason: switch maintenance
23:19 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 12 hosts with reason: switch maintenance
23:16 inflatador: bking@cumin2002 ban row A cloudelastic hosts T329073
23:04 inflatador: bking@cumin2002 'depool wcqs and wdqs row A hosts T329073'
22:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
22:49 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2098.codfw.wmnet with reason: Maintenance
22:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
22:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
22:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T329203)', diff saved to https://phabricator.wikimedia.org/P45138 and previous config saved to /var/cache/conftool/dbconfig/20230306-223044-marostegui.json
22:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P45137 and previous config saved to /var/cache/conftool/dbconfig/20230306-221537-marostegui.json
22:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P45136 and previous config saved to /var/cache/conftool/dbconfig/20230306-220031-marostegui.json
21:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T329203)', diff saved to https://phabricator.wikimedia.org/P45135 and previous config saved to /var/cache/conftool/dbconfig/20230306-214524-marostegui.json
21:45 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons.
21:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1203 (T329203)', diff saved to https://phabricator.wikimedia.org/P45133 and previous config saved to /var/cache/conftool/dbconfig/20230306-212358-marostegui.json
21:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
21:23 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1203.eqiad.wmnet with reason: Maintenance
21:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T329203)', diff saved to https://phabricator.wikimedia.org/P45132 and previous config saved to /var/cache/conftool/dbconfig/20230306-212336-marostegui.json
21:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P45131 and previous config saved to /var/cache/conftool/dbconfig/20230306-210829-marostegui.json
20:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P45130 and previous config saved to /var/cache/conftool/dbconfig/20230306-205322-marostegui.json
20:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1193 (T329203)', diff saved to https://phabricator.wikimedia.org/P45129 and previous config saved to /var/cache/conftool/dbconfig/20230306-203816-marostegui.json
20:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1193 (T329203)', diff saved to https://phabricator.wikimedia.org/P45128 and previous config saved to /var/cache/conftool/dbconfig/20230306-201704-marostegui.json
20:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
20:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1193.eqiad.wmnet with reason: Maintenance
20:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T329203)', diff saved to https://phabricator.wikimedia.org/P45127 and previous config saved to /var/cache/conftool/dbconfig/20230306-201643-marostegui.json
20:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T328817)', diff saved to https://phabricator.wikimedia.org/P45126 and previous config saved to /var/cache/conftool/dbconfig/20230306-200843-marostegui.json
20:04 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons.
20:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T329260)', diff saved to https://phabricator.wikimedia.org/P45125 and previous config saved to /var/cache/conftool/dbconfig/20230306-200354-marostegui.json
20:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P45124 and previous config saved to /var/cache/conftool/dbconfig/20230306-200136-marostegui.json
19:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P45123 and previous config saved to /var/cache/conftool/dbconfig/20230306-195336-marostegui.json
19:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P45122 and previous config saved to /var/cache/conftool/dbconfig/20230306-194848-marostegui.json
19:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P45121 and previous config saved to /var/cache/conftool/dbconfig/20230306-194630-marostegui.json
19:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P45120 and previous config saved to /var/cache/conftool/dbconfig/20230306-193829-marostegui.json
19:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P45119 and previous config saved to /var/cache/conftool/dbconfig/20230306-193341-marostegui.json
19:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T329203)', diff saved to https://phabricator.wikimedia.org/P45118 and previous config saved to /var/cache/conftool/dbconfig/20230306-193123-marostegui.json
19:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T328817)', diff saved to https://phabricator.wikimedia.org/P45117 and previous config saved to /var/cache/conftool/dbconfig/20230306-192322-marostegui.json
19:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T329260)', diff saved to https://phabricator.wikimedia.org/P45116 and previous config saved to /var/cache/conftool/dbconfig/20230306-191835-marostegui.json
19:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2180 (T329260)', diff saved to https://phabricator.wikimedia.org/P45115 and previous config saved to /var/cache/conftool/dbconfig/20230306-191622-marostegui.json
19:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2180.codfw.wmnet with reason: Maintenance
19:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2180.codfw.wmnet with reason: Maintenance
19:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P45114 and previous config saved to /var/cache/conftool/dbconfig/20230306-191600-marostegui.json
19:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1192 (T329203)', diff saved to https://phabricator.wikimedia.org/P45113 and previous config saved to /var/cache/conftool/dbconfig/20230306-190943-marostegui.json
19:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
19:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1192.eqiad.wmnet with reason: Maintenance
19:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T329203)', diff saved to https://phabricator.wikimedia.org/P45112 and previous config saved to /var/cache/conftool/dbconfig/20230306-190921-marostegui.json
19:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P45111 and previous config saved to /var/cache/conftool/dbconfig/20230306-190054-marostegui.json
18:56 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1036']
18:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2177 (T328817)', diff saved to https://phabricator.wikimedia.org/P45110 and previous config saved to /var/cache/conftool/dbconfig/20230306-185559-marostegui.json
18:55 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
18:55 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2177.codfw.wmnet with reason: Maintenance
18:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T328817)', diff saved to https://phabricator.wikimedia.org/P45109 and previous config saved to /var/cache/conftool/dbconfig/20230306-185537-marostegui.json
18:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P45108 and previous config saved to /var/cache/conftool/dbconfig/20230306-185415-marostegui.json
18:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P45107 and previous config saved to /var/cache/conftool/dbconfig/20230306-184547-marostegui.json
18:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P45106 and previous config saved to /var/cache/conftool/dbconfig/20230306-184030-marostegui.json
18:40 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1035']
18:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P45105 and previous config saved to /var/cache/conftool/dbconfig/20230306-183908-marostegui.json
18:38 mutante: phabricator - locked and archived project acl*discovery-repository-admins (T324171)
18:33 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1035']
18:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P45104 and previous config saved to /var/cache/conftool/dbconfig/20230306-183040-marostegui.json
18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P45103 and previous config saved to /var/cache/conftool/dbconfig/20230306-182524-marostegui.json
18:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P45102 and previous config saved to /var/cache/conftool/dbconfig/20230306-182508-marostegui.json
18:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
18:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2171.codfw.wmnet with reason: Maintenance
18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P45101 and previous config saved to /var/cache/conftool/dbconfig/20230306-182447-marostegui.json
18:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T329203)', diff saved to https://phabricator.wikimedia.org/P45100 and previous config saved to /var/cache/conftool/dbconfig/20230306-182402-marostegui.json
18:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T328817)', diff saved to https://phabricator.wikimedia.org/P45099 and previous config saved to /var/cache/conftool/dbconfig/20230306-181017-marostegui.json
18:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P45098 and previous config saved to /var/cache/conftool/dbconfig/20230306-180940-marostegui.json
18:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1178 (T329203)', diff saved to https://phabricator.wikimedia.org/P45097 and previous config saved to /var/cache/conftool/dbconfig/20230306-180249-marostegui.json
18:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
18:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1178.eqiad.wmnet with reason: Maintenance
18:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T329203)', diff saved to https://phabricator.wikimedia.org/P45096 and previous config saved to /var/cache/conftool/dbconfig/20230306-180228-marostegui.json
17:54 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316', diff saved to https://phabricator.wikimedia.org/P45095 and previous config saved to /var/cache/conftool/dbconfig/20230306-175433-marostegui.json
17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2156 (T328817)', diff saved to https://phabricator.wikimedia.org/P45094 and previous config saved to /var/cache/conftool/dbconfig/20230306-175254-marostegui.json
17:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
17:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2186.codfw.wmnet with reason: Maintenance
17:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
17:52 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2156.codfw.wmnet with reason: Maintenance
17:52 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T328817)', diff saved to https://phabricator.wikimedia.org/P45093 and previous config saved to /var/cache/conftool/dbconfig/20230306-175218-marostegui.json
17:47 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P45092 and previous config saved to /var/cache/conftool/dbconfig/20230306-174721-marostegui.json
17:42 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
17:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P45091 and previous config saved to /var/cache/conftool/dbconfig/20230306-173927-marostegui.json
17:37 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P45090 and previous config saved to /var/cache/conftool/dbconfig/20230306-173711-marostegui.json
17:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P45089 and previous config saved to /var/cache/conftool/dbconfig/20230306-173350-marostegui.json
17:33 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
17:33 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance
17:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T329260)', diff saved to https://phabricator.wikimedia.org/P45088 and previous config saved to /var/cache/conftool/dbconfig/20230306-173328-marostegui.json
17:32 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P45087 and previous config saved to /var/cache/conftool/dbconfig/20230306-173215-marostegui.json
17:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P45086 and previous config saved to /var/cache/conftool/dbconfig/20230306-172205-marostegui.json
17:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P45085 and previous config saved to /var/cache/conftool/dbconfig/20230306-171821-marostegui.json
17:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T329203)', diff saved to https://phabricator.wikimedia.org/P45084 and previous config saved to /var/cache/conftool/dbconfig/20230306-171708-marostegui.json
17:07 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T328817)', diff saved to https://phabricator.wikimedia.org/P45083 and previous config saved to /var/cache/conftool/dbconfig/20230306-170657-marostegui.json
17:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P45082 and previous config saved to /var/cache/conftool/dbconfig/20230306-170315-marostegui.json
16:54 andrew@deploy2002: Finished deploy [horizon/deploy@9d02cd6]: Updating member dashboard to reflect new role names (take two) -- T330759 (duration: 05m 19s)
16:49 andrew@deploy2002: Started deploy [horizon/deploy@9d02cd6]: Updating member dashboard to reflect new role names (take two) -- T330759
16:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T329260)', diff saved to https://phabricator.wikimedia.org/P45081 and previous config saved to /var/cache/conftool/dbconfig/20230306-164808-marostegui.json
16:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T329260)', diff saved to https://phabricator.wikimedia.org/P45080 and previous config saved to /var/cache/conftool/dbconfig/20230306-164245-marostegui.json
16:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance
16:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
16:42 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-restbase (exit_code=0) rolling restart_daemons on A:restbase-codfw
16:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance
16:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T329260)', diff saved to https://phabricator.wikimedia.org/P45079 and previous config saved to /var/cache/conftool/dbconfig/20230306-164158-marostegui.json
16:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2149 (T328817)', diff saved to https://phabricator.wikimedia.org/P45078 and previous config saved to /var/cache/conftool/dbconfig/20230306-163806-marostegui.json
16:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
16:37 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2149.codfw.wmnet with reason: Maintenance
16:32 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-restbase rolling restart_daemons on A:restbase-codfw
16:29 volans@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1007.mgmt.eqiad.wmnet with reboot policy GRACEFUL
16:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P45077 and previous config saved to /var/cache/conftool/dbconfig/20230306-162651-marostegui.json
16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1177 (T329203)', diff saved to https://phabricator.wikimedia.org/P45076 and previous config saved to /var/cache/conftool/dbconfig/20230306-161652-marostegui.json
16:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
16:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1177.eqiad.wmnet with reason: Maintenance
16:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T329203)', diff saved to https://phabricator.wikimedia.org/P45075 and previous config saved to /var/cache/conftool/dbconfig/20230306-161631-marostegui.json
16:13 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
16:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2139.codfw.wmnet with reason: Maintenance
16:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T328817)', diff saved to https://phabricator.wikimedia.org/P45074 and previous config saved to /var/cache/conftool/dbconfig/20230306-161321-marostegui.json
16:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151', diff saved to https://phabricator.wikimedia.org/P45073 and previous config saved to /var/cache/conftool/dbconfig/20230306-161144-marostegui.json
16:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P45072 and previous config saved to /var/cache/conftool/dbconfig/20230306-160124-marostegui.json
15:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P45071 and previous config saved to /var/cache/conftool/dbconfig/20230306-155815-marostegui.json
15:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2151 (T329260)', diff saved to https://phabricator.wikimedia.org/P45070 and previous config saved to /var/cache/conftool/dbconfig/20230306-155638-marostegui.json
15:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2151 (T329260)', diff saved to https://phabricator.wikimedia.org/P45069 and previous config saved to /var/cache/conftool/dbconfig/20230306-155428-marostegui.json
15:54 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
15:54 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2151.codfw.wmnet with reason: Maintenance
15:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
15:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2141.codfw.wmnet with reason: Maintenance
15:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T329260)', diff saved to https://phabricator.wikimedia.org/P45068 and previous config saved to /var/cache/conftool/dbconfig/20230306-155030-marostegui.json
15:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P45067 and previous config saved to /var/cache/conftool/dbconfig/20230306-154618-marostegui.json
15:44 otto@deploy2002: Started deploy [analytics/refinery@ee8981b] (hadoop-test): (no justification provided)
15:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127', diff saved to https://phabricator.wikimedia.org/P45066 and previous config saved to /var/cache/conftool/dbconfig/20230306-154308-marostegui.json
15:39 otto@deploy2002: Started deploy [analytics/refinery@d4d723a] (hadoop-test): (no justification provided)
15:36 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2014.codfw.wmnet
15:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P45065 and previous config saved to /var/cache/conftool/dbconfig/20230306-153524-marostegui.json
15:35 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2013.codfw.wmnet
15:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T329203)', diff saved to https://phabricator.wikimedia.org/P45064 and previous config saved to /var/cache/conftool/dbconfig/20230306-153111-marostegui.json
15:30 volans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ml-serve1007.eqiad.wmnet with reason: testing provision cookbook
15:30 volans@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on ml-serve1007.eqiad.wmnet with reason: testing provision cookbook
15:29 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-fe2014.codfw.wmnet
15:29 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-fe2013.codfw.wmnet
15:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2127 (T328817)', diff saved to https://phabricator.wikimedia.org/P45063 and previous config saved to /var/cache/conftool/dbconfig/20230306-152801-marostegui.json
15:26 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe2013.codfw.wmnet
15:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124', diff saved to https://phabricator.wikimedia.org/P45062 and previous config saved to /var/cache/conftool/dbconfig/20230306-152017-marostegui.json
15:17 eevans@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-fe2013.codfw.wmnet
15:16 zabe@deploy2002: zabe: Backport for Add logo for azwikimedia and vewikimedia (T331177) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
15:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T329260)', diff saved to https://phabricator.wikimedia.org/P45060 and previous config saved to /var/cache/conftool/dbconfig/20230306-150510-marostegui.json
15:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2127 (T328817)', diff saved to https://phabricator.wikimedia.org/P45059 and previous config saved to /var/cache/conftool/dbconfig/20230306-150115-marostegui.json
15:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
15:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2127.codfw.wmnet with reason: Maintenance
15:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T328817)', diff saved to https://phabricator.wikimedia.org/P45058 and previous config saved to /var/cache/conftool/dbconfig/20230306-150054-marostegui.json
14:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T329260)', diff saved to https://phabricator.wikimedia.org/P45057 and previous config saved to /var/cache/conftool/dbconfig/20230306-145945-marostegui.json
14:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
14:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2124.codfw.wmnet with reason: Maintenance
14:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T329260)', diff saved to https://phabricator.wikimedia.org/P45056 and previous config saved to /var/cache/conftool/dbconfig/20230306-145924-marostegui.json
14:57 herron: failing grafana over to codfw T329073
14:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
14:50 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1171.eqiad.wmnet with reason: Maintenance
14:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T329203)', diff saved to https://phabricator.wikimedia.org/P45055 and previous config saved to /var/cache/conftool/dbconfig/20230306-145052-marostegui.json
14:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P45054 and previous config saved to /var/cache/conftool/dbconfig/20230306-144547-marostegui.json
14:44 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P45053 and previous config saved to /var/cache/conftool/dbconfig/20230306-144417-marostegui.json
14:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P45051 and previous config saved to /var/cache/conftool/dbconfig/20230306-143546-marostegui.json
14:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109', diff saved to https://phabricator.wikimedia.org/P45050 and previous config saved to /var/cache/conftool/dbconfig/20230306-143041-marostegui.json
14:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P45049 and previous config saved to /var/cache/conftool/dbconfig/20230306-142910-marostegui.json
14:20 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P45048 and previous config saved to /var/cache/conftool/dbconfig/20230306-142039-marostegui.json
14:15 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2109 (T328817)', diff saved to https://phabricator.wikimedia.org/P45047 and previous config saved to /var/cache/conftool/dbconfig/20230306-141534-marostegui.json
14:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T329260)', diff saved to https://phabricator.wikimedia.org/P45046 and previous config saved to /var/cache/conftool/dbconfig/20230306-141404-marostegui.json
14:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T329203)', diff saved to https://phabricator.wikimedia.org/P45045 and previous config saved to /var/cache/conftool/dbconfig/20230306-140533-marostegui.json
14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T329260)', diff saved to https://phabricator.wikimedia.org/P45044 and previous config saved to /var/cache/conftool/dbconfig/20230306-140339-marostegui.json
14:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
14:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2117.codfw.wmnet with reason: Maintenance
14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T329260)', diff saved to https://phabricator.wikimedia.org/P45043 and previous config saved to /var/cache/conftool/dbconfig/20230306-140317-marostegui.json
13:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2109 (T328817)', diff saved to https://phabricator.wikimedia.org/P45042 and previous config saved to /var/cache/conftool/dbconfig/20230306-134820-marostegui.json
13:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P45041 and previous config saved to /var/cache/conftool/dbconfig/20230306-134811-marostegui.json
13:48 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
13:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2109.codfw.wmnet with reason: Maintenance
13:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
13:34 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T328817)', diff saved to https://phabricator.wikimedia.org/P45040 and previous config saved to /var/cache/conftool/dbconfig/20230306-133451-marostegui.json
13:34 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-restbase (exit_code=0) rolling restart_daemons on A:restbase-canary
13:34 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-restbase rolling restart_daemons on A:restbase-canary
13:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114', diff saved to https://phabricator.wikimedia.org/P45039 and previous config saved to /var/cache/conftool/dbconfig/20230306-133304-marostegui.json
13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P45038 and previous config saved to /var/cache/conftool/dbconfig/20230306-131945-marostegui.json
13:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T329260)', diff saved to https://phabricator.wikimedia.org/P45037 and previous config saved to /var/cache/conftool/dbconfig/20230306-131758-marostegui.json
13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T329260)', diff saved to https://phabricator.wikimedia.org/P45036 and previous config saved to /var/cache/conftool/dbconfig/20230306-131545-marostegui.json
13:15 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
13:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db2114.codfw.wmnet with reason: Maintenance
13:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
13:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
13:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T329260)', diff saved to https://phabricator.wikimedia.org/P45035 and previous config saved to /var/cache/conftool/dbconfig/20230306-131214-marostegui.json
13:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1167 (T329203)', diff saved to https://phabricator.wikimedia.org/P45034 and previous config saved to /var/cache/conftool/dbconfig/20230306-130933-marostegui.json
13:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1016,1020-1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
13:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
13:09 moritzm: rearmed keyholder on deploy1002 following reboot
13:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance
13:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T329203)', diff saved to https://phabricator.wikimedia.org/P45033 and previous config saved to /var/cache/conftool/dbconfig/20230306-130854-marostegui.json
13:08 nfraison@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-conf1002.eqiad.wmnet with OS bullseye
13:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P45032 and previous config saved to /var/cache/conftool/dbconfig/20230306-130438-marostegui.json
12:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P45031 and previous config saved to /var/cache/conftool/dbconfig/20230306-125707-marostegui.json
12:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P45030 and previous config saved to /var/cache/conftool/dbconfig/20230306-125348-marostegui.json
12:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T328817)', diff saved to https://phabricator.wikimedia.org/P45029 and previous config saved to /var/cache/conftool/dbconfig/20230306-124932-marostegui.json
12:48 nfraison@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-conf1002.eqiad.wmnet with reason: host reimage
12:46 nfraison@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-conf1002.eqiad.wmnet with reason: host reimage
12:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1198 (T328817)', diff saved to https://phabricator.wikimedia.org/P45028 and previous config saved to /var/cache/conftool/dbconfig/20230306-124341-marostegui.json
12:43 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
12:43 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1198.eqiad.wmnet with reason: Maintenance
12:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T328817)', diff saved to https://phabricator.wikimedia.org/P45027 and previous config saved to /var/cache/conftool/dbconfig/20230306-124308-marostegui.json
12:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P45026 and previous config saved to /var/cache/conftool/dbconfig/20230306-124200-marostegui.json
12:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126', diff saved to https://phabricator.wikimedia.org/P45025 and previous config saved to /var/cache/conftool/dbconfig/20230306-123841-marostegui.json
12:32 nfraison@cumin1001: START - Cookbook sre.hosts.reimage for host an-conf1002.eqiad.wmnet with OS bullseye
12:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P45024 and previous config saved to /var/cache/conftool/dbconfig/20230306-122801-marostegui.json
12:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1201 (T329260)', diff saved to https://phabricator.wikimedia.org/P45023 and previous config saved to /var/cache/conftool/dbconfig/20230306-122654-marostegui.json
12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1201 (T329260)', diff saved to https://phabricator.wikimedia.org/P45022 and previous config saved to /var/cache/conftool/dbconfig/20230306-122546-marostegui.json
12:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1201.eqiad.wmnet with reason: Maintenance
12:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1201.eqiad.wmnet with reason: Maintenance
12:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T329260)', diff saved to https://phabricator.wikimedia.org/P45021 and previous config saved to /var/cache/conftool/dbconfig/20230306-122524-marostegui.json
12:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1126 (T329203)', diff saved to https://phabricator.wikimedia.org/P45020 and previous config saved to /var/cache/conftool/dbconfig/20230306-122334-marostegui.json
12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P45019 and previous config saved to /var/cache/conftool/dbconfig/20230306-121255-marostegui.json
12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P45018 and previous config saved to /var/cache/conftool/dbconfig/20230306-121018-marostegui.json
12:03 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1126 (T329203)', diff saved to https://phabricator.wikimedia.org/P45017 and previous config saved to /var/cache/conftool/dbconfig/20230306-120328-marostegui.json
12:03 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1126.eqiad.wmnet with reason: Maintenance
12:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1126.eqiad.wmnet with reason: Maintenance
11:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T328817)', diff saved to https://phabricator.wikimedia.org/P45016 and previous config saved to /var/cache/conftool/dbconfig/20230306-115748-marostegui.json
11:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P45015 and previous config saved to /var/cache/conftool/dbconfig/20230306-115511-marostegui.json
11:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1189 (T328817)', diff saved to https://phabricator.wikimedia.org/P45014 and previous config saved to /var/cache/conftool/dbconfig/20230306-115201-marostegui.json
11:51 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
11:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1189.eqiad.wmnet with reason: Maintenance
11:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T328817)', diff saved to https://phabricator.wikimedia.org/P45013 and previous config saved to /var/cache/conftool/dbconfig/20230306-115140-marostegui.json
11:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1116.eqiad.wmnet with reason: Maintenance
11:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1116.eqiad.wmnet with reason: Maintenance
11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T329203)', diff saved to https://phabricator.wikimedia.org/P45012 and previous config saved to /var/cache/conftool/dbconfig/20230306-114354-marostegui.json
11:42 vgutierrez: enable ESI testing in cp4044 - T308799
11:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T329260)', diff saved to https://phabricator.wikimedia.org/P45011 and previous config saved to /var/cache/conftool/dbconfig/20230306-114004-marostegui.json
11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1187 (T329260)', diff saved to https://phabricator.wikimedia.org/P45010 and previous config saved to /var/cache/conftool/dbconfig/20230306-113856-marostegui.json
11:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1187.eqiad.wmnet with reason: Maintenance
11:38 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1187.eqiad.wmnet with reason: Maintenance
11:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T329260)', diff saved to https://phabricator.wikimedia.org/P45009 and previous config saved to /var/cache/conftool/dbconfig/20230306-113835-marostegui.json
11:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P45008 and previous config saved to /var/cache/conftool/dbconfig/20230306-113633-marostegui.json
11:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P45007 and previous config saved to /var/cache/conftool/dbconfig/20230306-112847-marostegui.json
11:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P45006 and previous config saved to /var/cache/conftool/dbconfig/20230306-112328-marostegui.json
11:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P45005 and previous config saved to /var/cache/conftool/dbconfig/20230306-112126-marostegui.json
11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1004.eqiad.wmnet
11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host poolcounter1004.eqiad.wmnet
11:13 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114', diff saved to https://phabricator.wikimedia.org/P45003 and previous config saved to /var/cache/conftool/dbconfig/20230306-111340-marostegui.json
11:09 jbond: enable puppet fleet wide to post reboot puppetdb
11:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P45002 and previous config saved to /var/cache/conftool/dbconfig/20230306-110822-marostegui.json
11:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 (T328817)', diff saved to https://phabricator.wikimedia.org/P45001 and previous config saved to /var/cache/conftool/dbconfig/20230306-110620-marostegui.json
11:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1179 (T328817)', diff saved to https://phabricator.wikimedia.org/P45000 and previous config saved to /var/cache/conftool/dbconfig/20230306-110031-marostegui.json
11:00 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1179.eqiad.wmnet with reason: Maintenance
11:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1179.eqiad.wmnet with reason: Maintenance
11:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T328817)', diff saved to https://phabricator.wikimedia.org/P44999 and previous config saved to /var/cache/conftool/dbconfig/20230306-110009-marostegui.json
10:58 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1114 (T329203)', diff saved to https://phabricator.wikimedia.org/P44998 and previous config saved to /var/cache/conftool/dbconfig/20230306-105834-marostegui.json
10:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T329260)', diff saved to https://phabricator.wikimedia.org/P44997 and previous config saved to /var/cache/conftool/dbconfig/20230306-105315-marostegui.json
10:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T329260)', diff saved to https://phabricator.wikimedia.org/P44996 and previous config saved to /var/cache/conftool/dbconfig/20230306-105206-marostegui.json
10:52 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
10:51 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1180.eqiad.wmnet with reason: Maintenance
10:51 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T329260)', diff saved to https://phabricator.wikimedia.org/P44995 and previous config saved to /var/cache/conftool/dbconfig/20230306-105145-marostegui.json
10:49 jbond: disable puppet fleet wide to reboot puppetdb
10:45 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P44994 and previous config saved to /var/cache/conftool/dbconfig/20230306-104503-marostegui.json
10:36 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P44993 and previous config saved to /var/cache/conftool/dbconfig/20230306-103639-marostegui.json
10:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter1005.eqiad.wmnet
10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1114 (T329203)', diff saved to https://phabricator.wikimedia.org/P44992 and previous config saved to /var/cache/conftool/dbconfig/20230306-103525-marostegui.json
10:35 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1114.eqiad.wmnet with reason: Maintenance
10:35 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1114.eqiad.wmnet with reason: Maintenance
10:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T329203)', diff saved to https://phabricator.wikimedia.org/P44991 and previous config saved to /var/cache/conftool/dbconfig/20230306-103503-marostegui.json
10:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host poolcounter1005.eqiad.wmnet
10:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P44990 and previous config saved to /var/cache/conftool/dbconfig/20230306-102956-marostegui.json
10:29 vgutierrez: enable haproxy systemd service unit hardening in cp4045 - T323944
10:29 nfraison@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-conf1001.eqiad.wmnet with OS bullseye
10:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P44989 and previous config saved to /var/cache/conftool/dbconfig/20230306-102132-marostegui.json
10:19 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P44988 and previous config saved to /var/cache/conftool/dbconfig/20230306-101957-marostegui.json
10:14 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T328817)', diff saved to https://phabricator.wikimedia.org/P44987 and previous config saved to /var/cache/conftool/dbconfig/20230306-101450-marostegui.json
10:12 otto@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1175 (T328817)', diff saved to https://phabricator.wikimedia.org/P44986 and previous config saved to /var/cache/conftool/dbconfig/20230306-100901-marostegui.json
10:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
10:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1175.eqiad.wmnet with reason: Maintenance
10:08 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T328817)', diff saved to https://phabricator.wikimedia.org/P44985 and previous config saved to /var/cache/conftool/dbconfig/20230306-100840-marostegui.json
10:08 nfraison@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-conf1001.eqiad.wmnet with reason: host reimage
10:06 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T329260)', diff saved to https://phabricator.wikimedia.org/P44984 and previous config saved to /var/cache/conftool/dbconfig/20230306-100626-marostegui.json
10:05 nfraison@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-conf1001.eqiad.wmnet with reason: host reimage
10:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111', diff saved to https://phabricator.wikimedia.org/P44983 and previous config saved to /var/cache/conftool/dbconfig/20230306-100450-marostegui.json
10:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1173 (T329260)', diff saved to https://phabricator.wikimedia.org/P44982 and previous config saved to /var/cache/conftool/dbconfig/20230306-100417-marostegui.json
10:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
10:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1173.eqiad.wmnet with reason: Maintenance
10:03 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T329260)', diff saved to https://phabricator.wikimedia.org/P44981 and previous config saved to /var/cache/conftool/dbconfig/20230306-100356-marostegui.json
09:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host deploy1002.eqiad.wmnet
09:59 otto@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
09:53 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P44980 and previous config saved to /var/cache/conftool/dbconfig/20230306-095333-marostegui.json
09:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1111 (T329203)', diff saved to https://phabricator.wikimedia.org/P44979 and previous config saved to /var/cache/conftool/dbconfig/20230306-094944-marostegui.json
09:49 nfraison@cumin1001: START - Cookbook sre.hosts.reimage for host an-conf1001.eqiad.wmnet with OS bullseye
09:49 nfraison@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host an-conf1001.eqiad.wmnet with OS bullseye
09:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P44978 and previous config saved to /var/cache/conftool/dbconfig/20230306-094849-marostegui.json
09:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host deploy1002.eqiad.wmnet
09:43 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P44977 and previous config saved to /var/cache/conftool/dbconfig/20230306-094341-root.json
09:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P44976 and previous config saved to /var/cache/conftool/dbconfig/20230306-093827-marostegui.json
09:36 nfraison@cumin1001: START - Cookbook sre.hosts.reimage for host an-conf1001.eqiad.wmnet with OS bullseye
09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P44975 and previous config saved to /var/cache/conftool/dbconfig/20230306-093343-marostegui.json
09:28 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P44974 and previous config saved to /var/cache/conftool/dbconfig/20230306-092836-root.json
09:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1111 (T329203)', diff saved to https://phabricator.wikimedia.org/P44973 and previous config saved to /var/cache/conftool/dbconfig/20230306-092557-marostegui.json
09:25 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1111.eqiad.wmnet with reason: Maintenance
09:25 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1111.eqiad.wmnet with reason: Maintenance
09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T329203)', diff saved to https://phabricator.wikimedia.org/P44972 and previous config saved to /var/cache/conftool/dbconfig/20230306-092536-marostegui.json
09:23 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T328817)', diff saved to https://phabricator.wikimedia.org/P44971 and previous config saved to /var/cache/conftool/dbconfig/20230306-092320-marostegui.json
09:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T329260)', diff saved to https://phabricator.wikimedia.org/P44970 and previous config saved to /var/cache/conftool/dbconfig/20230306-091836-marostegui.json
09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1166 (T328817)', diff saved to https://phabricator.wikimedia.org/P44969 and previous config saved to /var/cache/conftool/dbconfig/20230306-091733-marostegui.json
09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T329260)', diff saved to https://phabricator.wikimedia.org/P44968 and previous config saved to /var/cache/conftool/dbconfig/20230306-091728-marostegui.json
09:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
09:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
09:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1166.eqiad.wmnet with reason: Maintenance
09:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1168.eqiad.wmnet with reason: Maintenance
09:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T329260)', diff saved to https://phabricator.wikimedia.org/P44967 and previous config saved to /var/cache/conftool/dbconfig/20230306-091706-marostegui.json
09:14 dcausse: depooling & restarting blazegraph on wdqs1006 (stuck for 48+ hours)
09:13 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P44966 and previous config saved to /var/cache/conftool/dbconfig/20230306-091330-root.json
09:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P44965 and previous config saved to /var/cache/conftool/dbconfig/20230306-091030-marostegui.json
09:06 hashar@deploy2002: Finished deploy [gerrit/gerrit@b725ff6]: Gerrit to 3.5.5 on gerrit1001 (duration: 00m 12s)
09:06 hashar@deploy2002: Started deploy [gerrit/gerrit@b725ff6]: Gerrit to 3.5.5 on gerrit1001
09:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
09:04 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1145.eqiad.wmnet with reason: Maintenance
09:04 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T328817)', diff saved to https://phabricator.wikimedia.org/P44964 and previous config saved to /var/cache/conftool/dbconfig/20230306-090416-marostegui.json
09:02 vgutierrez: disabling haproxy systemd service unit hardening in ulsfo - T323944
09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P44963 and previous config saved to /var/cache/conftool/dbconfig/20230306-090200-marostegui.json
09:00 hashar@deploy2002: Finished deploy [gerrit/gerrit@b725ff6]: Gerrit to 3.5.5 on gerrit2002 (duration: 00m 07s)
09:00 hashar@deploy2002: Started deploy [gerrit/gerrit@b725ff6]: Gerrit to 3.5.5 on gerrit2002
08:58 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P44962 and previous config saved to /var/cache/conftool/dbconfig/20230306-085825-root.json
08:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104', diff saved to https://phabricator.wikimedia.org/P44961 and previous config saved to /var/cache/conftool/dbconfig/20230306-085523-marostegui.json
08:49 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P44960 and previous config saved to /var/cache/conftool/dbconfig/20230306-084910-marostegui.json
08:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P44959 and previous config saved to /var/cache/conftool/dbconfig/20230306-084653-marostegui.json
08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P44958 and previous config saved to /var/cache/conftool/dbconfig/20230306-084320-root.json
08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1104 (T329203)', diff saved to https://phabricator.wikimedia.org/P44957 and previous config saved to /var/cache/conftool/dbconfig/20230306-084017-marostegui.json
08:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123', diff saved to https://phabricator.wikimedia.org/P44956 and previous config saved to /var/cache/conftool/dbconfig/20230306-083403-marostegui.json
08:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T329260)', diff saved to https://phabricator.wikimedia.org/P44955 and previous config saved to /var/cache/conftool/dbconfig/20230306-083147-marostegui.json
08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T329260)', diff saved to https://phabricator.wikimedia.org/P44954 and previous config saved to /var/cache/conftool/dbconfig/20230306-083038-marostegui.json
08:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
08:30 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
08:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1165.eqiad.wmnet with reason: Maintenance
08:28 moritzm: rolling restart of Apache on mw* to pick up apr-util security updates
08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P44953 and previous config saved to /var/cache/conftool/dbconfig/20230306-082815-root.json
08:27 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
08:26 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1140.eqiad.wmnet with reason: Maintenance
08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P44952 and previous config saved to /var/cache/conftool/dbconfig/20230306-082645-marostegui.json
08:24 jmm@cumin2002: END (PASS) - Cookbook sre.elasticsearch.restart-nginx (exit_code=0) rolling restart_daemons on A:elastic-eqiad
08:18 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1123 (T328817)', diff saved to https://phabricator.wikimedia.org/P44951 and previous config saved to /var/cache/conftool/dbconfig/20230306-081857-marostegui.json
08:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T329203)', diff saved to https://phabricator.wikimedia.org/P44950 and previous config saved to /var/cache/conftool/dbconfig/20230306-081711-marostegui.json
08:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1104.eqiad.wmnet with reason: Maintenance
08:16 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1104.eqiad.wmnet with reason: Maintenance
08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P44949 and previous config saved to /var/cache/conftool/dbconfig/20230306-081639-marostegui.json
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P44948 and previous config saved to /var/cache/conftool/dbconfig/20230306-081310-root.json
08:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1123 (T328817)', diff saved to https://phabricator.wikimedia.org/P44947 and previous config saved to /var/cache/conftool/dbconfig/20230306-081305-marostegui.json
08:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1123.eqiad.wmnet with reason: Maintenance
08:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1123.eqiad.wmnet with reason: Maintenance
08:12 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T328817)', diff saved to https://phabricator.wikimedia.org/P44946 and previous config saved to /var/cache/conftool/dbconfig/20230306-081244-marostegui.json
08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P44945 and previous config saved to /var/cache/conftool/dbconfig/20230306-081138-marostegui.json
08:01 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P44944 and previous config saved to /var/cache/conftool/dbconfig/20230306-080132-marostegui.json
08:00 jmm@cumin2002: START - Cookbook sre.elasticsearch.restart-nginx rolling restart_daemons on A:elastic-eqiad
07:57 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P44943 and previous config saved to /var/cache/conftool/dbconfig/20230306-075737-marostegui.json
07:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P44942 and previous config saved to /var/cache/conftool/dbconfig/20230306-075632-marostegui.json
07:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2122', diff saved to https://phabricator.wikimedia.org/P44941 and previous config saved to /var/cache/conftool/dbconfig/20230306-074830-root.json
07:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318', diff saved to https://phabricator.wikimedia.org/P44940 and previous config saved to /var/cache/conftool/dbconfig/20230306-074626-marostegui.json
07:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112', diff saved to https://phabricator.wikimedia.org/P44939 and previous config saved to /var/cache/conftool/dbconfig/20230306-074231-marostegui.json
07:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P44938 and previous config saved to /var/cache/conftool/dbconfig/20230306-074125-marostegui.json
07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T329260)', diff saved to https://phabricator.wikimedia.org/P44937 and previous config saved to /var/cache/conftool/dbconfig/20230306-073707-marostegui.json
07:37 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1113.eqiad.wmnet with reason: Maintenance
07:31 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P44936 and previous config saved to /var/cache/conftool/dbconfig/20230306-073119-marostegui.json
07:27 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1112 (T328817)', diff saved to https://phabricator.wikimedia.org/P44935 and previous config saved to /var/cache/conftool/dbconfig/20230306-072724-marostegui.json
07:23 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db2094.codfw.wmnet
07:23 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
07:23 marostegui@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2094.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:22 marostegui@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: db2094.codfw.wmnet decommissioned, removing all IPs except the asset tag one - marostegui@cumin1001"
07:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1112 (T328817)', diff saved to https://phabricator.wikimedia.org/P44934 and previous config saved to /var/cache/conftool/dbconfig/20230306-072132-marostegui.json
07:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:21 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
07:21 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1112.eqiad.wmnet with reason: Maintenance
07:20 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1112.eqiad.wmnet with reason: Maintenance
07:15 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db2094.codfw.wmnet
07:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3318 (T329203)', diff saved to https://phabricator.wikimedia.org/P44933 and previous config saved to /var/cache/conftool/dbconfig/20230306-070814-marostegui.json
07:08 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
07:07 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1101.eqiad.wmnet with reason: Maintenance
07:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
07:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 12:00:00 on db1102.eqiad.wmnet with reason: Maintenance
06:29 apergos: rsync from dumpsdata1001 in ariel screen session of xmldatadumps/public to dumpsdata1007, no bandwidth cap
06:03 apergos: rsync from dumpsdata1001 in ariel screen session of xmldatadumps/private to dumpsdata1007 (did this for 1006 about an hour ago, forgot to log), no bandwidth cap
2023-03-04
14:56 andrew@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: Updating member dashboard to reflect new role names -- T330759 (duration: 02m 17s)
14:53 andrew@deploy1002: Started deploy [horizon/deploy@9d02cd6]: Updating member dashboard to reflect new role names -- T330759
14:44 andrew@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: Updating member dashboard to reflect new role names -- T330759 (duration: 08m 56s)
14:35 andrew@deploy1002: Started deploy [horizon/deploy@9d02cd6]: Updating member dashboard to reflect new role names -- T330759
20:50 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2070.codfw.wmnet with OS bullseye
20:41 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1059.mgmt.eqiad.wmnet with reboot policy GRACEFUL
20:35 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1040.mgmt.eqiad.wmnet with reboot policy FORCED
20:33 bking@cumin2002: START - Cookbook sre.hosts.provision for host elastic1059.mgmt.eqiad.wmnet with reboot policy GRACEFUL
20:30 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudcephosd1040.mgmt.eqiad.wmnet with reboot policy FORCED
20:29 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1039.mgmt.eqiad.wmnet with reboot policy FORCED
20:25 bking@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update location of elastic1058 - bking@cumin2002 - T322082"
20:24 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudcephosd1039.mgmt.eqiad.wmnet with reboot policy FORCED
20:23 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1038.mgmt.eqiad.wmnet with reboot policy FORCED
20:17 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudcephosd1038.mgmt.eqiad.wmnet with reboot policy FORCED
20:17 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1037.mgmt.eqiad.wmnet with reboot policy FORCED
20:13 bking@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host elastic1058.mgmt.eqiad.wmnet with reboot policy GRACEFUL
20:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudcephosd1037.mgmt.eqiad.wmnet with reboot policy FORCED
20:09 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1036.mgmt.eqiad.wmnet with reboot policy FORCED
20:05 bking@cumin2002: START - Cookbook sre.hosts.provision for host elastic1058.mgmt.eqiad.wmnet with reboot policy GRACEFUL
19:53 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host cloudcephosd1036.mgmt.eqiad.wmnet with reboot policy FORCED
19:52 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1035.mgmt.eqiad.wmnet with reboot policy FORCED
15:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1004.wikimedia.org
15:11 bking@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['elastic1053.eqiad.wmnet']
15:02 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1004.wikimedia.org on all recursors
15:02 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1004.wikimedia.org on all recursors
15:02 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:02 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1004.wikimedia.org - jmm@cumin2002"
14:58 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1004.wikimedia.org - jmm@cumin2002"
14:56 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1004.wikimedia.org
14:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host urldownloader1003.wikimedia.org
14:27 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) urldownloader1003.wikimedia.org on all recursors
14:27 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache urldownloader1003.wikimedia.org on all recursors
14:27 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
14:27 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1003.wikimedia.org - jmm@cumin2002"
14:27 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 14 hosts with reason: rerack
14:26 bking@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 14 hosts with reason: rerack
14:24 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
14:16 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM urldownloader1003.wikimedia.org - jmm@cumin2002"
14:10 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host urldownloader1003.wikimedia.org
14:09 inflatador: bking@cumin2002 banning elastic1053-59 from the cluster in preparation for T322082
14:02 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
13:51 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
13:16 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 20485
13:16 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 20485
13:15 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 20485
13:15 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'email' for AS: 20485
12:55 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
11:29 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1002.eqiad.wmnet with OS bookworm
11:17 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet with OS bookworm
11:13 moritzm: imported PHP 7.4 1:7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u2+icu67u1 to component/icu67 (build of PHP against co-installable ICU67) T329491
10:39 vgutierrez: restart ntp.service in dns2001
10:30 jelto@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Install software version upgrade
10:25 moritzm: installing 5.10.162 kernels on buster systems running Linux 5.10
10:12 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jonas Kress (WMDE) out of all services on: 1119 hosts
10:12 root@cumin2002: START - Cookbook sre.idm.logout Logging Jonas Kress (WMDE) out of all services on: 1119 hosts
09:56 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Tobias Andersson out of all services on: 1119 hosts
09:55 root@cumin2002: START - Cookbook sre.idm.logout Logging Tobias Andersson out of all services on: 1119 hosts
09:54 root@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Tobias Andersson out of all services on: 909 hosts
09:54 root@cumin2002: START - Cookbook sre.idm.logout Logging Tobias Andersson out of all services on: 909 hosts
09:45 jelto@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Install software version upgrade
09:45 jelto@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Install software version upgrade
09:27 jelto@cumin1001: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Install software version upgrade
09:10 elukey@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:10 elukey@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:07 elukey@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
09:01 jelto@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Install software version upgrade
08:45 jelto@cumin1001: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Install software version upgrade
08:36 vgutierrez: restarting ntp in dns1001
07:29 elukey: truncate /var/log/auth.log.1 on krb1001 to free space (root partition almost filled up)
01:12 mutante: releases1002: deleting /usr/local/sbin/sync-srv-org-wikimedia-reprepro-releases1002.eqiad.wmnet which confusingly contains an rsync command to rsync from releases1001 which does not exist anymore T330960
00:13 mutante: switching releases.wikimedia.org from eqiad to codfw - T330960
2023-03-02
23:40 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wdqs[2001-2003].codfw.wmnet
23:40 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
23:39 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs[2001-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
22:45 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wdqs[2001-2003].codfw.wmnet decommissioned, removing all IPs except the asset tag one - ryankemper@cumin2002"
15:59 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
15:59 cmooney@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix DNS typo in record for cr2-eqiad gr-3/3/0.2 - cmooney@cumin1001"
15:58 cmooney@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Fix DNS typo in record for cr2-eqiad gr-3/3/0.2 - cmooney@cumin1001"
15:44 root@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['cloudcephosd1005']
15:39 aokoth@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: installation failed due to read-only database
15:39 aokoth@cumin1001: START - Cookbook sre.hosts.downtime for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: installation failed due to read-only database
14:20 cgoubert@deploy2002: cgoubert: Backport for debug.json: List primary DC servers first (T327920) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
11:07 dcaro@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudcephosd1010.eqiad.wmnet
11:07 dcaro@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
11:07 dcaro@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephosd1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dcaro@cumin1001"
10:57 moritzm: upgrade cloudweb to PHP 1:7.4.33-1+0~20221108.73+debian10~1.gbpa00350a+wmf10u2 T330270
10:56 dcaro@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudcephosd1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - dcaro@cumin1001"
10:37 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1003.eqiad.wmnet with OS bullseye
10:35 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1002.eqiad.wmnet with OS bullseye
10:35 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1005.eqiad.wmnet with reason: host reimage
10:33 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1004.eqiad.wmnet with OS bullseye
10:32 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1005.eqiad.wmnet with reason: host reimage
10:30 root@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1001.eqiad.wmnet with OS bullseye